Hi team My datahub version is v0 10 2 I try to open Queries DataHub #troubleshoot

Hi, team. My datahub version is v0.10.2. I try t...

important-afternoon-19755

05/12/2023, 2:56 PM

Hi, team. My datahub version is v0.10.2. I try to open Queries using

DatasetUsageStatisticsClass

. I can see Queries tap and it works well in test (I emitted to about 30 urns.). But after I emit DatasetUsageStatisticsClass to about 4k urns, when I click data source, after about 10 seconds of loading and I got the error “An unknown error occurred. (code 500)” and the page looks like the picture I attached. Even for data sources where I don’t have the queries tab open. Is there a limit to how many queries taps I can have open? Or I set the max length of each query I emitted in the Queries tab to 10000, is there a limit to length of each query?

lively-cat-88289

05/12/2023, 2:57 PM

Hey there 👋 I'm The DataHub Community Support bot. I'm here to help make sure the community can best support you with your request. Let's double check a few things first: 1️⃣ There's a lot of good information on our docs site: www.datahubproject.io/docs, Have you searched there for a solution? Yes button 2️⃣ It's not uncommon that someone has run into your exact problem before in the community. Have you searched Slack for similar issues? Yes button

delightful-ram-75848

05/16/2023, 2:15 AM

Hi Jiyun - How are you deploying datahub? I'm wondering if the resource of each pod is not exceeded during the ingestion. Also, could you post the log of datahub-gms?

important-afternoon-19755

05/16/2023, 2:23 AM

I’m deploying it using docker-compose. After ingestion and run

datahub docker check

, output is

No issues detected

. Below is the log of datahub-gms.

gms.log

delightful-ram-75848

05/16/2023, 2:25 AM

Seems like this is the issue .... can you share your recipe file?

Copy code

.","caused_by":{"type":"illegal_state_exception","reason":"unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type."}}}}]},"status":500}

important-afternoon-19755

05/16/2023, 2:27 AM

I emitted data using this python code.

Copy code

def _emit_to_datahub(queries, query_count: int, db_name: str, tb_name: str):
    if queries and db_name and tb_name:
        top_sql_queries = [
            trim_query(
                query,
                budget_per_query=10000,
            )
            for query in queries
        ]

        usageStats = DatasetUsageStatisticsClass(
            timestampMillis=get_sys_time(),
            eventGranularity=TimeWindowSizeClass(unit=CalendarIntervalClass.DAY, multiple=1),
            totalSqlQueries=query_count,
            topSqlQueries=top_sql_queries,
        )

        mcp = MetadataChangeProposalWrapper(
                entityType="dataset",
                aspectName="datasetUsageStatistics",
                changeType=ChangeTypeClass.UPSERT,
                entityUrn=f'urn:li:dataset:(urn:li:dataPlatform:glue,{db_name}.{tb_name},PROD)',
                aspect=usageStats,
            )

        # Emit metadata
        emitter.emit(item=mcp)

delightful-ram-75848

05/16/2023, 2:55 AM

Thank you for sharing - this seems to be an issue on our side - we'll get back to you! @gentle-hamburger-31302

important-afternoon-19755

05/16/2023, 6:47 AM

Hey, I found the cause. Some of my top_sql_queries’s length is over 32766, so elasticsearch failed to bulk. After I fixed to top_sql_queries’s length under 32766, everything works well.

delightful-ram-75848

05/16/2023, 10:52 PM

Glad you figured out!

bland-lighter-26751

08/16/2023, 4:32 PM

hey @important-afternoon-19755. What do you mean by you fixed top_sql_queries’s length? Is that a Datahub setting or?

worried-laptop-98985

08/24/2023, 11:03 AM

I setup DataHub locally yesterday for the first time. Ingested one BigQuery project without issue. On ingesting a second projects I now see this message in the logs and any attempt to view a dataset (any dataset). The front-end gives the "Something went wrong, Error500" message. Is this a known issue? It's a very early roadblock for me.

worried-laptop-98985

08/24/2023, 11:06 AM

The exact msg is:

{"error":{"root_cause":[{"type":"exception","reason":"java.util.concurrent.ExecutionException: java.lang.IllegalStateException: unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET])

important-afternoon-19755

08/24/2023, 11:08 AM

@bland-lighter-26751 Oh, sorry. My slack alarm is off so I saw your message now 😂

important-afternoon-19755

08/24/2023, 11:09 AM

When I clicked on data source, I saw that the gms container logged an error related to elasticsearch, as shown in the comments of the thread. I assumed that there was a problem when ingesting the topSqlQueries and that elasticsearch was not able to index it. When I checked the logs of the gms container again when ingesting the topSqlQueries, I found the following error and realized that it was caused by a limit on the length of the topSqlQueries.

Copy code

2023-05-12 14:25:12,148 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener - Failed to feed bulk request. Number of events: 21 Took time ms: -1 Message: failure in bulk execution:
[7]: index [dataset_datasetusagestatisticsaspect_v1], type [_doc], id [25e23835a2de64beff172907fc73c967], message [ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field="topSqlQueries" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[91, 34, 83, 69, 76, 69, 67, 84, 92, 110, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 116, 97]...', original message: bytes can be at most 32766 in length; got 33502]]; nested: ElasticsearchException[Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=bytes can be at most 32766 in length; got 33502]];]
[8]: index [dataset_datasetusagestatisticsaspect_v1], type [_doc], id [5b70b999469bf8232731c1fb3a7e8a9d], message [ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field="topSqlQueries" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[91, 34, 83, 69, 76, 69, 67, 84, 92, 110, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 116, 97]...', original message: bytes can be at most 32766 in length; got 51237]]; nested: ElasticsearchException[Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=bytes can be at most 32766 in length; got 51237]];]
[9]: index [dataset_datasetusagestatisticsaspect_v1], type [_doc], id [5ab06f02d23bfc976a63bdccd605d465], message [ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field="topSqlQueries" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[91, 34, 83, 69, 76, 69, 67, 84, 92, 110, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 116, 97]...', original message: bytes can be at most 32766 in length; got 56755]]; nested: ElasticsearchException[Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=bytes can be at most 32766 in length; got 56755]];]

important-afternoon-19755

08/24/2023, 11:18 AM

I'm using DatasetUsageStatisticsClass. It can be used like this:

Copy code

DatasetUsageStatisticsClass(
    timestampMillis=get_sys_time(),
    eventGranularity=TimeWindowSizeClass(unit=CalendarIntervalClass.DAY, multiple=1),
    totalSqlQueries=query_count,
    topSqlQueries=top_sql_queries,
)

important-afternoon-19755

08/24/2023, 11:21 AM

So, I fixed length of top_sql_queries under 32766 like this

Copy code

# top_sql_queries: List[str]
sum(len(top_sql_query.encode("utf-8")) for top_sql_query in top_sql_queries) < 32766

important-afternoon-19755

08/24/2023, 11:29 AM

@worried-laptop-98985 I'm not sure if this is a common problem with Datahub, but I encountered a similar error message, just like you did. To troubleshoot, it might be helpful to examine the logs of the GMS container during the ingestion process. If you come across error messages related to Elasticsearch, it could be worth investigating the possibility of resolving the issue by ensuring that the length of

topSqlQueries

is kept under 32766 characters.

worried-laptop-98985

08/24/2023, 11:30 AM

This does feel like a bug that should be handled by DataHub. May have mis-read your workaround but it looks like you're just ignoring any longer SQL queries. Is that right?

worried-laptop-98985

08/24/2023, 11:33 AM

Connor had the same issue and it appears it was fixed by deleting and re-creating the ingest connection, which makes it seem like a problem of state

important-afternoon-19755

08/24/2023, 11:44 AM

That does sound a bit unusual. In my case, the source I was working with didn't allow me to open the Queries tab during ingestion. As a result, I manually ingested SQL queries using a Python Emitter. I had assumed that DataHub's ingestion process would handle the length limit of SQL queries. However, if you encountered the bug while using DataHub's ingestion, it could indeed be a bug that needs to be addressed by DataHub.

worried-laptop-98985

08/24/2023, 11:46 AM

Hi @delightful-ram-75848 - you mentioned above that this might be a code issue. Did anything get raised for this? Thx

worried-laptop-98985

08/25/2023, 2:18 PM

Just to add a bit more info to this. Deleting and re-adding the ingestion connection did not do anything for me as it did for Connor.

bland-lighter-26751

09/06/2023, 3:00 PM

Issue is back for me today 😞

bland-lighter-26751

09/06/2023, 4:13 PM

and re-creating the connection fixed it again

worried-laptop-98985

09/07/2023, 3:07 PM

I tried that but no joy

Open in Slack

Previous Next