Hi Team, I deployed datahub on my k8s cluster, upo...
# all-things-deployment
p
Hi Team, I deployed datahub on my k8s cluster, upon doing the ingestion through UI this is the error I'm getting
Copy code
04:17:34.148 [ThreadPoolTaskExecutor-1] WARN  c.l.m.k.DataHubUsageEventsProcessor:56 - Failed to apply usage events transform to record: {"type":"HomePageViewEvent","actorUrn":"urn:li:corpuser:datahub","timestamp":1665375452822,"date":"Mon Oct 10 2022 09:47:32 GMT+0530 (India Standard Time)","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36","browserId":"bbaaab7a-052b-40d6-a41f-715191432a39"}
04:17:34.159 [pool-14-thread-1] INFO  c.l.m.filter.RestliLoggingFilter:55 - GET /entitiesV2?ids=List(urn%3Ali%3Acorpuser%3Adatahub) - batchGet - 200 - 7ms
04:17:34.237 [I/O dispatcher 1] INFO  c.l.m.k.e.ElasticsearchConnector:41 - Successfully feeded bulk request. Number of events: 1 Took time ms: -1
04:17:43.580 [pool-14-thread-1] INFO  c.l.m.filter.RestliLoggingFilter:55 - GET /entitiesV2?ids=List(urn%3Ali%3Acorpuser%3Adatahub) - batchGet - 200 - 37ms
04:17:43.659 [I/O dispatcher 1] INFO  c.l.m.k.e.ElasticsearchConnector:41 - Successfully feeded bulk request. Number of events: 1 Took time ms: -1
04:17:43.677 [Thread-62] WARN  c.l.m.s.e.q.r.SearchRequestHandler:444 - Found invalid filter field for entity search. Invalid or unrecognized facet ingestionSource
04:17:46.502 [Thread-66] WARN  c.l.m.s.e.q.r.SearchRequestHandler:444 - Found invalid filter field for entity search. Invalid or unrecognized facet ingestionSource
04:19:14.391 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.h.i.IngestionSchedulerHook:56 - Received UPSERT to Ingestion Source. Rescheduling the source (if applicable). urn: urn:li:dataHubIngestionSource:e6330868-94ec-4339-9df5-3c72f9d628ea, key: null.
04:19:14.392 [ThreadPoolTaskExecutor-1] INFO  c.d.m.ingestion.IngestionScheduler:105 - Unscheduling ingestion source with urn urn:li:dataHubIngestionSource:e6330868-94ec-4339-9df5-3c72f9d628ea
04:19:14.393 [ThreadPoolTaskExecutor-1] INFO  c.d.m.ingestion.IngestionScheduler:138 - Scheduling next execution of Ingestion Source with urn urn:li:dataHubIngestionSource:e6330868-94ec-4339-9df5-3c72f9d628ea. Schedule: 0 0 * * *
04:19:14.401 [ThreadPoolTaskExecutor-1] INFO  c.d.m.ingestion.IngestionScheduler:167 - Scheduled next execution of Ingestion Source with urn urn:li:dataHubIngestionSource:e6330868-94ec-4339-9df5-3c72f9d628ea in 51045601ms.
04:19:15.403 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 8 Took time ms: -1 Message: failure in bulk execution:
[3]: index [datahubexecutionrequestindex_v2], type [_doc], id [urn%3Ali%3AdataHubExecutionRequest%3A7846970b-514c-4a9e-931a-980dca6c8e52], message [[datahubexecutionrequestindex_v2/wfhCmJ_jR0e48O5ItrryJA][[datahubexecutionrequestindex_v2][0]] ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn%3Ali%3AdataHubExecutionRequest%3A7846970b-514c-4a9e-931a-980dca6c8e52]: document missing]]]
04:19:16.965 [Thread-77] WARN  c.l.m.s.e.q.r.SearchRequestHandler:444 - Found invalid filter field for entity search. Invalid or unrecognized facet ingestionSource
04:19:18.367 [Thread-80] WARN  c.l.m.s.e.q.r.SearchRequestHandler:444 - Found invalid filter field for entity search. Invalid or unrecognized facet ingestionSource
04:19:18.921 [Thread-83] WARN  c.l.m.s.e.q.r.SearchRequestHandler:444 - Found invalid filter field for entity search. Invalid or unrecognized facet ingestionSource
f
Hey @polite-application-51650, I faced to the same problem last week but I’ve solved it. The root cause should be in the kafka. What kinda kafka did you deploy?
t
We have used for our own setup for all dependancies like kafka, elk and postgres instead sql.
f
Copy code
## For AWS MSK set this to a number larger than 1
    partitions: 3
    replicationFactor: 3
Have you added these kafka config?
t
Nope. I've commented out. # For AWS MSK set this to a number larger than 1 # partitions: 3 # replicationFactor: 3
f
Please let me know how’d you deploy your kafka? And make sure that
Brokers Spread %
of kafka is 100.
t
@polite-application-51650 - can you check,
Brokers Spread %
of kafka is 100?
p
Sure
m
I observe the same warning messages: INFO: Invalid event type: SearchAcrossLineageResultsViewEvent WARN: Failed to apply usage events transform to record: {"type":"SearchAcrossLineageResultsViewEvent","query":"", .... } I thinks it's just WIP
h
@famous-florist-7218 - Hi, I face the same problem. I use AWS MSK for Kafka, how can I check the
brokers spread %
in Kafka? FYI, I set the partitions and replicationFactor equal to the number of brokers I have setup. Thanks for your help!
f
@helpful-byte-81711 you can use this tool for checking your kafka health. https://github.com/provectus/kafka-ui