Hi team I think I have encountered a problem related to mae DataHub #troubleshoot

Hi team, I think I have encountered a problem rela...

great-computer-16446

11/14/2022, 8:04 AM

Hi team, I think I have encountered a problem related to mae-consumer, the general phenomenon is that when ingesting more data at one time (in our actual example, this value is tens of thousands), the offset of mae-consumer will not be updated again, whether it is a standalone consumer or a consumer integrated with the metadata service has encountered this problem, the version I am currently using is v0.9.2, if rolled back to v0.9.0, this problem will not exist, I am not sure what to do next, hope to get guidance.

astonishing-answer-96712

11/14/2022, 10:07 PM

Hi @great-computer-16446, just so I understand, you validated that the problem is nonexistent on 0.9? If that’s the case, could you submit a bug on Git detailing the issue?

great-computer-16446

11/15/2022, 1:38 AM

Fine, I will.

great-computer-16446

11/15/2022, 9:18 AM

https://github.com/datahub-project/datahub/issues/6437

astonishing-answer-96712

11/15/2022, 10:31 PM

Thank you!

great-computer-16446

01/18/2023, 3:29 AM

Hi Paul, I found some logs for this problem that should explain the problem. It seems that the consumer’s consumption performance is insufficient. The data writing volume on our side is relatively large. There are some real-time writing tasks and airflow-plugin.

great-computer-16446

01/18/2023, 3:30 AM

012846.778 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO o.a.k.c.c.i.AbstractCoordinator:979 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Member consumer-generic-mae-consumer-job-client-4-e3c4cd12-3525-45c4-9fe8-9dd96f597c10 sending LeaveGroup request to coordinator prerequisites-kafka-0.prerequisites-kafka-headless.datahub.svc.cluster.local:9092 (id: 2147483647 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records. 012846.778 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO o.a.k.c.c.i.AbstractCoordinator:979 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Member consumer-generic-mae-consumer-job-client-4-e3c4cd12-3525-45c4-9fe8-9dd96f597c10 sending LeaveGroup request to coordinator prerequisites-kafka-0.prerequisites-kafka-headless.datahub.svc.cluster.local:9092 (id: 2147483647 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.

great-computer-16446

01/18/2023, 3:36 AM

I have a few ideas to solve the problem 1. Raise kafkaListenerConcurrency, but the current topic: MetadataChangeLog_Versioned_v1 number of partitions is 1, looks like the number of partitions should be increased first? 2. According to the prompt of the log, add max.poll.interval.ms or max.poll.records parameters, and add the parameter settings in com.linkedin.gms.factory.kafka. KafkaEventConsumerFactory #createInstance What is your suggestion? Thank you

astonishing-answer-96712

01/19/2023, 6:55 PM

CC: @brainy-tent-14503 Should be able to help here- I’m unsure about the two specific changes you brought up

2 Views

Open in Slack

Previous Next