bland-balloon-48379
08/25/2022, 6:48 PMgorgeous-dinner-4055
08/26/2022, 1:12 AMbrowse_paths
aspect of your entity correctlybland-balloon-48379
08/26/2022, 12:45 PMbland-balloon-48379
08/26/2022, 12:47 PMbland-balloon-48379
08/29/2022, 1:33 PMgreen-football-43791
08/30/2022, 9:30 PMgreen-football-43791
08/30/2022, 9:30 PMgreen-football-43791
08/30/2022, 9:30 PMgreen-football-43791
08/30/2022, 9:31 PMbland-balloon-48379
08/30/2022, 9:50 PMSuccessfully fed bulk request. Number of events: 5 Took time ms: -1
request [POST <http://elasticsearch-master:9200/_bulk?timeout=1m>]
green-football-43791
08/30/2022, 9:52 PM*
query?green-football-43791
08/30/2022, 9:52 PMgreen-football-43791
08/30/2022, 9:53 PMgreen-football-43791
08/30/2022, 9:53 PMgreen-football-43791
08/30/2022, 9:53 PMgreen-football-43791
08/30/2022, 9:53 PMgreen-football-43791
08/30/2022, 9:53 PMbland-balloon-48379
08/31/2022, 3:09 PMdataPlatform\:oracle*
in the index, I got no results.
- what happens if you navigate directly to an oracle datasets entity page?
This is not possible as no oracle datasets appear in the UI.
- output of your logs while the reindexing job is running. Do you just see Successfully fed bulk request? Are there any errors? Are there no messages at all?
I ran the reindexing job again and did not see any errors in either the GMS pod or the reindex job pod, but there were a few warnings. In the indexing pod, the logs look like the following:
2022-08-31T14:46:24.874918315Z Reading rows 335000 through 336000 from the aspects table.
2022-08-31T14:46:25.932621207Z Successfully sent MAEs for 336000 rows
In the GMS pod, while the job was running I continued to just see the "Successfully fed bulk request" messages, but after it finished I got the following MAE logs. After this, the longs went back to "Successfully fed bulk request."
2022-08-31T14:47:51.062Z | 14:47:51.061 [I/O dispatcher 1] INFO c.l.m.s.e.update.BulkListener:28 - Successfully fed bulk request. Number of events: 5 Took time ms: -1
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:1100 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Failing OffsetCommit request since the consumer is not part of an active group
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:1100 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Failing OffsetCommit request since the consumer is not part of an active group
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] WARN o.a.k.c.c.i.ConsumerCoordinator:1041 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Synchronous auto-commit of offsets {MetadataChangeLog_Timeseries_v1-0=OffsetAndMetadata{offset=15, leaderEpoch=0, metadata=''}, MetadataChangeLog_Versioned_v1-0=OffsetAndMetadata{offset=160458, leaderEpoch=0, metadata=''}} failed: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] WARN o.a.k.c.c.i.ConsumerCoordinator:1041 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Synchronous auto-commit of offsets {MetadataChangeLog_Timeseries_v1-0=OffsetAndMetadata{offset=15, leaderEpoch=0, metadata=''}, MetadataChangeLog_Versioned_v1-0=OffsetAndMetadata{offset=160458, leaderEpoch=0, metadata=''}} failed: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:669 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:669 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:311 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Lost previously assigned partitions MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0
2022-08-31T14:47:54.438Z | 14:47:54.437 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:311 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Lost previously assigned partitions MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0
2022-08-31T14:47:54.438Z | 14:47:54.438 [ThreadPoolTaskExecutor-1] INFO o.s.k.l.KafkaMessageListenerContainer:292 - generic-mae-consumer-job-client: partitions lost: [MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0]
2022-08-31T14:47:54.438Z | 14:47:54.438 [ThreadPoolTaskExecutor-1] INFO o.s.k.l.KafkaMessageListenerContainer:292 - generic-mae-consumer-job-client: partitions revoked: [MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0]
2022-08-31T14:47:54.438Z | 14:47:54.438 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] (Re-)joining group
2022-08-31T14:47:54.438Z | 14:47:54.438 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] (Re-)joining group
2022-08-31T14:47:54.439Z | 14:47:54.439 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:455 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Join group failed with org.apache.kafka.common.errors.MemberIdRequiredException: The group member needs to have a valid member id before actually entering a consumer group
2022-08-31T14:47:54.439Z | 14:47:54.439 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:455 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Join group failed with org.apache.kafka.common.errors.MemberIdRequiredException: The group member needs to have a valid member id before actually entering a consumer group
2022-08-31T14:47:54.439Z | 14:47:54.439 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] (Re-)joining group
2022-08-31T14:47:54.439Z | 14:47:54.439 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] (Re-)joining group
2022-08-31T14:47:54.441Z | 14:47:54.441 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:604 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Finished assignment for group at generation 2367: {consumer-generic-mae-consumer-job-client-4-37ce43dd-14f5-4a94-8a15-17707cfede99=Assignment(partitions=[MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0])}
2022-08-31T14:47:54.441Z | 14:47:54.441 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:604 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Finished assignment for group at generation 2367: {consumer-generic-mae-consumer-job-client-4-37ce43dd-14f5-4a94-8a15-17707cfede99=Assignment(partitions=[MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0])}
2022-08-31T14:47:54.443Z | 14:47:54.442 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Successfully joined group with generation 2367
2022-08-31T14:47:54.443Z | 14:47:54.442 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Successfully joined group with generation 2367
2022-08-31T14:47:54.443Z | 14:47:54.443 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Adding newly assigned partitions: MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0
2022-08-31T14:47:54.443Z | 14:47:54.443 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Adding newly assigned partitions: MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0
2022-08-31T14:47:54.444Z | 14:47:54.444 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:792 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Setting offset for partition MetadataChangeLog_Versioned_v1-0 to the committed offset FetchPosition{offset=159958, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[prerequisites-kafka-0.prerequisites-kafka-headless.datagovernance.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}
2022-08-31T14:47:54.444Z | 14:47:54.444 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:792 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Setting offset for partition MetadataChangeLog_Versioned_v1-0 to the committed offset FetchPosition{offset=159958, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[prerequisites-kafka-0.prerequisites-kafka-headless.datagovernance.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}
2022-08-31T14:47:54.444Z | 14:47:54.444 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:792 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Setting offset for partition MetadataChangeLog_Timeseries_v1-0 to the committed offset FetchPosition{offset=15, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[prerequisites-kafka-0.prerequisites-kafka-headless.datagovernance.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}
2022-08-31T14:47:54.444Z | 14:47:54.444 [ThreadPoolTaskExecutor-1] INFO o.a.k.c.c.i.ConsumerCoordinator:792 - [Consumer clientId=consumer-generic-mae-consumer-job-client-4, groupId=generic-mae-consumer-job-client] Setting offset for partition MetadataChangeLog_Timeseries_v1-0 to the committed offset FetchPosition{offset=15, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[prerequisites-kafka-0.prerequisites-kafka-headless.datagovernance.svc.cluster.local:9092 (id: 0 rack: null)], epoch=0}}
2022-08-31T14:47:54.445Z | 14:47:54.445 [ThreadPoolTaskExecutor-1] INFO o.s.k.l.KafkaMessageListenerContainer:292 - generic-mae-consumer-job-client: partitions assigned: [MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-0]
bland-balloon-48379
08/31/2022, 3:17 PMbland-balloon-48379
08/31/2022, 4:11 PMdatahub ingest list-runs
. I added a description to give some more info on what each run is. There are two things which are weird about this:
1. The no-run-id-provided
run. I believe this corresponds to a Greenplum ingestion run that was stopped part way through. It says 98 rows created, by I can only find 3 in mysql that say no-run-id-provided and within the appropriate timeframe. However, these records have an incorrect platform ID, so other records may have been overwritten latter that day.
2. Any ingestion runs after the ldap ingestion do not show up in this table. However, if I go into mysql I can find entries with "runId":"oracle-2022_08_24-19_24_06"
.
Datahub deployment created: 2022-08-22 201617 (UTC)
Ingestion runs from Datahub:
+-----------------------------------------------+--------+---------------------------+---------------------+
| runId | rows | created at | Description |
+===============================================+========+===========================+=====================+
| file-2022_08_24-18_49_19 | 19593 | 2022-08-24 18:49:24 (UTC) | ldap users & groups |
+-----------------------------------------------+--------+---------------------------+---------------------+
| no-run-id-provided | 98 | 2022-08-24 18:12:06 (UTC) | Canceled greenplum |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_52_43 | 10857 | 2022-08-23 19:52:46 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_51_43 | 45 | 2022-08-23 19:51:43 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_51_20 | 36 | 2022-08-23 19:51:20 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_51_02 | 48 | 2022-08-23 19:51:02 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_50_30 | 86 | 2022-08-23 19:50:31 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| datahub-business-glossary-2022_08_23-19_49_43 | 208 | 2022-08-23 19:49:43 (UTC) | Glossary terms |
+-----------------------------------------------+--------+---------------------------+---------------------+
| sqlalchemy-2022_08_23-19_26_59 | 178 | 2022-08-23 19:27:07 (UTC) | Greenplum schema |
+-----------------------------------------------+--------+---------------------------+---------------------+
| sqlalchemy-2022_08_23-15_59_03 | 115493 | 2022-08-23 16:15:55 (UTC) | Greenplum db |
+-----------------------------------------------+--------+---------------------------+---------------------+
First oracle ingestion: 2022-08-24 192402 (UTC)
Second oracle ingestion: 2022-08-25 130832 (UTC)bland-balloon-48379
08/31/2022, 4:27 PMgreen-football-43791
08/31/2022, 4:30 PMgreen-football-43791
08/31/2022, 4:30 PMbland-balloon-48379
08/31/2022, 4:34 PMgreen-football-43791
08/31/2022, 4:38 PMbland-balloon-48379
08/31/2022, 8:57 PM18:40:59.724 [pool-11-thread-1] ERROR c.l.d.g.a.service.AnalyticsService:264 - Search query failed: Elasticsearch exception [type=index_not_found_exception, reason=no such index [datahub_usage_event]]
18:40:59.725 [pool-11-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler:95 - Unexpected error occurred in scheduled task
java.lang.RuntimeException: Search query failed:
...
18:41:37.625 [pool-9-thread-1] ERROR c.d.m.ingestion.IngestionScheduler:243 - Failed to retrieve ingestion sources! Skipping updating schedule cache until next refresh. start: 0, count: 30
Using kibana I verified that no datahub_usage_event index exists in ES. Also, only the system_metadata_service_v1, corpuserindex_v2, & datahubpolicyindex_v2 indexes have any documents at all with 500, 167, & 11 respectively.
I ran the reindexing job after this and it resulted in no change and no errors or meaningful messages in the logs.bland-balloon-48379
09/02/2022, 5:27 PMgreen-football-43791
09/02/2022, 5:28 PMgreen-football-43791
09/02/2022, 5:29 PMgreen-football-43791
09/02/2022, 5:29 PMkind-dawn-17532
09/02/2022, 5:31 PMgreen-football-43791
09/02/2022, 5:32 PMgreen-football-43791
09/02/2022, 5:32 PMgreen-football-43791
09/02/2022, 5:32 PMkind-dawn-17532
09/02/2022, 5:33 PM