fancy-alligator-33404
09/26/2022, 10:54 AMgray-shoe-75895
09/27/2022, 12:48 AMgray-shoe-75895
09/27/2022, 12:49 AMfancy-alligator-33404
09/27/2022, 4:26 AM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': '26cb9a6b-395a-4a11-a539-62025765b1f2',
'infos': ['2022-09-27 04:18:07.857268 [exec_id=26cb9a6b-395a-4a11-a539-62025765b1f2] INFO: Starting execution for task with name=RUN_INGEST',
'2022-09-27 04:18:24.032800 [exec_id=26cb9a6b-395a-4a11-a539-62025765b1f2] INFO: stdout=Elapsed seconds = 0\n'
' --report-to TEXT Provide an destination to send a structured\n'
'This version of datahub supports report-to functionality\n'
'datahub ingest run -c /tmp/datahub/ingest/26cb9a6b-395a-4a11-a539-62025765b1f2/recipe.yml --report-to '
'/tmp/datahub/ingest/26cb9a6b-395a-4a11-a539-62025765b1f2/ingestion_report.json\n'
'[2022-09-27 04:18:11,108] INFO {datahub.cli.ingest_cli:179} - DataHub CLI version: 0.8.44\n'
'[2022-09-27 04:18:11,139] INFO {datahub.ingestion.run.pipeline:165} - Sink configured successfully. DataHubRestEmitter: configured '
'to talk to <http://datahub-datahub-gms:8080>\n'
'[2022-09-27 04:18:16,524] INFO {datahub.ingestion.run.pipeline:190} - Source configured successfully.\n'
'[2022-09-27 04:18:16,526] INFO {datahub.cli.ingest_cli:126} - Starting metadata ingestion\n'
'[2022-09-27 04:18:22,081] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:73} - Querying for '
"the latest ingestion checkpoint for pipelineName:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26', "
"platformInstanceId:'hive_192.168.91.140:10000_gsc_ods', job_name:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:18:22,095] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:93} - The last '
"committed ingestion checkpoint for pipelineName:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26', "
"platformInstanceId:'hive_192.168.91.140:10000_gsc_ods', job_name:'common_ingest_from_sql_source' found with start_time: 2022-09-26 "
'10:52:01.128000+00:00 and a bucket duration of None.\n'
'[2022-09-27 04:18:22,096] INFO {datahub.ingestion.source.state.checkpoint:130} - Successfully constructed last checkpoint state for '
'job common_ingest_from_sql_source\n'
'[2022-09-27 04:18:22,159] INFO {datahub.ingestion.run.pipeline:420} - Processing commit request for '
'DatahubIngestionCheckpointingProvider. Commit policy = CommitPolicy.ON_NO_ERRORS, has_errors=False, has_warnings=False\n'
'[2022-09-27 04:18:22,159] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:140} - Committing '
'ingestion checkpoint for '
"pipeline:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26',instance:'hive_192.168.91.140:10000_gsc_ods', "
"job:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:18:22,169] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:166} - Committed '
'ingestion checkpoint for '
"pipeline:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26',instance:'hive_192.168.91.140:10000_gsc_ods', "
"job:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:18:22,169] INFO {datahub.ingestion.run.pipeline:440} - Successfully committed changes for '
'DatahubIngestionCheckpointingProvider.\n'
'[2022-09-27 04:18:22,170] INFO {datahub.ingestion.reporting.file_reporter:54} - Wrote SUCCESS report successfully to '
"<_io.TextIOWrapper name='/tmp/datahub/ingest/26cb9a6b-395a-4a11-a539-62025765b1f2/ingestion_report.json' mode='w' encoding='UTF-8'>\n"
'[2022-09-27 04:18:22,170] INFO {datahub.cli.ingest_cli:147} - Finished metadata ingestion\n'
'\n'
'Cli report:\n'
"{'cli_version': '0.8.44',\n"
" 'cli_entry_location': '/tmp/datahub/ingest/venv-hive-0.8.44/lib/python3.9/site-packages/datahub/__init__.py',\n"
" 'py_version': '3.9.9 (main, Dec 21 2021, 10:03:34) \\n[GCC 10.2.1 20210110]',\n"
" 'py_exec_path': '/tmp/datahub/ingest/venv-hive-0.8.44/bin/python3',\n"
" 'os_details': 'Linux-5.4.0-65-generic-x86_64-with-glibc2.31'}\n"
'Source (hive) report:\n'
"{'events_produced': '42',\n"
" 'events_produced_per_sec': '7',\n"
" 'event_ids': ['gsc_ods.sstp_stp_item_sbc-subtypes',\n"
" 'gsc_ods.sstp_stp_mst-subtypes',\n"
" 'sstp_stp_item-viewProperties',\n"
' '
"'container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_cls,PROD)',\n"
" 'sstp_stp_item_cls-subtypes',\n"
' '
"'container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_sbc,PROD)',\n"
' '
"'container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_mst,PROD)',\n"
" 'gsc_ods.sstp_stp_rslt',\n"
" 'sstp_stp_rslt-subtypes',\n"
" 'sstp_stp_rslt-viewProperties',\n"
" '... sampled of 42 total elements'],\n"
" 'warnings': {},\n"
" 'failures': {},\n"
" 'tables_scanned': '5',\n"
" 'views_scanned': '5',\n"
" 'entities_profiled': '0',\n"
" 'filtered': [],\n"
" 'soft_deleted_stale_entities': [],\n"
" 'start_time': '2022-09-27 04:18:16.274423',\n"
" 'running_time_in_seconds': '6'}\n"
'Sink (datahub-rest) report:\n'
"{'total_records_written': '42',\n"
" 'records_written_per_second': '3',\n"
" 'warnings': [],\n"
" 'failures': [],\n"
" 'start_time': '2022-09-27 04:18:10.041446',\n"
" 'current_time': '2022-09-27 04:18:22.418061',\n"
" 'total_duration_in_seconds': '12.38',\n"
" 'gms_version': 'v0.8.44',\n"
" 'pending_requests': '0'}\n"
'\n'
' Pipeline finished successfully ; produced 42 events in 6 seconds.\n',
"2022-09-27 04:18:24.033103 [exec_id=26cb9a6b-395a-4a11-a539-62025765b1f2] INFO: Successfully executed 'datahub ingest'"],
'structured_report': '{"source": {"type": "hive", "report": {"events_produced": "42", "events_produced_per_sec": "8", "event_ids": '
'["gsc_ods.sstp_stp_item_sbc-subtypes", "gsc_ods.sstp_stp_mst-subtypes", "sstp_stp_item-viewProperties", '
'"container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_cls,PROD)", '
'"sstp_stp_item_cls-subtypes", '
'"container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_sbc,PROD)", '
'"container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_mst,PROD)", '
'"gsc_ods.sstp_stp_rslt", "sstp_stp_rslt-subtypes", "sstp_stp_rslt-viewProperties", "... sampled of 42 total elements"], '
'"warnings": {}, "failures": {}, "tables_scanned": "5", "views_scanned": "5", "entities_profiled": "0", "filtered": [], '
'"soft_deleted_stale_entities": [], "start_time": "2022-09-27 04:18:16.274423", "running_time_in_seconds": "5"}}, "sink": '
'{"type": "datahub-rest", "report": {"total_records_written": "42", "records_written_per_second": "3", "warnings": [], '
'"failures": [], "start_time": "2022-09-27 04:18:10.041446", "current_time": "2022-09-27 04:18:22.169722", '
'"total_duration_in_seconds": "12.13", "gms_version": "v0.8.44", "pending_requests": "0"}}}'}
Execution finished successfully!
fancy-alligator-33404
09/27/2022, 4:26 AM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': 'dd500399-b2b8-4393-91b8-344f6bc4b9e3',
'infos': ['2022-09-27 04:21:55.525345 [exec_id=dd500399-b2b8-4393-91b8-344f6bc4b9e3] INFO: Starting execution for task with name=RUN_INGEST',
'2022-09-27 04:22:05.634332 [exec_id=dd500399-b2b8-4393-91b8-344f6bc4b9e3] INFO: stdout=Elapsed seconds = 0\n'
' --report-to TEXT Provide an destination to send a structured\n'
'This version of datahub supports report-to functionality\n'
'datahub ingest run -c /tmp/datahub/ingest/dd500399-b2b8-4393-91b8-344f6bc4b9e3/recipe.yml --report-to '
'/tmp/datahub/ingest/dd500399-b2b8-4393-91b8-344f6bc4b9e3/ingestion_report.json\n'
'[2022-09-27 04:21:57,486] INFO {datahub.cli.ingest_cli:179} - DataHub CLI version: 0.8.44\n'
'[2022-09-27 04:21:57,515] INFO {datahub.ingestion.run.pipeline:165} - Sink configured successfully. DataHubRestEmitter: configured '
'to talk to <http://datahub-datahub-gms:8080>\n'
'[2022-09-27 04:21:59,473] INFO {datahub.ingestion.run.pipeline:190} - Source configured successfully.\n'
'[2022-09-27 04:21:59,474] INFO {datahub.cli.ingest_cli:126} - Starting metadata ingestion\n'
'[2022-09-27 04:22:02,982] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:73} - Querying for '
"the latest ingestion checkpoint for pipelineName:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26', "
"platformInstanceId:'hive_192.168.91.140:10000_gsc_ods', job_name:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:22:02,997] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:93} - The last '
"committed ingestion checkpoint for pipelineName:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26', "
"platformInstanceId:'hive_192.168.91.140:10000_gsc_ods', job_name:'common_ingest_from_sql_source' found with start_time: 2022-09-27 "
'04:18:22.097000+00:00 and a bucket duration of None.\n'
'[2022-09-27 04:22:02,997] INFO {datahub.ingestion.source.state.checkpoint:130} - Successfully constructed last checkpoint state for '
'job common_ingest_from_sql_source\n'
'[2022-09-27 04:22:02,997] INFO {datahub.ingestion.source.sql.sql_common:626} - Soft-deleting stale entity of type view - '
'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD).\n'
'[2022-09-27 04:22:02,998] INFO {datahub.ingestion.source.sql.sql_common:626} - Soft-deleting stale entity of type view - '
'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_cls,PROD).\n'
'[2022-09-27 04:22:02,998] INFO {datahub.ingestion.source.sql.sql_common:626} - Soft-deleting stale entity of type view - '
'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_mst,PROD).\n'
'[2022-09-27 04:22:02,998] INFO {datahub.ingestion.source.sql.sql_common:626} - Soft-deleting stale entity of type view - '
'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_sbc,PROD).\n'
'[2022-09-27 04:22:02,999] INFO {datahub.ingestion.source.sql.sql_common:626} - Soft-deleting stale entity of type view - '
'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item,PROD).\n'
'[2022-09-27 04:22:03,132] INFO {datahub.ingestion.run.pipeline:420} - Processing commit request for '
'DatahubIngestionCheckpointingProvider. Commit policy = CommitPolicy.ON_NO_ERRORS, has_errors=False, has_warnings=False\n'
'[2022-09-27 04:22:03,132] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:140} - Committing '
'ingestion checkpoint for '
"pipeline:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26',instance:'hive_192.168.91.140:10000_gsc_ods', "
"job:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:22:03,140] INFO {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:166} - Committed '
'ingestion checkpoint for '
"pipeline:'urn:li:dataHubIngestionSource:0f7c7bfb-d1d0-4e4f-93d1-0e248952aa26',instance:'hive_192.168.91.140:10000_gsc_ods', "
"job:'common_ingest_from_sql_source'\n"
'[2022-09-27 04:22:03,140] INFO {datahub.ingestion.run.pipeline:440} - Successfully committed changes for '
'DatahubIngestionCheckpointingProvider.\n'
'[2022-09-27 04:22:03,141] INFO {datahub.ingestion.reporting.file_reporter:54} - Wrote SUCCESS report successfully to '
"<_io.TextIOWrapper name='/tmp/datahub/ingest/dd500399-b2b8-4393-91b8-344f6bc4b9e3/ingestion_report.json' mode='w' encoding='UTF-8'>\n"
'[2022-09-27 04:22:03,141] INFO {datahub.cli.ingest_cli:147} - Finished metadata ingestion\n'
'\n'
'Cli report:\n'
"{'cli_version': '0.8.44',\n"
" 'cli_entry_location': '/tmp/datahub/ingest/venv-hive-0.8.44/lib/python3.9/site-packages/datahub/__init__.py',\n"
" 'py_version': '3.9.9 (main, Dec 21 2021, 10:03:34) \\n[GCC 10.2.1 20210110]',\n"
" 'py_exec_path': '/tmp/datahub/ingest/venv-hive-0.8.44/bin/python3',\n"
" 'os_details': 'Linux-5.4.0-65-generic-x86_64-with-glibc2.31'}\n"
'Source (hive) report:\n'
"{'events_produced': '27',\n"
" 'events_produced_per_sec': '6',\n"
" 'event_ids': ['container-platforminstance-gsc_ods-urn:li:container:93a7f4080f9d1d30c44551ff89691612',\n"
" 'container-subtypes-gsc_ods-urn:li:container:93a7f4080f9d1d30c44551ff89691612',\n"
" 'container-info-gsc_ods-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af',\n"
' '
"'container-parent-container-gsc_ods-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-urn:li:container:93a7f4080f9d1d30c44551ff89691612',\n"
" 'gsc_ods.sstp_stp_item',\n"
" 'gsc_ods.sstp_stp_item-subtypes',\n"
" 'gsc_ods.sstp_stp_item_sbc',\n"
" 'gsc_ods.sstp_stp_mst-subtypes',\n"
' '
"'container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)',\n"
" 'soft-delete-view-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)',\n"
" '... sampled of 27 total elements'],\n"
" 'warnings': {},\n"
" 'failures': {},\n"
" 'tables_scanned': '5',\n"
" 'views_scanned': '0',\n"
" 'entities_profiled': '0',\n"
" 'filtered': [],\n"
" 'soft_deleted_stale_entities': ['urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)',\n"
" 'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_cls,PROD)',\n"
" 'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_mst,PROD)',\n"
" 'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_sbc,PROD)',\n"
" 'urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item,PROD)'],\n"
" 'start_time': '2022-09-27 04:21:59.259988',\n"
" 'running_time_in_seconds': '4'}\n"
'Sink (datahub-rest) report:\n'
"{'total_records_written': '27',\n"
" 'records_written_per_second': '3',\n"
" 'warnings': [],\n"
" 'failures': [],\n"
" 'start_time': '2022-09-27 04:21:56.553535',\n"
" 'current_time': '2022-09-27 04:22:03.351593',\n"
" 'total_duration_in_seconds': '6.8',\n"
" 'gms_version': 'v0.8.44',\n"
" 'pending_requests': '0'}\n"
'\n'
' Pipeline finished successfully ; produced 27 events in 4 seconds.\n',
"2022-09-27 04:22:05.634597 [exec_id=dd500399-b2b8-4393-91b8-344f6bc4b9e3] INFO: Successfully executed 'datahub ingest'"],
'structured_report': '{"source": {"type": "hive", "report": {"events_produced": "27", "events_produced_per_sec": "9", "event_ids": '
'["container-platforminstance-gsc_ods-urn:li:container:93a7f4080f9d1d30c44551ff89691612", '
'"container-subtypes-gsc_ods-urn:li:container:93a7f4080f9d1d30c44551ff89691612", '
'"container-info-gsc_ods-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af", '
'"container-parent-container-gsc_ods-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-urn:li:container:93a7f4080f9d1d30c44551ff89691612", '
'"gsc_ods.sstp_stp_item", "gsc_ods.sstp_stp_item-subtypes", "gsc_ods.sstp_stp_item_sbc", "gsc_ods.sstp_stp_mst-subtypes", '
'"container-urn:li:container:21c2cec8d1e1252753fdf82a6eb422af-to-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)", '
'"soft-delete-view-urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)", "... sampled of 27 total elements"], '
'"warnings": {}, "failures": {}, "tables_scanned": "5", "views_scanned": "0", "entities_profiled": "0", "filtered": [], '
'"soft_deleted_stale_entities": ["urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_rslt,PROD)", '
'"urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_cls,PROD)", '
'"urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_mst,PROD)", '
'"urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item_sbc,PROD)", '
'"urn:li:dataset:(urn:li:dataPlatform:hive,gsc_ods.sstp_stp_item,PROD)"], "start_time": "2022-09-27 04:21:59.259988", '
'"running_time_in_seconds": "3"}}, "sink": {"type": "datahub-rest", "report": {"total_records_written": "27", '
'"records_written_per_second": "4", "warnings": [], "failures": [], "start_time": "2022-09-27 04:21:56.553535", '
'"current_time": "2022-09-27 04:22:03.141100", "total_duration_in_seconds": "6.59", "gms_version": "v0.8.44", '
'"pending_requests": "0"}}}'}
Execution finished successfully!
gray-shoe-75895
09/28/2022, 3:14 AM'tables_scanned': '5'
. I suspect there’s an issue with stateful ingestion marking the tables as soft-deleted. Could you try running ingestion with include_views: false
and stateful_ingestion.ignore_old_state set to falsefancy-alligator-33404
09/28/2022, 3:49 AMgray-shoe-75895
09/28/2022, 4:12 AM