many-rocket-80549
05/29/2023, 10:36 AM~~~~ Execution Summary - RUN_INGEST ~~~~
Execution finished with errors.
{'exec_id': '06b7698c-048e-470e-bf2c-1ff4fca75bd0',
'infos': ['2023-05-29 10:18:33.415813 INFO: Starting execution for task with name=RUN_INGEST',
"2023-05-29 10:18:37.476974 INFO: Failed to execute 'datahub ingest'",
'2023-05-29 10:18:37.477118 INFO: Caught exception EXECUTING task_id=06b7698c-048e-470e-bf2c-1ff4fca75bd0, name=RUN_INGEST, '
'stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
'errors': []}
~~~~ Ingestion Report ~~~~
{
"cli": {
"cli_version": "0.10.0.7",
"cli_entry_location": "/usr/local/lib/python3.10/site-packages/datahub/__init__.py",
"py_version": "3.10.10 (main, Mar 14 2023, 02:37:11) [GCC 10.2.1 20210110]",
"py_exec_path": "/usr/local/bin/python",
"os_details": "Linux-5.15.0-72-generic-x86_64-with-glibc2.31",
"peak_memory_usage": "57.82 MB",
"mem_info": "57.82 MB"
},
"source": {
"type": "file",
"report": {
"events_produced": 0,
"events_produced_per_sec": 0,
"entities": {},
"aspects": {},
"warnings": {},
"failures": {},
"total_num_files": 0,
"num_files_completed": 0,
"files_completed": [],
"percentage_completion": "0%",
"estimated_time_to_completion_in_minutes": -1,
"total_bytes_read_completed_files": 0,
"total_parse_time_in_seconds": 0,
"total_count_time_in_seconds": 0,
"total_deserialize_time_in_seconds": 0,
"aspect_counts": {},
"entity_type_counts": {},
"start_time": "2023-05-29 10:18:35.206188 (now)",
"running_time": "0 seconds"
}
},
"sink": {
"type": "datahub-rest",
"report": {
"total_records_written": 0,
"records_written_per_second": 0,
"warnings": [],
"failures": [],
"start_time": "2023-05-29 10:18:35.161225 (now)",
"current_time": "2023-05-29 10:18:35.208860 (now)",
"total_duration_in_seconds": 0.05,
"gms_version": "v0.10.3",
"pending_requests": 0
}
}
}
~~~~ Ingestion Logs ~~~~
Obtaining venv creation lock...
Acquired venv creation lock
venv setup time = 0
This version of datahub supports report-to functionality
datahub ingest run -c /tmp/datahub/ingest/06b7698c-048e-470e-bf2c-1ff4fca75bd0/recipe.yml --report-to /tmp/datahub/ingest/06b7698c-048e-470e-bf2c-1ff4fca75bd0/ingestion_report.json
[2023-05-29 10:18:35,113] INFO {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.0.7
No ~/.datahubenv file found, generating one for you...
[2023-05-29 10:18:35,164] INFO {datahub.ingestion.run.pipeline:184} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
[2023-05-29 10:18:35,206] INFO {datahub.ingestion.run.pipeline:201} - Source configured successfully.
[2023-05-29 10:18:35,207] INFO {datahub.cli.ingest_cli:129} - Starting metadata ingestion
[2023-05-29 10:18:35,209] INFO {datahub.ingestion.reporting.file_reporter:52} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/06b7698c-048e-470e-bf2c-1ff4fca75bd0/ingestion_report.json' mode='w' encoding='UTF-8'>
[2023-05-29 10:18:35,209] INFO {datahub.cli.ingest_cli:134} - Source (file) report:
{'events_produced': 0,
'events_produced_per_sec': 0,
'entities': {},
'aspects': {},
'warnings': {},
'failures': {},
'total_num_files': 0,
'num_files_completed': 0,
'files_completed': [],
'percentage_completion': '0%',
'estimated_time_to_completion_in_minutes': -1,
'total_bytes_read_completed_files': 0,
'total_parse_time_in_seconds': 0,
'total_count_time_in_seconds': 0,
'total_deserialize_time_in_seconds': 0,
'aspect_counts': {},
'entity_type_counts': {},
'start_time': '2023-05-29 10:18:35.206188 (now)',
'running_time': '0 seconds'}
[2023-05-29 10:18:35,210] INFO {datahub.cli.ingest_cli:137} - Sink (datahub-rest) report:
{'total_records_written': 0,
'records_written_per_second': 0,
'warnings': [],
'failures': [],
'start_time': '2023-05-29 10:18:35.161225 (now)',
'current_time': '2023-05-29 10:18:35.210294 (now)',
'total_duration_in_seconds': 0.05,
'gms_version': 'v0.10.3',
'pending_requests': 0}
[2023-05-29 10:18:35,809] ERROR {datahub.entrypoints:188} - Command failed: Failed to process /home/miquelp/datahub/file_onboarding/test_containers.json
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 175, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
return func(ctx, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
loop.run_until_complete(run_func_check_upgrade(pipeline))
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 158, in run_func_check_upgrade
ret = await the_one_future
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 149, in run_pipeline_async
return await loop.run_in_executor(
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
pipeline.run()
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 339, in run
for wu in itertools.islice(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/file.py", line 196, in get_workunits
for f in self.get_filenames():
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/file.py", line 193, in get_filenames
raise Exception(f"Failed to process {self.config.path}")
Exception: Failed to process /home/miquelp/datahub/file_onboarding/test_containers.json
better-orange-49102
05/29/2023, 10:40 AMmany-rocket-80549
05/29/2023, 10:55 AMsource:
type: file
config:
filename: /home/miquelp/datahub/file_onboarding/test_containers.json
many-rocket-80549
05/29/2023, 10:55 AMmany-rocket-80549
05/29/2023, 10:57 AMbetter-orange-49102
05/29/2023, 10:58 AMbetter-orange-49102
05/29/2023, 11:32 AMmany-rocket-80549
05/29/2023, 4:22 PMbetter-orange-49102
05/30/2023, 2:34 AMgray-shoe-75895
06/02/2023, 6:28 PM