late-arm-1146
11/15/2022, 3:15 AMsilly-finland-62382
11/15/2022, 7:10 AMancient-policeman-73437
11/15/2022, 8:36 AMswift-judge-22731
11/15/2022, 3:57 PMsink:
type: datahub-rest
source:
type: mssql
config:
use_odbc: 'True'
host_port: '----:1433'
password: ----
database: ----
username: ----
uri_args:
driver: 'ODBC Driver 17 for SQL Server'
Encrypt: yes
TrustServerCertificate: Yes
ssl: 'True'
And we get the following error:
RUN_INGEST - {'errors': [],
'exec_id': '75ff2262-fcbb-47c3-8a60-bcf0445697e0',
'infos': ['2022-11-15 15:37:57.369176 [exec_id=75ff2262-fcbb-47c3-8a60-bcf0445697e0] INFO: Starting execution for task with name=RUN_INGEST',
'2022-11-15 15:38:01.452989 [exec_id=75ff2262-fcbb-47c3-8a60-bcf0445697e0] INFO: stdout=venv setup time = 0\n'
'This version of datahub supports report-to functionality\n'
'datahub --debug ingest run -c /tmp/datahub/ingest/75ff2262-fcbb-47c3-8a60-bcf0445697e0/recipe.yml --report-to '
'/tmp/datahub/ingest/75ff2262-fcbb-47c3-8a60-bcf0445697e0/ingestion_report.json\n'
'[2022-11-15 15:37:58,617] DEBUG {datahub.telemetry.telemetry:210} - Sending init Telemetry\n'
'[2022-11-15 15:37:59,245] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry\n'
'[2022-11-15 15:37:59,501] INFO {datahub.cli.ingest_cli:182} - DataHub CLI version: 0.9.0\n'
"[2022-11-15 15:37:59,504] DEBUG {datahub.cli.ingest_cli:196} - Using config: {'pipeline_name': "
"'urn:li:dataHubIngestionSource:79107ae9-94bc-4d08-82f8-7e3769edae25', 'run_id': '75ff2262-fcbb-47c3-8a60-bcf0445697e0', 'sink': {'type': "
"'datahub-rest'}, 'source': {'config': {'database': '--------', 'host_port': "
"'---------', 'password': '--------', 'uri_args': {'Encrypt': 'yes', 'TrustServerCertificate': "
"'Yes', 'driver': 'ODBC Driver 17 for SQL Server', 'ssl': 'True'}, 'use_odbc': 'True', 'username': '--------'}, 'type': 'mssql'}}\n"
'[2022-11-15 15:37:59,504] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry\n'
'[2022-11-15 15:37:59,764] ERROR {datahub.entrypoints:165} - 1 validation error for PipelineConfig\n'
'datahub_api -> __root__\n'
' DataHubGraphConfig expected dict not NoneType (type=type_error)\n'
'[2022-11-15 15:37:59,765] DEBUG {datahub.entrypoints:198} - DataHub CLI version: 0.9.0 at '
'/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/__init__.py\n'
'[2022-11-15 15:37:59,765] DEBUG {datahub.entrypoints:201} - Python version: 3.10.7 (main, Sep 13 2022, 14:31:33) [GCC 10.2.1 '
'20210110] at /tmp/datahub/ingest/venv-mssql-0.9.0/bin/python3 on Linux-5.4.0-1091-azure-x86_64-with-glibc2.31\n'
'[2022-11-15 15:37:59,766] DEBUG {datahub.entrypoints:204} - GMS config {}\n',
"2022-11-15 15:38:01.453254 [exec_id=75ff2262-fcbb-47c3-8a60-bcf0445697e0] INFO: Failed to execute 'datahub ingest'",
'2022-11-15 15:38:01.453461 [exec_id=75ff2262-fcbb-47c3-8a60-bcf0445697e0] INFO: Caught exception EXECUTING '
'task_id=75ff2262-fcbb-47c3-8a60-bcf0445697e0, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.
Do anyone have any tips for ingesting data for mssql? Using the cli we get pyodbc missing driver and can't figure out a way that way either.
Appreciate any suggestions!bland-orange-13353
11/15/2022, 3:57 PMancient-jordan-41401
11/16/2022, 8:17 AM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': '889d2fc9-8e50-4605-b9b9-a5af6db7fb08',
'infos': ['2022-11-16 07:58:54.589510 [exec_id=889d2fc9-8e50-4605-b9b9-a5af6db7fb08] INFO: Starting execution for task with name=RUN_INGEST',
'2022-11-16 07:58:58.677408 [exec_id=889d2fc9-8e50-4605-b9b9-a5af6db7fb08] INFO: stdout=venv setup time = 0\n'
'This version of datahub supports report-to functionality\n'
'datahub ingest run -c /tmp/datahub/ingest/889d2fc9-8e50-4605-b9b9-a5af6db7fb08/recipe.yml --report-to '
'/tmp/datahub/ingest/889d2fc9-8e50-4605-b9b9-a5af6db7fb08/ingestion_report.json\n'
'[2022-11-16 07:58:56,721] INFO {datahub.cli.ingest_cli:182} - DataHub CLI version: 0.9.0\n'
'[2022-11-16 07:58:56,754] INFO {datahub.ingestion.run.pipeline:175} - Sink configured successfully. DataHubRestEmitter: configured '
'to talk to <http://datahub-datahub-gms:8080>\n'
'[2022-11-16 07:58:57,527] ERROR {datahub.entrypoints:192} - \n'
'Traceback (most recent call last):\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 196, in __init__\n'
' self.source: Source = source_class.create(\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/source/sql/mssql.py", line 177, in create\n'
' return cls(config, ctx)\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/source/sql/mssql.py", line 123, in __init__\n'
' for inspector in self.get_inspectors():\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/source/sql/mssql.py", line 213, in '
'get_inspectors\n'
' url = self.config.get_sql_alchemy_url()\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/source/sql/mssql.py", line 75, in '
'get_sql_alchemy_url\n'
' import pyodbc # noqa: F401\n'
"ModuleNotFoundError: No module named 'pyodbc'\n"
'\n'
'The above exception was the direct cause of the following exception:\n'
'\n'
'Traceback (most recent call last):\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 197, in run\n'
' pipeline = Pipeline.create(\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 317, in create\n'
' return cls(\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 202, in __init__\n'
' self._record_initialization_failure(\n'
' File "/tmp/datahub/ingest/venv-mssql-0.9.0/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 129, in '
'_record_initialization_failure\n'
' raise PipelineInitError(msg) from e\n'
'datahub.ingestion.run.pipeline.PipelineInitError: Failed to configure source (mssql)\n'
'[2022-11-16 07:58:57,528] ERROR {datahub.entrypoints:195} - Command failed: \n'
'\tFailed to configure source (mssql) due to \n'
"\t\t'No module named 'pyodbc''.\n"
'\tRun with --debug to get full stacktrace.\n'
"\te.g. 'datahub --debug ingest run -c /tmp/datahub/ingest/889d2fc9-8e50-4605-b9b9-a5af6db7fb08/recipe.yml --report-to "
"/tmp/datahub/ingest/889d2fc9-8e50-4605-b9b9-a5af6db7fb08/ingestion_report.json'\n",
"2022-11-16 07:58:58.677623 [exec_id=889d2fc9-8e50-4605-b9b9-a5af6db7fb08] INFO: Failed to execute 'datahub ingest'",
'2022-11-16 07:58:58.677807 [exec_id=889d2fc9-8e50-4605-b9b9-a5af6db7fb08] INFO: Caught exception EXECUTING '
'task_id=889d2fc9-8e50-4605-b9b9-a5af6db7fb08, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.
billowy-pilot-93812
11/16/2022, 10:54 AM'[2022-11-16 10:16:23,427] ERROR {datahub.entrypoints:185} - File '
'"/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/entrypoints.py", line 164, in main\n'
' 161 def main(**kwargs):\n'
' 162 # This wrapper prevents click from suppressing errors.\n'
' 163 try:\n'
'--> 164 sys.exit(datahub(standalone_mode=False, **kwargs))\n'
' 165 except click.Abort:\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 1130, in __call__\n'
' 1128 def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:\n'
' (...)\n'
'--> 1130 return self.main(*args, **kwargs)\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 1055, in main\n'
' rv = self.invoke(ctx)\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke\n'
' return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke\n'
' return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 1404, in invoke\n'
' return ctx.invoke(self.callback, **ctx.params)\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/core.py", line 760, in invoke\n'
' return __callback(*args, **kwargs)\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func\n'
' return f(get_current_context(), *args, **kwargs)\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 347, in wrapper\n'
' 290 def wrapper(*args: Any, **kwargs: Any) -> Any:\n'
' (...)\n'
' 343 "status": "error",\n'
' 344 "error": get_full_class_name(e),\n'
' 345 },\n'
' 346 )\n'
'--> 347 raise e\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 299, in wrapper\n'
' 290 def wrapper(*args: Any, **kwargs: Any) -> Any:\n'
' (...)\n'
' 295 telemetry_instance.ping(\n'
' 296 "function-call", {"function": function, "status": "start"}\n'
' 297 )\n'
' 298 try:\n'
'--> 299 res = func(*args, **kwargs)\n'
' 300 telemetry_instance.ping(\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in '
'wrapper\n'
' 86 def wrapper(ctx: click.Context, *args: P.args, **kwargs: P.kwargs) -> Any:\n'
' (...)\n'
' 91 )\n'
' 92 _init_leak_detection()\n'
' 93 \n'
' 94 try:\n'
'--> 95 return func(ctx, *args, **kwargs)\n'
' 96 finally:\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 192, in run\n'
' 103 def run(\n'
' 104 ctx: click.Context,\n'
' 105 config: str,\n'
' 106 dry_run: bool,\n'
' 107 preview: bool,\n'
' 108 strict_warnings: bool,\n'
' 109 preview_workunits: int,\n'
' 110 test_source_connection: bool,\n'
' 111 report_to: str,\n'
' 112 no_default_report: bool,\n'
' 113 no_spinner: bool,\n'
' 114 ) -> None:\n'
' (...)\n'
' 188 raw_pipeline_config,\n'
' 189 )\n'
' 190 \n'
' 191 loop = asyncio.get_event_loop()\n'
'--> 192 loop.run_until_complete(run_func_check_upgrade(pipeline))\n'
'\n'
'File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' 610 def run_until_complete(self, future):\n'
' (...)\n'
' 642 future.remove_done_callback(_run_until_complete_cb)\n'
' 643 if not future.done():\n'
" 644 raise RuntimeError('Event loop stopped before Future completed.')\n"
' 645 \n'
'--> 646 return future.result()\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 151, in '
'run_func_check_upgrade\n'
' 146 async def run_func_check_upgrade(pipeline: Pipeline) -> None:\n'
' 147 version_stats_future = asyncio.ensure_future(\n'
' 148 upgrade.retrieve_version_stats(pipeline.ctx.graph)\n'
' 149 )\n'
' 150 the_one_future = asyncio.ensure_future(run_pipeline_async(pipeline))\n'
'--> 151 ret = await the_one_future\n'
' 152 \n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 142, in run_pipeline_async\n'
' 140 async def run_pipeline_async(pipeline: Pipeline) -> int:\n'
' 141 loop = asyncio._get_running_loop()\n'
'--> 142 return await loop.run_in_executor(\n'
' 143 None, functools.partial(run_pipeline_to_completion, pipeline)\n'
'\n'
'File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run\n'
' 53 def run(self):\n'
' 54 if not self.future.set_running_or_notify_cancel():\n'
' 55 return\n'
' 56 \n'
' 57 try:\n'
'--> 58 result = self.fn(*self.args, **self.kwargs)\n'
' 59 except BaseException as exc:\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 133, in '
'run_pipeline_to_completion\n'
' 117 def run_pipeline_to_completion(\n'
' 118 pipeline: Pipeline, structured_report: Optional[str] = None\n'
' 119 ) -> int:\n'
' (...)\n'
' 129 )\n'
' 130 <http://logger.info|logger.info>(\n'
' 131 f"Sink ({pipeline.config.sink.type}) report:\\n{pipeline.sink.get_report().as_string()}"\n'
' 132 )\n'
'--> 133 raise e\n'
' 134 else:\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 125, in '
'run_pipeline_to_completion\n'
' 117 def run_pipeline_to_completion(\n'
' 118 pipeline: Pipeline, structured_report: Optional[str] = None\n'
' 119 ) -> int:\n'
' (...)\n'
' 121 with click_spinner.spinner(\n'
' 122 beep=False, disable=no_spinner, force=False, stream=sys.stdout\n'
' 123 ):\n'
' 124 try:\n'
'--> 125 pipeline.run()\n'
' 126 except Exception as e:\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 344, in run\n'
' 332 def run(self) -> None:\n'
' (...)\n'
' 340 else DeadLetterQueueCallback(\n'
' 341 self.ctx, self.config.failure_log.log_config\n'
' 342 )\n'
' 343 )\n'
'--> 344 for wu in itertools.islice(\n'
' 345 self.source.get_workunits(),\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/ingestion/source/superset.py", line 354, in '
'get_workunits\n'
' 353 def get_workunits(self) -> Iterable[MetadataWorkUnit]:\n'
'--> 354 yield from self.emit_dashboard_mces()\n'
' 355 yield from self.emit_chart_mces()\n'
'\n'
'File "/tmp/datahub/ingest/venv-superset-0.9.2/lib/python3.10/site-packages/datahub/ingestion/source/superset.py", line 263, in '
'emit_dashboard_mces\n'
' 247 def emit_dashboard_mces(self) -> Iterable[MetadataWorkUnit]:\n'
' (...)\n'
' 259 \n'
' 260 current_dashboard_page += 1\n'
' 261 \n'
' 262 payload = dashboard_response.json()\n'
'--> 263 for dashboard_data in payload["result"]:\n'
' 264 dashboard_snapshot = self.construct_dashboard_from_api_data(\n'
'\n'
polite-ghost-91039
11/16/2022, 12:20 PMsteep-family-13549
11/16/2022, 12:52 PMsteep-family-13549
11/16/2022, 12:54 PMmicroscopic-mechanic-13766
11/16/2022, 1:28 PMtransformers:
type: pattern_add_dataset_terms
config:
term_patter:
rules:
'*metadata*':
- 'urn:li:glossaryTerm:metadata'
source:
type: postgres
config:
include_tables: true
database: knoxdb
password: <password>
profiling:
enabled: false
host_port: 'postgresql:5432'
include_views: true
username: <username>
ERROR {datahub.entrypoints:182} - 1 validation error for PipelineConfig\n'
'transformers\n'
' value is not a valid list (type=type_error.list)\n'
colossal-smartphone-90274
11/16/2022, 4:20 PMrich-state-73859
11/16/2022, 6:55 PMCaused by: java.lang.ClassNotFoundException: <http://datahub.shaded.org|datahub.shaded.org>.apache.http.ssl.TrustStrategy
when using datahub-protobuf-0.9.2.jar
but it could work with datahub-protobuf-0.8.45.jar
. Is there any solution for this?chilly-truck-63841
11/16/2022, 9:40 PMwonderful-egg-79350
11/17/2022, 1:40 AMthousands-branch-81757
11/17/2022, 5:18 AMbumpy-journalist-41369
11/17/2022, 10:28 AMbest-wire-59738
11/17/2022, 1:25 PMbland-orange-13353
11/17/2022, 1:25 PMbetter-fireman-33387
11/17/2022, 1:43 PMlively-dusk-19162
11/17/2022, 6:15 PMaloof-art-29270
11/17/2022, 8:18 PMbillowy-pilot-93812
11/18/2022, 9:29 AMlittle-spring-72943
11/18/2022, 10:47 AMbright-receptionist-94235
11/18/2022, 1:36 PMpolite-monitor-47621
11/18/2022, 2:54 PMlively-dusk-19162
11/18/2022, 4:25 PMlively-dusk-19162
11/18/2022, 4:25 PMlively-dusk-19162
11/18/2022, 4:26 PMmelodic-book-17939
11/18/2022, 9:36 PM