green-lion-58215
11/09/2022, 11:26 PMTraceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 256, in process_file
m = imp.load_source(mod_name, filepath)
File "/usr/local/lib/python3.7/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 696, in _load
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/airflow/dags/marketing/bing_ads/bing_ads_dag.py", line 66, in <module>
outlets=Dataset("delta-lake", "l1_dev.bing_ads_ads"),
File "/usr/local/lib/python3.7/site-packages/airflow/utils/decorators.py", line 98, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/databricks_operator.py", line 448, in __init__
super(DatabricksRunNowOperator, self).__init__(**kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/decorators.py", line 98, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 447, in __init__
self._inlets.update(inlets)
TypeError: cannot convert dictionary update sequence element #0 to a sequence
lively-dusk-19162
11/10/2022, 1:24 AMlively-dusk-19162
11/10/2022, 2:17 AMrough-fish-51544
11/10/2022, 7:37 AMfierce-baker-1392
11/10/2022, 7:44 AMstocky-helicopter-7122
11/10/2022, 12:58 PMbrief-ability-41819
11/10/2022, 1:31 PMsource:
type: kafka
config:
schema_registry_class: datahub.ingestion.source.confluent_schema_registry.ConfluentSchemaRegistry
platform_instance: Kafka
connection:
bootstrap: 'bootstrap-server:9092'
but only 34 out of ~1200 topics are ingested. Those 34 are also “empty” inside.
No errors being throwned 🤔 thanks in advance!proud-accountant-49377
11/10/2022, 2:14 PMbland-orange-13353
11/10/2022, 3:41 PMgreen-lion-58215
11/10/2022, 3:45 PM[2022-11-10 06:01:41,020] {{taskinstance.py:1150}} ERROR - 1 validation error for DatahubLineageConfig
enabled
extra fields not permitted (type=value_error.extra)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 990, in _run_raw_task
task_copy.post_execute(context=context, result=result)
File "/usr/local/lib/python3.7/site-packages/airflow/lineage/__init__.py", line 82, in wrapper
outlets=self.outlets, context=context)
File "/usr/local/lib/python3.7/site-packages/datahub_provider/lineage/datahub.py", line 75, in send_lineage
config = get_lineage_config()
File "/usr/local/lib/python3.7/site-packages/datahub_provider/lineage/datahub.py", line 35, in get_lineage_config
return DatahubLineageConfig.parse_obj(kwargs)
File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DatahubLineageConfig
enabled
extra fields not permitted (type=value_error.extra)
elegant-salesmen-99143
11/10/2022, 4:15 PMpurple-monitor-41675
11/10/2022, 6:34 PM0.2.109
Any help about what is missinglively-dusk-19162
11/10/2022, 10:21 PMalert-fall-82501
11/11/2022, 5:34 AMcan anybody suggest on below error messages .I am ingesting bigquery beta source .
alert-fall-82501
11/11/2022, 5:34 AM/tmp
[2022-11-10, 06:17:38 UTC] {{subprocess.py:74}} INFO - Running command: ['bash', '-c', 'python3 -m datahub ingest -c /usr/local/airflow/dags/dt_datahub/recipes/prod/bigquery/bmp5.yaml']
[2022-11-10, 06:17:38 UTC] {{subprocess.py:85}} INFO - Output:
[2022-11-10, 06:17:40 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:17:40 UTC] INFO {datahub.cli.ingest_cli:179} - DataHub CLI version: 0.8.44
[2022-11-10, 06:17:40 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:17:40 UTC] INFO {datahub.ingestion.run.pipeline:165} - Sink configured successfully. DataHubRestEmitter: configured to talk to <https://datahub-gms.digitalturbine.com:8080>
[2022-11-10, 06:17:44 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:17:44 UTC] INFO {datahub.ingestion.run.pipeline:190} - Source configured successfully.
[2022-11-10, 06:17:44 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:17:44 UTC] INFO {datahub.cli.ingest_cli:126} - Starting metadata ingestion
[2022-11-10, 06:18:08 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:18:08 UTC] INFO {datahub.ingestion.source.bigquery_v2.lineage:145} - Populating lineage info via GCP audit logs
[2022-11-10, 06:18:08 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:18:08 UTC] INFO {datahub.ingestion.source.bigquery_v2.lineage:208} - Start loading log entries from BigQuery start_time=2022-11-08T23:45:00Z and end_time=2022-11-10T06:32:44Z
[2022-11-10, 06:18:11 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:18:11 UTC] INFO {datahub.ingestion.source.bigquery_v2.lineage:227} - Finished loading 0 log entries from BigQuery so far
[2022-11-10, 06:18:11 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:18:11 UTC] INFO {datahub.ingestion.source.bigquery_v2.lineage:319} - Parsing BigQuery log entries: number of log entries successfully parsed=0
[2022-11-10, 06:18:11 UTC] {{subprocess.py:89}} INFO - [2022-11-10, 06:18:11 UTC] INFO {datahub.ingestion.source.bigquery_v2.lineage:433} - Built lineage map containing 0 entries.
[2022-11-10, 06:21:00 UTC] {{subprocess.py:93}} INFO - Command exited with return code -9
[2022-11-10, 06:21:00 UTC] {{taskinstance.py:1703}} ERROR - Task failed with exception
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1509, in _execute_task
result = execute_callable(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/bash.py", line 188, in execute
f'Bash command failed. The command returned a non-zero exit code {result.exit_code}.'
airflow.exceptions.AirflowException: Bash command failed. The command returned a non-zero exit code -9.
full-chef-85630
11/11/2022, 6:09 AM[2022-11-11T11:10:48.012+0800] {logging_mixin.py:117} INFO - Exception: Traceback (most recent call last):
File "/opt/miniconda3/envs/xavier/lib/python3.10/site-packages/datahub_airflow_plugin/datahub_plugin.py", line 339, in custom_on_success_callback
datahub_on_success_callback(context)
File "/opt/miniconda3/envs/xavier/lib/python3.10/site-packages/datahub_airflow_plugin/datahub_plugin.py", line 192, in datahub_on_success_callback
inlets = get_inlets_from_task(task, context)
File "/opt/miniconda3/envs/xavier/lib/python3.10/site-packages/datahub_airflow_plugin/datahub_plugin.py", line 46, in get_inlets_from_task
and isinstance(task._inlets, list)
AttributeError: '_PythonDecoratedOperator' object has no attribute '_inlets'
glamorous-library-1322
11/11/2022, 9:51 AMurn:li:dataset:(urn:li:dataPlatform:s3,path/to/my/data/my_file_1.csv,PROD)
, and datahub get --urn 'urn:li:dataset:(urn:li:dataPlatform:s3,path/to/my/data/my_file_1.csv,PROD)'
works well. But when i try to see the timeline with datahub timeline -c TECHNICAL_SCHEMA --urn 'urn:li:dataset:(urn:li:dataPlatform:s3,path/to/my/data/my_file_1.csv,PROD)'
i get an error from Jetty ("status": 404). I'm guessing my URN is wrong, so what is it? How do I find out? No problem with other data sets (with only special chars '.'). Help appreciated. datahub version 0.9.1sparse-australia-70466
11/11/2022, 7:39 PM"[2022-11-11 19:33:18,895] ERROR {datahub.ingestion.run.pipeline:127} - s3 is disabled; try running: pip install 'acryl-datahub[s3]'\n"
This is using the vanilla ingestion functionality offered through the UI so I'm unsure of where to interject with adding a pip install
command or some other form of providing a customer container image...
Any ideas?
Thanks!thousands-branch-81757
11/14/2022, 3:45 AMstocky-helicopter-7122
11/14/2022, 7:27 AMalert-fall-82501
11/14/2022, 8:44 AMCan anybody suggest on this ? Not able to start the datahub using command python3 -m datahub docker quickstart
alert-fall-82501
11/14/2022, 8:44 AMrequests.exceptions.SSLError: HTTPSConnectionPool(host='<http://raw.githubusercontent.com|raw.githubusercontent.com>', port=443): Max retries exceeded with url: /datahub-project/datahub/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1131)')))
[2022-11-14 14:11:44,353] ERROR {datahub.entrypoints:195} - Command failed:
HTTPSConnectionPool(host='<http://raw.githubusercontent.com|raw.githubusercontent.com>', port=443): Max retries exceeded with url: /datahub-project/datahub/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1131)'))).
Run with --debug to get full stacktrace.
e.g. 'datahub --debug docker quickstart'
microscopic-mechanic-13766
11/14/2022, 9:00 AMmammoth-gigabyte-6392
11/14/2022, 9:18 AMalert-petabyte-2924
11/14/2022, 12:27 PMdbt version - 1.2.2
I am getting the below error when I am running my recipe:
RUN_INGEST - {'errors': [],
'exec_id': '70be018f-168c-41e6-b7a3-eb01a9468ff8',
'infos': ['2022-11-14 12:20:28.468413 [exec_id=70be018f-168c-41e6-b7a3-eb01a9468ff8] INFO: Starting execution for task with name=RUN_INGEST',
'2022-11-14 12:20:44.744602 [exec_id=70be018f-168c-41e6-b7a3-eb01a9468ff8] INFO: stdout=venv setup time = 0\n'
'This version of datahub supports report-to functionality\n'
'datahub ingest run -c /tmp/datahub/ingest/70be018f-168c-41e6-b7a3-eb01a9468ff8/recipe.yml --report-to '
'/tmp/datahub/ingest/70be018f-168c-41e6-b7a3-eb01a9468ff8/ingestion_report.json\n'
'[2022-11-14 12:20:31,186] INFO {datahub.cli.ingest_cli:167} - DataHub CLI version: 0.9.2\n'
'[2022-11-14 12:20:43,924] ERROR {datahub.entrypoints:206} - Command failed: Failed to set up framework context: Failed to connect to '
'DataHub\n'
'Traceback (most recent call last):\n'
' File "/tmp/datahub/ingest/venv-dbt-0.9.2/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn\n'
' conn = connection.create_connection(\n'
' File "/tmp/datahub/ingest/venv-dbt-0.9.2/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection\n'
' raise err\n'
' File "/tmp/datahub/ingest/venv-dbt-0.9.2/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection\n'
' sock.connect(sa)\n'
'ConnectionRefusedError: [Errno 111] Connection refused\n'
'\n'
'During handling of the above exception, another exception occurred:\n'
'\n'
'Traceback (most recent call last):\n'
My recipe looks like below :
sink:
type: datahub-rest
config:
server: '<http://localhost:8080>'
source:
type: dbt
config:
manifest_path: /Users/annujoshi/Downloads/artifacts/manifest_file.json
test_results_path: /Users/annujoshi/Downloads/artifacts/run_results.json
sources_path: /Users/annujoshi/Downloads/artifacts/sources_file.json
target_platform: snowflake
catalog_path: /Users/annujoshi/Downloads/artifacts/catalog_file.json
bulky-salesclerk-62223
11/14/2022, 12:35 PMpipeline_name: "snowflake_metadata"
source:
type: snowflake
config:
ignore_start_time_lineage: true
account_id: XXX
warehouse: XXXX
role: XXX
include_table_lineage: true
include_view_lineage: true
profiling:
enabled: true
stateful_ingestion:
fail_safe_threshold: 100.0
enabled: true
username: XXX
password: '${PASSWORD}'
I have given my role grant imported privileges on database snowflake to role XXX;
but the "Queries" tab is still greyed out. The CLI Logs also have "include_usage_stats" set to True.
Any ideas why it's not working?alert-fall-82501
11/14/2022, 1:43 PMsquare-ocean-28447
11/14/2022, 1:51 PMdatahub ingest -c recipe.yaml
-> from there, after the airflow job completes, the rendered lineage will change from dataset to table with corresponding schema and relationship.
I was wondering if there's already a way to bundle those 2 separate steps programmatically?purple-monitor-41675
11/14/2022, 2:16 PMgreen-lion-58215
11/14/2022, 9:15 PM