bumpy-pharmacist-66525
01/16/2023, 1:33 PMcolumn_meta_mapping
feature. It seems I am not able to add either a tag
or a term
at the column level, however, the meta_mapping
feature to add tags/terms to the node itself seems to work fine. Not too sure if this is important, but for reference purposes, I am using DBT & Iceberg together.
I actually went into the source code of the DBT ingestion source to try to find why it wasn't working. It seems that on line 1201 of the dbt_common.py
file that the columns
field (node.columns) in the DBTNode is always empty (https://github.com/datahub-project/datahub/blob/ce5545ed27eeb56669d0adccc0030fad7c[…]tadata-ingestion/src/datahub/ingestion/source/dbt/dbt_common.py). It doesn't seem to be populated at any point in time, which is why I believe it is causing the issue with column level meta mapping not working.
Is anyone able to confirm if the column_meta_mapping
feature for the DBT ingestion source is working for them?alert-fall-82501
01/16/2023, 4:16 PMlemon-daybreak-58504
01/16/2023, 6:35 PMgentle-portugal-21014
01/16/2023, 8:29 PMpolite-actor-701
01/17/2023, 1:07 AMbland-appointment-45659
01/17/2023, 5:04 AMminiature-branch-33689
01/17/2023, 6:55 AM[2023-01-17 02:17:36,141] INFO {datahub_actions.cli.actions:98} - Action Pipeline with name 'ingestion_executor' is now running.
...
File "/usr/local/lib/python3.9/site-packages/avrogen/avrojson.py", line 358, in _record_from_json
raise ValueError(f'{readers_schema.fullname} contains extra fields: {input_keys}')
ValueError: com.linkedin.pegasus2avro.common.AuditStamp contains extra fields: {'message'}
acceptable-morning-73148
01/17/2023, 9:34 AMunable to map type INTERVAL_MONTH(precision=4) to metadata schema
and unable to map type INTERVAL_DAY(precision=4) to metadata schema
. Those are valid column types in our system, but not recognized it seems in DataHub. Any suggestions how to handle?thousands-yacht-8284
01/17/2023, 10:20 AMcurved-planet-99787
01/17/2023, 12:58 PMmax_number_of_fields_to_profile
which allows to only profile a certain amount of fields but I want to exclude or include only specific fields by their namelimited-forest-73733
01/17/2023, 2:27 PMcreamy-machine-95935
01/17/2023, 6:46 PMlate-gpu-33114
01/17/2023, 7:12 PMlively-dusk-19162
01/17/2023, 7:15 PMbrash-helicopter-28341
01/18/2023, 7:36 AMbrash-helicopter-28341
01/18/2023, 7:38 AMelegant-salesmen-99143
01/18/2023, 11:27 AMflat-engineer-75197
01/18/2023, 12:21 PMimportant-helmet-98156
01/18/2023, 12:58 PMBy using this software, you agree that the following text is incorporated into the terms of the Developer Agreement:
If you are an existing SAP customer for On-Premise software, your use of this current software is also covered by the terms of your software license agreement with SAP, including the Use Rights, the current version of which can be found at: <https://www.sap.com/about/agreements/product-use-and-support-terms.html?tag=agreements:product-use-support-terms/on-premise-software/software-use-rights>
For me, this would mean that I could use the connector and get the metadata into Data Hub with our existing SAP Hana On-Pemise database. However, I am not sure, therefore, I am asking you if someone had a similar issue.
Thank you in advance and best regards
Martin
#sap #sap-hanalemon-daybreak-58504
01/18/2023, 1:27 PMmagnificent-lawyer-97772
01/18/2023, 4:38 PMalert-fall-82501
01/18/2023, 5:54 PMalert-fall-82501
01/18/2023, 5:54 PMbright-receptionist-94235
01/19/2023, 4:43 AMorange-actor-62586
01/19/2023, 6:57 AMgreat-computer-16446
01/19/2023, 7:54 AMsalmon-helmet-338
01/19/2023, 8:53 AMsource:
type: s3
config:
path_specs:
- include: <mys3path>
aws_config:
aws_region: <myregion>
aws_access_key_id: <mykey>
aws_secret_access_key: <mysecret>
env: "test"
profiling:
enabled: false
and I got the following error when I run it from UI:
'Collecting pyspark==3.0.3\n'
'/usr/local/bin/ingestion_common.sh: line 3: 44 Killed pip install -r $req_file\n',
"2023-01-19 08:20:42.689903 [exec_id=4ba00e6f-ecac-405a-a5cd-281dd4f1cf94] INFO: Failed to execute 'datahub ingest'",
'2023-01-19 08:20:42.692510 [exec_id=4ba00e6f-ecac-405a-a5cd-281dd4f1cf94] INFO: Caught exception EXECUTING '
'task_id=4ba00e6f-ecac-405a-a5cd-281dd4f1cf94, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.
Would you know what can be the reason and solution? Thanks a lotsalmon-angle-92685
01/19/2023, 11:07 AMelegant-salesmen-99143
01/19/2023, 1:42 PMinclude_views
option in config details for Hive. Is it not possible to display views in Hive in Datahub?🤔brief-oyster-50637
01/19/2023, 5:26 PM