bland-orange-13353
04/18/2023, 6:48 AMbreezy-kangaroo-27287
04/18/2023, 9:06 AMagreeable-table-54007
04/18/2023, 2:02 PMearly-hydrogen-27542
04/18/2023, 3:22 PMminiature-policeman-55414
04/18/2023, 3:35 PMmeta:
owner: "@sree"
terms_list: core_transport_gross_profit;core_adjusted_transport_gross_profit
Following is the configuration for dbt metadata ingestion recipe:
"meta_mapping": {
"term": {
"match": ".*",
"operation": "add_term",
"config":
{
"term": "{{ $match }}"
}
},
"terms_list": {
"match": ".*",
"operation": "add_terms",
"config":
{
"seperator": ";"
}
}
},
The glossary terms are defined in glossary YAML and are being ingested successfully (core_transport_gross_profit;core_adjusted_transport_gross_profit). So, now I dont see the glossary terms both added to my dbt model after metadata ingestion. Is this the right way to add multiple terms? Please correct or suggest me what is the right way?gray-airplane-39227
04/18/2023, 6:28 PMIngestionSource
, currently only the property name
is indexed and searchable, is there any reason property type
is not searchable by default? Any concerns if I make a contribution to annotate type
as a searchable field for IngestionSource
?
record DataHubIngestionSourceInfo {
/**
* The display name of the ingestion source
*/
@Searchable = {
"fieldType": "TEXT_PARTIAL"
}
name: string
/**
* The type of the source itself, e.g. mysql, bigquery, bigquery-usage. Should match the recipe.
*/
type: string
early-hydrogen-27542
04/18/2023, 6:33 PMlively-dusk-19162
04/18/2023, 8:42 PMadamant-sugar-28445
04/19/2023, 2:08 AMspark.sql("select * from db.tableX")
and tableX is already in datahub in the Hive platform, datahub lineage shows this input as an hdfs path rather than a Hive table. How can I make it present the table input as a table input?late-arm-1146
04/19/2023, 3:54 AMcsv-enricher
with v0.8.45. I noticed that the resource description I provide in the csv overwrites the existing description even if I don't set the write_semantics
to OVERRIDE. Is this expected behavior for description?breezy-kangaroo-27287
04/19/2023, 8:50 AMrapid-airport-61849
04/19/2023, 9:30 AMdatahub.ingestion.run.pipeline.PipelineInitError: Failed to configure the source (mssql): No module named 'pyodbc'
Hello!!! Have you ever seen that error guys? I am using quickstart docker.late-furniture-56629
04/19/2023, 9:50 AMdatahub get --urn "urn:li:dataset:(urn:li:dataPlatform:mssql,UnifiedJobs.dbo.AccountCompanyMapping,PROD)"
And I got error:
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
If somebody had this problem and how did you fix it?
At the end I would like to be able dump all ingestion to some file - as a backup πagreeable-table-54007
04/19/2023, 10:22 AMbrainy-oxygen-20792
04/19/2023, 10:25 AMadamant-sugar-28445
04/19/2023, 10:34 AMagreeable-table-54007
04/19/2023, 1:16 PMgifted-diamond-19544
04/19/2023, 1:56 PMv0.10.2
, with the actions container on version v0.0.12
. When we changed the cli version to match our server version, our tableau ingestion setup via the UI started failing with the following error:
Failed to find a registered source for type tableau: 'str' object is not callable
Anyone got the same problem?
cc @ancient-ocean-36062strong-parrot-78481
04/19/2023, 8:57 PMmicroscopic-machine-90437
04/20/2023, 9:37 AMERROR: The ingestion process was killed, likely because it ran out of memory. You can resolve this issue by allocating more memory to the datahub-actions container.
When I go through the values.yml file, I could see that the datahub-actions container has 512 Mi as memory.
My questions is, when we ingest metadata, in which container it will be stored. If the data from snowflake we are trying to ingest is in GBs, how large we have to scale our memory in the actions container? Is there a way to find out what is the size of the data/metadata we are trying to ingest(from snowflake/any other source).
Can someone help me with this.quiet-television-68466
04/20/2023, 11:05 AMBashOperator(
task_id="run_data_task",
dag=dag,
bash_command="echo 'hello world'",
owner='<mailto:john.claro@checkout.com|john.claro@checkout.com>',
inlets=[
Dataset("snowflake", "data_platform.cfirth_test.CFIRTH_TEST_UPLOAD"),
# You can also put dataset URNs in the inlets/outlets lists.
],
outlets=[Urn("urn:li:dataset:(urn:li:dataPlatform:snowflake,landing.data_platform.cfirth_tbl,PROD)")],
)
Currently, we are trying to set up our custom DbtOperator to automatically parse the lineage of Dbt jobs using the manifest file, and then set them as inlets/outlets respectively.
if self.operation == 'run':
inlets, outlets = self._get_lineage() # this works as expected and returns lists of Dataset('snowflake', '<snowflake_table_name>')
self.add_inlets(inlets)
<http://self.log.info|self.log.info>(f"Added inlets: {self.get_inlet_defs()}")
self.add_oulets(outlets)
<http://self.log.info|self.log.info>(f"Added outlets: {self.get_outlet_defs()}")
Airflow picks up the inlets and outlets correctly (as seen in the logs here)
Added inlets: [Dataset(platform='snowflake', name='<db>.<schema>.<table>', env='PROD'), ...]
But when they are emitted to Datahub, (logs in π§΅)it looks like thereβs nothing happening lineage wise. Anyone have any ideas?numerous-byte-87938
04/20/2023, 5:57 PMadorable-megabyte-63781
04/21/2023, 8:48 AMRuntimeError: Query workbooksConnection error: [{'message': "Validation error of type FieldUndefined: Field 'projectLuid' in type 'Workbook' is undefined @ 'workbooksConnection/nodes/projectLuid'", 'locations': [{'line': 9, 'column': 7, 'sourceName': None}], 'description': "Field 'projectLuid' in type 'Workbook' is undefined", 'validationErrorType': 'FieldUndefined', 'queryPath': ['workbooksConnection', 'nodes', 'projectLuid'], 'errorType': 'ValidationError', 'path': None, 'extensions': None}]
Here is my recipe for reference .source:
type: tableau
config:
connect_uri: 'tableau_url'
ssl_verify: false
stateful_ingestion:
enabled: false
site: site_name
project_pattern:
allow:
- default
ignoreCase: true
username: '${my_username}'
password: '${my_password}'microscopic-machine-90437
04/21/2023, 9:15 AMhundreds-airline-29192
04/21/2023, 9:55 AMhundreds-airline-29192
04/21/2023, 10:10 AMhundreds-airline-29192
04/21/2023, 10:10 AMhundreds-airline-29192
04/21/2023, 10:10 AMhundreds-airline-29192
04/21/2023, 10:10 AMhundreds-airline-29192
04/21/2023, 10:10 AM