steep-midnight-37232
06/16/2022, 7:06 PMstocky-energy-24880
06/17/2022, 10:58 AM# see <https://datahubproject.io/docs/generated/ingestion/sources/mysql> for complete documentation
source:
type: "postgres"
config:
host_port: localhost:54320
database: test
stateful_ingestion:
enabled: True
table_pattern:
deny:
- '.*company$'
pipeline_name: "my_postgres_pipeline_2"
transformers:
- type: "simple_add_dataset_ownership"
config:
owner_urns:
- "urn:li:corpGroup:mobilede@DataHub"
datahub_api:
server: "<http://localhost:8080>"
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
Here we are trying to softdelete a table using deny pattern and as per the stateful ingestion the soft deleted item should not be displayed from UI, but the soft deleted table still visible from UI.
while seeing the logs with debug we got to know that when we are using a transformer then the deleted entity upserted again.
please see below logs:
[2022-06-17 113520,773] DEBUG {datahub.emitter.rest_emitter:229} - Attempting to emit to DataHub GMS; using curl equivalent to:
curl -X POST -H 'User-Agent: python-requests/2.27.1' -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'Content-Type: application/json' --data '{"proposal": {"entityType": "dataset", "entityUrn": "urnlidataset:(urnlidataPlatform:postgres,test.public.company,PROD)", "changeType": "UPSERT", "aspectName": "status", "aspect": {"value": "{\"removed\": true}", "contentType": "application/json"}, "systemMetadata": {"lastObserved": 1655458520711, "runId": "postgres-2022_06_17-11_35_18"}}}' 'http://localhost:8080/aspects?action=ingestProposal'
[2022-06-17 113520,794] INFO {datahub.ingestion.run.pipeline:84} - sink wrote workunit soft-delete-table-urnlidataset:(urnlidataPlatform:postgres,test.public.company,PROD)
[2022-06-17 113520,795] DEBUG {datahub.emitter.rest_emitter:229} - Attempting to emit to DataHub GMS; using curl equivalent to:
curl -X POST -H 'User-Agent: python-requests/2.27.1' -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'Content-Type: application/json' --data '{"proposal": {"entityType": "dataset", "entityUrn": "urnlidataset:(urnlidataPlatform:postgres,test.public.company,PROD)", "changeType": "UPSERT", "aspectName": "ownership", "aspect": {"value": "{\"owners\": [{\"owner\": \"urnlicorpGroup:mobilede@DataHub\", \"type\": \"DATAOWNER\"}], \"lastModified\": {\"time\": 0, \"actor\": \"urnlicorpuser:unknown\"}}", "contentType": "application/json"}, "systemMetadata": {"lastObserved": 1655458520711, "runId": "postgres-2022_06_17-11_35_18"}}}' 'http://localhost:8080/aspects?action=ingestProposal'
[2022-06-17 113520,830] INFO {datahub.ingestion.run.pipeline:84} - sink wrote workunit txform-urnlidataPlatform:postgres-test.public.company-PROD-ownership
Is this expected? I mean stateful ingestion with transformer not supported?
Or, Is there any configuration for transformers to check the soft deleted entity?adventurous-apple-98365
06/17/2022, 10:25 PMchangeTypes
for MetadataChangeProposals? Ideally I need to PATCH(to support external metadata enrichment), but per the docs only UPSERT is supportedbetter-orange-49102
06/20/2022, 2:31 AMdatahub telemetry
command, whats the purpose of enabling CLI to enable and disable it? Is it a global setting or a per CLI session setting?lemon-zoo-63387
06/20/2022, 10:55 AMnumerous-diamond-76461
06/20/2022, 11:30 AMnumerous-diamond-76461
06/20/2022, 11:31 AMnumerous-diamond-76461
06/20/2022, 11:32 AMbland-orange-13353
06/20/2022, 11:33 AMbulky-jackal-3422
06/20/2022, 1:21 PMFile
source make the most sense here?cool-actor-73767
06/20/2022, 8:36 PMbulky-jackal-3422
06/20/2022, 8:58 PMSchemaMetadataClass
, what exactly should I be using as a platformSchema
? The documentation isn't very clear to me https://datahubproject.io/docs/graphql/unions#platformschemamany-house-53659
06/21/2022, 4:20 AMmany-house-53659
06/21/2022, 6:00 AMnutritious-vegetable-81282
06/21/2022, 7:20 AMfew-air-56117
06/21/2022, 8:43 AMacoustic-quill-54426
06/21/2022, 12:54 PMge-temp-{uuid}
. I found a related feature request, but that is about not showing the query in the original dataset. I believe this is a bug rather than a feature 😅 Do you guys want me to create an issue?straight-refrigerator-31859
06/21/2022, 3:44 PMhigh-family-71209
06/21/2022, 3:49 PMdelightful-barista-90363
06/21/2022, 5:19 PMloud-shampoo-64092
06/21/2022, 7:59 PMpolite-application-51650
06/22/2022, 8:28 AMmodern-monitor-81461
06/22/2022, 12:22 PMmatch this or that or this...
):
meta_mapping:
data_tier:
- match: "Bronze"
operation: "add_term"
config:
term: "Bronze"
- match: "Gold"
operation: "add_term"
config:
term: "Gold"
- match: "Silver"
operation: "add_term"
config:
term: "Silver"
term: "Silver"
The current implementation only performs the last match
and discards the previous ones (in this example, only Silver
would be considered).
2- The DBT model support meta
fields for columns (see docs), but the current code seems to only support meta
information in the DBTNode (not in DBTColumn). I would like to be able to map terms to columns and not only for datasets. Was that ever considered?sparse-barista-40860
06/22/2022, 4:18 PMsparse-barista-40860
06/22/2022, 4:18 PMsparse-barista-40860
06/22/2022, 4:18 PMdatahub ingest -c /root/datahub/metadata-ingestion/examples/demo_data/bigquery_covid19_to_file.dhub.yaml
sparse-barista-40860
06/22/2022, 4:30 PMcd datahub/
pip install 'acryl-datahub[kafka]'
datahub ingest -c /root/datahub/metadata-ingestion/examples/recipes/secured_kafka.dhub.yaml
sparse-barista-40860
06/22/2022, 4:30 PMsparse-barista-40860
06/22/2022, 4:30 PMsparse-barista-40860
06/22/2022, 4:32 PMcd datahub/
pip install 'acryl-datahub[nifi]'
datahub ingest -c metadata-ingestion/examples/recipes/nifi_to_datahub_rest.dhub.yaml