loud-account-57875
06/08/2023, 6:12 AMadventurous-apple-52621
06/08/2023, 8:32 AMbrief-evening-58385
06/08/2023, 1:46 PMwonderful-tomato-83083
06/08/2023, 5:09 PMextraHeaders
but I haven't found an example of what that would look like. Can anyone help?adventurous-apple-52621
06/09/2023, 6:50 AMcuddly-garden-9148
06/09/2023, 8:12 AMbillions-rose-75566
06/09/2023, 11:00 AMpipeline_name: DatabaseNameToBeIngested
source:
type: postgres
config:
host_port: postgres:5432
database: db
username: db
password: password
profiling:
enabled: true
stateful_ingestion:
enabled: true
sink:
type: "datahub-kafka"
config:
connection:
bootstrap: "broker:29092"
schema_registry_url: "http://schema-registry:8081"swift-agency-2567
06/09/2023, 1:29 PMsnowflake.account_usage.access_history
.cuddly-garden-9148
06/09/2023, 2:13 PMwonderful-tomato-83083
06/09/2023, 3:08 PMssl_verify
isn't supported in openapi recipe, is there a way around that?limited-forest-73733
06/09/2023, 3:10 PMdazzling-london-20492
06/10/2023, 1:57 AMquiet-scientist-40341
06/10/2023, 8:23 AMloud-account-57875
06/11/2023, 4:27 PMwonderful-tomato-83083
06/12/2023, 5:06 PMswift-painter-68980
06/12/2023, 7:40 PMnumerous-refrigerator-15664
06/13/2023, 1:45 AMdataset
only. Is it still true?
I have an external mysql DB that has dataset-datajob-dataset metadata and I'm looking for a way to ingest them into datahub.
I already checked out the pipeline lineage too, but since I need to export my mysql data as desired format, it seems yaml file based one is more doable, so I wish I could use datajob
or dataflow
entity type in file based lineage too.
Thanks!hallowed-farmer-50988
06/13/2023, 7:34 AM0.10.3
(from 0.10.0)? So I’m ingesting dbt metadata with Athena as the target platform, and I noticed that now the browse structure as well as the dataset urn have the catalog name in them so for instance:
Browse structure:
• then: dataset/{ENV}/dbt/{platform_instance}/{database}/{table}
• now: dataset/{ENV}/dbt/{platform_instance}/{catalog_name}/{database}/{table}
urn:
• then: urn:li:dataPlatform:dbt,{platform_instance}.{database}.{table},{ENV}
• now: urn:li:dataPlatform:dbt,{platform_instance}.{catalog_name}.{database}.{table},{ENV}
I searched the code for what could have caused that change and I couldn’t find anything. The problem with this is that now the entity created by the dbt ingestion for the target platform (Athena in my case) doesn’t match the existing entity for that platform as the later doesn’t have the catalog_name
in the urn. Any help will be much appreciated.cool-tiger-42613
06/13/2023, 7:45 AMadventurous-apple-52621
06/13/2023, 11:04 AMripe-eye-60209
06/13/2023, 2:04 PMorange-gpu-90973
06/13/2023, 3:18 PMrich-restaurant-61261
06/13/2023, 5:53 PMpip install 'acryl-datahub[superset]'
creamy-battery-20182
06/13/2023, 6:07 PM023-06-12 22:40:55,684 [qtp944427387-17466] INFO c.l.m.r.entity.AspectResource:166 - INGEST PROPOSAL proposal: {aspectName=assertionInfo, systemMetadata={lastObserved=1686609651915, runId=dbt-2023_06_12-22_40_42}, entityUrn=urn:li:assertion:d8691f1c759e159221940a3696e48cf8, entityType=assertion, aspect={contentType=application/json, value=ByteString(length=1375,bytes=7b226375...6e227d7d)}, changeType=UPSERT}
2023-06-12 22:40:55,687 [qtp944427387-17421] ERROR c.l.m.filter.RestliLoggingFilter:38 - <http://Rest.li|Rest.li> error:
com.linkedin.restli.server.RestLiServiceException: com.datahub.util.exception.RetryLimitReached: Failed to add after 3 retries
But these are the underlying exceptions (logs are from the GMS pod):
Caused by: io.ebean.DuplicateKeyException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?,?,?,?,?)
Caused by: java.sql.BatchUpdateException: Duplicate entry 'urn:li:assertion:04063f0fbcbe627b390598a883fb0272-assertionInfo-' for key 'PRIMARY'
Caused by: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry 'urn:li:assertion:04063f0fbcbe627b390598a883fb0272-assertionInfo-' for key 'PRIMARY'
Has anyone seen these before? What could be the underlying issue here, is there an issue with the data itself?purple-terabyte-64712
06/14/2023, 9:31 AMbillions-lawyer-94523
06/14/2023, 8:13 PMlimited-forest-73733
06/15/2023, 12:17 PMwonderful-book-58712
06/16/2023, 1:38 AMcreamy-pizza-80433
06/16/2023, 8:32 AMprd_db
Schema container inside the Database container, while other tables are ingested directly into the Database container.
Do you have any insights into why this might be happening? Could there be any issues with how I'm ingesting the data?
Thank you!