alert-fall-82501
08/02/2022, 12:03 PMalert-football-80212
08/02/2022, 1:50 PMbig-zoo-81740
08/02/2022, 1:56 PMbase_folder
config option, /home/ubuntu/github/myreponame
, but I keep getting an error saying it can't find the directory or it doesn't exist. The folder has r/w permissions, so datahub should be able to read from it. Is there something I am obviously doing wrong? Does the repo need to be located in a specific folder in order for the base_folder
config to be able to read from it?green-lion-58215
08/02/2022, 8:33 PMproud-accountant-49377
08/02/2022, 3:16 PMgifted-knife-16120
08/03/2022, 10:19 AMdatahub delete --urn "urn:li:assertion:35b8d904a367f5d02b07b20bac478408" --soft
error msg: OperationalError: ('Unable to emit metadata to DataHub GMS', {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: java.lang.RuntimeException: Unknown aspect status for entity assertion\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)\n\tat com.linkedin.metadata.resources.entity.AspectResource.ingestProposal(AspectResource.java:133)
rich-salesmen-55640
08/03/2022, 11:55 AMdatahub-actions
image.
However, unixODBC seems not to be present in that image.
I saw datahub-gms
references MySQLs JDBC driver.
Would that be the appropriate image to put a custom driver file?
Does both JDBC or ODBC work or only the first?echoing-alligator-70530
08/03/2022, 1:44 PMalert-football-80212
08/03/2022, 1:58 PMalert-football-80212
08/03/2022, 3:09 PMsource:
type: "kafka"
config:
# Coordinates
env: $ENV
connection:
bootstrap: $KAFKA_BOOTSTRAP_SERVER
consumer_config:
security.protocol: "SASL_SSL"
sasl.mechanism: "PLAIN"
sasl.username: "user_name"
sasl.password: $KAFKA_PASSWORD
schema_registry_url: $SCHEMA_REGISTRY_URL
topic_patterns:
allow:
- $TOPIC_NAME
topic_subject_map:
topicName-value: $SCHEMA_NAME
transformers:
- type: "simple_add_dataset_ownership"
config:
owner_urns:
- some_owner_name
mysterious-pager-59554
08/03/2022, 3:27 PMfull-toddler-4661
08/03/2022, 5:03 PMcolossal-sandwich-50049
08/03/2022, 9:24 PMmetadata-models
repo and have the following question: I notice that upstreamLineage
is a dataset aspect but that it is not listed in the aspects list for dataset in the `entity-registry.yaml`; can someone advise how this is the case?
• https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/dataset/UpstreamLineage.pdl
• https://github.com/datahub-project/datahub/blob/master/metadata-models/src/main/resources/entity-registry.yml
Edit: in general, I see that the aspects under metadata-models/src/main/pegasus/com/linkedin/dataset
correspond to the aspects below, but the entity-registry.yaml
doesn't seem to correspond with this
• https://datahubproject.io/docs/generated/metamodel/entities/dataset#aspectsalert-fall-82501
08/04/2022, 6:24 AMalert-fall-82501
08/04/2022, 6:26 AMalert-fall-82501
08/04/2022, 6:26 AMalert-fall-82501
08/04/2022, 7:25 AM/usr/lib/python3/dist-packages/paramiko/transport.py:219: CryptographyDeprecationWarning: Blowfish has been deprecated
"class": algorithms.Blowfish,
[2022-08-03 14:24:08,362] INFO {datahub.cli.ingest_cli:170} - DataHub CLI version: 0.8.41.2
[2022-08-03 14:24:08,410] INFO {datahub.ingestion.run.pipeline:163} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://localhost:8080>
[2022-08-03 14:24:08,749] ERROR {logger:26} - Please set env variable SPARK_VERSION
[2022-08-03 14:24:08,875] ERROR {datahub.ingestion.run.pipeline:127} - 's3'
[2022-08-03 14:24:08,876] INFO {datahub.cli.ingest_cli:119} - Starting metadata ingestion
[2022-08-03 14:24:08,876] INFO {datahub.cli.ingest_cli:137} - Finished metadata ingestion
Failed to configure source (delta-lake) due to 's3'
calm-dinner-63735
08/04/2022, 11:33 AMcalm-dinner-63735
08/04/2022, 12:20 PMfull-toddler-4661
08/03/2022, 7:46 PMbrave-pencil-21289
08/04/2022, 12:31 PMgifted-knife-16120
08/04/2022, 3:17 PMlittle-spring-72943
08/04/2022, 4:16 PMgreen-lion-58215
08/04/2022, 7:57 PMdazzling-insurance-83303
08/04/2022, 8:14 PMsource:
type: postgres
config:
# Coordinates
host_port: db_server-001:5432
database: db001
I cannot cycle through DBs in postgres cluster/instance by providing the following syntax, right?
source:
type: postgres
config:
# Coordinates
host_port: db_server-001:5432
database:
- db001
- db002
- db003
Thanks!rapid-fall-7147
08/04/2022, 8:56 PMcool-actor-73767
08/04/2022, 10:58 PMmicroscopic-mechanic-13766
08/05/2022, 10:33 AMsource:
type: <source>
config:
host_port: <host>:<port>
database: DB1, DB2
or
source:
type: <source>
config:
host_port: <host>:<port>
database: !DB3
few-grass-66826
08/05/2022, 12:19 PMfuture-student-30987
08/05/2022, 1:19 PM