lively-jackal-83760
03/02/2023, 8:44 AMalert-football-80212
03/02/2023, 9:45 AMbest-notebook-58252
03/02/2023, 1:34 PMaloof-holiday-45827
03/02/2023, 1:51 PMchilly-potato-57465
03/02/2023, 2:50 PMblue-engineer-74605
03/02/2023, 5:20 PMbrave-animal-98220
03/02/2023, 5:34 PMHey guys. I'm ingesting from superset (datahub v0.10.0) but I get this error below, can anyone help.
stocky-portugal-689
03/02/2023, 5:38 PMlate-dawn-4912
03/02/2023, 5:47 PMprehistoric-furniture-42991
03/02/2023, 6:15 PMput
command in CLI. I'm able to update the ownership with example json in documentation
datahub put --urn "urn:li:dataset:(urn:li:dataPlatform:s3,test.parquet,PROD)" --aspect ownership -d editable_schema.json
Also I'm trying to update the description using the --aspect editableSchemaMetadata.
Now I need example json file for this. I tried with different json files, which is not supporting.salmon-vr-6357
03/02/2023, 6:47 PMcuddly-dinner-641
03/02/2023, 7:31 PMgifted-bear-4760
03/03/2023, 11:04 AMrich-policeman-92383
03/03/2023, 11:07 AMelegant-salesmen-99143
03/03/2023, 12:38 PMrich-policeman-92383
03/03/2023, 1:43 PMmost-animal-32096
03/03/2023, 2:59 PMdatahub:metadata-integration:java:datahub-client
the very basic example using RestEmitter
and described in documentation, fails with such ClassNotFoundException: org.apache.http.ssl.TrustStrategy
error?lively-jackal-83760
03/03/2023, 3:26 PMwitty-monitor-636
03/03/2023, 6:02 PMbitter-midnight-96257
03/03/2023, 8:12 PMoptions:
tls: true
tlsCRLFile: /home/ec2-user/crl.pem
But it didn't recognize the path.
Thank you in advance and a great weekend to all 🙂able-evening-90828
03/03/2023, 8:50 PMbigquery
ingestion. I want to ingest data from Google's public bigquery-public-data
project.
I was only to get it working using the project_id: bigquery-public-data
setting in the recipe, although the doc says it is deprecated.
I tried to use project_id_pattern
as follows, but it wasn't able to pick up any datasets.
project_id_pattern:
allow:
- '.*bigquery-public-data.*'
How do you read data from another project that is different from the service account's project?few-branch-52297
03/05/2023, 4:19 AMfew-branch-52297
03/05/2023, 4:40 AMdatahub-datahub-system-update-job-nqrrl 0/1 CreateContainerConfigError
W0305 04:38:31.609377 15424 warnings.go:70] spec.template.spec.containers[0].env[31].name: duplicate name "DATAHUB_UPGRADE_HISTORY_TOPIC_NAME"
W0305 04:38:31.609414 15424 warnings.go:70] spec.template.spec.containers[0].env[33].name: duplicate name "ENTITY_REGISTRY_CONFIG_PATH"
W0305 04:38:31.609423 15424 warnings.go:70] spec.template.spec.containers[0].env[34].name: duplicate name "KAFKA_BOOTSTRAP_SERVER"
W0305 04:38:31.609430 15424 warnings.go:70] spec.template.spec.containers[0].env[35].name: duplicate name "KAFKA_SCHEMAREGISTRY_URL"
W0305 04:38:31.609438 15424 warnings.go:70] spec.template.spec.containers[0].env[39].name: duplicate name "ELASTICSEARCH_HOST"
W0305 04:38:31.609445 15424 warnings.go:70] spec.template.spec.containers[0].env[40].name: duplicate name "ELASTICSEARCH_PORT"
W0305 04:38:31.609453 15424 warnings.go:70] spec.template.spec.containers[0].env[41].name: duplicate name "SKIP_ELASTICSEARCH_CHECK"
W0305 04:38:31.609460 15424 warnings.go:70] spec.template.spec.containers[0].env[42].name: duplicate name "ELASTICSEARCH_USE_SSL"
W0305 04:38:31.609467 15424 warnings.go:70] spec.template.spec.containers[0].env[43].name: duplicate name "ELASTICSEARCH_USERNAME"
W0305 04:38:31.609474 15424 warnings.go:70] spec.template.spec.containers[0].env[44].name: duplicate name "ELASTICSEARCH_PASSWORD"
W0305 04:38:31.609496 15424 warnings.go:70] spec.template.spec.containers[0].env[48].name: duplicate name "GRAPH_SERVICE_IMPL"
alert-football-80212
03/05/2023, 9:59 AMbitter-evening-61050
03/06/2023, 7:51 AMbest-wire-59738
03/06/2023, 8:21 AMKAFKA_LISTENER_CONCURRENCY
to 10 to support parallel processing of the kafka offsets. At this point of time our UI is also freezed as the consumer is going in a re-balancing loop and its not consuming offsets and offset lag keeps on Increasing as Ingestion is pulling more info.
Upon debugging we found that the datahub is using MetadataChangeLog_Versioned_v1 topic for all the changes made to Metadata Graph using UI and also while using kafka sink for Ingestions. So for this reason our UI is in freezed state till the consumer (generic-mae-consumer-job-client
) reads all the partitions from the topic as the change made to UI is also some where in the queue in the kafka topic.
1. Can we use seperate topic for all the changes we made using UI so that our UI be free from freeezing issue?
2. Also how can we let ourselves come out from the re-balancing of groups issue and speed up our ingestion, As kafka is Asynchronous . MCE consumers are slow in reading the offsets. we are yet to create standalone MCE and MAE Consumers. Hope it increase the speed of the Ingestion but yet to find solution for re-balancing issue.
Pulling up this thread which has the logs for the kafka client re-balancing issue. https://datahubspace.slack.com/archives/CV2UVAPPG/p1677676301370689calm-dinner-63735
03/06/2023, 10:25 AMmicroscopic-machine-90437
03/06/2023, 10:52 AMbest-notebook-58252
03/06/2023, 11:27 AM…
view_name: payment_events {
fields: [
payment_events.payment_id,
…
]
}
…
this seems not supported because it’s causing an error and the explore is skipped:
Traceback (most recent call last):
File "…/datahub/ingestion/source/looker/lookml_source.py", line 1634, in get_internal_workunits
explore: LookerExplore = LookerExplore.from_dict(
File "…/datahub/ingestion/source/looker/looker_common.py", line 550, in from_dict
view_names.add(dict.get("view_name") or dict.get("from") or dict["name"])
TypeError: unhashable type: 'dict'
Should I open a bug/feature request?elegant-salesmen-99143
03/06/2023, 11:45 AM