better-oyster-93449
02/23/2022, 2:52 PM/assets/platforms/lookerlogo.png
. However when I deploy the frontend docker container, this URL is unreachable. This translates to icons not showing up in my deployments. Has any one experienced something similar? I’m new to the project, so any help is appreciated!lively-fall-12210
02/23/2022, 2:54 PMdatahub delete --urn xyz
fails, because the status aspect is unknown for domain entities.nutritious-bird-77396
02/23/2022, 6:43 PMShow Full Titles
by default in the Lineage View - https://demo.datahubproject.io/dataset/urn:li:dataset:(urn:li:dataPlatform:kafka,cdc.UserAccount_ChangeEvent,PROD)?is_lineage_mode=truenutritious-bird-77396
02/23/2022, 9:34 PMprovided_configs
property for? https://datahubproject.io/docs/metadata-ingestion/source_docs/kafka-connect
I am having an issue where the kafka connect ingestion also ingests the destination topics but it ingests with PROD
as env instead of the provided env
value in the recipe such as STG
just wondering if the provided_configs could be a solution here.bland-orange-95847
02/24/2022, 7:17 AMValidation error of type FieldUndefined: Field 'displayName' in type 'CorpUserEditableProperties' is undefined @ 'searchResults/searchResults/entity/editableProperties/displayName' (code undefined
errors.
Anyone else facing this after the upgrade?busy-dusk-4970
02/24/2022, 4:01 PMdocker-compose -f docker-compose.dev.yml up
view error log in thread 🙏nutritious-machine-80578
02/24/2022, 7:57 PMrich-policeman-92383
02/25/2022, 10:09 AMfreezing-nightfall-82415
02/25/2022, 10:16 AMwitty-painting-90923
02/25/2022, 4:03 PMcurl http://<datahub-gms-endpoint>/config
is saying statefulIngestionCapable: true
so it should be fine
we have
gms v0.8.26
datahub cli v0.8.26.3
Any help would be much appreciated, thank you!
pipeline = Pipeline.create(
# This configuration is analogous to a recipe configuration.
{
"source": {
"type": "postgres",
"config": {
"env": ENV,
"host_port": sql_host_port,
"database": database,
"username": sql_login,
"password": sql_password,
"include_views": False,
"profiling": {
"enabled": True
},
"stateful_ingestion": {
"enabled": True,
"remove_stale_metadata": True,
"state_provider": {
"type": "datahub",
"config": {"datahub_api": {"server": datahub_host}},
},
},
},
},
"pipeline_name": "my_postgres_pipeline_1",
"sink": {
"type": "datahub-rest",
"config": {"server": datahub_host},
},
}
)
broad-thailand-41358
02/25/2022, 5:38 PMDsocksProxyHost=127.0.0.1 -DsocksProxyPort=8080
in VM Options
under the Advanced Settings
Is there any way to mimic this in a data ingestion recipe yml file?red-napkin-59945
02/25/2022, 9:09 PM*:metadata-service:restli-servlet-impl:checkRestModel* FAILED
issue. anyone has any idea of how to fix it?numerous-application-54063
02/28/2022, 2:03 PMThe conflict is caused by:
#17 110.4 jinja2 2.11.3 depends on MarkupSafe>=0.23
#17 110.4 apache-airflow 2.1.2 depends on markupsafe<2.0 and >=1.1.1
#17 110.4 acryl-datahub 0.8.27.1 depends on markupsafe==2.0.1
is the markupsafe==2.0.1 requirement strictly necessary for datahub cli?numerous-camera-74294
02/28/2022, 2:43 PMcom.linkedin.restli.server.RestLiServiceException [HTTP Status:400]: Invalid value type for parameter aspects
looking deeper into it, that error is thrown when the header X-RestLi-Protocol-Version set to 2.0.0, if I remove or change it to i.e. 1.0.0, the request looks good
the header is added in https://github.com/acryldata/datahub/blob/master/metadata-ingestion/src/datahub/cli/cli_utils.py#L163
any hint why this is causing the backend to crash?elegant-article-21703
02/28/2022, 5:53 PM<http://service.beta.kubernetes.io/azure-load-balancer-internal|service.beta.kubernetes.io/azure-load-balancer-internal>: "true"
at values.yaml
in the frontend and gms charts. Addticionally, I have included the the LoadBalancerIP
for each at their correspondant service.yaml
inside templates.
Once the pipeline reach the installation task, I get this message:
wait.go:225: [debug] Service does not have load balancer ingress IP address: default/datahub-datahub-frontend
I don't know what do I need to include in the files, what am I missing here? Thank you in advance for any help! 🙂strong-architect-67189
02/28/2022, 9:13 PMacryl-datahub, version 0.8.27
• Port-forwarding datahub-datahub-frontend --> 9002 and datahub-datahub-gms --> 8080
• Currently have a working ingress for the frontend using my domain which was created by following the exact instructions from the docs.
• curl '<http://datahub.xxxx.xxxx/api/gms/config>'
returns nothing at the moment
• Trying to ingest from BigQuery located in the same private project as my GKE instance
Here is the recipe I created in the UI:
source:
type: bigquery
config:
project_id: xxxxxxxxxxxx
credential:
project_id: xxxxxxxxxx
private_key_id: xxxxxxxxxxx
private_key: "-----BEGIN PRIVATE KEY-----xxxxxxxxxxxxxx-----END PRIVATE KEY-----\n"
client_email: <mailto:xxxxxx@xxxxxxx.iam.gserviceaccount.com|xxxxxx@xxxxxxx.iam.gserviceaccount.com>
client_id: 'xxxxxxxxxxx'
sink:
type: datahub-rest
config:
server: '<http://datahub.xxxx.xxxx/api/gms>'
Keep receiving this error (did not include the entire stack trace)
'ConfigurationError: Unable to connect to <http://datahub.xxxx.xxx/api/gms/config> with status_code: 401. Maybe you need to set up
authentication? Please check your configuration and make sure you are talking to the DataHub GMS (usually <datahub-gms-host>:8080) or
Frontend GMS API (usually <frontend>:9002/api/gms).'
I've tried including an access token generated from the UI, but that still doesn't seem to give me the right authentication. Is this an authentication issue with GKE? Do I need to create an ingress for the datahub-gms service? I've tried quite a few combinations and I'm still very confused. Any help would be greatly appreciated.
Thank you in advance!!!!cool-painting-92220
02/28/2022, 10:59 PMUnable to emit metadata to DataHub GMS
. My new focus was to update my datahub instance, so I updated acryl-datahub to 0.8.27 and then took a look at the datahub-upgrade.sh
script. I attempted to run it and was met with the following message:
Starting upgrade with id NoCodeDataMigration...
Cleanup has not been requested.
Skipping Step 1/6: RemoveAspectV2TableStep...
Executing Step 2/6: GMSQualificationStep...
Completed Step 2/6: GMSQualificationStep successfully.
Executing Step 3/6: UpgradeQualificationStep...
-- V1 table does not exist
Any pointers on how I can fix this?
I know that versioning and instances can be a bit tricky to debug without the full context, so I'd be happy to hop on a quick call as well to sort out my instance if that's easier.able-rain-74449
03/01/2022, 2:42 PMable-rain-74449
03/01/2022, 2:47 PMCrashLoopBackOff
NAME READY STATUS RESTARTS AGE
datahub-elasticsearch-master-0 1/1 Running 0 42m
datahub-elasticsearch-master-1 1/1 Running 0 42m
datahub-elasticsearch-master-2 0/1 Running 0 20m
datahub-prerequisites-cp-schema-registry-65d8777cc8-m88mn 1/2 CrashLoopBackOff 10 38m
datahub-prerequisites-kafka-0 1/1 Running 0 48m
datahub-prerequisites-mysql-0 1/1 Running 0 66m
datahub-prerequisites-neo4j-community-0 1/1 Running 0 70m
datahub-prerequisites-zookeeper-0 1/1 Running 0 45m
miniature-account-72792
03/01/2022, 2:53 PMdatahub-acryl-datahub-actions
components is constantly logging the following error
%3|1646144911.225|FAIL|rdkafka#consumer-1| [thrd:<ssl://testing-strimzi-cluster-kafka-bootstrap:9092/bootstrap>]: <ssl://testing-strimzi-cluster-kafka-bootstrap:9092/bootstrap>: SSL handshake failed: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed: broker certificate could not be verified, verify that ssl.ca.location is correctly configured or root CA certificates are installed (install ca-certificates package) (after 5ms in state SSL_HANDSHAKE, 31 identical error(s) suppressed)
I currently pass the following environment variables via the deployment yaml:
• KAFKA_PROPERTIES_SSL_KEY_PASSWORD
• KAFKA_PROPERTIES_KAFKASTORE_SSL_TRUSTSTORE_PASSWORD
• KAFKA_PROPERTIES_SSL_TRUSTSTORE_PASSWORD
• KAFKA_PROPERTIES_SSL_KEYSTORE_PASSWORD
• KAFKA_PROPERTIES_KAFKASTORE_SSL_KEYSTORE_PASSWORD
• KAFKA_PROPERTIES_SSL_TRUSTSTORE_TYPE
• KAFKA_PROPERTIES_SSL_KEYSTORE_LOCATION
• KAFKA_PROPERTIES_SSL_TRUSTSTORE_LOCATION
• KAFKA_PROPERTIES_KAFKASTORE_SSL_TRUSTSTORE.LOCATION
• KAFKA_PROPERTIES_SECURITY_PROTOCOL
• KAFKA_PROPERTIES_KAFKASTORE_SECURITY_PROTOCOL
• KAFKA_PROPERTIES_SSL_PROTOCOL
• KAFKA_PROPERTIES_SSL_ENDPOINT_IDENTIFICATION.ALGORITHM
• KAFKA_PROPERTIES_SSL_CA_LOCATION (= truststore location)plain-farmer-27314
03/01/2022, 3:37 PM0.8.26.3
(also tried this on latest pip version 0.8.27.1
)
We use BQ with looker
There were no errors associated with this view in the logs
parse_table_names_from_sql is set to true
I have double checked the view definition in datahub and it matches what we have in Looker, there are very clearly 3-4 tables that were not picked up by ingestion, and one that was ingested incorrectlyhandsome-football-66174
03/01/2022, 5:22 PM{
search(input: {start: 0, count: 100, query: "*", type: DATASET,filters:{field:"description",value:"Project*"}}) {
searchResults {
entity {
urn
type
}
matchedFields {
name
value
}
}
}
}
some-crayon-90964
03/01/2022, 6:21 PMtestQuick
. I can build with -x testQuick
, but I would like to make sure issues go away in the long term. Please advise, thanks in advanced!red-napkin-59945
03/01/2022, 9:31 PMred-napkin-59945
03/02/2022, 1:01 AMjava.lang.UnsupportedOperationException: Failed to find Typeref schema associated with Config-based Entity
adorable-flower-19656
03/02/2022, 5:37 AMred-napkin-59945
03/02/2022, 5:38 AMdatahub-graphql-core
module according to the "DataHub GraphQL Core" readme page:
1. Looks like the ReadMe needs some update? I could not easily find what the doc told me to do, like resources/gms.graphql
, DataLoaders
, Mappers
and DataFetchers
2. Looks like the SearchableEntityType
is deprecated. Should the new LoadableType extend SearchableEntityType? If no, what’s the alternative?
3. I am a little confused about RestliEntityClient
and JavaEntityClient
Looks like, they both finally call EntityService -> EbeanAspectDao -> DB
The difference is RestliEntityClient
send a Restli request and the Restli server calles EntityService
But JavaEntityClient
call EntityService
directly? If I would like to introduce a new entity, looks like I do not need to change them?
4. Looks like Mappers and DataFetchers are not needed now since batchLoad() return GraphQL object?rapid-sundown-8805
03/02/2022, 1:45 PMKafkaException: KafkaError{code=_INVALID_ARG,val=-186,str="Failed to create consumer: No provider for SASL mechanism GSSAPI: recompile librdkafka with libsasl2 or openssl support. Current build options: PL ││ AIN SASL_SCRAM OAUTHBEARER"} ││ 2022/03/02 13:30:11 Command exited with error: exit status 1
We don't use GSSAPI, but PLAIN, so there is some setting that the container does not pick up correctly.
However, when I look at the deployment manifest for the actions container, it has these variables set:
spec:
containers:
- env:
- name: GMS_HOST
value: dfds-datahub-datahub-gms
- name: GMS_PORT
value: "8080"
- name: KAFKA_BOOTSTRAP_SERVER
value: REDACTED
- name: SCHEMA_REGISTRY_URL
value: <http://datadelivery-schema-registry:8081>
- name: KAFKA_AUTO_OFFSET_POLICY
value: latest
- name: ACTION_FILE_NAME
value: executor.yaml
- name: KAFKA_PROPERTIES_KAFKASTORE_SECURITY_PROTOCOL
value: SASL_SSL
- name: KAFKA_PROPERTIES_SASL_JAAS_CONFIG
value: org.apache.kafka.common.security.plain.PlainLoginModule required
username="REDACTED" password="REDACTED";
- name: KAFKA_PROPERTIES_SASL_MECHANISM
value: PLAIN
- name: KAFKA_PROPERTIES_SASL_PASSWORD
value: REDACTED
- name: KAFKA_PROPERTIES_SASL_USERNAME
value: REDACTED
- name: KAFKA_PROPERTIES_SECURITY_PROTOCOL
value: SASL_SSL
image: public.ecr.aws/datahub/acryl-datahub-actions:v0.0.1-beta.8
imagePullPolicy: IfNotPresent
name: acryl-datahub-actions
ports:
- containerPort: 9093
name: http
protocol: TCP
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
securityContext: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
So the SASL_MECHANISM should be set to PLAIN, no? It is also set to PLAIN in the global values in the helm chart, see our values file here: https://github.com/dfds-data/datahub-infrastructure/blob/master/datahub/dfdsvals.yaml
Is it a bug?gentle-father-80172
03/02/2022, 2:20 PM{
search(input: {start: 0, count: 100, query: "*", type: DATASET, filters:{field:"Dataset.platform.name",value:"glue"}}) {
searchResults {
entity {
urn
type
}
matchedFields {
name
value
}
}
}
}
salmon-area-51650
03/02/2022, 3:01 PMdatahub
user is not admin and I cannot access to Ingestion UI
, Policies
, …. How can activate datahub
user as administrator?