bumpy-activity-74405
05/02/2023, 9:54 AMlinkedin
or under acryldata
?proud-dusk-671
05/02/2023, 11:25 AMchilly-potato-57465
05/02/2023, 12:38 PMwonderful-jordan-36532
05/03/2023, 9:58 AMREST_API_AUTHORIZATION
in helm charts during deployment? Maybe @gentle-hamburger-31302 can you help?lemon-scooter-69730
05/03/2023, 10:29 AMsquare-football-37770
05/03/2023, 9:44 PMdatahub-kafka-setup-job
since I can see its yaml has this conf:*
spec:
โ containers:
โ - env:
โ - name: KAFKA_ZOOKEEPER_CONNECT
โ value: prerequisites-zookeeper:2181
which doesnโt exists obviously. It seems this is causing the job to time out. Any ideas what can be done about it?
thanksblue-microphone-24514
05/04/2023, 2:17 PMdatahub-gms-6698965898-bjv4x datahub-gms 2023-05-04 14:10:38,024 [qtp447981768-278] WARN c.d.a.a.AuthenticatorChain:80 - Authentication chain failed to resolve a valid authentication. Errors: [(com.datahub.authentication.authenticator.DataHubSystemAuthenticator,Failed to authenticate inbound request: Provided credentials do not match known system client id & client secret. Check your configuration values...), (com.datahub.authentication.authenticator.DataHubTokenAuthenticator,Failed to authenticate inbound request: Authorization header missing 'Bearer' prefix.)]
datahub-gms-6698965898-bjv4x datahub-gms 2023-05-04 14:10:40,028 [qtp447981768-269] WARN c.d.a.a.AuthenticatorChain:80 - Authentication chain failed to resolve a valid authentication. Errors: [(com.datahub.authentication.authenticator.DataHubSystemAuthenticator,Failed to authenticate inbound request: Provided credentials do not match known system client id & client secret. Check your configuration values...), (com.datahub.authentication.authenticator.DataHubTokenAuthenticator,Failed to authenticate inbound request: Authorization header missing 'Bearer' prefix.)]
datahub-frontend-7b758459b7-vs8sj datahub-frontend 2023-05-04 14:10:44,033 [application-akka.actor.default-dispatcher-12] ERROR auth.sso.oidc.OidcCallbackLogic - Failed to perform post authentication steps. Redirecting to error page.
datahub-frontend-7b758459b7-vs8sj datahub-frontend java.lang.RuntimeException: Failed to provision user with urn urn:li:corpuser:me@company.com.
datahub-frontend-7b758459b7-vs8sj datahub-frontend at auth.sso.oidc.OidcCallbackLogic.tryProvisionUser(OidcCallbackLogic.java:340)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at auth.sso.oidc.OidcCallbackLogic.handleOidcCallback(OidcCallbackLogic.java:129)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at auth.sso.oidc.OidcCallbackLogic.perform(OidcCallbackLogic.java:107)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:89)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:75)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at org.pac4j.play.CallbackController.lambda$callback$0(CallbackController.java:54)
datahub-gms-6698965898-bjv4x datahub-gms 2023-05-04 14:10:44,031 [qtp447981768-277] WARN c.d.a.a.AuthenticatorChain:80 - Authentication chain failed to resolve a valid authentication. Errors: [(com.datahub.authentication.authenticator.DataHubSystemAuthenticator,Failed to authenticate inbound request: Provided credentials do not match known system client id & client secret. Check your configuration values...), (com.datahub.authentication.authenticator.DataHubTokenAuthenticator,Failed to authenticate inbound request: Authorization header missing 'Bearer' prefix.)]
datahub-frontend-7b758459b7-vs8sj datahub-frontend at
...
auth.sso.oidc.OidcCallbackLogic.tryProvisionUser(OidcCallbackLogic.java:321)
datahub-frontend-7b758459b7-vs8sj datahub-frontend ... 14 common frames omitted
datahub-frontend-7b758459b7-vs8sj datahub-frontend Caused by: com.linkedin.r2.RemoteInvocationException: Received error 401 from server for URI <http://datahub-gms:8080/entities/urn:li:corpuser:me@company.com>
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.restli.internal.client.ExceptionUtil.exceptionForThrowable(ExceptionUtil.java:98)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.restli.client.RestLiCallbackAdapter.convertError(RestLiCallbackAdapter.java:66)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.common.callback.CallbackAdapter.onError(CallbackAdapter.java:86)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.r2.message.timing.TimingCallback.onError(TimingCallback.java:81)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.r2.transport.common.bridge.client.TransportCallbackAdapter.onResponse(TransportCallbackAdapter.java:47)
...
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
datahub-frontend-7b758459b7-vs8sj datahub-frontend at java.base/java.lang.Thread.run(Thread.java:829)
datahub-frontend-7b758459b7-vs8sj datahub-frontend Caused by: com.linkedin.r2.message.rest.RestException: Received error 401 from server for URI <http://datahub-gms:8080/entities/urn:li:corpuser:me@company.com>
datahub-frontend-7b758459b7-vs8sj datahub-frontend at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:76)
datahub-frontend-7b758459b7-vs8sj datahub-frontend ... 4 common frames omitted
quick-television-59428
05/04/2023, 9:21 PMbillions-baker-82097
05/08/2023, 7:00 AMbrief-nail-41206
05/08/2023, 2:17 PM2023/05/08 09:30:02 Waiting for: <tcp://prerequisites-mysql.datahub-temp:3306>
2023/05/08 09:30:02 Waiting for: tcp://*****-kafka-bootstrap:9092
2023/05/08 09:30:02 Waiting for: http://*****@datahub-elasticsearch-*****:9200
2023/05/08 09:30:02 Waiting for: http:
2023/05/08 09:30:02 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s
2023/05/08 09:30:02 Connected to tcp://*****-kafka-bootstrap:9092
2023/05/08 09:30:02 Received 200 from http://*****@datahub-elasticsearch-*****:9200
2023/05/08 09:30:02 Connected to <tcp://prerequisites-mysql.datahub-temp:3306>
2023/05/08 09:30:03 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s
Based on previous posts, despite using graph_service_impl: elasticsearch
the gms is somehow still waiting for neo4j
service to start. Any way to disable this?quick-megabyte-61846
05/08/2023, 5:10 PM0.2.161
Recently we updated the application with Azure AD SSO and created a permission model based on groups uuid from Azure which is pulled from Azure AD while logging into DataHub and here problem araised not every group is being synced with DataHub from Azure AD (only groups with the specific prefix are being pulled to DataHub from AD)
Iโve tried to search through docs and check If there is any variable to specify regex for groups but there is nothing or I didnโt catch that
<https://github.com/datahub-project/datahub/blob/master/datahub-frontend/conf/application.conf#L156>
<https://datahubproject.io/docs/authentication/guides/sso/configure-oidc-react/#user--group-provisioning-jit-provisioning>
<https://datahubproject.io/docs/authentication/guides/sso/configure-oidc-react-azure/>
Our config
datahub-frontend:
extraEnvs:
- name: AUTH_OIDC_JIT_PROVISIONING_ENABLED
value: "true"
- name: AUTH_OIDC_EXTRACT_GROUPS_ENABLED
value: "true"
- name: AUTH_OIDC_GROUPS_CLAIM
value: "groups"
- name: AUTH_JAAS_ENABLED
value: "true"
oidcAuthentication:
enabled: true
provider: azure
clientId: change_me
azureTenantId: change_me
clientSecretRef:
secretRef: "change_me"
secretKey: "change_me"
I know that we can accomplish this somehow using this https://datahubproject.io/docs/generated/ingestion/sources/azure-ad but I wanted to ask if is there any chance to pull all groups to DataHub with Azure Ad provider rather than using an additional recipe for this
My idea was to look for regex for groups and permissions in OIDC attributes/applications to access a wider list of groups?
Or maybe there is the limitation that only a few groups are being pulled while logging and we cannot overcome this?creamy-machine-95935
05/08/2023, 6:55 PMlively-rose-99563
05/09/2023, 9:38 AMsquare-football-37770
05/09/2023, 2:28 PMcreate-indices.sh
on ealsticsearch-setup
job creates a number of indexes starting with underscore _
# 1. ILM policy
create_if_not_exists "_ilm/policy/${PREFIX}datahub_usage_event_policy" policy.json
# 2. index template
create_if_not_exists "_index_template/${PREFIX}datahub_usage_event_index_template" index_template.json
# 3. although indexing request creates the data stream, it's not queryable before creation, causing GMS to throw exceptions
create_if_not_exists "_data_stream/${PREFIX}datahub_usage_event" "datahub_usage_event"
turns out names are forbidden to start with _
on my ES instance (Aiven). If I change the script to use another nameโฆ what other components would I need to change? or would I just be better off starting my own ES on GKE?lively-addition-55180
05/09/2023, 2:28 PMcreamy-ram-28134
05/09/2023, 4:34 PM[root@adkube06 ~]# kubectl logs -f datahub-datahub-system-update-job-w6qr7 -n gopikab
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v2.1.4.RELEASE)
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
SLF4J: Defaulting to no-operation MDCAdapter implementation.
SLF4J: See <http://www.slf4j.org/codes.html#no_static_mdc_binder> for further details.
May 09, 2023 4:30:03 PM org.neo4j.driver.internal.logging.JULogger info
INFO: Direct driver instance 1495445111 created for server address localhost:7687
ERROR SpringApplication Application run failed
java.lang.IllegalStateException: Failed to execute CommandLineRunner
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:816)
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:797)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:324)
at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:139)
at com.linkedin.datahub.upgrade.UpgradeCliApplication.main(UpgradeCliApplication.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
Caused by: java.lang.IllegalArgumentException: No upgrade with id SystemUpdate could be found. Aborting...
at com.linkedin.datahub.upgrade.impl.DefaultUpgradeManager.execute(DefaultUpgradeManager.java:32)
at com.linkedin.datahub.upgrade.UpgradeCli.run(UpgradeCli.java:44)
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:813)
... 12 more
May 09, 2023 4:30:10 PM org.neo4j.driver.internal.logging.JULogger info
INFO: Closing driver instance 1495445111
Does anyone know how to fix this ?rich-restaurant-61261
05/10/2023, 9:34 AMImagepullbackoff
error on elasticsearch-setup-job, prometheus-jmx-exporter, and cp-schema-registry-server, but I do see the image exist in the docker hub, can anyone help me have a look over there? thanks
kubectl --kubeconfig ~/.kube/di_config describe pod datahub-elasticsearch-setup-job-ntmk5
Name: datahub-elasticsearch-setup-job-ntmk5
Namespace: feature-aoc-27123-crawler
Priority: 0
Node: didevwkrvm9/xx.2.xx.37
Start Time: Wed, 10 May 2023 15:59:13 +0800
Labels: controller-uid=94aec4cc-060c-46c7-bf92-xxxxx
job-name=datahub-elasticsearch-setup-job
Annotations: <http://cni.projectcalico.org/containerID|cni.projectcalico.org/containerID>: 48d897d76f8487556c564f51e97236fe423d46f5a5651ff7d4873032bd39370a
<http://cni.projectcalico.org/podIP|cni.projectcalico.org/podIP>: 10.42.11.63/32
<http://cni.projectcalico.org/podIPs|cni.projectcalico.org/podIPs>: 10.42.11.63/32
Status: Pending
IP: 10.42.11.xx
IPs:
IP: 10.42.11.xx
Controlled By: Job/datahub-elasticsearch-setup-job
Containers:
elasticsearch-setup-job:
Container ID:
Image: linkedin/datahub-elasticsearch-setup:v0.10.2
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Limits:
cpu: 500m
memory: 512Mi
Requests:
cpu: 300m
memory: 256Mi
Environment:
ELASTICSEARCH_HOST: elasticsearch-master
ELASTICSEARCH_PORT: 9200
SKIP_ELASTICSEARCH_CHECK: false
ELASTICSEARCH_INSECURE: false
ELASTICSEARCH_USE_SSL: false
DATAHUB_ANALYTICS_ENABLED: true
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xm27c (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-xm27c:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
<http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 4m29s (x264 over 64m) kubelet Back-off pulling image "linkedin/datahub-elasticsearch-setup:v0.10.2"
fresh-toothbrush-9306
05/10/2023, 3:32 PMminiature-room-15319
05/11/2023, 8:35 AMadamant-rain-51672
05/11/2023, 9:21 AMThe command will provision an EKS cluster powered by 3 EC2 m3.large nodes and provision a VPC based networking layer.https://datahubproject.io/docs/deploy/aws Is it possible to run datahub on smaller instances?
careful-lunch-53644
05/12/2023, 3:34 AMsteep-soccer-91284
05/12/2023, 6:21 AMgreat-monkey-52307
05/12/2023, 4:09 PMfuture-controller-3884
05/13/2023, 11:55 AMshy-dog-84302
05/13/2023, 7:43 PMcareful-lunch-53644
05/15/2023, 7:55 AMsteep-vr-39297
05/15/2023, 8:23 AMenablePrometheus
was set to true as helm in the cluster, and datahub was deployed.
Do I only need to install the grafana separately on the cluster (helm repo add grafana https://grafana.github.io/helm-charts) ?
The datahub document contains only the docker-compose
description. Is there any information related to Helm?proud-dusk-671
05/15/2023, 11:45 AMrough-summer-14442
05/15/2023, 11:59 AMbland-orange-13353
05/15/2023, 4:28 PM