https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • c

    creamy-wall-36971

    08/30/2023, 6:49 AM
    Hi team, Please tell me how to delete all(or one) Glossary Terms on the Glossary tab with Python or GraphQL (code example). I ran the code below, and got "{'batchRemoveTerms': True}" returned. However, it was not actually deleted from the Glossary tab. I need your help.
    Copy code
    from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
    import time
    
    gms_endpoint = "#######"
    graph = DataHubGraph(DatahubClientConfig(server=gms_endpoint))
    
    start = time.time()
    
    for i in range(len(df)):
        term_urn = "urn:li:glossaryTerm:" + df['공통표준용어명'][i]
        mutation = f"""
        mutation batchRemoveTerms {{
            batchRemoveTerms(
              input: {{
                termUrns: ["{term_urn}"],
                resources: []
              }}
            )
        }}
        """
        result = graph.execute_graphql(query=mutation)
        print(result)
        
    end = time.time() 
    print(f"{end - start: .2f} sec")
    a
    b
    • 3
    • 3
  • f

    freezing-controller-51212

    08/30/2023, 7:33 PM
    Hello, I have been tasked to debug errors messages appearing on our datahub deployment on kubernetes using the helm chart provided by acryldata bumped chart version from
    0.2.181
    to
    0.2.182
    We are ingesting metadata from Databricks, Sagemaker, dbt. When displaying the logs of these 3 jobs, I get recurring similar errors being : • datahub-gms return status 500 (see picture) • Error registering Avro Schema Even after trying the solution propsed in this issue the problem keeps recurring. Thank you for the help!
    a
    d
    +2
    • 5
    • 25
  • t

    tall-gigabyte-99212

    08/31/2023, 3:50 AM
    Hello,I have a few questions... 1.DataHub has generic models for datasets, and it doesn’t have Hive/Redshift/BigQuery specific ones.Does this mean that datasets from platforms like MySQL, Hive, and others have exactly the same aspects?Are their entity models completely identical? 2.Does the terminology glossary support bulk import? Or is there an API for bulk import, as I have many terms in my Apache Atlas that need to be migrated to DataHub?🤔
    ✅ 1
    h
    • 2
    • 1
  • r

    rhythmic-sundown-12093

    08/31/2023, 6:27 AM
    Hi team, I encountered the following error when using datahub to integrate internal OIDC authentication (casdoor)
    Copy code
    java.util.concurrent.CompletionException: org.pac4j.core.exception.TechnicalException: com.nimbusds.jose.proc.BadJOSEException: Signed JWT rejected: Another algorithm expected, or no matching key(s) found
            at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
            at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
            at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
            at play.core.j.HttpExecutionContext.$anonfun$execute$1(HttpExecutionContext.scala:64)
            at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)
            at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48)
            at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
            at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
            at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
            at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
            at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
    Caused by: org.pac4j.core.exception.TechnicalException: com.nimbusds.jose.proc.BadJOSEException: Signed JWT rejected: Another algorithm expected, or no matching key(s) found
            at org.pac4j.oidc.profile.creator.OidcProfileCreator.create(OidcProfileCreator.java:145)
            at org.pac4j.oidc.profile.creator.OidcProfileCreator.create(OidcProfileCreator.java:45)
            at org.pac4j.core.client.BaseClient.retrieveUserProfile(BaseClient.java:119)
            at org.pac4j.core.client.BaseClient.getUserProfile(BaseClient.java:99)
            at org.pac4j.core.engine.DefaultCallbackLogic.perform(DefaultCallbackLogic.java:88)
            at auth.sso.oidc.OidcCallbackLogic.perform(OidcCallbackLogic.java:100)
            at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:91)
            at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:77)
            at org.pac4j.play.CallbackController.lambda$callback$0(CallbackController.java:54)
            at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
            ... 8 common frames omitted
    Caused by: com.nimbusds.jose.proc.BadJOSEException: Signed JWT rejected: Another algorithm expected, or no matching key(s) found
            at com.nimbusds.jwt.proc.DefaultJWTProcessor.process(DefaultJWTProcessor.java:384)
            at com.nimbusds.openid.connect.sdk.validators.IDTokenValidator.validate(IDTokenValidator.java:288)
            at com.nimbusds.openid.connect.sdk.validators.IDTokenValidator.validate(IDTokenValidator.java:224)
            at org.pac4j.oidc.profile.creator.TokenValidator.validate(TokenValidator.java:103)
            at org.pac4j.oidc.profile.creator.OidcProfileCreator.create(OidcProfileCreator.java:93)
            ... 17 common frames omitted
  • s

    some-flower-21264

    08/31/2023, 8:05 AM
    Hi All, Does Datahub 0.10.3 version have column-level lineage?
    a
    r
    • 3
    • 3
  • t

    tall-gigabyte-99212

    08/31/2023, 10:36 AM
    Due to the inability to add column-level lineage on the webUI, only table-level lineage can be added. How can I add column-level lineage?Therefore, I would like to know how column-level lineage is stored in MySQL. I noticed that table-level lineage is stored as "aspect:upstreamLineage" in the table metadata_aspect_v2. Can someone provide an answer? It would be great if a sample data of column-level lineage can be provided, or even a sample image. Just like mine:😊
    a
    • 2
    • 4
  • t

    tall-gigabyte-99212

    08/31/2023, 10:39 AM
    Does DataHub support making dynamic modifications to models online? If I want to add some custom models, how can I do that? Thank you all very much for your help!catyay
  • p

    powerful-cat-68806

    08/31/2023, 12:48 PM
    Hi all 🙂 Does DH support `k8s 1.27`(AWS EKS)?
    a
    • 2
    • 2
  • a

    acoustic-lawyer-52945

    08/31/2023, 8:03 PM
    Is the kafka schema registry component strictly required? It seems as though I’m able to successfully ingest from multiple sources (glue, redshift) using the datahub-rest sink without it. I still get tons of logs within datahub-gms related to avro deserialization failing. I’m deploying on kubernetes using the Strimzi operator to define kafka. Strimzi has no official schema registry support, so if I can avoid installing something like the confluent schema registry then I will.
    plus1 1
    f
    a
    d
    • 4
    • 15
  • s

    some-actor-27079

    08/31/2023, 9:07 PM
    Hi, We are facing some problems when trying to deploy Datahub with Kubernetes in Azure Kubernetes Service (AKS). If I do a clean helm install of the prerequisites and the Datahub charts as in the tutorial https://datahubproject.io/docs/deploy/kubernetes/ , I am able to install the prerequisites but the installation of Datahub keeps failing. The culprit is the
    datahub-datahub-system-update-job
    which keeps on failing. If I modify the prerequisites and the datahub values.yaml as suggested here: https://github.com/acryldata/datahub-helm/issues/347 This is: """ Filling values in prerequisites:
    Copy code
    cp-helm-charts:
      enabled: true
      cp-schema-registry:
        enabled: true
    and values in datahub:
    Copy code
    kafka:
      schemaregistry:
        type: KAFKA
        url: "<http://datahub-datahub-gms:8080/schema-registry/api/>"
    """ I am able to helm install the prerequisites without issues. I am also able to helm install datahub. However, in the front-end I don't have an "Ingestion" tab. If I write "http:my-datahub.domain/ingestion" I get an error "Failed to load ingestion sources! An unexpected error occurred" (see image). If I try to ingest some data, I get another error: "Unauthorized to perform this action. Please contact your DataHub administrator (code 403)." (see image). Does anyone have any clue on how to fix these errors? Any tips would be greatly appreciated!
    a
    b
    • 3
    • 3
  • t

    tall-gigabyte-99212

    09/01/2023, 7:16 AM
    I want to know how DataHub converts data from MySQL into a graph structure, such as lineage.😄😄 Which classes in the source code can I take a look at?
    plus1 1
  • a

    ancient-kitchen-28586

    09/01/2023, 12:31 PM
    hi, I'm trying to configure Azure OIDC login (https://datahubproject.io/docs/authentication/guides/sso/configure-oidc-react-azure/) . The login comes up, but after that I get the message "*Sorry, but we’re having trouble signing you in.* *AADSTS900971: No reply address provided.*" . From MS docs it looks like the redirect URI should be wrong, but I have configured https://iodatahub.domain.com/callback/oidc in the app registration and I still get the error. Anybody encountered this?
    ✅ 1
    • 1
    • 1
  • g

    gifted-laptop-70221

    09/01/2023, 1:07 PM
    Hi everyone, I’m testing Datahub integration with Keboola. I have followed the quickstart guide, started Datahub locally and generated an access token via GraphQL, but to set up the Datahub component in Keboola it is necessary to specify also the Datahub Server Address. Could someone give me hands on how to proceed in case I am deploying locally?
    b
    • 2
    • 4
  • s

    some-flower-21264

    09/01/2023, 3:37 PM
    .Hi Team, MCE Issue: Earlier when we were upgrading to 0.10.4 version we were getting Error creating bean with name 'awsGlueSchemaRegistryFactory': Injection of autowired dependencies failed; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'kafka.schemaRegistry.awsGlue.region' in value "${kafka.schemaRegistry.awsGlue.region}", as suggested in tuesday breakout session we downgraded to 0.10.3 version Now we are getting below error": 2023-09-01 114537,439 [main] WARN o.eclipse.jetty.webapp.WebAppContext - Failed startup of context o.s.b.w.e.j.JettyEmbeddedWebAppContext@758e6acd{application,/,[file:///tmp/jetty-docbase.9090.3581628884752846733/],UNAVAILABLE} org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'restliServletRegistration' defined in class path resource [com/linkedin/metadata/restli/RestliServletConfig.class]: Unsatisfied dependency expressed through method 'restliServletRegistration' parameter 0; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'restliHandlerServlet': Unsatisfied dependency expressed through field '_r2Servlet'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'rapServlet' defined in class path resource [com/linkedin/restli/server/RAPServletFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.linkedin.r2.transport.http.server.RAPServlet]: Factory method 'rapServlet' threw exception; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'authorizerChainFactory': Unsatisfied dependency expressed through field 'dataHubAuthorizer'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'dataHubAuthorizerFactory': Unsatisfied dependency expressed through field 'entityClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'javaEntityClientFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in class path resource [com/linkedin/gms/factory/entity/EntityAspectDaoFactory.class]: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in class path resource [com/linkedin/gms/factory/entity/EbeanServerFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'restliHandlerServlet': Unsatisfied dependency expressed through field '_r2Servlet'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'rapServlet' defined in class path resource [com/linkedin/restli/server/RAPServletFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.linkedin.r2.transport.http.server.RAPServlet]: Factory method 'rapServlet' threw exception; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'authorizerChainFactory': Unsatisfied dependency expressed through field 'dataHubAuthorizer'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'dataHubAuthorizerFactory': Unsatisfied dependency expressed through field 'entityClient'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'javaEntityClientFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in class path resource [com/linkedin/gms/factory/entity/EntityAspectDaoFactory.class]: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in class path resource [com/linkedin/gms/factory/entity/EbeanServerFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException For GMS: Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [https://url:443], URI [/datahubpolicyindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request] {"error":{"root_cause":[{"type":"query_shard_exception","reason":"No mapping found for [lastUpdatedTimestamp] in order to sort on","index_uuid":"xyz","index":"datahubpolicyindex_v2"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"can_match","grouped":true,"failed_shards":[{"shard":0,"index":"datahubpolicyindex_v2","node":"UpzpcqaERpuGr-3NwBZoIQ","reason":{"type":"query_shard_exception","reason":"No mapping found for [lastUpdatedTimestamp] in order to sort on","index_uuid":"xyz","index":"datahubpolicyindex_v2"}}]},"status":400} at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:326) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:296) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:270) at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1632) ... 17 common frames omitted 2023-09-01 115422,710 [pool-11-thread-1] ERROR c.d.authorization.DataHubAuthorizer:230 - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30 com.datahub.util.exception.ESQueryException: Search query failed: 2023-09-01 115422,836 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115422,837 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115423,838 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115423,838 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115424,839 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115424,839 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115425,841 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115425,841 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115426,842 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115426,842 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115427,848 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry: Name does not resolve 2023-09-01 115427,849 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115422,836 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115422,837 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115423,838 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115423,838 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115424,839 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115424,839 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115425,841 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115425,841 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115426,842 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry 2023-09-01 115426,842 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115427,848 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:68 - Failed to connect to open servlet: schema-registry: Name does not resolve 2023-09-01 115427,849 [pool-14-thread-1] INFO c.l.m.boot.OnBootApplicationListener:60 - Sleeping for 1 second 2023-09-01 115422,710 [pool-11-thread-1] ERROR c.d.authorization.DataHubAuthorizer:230 - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30 com.datahub.util.exception.ESQueryException: Search query failed: 2023-09-01 115417,888 [pool-17-thread-1] ERROR c.l.g.factory.telemetry.DailyReport:115 - Error reporting telemetry: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131) Could anyone please have a look and assist us its blocking us now?
  • w

    white-appointment-14217

    09/01/2023, 5:23 PM
    Hi all, I am looking to resolve security alerts on azure by setting up securityContext for all the pods in datahub running on AKS. For the cp-schema-registry (prerequisites version 0.0.14), both podSecurityContext and securityContext does not seem to take effect and the promethus-jmx-exporter container is still running as root user. Below is snapshot of the values files for cp-schema-registry. Could someone please guide me on this resolution?
    Copy code
    cp-helm-charts:
      cp-schema-registry:
        enabled: true
        podSecurityContext:
          fsGroup: 1000
        securityContext:
          runAsUser: 1000
        prometheus:
          securityContext:
            runAsUser: 1000
          jmx:
            securityContext:
              runAsUser: 1000
        kafka:
          bootstrapServers: "prerequisites-kafka:9092"
      cp-kafka:
        enabled: false
      cp-zookeeper:
        enabled: false
      cp-kafka-rest:
        enabled: false
      cp-kafka-connect:
        enabled: false
      cp-ksql-server:
        enabled: false
      cp-control-center:
        enabled: false
    a
    b
    • 3
    • 2
  • c

    chilly-potato-57465

    09/04/2023, 2:12 PM
    Hello Everyone! I was reading about stemming and synonyms support around the beginning of the summer and thought it is a really useful feature. But since then it has disappeared from the Datahub documentation. I found the link to it in this post https://datahubspace.slack.com/archives/CV2UVAPPG/p1690483464290849 but it is not working. So I am wondering if this feature has been removed or deprecated? I would like to implement it. Thank you!
    b
    • 2
    • 6
  • w

    wide-ability-48958

    09/05/2023, 5:06 AM
    Hi All, We were trying to deploy datahub in AirGap kubernetes environment where we need to build/point all repo to internal docker hub due to security concerns. Would anyone has any suggestion on how to proceed further ? The datahub-helm seems like not having templates for prerequisites . Reference https://github.com/acryldata/datahub-helm .It would be helpful if anyone has any documentation or details for such requirements.
    r
    b
    • 3
    • 2
  • o

    orange-gpu-90973

    09/05/2023, 5:56 AM
    Hi team, I wanted to deploy datahub using a load balancer which is provided to me. Now we need to add prefix to the datahub application so that it gets separate out to other application which are using the same load balancer. Now is there any way in Datahub front-end or react part where I can add prefix to route for such that it will get added in all the routes of datahub?
    ✅ 1
    r
    b
    • 3
    • 4
  • b

    better-orange-49102

    09/05/2023, 8:41 AM
    re: data retention. i noticed that deploying 0.10.3 inserted a datahubretention aspect into the mysql DB with a value of 20 for all aspects and entities. https://datahubproject.io/docs/advanced/db-retention/#how-to-configure I noticed that my GMS didnt have ENTITY_SERVICE_ENABLE_RETENTION=true as an environmental variable, yet it was still pruning my entries whats the best way to disable pruning? remove the aspect?
    ✅ 1
    r
    b
    • 3
    • 4
  • f

    future-controller-3884

    09/05/2023, 2:58 PM
    Hi all, I’m Editor role on Datahub tool. But, sometimes I got the under error when I edit description of a column. I’m using v0.10.3. Do you have any suggestion for me to investigate the issue, here?
    r
    d
    o
    • 4
    • 16
  • b

    brainy-butcher-66683

    09/05/2023, 6:11 PM
    Hi Team I am using a standalone Postgres db Is there a way for me to create the
    datahub
    database needed in the my standalone db via the deployment helm chart ?
    r
    d
    g
    • 4
    • 11
  • w

    wonderful-cpu-12969

    09/05/2023, 7:12 PM
    Hi all, we are trying to deploy DataHub in Azure AKS, and using the Azure MySQL flexible server for DB. We can confirm access the database server from the AKS cluster, so there isn't network connectivity issue. However, we are seeing failure in the
    datahub-system-update-job
    with the below error. I have also attached the log.
    Copy code
    2023-09-04 16:29:19,467 [main] INFO  i.e.d.pool.PooledConnectionQueue:405 - Reseting DataSourcePool [gmsEbeanServiceConfig] min[2] max[50] free[0] busy[0] waiting[0] highWaterMark[0] waitCount[0] hitCount[0]
    2023-09-04 16:29:19,468 [main] INFO  i.e.d.pool.PooledConnectionQueue:411 - Busy Connections:
    
    2023-09-04 16:29:19,470 [main] ERROR c.l.g.f.entity.EbeanServerFactory:38 - Failed to connect to the server. Is it up?
    ERROR SpringApplication Application run failed
     org.springframework.beans.fa
    This is the jdbc connection url : "jdbc:mysql://mysqlsrv05092023.mysql.database.azure.com:3306/datahub?verifyServerCertificate=false&amp;useSSL=true&amp;useUnicode=yes&amp;characterEncoding=UTF-8&amp;enabledTLSProtocols=TLSv1.2" Has anyone come across similar error and resolved this?
    current_error.txt
    b
    • 2
    • 2
  • c

    creamy-wall-36971

    09/06/2023, 2:04 AM
    Hi everyone, Please tell me how to delete all glossary terms through cli. Thank you.
    h
    d
    • 3
    • 3
  • a

    adamant-iron-62899

    09/06/2023, 7:58 AM
    Dear Community, I have checked previous threads and there were some discussions about using podman instead of docker, especially podman-compose part. However I have not seen any confirmation that somebody has successfully deployed Datahub with podman? Is the trick here the alias in bash?
    s
    l
    • 3
    • 6
  • l

    loud-gpu-36404

    09/08/2023, 1:09 PM
    Heya everyone! I'm currently deploying DataHub (0.10.5) on GCP, using helm charts, and have been struggling with auth issues within DataHub. I'm wondering if anyone can shed some light on them : D I've tried: • Frontend default auth and GMS no metadata service authentication - ✅ • Frontend default auth and GMS metadata service authentication enabled (auto generated secrets) - ✅ • Frontend oicd auth (google provider) and GMS no metadata service authentication - ✅ • Frontend oicd auth (google provider) and GMS metadata service authentication enabled - ❌ Has anyone got OICD auth on the frontend to play nice with the GMS metadata service auth?
  • o

    orange-gpu-90973

    09/08/2023, 2:31 PM
    Hi team, I am getting below error in gms logs and also not able to see datasets only platform is added . Logs are here...... ERROR c.l.m.s.e.update.BulkListener:56 - Error feeding bulk request. No retries left. Request: Failed to perform bulk request: index [dat asetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGM%2CPROD%29];Failed to perform bulk request: index [ datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGMGRP%2CPROD%29];Failed to perform bulk request: i ndex [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGMGRP%2CPROD%29];Failed to perform bulk requ est: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [MulrbF+m7pOX8FZWKhFvfA==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], typ e [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTG%2CPROD%29];Failed to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [+dVcgKkZo0b54/VULBLKVA==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3A dataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [graph_service_v1], optype: [UPDATE], type [_doc], id [q8v+RyJ/4GuJiUEeie9BYQ==];Faile d to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [n3GQtcgTWWWXYCsCtIxcIA==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [system_metadat a_service_v1], optype: [UPDATE], type [_doc], id [1Pe69BbgsIXutJ+qQx46Sg==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Ad ataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [0EvAuIBMmx2PfhUA010aFQ==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CC TGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere% 2CCTGS.STNET.DTS2.CTGRP%2CPROD%29];Failed to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [yvOt1wkWazchptZYmJtRzg==];Failed to perfor m bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to per form bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [OvHgSEtTPUICetikdlM0+g==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [system_metadata_servic e_v1], optype: [UPDATE], type [_doc], id [7c9EvHBwA7Xss4O7rNsE2g==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3 A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adatase t%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGRP%2CPROD%29];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Ada taset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTITLE%2CPROD%29];Failed to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], i d [sQtpzVgs7goDUM/HXNvkGw==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2C CTGS.STNET.DTS2.CTGL%2CPROD%29];Failed to perform bulk request: index [system_metadata_service_v1], optype: [UPDATE], type [_doc], id [qIM8ghhj7xP2qzARnIDXwQ==];Failed to perform b ulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3AdataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGM%2CPROD%29];Failed to perfor m bulk request: index [graph_service_v1], optype: [UPDATE], type [_doc], id [q7vdPq8m0+zZpY5saCJV6A==];Failed to perform bulk request: index [system_metadata_service_v1], optype: [ UPDATE], type [_doc], id [6OV/6/4OgKUF/Y5Rp0hbEg==];Failed to perform bulk request: index [datasetindex_v2], optype: [UPDATE], type [_doc], id [urn%3Ali%3Adataset%3A%28urn%3Ali%3Ad ataPlatform%3AAmpere%2CCTGS.STNET.DTS2.CTGM%2CPROD%29] java.io.IOException: Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=http://logging-es-http.monitoring.svc.cluster.local:9200, response =HTTP/1.1 200 OK} at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1783) at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onSuccess(RestClient.java:636) at org.elasticsearch.client.RestClient$1.completed(RestClient.java:376) at org.elasticsearch.client.RestClient$1.completed(RestClient.java:370) at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122) at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338) at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.lang.NullPointerException: null at java.base/java.util.Objects.requireNonNull(Objects.java:221) at org.elasticsearch.action.DocWriteResponse.<init>(DocWriteResponse.java:127) at org.elasticsearch.action.update.UpdateResponse.<init>(UpdateResponse.java:65) at org.elasticsearch.action.update.UpdateResponse$Builder.build(UpdateResponse.java:172) at org.elasticsearch.action.update.UpdateResponse$Builder.build(UpdateResponse.java:160) at org.elasticsearch.action.bulk.BulkItemResponse.fromXContent(BulkItemResponse.java:159) at org.elasticsearch.action.bulk.BulkResponse.fromXContent(BulkResponse.java:188) at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1911) at org.elasticsearch.client.RestHighLevelClient.lambda$performRequestAsyncAndParseEntity$10(RestHighLevelClient.java:1699) at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1781) ... 18 common frames omitted 2023-09-07 145508,612 [qtp71399214-18] INFO c.l.m.r.platform.PlatformResource:61 - Emitting platform event. name: entityChangeEvent, key: entityChangeEvent-urnlidataset:(urn:l idataPlatformAmpere,CTGS.STNET.DTS2.CTGM,PROD) Any idea on why it is failing while making bulk request to elastic search? Context : using helm deployment of datahub and ingesting a custom data source.
    b
    a
    a
    • 4
    • 6
  • g

    gorgeous-tent-62316

    09/08/2023, 4:38 PM
    Hi Team, Looking at the following datahub-gms docker image released three days ago: https://hub.docker.com/layers/linkedin/datahub-gms/065a290/images/sha256-21e522d1168[…]5307b6f6897d6110b2d169656adb849e2ec184f44ea73?context=explore We are seeing many Critical Security Vulnerabilities. Please see attached file. Our strict Security Compliance has blocked us from using these images. From our side, we will try to mitigate these issues, and of course, if we find a viable solution we will contribute back to the community. My question, is any work scheduled from your side to remedy these vulnerabilities?
    Docker_linkedin-datahub-gms-sha256__21e522d1168912ec1795307b6f6897d6110b2d169656adb849e2ec184f44ea73_Security_Export.pdf
    r
    • 2
    • 1
  • a

    average-vr-23088

    09/08/2023, 6:51 PM
    Hi guys, i see that 0.11.0 is in pre release. How long does it usually take for a prerelease to become an actual release? Is it on the order of a few days? Wondering if we should wait for 0.11.0 to drop or just update to 0.10.5
    a
    r
    • 3
    • 2
  • p

    proud-table-38689

    09/10/2023, 7:33 PM
    where is the code that DataHub uses to create the Kafka Schema Registry schemas for the topics it uses internally? (e.g.
    MetadataAuditEvent_v4
    ,
    DataHubUpgradeHistory_v1
    )
    r
    g
    • 3
    • 3
  • m

    many-wolf-97286

    09/11/2023, 6:36 PM
    Hi team, recently we installed datahub & GMS Components on Private AKS. During Ingestion process we are getting below error and also not able to add datasets only platform is added. Any suggestions or recommendations would be helpful Errors Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1007)'))) - skipping [2023-09-08 143322,977] ERROR {datahub.ingestion.run.pipeline:418} - Caught error Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/datahub_classify/infotype_helper.py", line 35, in <module> nlp_english = spacy.load(spacy_model_name) File "/usr/local/lib/python3.10/site-packages/spacy/init.py", line 54, in load return util.load_model( File "/usr/local/lib/python3.10/site-packages/spacy/util.py", line 439, in load_model raise IOError(Errors.E050.format(name=name))
    OSError: [E050] Can't find model '*en_core_web_sm*'. It doesn't seem to be a Python package or a valid path to a data directory.
    r
    • 2
    • 1
1...474849...53Latest