https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • h

    handsome-flag-16272

    02/24/2023, 7:20 PM
    Hi team, 1. Is datahub-mce-consumer responsible for consuming the events? 2. To improve the throughput, we need to increasing the number of Kafka partitions and scaling out the datahub-mce-consumer instance. - Is that all messages send to one topic or we can config diffient Kafka topics per domain? - Does datahub support scaleing out consumer based on the number of to be processed messages in the Kafka topic? 3. If no standalone datahub-mce-consumer instances, which component will do the job? From log I found that’s datahub-gms. • If we start both mce-consumer, mae-consumer and datahub-gms. Will one change proposal event will be processed in all these 3 components? 4. Where will datahub-mce-consumer send data to, datahub-gms or directly to ElasticSearch for index buildibng? It appears to the datahub-gms.
    ☝️ 1
    o
    • 2
    • 1
  • n

    numerous-byte-87938

    02/24/2023, 9:44 PM
    🤔 Looking for some insights on ES indices reindex on GMS deployment (Our version is before this PR). My current understanding is that GMS uses Spring, and before creating servlets, it needs to initialize beans first, such as elasticSearchService, elasticSearchGraphService. One common method of those services is
    configure()
    which rebuilds the indices as needed. What confuses me is that I’m only able to find references to this method in mae-consumer and nowhere else on GMS side. But we do see this
    configure()
    method being called (reindex step happening) inside our GMS pod deployment, despite the fact that we’ve separated mae-consumer into another service.
    o
    • 2
    • 5
  • f

    future-analyst-98466

    02/27/2023, 7:19 AM
    Hi team, Is there any sizing guidelines for deployment datahub on docker? What parameters does sizing depend on? (example number of tables, how big schemas...etc)
    ✅ 1
    b
    • 2
    • 2
  • a

    agreeable-belgium-70840

    02/27/2023, 12:00 PM
    So I posted yesterday on #troubleshoot about that, but the issue persists. Sorry for the spam, but I'll need to post once more and write some further details. I am trying to upgrade datahub to v0.10.0 from v0.9.5 . I ran all the init-jobs and the datahub-upgrade. Everything gets deployed, however datahub-gms seems to be hanging without giving any error. Helm times out at some point and the deployment fails. Here are datahub's logs:
    Copy code
    2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - mce-consumer-job-client: partitions revoked: []
    2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
    2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - mce-consumer-job-client: partitions assigned: []
    Any recommendation is welcome. Regards
    b
    • 2
    • 6
  • g

    gifted-diamond-19544

    02/27/2023, 12:17 PM
    Hello all! I have a question, how is the weekly active users on the analytics tab calculated? The reason I ask is because the number shown at the top of the page is not the same as the one shown in the graph for this week
    b
    • 2
    • 5
  • w

    wooden-breakfast-17692

    02/27/2023, 3:22 PM
    Hi all, I’m trying to deploy datahub locally with
    ./gradlew quickstartDebug
    using neo4j. It looks like neo4j is disabled by default. Is there a way I can use quickstartDebug with neo4j? Thanks!
  • b

    bland-balloon-48379

    02/27/2023, 4:06 PM
    Hey everyone! So my team recently tested switching from neo4j to elasticsearch as our graph database backend. We followed the documentation to do the switch and things seemed to go pretty smoothly, however the logs for the restore indices job show that there were 521 skipped rows during re-indexing. Looking through the logs more in-depth, all of the failures were of the same format
    java.lang.IllegalArgumentException: Failed to find entity with name X in EntityRegistry.
    Aggregating these, I got the following count of entity names: globalSettings: 1 dataHubView: 8 dataHubStepState: 512 I just wanted to know if these are all internal datahub items that don't have a place in ES and can be safely ignored or if I need to investigate these items further. For context, we are still running datahub v0.9.5. Any insights around this would be appreciated. Thanks!
    plus1 2
    • 1
    • 1
  • a

    alert-traffic-45034

    02/28/2023, 5:20 AM
    Hi, Hope this is the right place where I throw the below question. Given that I onboard the users with okta sso login, So, the user will be created on datahub only at the time when she/he does the firsts log-in. What I would like to do is that I would like to have a specified grouping of users using the provided API, even when the user has not been created yet.
    Copy code
    mutation {
      addGroupMembers(input:{
        groupUrn:"<existing-group-Urn>"
        userUrns:[
          "urn:li:corpuser:<new-user-1>"
          "urn:li:corpuser:<new-user-2>"
        ]
      })
    }
    But currently, It is failed to do so, even I have the exact string pattern of the new user(s). May I know any other good approach to doing it? thanks in advance
    ✅ 1
    c
    • 2
    • 9
  • c

    cuddly-arm-8412

    02/28/2023, 7:23 AM
    hi,team.I modified the gms startup port. At that time, the run gms service prompted the following error->Connection refused: localhost/127.0.0.1:14142 I don't know why the service didn't start successfully?
    Copy code
    task run(type: JavaExec, dependsOn: build) {
        main = "org.eclipse.jetty.runner.Runner"
        systemProperties System.getProperties()
        args = ["--port", 14142, war.archivePath]
        classpath configurations.jetty9
    }
    2023-02-28T150710.822+0800 [QUIET] [system.out] 2023-02-28 150710,822 [R2 Nio Event Loop-1-1] DEBUG c.l.r.t.http.client.AsyncPoolImpl:733 - localhost/127.0.0.114142/fb07564 object creation failed 2023-02-28T150710.822+0800 [QUIET] [system.out] com.linkedin.r2.RetriableRequestException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:14142 2023-02-28T150710.822+0800 [QUIET] [system.out] at com.linkedin.r2.transport.http.client.common.ChannelPoolLifecycle.onError(ChannelPoolLifecycle.java:142) 2023-02-28T150710.822+0800 [QUIET] [system.out] at com.linkedin.r2.transport.http.client.common.ChannelPoolLifecycle.lambda$create$0(ChannelPoolLifecycle.java:97) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) 2023-02-28T150710.822+0800 [QUIET] [system.out] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 2023-02-28T150710.822+0800 [QUIET] [system.out] at java.base/java.lang.Thread.run(Thread.java:834) 2023-02-28T150710.822+0800 [QUIET] [system.out] Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:14142 2023-02-28T150710.822+0800 [QUIET] [system.out] Caused by: java.net.ConnectException: Connection refused 2023-02-28T150710.822+0800 [QUIET] [system.out] at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 2023-02-28T150710.823+0800 [QUIET] [system.out] at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) 2023-02-28T150710.823+0800 [QUIET] [system.out] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 2023-02-28T150710.823+0800 [QUIET] [system.out] at java.base/java.lang.Thread.run(Thread.java:834) 2023-02-28T150712.721+0800 [QUIET] [system.out] 2023-02-28 150712,721 [ThreadPoolTaskExecutor-1] DEBUG o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer:313 - Received: 0 records
    ✅ 1
    b
    a
    +3
    • 6
    • 13
  • b

    bright-receptionist-94235

    02/28/2023, 7:53 AM
    Hey, we have upgraded to version 0.0.11 Image: docker.taboolasyndication.com/data-apps/acryldata/datahub-actions:v0.0.11 but inside the actions pod datahub cli version is DataHub CLI version: 0.9.6.2 why is it?
    b
    • 2
    • 4
  • c

    cuddly-arm-8412

    02/28/2023, 11:34 AM
    hi,team。I merged the latest official code. When I debugged the code to get the dataset information, an error was reported。Is the image database to be upgraded? Caused by: com.google.common.util.concurrent.UncheckedExecutionException: org.neo4j.driver.exceptions.ClientException: The server does not support any of the protocol versions supported by this driver. Ensure that you are using driver and server versions that are compatible with one another.
    ✅ 1
    b
    • 2
    • 2
  • g

    gifted-diamond-19544

    02/28/2023, 12:38 PM
    Hello all! We are currently having a problem with our Datahub instance. Basically, we cannot trigger an ingestion on our UI, or add new ingestion sources. When we trigger an ingestion, there is a green a popup that says that the ingestion started, but there is no change in the UI. When I go into the logs, there is this error message:
    Copy code
    [0]: index [datahubexecutionrequestindex_v2], type [_doc], id [urn%3Ali%3AdataHubExecutionRequest%3Acb0e3a90-c4b5-47de-9f60-88c9301d7866], message [[datahubexecutionrequestindex_v2/hcm4-xQ0T2CYNwTo9WLH4Q][[datahubexecutionrequestindex_v2][0]] ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn%3Ali%3AdataHubExecutionRequest%3Acb0e3a90-c4b5-47de-9f60-88c9301d7866]: document missing]]]
    what would be the best way to fix this? Thank you!
    b
    • 2
    • 10
  • b

    busy-mechanic-8014

    02/28/2023, 2:53 PM
    Hello everyone, I'm trying to deploy datahub on my Kubernetes cluster and i am stuck with datahub-gms pod error. I have already seen that the problem was encountered by others but the solutions that were proposed did not work on my side. One of the many mistakes:
    Copy code
    Caused by: 
    org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'restliEntityClientFactory': Unsatisfied dependency expressed through field 'gmsPort'; nested exception is org.springframework.beans.TypeMismatchException: Failed to convert value of type 'java.lang.String' to required type 'int'; nested exception is java.lang.NumberFormatException: For input string: "<tcp://10.43.117.128:8080>"
    See the complete log file (interesting lines from line 2824) (datahub-gms-...log). I deploy from the helm charts like this (without modifying the values.yaml):
    Copy code
    helm repo add datahub <https://helm.datahubproject.io/>
    helm install prerequisites datahub/datahub-prerequisites --namespace datahub
    helm install datahub datahub/datahub --namespace datahub
    All components are in the Running state (except acryl-datahub and datahub-gms), all Jobs are in the Succeeded state and have no error logs (especially mysql). I attached to this message all log files from Job. Can someone help me ? Thanks a lot ! Don't hesitate to ask me for more information if needed 🙂
    datahub-system-update-job-z25bm.logkafka-setup-job-6bqzz.logelasticsearch-setup-job-smlp2.logmysql-setup-job-vq48j.log
    datahub-gms-84d748899c-nn7c5.log
    b
    • 2
    • 10
  • a

    agreeable-belgium-70840

    02/28/2023, 3:43 PM
    Hello guys, I have posted about that before but I have some more evidence now, sorry for the spam. I am trying to upgrade to v0.10.0 from v0.9.5 . I ran the upgrade-job and the elasticsearch, kafka and mysql init jobs. Everything looks fine, but gms hangs here:
    Copy code
    2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
    2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
    2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - mce-consumer-job-client: partitions assigned: []
    At some point helm times out and it fails. What I can observe now that it is extra warning, which is:
    Copy code
    2023-02-24 14:42:10,554 [main] WARN  c.l.metadata.entity.EntityService:798 - Unable to produce legacy MAE, entity may not have legacy Snapshot schema.
    java.lang.UnsupportedOperationException: Failed to find Typeref schema associated with Config-based Entity
    	at com.linkedin.metadata.models.ConfigEntitySpec.getAspectTyperefSchema(ConfigEntitySpec.java:80)
    	at com.linkedin.metadata.entity.EntityService.toAspectUnion(EntityService.java:1480)
    	at com.linkedin.metadata.entity.EntityService.buildSnapshot(EntityService.java:1429)
    	at com.linkedin.metadata.entity.EntityService.produceMetadataAuditEvent(EntityService.java:1239)
    	at com.linkedin.metadata.entity.EntityService.sendEventForUpdateAspectResult(EntityService.java:794)
    The full log is attached. I would be grateful if I get some assistance, as I am stuck at that point. Regards
    logs-from-datahub-gms-in-datahub-gms-566ddf4574-rxmk7.log
    b
    • 2
    • 4
  • c

    cuddly-arm-8412

    03/01/2023, 7:01 AM
    hi,team.I want to know the effects of this index[datahubstepstateindex_v2]. I pull the latest code to debug locally and search for this error
    Copy code
    Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<http://172.21.204.35:9201>], URI [/datahubstepstateindex_v2/_count?ignore_throttled=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true], status line [HTTP/1.1 404 Not Found]
    2023-03-01T13:35:55.598+0800 [QUIET] [system.out] {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [datahubstepstateindex_v2]","resource.type":"index_or_alias","resource.id":"datahubstepstateindex_v2","index_uuid":"_na_","index":"datahubstepstateindex_v2"}],"type":"index_not_found_exception","reason":"no such index [datahubstepstateindex_v2]","resource.type":"index_or_alias","resource.id":"datahubstepstateindex_v2","index_uuid":"_na_","index":"datahubstepstateindex_v2"},"status":404}
    ✅ 1
    b
    • 2
    • 7
  • b

    better-sunset-65466

    03/01/2023, 10:07 AM
    Hello, what are the minimum/recommended system requirements to install via
    docker quickstart
    ?
    ✅ 1
    b
    • 2
    • 1
  • b

    billowy-jewelry-47039

    03/01/2023, 10:44 AM
    After installing the datahub helm chart I noticed the default datahub user doesn't have administrator privileges. Is there any way to create a user with administrator privileges?
    b
    g
    • 3
    • 5
  • r

    rapid-airport-61849

    03/01/2023, 11:13 AM
    How could we add lib into docker Quickstart image? PipelineInitError: Failed to configure the source (mssql): No module named ‘pyodbc’
    b
    • 2
    • 22
  • b

    best-wire-59738

    03/01/2023, 1:11 PM
    Hello Team, we are facing kafka rebalancing issue when we set KAFKA_LISTENER_CONCURRENCY to 10 and Increased partitions to 10 to make consumers work in parallel as we are using kafka sink and ingested data is taking long time to get updated neo4j. we confirmed we get into group rebalancing issue when concurrency is set to any number instead of 1. we have reduced max.poll.records to 10 (default 500) thinking it might resolve the issue. But no use. Could you please help us. we are currently on datahub version 0.9.2 and using MSK service from aws
    Copy code
    19:43:59 [ThreadPoolTaskExecutor-2] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-9, groupId=generic-mae-consumer-job-client] Finished assignment for group at generation 5728: {consumer-generic-mae-consumer-job-client-9-fa499c12-3ac0-440b-aa27-711c3b60c14d=Assignment(partitions=[MetadataChangeLog_Timeseries_v1-5, MetadataChangeLog_Timeseries_v1-6, MetadataChangeLog_Timeseries_v1-7, MetadataChangeLog_Timeseries_v1-8, MetadataChangeLog_Timeseries_v1-9, MetadataChangeLog_Versioned_v1-5, MetadataChangeLog_Versioned_v1-6, MetadataChangeLog_Versioned_v1-7, MetadataChangeLog_Versioned_v1-8, MetadataChangeLog_Versioned_v1-9]), consumer-generic-mae-consumer-job-client-10-a8633ab0-830e-4e75-9d1c-593e100e1505=Assignment(partitions=[MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Timeseries_v1-1, MetadataChangeLog_Timeseries_v1-2, MetadataChangeLog_Timeseries_v1-3, MetadataChangeLog_Timeseries_v1-4, MetadataChangeLog_Versioned_v1-0, MetadataChangeLog_Versioned_v1-1, MetadataChangeLog_Versioned_v1-2, MetadataChangeLog_Versioned_v1-3, MetadataChangeLog_Versioned_v1-4])}
    19:43:59 [ThreadPoolTaskExecutor-3] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-10, groupId=generic-mae-consumer-job-client] Successfully joined group with generation 5728
    19:43:59 [ThreadPoolTaskExecutor-2] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-9, groupId=generic-mae-consumer-job-client] Successfully joined group with generation 5728
    19:47:45 [ThreadPoolTaskExecutor-1] WARN  o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] Synchronous auto-commit of offsets {MetadataChangeLog_Timeseries_v1-1=OffsetAndMetadata{offset=0, leaderEpoch=null, metadata=''}, MetadataChangeLog_Versioned_v1-4=OffsetAndMetadata{offset=8, leaderEpoch=0, metadata=''}, MetadataChangeLog_Timeseries_v1-0=OffsetAndMetadata{offset=0, leaderEpoch=null, metadata=''}, MetadataChangeLog_Versioned_v1-3=OffsetAndMetadata{offset=1, leaderEpoch=0, metadata=''}, MetadataChangeLog_Timeseries_v1-3=OffsetAndMetadata{offset=0, leaderEpoch=null, metadata=''}, MetadataChangeLog_Versioned_v1-2=OffsetAndMetadata{offset=168196, leaderEpoch=0, metadata=''}, MetadataChangeLog_Timeseries_v1-2=OffsetAndMetadata{offset=3, leaderEpoch=null, metadata=''}, MetadataChangeLog_Versioned_v1-1=OffsetAndMetadata{offset=189482, leaderEpoch=0, metadata=''}, MetadataChangeLog_Versioned_v1-0=OffsetAndMetadata{offset=168795, leaderEpoch=0, metadata=''}, MetadataChangeLog_Timeseries_v1-4=OffsetAndMetadata{offset=0, leaderEpoch=null, metadata=''}} failed: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] Lost previously assigned partitions MetadataChangeLog_Timeseries_v1-1, MetadataChangeLog_Versioned_v1-4, MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-3, MetadataChangeLog_Timeseries_v1-3, MetadataChangeLog_Versioned_v1-2, MetadataChangeLog_Timeseries_v1-2, MetadataChangeLog_Versioned_v1-1, MetadataChangeLog_Versioned_v1-0, MetadataChangeLog_Timeseries_v1-4
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer - generic-mae-consumer-job-client: partitions lost: [MetadataChangeLog_Timeseries_v1-1, MetadataChangeLog_Versioned_v1-4, MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-3, MetadataChangeLog_Timeseries_v1-3, MetadataChangeLog_Versioned_v1-2, MetadataChangeLog_Timeseries_v1-2, MetadataChangeLog_Versioned_v1-1, MetadataChangeLog_Versioned_v1-0, MetadataChangeLog_Timeseries_v1-4]
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer - generic-mae-consumer-job-client: partitions revoked: [MetadataChangeLog_Timeseries_v1-1, MetadataChangeLog_Versioned_v1-4, MetadataChangeLog_Timeseries_v1-0, MetadataChangeLog_Versioned_v1-3, MetadataChangeLog_Timeseries_v1-3, MetadataChangeLog_Versioned_v1-2, MetadataChangeLog_Timeseries_v1-2, MetadataChangeLog_Versioned_v1-1, MetadataChangeLog_Versioned_v1-0, MetadataChangeLog_Timeseries_v1-4]
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] (Re-)joining group
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] Join group failed with org.apache.kafka.common.errors.MemberIdRequiredException: The group member needs to have a valid member id before actually entering a consumer group
    19:47:45 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-8, groupId=generic-mae-consumer-job-client] (Re-)joining group
    19:47:47 [ThreadPoolTaskExecutor-2] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-9, groupId=generic-mae-consumer-job-client] Attempt to heartbeat failed since group is rebalancing
    19:47:47 [ThreadPoolTaskExecutor-2] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-9, groupId=generic-mae-consumer-job-client] Revoke previously assigned partitions MetadataChangeLog_Timeseries_v1-7, MetadataChangeLog_Timeseries_v1-6, MetadataChangeLog_Timeseries_v1-9, MetadataChangeLog_Timeseries_v1-8, MetadataChangeLog_Versioned_v1-9, MetadataChangeLog_Versioned_v1-8, MetadataChangeLog_Versioned_v1-7, MetadataChangeLog_Versioned_v1-6, MetadataChangeLog_Versioned_v1-5, MetadataChangeLog_Timeseries_v1-5
    19:47:47 [ThreadPoolTaskExecutor-2] INFO  o.s.k.l.KafkaMessageListenerContainer - generic-mae-consumer-job-client: partitions revoked: [MetadataChangeLog_Timeseries_v1-7, MetadataChangeLog_Timeseries_v1-6, MetadataChangeLog_Timeseries_v1-9, MetadataChangeLog_Timeseries_v1-8, MetadataChangeLog_Versioned_v1-9, MetadataChangeLog_Versioned_v1-8, MetadataChangeLog_Versioned_v1-7, MetadataChangeLog_Versioned_v1-6, MetadataChangeLog_Versioned_v1-5, MetadataChangeLog_Timeseries_v1-5]
    19:47:47 [ThreadPoolTaskExecutor-2] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-9, groupId=generic-mae-consumer-job-client] (Re-)joining group
    19:47:47 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-10, groupId=generic-mae-consumer-job-client] Attempt to heartbeat failed since group is rebalancing
    19:47:50 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-10, groupId=generic-mae-consumer-job-client] Attempt to heartbeat failed since group is rebalancing
    19:47:56 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-10, groupId=generic-mae-consumer-job-client] Attempt to heartbeat failed since group is rebalancing
    19:48:59 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO  o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-generic-mae-consumer-job-client-10, groupId=generic-mae-consumer-job-client] Member consumer-generic-mae-consumer-job-client-10-a8633ab0-830e-4e75-9d1c-593e100e1505 sending LeaveGroup request to coordinator <http://b-1.datahubmsk.6wyvlb.c8.kafka.us-west-2.amazonaws.com:9092|b-1.datahubmsk.6wyvlb.c8.kafka.us-west-2.amazonaws.com:9092> (id: 2147483646 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured <http://max.poll.interval.ms|max.poll.interval.ms>, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing <http://max.poll.interval.ms|max.poll.interval.ms> or by reducing the maximum size of batches returned in poll() with max.poll.records.
    ✅ 1
    b
    • 2
    • 5
  • c

    creamy-machine-95935

    03/01/2023, 9:41 PM
    Hi everyone! Is there any example of how to deploy datahub on kubernetes using Terraform? thanks! 🙌
    lookaround 2
    plus1 1
  • w

    witty-toddler-69828

    03/02/2023, 9:40 AM
    Hello everyone, I'm looking to deploy datahub to AWS using ECS and the Opensearch / Elasticsearch service. I'm finding that the GMS container is starting up successfully and running, but it doesn't seem to be listening on port 8080. Checking the GMS logs, I'm getting lots of Elasticsearch errors and wondering if that may be the issue. This is one of them:
    Copy code
    {
      "error": {
        "root_cause": [
          {
            "type": "index_not_found_exception",
            "reason": "no such index [datahubpolicyindex_v2]",
            "resource.type": "index_or_alias",
            "resource.id": "datahubpolicyindex_v2",
            "index_uuid": "_na_",
            "index": "datahubpolicyindex_v2"
          }
        ],
        "type": "index_not_found_exception",
        "reason": "no such index [datahubpolicyindex_v2]",
        "resource.type": "index_or_alias",
        "resource.id": "datahubpolicyindex_v2",
        "index_uuid": "_na_",
        "index": "datahubpolicyindex_v2"
      },
      "status": 404
    }
    I've checked and that index mentioned doesn't exist. I ran the datahub-elasticsearch-setup task which seems to run successfully, but it doesn't create the index that the GMS task is expecting. The indexes looks like:
    Copy code
    health status index                          uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    green  open   .kibana_1                      -ePAdnrdRzii8blPjg6iqA   1   0          1            0        5kb            5kb
    yellow open   .opendistro-job-scheduler-lock 4Uf-Vr2jTAi6jm5AAuP_5g   5   1          1            0     12.1kb         12.1kb
    yellow open   datahub_usage_event-000001     8JEBpkYDQX6z5XVZXvYFuQ   5   1          0            0        1kb            1kb
    Does anyone know why the index that GMS is looking for doesn't match what is created? Is it right that the http service wouldn't be available if the index isn't there or should I be looking elsewhere for the issue?
    ✅ 2
    d
    • 2
    • 2
  • a

    aloof-dentist-85908

    03/02/2023, 9:58 AM
    Hi, does anyone know when the confluent schema registry will be removed as hard dependency? Is there any plan when this will be released? https://github.com/datahub-project/datahub/pull/6552 @incalculable-ocean-74010 Do you have any news for us? 🙂 Thanks a lot!
    ✅ 1
    i
    a
    • 3
    • 4
  • m

    microscopic-machine-90437

    03/02/2023, 12:06 PM
    Hi Team, I'm trying to deploy datahub using Kubernetes, for which I need to install minikube. When I try to install minikube on my LINUX server, I'm getting the below error. Can someone help...!
    ✅ 1
    g
    • 2
    • 1
  • s

    silly-angle-91497

    03/02/2023, 5:00 PM
    Hello everyone. When we are using AWS MSK with IAM authentication, how are we supposed to configure this in the prerequisites values.yaml file?
    b
    • 2
    • 2
  • f

    famous-fall-59477

    03/02/2023, 6:42 PM
    Hi, when I try to do a
    ./gradlew build
    , I get the following yarn related error:
    Copy code
    > Task :datahub-web-react:yarnGenerate FAILED
    yarn run v1.22.0
    $ graphql-codegen --config codegen.yml
    node:internal/modules/cjs/loader:936
      throw err;
      ^
    
    Error: Cannot find module './_baseClone'
    Require stack:
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/lodash/clone.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/builder.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/generated/index.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/utils/react/cleanJSXElementLiteralChild.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/react/buildChildren.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/index.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/index.cjs.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/code-file-loader/index.cjs.js
    - /Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-codegen/cli/bin.js
        at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
        at Function.Module._load (node:internal/modules/cjs/loader:778:27)
        at Module.require (node:internal/modules/cjs/loader:1005:19)
        at require (node:internal/modules/cjs/helpers:94:18)
        at Object.<anonymous> (/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/lodash/clone.js:1:17)
        at Module._compile (node:internal/modules/cjs/loader:1101:14)
        at Object.Module._extensions..js (node:internal/modules/cjs/loader:1153:10)
        at Module.load (node:internal/modules/cjs/loader:981:32)
        at Function.Module._load (node:internal/modules/cjs/loader:822:12)
        at Module.require (node:internal/modules/cjs/loader:1005:19) {
      code: 'MODULE_NOT_FOUND',
      requireStack: [
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/lodash/clone.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/builder.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/generated/index.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/utils/react/cleanJSXElementLiteralChild.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/builders/react/buildChildren.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/node_modules/@babel/types/lib/index.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/graphql-tag-pluck/index.cjs.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-tools/code-file-loader/index.cjs.js',
        '/Users/subhajoy/Documents/personal_repositories/datahub/datahub-web-react/node_modules/@graphql-codegen/cli/bin.js'
      ]
    }
    error Command failed with exit code 1.
    info Visit <https://yarnpkg.com/en/docs/cli/run> for documentation about this command.
    I am on a Mac (non M1), and
    java --version
    gives:
    Copy code
    openjdk 11.0.18 2023-01-17
    OpenJDK Runtime Environment Homebrew (build 11.0.18+0)
    OpenJDK 64-Bit Server VM Homebrew (build 11.0.18+0, mixed mode)
    Any idea why is this happening? I noticed several posts here about the same issue somewhat, but I could not find any meaningful resolution yet. Any help would be appreciated, thank you!
    lookaround 3
    a
    e
    i
    • 4
    • 3
  • c

    cuddly-arm-8412

    03/03/2023, 5:38 AM
    hi,team. i download the project loacally,when i run command -> python3 -m datahub docker quickstart --quickstart-compose-file /project/github-datahub/docker/docker-compose.yml Error response from daemon: manifest for elasticsearch:7.10.2 not found: manifest unknown: manifest unknown
    b
    o
    • 3
    • 5
  • s

    shy-dog-84302

    03/03/2023, 2:24 PM
    Hi! I’m looking for some advice on stackdriver log integration of DataHub components in GCP. Is there any way I can configure logging for various components like metadata service and front-end etc?
    b
    • 2
    • 14
  • r

    rapid-spoon-75609

    03/03/2023, 4:45 PM
    Hello! Is there a way to run custom actions from the DataHub helm chart? I see that a container is running when I deploy, but there is no documentation which describes how it works:
    Copy code
    datahub-acryl-datahub-actions-58b676f77c-c6pfx
    What is the purpose of this subchart? It’s enabled by default but I can’t find info on how to use it: https://github.com/acryldata/datahub-helm/tree/master/charts/datahub/subcharts/acryl-datahub-actions Thanks!
    s
    • 2
    • 15
  • w

    white-horse-97256

    03/03/2023, 6:22 PM
    Hi Team, We are trying to deploy a new datahub instance using helm charts approach, we see there is schema registry url in the values.yaml file of type kafka
    Copy code
    schemaregistry:
      url: "<http://prerequisites-cp-schema-registry:8081>"
      type: KAFKA
    What do you mean type: KAFKA , we are trying to use our own kafka servers and wanted to check what schema registry url should we give? How to we provide creds required to authenticate our servers?
    a
    b
    • 3
    • 7
  • w

    white-horse-97256

    03/03/2023, 8:40 PM
    Hi Team, I am also facing issue connecting to ES server , our ES server is https and they have self-signed certificates, how to configure those in values.yaml file? for ES in helm charts
    a
    a
    +2
    • 5
    • 53
1...363738...53Latest