https://datahubproject.io logo
Join SlackCommunities
Powered by
# all-things-deployment
  • m

    modern-belgium-81337

    06/09/2022, 11:15 PM
    Hi team, is there a recommended way to convert the helm charts to Kustomize charts? I want to deploy the dependencies with Kustomize but am unsure of the best way to get started
  • t

    tall-fall-45442

    06/10/2022, 12:17 AM
    I am trying to use AWS resources in my Datahub deployment and am following the instructions in the documentation. However when I run
    helm upgrade --install datahub/datahub --values values.yaml --debug
    (where I've placed all the AWS configuration details in the values.yaml file) I get the following output
    Copy code
    history.go:56: [debug] getting history for release datahub
    upgrade.go:139: [debug] preparing upgrade for datahub
    upgrade.go:147: [debug] performing update for datahub
    upgrade.go:319: [debug] creating upgraded release for datahub
    client.go:299: [debug] Starting delete for "datahub-elasticsearch-setup-job" Job
    client.go:128: [debug] creating 1 resource(s)
    client.go:528: [debug] Watching for changes to Job datahub-elasticsearch-setup-job with timeout of 5m0s
    client.go:556: [debug] Add/Modify event for datahub-elasticsearch-setup-job: ADDED
    client.go:595: [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
    client.go:556: [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED
    client.go:299: [debug] Starting delete for "datahub-kafka-setup-job" Job
    client.go:128: [debug] creating 1 resource(s)
    client.go:528: [debug] Watching for changes to Job datahub-kafka-setup-job with timeout of 5m0s
    client.go:556: [debug] Add/Modify event for datahub-kafka-setup-job: ADDED
    client.go:595: [debug] datahub-kafka-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
    upgrade.go:430: [debug] warning: Upgrade "datahub" failed: pre-upgrade hooks failed: timed out waiting for the condition
    Error: UPGRADE FAILED: pre-upgrade hooks failed: timed out waiting for the condition
    helm.go:88: [debug] pre-upgrade hooks failed: timed out waiting for the condition
    UPGRADE FAILED
    main.newUpgradeCmd.func2
            <http://helm.sh/helm/v3/cmd/helm/upgrade.go:202|helm.sh/helm/v3/cmd/helm/upgrade.go:202>
    <http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
            <http://github.com/spf13/cobra@v1.2.1/command.go:856|github.com/spf13/cobra@v1.2.1/command.go:856>
    <http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
            <http://github.com/spf13/cobra@v1.2.1/command.go:974|github.com/spf13/cobra@v1.2.1/command.go:974>
    <http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
            <http://github.com/spf13/cobra@v1.2.1/command.go:902|github.com/spf13/cobra@v1.2.1/command.go:902>
    main.main
            <http://helm.sh/helm/v3/cmd/helm/helm.go:87|helm.sh/helm/v3/cmd/helm/helm.go:87>
    runtime.main
            runtime/proc.go:255
    runtime.goexit
            runtime/asm_amd64.s:1581
    Where can I find information about the pre-upgrade hooks? I don't seem to see it mentioned in the deployment documentation.
  • q

    quick-megabyte-61846

    06/10/2022, 7:16 AM
    Hello, I don’t know if it’s a good channel, does somebody accomplish integrating DataHub with dbt-expectations?
    plus1 1
    r
    b
    • 3
    • 2
  • t

    tall-fall-45442

    06/10/2022, 5:26 PM
    If I plan to deploy Datahub using AWS resources that have already been provisioned do I still need to go through the process of installing datahub-prerequisites?
    s
    e
    • 3
    • 15
  • c

    creamy-artist-77191

    06/10/2022, 9:20 PM
    Hey there, very new to Datahub. I've succesfully ingested data from Bigquery locally. As a next step, I deployed Datahub using GKE. Everything seems to be working as intended. What is the best way to go about removing the default datahub user and adding then inviting new users?
    l
    e
    • 3
    • 12
  • t

    tall-fall-45442

    06/12/2022, 4:55 AM
    I am now having issues setting up Kafka for Datahub. I am using AWS MSK with SASL/SCRAM. After doing some searching I see that I need to make the following update to my
    values.yaml
    file:
    Copy code
    global:
      springKafkaConfigurationOverrides:
        security.protocol: SASL_SSL
        sasl.mechanism: SCRAM-SHA-512
        sasl.jaas.config: org.apache.kafka.common.security.scram.ScramLoginModule required username="username" password="password":
    However, with only these changes the
    datahub-kafka-setup-job
    fails, and I see the following in the logs.
    Copy code
    org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
            at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:535)
            at org.apache.kafka.clients.admin.Admin.create(Admin.java:75)
            at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:49)
            at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:138)
            at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
    Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
            at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
            at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
            at org.apache.kafka.common.security.JaasContext.loadClientContext(JaasContext.java:82)
            at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:134)
            at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:73)
            at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:105)
            at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:508)
            ... 4 more
    Does anyone know if there is something that I am missing or did incorrectly?
    o
    • 2
    • 1
  • b

    breezy-controller-54597

    06/13/2022, 12:48 AM
    elasticsearch-setup-job
    cannot connect to
    elasticsearch-master
    .
    Copy code
    2022/06/13 00:38:01 Waiting for: <http://elasticsearch-master:9200|http://elasticsearch-master:9200>
    2022/06/13 00:38:05 Problem with request: Get <http://elasticsearch-master:9200:|http://elasticsearch-master:9200:> dial tcp: lookup elasticsearch-master on 10.43.0.10:53: server misbehaving. sleeping 1s
    …
    2022/06/13 01:08:48 Timeout after 2m0s waiting on dependencies to become available: [<http://elasticseach-master:9200|http://elasticseach-master:9200>]
    • 1
    • 1
  • m

    mammoth-fountain-32989

    06/13/2022, 5:24 AM
    Hi, Any pointers on deploying DataHub with High Availability. Please share steps if available or someone has done that. Thanks in advance.
    l
    o
    • 3
    • 4
  • b

    breezy-controller-54597

    06/13/2022, 5:51 AM
    Using the helm chart, I was able to deploy the DataHub, created an L7 ingress and was able to access the frontend from outside the cluster. 🎉 I created a L7 ingress for gms as well, but could not access it. Do I need to do anything special to access gms from outside the cluster?
    • 1
    • 2
  • c

    creamy-van-28626

    06/13/2022, 7:39 AM
    Hi team What's the value of build platform for frontend image?
    s
    b
    • 3
    • 11
  • n

    numerous-bird-27004

    06/13/2022, 5:22 PM
    When will v0.8.38 be available in helm chart for deployment?
    b
    r
    • 3
    • 15
  • b

    big-carpet-38439

    06/14/2022, 4:37 PM
    @bitter-dog-24903 Let's chat more here!
    b
    e
    +4
    • 7
    • 110
  • f

    fancy-thailand-73281

    06/14/2022, 9:38 PM
    Hi All, https://datahubproject.io/docs/deploy/aws looks like Ingress service is deployed through only OIDC? do we have any other options other than OIDC
    o
    • 2
    • 2
  • b

    better-orange-49102

    06/15/2022, 3:48 AM
    Got a question about safeguarding against deletes; currently, we allow other teams to ingest datasets via the rest endpoint and with metadata-authentication enabled. They can ingest via https://<url>/api/gms using their recipe and token as we did not expose GMS, only frontend. But we also noticed that the
    datahub delete
    command is accessible by the other parties as well. Just wondering how you guys would set out to block or at least logging people who send hard delete commands? Some kind of IP whitelist/blacklist for the Ingress and endpoint combination? Is it possible?
    b
    • 2
    • 2
  • f

    fresh-napkin-5247

    06/15/2022, 12:53 PM
    Hello all. We are trying to spin a minimal version of datahub on ECS to let our BA team try the tool. We intent to use the aws managed services for MYSQL and Elastic search, and run the gms and the frontend on ECS containers. So in total, we would have 4 services running on this ‘minimal’ setup. Given that we are looking to ingest the data via http (At least at this stage), can we skip KAFKA entirely? Also, what would be the advantages of using neo4j over elasticsearch? Is there someone in here with a similar deployment architecture? Thank you! 🙂
    b
    • 2
    • 3
  • b

    bland-orange-13353

    06/15/2022, 5:43 PM
    This message was deleted.
    b
    c
    • 3
    • 13
  • m

    mysterious-portugal-30527

    06/15/2022, 8:16 PM
    Hi All! I have been digging into the security vulnerabilities around the DataHub canned images. It looks to me like all of the remaining vulnerabilities are coming from usage of Docker’s default Python image, which appears to be based on a vulnerable Debian base linux distro, at least for the severe vulnerabilities. These vulnerabilities were identified at least a year-ish ago, and as far as I can tell who ever owns this image just does not seem to care about it. I think as an org / community DataHub should care about this! (One guy’s opinion!!) So IMHO a better base Python image is needed. Not sure of the best way to make that happen, but there it is.
    b
    • 2
    • 4
  • r

    rapid-book-98432

    06/16/2022, 10:27 AM
    Hi hi there 🙂 How do you know wich version of the datahub/datahub-prerequisites helm chart matches with the datahub/datahub chart version ? You have datahub/datahub-prerequisites : 0.0.6 / 0.0.5 / 0.0.4 ... And datahub latests : 0.8.38 / 0.8.36 and also multiple chart version Thanks if you have any advice
    b
    • 2
    • 2
  • m

    mammoth-fountain-32989

    06/16/2022, 10:28 AM
    Hi, Can we integrate LDAP/kerberos authentication for users to use DataHub, please provide reference docs if it can be done. Thanks
    r
    b
    a
    • 4
    • 5
  • c

    creamy-van-28626

    06/16/2022, 2:09 PM
    Hi I have installed datahub over a dbt edp image By using pip install acryl-datahub==0.8.38 But I am still getting error No module named - datahub provider
    b
    d
    s
    • 4
    • 24
  • f

    faint-translator-23365

    06/16/2022, 8:48 PM
    Hi What's the best way to enable SSL for datahub frontend? I don't see any configuration for adding certificates in helm chart and also there isn't an option to add side car containers in values.yaml of datahub frontend helm chart. (I wanted to run nginx as side car to enable ssl)
    b
    b
    +2
    • 5
    • 8
  • b

    bright-cpu-56427

    06/17/2022, 5:24 AM
    I want to create datahub custom datasets and custom charts using python. Is there any way??
    b
    • 2
    • 2
  • g

    gray-architect-29447

    06/17/2022, 7:49 AM
    hello guys, I searched about it but cannot find the answer. How do I get the count of active users in given time period? I mean users that already logged in to the system at least once. I hope kind of data is stored in the elasticsearch db.
    b
    • 2
    • 2
  • c

    creamy-church-10353

    06/17/2022, 9:37 AM
    Hi, My datahub setup is running on EKS, using AWS Elasticsearch
    7.10
    , and also using it as a backend for GMS graph service by setting
    global.graph_service_impl = elasticsearch
    in the helm chart. I'm getting errors while accessing data on Datahub UI, for e.g
    Analytics
    page. I have attached the corresponding errors produced by GMS service. See datahub-gms.log Any help or hint in this case, please.
    datahub-gms.log
    b
    s
    c
    • 4
    • 10
  • b

    bitter-lizard-32293

    06/17/2022, 2:14 PM
    hey folks, are there any rough guidelines on scaling GMS instances based on ingest load? We started to dial up traffic from our ingest systems -> gms (currently all over http) and we're doing around 60 calls / min to 570 calls / min. When we see our spikes, we notice that our ingest latencies also tend to go up from 200ms to 3s. This is being handled by 3 GMS instances and a 12 data node + 3 master + 3 router ES cluster. Trying to get a better sense for where the bottlenecks might be to inform the right scaling. I looked at our DB metrics and they seem healthy (low latencies, low cpu, well below conn count limit). I suspect the bottlenecks might be on the search cluster or the GMS service. Trying to dig in to figure out which and as I'm hunting through the JMX metrics we push, would be great to get pointers from folks who might already know
    o
    s
    • 3
    • 8
  • h

    helpful-processor-71693

    06/19/2022, 10:17 AM
    I am facing issues in setting up kafka for datahub, I have used following configuration to connect my AWS MSK with SASL/SCRAM in my values.yaml file:
    Copy code
    kafka:
        bootstrap:
          server: "bootstrap1:9096,bootstrap2:9096,bootstrap3:9096"
        zookeeper:
          server: "zk1:2181,zk2:2181,zk3:2181"
    global:
    credentialsAndCertsSecrets:
        name: sasl-jass-config
        secureEnv:
          sasl.jaas.config: sasl_jaas_config
    
    
    springKafkaConfigurationOverrides:
       security.protocol: SASL_SSL
       sasl.mechanism: SCRAM-SHA-512
    My secrets file is containing the jaas.config file content as follows.
    Copy code
    org.apache.kafka.common.security.scram.ScramLoginModule required username="xxxxxxxxxx" password="xxxxxxxxxxxx";
    and i verified that my MSK is already configured with SASL/SCRAM but still
    datahub-kafka-setup-job
    is failing with following error:
    Copy code
    [main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready.
    org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
    	at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:535)
    	at org.apache.kafka.clients.admin.Admin.create(Admin.java:75)
    	at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:49)
    	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:138)
    	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
    Caused by: java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
    	at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
    	at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
    	at org.apache.kafka.common.security.JaasContext.loadClientContext(JaasContext.java:82)
    	at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:134)
    	at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:73)
    	at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:105)
    	at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:508)
    Can someone please help if anything specific I am missing in the configuration ?
    l
    b
    i
    • 4
    • 12
  • p

    prehistoric-knife-90526

    06/19/2022, 7:47 PM
    Hi All 👋. Is there a way to set the logging level for acryl-datahub-actions deployed in K8s? Metadata ingestion through the UI is creating very verbose INFO logs and I'd like to decrease it to WARN.
    s
    • 2
    • 2
  • b

    bland-easter-53873

    06/20/2022, 7:13 AM
    Hi All, need help in deploying the datahub on the AWS environment using both ECS and EKS as both of them are meant for separate purposes. Any guide on this regard would be really helpful
    b
    • 2
    • 1
  • b

    bland-easter-53873

    06/20/2022, 7:49 AM
    @bumpy-needle-3184, is there a similar one for ECS ?
    plus1 1
    b
    i
    • 3
    • 2
  • g

    gray-architect-29447

    06/20/2022, 9:38 AM
    hi, quite strange thing happened to me. Today morning I had to add a new admin user, so I added it from the GUI. After a few minutes, all accounts became unable to access resources even admin users. There's nothing changed except this priveledge change. Guys could you please suggest me where do need to start checking?
    i
    • 2
    • 5
1...141516...53Latest