https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • n

    nice-yak-49999

    07/13/2021, 8:12 AM
    Does the
    mysql-setup-job
    container only run this sql?
    Copy code
    -- create datahub database
    CREATE DATABASE IF NOT EXISTS DATAHUB_DB_NAME;
    USE DATAHUB_DB_NAME;
    
    -- create metadata aspect table
    create table if not exists metadata_aspect_v2 (
      urn                           varchar(500) not null,
      aspect                        varchar(200) not null,
      version                       bigint(20) not null,
      metadata                      longtext not null,
      systemmetadata                longtext,
      createdon                     datetime(6) not null,
      createdby                     varchar(255) not null,
      createdfor                    varchar(255),
      constraint pk_metadata_aspect_v2 primary key (urn,aspect,version)
    );
    
    -- create default records for datahub user if not exists
    CREATE TABLE temp_metadata_aspect_v2 LIKE metadata_aspect_v2;
    INSERT INTO temp_metadata_aspect_v2 (urn, aspect, version, metadata, createdon, createdby) VALUES(
      'urn:li:corpuser:datahub',
      'corpUserInfo',
      0,
      '{"displayName":"Data Hub","active":true,"fullName":"Data Hub","email":"<mailto:datahub@linkedin.com|datahub@linkedin.com>"}',
      now(),
      'urn:li:principal:datahub'
    ), (
      'urn:li:corpuser:datahub',
      'corpUserEditableInfo',
      0,
      '{"skills":[],"teams":[],"pictureLink":"<https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web/packages/data-portal/public/assets/images/default_avatar.png>"}',
      now(),
      'urn:li:principal:datahub'
    );
    -- only add default records if metadata_aspect is empty
    INSERT INTO metadata_aspect_v2
    SELECT * FROM temp_metadata_aspect_v2
    WHERE NOT EXISTS (SELECT * from metadata_aspect_v2);
    DROP TABLE temp_metadata_aspect_v2;
    
    -- create metadata index table
    CREATE TABLE IF NOT EXISTS metadata_index (
     `id` BIGINT NOT NULL AUTO_INCREMENT,
     `urn` VARCHAR(200) NOT NULL,
     `aspect` VARCHAR(150) NOT NULL,
     `path` VARCHAR(150) NOT NULL,
     `longVal` BIGINT,
     `stringVal` VARCHAR(200),
     `doubleVal` DOUBLE,
     CONSTRAINT id_pk PRIMARY KEY (id),
     INDEX longIndex (`urn`,`aspect`,`path`,`longVal`),
     INDEX stringIndex (`urn`,`aspect`,`path`,`stringVal`),
     INDEX doubleIndex (`urn`,`aspect`,`path`,`doubleVal`)
    );
    e
    • 2
    • 1
  • f

    fancy-helmet-32669

    07/13/2021, 1:01 PM
    Hi folks. Would you tell me if it’s possible to make a backup of the state of Datahub deployed on AWS EKS (without AWS manages services for the storage layer) and restore it in the case of an emergency? Or maybe using managed services for storage is a better option?
    e
    • 2
    • 2
  • n

    nice-yak-49999

    07/14/2021, 8:01 AM
    How can I apply OIDC on k8s datahub-frontend? Just add AUTH_OIDC env like applying OIDC to docker?
    e
    f
    g
    • 4
    • 14
  • b

    blue-holiday-20644

    07/14/2021, 2:33 PM
    I've managed to get AWS Kafka/MSK working as a replacement for the k8s quickstart config and I was wondering if it was possible to use the AWS Glue Schema Registry instead of the prerequisites cp-schema-registry service? Can the schemaregistry config use an AWS ARN instead of a URL?
    b
    m
    • 3
    • 5
  • r

    rapid-sundown-8805

    07/15/2021, 1:02 PM
    Hi community, we prefer postgres to mysql (aligns better with our stack) so I'd like to use postgres as backend for gms. However, the helm chart contains a lot of mysql-specific variables, and no mention of postgres. Does the helm chart support postgres? https://github.com/acryldata/datahub-helm/tree/master/charts/datahub
    e
    g
    l
    • 4
    • 8
  • s

    square-activity-64562

    07/16/2021, 7:23 AM
    Is the version supposed to be unavailable?
    Copy code
    docker run -it --rm --entrypoint=bash --tty linkedin/datahub-ingestion:v0.8.6  
    datahub@2523f447d4db:/$ datahub version
    DataHub CLI version: unavailable (installed editable via git)
    Python version: 3.8.11 (default, Jun 29 2021, 19:54:56) 
    [GCC 8.3.0]
    datahub@2523f447d4db:/$
    🙌 1
    b
    g
    • 3
    • 5
  • c

    crooked-leather-44416

    07/19/2021, 3:22 PM
    Is there a Docker image I can use for
    ./gradlew build
    that has all necessary dependencies (Gradle, Python, etc.)?
    g
    a
    b
    • 4
    • 17
  • s

    square-activity-64562

    07/21/2021, 7:22 PM
    What are the pros and cons of running the Mae and mce jobs standalone?
    m
    • 2
    • 4
  • r

    rapid-sundown-8805

    07/22/2021, 3:08 PM
    Hi community, I ran into an issue with the helm chart, where it defaults to appending
    :9200
    (the port) to the elastic URI, although the AWS ElasticSeararch doesn't work that way (it's reached just
    <https://elastic-host.domain.com>
    without the port). This causes the elasticearch init-job to fail,
    2021/07/22 143855 Waiting for: https://myuser:mypassword@search-mydomain.eu-central-1.es.amazonaws.com:9200
    I got past this step by setting it to yaml null, `~`:
    Copy code
    global:
        elasticsearch:
            port: ~
    However, gms does not heed the port variable, and continues to use
    :9200
    (same log as above). Are there some env vars I can set on GMS to make it not append a port?
    m
    • 2
    • 2
  • h

    hallowed-london-5429

    07/27/2021, 9:54 AM
    Hello team, Can we integrate Microsoft Active Directory authentication to DataHub? is it possible?
    b
    • 2
    • 1
  • c

    cool-iron-6335

    08/04/2021, 9:53 AM
    I've checked out the new version of DataHub that's deployed on K8s and got error with mae component
    a
    e
    • 3
    • 5
  • s

    square-activity-64562

    08/04/2021, 10:36 AM
    @early-lamp-41924 Is this correct? I think it should be
    with .Values.labels
    https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/subcharts/datahub-frontend/templates/deployment.yaml#L6 I am not 100% sure so wanted to ask. Similar includes are present in most of the files. And I don't think any of them are working. Neither
    labels
    nor
    selectorLabels
    are being applied to deployments
    e
    • 2
    • 4
  • r

    rapid-sundown-8805

    08/18/2021, 11:28 AM
    Hi Community! In DataHub v0.8.10, GMS throws this exception:
    Copy code
    org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'configEntityRegistry' defined in com.linkedin.gms.factory.entityregistry.ConfigEntityRegistryFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.linkedin.metadata.models.registry.ConfigEntityRegistry]: Factory method 'getInstance' threw exception; nested exception is java.io.FileNotFoundException: ../../metadata-models/src/main/resources/entity-registry.yml (No such file or directory)
    I'm wondering if any of you have faced the same? Seems there is some yaml file that it is missing (or cannot find at that path).
    e
    b
    • 3
    • 9
  • b

    blue-holiday-20644

    08/26/2021, 1:03 PM
    Untitled
    Untitled
    🙌 3
    b
    h
    e
    • 4
    • 6
  • h

    handsome-football-66174

    09/02/2021, 5:38 PM
    Hi Trying Dockerised Datahub in AWS env, as @blue-holiday-20644 suggested. Using Postgres instead of mysql. Getting this error - Trying the above with postgres rather than mysql- unfortunately getting this - Any suggestions?
    l
    e
    • 3
    • 17
  • c

    cool-iron-6335

    09/06/2021, 8:49 AM
    I've been deploy DataHub on k8s and i want to track user activity over log of pods like gms or frontend. However the log of pod just give information over time but not date. And that's just not enough. I want to add date into log that will come along with time. Where i can edit DataHub code to add date into log INFO ?
    e
    b
    • 3
    • 5
  • m

    millions-jelly-76272

    09/13/2021, 5:33 AM
    Hi community. Am deploying to EKS and have just started seeing an error when applying prerequisite helm chart:
    Copy code
    kubectl get pods -n datahub
    NAME                                               READY   STATUS    RESTARTS   AGE
    elasticsearch-master-0                             0/1     Pending   0          16m
    elasticsearch-master-1                             0/1     Pending   0          16m
    elasticsearch-master-2                             0/1     Pending   0          16m
    prerequisites-cp-schema-registry-cf79bfccf-zxp2d   1/2     Error     7          16m
    prerequisites-kafka-0                              0/1     Pending   0          16m
    prerequisites-zookeeper-0                          0/1     Pending   0          16m
    Error coming from cp-schema-registry pod:
    Copy code
    kubectl logs -n datahub -f pod/prerequisites-cp-schema-registry-cf79bfccf-zxp2d
    error: a container name must be specified for pod prerequisites-cp-schema-registry-cf79bfccf-zxp2d, choose one of: [prometheus-jmx-exporter cp-schema-registry-server]
    Has anyone seen this before?
    r
    e
    • 3
    • 6
  • h

    handsome-football-66174

    09/13/2021, 7:35 PM
    General - I am deploying to EKS and getting the following error . Any guidance on this is greatly appreciated - helm install datahub datahub/datahub W0913 153351.098210  9497 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob W0913 153351.230126  9497 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
    e
    • 2
    • 6
  • b

    better-orange-49102

    09/14/2021, 9:13 AM
    im deploying the datahub containers as individual deployments in k8s....can there ever be more than 1 instance of the various containers, ie broker, ES, mySQL, GMS, frontend-react running at the same time?
    r
    • 2
    • 2
  • p

    prehistoric-grass-6413

    09/14/2021, 3:37 PM
    Wondering if anyone has experienced this pain: Kubernetes clusters not working when a database fails over
    s
    • 2
    • 1
  • l

    little-address-54150

    09/15/2021, 2:25 PM
    Hey, I deployed Datahub 0.8.12 on our k8s cluster via the Helm chart 0.2.22, and it is working perfectly. Looked into monitoring via JMX Exporter, but couldnt figure out how to expose the metrics from datahub-gms in such a way that our existing Prometheus stack is able to pull them I noticed that in the latest version (0.8.12), monitoring stack via docker-compose is available, is there something missing in the Helm chart deployment?
    e
    • 2
    • 5
  • c

    careful-artist-3840

    09/16/2021, 2:02 AM
    Can someone pls look at - https://github.com/acryldata/datahub-helm/issues/24
    r
    • 2
    • 2
  • h

    handsome-football-66174

    09/21/2021, 2:44 PM
    General - Using this guide https://datahubproject.io/docs/deploy/aws#use-aws-managed-services-for-the-storage-layer I am able to setup the prerequisities up and running, but when I try to deploy datahub, only the following services are coming up.
    • 1
    • 1
  • c

    clean-furniture-99495

    09/22/2021, 8:35 AM
    Hi there! Any plan on releasing the new helm for v0.8.14?
    m
    • 2
    • 3
  • h

    handsome-football-66174

    09/22/2021, 5:24 PM
    General - Getting this when deploying Datahub as containers on EKS 0s     Warning  SyncLoadBalancerFailed  service/datahub-datahub-frontend         Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB 0s     Normal  EnsuringLoadBalancer   service/datahub-datahub-gms           Ensuring load balancer 0s     Warning  SyncLoadBalancerFailed  service/datahub-datahub-gms           Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB 0s     Warning  FailedBuildModel     ingress/datahub-datahub-frontend         Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet 0s     Normal  EnsuringLoadBalancer   service/datahub-datahub-frontend         Ensuring load balancer 0s     Warning  SyncLoadBalancerFailed  service/datahub-datahub-frontend         Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB 0s     Normal  EnsuringLoadBalancer   service/datahub-datahub-gms           Ensuring load balancer 0s     Warning  SyncLoadBalancerFailed  service/datahub-datahub-gms           Error syncing load balancer: failed to ensure load balancer: could not find any suitable subnets for creating the ELB 0s     Warning  FailedBuildModel     ingress/datahub-datahub-frontend         Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet How do we resolve this ?
    b
    • 2
    • 5
  • b

    brief-insurance-68141

    09/23/2021, 12:18 AM
    yes, use this datahub-datahub-gms:8080, works
    b
    • 2
    • 1
  • c

    chilly-barista-6524

    09/27/2021, 11:45 AM
    Hey everyone We are stuck in quite a deadlock while trying to upgrade our datahub from v0.6 to v0.8.14. Here are the steps we have performed: 1. Took backup of our existing mysql db and launched a new mysql container and restored the dump into it 2. Using these helm charts: https://github.com/acryldata/datahub-helm to install the upgraded version. We install all the prerequisites (except Mysql because that we are using the one we launched in step 1) 3. All the prerequisites get installed properly. Then when we try to install datahub via helm chart, everything runs fine except
    datahub-gms
    and
    datahubUpgrade
    job.
    datahub-gms
    throws following error:
    Copy code
    javax.persistence.PersistenceException: Query threw SQLException:Table 'datahub.metadata_aspect_v2' doesn't exist
    and
    datahubUpgrade
    throws following error:
    Copy code
    ERROR: Cannot connect to GMSat host test-datahub-datahub-gms port 8080. Make sure GMS is on the latest version and is running at that host before starting the migration.
    Now both the errors seems to be dependent on each other to me. I was wondering if we are missing any step in between and does the
    metadata_aspect_v2
    table needs to be created manually?
    l
    e
    b
    • 4
    • 23
  • s

    some-cricket-23089

    09/30/2021, 9:23 AM
    Hi Team , I was doing some changes in UI look and feel of datahub. To verify those changes i was trying to build the docker image of datahub-frontend module on running the below command from datahub home folder
    Copy code
    sudo docker build -t updated_datahub_frontend_react -f ./docker/datahub-frontend
    But this end with the error below
    Copy code
    ---> Running in 55db64e7685d
    /bin/sh: ./gradlew: not found
    The command '/bin/sh -c cd datahub-src && ./gradlew :datahub-frontend:dist -PenableEmber=${ENABLE_EMBER} -x test -x yarnTest -x yarnLint     && cp datahub-frontend/build/distributions/datahub-frontend.zip ../datahub-frontend.zip     && cd .. && rm -rf datahub-src && unzip datahub-frontend.zip' returned a non-zero code: 127
    Could anyone please help me resolve this issue.
    g
    • 2
    • 1
  • s

    some-cricket-23089

    10/01/2021, 1:15 PM
    And i guess error is because of yarn is not able to download the cypress package
    m
    • 2
    • 8
  • s

    some-cricket-23089

    10/04/2021, 5:43 AM
    Thank you so much once again for all your help and support. And sorry for such a silly issue, took you lots of time to dive into it.
    s
    • 2
    • 2
12345...53Latest