https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • b

    best-daybreak-64419

    01/13/2023, 3:38 PM
    Hello team! hihi I have a question about the words or module names used in the DataHub Project GitHub development source. They are
    metadata-io
    ,
    metadata-service
    ,
    metadata-ingestion
    ,
    datahub-GMS
    Can someone give me a summary of what each module does and how it relates to it?
    đź‘€ 1
    a
    • 2
    • 5
  • m

    microscopic-scientist-28962

    01/14/2023, 6:40 PM
    I’m trying to launch a new instance on an M1 Pro. I’m getting the error that kafka-setup is still running but when I look in the docker desktop logs I see that it appears to have finished successfully:
    Copy code
    2023-01-14 13:14:46 2 done working
    2023-01-14 13:14:46 3 done working
    2023-01-14 13:14:46 1 done working
    2023-01-14 13:14:56 WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
    2023-01-14 13:14:58 4 done working
    2023-01-14 13:14:58 Topic Creation Complete.
    The broker service appears to be running correctly but GMS says it’s still waiting on it. Nuking, pruning and starting over. I’ve removed all other containers and don’t believe there are any port conflicts. It appears to be a timing issue (I think). Is is possible to just start the various components one by one within the docker desktop?
    b
    m
    s
    • 4
    • 7
  • w

    worried-animal-81235

    01/16/2023, 2:29 PM
    We are looking integrating data quality check with DataHub, I know DataHub supports Great Expectation, but we are thinking of WhyLogs, does DataHub has a plane to integrate with WhyLogs as well?
    g
    • 2
    • 1
  • l

    lemon-daybreak-58504

    01/16/2023, 6:25 PM
    hi everyone i am having trouble configuring a bigquery ingest source the log says i am having trouble with this: "include_usage_statistics", does anyone know how to fix it?
    âś… 1
    g
    • 2
    • 1
  • g

    gray-nightfall-60513

    01/17/2023, 1:32 PM
    Hello, is it possible to customize tabs I see in the datahub? E.g. when I select some dataset I would like to add another tab “Custom” that would include e.g. large table with data sample and e.g. another tab that would include some charts
    âś… 2
    a
    • 2
    • 4
  • g

    gifted-florist-55079

    01/17/2023, 7:10 PM
    Hey guys, hope you all are doing great there Here are 2 questions I am currently concerned with: 1. Just deployed datahub in k8s using https://github.com/acryldata/datahub-helm and I was wondering where are the credentials to login. 2. How am I supposed to install libraries? (for instance, let’s say I would like to install this module https://datahubproject.io/docs/generated/ingestion/sources/delta-lake/) Thanks in advance!
    âś… 1
    g
    • 2
    • 2
  • q

    quaint-ambulance-76706

    01/17/2023, 8:39 PM
    Hi all -- I am just getting started with DataHub. My use case is that I would like to map out a series of data sources / tables / schemas (and their lineages/dependencies), but none of these tables actually exist in real life. Is there a way for me to easily define these data sources/tables in a file or in the UI? I am only seeing ways to import actual data sources. Are there any good options here aside from creating fake empty tables in one of the supported sources?
    âś… 1
    g
    • 2
    • 4
  • l

    lively-dusk-19162

    01/17/2023, 9:16 PM
    Hello all, i am trying to set up datahub in my system with v0.9.3 and v0.9.2 . I tried running QuickStart command couple of times but still coming up with the issue Unable to run quickstart command: datahub-gms is still running Kafka-setup is still running
    đź‘€ 1
    a
    b
    b
    • 4
    • 29
  • l

    lively-dusk-19162

    01/17/2023, 9:16 PM
    Coudl some one help me out on how to resolve this issue?
  • s

    steep-train-75714

    01/17/2023, 9:33 PM
    Hi I am getting this error when running this command datahub docker quickstart: Docker doesn't seem to be running. I run it on Windows 10 64 bit machine.
    âś… 1
  • s

    steep-train-75714

    01/17/2023, 9:36 PM
    image.png
    b
    a
    b
    • 4
    • 12
  • s

    steep-train-75714

    01/17/2023, 9:39 PM
    my docker instance is using WSL 2 based engine
  • g

    gentle-plastic-92802

    01/18/2023, 3:07 AM
    Hi all, new to Datahub. Does Datahub provide a rate limit config for the API calls?
    âś… 1
    a
    • 2
    • 1
  • b

    brave-zoo-82336

    01/18/2023, 2:57 PM
    Hi DataHub, we want to consider DataHub for our project. Is there any one who will give Demo / Sales contact people to show technical and business capabilities and implementation ?.. Please reach out to Nagesh.Shivaramu@paccar.com with Demo Details
    âś… 1
  • g

    gifted-bird-57147

    01/18/2023, 3:06 PM
    Hi, I recently started defining our own Dataplatform types using the Snapshot method, by ingesting a definition file. That method now seems to be deprecated? (I upgraded to 0.9.6) And is replaced by a definition via the enitity-registry.yml apparently? https://datahubproject.io/docs/metadata-modeling/metadata-model#the-entity-registry However from the documentation I can't quite figure out how to actually define our own dataplatform types to be added. Is there any more documentation or examples on this?
    âś… 1
    đź‘€ 1
    a
    • 2
    • 1
  • c

    cold-application-21589

    01/18/2023, 9:11 PM
    hi, i see that the log4j was updated end of 2021, but when i install using docker quickstart (https://datahubproject.io/docs/quickstart) i still get the old log4j with the vulnerability. do you know how i can fix it?
    đź‘€ 1
    âś… 1
    a
    • 2
    • 1
  • c

    colossal-autumn-78301

    01/19/2023, 1:59 PM
    Hi all, I am getting started with datahub, and getting this error - any insights? @green-football-43791 and others
    Copy code
    > Task :metadata-ingestion:lint
    + black --check --diff src/ tests/ examples/
    All done! ✨ 🍰 ✨
    580 files would be left unchanged.
    + isort --check --diff src/ tests/ examples/
    Skipped 1 files
    + flake8 --count --statistics src/ tests/ examples/
    0
    + mypy --show-traceback --show-error-codes src/ tests/ examples/
    src/datahub/ingestion/source_config/sql/snowflake.py:345: error: Dict entry 0 has incompatible type "str": "bytes"; expected "str": "int"  [dict-item]
    Found 1 error in 1 file (checked 636 source files)
    
    > Task :metadata-ingestion:lint FAILED
    
    FAILURE: Build failed with an exception.
    
    * What went wrong:
    Execution failed for task ':metadata-ingestion:lint'.
    > Process 'command 'bash'' finished with non-zero exit value 1
    
    * Try:
    Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
    
    * Get more help at <https://help.gradle.org>
    âś… 2
    b
    g
    • 3
    • 4
  • c

    cuddly-plumber-64837

    01/19/2023, 3:10 PM
    Hello everyone, I am working to bring in demo data into my instance. Currently using the recipe given here: https://datahubproject.io/docs/generated/ingestion/sources/demo-data#config-details. However when I try to ingest this recipe, I am met with a "Failed to set up framework context: Failed to instantiate a valid DataHub Graph instance" error. Can someone explain to me what I am missing?
    đź‘€ 1
    âś… 1
    a
    • 2
    • 7
  • f

    fresh-cricket-75926

    01/20/2023, 9:04 AM
    Hi community , we are looking for some kind of workflow mechanism to manage[approve/reject] business glossary in datahub . Will be helpful if anyone can point to the possible mechanism available
    âś… 1
    a
    b
    • 3
    • 2
  • g

    gifted-bird-57147

    01/20/2023, 12:35 PM
    Hi All, We added a custom dataplatform in our configuration using the following specification:
    Copy code
    [
        {
          "auditHeader": null,
          "proposedSnapshot": {
            "com.linkedin.pegasus2avro.metadata.snapshot.DataPlatformSnapshot": {
              "urn": "urn:li:dataPlatform:CSW_Record",
              "aspects": [
                {
                  "com.linkedin.pegasus2avro.dataplatform.DataPlatformInfo": {
                    "datasetNameDelimiter": "/",
                    "name": "CSW_Record",
                    "type": "OTHER",
                    "logoUrl": "<https://d1q6f0aelx0por.cloudfront.net/product-logos/library-geonetwork-logo.png>"
                  }
                }
              ]
            }
          },
          "proposedDelta": null
        }
      ]
    Is there a way to have the name show up as well? (like with the standard platform types)
    âś… 1
    b
    m
    • 3
    • 4
  • b

    brave-businessperson-3969

    01/20/2023, 3:41 PM
    Question about role based access / multi-client capabilities: I just checked the options to restrict access to DataHub objects. If I unstand it correctly, I can restrict access to statistics and table details. Which is nice but solves only a part of my problem. Ideally I want to configure that only if a user is part of a certain DataHub group, he/she should be able to see certain objects (this could be tables, whole domains, glossary terms, etc. Essentially all DataHub objects). If the user is not a member of a group which permits access, DataHub should act as if this object(s) do(es) not exists at all. The object should neither shown up as part of search results (most important), nor show up on the front page or as related/relates to in glossary terms or somewhere else, not shown in the linage graph, etc. Does DataHub have multi-client capabilities as described above (and I just haven't found this feature) and if not, are there any plans to implement such a functionality?
    đź‘€ 1
    a
    b
    b
    • 4
    • 8
  • c

    colossal-van-35225

    01/20/2023, 10:59 PM
    What's the easiest way to start using DataHub to check it's functionality and usability ? I thought it would be the managed service at Acryl, but a week on I still don't have an account and I don't think it's going to be free (I thought I read that somewhere, but can't find it now). I've never used Docker before, I'm hoping it will be a simple setup on Linode or AWS 🙂
    âś… 1
    l
    a
    • 3
    • 14
  • r

    ripe-eye-60209

    01/21/2023, 12:36 PM
    Hello Team, following the airflow integration docs https://datahubproject.io/docs/docker/airflow/local_airflow. i'm getting this error with the example lineage.
  • f

    few-tent-85021

    01/21/2023, 11:02 PM
    Im doing good on helm for datah ub but am having problem with mce and mae not coming up. error is not there when i deploy chart and can make ingetion source but it is always stuck at pending. help.
    âś… 1
    b
    a
    o
    • 4
    • 3
  • d

    delightful-quill-6776

    01/22/2023, 12:56 PM
    hii Team, i am new in Datahub and our team is going to install it on EKS but i have one doubt should we have to install zookeeper,kafka,mysql .. ,is this required?
    âś… 1
    b
    a
    • 3
    • 11
  • m

    microscopic-machine-90437

    01/24/2023, 10:30 AM
    Hello team, how do we find urn of a particular environment(for example dbt)?
    d
    • 2
    • 2
  • b

    blue-pilot-81295

    01/24/2023, 1:11 PM
    Hello, I would like to use Fargate in order to deploy. Is it possible to deploy prerequisites as well on Fargate ? or they should be deployed separately ?
    âś… 1
    o
    • 2
    • 2
  • w

    witty-butcher-82399

    01/24/2023, 2:41 PM
    Hi team! I’m checking existing privileges https://github.com/datahub-project/datahub/blob/db968497cc736594d81b9000357a753abe[…]in/java/com/linkedin/metadata/authorization/PoliciesConfig.java While I have found the
    DELETE_ENTITY_PRIVILEGE
    I haven’f found the corresponding
    CREATE_ENTITY_PRIVILEGE
    . Does it mean that this privilege is given out of the box to all users? Instead I was thinking we may restrict the create entity privilege to some specific user, or even restrict some users to create entities (datasets) for some specific platform or even platform instance only. So just wondering whether this is possible. Thanks
    âś… 1
    đź‘€ 1
    a
    b
    • 3
    • 9
  • b

    better-orange-49102

    01/24/2023, 3:05 PM
    for the datahub-actions container, I see that documentation in the action repo saying that it can subscribe to Entity Change Events v1 and Metadata Change Log v1 (of which I interprete it as being able to read both Versioned and Timeseries Logs), however I am unable to get any events about DatasetProfiles timeseries aspects being ingested, only Versioned v1 events. My kafka source config:
    Copy code
    source:
      type: "kafka"
      config:
        connection:
          bootstrap: ${KAFKA_BOOTSTRAP_SERVER:-localhost:9092}
          schema_registry_url: ${SCHEMA_REGISTRY_URL:-<http://localhost:8081>}
    Am I misunderstanding something or missed out a config
    b
    • 2
    • 2
  • m

    mammoth-insurance-91360

    01/25/2023, 10:22 AM
    Hi All, quick question, is it possible to use PAM authentication with datahub(for user logins)? If so are there any examples
    âś… 1
    a
    • 2
    • 1
1...525354...80Latest