https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • w

    wonderful-egg-79350

    10/24/2022, 5:44 AM
    Hello All. How can I make it case-insensitive?
  • v

    victorious-author-66803

    10/24/2022, 7:10 AM
    Hi everyone~ I am new to datahub. Our team wants to login datahub with ldap. We've checked this https://datahubproject.io/docs/datahub-frontend/#authentication. But its implement is still with DummyLoginModule, so anyone can login with any user and password. Though we've added ldaploginmodule in jaas.conf as volume in datahub-front-end. Besides, we've set AUTH_JAAS_ENABLED=true. Is there a way to login with ldap, or how long will this feature come?
  • d

    dazzling-alarm-64985

    10/24/2022, 8:09 AM
    Hi, where can i learn where datahub stores its data? Im looking to migrate my datahub and im mostly interested in migrating our Glossary 🙂
    b
    • 2
    • 2
  • a

    average-dinner-25106

    10/24/2022, 8:28 AM
    One simple question : If owner is assigned to the table, does non-owner can't edit the column/table description? If not, how can I give restriction of edit right to non-owners? Can I do this on the ui screen?
    b
    • 2
    • 2
  • q

    quiet-wolf-56299

    10/24/2022, 1:13 PM
    Does metadata service auth have to be enabled to allow for native authentication in the frontend?
    h
    • 2
    • 1
  • c

    cuddly-greece-84260

    10/24/2022, 2:06 PM
    đź‘‹ Hi everyone! Does anyone know if data hub has a collaborative module (internal/ 3rd party) which would help users tag, comment, notify others on individual data sets
    b
    • 2
    • 1
  • w

    wide-musician-39968

    10/25/2022, 5:32 AM
    Hi everyone! I have trouble with ingesting in local rancher desktop environment Environment • M1 MAC OS 12.6 • Rancher desktop v1.5.1 ◦ runtime: dockerd • Kubernetes version v1.24.6 • helm v3.9.1 I followed this doc(https://datahubproject.io/docs/deploy/kubernetes/) for install What I did during install • Command
    docker pull --platform linux/amd64 neo4j:4.2.4
    before Helm install because of
    imagepull backoff
    • Command
    docker pull --platform linux/amd64 acryldata/datahub-actions:v0.0.7
    before Helm install because of
    imagepull backoff
    • Edit Elasticsearch config in prerequisites because I run this on single node cluster ◦
    replicas: *1
    ,*
    minimumMasterNodes: *1
    ,*
    antiAffinity: "soft"
    ,
    clusterHealthCheckParams: "wait_for_status=yellow&timeout=1s"
    • Edit datahub-gms config because I got
    0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
    from pod
    svclb-dataiub-datahub-gms
    . Actually when I redeploy data hub, either
    svclb-dataiub-datahub-gms
    or
    svclb-datahub-datahub-frontend
    got this error. Then I deploy Postgresql(https://artifacthub.io/packages/helm/bitnami/postgresql) to same cluster. And I make one table I create Ingestion from UI and below is my yaml setting.
    Copy code
    source:
      type: postgres
      config:
        include_tables: true
        database: test
        password: '${secret}'
        profiling:
          enabled: false
        host_port: 'test-postgresql:5432'
        include_views: true
        stateful_ingestion:
          enabled: true
        username: postgres
    pipeline_name: 'urn:li:dataHubIngestionSource:21f75ec0-4034-401f-82bc-edf8f5d0e259'
    sink:
      type: datahub-rest
      config:
        server: '<http://datahub-datahub-gms:8080>'
    And this is ingestion failed log
    Copy code
    ~~~~ Execution Summary ~~~~
    
    RUN_INGEST - {'errors': [],
     'exec_id': '7a6c8d3f-43cb-4d7c-aae6-a568423d2518',
     'infos': ['2022-10-25 04:08:20.614216 [exec_id=7a6c8d3f-43cb-4d7c-aae6-a568423d2518] INFO: Starting execution for task with name=RUN_INGEST',
          '2022-10-25 04:08:41.826105 [exec_id=7a6c8d3f-43cb-4d7c-aae6-a568423d2518] INFO: stdout=venv setup time = 0\n'
          'This version of datahub supports report-to functionality\n'
          'datahub --debug ingest run -c /tmp/datahub/ingest/7a6c8d3f-43cb-4d7c-aae6-a568423d2518/recipe.yml --report-to '
          '/tmp/datahub/ingest/7a6c8d3f-43cb-4d7c-aae6-a568423d2518/ingestion_report.json\n'
          '[2022-10-25 04:08:29,691] DEBUG  {datahub.telemetry.telemetry:206} - Sending init Telemetry\n'
          '[2022-10-25 04:08:30,184] DEBUG  {datahub.telemetry.telemetry:239} - Sending Telemetry\n'
          '[2022-10-25 04:08:30,399] INFO   {datahub.cli.ingest_cli:177} - DataHub CLI version: 0.8.43.5\n'
          "[2022-10-25 04:08:30,412] DEBUG  {datahub.cli.ingest_cli:189} - Using config: {'pipeline_name': "
          "'urn:li:dataHubIngestionSource:21f75ec0-4034-401f-82bc-edf8f5d0e259', 'run_id': '7a6c8d3f-43cb-4d7c-aae6-a568423d2518', 'sink': "
          "{'config': {'server': '<http://10.42.0.143:8080>'}, 'type': 'datahub-rest'}, 'source': {'config': {'database': 'test', 'host_port': "
          "'10.42.0.148:5432', 'include_tables': True, 'include_views': True, 'password': 'Z2gUhyBPv2', 'profiling': {'enabled': False}, "
          "'stateful_ingestion': {'enabled': True}, 'username': 'postgres'}, 'type': 'postgres'}}\n"
          '[2022-10-25 04:08:30,479] DEBUG  {datahub.ingestion.sink.datahub_rest:125} - Setting env variables to override config\n'
          '[2022-10-25 04:08:30,479] DEBUG  {datahub.ingestion.sink.datahub_rest:127} - Setting gms config\n'
          '[2022-10-25 04:08:30,479] DEBUG  {datahub.ingestion.run.pipeline:162} - Sink type:datahub-rest,<class '
          "'datahub.ingestion.sink.datahub_rest.DatahubRestSink'> configured\n"
          '[2022-10-25 04:08:30,480] INFO   {datahub.ingestion.run.pipeline:163} - Sink configured successfully. DataHubRestEmitter: configured '
          'to talk to <http://10.42.0.143:8080>\n'
          '[2022-10-25 04:08:30,480] DEBUG  {datahub.ingestion.run.pipeline:253} - Reporter type:file,<class '
          "'datahub.ingestion.reporting.file_reporter.FileReporter'> configured.\n"
          '/usr/local/bin/run_ingest.sh: line 40:  452 Killed         ( datahub ${debug_option} ingest run -c "${recipe_file}" '
          '${report_option} )\n',
          "2022-10-25 04:08:41.826607 [exec_id=7a6c8d3f-43cb-4d7c-aae6-a568423d2518] INFO: Failed to execute 'datahub ingest'",
          '2022-10-25 04:08:41.830033 [exec_id=7a6c8d3f-43cb-4d7c-aae6-a568423d2518] INFO: Caught exception EXECUTING '
          'task_id=7a6c8d3f-43cb-4d7c-aae6-a568423d2518, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
          ' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
          '  task_event_loop.run_until_complete(task_future)\n'
          ' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
          '  return future.result()\n'
          ' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
          '  raise TaskError("Failed to execute \'datahub ingest\'")\n'
          "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
    Execution finished with errors.
    I can’t find what makes fail. Do you know what is wrong? And any good way to debug this issue?
    m
    • 2
    • 3
  • q

    quiet-wolf-56299

    10/25/2022, 3:25 PM
    Is it possible to ingest a password when ingesting a user from JSON?
    h
    • 2
    • 1
  • p

    prehistoric-fireman-61692

    10/27/2022, 10:06 PM
    Hi all I’ve set DataHub up via the docker quickstart but when trying to ingest metadata I get the following error message “ERROR {datahub.entrypoints:165} - Cannot open config file” - this is when trying to ingest via UI or CLI - any clue what the problem is or a fix?
    m
    • 2
    • 30
  • b

    bitter-furniture-95993

    10/28/2022, 3:14 PM
    Hi! I have successfully loaded a customer glossary I can see "CDMO 176" in the glossary view, so all items have been loaded When I try to access the glossary in order to view the details, the view start loading, and then I get a white screen.
    m
    b
    • 3
    • 10
  • m

    microscopic-tailor-94417

    10/31/2022, 7:30 AM
    Hello team, I hope you're all fine. One of my customers wants to get Qliksense metada via Datahub. Is it possible to create Qliksense source for ingestion? Do I need to create a custom source? If so, how should I do it? Thanks in advance.
    a
    • 2
    • 2
  • b

    breezy-shoe-41523

    10/31/2022, 11:57 AM
    hello team can you give me example graphql query or python api to list all View, datasource, dashboard, chart of Tableau Workbook by urn i can check it is possible by graphql container but i need some help to list all of that i need to get all that lists to set domains to them
    h
    • 2
    • 2
  • b

    breezy-shoe-41523

    10/31/2022, 11:58 AM
    image.png
  • m

    modern-garden-35830

    10/31/2022, 12:51 PM
    Hi all, Can someone tell me if there is a plugin for Datagrip? or any other recommended query platform that I can plug in DataHub?
    h
    • 2
    • 2
  • c

    careful-france-26343

    11/01/2022, 12:41 AM
    Does the kubernetes version of DataHub require internet access to install?
    b
    • 2
    • 1
  • n

    nice-zebra-54018

    11/02/2022, 5:47 AM
    Hello I installed Datahub on AWS EKS using Helm Chart. I have ingested a sample dataset. I will use Curl to use the Search and Filter functions. I referenced it here. (https://datahubproject.io/docs/metadata-service) My Datahub Version is 0.9.0. However, this is only an example for a chart. ### curl -X POST 'http://localhost:8080/entities?action=search' \ --data '{ "input": "looker", "entity": "chart", "start": 0, "count": 10, "filter": { "criteria": [ { "field": "title", "value": "Baz Chart 1", "condition": "EQUAL" } ] } }' ### I want an example where the entity is a dataset. And I want an example when the filtering target is 'Platform Instance'. I have two Platform Instances called toystore and moviestore. What if I want to filter on toystore Platform Instance? When I use curl like below, I got a normal response. ### curl -X POST 'http//[my datahub gms url]8080/entities?action=search' \ --data '{ "input": "id", "entity": "dataset", "start": 0, "count": 10 }' ### However, if I put the filter-related part, I get a java.lang.NullPointerException error. ### curl -X POST 'http//[my datahub gms url]8080/entities?action=search' \ --data '{ "input": "id", "entity": "dataset", "start": 0, "count": 10, "filter": { "criteria": [ { "field": "platformInstance", "value": "toystore", "condition": "[EQUAL or CONTAIN]" } ] } }' ### Looks like I wrote the filter part wrong. How do I fix it? For reference, in the UI, PlatformInstance filtering works well.
    h
    • 2
    • 4
  • d

    dazzling-alarm-64985

    11/02/2022, 8:56 AM
    Hello, im deploying datahub on gcp gke, it worked fine when using preq mysql, now im using cloud sql mysql, a managed mysql instead, connectivity works and datahub has succesfully created a database there, however i get this error on datahub-upgrade-job pod.. any help is welcome 🙂 Cloud SQL mysql is running version 8.0
    Copy code
    "ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.7.2ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.7.2ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.7.2ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.7.2"
    a
    e
    s
    • 4
    • 11
  • r

    rich-pencil-57339

    11/02/2022, 12:18 PM
    Hi all, setting this up for the first time and I’ve got some ingestions up and running and it’s fantastic. I wondering how I would go about linking lineages together? For example my Tableau dashboard lineages end with the postgresql view from the datasource, but that lineage is not connected to the postgresql datasource I also ingested. I assume this is something like the lineage emitter I need to run between the two to merge them into one, or am I missing something?
    h
    m
    • 3
    • 6
  • q

    quiet-wolf-56299

    11/02/2022, 2:49 PM
    Is it possible to change the default user password if i’m using native auth rather than JAAS?
    m
    • 2
    • 3
  • q

    quiet-wolf-56299

    11/02/2022, 2:51 PM
    Obviously i’d rather not have the admin user’s password just hanging around in a props file on the harddrive
  • q

    quiet-wolf-56299

    11/02/2022, 3:52 PM
    Ok better question. Is it possible to disable the default datahub user all together? Say after the initial install I can switch to a native user rather than the default one and remove the datahub user
    a
    b
    m
    • 4
    • 6
  • h

    hallowed-lizard-92381

    11/03/2022, 3:02 AM
    Attempting to model a pipeline that doesn’t take a dataset as an input, but outputs a dataset. The DataJobInputOutputClass appears to require an input and an output. Can anyone think of a way around this? We basically want to set a dataJob as upstream of a Dataset.
    h
    • 2
    • 3
  • f

    few-air-34037

    11/03/2022, 9:04 AM
    Is there any relevance for gms's parameter DATAHUB_SERVER_TYPE? I didn't find any explanation from documentation or code?
    plus1 1
    m
    • 2
    • 2
  • f

    faint-actor-78390

    11/03/2022, 10:46 AM
    Hi all, installing datahub with manuel docker compose. finally 8/11 containers are running set-up containers failed. Login failed on port 9002. Any idea ? Thanks
    h
    g
    • 3
    • 9
  • a

    average-dinner-25106

    11/04/2022, 3:57 AM
    Hi everyone. Does search engine of Datahub provide 'fuzzy matching' of keyword? For example, I want to search using keyword 'account' but typed 'acount'. What I expect is using 'acount' DataHub gives me column description, tag, etc about 'account'.
  • r

    ripe-eye-60209

    11/04/2022, 11:41 AM
    Hello Team, 1. how can we delete default datahub user (any command). 2. how can we assign default role to users login via OIDC azure?
    h
    • 2
    • 1
  • s

    shy-garden-57011

    11/04/2022, 1:06 PM
    Does DataHub integrate with Istio/ServiceMesh?
    g
    • 2
    • 1
  • g

    green-intern-1667

    11/04/2022, 4:36 PM
    Trying to ingest Snowflake data via UI ingestion, but after hit
    Test Connection
    button I can only see
    Testing you connection for several minutes
    . Any clue on that?
  • g

    gifted-rocket-7960

    11/07/2022, 5:16 AM
    Hi everyone, does the field level lineage is available to use ? https://github.com/datahub-project/datahub/blob/master/docs/rfc/active/1841-lineage/field_level_lineage.md
    h
    • 2
    • 4
  • c

    colossal-laptop-87082

    11/07/2022, 6:10 AM
    Hello team!! I'm new to the Datahub, I wanted to ingest CSV and make these observability checkpoints with the help of the Datahub, Is this possible for this? • Freshness • Volume • Scheme
    h
    b
    • 3
    • 2
1...464748...80Latest