https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    brave-room-48783

    04/13/2023, 11:01 AM
    Hi, Getting these errors while trying to ingest snowflake on datahub. Have read access to tables and queries. Please advise on a process. DataHub CLI version: 0.10.1.1 Python version: 3.9.6 (default, Mar 10 2023, 201638) [Clang 14.0.3 (clang-1403.0.22.14.1)] Deployment method - Docker on local machine
    Copy code
    ~~~~ Execution Summary - RUN_INGEST ~~~~
    Execution finished with errors.
    {'exec_id': '426ca545-5c66-4adb-98b0-7f2fffbfcd0f',
     'infos': ['2023-04-14 02:35:39.958717 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-04-14 02:55:31.996839 INFO: Failed to execute 'datahub ingest'",
               '2023-04-14 02:55:32.010133 INFO: Caught exception EXECUTING task_id=426ca545-5c66-4adb-98b0-7f2fffbfcd0f, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    πŸ” 1
    πŸ“– 1
    l
    a
    +2
    • 5
    • 18
  • b

    bumpy-engineer-7375

    04/13/2023, 12:39 PM
    Hi all! I'm new to DataHub. I just installed it on Azure AKS via helm charts. However, when I try to log in with the default authentication (user datahub, pwd datahub) the login fails.
    πŸ“– 1
    l
    a
    +4
    • 7
    • 22
  • e

    elegant-salesmen-99143

    04/13/2023, 7:54 PM
    Hi, it's me with API queries again Below is my query, and it gives me this erros:
    "Validation error (WrongType@[searchAcrossEntities]) : argument 'input.orFilters[0]' with value 'ObjectValue{objectFields=[ObjectField{name='query', value=StringValue{value='*'}}, ObjectField{name='orFilters', value=ArrayValue{values=[ObjectValue{objectFields=[ObjectField{name='field', value=StringValue{value='removed'}}, ObjectField{name='condition', value=StringValue{value='EQUAL'}}, ObjectField{name='values', value=ArrayValue{values=[StringValue{value='true'}]}}, ObjectField{name='negated', value=BooleanValue{value=false}}]}]}}]}' contains a field not in 'AndFilterInput': 'field'"
    But the field
    field
    is definitely is possible in AndFilterInput. What is wrong here?
    Copy code
    {
      searchAcrossEntities (
      input: {query: "*",
        orFilters: [{field: "removed", condition: "EQUAL", values: ["true"], negated: false}]}
        )     {
        start
        count
        total
        searchResults {
          entity {
            type
            ... on Dataset {
              urn
              type
              description
              platform {
                name
              }
              name
            }
          }
        }
      }
    }
    πŸ“– 1
    πŸ” 1
    βœ… 1
    l
    a
    +2
    • 5
    • 9
  • s

    steep-alligator-93593

    04/14/2023, 1:53 AM
    Hey! Anyone know how I can go about removing the first couple lines of the sql in the
    datahub-mysql-setup:v0.10.1
    image.. working with some tight permissions and need a workaround
    Copy code
    -- create datahub database
    CREATE DATABASE IF NOT EXISTS <DB> CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
    USE <DB>;
    πŸ” 1
    πŸ“– 1
    βœ… 1
    l
    b
    a
    • 4
    • 4
  • i

    incalculable-stone-67607

    04/14/2023, 6:59 AM
    how to fix error SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". (first build on windows).
    l
    a
    • 3
    • 2
  • w

    wonderful-jordan-36532

    04/14/2023, 10:27 AM
    Hi, when trying to create users via CLI I get the following error:
    python3 -m datahub user upsert -f user.yaml
    Error: No such command 'user'.
    βœ… 1
    l
    b
    a
    • 4
    • 7
  • f

    fierce-electrician-85924

    04/14/2023, 10:30 AM
    Hi team, I am getting this error while running datahub on version
    0.10.2
    .
    Copy code
    io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
    I am using k8s with helm to setup local instance. Default
    linkedin/datahub-gms
    image works fine but if I try to build the image locally and then use it for instance setup it throws this error. (We haven't changed anything in datahub-gms)
    πŸ“– 1
    βœ… 1
    l
    a
    +4
    • 7
    • 19
  • b

    best-wire-59738

    04/14/2023, 10:48 AM
    Hi Team, I would like to report an error we faced in the latest version of datahub v0.10.0.6. We noticed there are 2 privileges that are not getting synced with datahub if we include them in our custom policies. Those privileges are
    View Dataset Usage and View Dataset Profile
    . For more details on the Issue please follow this thread. https://datahubspace.slack.com/archives/C029A3M079U/p1680054629679729
    πŸ“– 1
    πŸ” 1
    l
    a
    • 3
    • 2
  • v

    victorious-monkey-86128

    04/14/2023, 3:38 PM
    Hi, I have a custom data source and I need to emit the data in said source with the Python SDK. I have emitted datasets and container entities, but they're separate... Currently, I have datasets such as
    container_name.dataset_name
    and containers with the name
    container_name
    . They exist independently of each other currently. I'd like to assign a set of datasets to the Container. How would I go about doing so? Thanks!
    βœ… 1
    l
    a
    • 3
    • 5
  • c

    creamy-ram-28134

    04/14/2023, 4:25 PM
    I was trying to set up datahub on k8s and i keep getting this issue helm repo add datahub https://helm.datahubproject.io/ Error: looks like "https://helm.datahubproject.io/" is not a valid chart repository or cannot be reached: Get "https://helm.datahubproject.io/index.yaml"
    l
    a
    b
    • 4
    • 3
  • b

    bland-orange-13353

    04/14/2023, 9:00 PM
    This message was deleted.
    πŸ” 1
    βœ… 1
    πŸ“– 1
    l
    • 2
    • 1
  • s

    steep-alligator-93593

    04/15/2023, 5:06 AM
    Hey Team getting an error with the job
    datahub-system-update-job
    Copy code
    2023-04-15 05:00:18.721 ERROR 1 --- [ main] i.c.k.s.client.rest.RestService : Failed to send HTTP request to endpoint: <http://prerequisites-cp-schema-registry:8081/subjects/DataHubUpgradeHistory_v1-value/versions>
    252
    251
    java.net.UnknownHostException: prerequisites-cp-schema-registry
    250
    at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229) ~[na:na]
    249
    at java.base/java.net.Socket.connect(Socket.java:609) ~[na:na] 
    
    Along with
    
    2023-04-15 05:00:18.697 ERROR 1 --- [ main] i.c.k.s.client.rest.RestService : Failed to send HTTP request to endpoint: <http://prerequisites-cp-schema-registry:8081/subjects/DataHubUpgradeHistory_v1-value/versions>
    460
    459
    java.net.UnknownHostException: prerequisites-cp-schema-registry
    I am running on kubernetes any help would be greatly appreciated, thank you
    πŸ” 1
    πŸ“– 1
    βœ… 1
    l
    a
    b
    • 4
    • 6
  • r

    red-painter-89141

    04/15/2023, 3:46 PM
    Hi everyone, new guy here trying to get DataHub up on an ubuntu machine. I followed the instructions to install docker and docker compose, verified it's running correctly, and was able to install DataHub. But I can't get the quickstart to run:
    πŸ” 1
    πŸ“– 1
    l
    b
    a
    • 4
    • 8
  • r

    red-painter-89141

    04/15/2023, 3:47 PM
    Copy code
    $ python3 -m datahub docker quickstart
    [2023-04-15 08:46:29,229] INFO     {datahub.cli.quickstart_versioning:144} - Saved quickstart config to /home/tim/.datahub/quickstart/quickstart_version_mapping.yaml.
    [2023-04-15 08:46:29,229] INFO     {datahub.cli.docker_cli:638} - Using quickstart plan: composefile_git_ref='master' docker_tag='head'
    Docker doesn't seem to be running. Did you start it?
    $ docker ps -a
    CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
    $ docker compose version
    Docker Compose version v2.17.2
    plus1 1
  • r

    red-painter-89141

    04/16/2023, 2:50 PM
    Maybe it's a permission issue? I ran with --debug:
    Copy code
    $ datahub --debug docker quickstart
    [2023-04-16 07:48:08,882] DEBUG    {datahub.telemetry.telemetry:219} - Sending init Telemetry
    [2023-04-16 07:48:08,934] DEBUG    {datahub.upgrade.upgrade:134} - Failed to get a valid server: Cannot connect to host localhost:8080 ssl:default [Connect call failed ('127.0.0.1', 8080)]
    [2023-04-16 07:48:09,792] DEBUG    {datahub.telemetry.telemetry:248} - Sending telemetry for function-call
    [2023-04-16 07:48:11,036] INFO     {datahub.cli.quickstart_versioning:144} - Saved quickstart config to /home/tim/.datahub/quickstart/quickstart_version_mapping.yaml.
    [2023-04-16 07:48:11,036] INFO     {datahub.cli.docker_cli:638} - Using quickstart plan: composefile_git_ref='master' docker_tag='head'
    [2023-04-16 07:48:11,037] DEBUG    {datahub.telemetry.telemetry:248} - Sending telemetry for function-call
    [2023-04-16 07:48:11,196] DEBUG    {datahub.entrypoints:189} - Error: Docker doesn't seem to be running. Did you start it?
    Traceback (most recent call last):
      File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
        httplib_response = self._make_request(
      File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 394, in _make_request
        conn.request(method, url, **httplib_request_kw)
      File "/usr/lib/python3.10/http/client.py", line 1282, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
        self.send(msg)
      File "/usr/lib/python3.10/http/client.py", line 975, in send
        self.connect()
      File "/home/tim/.local/lib/python3.10/site-packages/docker/transport/unixconn.py", line 30, in connect
        sock.connect(self.unix_socket)
    PermissionError: [Errno 13] Permission denied
    πŸ” 1
    βœ… 1
    πŸ“– 1
    l
    • 2
    • 2
  • r

    rapid-zoo-88437

    04/17/2023, 7:07 AM
    Hi everyone, I had tried a couple of tests on spark lineage with hive table. Here is my result: β€’ Hive Lineage situation β€’ spark 2.3.x β—¦ orc, text format: both parent, child β—¦ parquet format: only child β€’ spark 2.4.x β—¦ text format: both parent, child β—¦ orc, parquet format: only child I wonder that orc, parquet format didn't fully support yet? Thanks!
    πŸ“– 1
    πŸ” 1
    l
    a
    • 3
    • 4
  • c

    clever-magician-79463

    04/17/2023, 7:24 AM
    Copy code
    Execution finished with errors.
    {'exec_id': 'dc2c87c8-2590-40e6-88eb-c07e7e63adfa',
     'infos': ['2023-04-17 07:10:16.517630 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-04-17 07:10:38.176173 INFO: Failed to execute 'datahub ingest'",
               '2023-04-17 07:10:38.181265 INFO: Caught exception EXECUTING task_id=dc2c87c8-2590-40e6-88eb-c07e7e63adfa, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    Hello, I am facing this issue while trying to ingest redshift data. Can anyone help with any fixes?
    l
    a
    h
    • 4
    • 4
  • b

    bland-orange-13353

    04/17/2023, 7:43 AM
    This message was deleted.
    βœ… 1
    l
    • 2
    • 1
  • g

    gentle-nest-73959

    04/17/2023, 8:26 AM
    Hi I'm trying to provision self-hosted datahub on our kubernetes cluster. currently I'm setting authentication and authorizations and I managed to setup SSO and
    groups
    sync. however I couldn't find how to assign role to a group. how can I achieve this?
    πŸ” 1
    πŸ“– 1
    l
    a
    b
    • 4
    • 3
  • m

    microscopic-room-90690

    04/17/2023, 8:33 AM
    Hello team, can anyone help with this?
    Copy code
    This entity is not discoverable via search or lineage graph. Contact your DataHub admin for more information.
    πŸ“– 1
    πŸ” 1
    βœ… 1
    l
    e
    a
    • 4
    • 4
  • b

    brief-mechanic-70547

    04/17/2023, 11:01 AM
    Hi, I am trying to setup self-hosted datahub on my local M2 laptop. After the installation as supervised at the official documentations, I am facing some issues. 1- I start the docker container
    datahub docker quickstart
    Even though DataHub is running established, I got the warning below
    ❗Client-Server Incompatible❗ Your client version 0.10.1.1 is older than your server version 0.10.2. Upgrading the cli to 0.10.2 is recommended.
    Upgrade via "pip install 'acryl-datahub==0.10.2'"
    2- When I follow the requested installation, I got an error:
    ERROR: No matching distribution found for acryl-datahub==0.10.2
    Any suggestions?
    πŸ” 1
    πŸ“– 1
    l
    a
    • 3
    • 2
  • f

    flat-engineer-75197

    04/17/2023, 11:05 AM
    πŸ‘‹ is there a way to mass-delete owners from datasets?
    πŸ“– 1
    πŸ” 1
    βœ… 2
    l
    • 2
    • 2
  • b

    brief-mechanic-70547

    04/17/2023, 11:17 AM
    Hi, I am trying to connect to a Postgres instance running on my local machine. The instance is running and accessible through SQL editors etc. I am getting the error below;
    Copy code
    [2023-04-17 11:12:13,532] ERROR    {datahub.entrypoints:188} - Command failed: (psycopg2.OperationalError) could not connect to server: Connection refused
    	Is the server running on host "localhost" (127.0.0.1) and accepting
    	TCP/IP connections on port 5432?
    could not connect to server: Cannot assign requested address
    	Is the server running on host "localhost" (::1) and accepting
    	TCP/IP connections on port 5432?
    On the host and port I tried various alternatives but I couldn't succeed; β€’ host.docker.internal:5432 β€’ localhost:5432 β€’ 127.0.0.1:5432 Could you help me with this? Thanks
    πŸ” 1
    πŸ“– 1
    βœ… 1
    l
    • 2
    • 2
  • d

    dazzling-appointment-34954

    04/17/2023, 12:29 PM
    Hey experts, I have a problem with the lineage tab and hope someone can help. In the visual overview of lineage I can see all connected entities to my graph asset and through clicking β€œ+” can run down the full graph. When I go to the same asset in the Lineage tab I can see the list of all upstream lineage assets (in this case postgres datasets) but only for 1st degree of dependency. When clicking on 2nd or 3rd it shows an empty list. If I now go to one of these 1st degree datasets the 2nd and 3rd degree for them is shown properly so it seems to be related to dataset <-> graph. (the others are dataset <-> dataset relations). We connected the graphs with datasets through a python lineage emitter btw so something might have been gone wrong there? Can anyone please help or has an idea what to do? Thank you in advance !
    πŸ“– 1
    πŸ” 1
    l
    a
    • 3
    • 5
  • q

    quiet-rain-16785

    04/17/2023, 1:59 PM
    Hi Guys, I am new to datahub can anyone help me to integrate airflow with datahub. I have an idea using docs but it is not sufficient .....can anyone help me out!! please
    πŸ” 1
    l
    a
    • 3
    • 2
  • b

    best-morning-7115

    04/17/2023, 2:07 PM
    Hello all! I already have one docker service running on the port number 3306 which is used by DataHub mysql port. To avoid the port conflict issue, I tried to change the mysql port from 3306 to 53306 and DATAHUB_MAPPED_GMS_PORT from 8080 to 58080. I am still getting the error after running β€œ_*datahub docker quickstart --quickstart-compose-file docker-compose-without-neo4j.quickstart.yml*_” command. Can anyone please let me know how can I resolve this issue? Thanks in advance πŸ™‚ (I can see the DataHub Login page on my localhost but I am unable to login into it) My datahub version is 0.10.1.1 and I have attached the datahub-upgrade and datahub-logs file below. @gray-shoe-75895 @dazzling-judge-80093 can you please help me in resolving this issue?
    πŸ” 1
    πŸ“– 1
    l
    a
    • 3
    • 4
  • b

    brief-ability-41819

    04/17/2023, 2:15 PM
    Hello, We’re running
    0.9.6.1
    on EKS with managed storages (RDS, OpenSearch, MSK). After an attempt to upgrade to
    0.10.2
    , we had to perform a disaster recovery on OpenSearch cluster, as
    datahub-upgrade
    pod failed along with
    datahub-gms
    pod and we couldn’t restore the functionality with rollback. All upgrade jobs were set to
    true
    before applying the Helm chart. Chart itself was updated before an upgrade with
    helm repo update
    . Questions: β€’ Is there any breaking change we’re not aware of (like too many major versions skip)? We did multiple upgrades in the past and this is the first time we’re blocked. β€’ Shall we perform a manual diff on
    values.yaml
    to mirror GitHub’s chart to-the-letter?
    πŸ” 1
    πŸ“– 1
    βœ… 1
    l
    a
    +2
    • 5
    • 5
  • w

    wide-afternoon-79955

    04/17/2023, 3:35 PM
    Hi All, We are running , Datahub with version :
    0.10.1
    Running Datahub on AWS EKS and we are facing very high latency (~20 seconds) in our Datahub searches. AWS's Elastic search service with version 7.10 and 3 data nodes and no dedicated master node Cache is not enabled on Datahub. We have are using RDS for Mysql We are not seeing any high memory or CPU utilisation on any Datahub components and even Elastic Search data metrics seems to looks good. Datahub GMS config 2 replicas with 16G each and the rest of pods are on 4G memory. We have 10 - 13 k Datasets. Can some one please guide on how to tune the Datahub search.
    πŸ” 1
    βœ… 1
    πŸ“– 1
    l
    r
    +3
    • 6
    • 10
  • c

    clever-twilight-40247

    04/17/2023, 8:20 PM
    Hello - Running datahub quickstart version
    0.10.1.1
    on mac M1 and trying to ingest oracle datasource via datahub CLI(and UI), but it is failing due to lack of
    arm64
    support for cx_oracle. Are there any alternatives? Error messages in 🧡
    πŸ“– 1
    πŸ” 1
    l
    a
    • 3
    • 9
  • q

    quiet-rain-16785

    04/18/2023, 8:02 AM
    Hi guys!! can anyone share me python file which gives only failed pipelinedata using kafka event?? please share the file
    πŸ” 1
    πŸ“– 1
    l
    a
    • 3
    • 3
1...899091...119Latest