https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    bland-orange-13353

    05/11/2023, 5:00 PM
    This message was deleted.
    🔍 1
    ✅ 1
    📖 1
    l
    • 2
    • 1
  • h

    high-twilight-23787

    05/11/2023, 9:38 PM
    Hello Everyone 1/ I've installed datahub quickstart and it's OK (on a server with internet access) 2/ Now I want to start datahub quickstart on a 2nd server that doesn't have internet access. The first install had been done on a server with internet access. I've exported container (docker export .... ) ==> So, I've obtained images of the running containers from the first server. On the 2nd server (no internet access) : The images have been imported. But I cannot use the command "datahub docker quickstart" since I've have internet access (I have HTTPSConnection errors) I've created an ENV file with all environnement variables. ==> So, how can I start the containers without the need to connect to the internet? I've searched and read other message about "offline install" but none is successful. Regards Christophe
    🔍 1
    📖 1
    b
    l
    • 3
    • 2
  • s

    steep-soccer-91284

    05/12/2023, 3:29 AM
    Hi, everyone. I’m trying to ingest Redash, however it makes error like below. How can I solve this? Best regards, Young
    Copy code
    PipelineInitError: Failed to find a registered source for type redash: 'str' object is not callable
    🔍 1
    l
    d
    +3
    • 6
    • 13
  • b

    brave-room-48783

    05/12/2023, 7:32 AM
    Hi, getting this error while ingesting DataBricks Tried reverting the library urllib3 to 1.26.0 as well as 1.24.3. Here are the logs
    Copy code
    ~~~~ Execution Summary - RUN_INGEST ~~~~
    Execution finished with errors.
    {'exec_id': '7995ad9a-df8f-4ed3-85c1-f0032daa69de',
     'infos': ['2023-05-12 07:29:32.498479 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-05-12 07:29:40.217073 INFO: Failed to execute 'datahub ingest'",
               '2023-05-12 07:29:40.217932 INFO: Caught exception EXECUTING task_id=7995ad9a-df8f-4ed3-85c1-f0032daa69de, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    
    ~~~~ Ingestion Logs ~~~~
    Obtaining venv creation lock...
    Acquired venv creation lock
    venv setup time = 0
    This version of datahub supports report-to functionality
    datahub  ingest run -c /tmp/datahub/ingest/7995ad9a-df8f-4ed3-85c1-f0032daa69de/recipe.yml --report-to /tmp/datahub/ingest/7995ad9a-df8f-4ed3-85c1-f0032daa69de/ingestion_report.json
    [2023-05-12 07:29:35,127] INFO     {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.2
    [2023-05-12 07:29:35,658] INFO     {datahub.ingestion.run.pipeline:204} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
    [2023-05-12 07:29:36,675] ERROR    {datahub.entrypoints:195} - Command failed: Failed to configure the source (unity-catalog): type object 'Retry' has no attribute 'DEFAULT_METHOD_WHITELIST'
    Traceback (most recent call last):
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 119, in _add_init_error_context
        yield
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 217, in __init__
        self.source = source_class.create(
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/source/unity/source.py", line 172, in create
        return cls(ctx=ctx, config=config)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/source/unity/source.py", line 124, in __init__
        self.unity_catalog_api_proxy = proxy.UnityCatalogApiProxy(
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/source/unity/proxy.py", line 125, in __init__
        ApiClient(
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/databricks_cli/sdk/api_client.py", line 106, in __init__
        method_whitelist=set({'POST'}) | set(Retry.DEFAULT_METHOD_WHITELIST),
    AttributeError: type object 'Retry' has no attribute 'DEFAULT_METHOD_WHITELIST'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/entrypoints.py", line 182, in main
        sys.exit(datahub(standalone_mode=False, **kwargs))
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
        return self.main(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 1055, in main
        rv = self.invoke(ctx)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/core.py", line 760, in invoke
        return __callback(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
        raise e
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
        res = func(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
        return func(ctx, *args, **kwargs)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 187, in run
        pipeline = Pipeline.create(
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 328, in create
        return cls(
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 216, in __init__
        with _add_init_error_context(f"configure the source ({source_type})"):
      File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
        self.gen.throw(typ, value, traceback)
      File "/tmp/datahub/ingest/venv-unity-catalog-0.10.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 121, in _add_init_error_context
        raise PipelineInitError(f"Failed to {step}: {e}") from e
    datahub.ingestion.run.pipeline.PipelineInitError: Failed to configure the source (unity-catalog): type object 'Retry' has no attribute 'DEFAULT_METHOD_WHITELIST'
    🔍 1
    📖 1
    l
    d
    • 3
    • 2
  • b

    bland-orange-13353

    05/12/2023, 8:15 AM
    This message was deleted.
    ✅ 1
    l
    • 2
    • 1
  • m

    microscopic-lizard-81562

    05/12/2023, 8:17 AM
    I have the problem that on Ubuntu the DataHub quickstart command can't see the running Docker client. The problem description is parked at Stackoverflow. https://askubuntu.com/questions/1467620/docker-doesnt-seem-to-be-running-did-you-start-it?noredirect=1#comment2571773_1467620 Does anyone know how to deal with this problem?
    b
    • 2
    • 4
  • b

    bland-orange-13353

    05/12/2023, 8:39 AM
    This message was deleted.
    📖 1
    ✅ 1
    l
    • 2
    • 1
  • a

    average-nail-72662

    05/12/2023, 12:50 PM
    Hi guys! I’m not able to run datahub
    l
    d
    • 3
    • 3
  • p

    proud-dusk-671

    05/12/2023, 1:26 PM
    Recently installed Datahub via K8s. Getting the following error on gms -
    Copy code
    2023-05-12 13:02:36,560 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - generic-platform-event-job-client: partitions assigned: [PlatformEvent_v1-0]
    2023-05-12 13:02:40,944 [pool-10-thread-1] WARN  org.elasticsearch.client.RestClient:65 - request [POST <http://elasticsearch-master:9200/datahubpolicyindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 2 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See <https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-minimal-setup.html> to enable security."],[299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
    2023-05-12 13:03:16,631 [R2 Nio Event Loop-1-1] WARN  c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
    io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
    Caused by: java.net.ConnectException: Connection refused
    	at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    	at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
    	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
    	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
    	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
    	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
    	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
    	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    	at java.base/java.lang.Thread.run(Thread.java:829)
    🔍 1
    📖 1
    l
    a
    a
    • 4
    • 7
  • i

    important-afternoon-19755

    05/12/2023, 2:56 PM
    Hi, team. My datahub version is v0.10.2. I try to open Queries using
    DatasetUsageStatisticsClass
    . I can see Queries tap and it works well in test (I emitted to about 30 urns.). But after I emit DatasetUsageStatisticsClass to about 4k urns, when I click data source, after about 10 seconds of loading and I got the error “An unknown error occurred. (code 500)” and the page looks like the picture I attached. Even for data sources where I don’t have the queries tab open. Is there a limit to how many queries taps I can have open? Or I set the max length of each query I emitted in the Queries tab to 10000, is there a limit to length of each query?
    l
    d
    +2
    • 5
    • 24
  • q

    quiet-television-68466

    05/12/2023, 5:14 PM
    Hello we are on version v0.10.2 as we’re testing out the new search autocomplete feature on any new searches we make we get the following error: The only functional issue we have is that the autocomplete doesn’t work, but the 500 error popping up on the screen all the time is really distracting! This doesn’t occur on our dev environment, which is identical infra wise (other than the fact that there are less assets on dev). Any help troubleshooting this would be appreciated!
    datahub search error.txtdatahub search error.mov
    l
    d
    h
    • 4
    • 7
  • s

    shy-dog-84302

    05/13/2023, 3:38 AM
    Hi! I’m running DataHub on K8S. Can someone help me with error situation where I get 500 from the backend and nothing gets displayed on frontend when I try to access a dataset. Following is the error message in DataHub GMS logs.
    Copy code
    org.elasticsearch.client.ResponseException: method [POST], host [<https://opensearch-service.com:11862>], URI [/dataset_datasetusagestatisticsaspect_v1/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 500 Internal Server Error]
    {"error":{"root_cause":[{"type":"exception","reason":"java.util.concurrent.ExecutionException: java.lang.IllegalStateException: unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"dataset_datasetusagestatisticsaspect_v1","node":"lSc8kpchRyGhN_lPDqNyvQ","reason":{"type":"exception","reason":"java.util.concurrent.ExecutionException: java.lang.IllegalStateException: unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type.","caused_by":{"type":"execution_exception","reason":"java.lang.IllegalStateException: unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type.","caused_by":{"type":"illegal_state_exception","reason":"unexpected docvalues type NONE for field 'topSqlQueries' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type."}}}}]},"status":500}
    🔍 1
    ❌ 1
    📖 1
    plus1 1
    ⏳ 1
    l
    d
    +4
    • 7
    • 8
  • m

    mysterious-scooter-52411

    05/14/2023, 7:51 AM
    I am trying to gradle build datahub on EC2. It shows out of memory for 16gb ebs space. What should be the ideal storage required ?
    l
    g
    • 3
    • 2
  • h

    hallowed-lock-74921

    05/14/2023, 4:26 PM
    Hi Team
    l
    • 2
    • 1
  • h

    hallowed-lock-74921

    05/14/2023, 4:26 PM
    I am getting the below error , while executing ./gradlew quickstart
  • h

    hallowed-lock-74921

    05/14/2023, 4:26 PM
    Configure project dockermysql-setup
    fullVersion=v0.10.2-147-g0fa983a.dirty cliMajorVersion=0.10.2 version=0.10.3-SNAPSHOT SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    Task dockermysql-setup:docker
    FAILED unknown flag: --load See 'docker --help'.
  • h

    hallowed-lock-74921

    05/14/2023, 5:25 PM
    FAILURE: Build completed with 3 failures. 1: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockerkafka-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== 2: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockerelasticsearch-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== 3: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockermysql-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== * Get more help at https://help.gradle.org Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0. Use '--warning-mode all' to show the individual deprecation warnings. See https://docs.gradle.org/6.9.2/userguide/command_line_interface.html#sec:command_line_warnings BUILD FAILED in 1m 10s 88 actionable tasks: 52 executed, 36 up-to-date (venv) Apples-MacBook-Pro-2:datahub apple$
    l
    d
    b
    • 4
    • 3
  • b

    best-wire-59738

    05/15/2023, 4:41 AM
    Hi Team, We are getting
    SSLV3_ALERT_HANDSHAKE_FAILURE
    error while connecting to MariaDB using ssl account. can you please help me to overcome this issue.
    ✅ 1
    l
    g
    • 3
    • 3
  • r

    rich-market-9876

    05/15/2023, 9:42 AM
    Hi, I have a question about Domains, is it possible to know which domain an entity is associated to (using query)? I though this should be the solution for "container" entity for example but it doesn't give me the donain's urn.
    Copy code
    ... on Container {
                      	domain {
                          associatedUrn
                        }
    📖 1
    🔍 1
    ✅ 1
    l
    f
    • 3
    • 2
  • h

    helpful-computer-25834

    05/15/2023, 10:07 AM
    Hello, please help me understand, I have a gx check, and the record of it in the datahub for some reason is expressed in a new check at the ui, whereas it is expected that will be displayed only one, and there will be launches on the graph
    l
    g
    • 3
    • 3
  • m

    millions-solstice-5820

    05/15/2023, 11:54 AM
    Hi all, we are now experimenting with using datahub-actions propagating updates into an internal Teams chanel, but are facing some issues. We are over on version 0.10.2.3, deployed over with quickstart using most up to date (due to SSO integration adapted) compose yml file. We followed the guidline as following for the prerequisites: We run:
    Copy code
    python3 -m pip install --upgrade pip wheel setuptools
    python3 -m pip install --upgrade acryl-datahub
    datahub --version
    Output:
    Copy code
    acryl-datahub, version 0.10.2.3
    Followed up by:
    Copy code
    python3 -m pip install --upgrade pip wheel setuptools
    python3 -m pip install --upgrade acryl-datahub-actions
    datahub actions version
    Output:
    Copy code
    DataHub Actions version: 0.0.12
    Python version: 3.9.10 (main, Feb  9 2022, 00:00:00)
    [GCC 11.2.1 20220127 (Red Hat 11.2.1-9)]
    We setup environment variables and restarted:
    Copy code
    export DATAHUB_ACTIONS_TEAMS_ENABLED=true
    export DATAHUB_ACTIONS_TEAMS_WEBHOOK_URL=<our webhook url>
    
    datahub docker quickstart --stop && datahub docker quickstart
    We setup a action-config yaml file for Teams as with the structure from the official setup page And tried to run:
    Copy code
    datahub actions -c teams-action.yml
    but getting the following error:
    Copy code
    [2023-05-15 13:32:08,288] INFO     {datahub_actions.cli.actions:76} - DataHub Actions version: 0.0.12
    [2023-05-15 13:32:08,367] ERROR    {datahub.entrypoints:195} - Command failed: Failed to instantiate Actions Pipeline using config datahub_teams_action: Caught exception while attempting to instantiate Action with type teams.
    Traceback (most recent call last):
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 117, in _ensure_not_lazy
        plugin_class = import_path(path)
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 48, in import_path
        item = importlib.import_module(module_name)
      File "/usr/lib64/python3.9/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 850, in exec_module
      File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub_actions/plugin/action/teams/teams.py", line 18, in <module>
        import pymsteams
    ModuleNotFoundError: No module named 'pymsteams'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub_actions/pipeline/pipeline_util.py", line 130, in create_action
        action_class = action_registry.get(action_type)
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 175, in get
        raise ConfigurationError(
    datahub.configuration.common.ConfigurationError: teams is disabled; try running: pip install 'acryl-datahub[teams]'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/home/eamadm/.local/lib/python3.9/site-packages/datahub_actions/cli/actions.py", line 50, in pipeline_config_to_pipeline
    ...
    When trying to execute pip install 'acryl-datahub[teams]', as indicated within the log, we get a following output containing a statement:
    Copy code
    WARNING: acryl-datahub 0.10.2.3 does not provide the extra 'teams'
    and a full list or requirements already satisfied:
    Copy code
    Requirement already satisfied: mixpanel>=4.9.0 in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (4.10.0)
    Requirement already satisfied: avro-gen3==0.7.10 in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (0.7.10)
    Requirement already satisfied: ratelimiter in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (1.2.0.post0)
    Requirement already satisfied: avro<1.11,>=1.10.2 in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (1.10.2)
    Requirement already satisfied: cached-property in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (1.5.2)
    Requirement already satisfied: mypy-extensions>=0.4.3 in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (0.4.3)
    Requirement already satisfied: python-dateutil>=2.8.0 in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (2.8.2)
    Requirement already satisfied: click-default-group in /home/eamadm/.local/lib/python3.9/site-packages (from acryl-datahub[teams]) (1.2.2)
    ...
    Thanks for any advise or pointing into a correct direction !
    📖 1
    ✅ 1
    🔍 1
    l
    g
    g
    • 4
    • 5
  • f

    fierce-electrician-85924

    05/15/2023, 12:34 PM
    we tried to run datahub upgrade on our system with version v0.9.2 But we are facing following issue what could be the reason behind this?
    ✅ 1
    l
    g
    • 3
    • 3
  • m

    miniature-rain-19681

    05/15/2023, 2:23 PM
    Good morning, everyone. We are trying to push data validations to Datahub. Could someone share the API we should use and an example of a JSON? These validations are for a GLUE table.
    📖 1
    🔍 1
    l
    d
    • 3
    • 3
  • i

    icy-kitchen-54364

    05/15/2023, 7:26 PM
    Hello Team, I am new to Datahub and i am getting this error while running datahub docker quickstart, It was working previously and we did a system restart and started the docker services again and ran a quickstart and getting this error. dependency failed to start: container for service "zookeeper" exited (1) .............. [+] Running 6/7 ⠿ Container zookeeper Error 2.9s ⠿ Container mysql Healthy 0.5s ⠿ Container broker Created 0.0s ⠿ Container mysql-setup Started 1.1s ⠿ Container schema-registry Created 0.0s ⠿ Container elasticsearch Healthy 0.5s ⠿ Container elasticsearch-setup Started 1.0s dependency failed to start: container for service "zookeeper" exited (1) ............. Unable to run quickstart - the following issues were detected: - datahub-frontend-react is not running - datahub-gms is not running - kafka-setup is still running - schema-registry is not running - broker is not running - zookeeper is not running
    📖 1
    🔍 1
    l
    d
    +2
    • 5
    • 19
  • m

    most-byte-90620

    05/15/2023, 9:44 PM
    hi team. I am getting this error while ingesting a athena source through datahub UI. datahub is running in docker on an EC2 using this role & the role has athena full access. any help is appreciated
    Copy code
    '[2023-05-15 21:28:50,063] INFO     {botocore.credentials:1108} - Found credentials from IAM Role: AmazonSSMRoleForInstancesQuickSetup\n'
               '2023-05-15 21:28:50,168 INFO sqlalchemy.engine.base.Engine \n'
               '                SELECT schema_name\n'
               '                FROM information_schema.schemata\n'
               "                WHERE schema_name NOT IN ('information_schema')\n"
               '                \n'
               '[2023-05-15 21:28:50,168] INFO     {sqlalchemy.engine.base.Engine:110} - \n'
               '                SELECT schema_name\n'
               '                FROM information_schema.schemata\n'
               "                WHERE schema_name NOT IN ('information_schema')\n"
               '                \n'
               '2023-05-15 21:28:50,168 INFO sqlalchemy.engine.base.Engine {}\n'
               '[2023-05-15 21:28:50,168] INFO     {sqlalchemy.engine.base.Engine:110} - {}\n'
               '[2023-05-15 21:28:50,203] ERROR    {pyathena.common:420} - Failed to execute query.\n'
               'Traceback (most recent call last):\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/pyathena/common.py", line 413, in _execute\n'
               '    query_id = retry_api_call(\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/pyathena/util.py", line 84, in retry_api_call\n'
               '    return retry(func, *args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/tenacity/__init__.py", line 406, in __call__\n'
               '    do = self.iter(retry_state=retry_state)\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/tenacity/__init__.py", line 351, in iter\n'
               '    return fut.result()\n'
               '  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result\n'
               '    return self.__get_result()\n'
               '  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result\n'
               '    raise self._exception\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/tenacity/__init__.py", line 409, in __call__\n'
               '    result = fn(*args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/botocore/client.py", line 495, in _api_call\n'
               '    return self._make_api_call(operation_name, kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-athena-0.9.1/lib/python3.10/site-packages/botocore/client.py", line 914, in _make_api_call\n'
               '    raise error_class(parsed_response, operation_name)\n'
               'botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the StartQueryExecution operation: You are not '
               'authorized to perform: athena:StartQueryExecution on the resource. After your AWS administrator or you have updated your permissions,
    ✅ 1
    l
    g
    • 3
    • 4
  • s

    salmon-garden-47148

    05/16/2023, 7:17 AM
    👋 Hello, I have some issues when I try to deploy datahub with helm (last release no custom charts) in kubernetes (EKS). I use OpenSearch & RDS but I prefer use kafka pod for now. datahub-gms doesn't running:
    Copy code
    Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
    Caused by: java.net.ConnectException: Connection refused
    Readiness/Liveness are always KO
    Copy code
    Warning  Unhealthy  4m31s (x12 over 8m1s)  kubelet            Readiness probe failed: Get "<http://10.18.64.5:8080/health>": dial tcp 10.18.64.5:8080: connect: connection refused
      Normal   Killing    4m31s                  kubelet            Container datahub-gms failed liveness probe, will be restarted
      Warning  Unhealthy  3m1s (x9 over 8m1s)    kubelet            Liveness probe failed: Get "<http://10.18.64.5:8080/health>": dial tcp 10.18.64.5:8080: connect: connec
    and nocode-migration :
    Copy code
    ERROR: Cannot connect to GMSat <http://host> datahub-datahub-gms port 8080. Make sure GMS is on the latest version and is running at that host before starting the migration.
    NAME READY STATUS RESTARTS AGE
    datahub-datahub-frontend-d7cfdfb96-sxd59 1/1 Running 0 4m58s
    datahub-datahub-gms-bb6d47fcf-7g7tc 0/1 Running 1 (27s ago) 4m58s
    datahub-elasticsearch-setup-job-bnnvr 0/1 Completed 0 6m20s
    datahub-kafka-setup-job-fcvqg 0/1 Completed 0 6m14s
    datahub-nocode-migration-job-56rrb 0/1 Error 0 72s
    datahub-nocode-migration-job-8vljj 0/1 Error 0 3m55s
    datahub-nocode-migration-job-h975w 0/1 Error 0 2m49s
    datahub-nocode-migration-job-zkclz 0/1 Error 0 4m55s
    datahub-postgresql-setup-job-4kl22 0/1 Completed 0 5m2s
    prerequisites-cp-schema-registry-cf79bfccf-f6mpx 2/2 Running 0 7m42s
    prerequisites-kafka-0 1/1 Running 0 7m42s
    prerequisites-zookeeper-0 1/1 Running 0 7m42s
    I tried to delete pod, redploy in another namespace, redeploy with full pod and no AWS Services, I have always the same issue
    ✅ 1
    g
    • 2
    • 3
  • c

    clever-magician-79463

    05/16/2023, 1:33 PM
    Hi All, I had a query. This is the scenario: I created 2 instances of datahub on different machines, call them datahub_1 and datahub_2. The mysql server is hosted on different dbs on RDS, call them datahub_primary and datahub_secondary for the resprective instances. I ingested some data in datahub_1, this data goes into the metadata_aspect_v2 table of datahub_primary db. Now i want to point datahub_2 to use the same metadata. Just copying the metadata_aspect_v2 table from primary to secondary db is not enough? I tried doing it but i still not see the data in datahub_2. Is there some other way of restoring the metadatas? This test was done by deploying both the instances through docker and I was trying to test the back and restore functionality of datahub. Can anyone please suggest why the above result was seen and if there is a better or a official way of restoring metadata and indices. Happy to take this offline if needed. Thanks a lot! Edit: I have tried directly pointing the db of datahub_2 to the same db of datahub_1 but no use. I created different dbs and copied the table but still no use
    g
    • 2
    • 2
  • q

    quiet-television-68466

    05/16/2023, 1:59 PM
    Is anybody else having issues with the
    scrollAcrossEntities
    graphQL? Have tried this from the docs here: https://datahubproject.io/docs/how/search/#searching-at-scale
    Copy code
    {
      scrollAcrossEntities(input: { types: [DATASET], query: "*", count: 10}) {
        nextScrollId
        count
        searchResults {
          entity {
            type
            ... on Dataset {
              urn
              name
            }
          }
        }
      }
    }
    and I get returned the following error:
    Copy code
    {
      "errors": [
        {
          "message": "An unknown error occurred.",
          "locations": [
            {
              "line": 2,
              "column": 3
            }
          ],
          "path": [
            "scrollAcrossEntities"
          ],
          "extensions": {
            "code": 500,
            "type": "SERVER_ERROR",
            "classification": "DataFetchingException"
          }
        }
      ],
      "data": {
        "scrollAcrossEntities": null
      },
      "extensions": {}
    }
    Is this something wrong on our end? Or is this an issue other teams are having. We’re currently on datahub
    0.10.2
    .
    plus1 1
    g
    s
    • 3
    • 4
  • i

    incalculable-translator-39548

    05/16/2023, 2:29 PM
    Hi DataHub team. May I know if you support AWS Glue ETL? If I follow the following documentation https://datahubproject.io/docs/metadata-integration/java/spark-lineage/ should I be able to use it? Thanks
    d
    • 2
    • 8
  • p

    prehistoric-farmer-31305

    05/16/2023, 2:49 PM
    Hello, My datahub instance is running on GKE. I am getting started with dbt (not dbt-cloud). It seems that when running the ingestion, I am getting an issue connecting to GMS. Upon doing some troubleshooting, it seems that my machine is not able to connect to the GMS server.
    Copy code
    DataHub CLI version: 0.10.2.1
    Failed to set up framework context: Failed to connect to DataHub
    When I am looking at the gms-logs (kubctl logs for gms server) - I am getting the below error
    Copy code
    [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:969 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Error connecting to node prerequisites-kafka-0.prerequisites-kafka-headless.default.svc.cluster.local:9092 (id: 0 rack: null)
    What is the next course of access? It seems like gms is not able to communicate with kafka
    ✅ 1
    g
    a
    • 3
    • 11
1...959697...119Latest