https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • r

    rhythmic-zoo-52859

    09/20/2022, 5:39 AM
    Hi, I would like to add a key in
    ResponseHeaders
    after successful login, but I could not find the way where I should add it.
  • j

    jolly-hospital-52505

    09/20/2022, 9:03 AM
    Getting Below Error while Creating a New Ingestion Source
    Copy code
    ~~~~ Execution Summary ~~~~
    
    RUN_INGEST - {'errors': [],
     'exec_id': '04fe03c0-c639-4f9f-aa52-6d488a12738f',
     'infos': ['2022-09-20 09:01:36.448196 [exec_id=04fe03c0-c639-4f9f-aa52-6d488a12738f] INFO: Starting execution for task with name=RUN_INGEST',
               '2022-09-20 09:01:40.487800 [exec_id=04fe03c0-c639-4f9f-aa52-6d488a12738f] INFO: stdout=Elapsed seconds = 0\n'
               '  --report-to TEXT                Provide an destination to send a structured\n'
               'This version of datahub supports report-to functionality\n'
               'datahub  ingest run -c /tmp/datahub/ingest/04fe03c0-c639-4f9f-aa52-6d488a12738f/recipe.yml --report-to '
               '/tmp/datahub/ingest/04fe03c0-c639-4f9f-aa52-6d488a12738f/ingestion_report.json\n'
               '[2022-09-20 09:01:39,455] INFO     {datahub.cli.ingest_cli:179} - DataHub CLI version: 0.8.44\n'
               '[2022-09-20 09:01:40,097] ERROR    {datahub.entrypoints:192} - \n'
               'Traceback (most recent call last):\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/entrypoints.py", line 149, in main\n'
               '    sys.exit(datahub(standalone_mode=False, **kwargs))\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 1130, in __call__\n'
               '    return self.main(*args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 1055, in main\n'
               '    rv = self.invoke(ctx)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 1657, in invoke\n'
               '    return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 1657, in invoke\n'
               '    return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 1404, in invoke\n'
               '    return ctx.invoke(self.callback, **ctx.params)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/core.py", line 760, in invoke\n'
               '    return __callback(*args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func\n'
               '    return f(get_current_context(), *args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line 347, in wrapper\n'
               '    raise e\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line 299, in wrapper\n'
               '    res = func(*args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/utilities/memory_leak_detector.py", line 102, in '
               'wrapper\n'
               '    return func(*args, **kwargs)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/cli/ingest_cli.py", line 182, in run\n'
               '    pipeline_config = load_config_file(\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/configuration/config_loader.py", line 79, in '
               'load_config_file\n'
               '    config = resolve_env_variables(raw_config)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/configuration/config_loader.py", line 43, in '
               'resolve_env_variables\n'
               '    new_dict[k] = resolve_env_variables(v)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/configuration/config_loader.py", line 43, in '
               'resolve_env_variables\n'
               '    new_dict[k] = resolve_env_variables(v)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/configuration/config_loader.py", line 47, in '
               'resolve_env_variables\n'
               '    new_dict[k] = resolve_element(v)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/datahub/configuration/config_loader.py", line 15, in '
               'resolve_element\n'
               '    return expandvars(element, nounset=True)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 475, in expandvars\n'
               '    return expand(vars_, nounset=nounset)\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 434, in expand\n'
               '    return "".join(buff) + expand_var(\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 171, in expand_var\n'
               '    return expand_modifier_var(\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 216, in expand_modifier_var\n'
               '    return expand_advanced(\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 294, in expand_advanced\n'
               '    return expand_offset(\n'
               '  File "/tmp/datahub/ingest/venv-postgres-0.8.44/lib/python3.9/site-packages/expandvars.py", line 344, in expand_offset\n'
               '    raise OperandExpected(var, offset_str)\n'
               "expandvars.OperandExpected: CMDB: operand expected (error token is 'Credential')\n"
               '[2022-09-20 09:01:40,098] ERROR    {datahub.entrypoints:195} - Command failed: \n'
               "\tCMDB: operand expected (error token is 'Credential').\n"
               '\tRun with --debug to get full stacktrace.\n'
               "\te.g. 'datahub --debug ingest run -c /tmp/datahub/ingest/04fe03c0-c639-4f9f-aa52-6d488a12738f/recipe.yml --report-to "
               "/tmp/datahub/ingest/04fe03c0-c639-4f9f-aa52-6d488a12738f/ingestion_report.json'\n",
               "2022-09-20 09:01:40.488106 [exec_id=04fe03c0-c639-4f9f-aa52-6d488a12738f] INFO: Failed to execute 'datahub ingest'",
               '2022-09-20 09:01:40.488406 [exec_id=04fe03c0-c639-4f9f-aa52-6d488a12738f] INFO: Caught exception EXECUTING '
               'task_id=04fe03c0-c639-4f9f-aa52-6d488a12738f, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    self.event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.9/site-packages/nest_asyncio.py", line 89, in run_until_complete\n'
               '    return f.result()\n'
               '  File "/usr/local/lib/python3.9/asyncio/futures.py", line 201, in result\n'
               '    raise self._exception\n'
               '  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step\n'
               '    result = coro.send(None)\n'
               '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 142, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
    Execution finished with errors.
    d
    • 2
    • 6
  • b

    broad-fountain-13319

    09/20/2022, 10:34 AM
    Morning, we were about to release an upgrade to 0.8.34 to our live environment before realising the following:
    Copy code
    > [config_kafka-setup stage-1 4/8] RUN mkdir -p /opt   && mirror=$(curl --stderr /dev/null <https://www.apache.org/dyn/closer.cgi?as_json=1> | jq -r '.preferred')   && curl -sSL "${mirror}kafka/2.8.1/kafka_2.13-2.8.1.tgz"   | tar -xzf - -C /opt   && mv /opt/kafka_2.13-2.8.1 /opt/kafka   && adduser -DH -s /sbin/nologin kafka   && chown -R kafka: /opt/kafka   && echo "===> Installing python packages ..."    && pip install --no-cache-dir jinja2 requests   && pip install --prefer-binary --prefix=/usr/local --upgrade "git+<https://github.com/confluentinc/confluent-docker-utils@v0.0.49>"   && echo "===> Applying log4j log4shell fix based on <https://www.slf4j.org/log4shell.html> ..."   && zip -d /opt/kafka/libs/log4j-1.2.17.jar org/apache/log4j/net/JMSAppender.class   && rm -rf /tmp/*   && apk del --purge .build-deps:
    #0 0.853 tar: invalid magic
    #0 0.853 tar: short read
    https://dlcdn.apache.org/kafka/ It seems 2.8.1 of kafka has been removed since, a number of datahub versions may be affected as the version is hardcoded in the kafka-setup Dockerfile, thought it was worth flagging.
    thankyou 1
    i
    • 2
    • 3
  • b

    bitter-forest-52650

    09/20/2022, 10:36 AM
    hi, I searched a lot but I could not find a solution this problem: I am trying to follow these steps for deploying datahub to my computer. Hovewer I am getting this error while running "datahub docker quickstart" Unable to run quickstart: - Docker doesn't seem to be running. Did you start it? Before you ask, yes I started docker and it is running. Could you help me?
    h
    • 2
    • 1
  • f

    fresh-cricket-75926

    09/20/2022, 11:29 AM
    Hi, While ldap ingestion , we are getting below error for few user . Can any one please suggest me what might be the issue or fix here.
    Copy code
    Unable to emit metadata to DataHub '
                          'GMS", "info": {"exceptionClass": "com.linkedin.restli.server.RestLiServiceException", "stackTrace": '
                          '"com.linkedin.restli.server.RestLiServiceException [HTTP Status:400]: Error: cannot provide an URN with leading or trailing '
                          'whitespace\\n\\tat com.linkedin.metadata.restli.RestliUtil.badRequestException(RestliUtil.java:84)\\n\\tat '
                          'com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:35)", "message": "Error: cannot provide an URN with leading or '
                          'trailing whitespace", "status": 400, "id": "urn:li:corpuser:cisweingest02 "}},
    w
    • 2
    • 3
  • f

    fast-ice-59096

    09/20/2022, 1:08 PM
    Hi, everyone, I am trying to connect to my DB in MySQL e I have the follwing error:
  • f

    fast-ice-59096

    09/20/2022, 1:09 PM
    Copy code
    ~~~~ Execution Summary ~~~~
    
    RUN_INGEST - {'errors': [],
     'exec_id': '1da5a23b-91f5-4f60-a723-2daf6003cb7b',
     'infos': ['2022-09-20 13:03:35.322273 [exec_id=1da5a23b-91f5-4f60-a723-2daf6003cb7b] INFO: Starting execution for task with name=RUN_INGEST',
               '2022-09-20 13:03:37.367155 [exec_id=1da5a23b-91f5-4f60-a723-2daf6003cb7b] INFO: stdout=venv setup time = 0\n'
               'This version of datahub supports report-to functionality\n'
               'datahub  ingest run -c /tmp/datahub/ingest/1da5a23b-91f5-4f60-a723-2daf6003cb7b/recipe.yml --report-to '
               '/tmp/datahub/ingest/1da5a23b-91f5-4f60-a723-2daf6003cb7b/ingestion_report.json\n'
               '[2022-09-20 13:03:36,887] INFO     {datahub.cli.ingest_cli:170} - DataHub CLI version: 0.8.42\n'
               "[2022-09-20 13:03:37,131] ERROR    {datahub.entrypoints:188} - Command failed with : operand expected (error token is 'password'). "
               'Run with --debug to get full trace\n'
               '[2022-09-20 13:03:37,131] INFO     {datahub.entrypoints:191} - DataHub CLI version: 0.8.42 at '
               '/tmp/datahub/ingest/venv-mysql-0.8.42/lib/python3.9/site-packages/datahub/__init__.py\n',
               "2022-09-20 13:03:37.367319 [exec_id=1da5a23b-91f5-4f60-a723-2daf6003cb7b] INFO: Failed to execute 'datahub ingest'",
               '2022-09-20 13:03:37.367441 [exec_id=1da5a23b-91f5-4f60-a723-2daf6003cb7b] INFO: Caught exception EXECUTING '
               'task_id=1da5a23b-91f5-4f60-a723-2daf6003cb7b, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
    Execution finished with errors.
  • f

    fast-ice-59096

    09/20/2022, 1:10 PM
    Does anyone have any idea of what it is happening?
    h
    • 2
    • 4
  • b

    brave-cat-37084

    09/21/2022, 7:15 AM
    Hey, I try to describe data assets directly in ssms (MSSQL) and ingest those descriptions into DataHub. Does anyone has experience where i need to add the description of the asset in ssms?
    h
    • 2
    • 1
  • m

    many-keyboard-47985

    09/21/2022, 7:43 AM
    Hi! I am trying to datahub-gms component mysql error log. I’m set up datahub-component and using datahub. Most of the functions are working. However, the error log of datahub-gms is repeatedly output. Can you help me with this error log?
    Copy code
    16:36:52.931 [pool-7-thread-1] ERROR c.d.authorization.DataHubAuthorizer:229 - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30
    javax.persistence.PersistenceException: Query threw SQLException:vtgate: ${my mysql address} : code = Aborted desc = transaction 1663147082760249589: ended at 2022-09-21 16:36:10.929 KST (exceeded timeout: 1m0s) (CallerID: datahub admin db) Bind values:[urn:li:dataHubPolicy:15b29d13-53ad-44d0-a006-a33a9550ee77, dataHubPolicyInfo, 0, urn:li:dataHubPolicy:15b29d13-53ad-44d0-a006-a33a9550ee77, dataHubPolicyKey, 0] Query was:select urn, aspect, version, metadata, systemMetadata, createdOn, createdBy, createdFor FROM metadata_aspect_v2 WHERE urn = ? AND aspect = ? AND version = ? UNION ALL SELECT urn, aspect, version, metadata, systemMetadata, createdOn, createdBy, createdFor FROM metadata_aspect_v2 WHERE urn = ? AND aspect = ? AND version = ?
    	at io.ebean.config.dbplatform.SqlCodeTranslator.translate(SqlCodeTranslator.java:52)
    	at io.ebean.config.dbplatform.DatabasePlatform.translate(DatabasePlatform.java:219)
    	at io.ebeaninternal.server.query.CQueryEngine.translate(CQueryEngine.java:149)
    	at io.ebeaninternal.server.query.DefaultOrmQueryEngine.translate(DefaultOrmQueryEngine.java:43)
    	at io.ebeaninternal.server.core.OrmQueryRequest.translate(OrmQueryRequest.java:102)
    	at io.ebeaninternal.server.query.CQuery.createPersistenceException(CQuery.java:702)
    	at io.ebeaninternal.server.query.CQueryEngine.findMany(CQueryEngine.java:411)
    	at io.ebeaninternal.server.query.DefaultOrmQueryEngine.findMany(DefaultOrmQueryEngine.java:133)
    	at io.ebeaninternal.server.core.OrmQueryRequest.findList(OrmQueryRequest.java:459)
    	at io.ebeaninternal.server.core.DefaultServer.findList(DefaultServer.java:1596)
    	at io.ebeaninternal.server.core.DefaultServer.findList(DefaultServer.java:1574)
    	at io.ebeaninternal.server.querydefn.DefaultOrmQuery.findList(DefaultOrmQuery.java:1481)
    	at com.linkedin.metadata.entity.ebean.EbeanAspectDao.batchGetUnion(EbeanAspectDao.java:359)
    	at com.linkedin.metadata.entity.ebean.EbeanAspectDao.batchGet(EbeanAspectDao.java:279)
    	at com.linkedin.metadata.entity.ebean.EbeanAspectDao.batchGet(EbeanAspectDao.java:260)
    	at com.linkedin.metadata.entity.EntityService.exists(EntityService.java:1309)
    	at com.linkedin.metadata.resources.entity.ResourceUtils.lambda$validateSearchResult$0(ResourceUtils.java:52)
    	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
    	at java.util.Iterator.forEachRemaining(Iterator.java:116)
    	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
    	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
    	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
    	at com.linkedin.metadata.resources.entity.ResourceUtils.validateSearchResult(ResourceUtils.java:53)
    	at com.linkedin.entity.client.JavaEntityClient.search(JavaEntityClient.java:297)
    	at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:50)
    	at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:42)
    	at com.datahub.authorization.DataHubAuthorizer$PolicyRefreshRunnable.run(DataHubAuthorizer.java:222)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    Caused by: com.mysql.cj.jdbc.exceptions.MySQLQueryInterruptedException: ${my mysql address} : rpc error: code = Aborted desc = transaction 1663147082760249589: ended at 2022-09-21 16:36:10.929 KST (exceeded timeout: 1m0s) (CallerID: datahub admin db)
    	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:126)
    	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
    	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
    	at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
    	at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003)
    	at io.ebean.datasource.pool.ExtendedPreparedStatement.executeQuery(ExtendedPreparedStatement.java:136)
    	at io.ebeaninternal.server.query.CQuery.prepareResultSet(CQuery.java:376)
    	at io.ebeaninternal.server.query.CQuery.prepareBindExecuteQueryWithOption(CQuery.java:324)
    	at io.ebeaninternal.server.query.CQuery.prepareBindExecuteQuery(CQuery.java:319)
    	at io.ebeaninternal.server.query.CQueryEngine.findMany(CQueryEngine.java:384)
    	... 30 common frames omitted
    m
    • 2
    • 5
  • n

    numerous-account-62719

    09/21/2022, 11:24 AM
    Hi Team I am trying to ingest the kafka data but getting the following error: ImportError: datahub.ingestion.source.confluent_schema_registry.ConfluentSchemaRegistry Below is the config that I am using: source: type: "kafka" config: platform_instance: "amqstreams-cluster" connection: bootstrap: "*****:9095" schema_registry_url: "*****:8080" sink: type: datahub-rest config: server: 'http://datahub-gms.telco-dataprocessing-mvp:8080' # Add a secret in secrets Tab token: null
    h
    • 2
    • 18
  • e

    enough-monitor-24292

    09/21/2022, 1:47 PM
    Hi Team, I'm getting error in deleting glossary term datahub delete --urn "urnliglossaryTerm:Re-marketing.New" --hard, It's. not deleting and all by this issue, our search is failing, can you please help for the same
    g
    • 2
    • 1
  • a

    adamant-rain-51672

    09/21/2022, 2:30 PM
    Hey, does anyone know how I can change password for datahub root user (I deployed using EKS and helm charts)?
    g
    i
    w
    • 4
    • 12
  • m

    most-nightfall-36645

    09/21/2022, 4:03 PM
    Hi, Were experiencing intermittent issues when querying our metadata using the search bar. The logs in my browser console report two errors:
    Copy code
    Could not fetch logged in user from cache. + Unexpected token < in JSON at position 0
    Has anyone else experience this? I checked the gms and react containers, neither are reporting any error logs. I also check elastic search, kafka and our mysql database, non are under contention and maintain reasonable response and read/write latencies.
    • 1
    • 1
  • s

    steep-advantage-66572

    09/21/2022, 5:28 PM
    Hi I followed the instructions at: https://datahubproject.io/docs/quickstart, but when I run: datahub docker quickstart, I get:
    No Datahub Neo4j volume found, starting with elasticsearch as graph service.
    To use neo4j as a graph backend, run
    datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose.quickstart.yml
    from the root of the datahub repo
    Fetching docker-compose file <https://raw.githubusercontent.com/datahub-project/datahub/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml> from GitHub
    Pulling docker images...
    unknown shorthand flag: 'f' in -f
    See 'docker --help'.
    Any help appreciated :-)
    • 1
    • 1
  • e

    eager-oil-39220

    09/21/2022, 6:00 PM
    Hi Team, we just updated our datahub to the latest version(v0.8.44). I checked the UI everything works fine but the tag under the user profile shows "v0.8.22". Do you have any insights on how to solve this problem? Thanks.
    b
    • 2
    • 3
  • f

    few-carpenter-93837

    09/22/2022, 6:48 AM
    Hi, all Is DataHub file based lineage supported for Vertica? Tried it out and got the following error:
    Copy code
    mapping values are not allowed here
    Not sure if the error is with my recipe and lineage yml or just missing capability.
    h
    • 2
    • 6
  • l

    lemon-cat-72045

    09/22/2022, 6:57 AM
    Hi team, I am seeing this error when I try to view the lineage for a Looker Explore
    h
    • 2
    • 8
  • m

    microscopic-mechanic-13766

    09/22/2022, 12:46 PM
    Hello, so I am trying to integrate Datahub with Apache Ranger. I have followed the guide that is in the documention (this) I have configured both Ranger and Datahub but when I redeploy the gms container with the Ranger options enabled i get the error of the file. Does anyone know why it is happening??
    Datahub_Ranger_error.txt
    g
    • 2
    • 41
  • m

    microscopic-mechanic-13766

    09/22/2022, 3:44 PM
    Apart from the previous errors, how can I indicate to Ranger the host/port of the datahub-gms?? In other words, what is the name of the config property that I need to create in order to be able to create the ranger_datahub service??
    g
    • 2
    • 3
  • n

    nice-country-99675

    09/22/2022, 8:28 PM
    👋 Hello Team! Has anyone tried to use DataHub 0.8.44 with Airflow 2.4? I'm getting a conflicting dependency with MarkupSafe. Datahub is requesting 2.0.1 but Flask 2.2.2 (which is an Airflow dependency) requires >= 2.1.1 which is the lastest relase of MarkupSafe (from March 2022)...
    p
    • 2
    • 6
  • b

    bland-sundown-49496

    09/22/2022, 10:58 PM
    Hello, I am installing datahub and found the below error in gms log. Any idea how to troubleshoot this?
    Copy code
    22:35:38.719 [pool-13-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler:95 - Unexpected error occurred in scheduled task
    java.lang.RuntimeException: Search query failed:
            at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:265)
            at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.getHighlights(AnalyticsService.java:236)
            at com.linkedin.gms.factory.telemetry.DailyReport.dailyReport(DailyReport.java:76)
            at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.base/java.lang.reflect.Method.invoke(Method.java:566)
            at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
            at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
            at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
            at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
            at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
            at java.base/java.lang.Thread.run(Thread.java:829)
    Caused by: org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=index_not_found_exception, reason=no such index [datahub_usage_event]]
            at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:187)
            at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
            at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1869)
            at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1626)
            at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)
            at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)
            at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)
            at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:260)
            ... 14 common frames omitted
            Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<http://elasticsearch:9200>], URI [/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [datahub_usage_event]","resource.type":"index_or_alias","resource.id":"datahub_usage_event","index_uuid":"_na_","index":"datahub_usage_event"}],"type":"index_not_found_exception","reason":"no such index [datahub_usage_event]","resource.type":"index_or_alias","resource.id":"datahub_usage_event","index_uuid":"_na_","index":"datahub_usage_event"},"status":404}
                    at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
                    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272)
                    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
                    at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
    m
    • 2
    • 1
  • g

    gorgeous-dinner-4055

    09/22/2022, 11:13 PM
    👋🏽 Hi All! I am seeing weird behavior with lineage, and wondering if it's a bug, or something I'm miss-understanding. We have a couple of datasets with multiple versions of Lineage that we have ingested over time. When looking at a urns history, we see that there's 4 versions(details in thread), and the latest version is not == to the largest version number. Is that perhaps a bug with how we're ingesting data?
    • 1
    • 7
  • l

    lemon-cat-72045

    09/23/2022, 3:11 AM
    Hi, everyone. I'm facing the issue that Datahub cannot get the access token list. Does anyone know what's causing this problem? See the screenshot attached in the thread. Thanks in advance.
  • e

    enough-monitor-24292

    09/23/2022, 5:37 AM
    Hi Team, I'm getting following error while opening datahub url Caused by: org.pac4j.core.exception.TechnicalException: com.nimbusds.oauth2.sdk.ParseException: The scope must include an "openid" value Can anyone please help? Thanks
    h
    • 2
    • 2
  • m

    mammoth-air-95743

    09/23/2022, 8:32 AM
    Hi everyone! I am using ingestion from S3 bucket, and json files within, and in logger of ingestion task I get the message that it’s extracting table schema but there’s nothing actually there, it doesn’t infer the schema. Here’s logger output:
    Copy code
    '[2022-09-20 09:41:44,078] INFO     {datahub.ingestion.source.s3.source:519} - Extracting table schema from file: '
               '<s3://path/to/file.json>\n'
    '[2022-09-20 09:41:44,078] INFO     {datahub.ingestion.source.s3.source:527} - Creating dataset urn with name: '
               'path/to/file.json\n'
    h
    • 2
    • 3
  • m

    mammoth-air-95743

    09/23/2022, 8:32 AM
    Second question is related to ingestion breaking for some mongo collections and json files. Mongo collections causes I found were related to values of columns having some sort of encoded html or json inside them, and also one case of really big schema. Out of few JSON’s that failed, I imagine it has a similar case but I didn’t debug it properly yet. My main issue is that there’s no useful output anywhere to see why it failed. Is there anywhere I can look, in some pod’s logs or something. Alternatively, can I add a blacklist to ingestion recipe so it skips some collections/files?
  • n

    narrow-toothbrush-13209

    09/23/2022, 8:52 AM
    Hi, The Dathub Provider For Airflow can handle the connection error . Since the task are getting failed if connection is not established. datahub_provider.lineage.datahub.DatahubLineageBackend
    h
    b
    • 3
    • 2
  • f

    fresh-cricket-75926

    09/23/2022, 9:46 AM
    Hi All , i am trying fetch Metadata from Bigquery and save it to a file . The issue is , i cant read Entities and Table from the source . In shell console , simple the connection get timeout . we are using rancher here to deploy datahub . Any suggestion will be helpful here.
    h
    • 2
    • 2
  • g

    green-hamburger-3800

    09/23/2022, 10:08 AM
    Hey folks, how are you?! I'm having some issue trying to ingest data from
    trino
    We're using
    Starburst
    with a
    glue catalog
    and when trying to ingest data using the
    Trino Source
    we're encountering the following error:
    Copy code
    'message="Table \'"schema".table$properties\' not found"
    It seems that in fact that doesn't exist and it's not the way to query for properties in this case. Any ideas?
    h
    • 2
    • 2
1...505152...119Latest