https://datahubproject.io logo
Join SlackCommunities
Powered by
# troubleshoot
  • n

    numerous-account-62719

    12/08/2022, 5:48 AM
    Hi Team Facing the following issue while executing the ingestion pipeline through UI
    Copy code
    ~~~~ Execution Summary ~~~~
    
    RUN_INGEST - {'errors': [],
     'exec_id': 'bcaca377-5b8d-4957-bca5-74e68bf71e3d',
     'infos': ['2022-12-08 05:46:36.817335 [exec_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d] INFO: Starting execution for task with name=RUN_INGEST',
               '2022-12-08 05:46:36.921081 [exec_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d] INFO: Caught exception EXECUTING '
               'task_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 73, in execute\n'
               '    SubProcessTaskUtil._write_recipe_to_file(exec_out_dir, file_name, recipe)\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_task_common.py", line 105, in '
               '_write_recipe_to_file\n'
               '    os.makedirs(dir_path, mode = 0o777, exist_ok = True)\n'
               '  File "/usr/local/lib/python3.10/os.py", line 215, in makedirs\n'
               '    makedirs(head, exist_ok=exist_ok)\n'
               '  File "/usr/local/lib/python3.10/os.py", line 225, in makedirs\n'
               '    mkdir(name, mode)\n'
               "PermissionError: [Errno 13] Permission denied: '/tmp/datahub/ingest'\n"]}
    Execution finished with errors.
    Can someone please help me out
    ✅ 1
    a
    d
    +4
    • 7
    • 36
  • n

    numerous-account-62719

    12/08/2022, 8:28 AM
    Facing the following error while using the csv emmiter [2022-12-08 081711,051] ERROR {datahub.entrypoints:195} - Command failed: Failed to configure source (csv-enricher) due to '1 validation error for CSVEnricherConfig should_overwrite extra fields not permitted (type=value_error.extra)'. Run with --debug to get full stacktrace. e.g. 'datahub --debug ingest -c csvtest.yml' Below is the csv file that I am using resource,subresource,glossary_terms,tags,owners,ownership_type,description,domain "urnlidataset:(urnlidataPlatform:mysql,rsnac.host,PROD)",id,[urnliglossaryTermb91d74c2 e5fa 4de9 9525 42ffbc818ba8],[urnlitagdataset],[urnlicorpuser:abhinav],"TECHNICAL_OWNER","new description",[urnlidomain:Engineering]
    ✅ 1
    b
    b
    • 3
    • 6
  • l

    lively-action-8308

    12/08/2022, 11:59 AM
    Hi all, I’m using Datahub with OIDC authentication (Keycloak), does anyone has an idea how I can deny access based on a specific role asigned to the user? For example, only a user with the role datahub_admin can access Datahub.
    ✅ 1
    b
    • 2
    • 1
  • a

    adamant-van-21355

    12/08/2022, 2:15 PM
    Hi everyone waveboi are there any plans to support
    column-level-lineage
    functionality when DBT metadata is involved? We are ingesting metadata from Snowflake, DBT and Looker (latest version) and currently it is not possible to use this for merged *Snowflake*&*DBT* entities. It would be really nice to unlock this feature for cases when DBT nodes are part of the lineage (which is probably the most of the cases). Thanks 🙂
    ✅ 1
    g
    b
    • 3
    • 6
  • b

    bitter-furniture-95993

    12/08/2022, 3:35 PM
    Hello, I am having trouble with local users, created from an invite. Users are created correctly and have access to datahub. However the users & groups view is empty. I don't even see the datahub user (and can't reset the standard password) I tried to reinstall everything clean but still have the same issue. Any idea where I could start to solve this ?
    b
    • 2
    • 4
  • d

    dazzling-appointment-34954

    12/08/2022, 5:43 PM
    Hey guys, I have a question regarding view policy. We want a group of users to only see all assets that are assigned to a specific domain. Meaning: If they login they will only see the 1 domain and the related assets in the UI (also in overview page, search etc.). Is this possible?
    ✅ 1
    b
    a
    • 3
    • 6
  • a

    ancient-apartment-23316

    12/08/2022, 6:04 PM
    Hello, data ingestion does not work for me, I checked the datahub-actions pod logs and there is this:
    Copy code
    2022/12/08 14:37:50 Waiting for: <http://datahub-datahub-gms:8080/health>
    2022/12/08 14:37:50 Received 200 from <http://datahub-datahub-gms:8080/health>
    No user action configurations found. Not starting user actions.
    [2022-12-08 14:37:50,955] INFO     {datahub_actions.cli.actions:68} - DataHub Actions version: unavailable (installed editable via git)
    [2022-12-08 14:37:51,012] INFO     {datahub_actions.cli.actions:98} - Action Pipeline with name 'ingestion_executor' is now running.
    Can you please help me, how can I fix data ingestion? I’m already saw this https://datahubproject.io/docs/ui-ingestion/#i-see-na-when-i-try-to-run-ingestion-what-do-i-do
    e
    a
    g
    • 4
    • 6
  • l

    little-breakfast-38102

    12/08/2022, 9:26 PM
    @dazzling-judge-80093 / @gray-shoe-75895, We are running into a issue where our Glossary Terms not showing up datasets in “Related Entities” page and throwing a 500. Here are details around ingestion. 1) ingested glossary terms through yaml 2) ingested s3 objects from CLI 3) used CSV enricher to add terms to columns. Term in question has a parent glossaryNode. Cc: @billowy-book-26360
    g
    e
    +2
    • 5
    • 13
  • b

    brief-dream-8019

    12/08/2022, 10:32 PM
    Hi, im logged in as the datahub root user but I don’t seem to have full admin privileges. Is there anyway I can give the datahub root user full admin access?
    m
    a
    • 3
    • 4
  • q

    quick-student-61408

    12/09/2022, 1:52 PM
    Hi all, I'm trying to create some domains with GraphQL it not works and i've an success message. I tried to add domains by UI and it not work also. Any idea ? PS : if i tried to set domain by GraphQL it work but it not referenced in Datahub...
    👀 1
    ✅ 1
    a
    • 2
    • 2
  • g

    gentle-portugal-21014

    12/09/2022, 4:30 PM
    Hi all, I'd like to know if/how it's possible to track changes performed in the UI (e.g. changes in descriptions, new relationships, etc.) and display those changes again in the UI (e.g. by "turning on" displaying of the lastModified field defined for some aspects in order to see who performed certain change and when). As far as I can see, there's limited support for displaying differences in dataset structures coming from ingestion (limited in that it discovers if certain columns are newly added, but a newly added column comment is not considered as a change and the comment is displayed there as if it was there from the beginning). Any ideas? I suspect that adding support for displaying those changes would need to be added in the UI (i.e. the functionality is not available at the moment), but even then it's important to understand whether the history is at least already tracked in the backend, or not (and if not, how to change that)...
    ✅ 1
    👀 2
    e
    o
    b
    • 4
    • 21
  • b

    brainy-piano-85560

    12/11/2022, 11:12 AM
    Hi all, deployed datahub for the first time, started with the quickstart + ingestion of sample data, and ingested afterwards some real data. Now I tried to delete all sample data via the CLI. most of it worked fine, but a few can't be deleted (airflow dag + 5 feast feature tables). For Airflow: Failed to execute operation java.lang.UnsupportedOperationException: Aspect and aspect name is required for create and update operations For Feast: A very long stacktrace with a 500 code, java.lang.RuntimeException: Unknown aspect status for entity dataPlatform Tried both soft & hard deletion. Thanks for the help :)
    c
    b
    b
    • 4
    • 8
  • c

    cool-translator-98249

    12/12/2022, 12:38 AM
    Hi, I'm running an ingestion process on Snowflake with the stateful ingestion option, and it's failing, with this message in the log:
    Copy code
    "failures": {"Stateful Ingestion": ["Fail safe mode triggered, entity '
                          'difference percent:66.66666666666667 > fail_safe_threshold:{self.stateful_ingestion_config.fail_safe_threshold}"]},
    How can I troubleshoot this?
    ✅ 1
    w
    • 2
    • 1
  • a

    acceptable-alarm-65116

    12/12/2022, 1:28 PM
    Hey! Is there any way of controlling what a user can view in the UI? I deactivate all the default policies for "READERS", so a user with the READER role cant access anything, (Dataset. domain, etc), but still can see all of those listed there....
    ✅ 1
    m
    a
    • 3
    • 13
  • g

    gentle-camera-33498

    12/12/2022, 6:41 PM
    Hello everyone, I'm having problems with BigQuery lineage metadata ingestion. Can someone help me? In the company where I work, we have patterns for tables and view names: tables are named following the 'snake case' pattern, and views are named following the 'upper camel case' pattern. The problem is that the lineage builder returns the tables and views name in lowercase. With that, we have a consistency problem in our lineage visualization (as we have tables with the same name as views but in different case patterns) The possible location of the problem: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/bigquery_v2/lineage.py#L437 The 'get_table' method correctly returns the tables and views involved, but all with the names converted to lower case.
    a
    • 2
    • 4
  • b

    bland-lighter-26751

    12/12/2022, 7:29 PM
    Hello, can anyone help out with some Metabase lineage weirdness? In the screenshot below, the left-hand side is the actual table in DataHub that the BigQuery connector generated and manages. On the right is the lineage from the Metabase Dashboard. Instead of linking to the BQ table, it's creating it's own
    web_profile_clicks
    that doesn't actually exist anywhere.
    01_transform
    instead of
    01_TRANSFORM
    . Any ideas?
    ✅ 1
    g
    d
    • 3
    • 10
  • m

    melodic-telephone-26568

    12/13/2022, 7:57 AM
    Hello, After installing the latest version of datahub and trying to ingest data from an Oracle DB, I realized that I don't have read access to metadata tables (dba_*). After searching in this forum, I found out that it used to be possible to ingest data without access to dba tables until version 0.8.43. So I am now trying to run this version. I nuked the existing datahub containers, created a new Python virtual environment in which I installed datahub 0.8.43, and I run the program with the following commands:
    datahub docker nuke
    python -m pip install --upgrade pip wheel setuptools
    python -m pip install acryl-datahub==0.8.43
    datahub version
    DataHub CLI version: 0.8.43 Python version: 3.7.15 (default, Nov 24 2022, 184454) [MSC v.1916 64 bit (AMD64)]
    datahub docker quickstart --version v0.8.43 --mysql-port 53306
    But it fails to start after a few minutes, telling me that datahub-gms is not running. The logs for datahub-gms container are as follows. I don't know what the problem is (the latest version worked without any problem).
    2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <http://elasticsearch:9200>
    2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <tcp://mysql:3306>
    2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <tcp://broker:29092>
    2022-12-13 16:52:58 2022/12/13 07:52:58 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:52:58 2022/12/13 07:52:58 Connected to <tcp://mysql:3306>
    2022-12-13 16:52:58 2022/12/13 07:52:58 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:52:59 2022/12/13 07:52:59 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:52:59 2022/12/13 07:52:59 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:00 2022/12/13 07:53:00 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:00 2022/12/13 07:53:00 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:01 2022/12/13 07:53:01 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:01 2022/12/13 07:53:01 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:02 2022/12/13 07:53:02 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:02 2022/12/13 07:53:02 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:03 2022/12/13 07:53:03 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:03 2022/12/13 07:53:03 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:04 2022/12/13 07:53:04 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:04 2022/12/13 07:53:04 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:05 2022/12/13 07:53:05 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:05 2022/12/13 07:53:05 Connected to <tcp://broker:29092>
    2022-12-13 16:53:06 2022/12/13 07:53:06 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
    2022-12-13 16:53:07 2022/12/13 07:53:07 Received 200 from <http://elasticsearch:9200>
    2022-12-13 16:53:08 2022-12-13 07:53:08.265:INFO::main: Logging initialized @277ms to org.eclipse.jetty.util.log.StdErrLog
    2022-12-13 16:53:08 WARNING: jetty-runner is deprecated.
    2022-12-13 16:53:08          See Jetty Documentation for startup options
    2022-12-13 16:53:08          <https://www.eclipse.org/jetty/documentation/>
    2022-12-13 16:53:08 ERROR: No such classes directory file:///etc/datahub/plugins/auth/resources
    2022-12-13 16:53:08 Usage: java [-Djetty.home=dir] -jar jetty-runner.jar [--help|--version] [ server opts] [[ context opts] context ...]
    2022-12-13 16:53:08 Server opts:
    2022-12-13 16:53:08  --version                           - display version and exit
    2022-12-13 16:53:08  --log file                          - request log filename (with optional 'yyyy_mm_dd' wildcard
    2022-12-13 16:53:08  --out file                          - info/warn/debug log filename (with optional 'yyyy_mm_dd' wildcard
    2022-12-13 16:53:08  --host name|ip                      - interface to listen on (default is all interfaces)
    2022-12-13 16:53:08  --port n                            - port to listen on (default 8080)
    2022-12-13 16:53:08  --stop-port n                       - port to listen for stop command (or -DSTOP.PORT=n)
    2022-12-13 16:53:08  --stop-key n                        - security string for stop command (required if --stop-port is present) (or -DSTOP.KEY=n)
    2022-12-13 16:53:08  [--jar file]*n                      - each tuple specifies an extra jar to be added to the classloader
    2022-12-13 16:53:08  [--lib dir]*n                       - each tuple specifies an extra directory of jars to be added to the classloader
    2022-12-13 16:53:08  [--classes dir]*n                   - each tuple specifies an extra directory of classes to be added to the classloader
    2022-12-13 16:53:08  --stats [unsecure|realm.properties] - enable stats gathering servlet context
    2022-12-13 16:53:08  [--config file]*n                   - each tuple specifies the name of a jetty xml config file to apply (in the order defined)
    2022-12-13 16:53:08 Context opts:
    2022-12-13 16:53:08  [[--path /path] context]*n          - WAR file, web app dir or context xml file, optionally with a context path
    2022-12-13 16:53:08 2022/12/13 07:53:08 Command exited with error: exit status 1
    2022-12-13 16:53:27 + echo
    2022-12-13 16:53:27 + grep -q ://
    2022-12-13 16:53:27 + NEO4J_HOST=http://
    2022-12-13 16:53:27 + [[ ! -z '' ]]
    2022-12-13 16:53:27 + [[ -z '' ]]
    2022-12-13 16:53:27 + ELASTICSEARCH_AUTH_HEADER='Accept: */*'
    2022-12-13 16:53:27 + [[ '' == true ]]
    2022-12-13 16:53:27 + ELASTICSEARCH_PROTOCOL=http
    2022-12-13 16:53:27 + WAIT_FOR_EBEAN=
    2022-12-13 16:53:27 + [[ '' != true ]]
    2022-12-13 16:53:27 + [[ '' == ebean ]]
    2022-12-13 16:53:27 + [[ -z '' ]]
    2022-12-13 16:53:27 + WAIT_FOR_EBEAN=' -wait <tcp://mysql:3306> '
    2022-12-13 16:53:27 + WAIT_FOR_CASSANDRA=
    2022-12-13 16:53:27 + [[ '' == cassandra ]]
    2022-12-13 16:53:27 + WAIT_FOR_KAFKA=
    2022-12-13 16:53:27 + [[ '' != true ]]
    2022-12-13 16:53:27 ++ echo broker:29092
    2022-12-13 16:53:27 ++ sed 's/,/ -wait tcp:\/\//g'
    2022-12-13 16:53:27 + WAIT_FOR_KAFKA=' -wait <tcp://broker:29092> '
    2022-12-13 16:53:27 + WAIT_FOR_NEO4J=
    2022-12-13 16:53:27 + [[ elasticsearch != elasticsearch ]]
    2022-12-13 16:53:27 + OTEL_AGENT=
    2022-12-13 16:53:27 + [[ '' == true ]]
    2022-12-13 16:53:27 + PROMETHEUS_AGENT=
    2022-12-13 16:53:27 + [[ '' == true ]]
    2022-12-13 16:53:27 + auth_resource_dir=/etc/datahub/plugins/auth/resources
    2022-12-13 16:53:27 + COMMON='
    2022-12-13 16:53:27      -wait <tcp://mysql:3306>            -wait <tcp://broker:29092>           -timeout 240s     java -Xms1g -Xmx1g                -jar /jetty-runner.jar     --jar jetty-util.jar     --jar jetty-jmx.jar --classes /etc/datahub/plugins/auth/resources     --config /datahub/datahub-gms/scripts/jetty.xml     /datahub/datahub-gms/bin/war.war'
    2022-12-13 16:53:27 + [[ '' != true ]]
    2022-12-13 16:53:27 + exec dockerize -wait <http://elasticsearch:9200> -wait-http-header 'Accept: */*' -wait <tcp://mysql:3306> -wait <tcp://broker:29092> -timeout 240s java -Xms1g -Xmx1g -jar /jetty-runner.jar --jar jetty-util.jar --jar jetty-jmx.jar --classes /etc/datahub/plugins/auth/resources --config /datahub/datahub-gms/scripts/jetty.xml /datahub/datahub-gms/bin/war.war
    2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <http://elasticsearch:9200>
    2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <tcp://mysql:3306>
    2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <tcp://broker:29092>
    2022-12-13 16:53:27 2022/12/13 07:53:27 Connected to <tcp://mysql:3306>
    2022-12-13 16:53:27 2022/12/13 07:53:27 Connected to <tcp://broker:29092>
    2022-12-13 16:53:27 2022/12/13 07:53:27 Received 200 from <http://elasticsearch:9200>
    2022-12-13 16:53:27 2022-12-13 07:53:27.370:INFO::main: Logging initialized @224ms to org.eclipse.jetty.util.log.StdErrLog
    2022-12-13 16:53:27 WARNING: jetty-runner is deprecated.
    2022-12-13 16:53:27          See Jetty Documentation for startup options
    2022-12-13 16:53:27          <https://www.eclipse.org/jetty/documentation/>
    2022-12-13 16:53:27 ERROR: No such classes directory file:///etc/datahub/plugins/auth/resources
    2022-12-13 16:53:27 Usage: java [-Djetty.home=dir] -jar jetty-runner.jar [--help|--version] [ server opts] [[ context opts] context ...]
    2022-12-13 16:53:27 Server opts:
    2022-12-13 16:53:27  --version                           - display version and exit
    2022-12-13 16:53:27  --log file                          - request log filename (with optional 'yyyy_mm_dd' wildcard
    2022-12-13 16:53:27  --out file                          - info/warn/debug log filename (with optional 'yyyy_mm_dd' wildcard
    2022-12-13 16:53:27  --host name|ip                      - interface to listen on (default is all interfaces)
    2022-12-13 16:53:27  --port n                            - port to listen on (default 8080)
    2022-12-13 16:53:27  --stop-port n                       - port to listen for stop command (or -DSTOP.PORT=n)
    2022-12-13 16:53:27  --stop-key n                        - security string for stop command (required if --stop-port is present) (or -DSTOP.KEY=n)
    2022-12-13 16:53:27  [--jar file]*n                      - each tuple specifies an extra jar to be added to the classloader
    2022-12-13 16:53:27  [--lib dir]*n                       - each tuple specifies an extra directory of jars to be added to the classloader
    2022-12-13 16:53:27  [--classes dir]*n                   - each tuple specifies an extra directory of classes to be added to the classloader
    2022-12-13 16:53:27  --stats [unsecure|realm.properties] - enable stats gathering servlet context
    2022-12-13 16:53:27  [--config file]*n                   - each tuple specifies the name of a jetty xml config file to apply (in the order defined)
    2022-12-13 16:53:27 Context opts:
    2022-12-13 16:53:27  [[--path /path] context]*n          - WAR file, web app dir or context xml file, optionally with a context path
    2022-12-13 16:53:27 2022/12/13 07:53:27 Command exited with error: exit status 1
    👀 1
    b
    d
    m
    • 4
    • 9
  • m

    melodic-dress-7431

    12/13/2022, 12:29 PM
    Hello, trying to build locally having following issue
  • m

    melodic-dress-7431

    12/13/2022, 12:29 PM
    Copy code
    error: error reading /opt/datahub/metadata-auth/auth-api/build/libs/auth-api-0.9.4-SNAPSHOT.jar; zip file is empty
    ✅ 1
  • m

    melodic-dress-7431

    12/13/2022, 12:29 PM
    any suggestions
    b
    • 2
    • 10
  • m

    melodic-dress-7431

    12/13/2022, 12:31 PM
    using gradle-6.9.2 and openjdk-11.0.17.0.8-2.el7_9
  • m

    melodic-dress-7431

    12/13/2022, 12:33 PM
    have cloned the repo few hours back
  • a

    acoustic-rose-68681

    12/13/2022, 2:03 PM
    Hello, I tried adding a new custom platform following these instructions: https://datahubproject.io/docs/how/add-custom-data-platform/ Then I loaded some dataflow, datajobs and lineage information associated with this platform. In the home page I cannot see the new platform in the platform list. Is this normal ?
    ✅ 1
    f
    b
    • 3
    • 8
  • m

    microscopic-mechanic-13766

    12/13/2022, 3:04 PM
    Hello, so I am trying to ingest from Hive using some transformers (in order to save me time from doing the respective additions by hand) and I think one of the transformers used is interfering in correct ingestion of the metadata. As you can see in the pictures the datasets ingested are like <db>.<table> when they should be <table>. Furthermore, no aditional metadata has been ingested of each dataset (like the info presented in properties) The transformers used are the following:
    Copy code
    transformers:
      -
         type: "pattern_add_dataset_tags"
         config:
           tag_pattern:
             rules:
               '.*030902.*': ['urn:li:tag:030902']
               '.*050501.*': ['urn:li:tag:050501']
      -
         type: "pattern_add_dataset_domain"
         config:
           domain_pattern:
             rules:
               '.*libros.*': ['urn:li:domain:c4c94633-96cf-4a93-baa7-15562905f8f0']
               '.*050501.*': ['urn:li:domain:97faf5f3-4494-4620-abcf-a6a9eeea9fbe']
    I am using 0.9.0 for both gms and front and 0.0.8 for actions Both the tags and domains do exist and the execution was executed successfully. I think the problem could be in the
    pattern_add_dataset_tags
    (although I am not really sure) as I have previously used the latter transformer in previous ocasions and didn't have any problem with it. Thanks in advance!
    b
    • 2
    • 2
  • p

    purple-printer-15193

    12/13/2022, 4:55 PM
    Hello, We have some column-level lineages that aren’t being linking up between Snowflake and Looker (see screenshot). However, there are some column-level lineages between Snowflake and Looker that does linked up but have observed that it’s only if the degree of dependency is one. What’s the best way to debug this better?
    d
    m
    • 3
    • 4
  • a

    ancient-library-85500

    12/13/2022, 9:16 PM
    Hi everyone! I'm running into this error when I run `gradle build`:
    * What went wrong:
    Execution failed for task ':li-utils:compileMainGeneratedDataTemplateJava'.
    > Could not find tools.jar. Please check that /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64/jre contains a valid JDK installation.
    Running
    java -version
    outputs this:
    openjdk version "11.0.17" 2022-10-18 LTS
    OpenJDK Runtime Environment (Red_Hat-11.0.17.0.8-2.el7_9) (build 11.0.17+8-LTS)
    OpenJDK 64-Bit Server VM (Red_Hat-11.0.17.0.8-2.el7_9) (build 11.0.17+8-LTS, mixed mode, sharing)
    My JAVA_HOME variable is set to the Java 11 location. It seems that when I try to run build, gradle is picking up a different, older version of Java that I had been using previously.
    b
    m
    • 3
    • 7
  • l

    lemon-lock-92370

    12/14/2022, 3:50 AM
    Hi community! Thanks for this amazing platform! I understand that rebuilding and updating docker container for backend(gms)/frontend is as below.
    Copy code
    # Backend (gms)
    ./gradlew :metadata-service:war:build
    (cd docker && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub -f docker-compose-without-neo4j.yml -f docker-compose-without-neo4j.override.yml -f docker-compose.dev.yml up -d --no-deps --force-recreate datahub-gms)
    
    # Frontend
    ./gradlew :datahub-frontend:dist -x yarnTest -x yarnLint
    (cd docker && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub -f docker-compose-without-neo4j.yml -f docker-compose-without-neo4j.override.yml -f docker-compose.dev.yml up -d --no-deps --force-recreate datahub-frontend-react)
    Then how could I do this for metadata-ingestion? 😮 I modified some code in metadata-ingestion/src/datahub/ingestion/source/aws/glue.py file. I want to build this and make it updated in my docker to apply the code modification. I tried to build this as below.
    Copy code
    ./gradlew :metadata-ingestion:build
    How could I update this modification to existing docker container? Please help 🙏 Thank you 🙇
    👀 1
    ✅ 1
    b
    • 2
    • 3
  • a

    astonishing-cartoon-6079

    12/14/2022, 8:32 AM
    #troubleshoot Hi everyone, I'm trying to add a new Aspect to Dataset entity in the metadata-model-custome. The Aspect is
    Copy code
    namespace com.mycompany.dq
    
    /**
     * Details about dataset Storage.
     */
    @Aspect = {
      "name": "storage",
      "autoRender": true,
      "renderSpec": {
        "displayType": "properties",
        "displayName": "Storage Info",
      }
    }
    record Storage {
      format: optional string
      compression: optional string
      sizeInBytes: optional long
      fileNum: optional long
    }
    I can insert storage aspect successfully, but there is not Storage Info tab in the web page. Does anybody have the answer to slove the problem?
    b
    • 2
    • 3
  • b

    brainy-piano-85560

    12/14/2022, 8:46 AM
    Hi guys, Tried to ingest some metadata from postgres (+ table profiling, not column). the db has 7 schemas, and 72 tables overall. After running 28h~ it failed on 'no disk space'. Can I estimate how much disk space an ingestion should take? how much of it is the profiling part? I know we don't have exact numbers, but approximations will be fine so i'll be able to continue. Thank you 🙂
    ✅ 1
    d
    • 2
    • 41
  • s

    strong-kite-83354

    12/14/2022, 2:06 PM
    Hi @gentle-camera-33498 - did you resolve this issue? I have a similar problem - I can get Assertions to appear in the Validations tab but if I delete the parent dataset and try to add the Assertions again I can't see them in the Validation tab
    ✅ 1
    a
    • 2
    • 2
1...646566...119Latest