https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • m

    modern-laptop-12942

    06/03/2022, 1:39 PM
    Hi guys. I can ingest metadata from snowflake but the stats and lineage are disabled. Anyone has idea?
    d
    • 2
    • 5
  • b

    busy-furniture-10879

    06/03/2022, 4:42 PM
    I'm running into a situation where DBT isn't picking up the full lineage for situations like:
    Copy code
    {Snowflake Table A} -> {DBT Job B} -> {Snowflake Table C}
    It picks up B->C but not A->B I saw in the docs that this was resolved in 0.8.16.2, and I'm on 0.8.34.1. Any ideas as to what I'm doing wrong?
    l
    g
    • 3
    • 16
  • h

    high-ice-84066

    06/03/2022, 5:31 PM
    Hey all! I'm working on ingesting Stats data from Postgres. I've seen some historic examples of people having had a similar issue. I can see the stats being sent via datahub-cli debug. The dataset is created successfully, however in the UI stats tab remains greyed out. I can see that we are sending the stats. Any advice on where to look next? or how to resolve this issue? cc: @User Will share recipe/logs/results datahub: v0.8.36
    datahub --debug ingest -c recipe-postgres.yaml
    • 1
    • 5
  • s

    sparse-raincoat-42898

    06/06/2022, 4:53 AM
    Hi team, I am trying to configure airflow to datahub metadata service(gms) but getting below error. I thought metadata service auth is disabled by default. Am I missing anything? How do I bug/resolve this error? Thanks.
    Copy code
    ***.configuration.common.OperationalError: ('Unable to emit metadata to DataHub GMS', {'message': '401 Client Error: Unauthorized for url: <http://test/api/gms/aspects?action=ingestProposal'}>)
    [2022-06-06, 04:32:07 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check
  • s

    stocky-midnight-78204

    06/06/2022, 7:29 AM
    How to ingest view from vertica? I got below error:
  • s

    stocky-midnight-78204

    06/06/2022, 7:29 AM
    [2022-06-06 152407,976] WARNING {datahub.ingestion.source.sql.sql_common:1141} - Unable to ingest view xxxxxx due to an exception. Traceback (most recent call last): File "/usr/local/python3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context self.dialect.do_execute( File "/usr/local/python3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute cursor.execute(statement, parameters) File "/usr/local/python3/lib/python3.8/site-packages/vertica_python/vertica/cursor.py", line 235, in execute self._execute_simple_query(operation) File "/usr/local/python3/lib/python3.8/site-packages/vertica_python/vertica/cursor.py", line 632, in _execute_simple_query raise errors.QueryError.from_error_response(self._message, query) vertica_python.errors.MissingRelation: Severity: ERROR, Message: Relation "pg_class" does not exist
    m
    b
    • 3
    • 5
  • c

    cold-traffic-81229

    06/06/2022, 8:31 AM
    hi team. how to change storage backend to oracle
  • q

    quick-pizza-8906

    06/06/2022, 2:08 PM
    Hello, I have a question regarding DBT connector - I see there was
    platform_instance
    parameter added recently and 2 changes by @green-football-43791 : https://github.com/datahub-project/datahub/pull/4926 https://github.com/datahub-project/datahub/pull/5028 I set
    platform_insance
    parameter and then run DBT connector first with 5028 commit and then with this commit reverted - both ended up producing target datasets without a platform instance. How is this feature supposed to work?
    g
    m
    • 3
    • 71
  • d

    dry-zoo-35797

    06/06/2022, 6:53 PM
    Hello, Anyone tried to connect to Netezza using any out-of-the-box plugins? I would appreciate if you could share your experience. Thanks, Mahbub
  • s

    stocky-midnight-78204

    06/07/2022, 3:02 AM
    Any one faced this issue:The Cluster ID CfdFs45iRnyj2CuDewpzOQ doesn't match stored clusterId Some(oJr3IMIuRbmqznfRZFPyBg) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
    b
    • 2
    • 5
  • f

    few-address-59566

    06/07/2022, 4:06 AM
    Hi, does anyone know how to generate the Golang REST client from the Datahub OpenAPI spec by any chance?
    c
    • 2
    • 1
  • c

    cuddly-arm-8412

    06/07/2022, 6:00 AM
    hi,team, when i run command ----------------------------------------------- cd metadata-ingestion ../gradlew metadata ingestioninstallDev it took a long time,and always prompts-> INFO: pip is looking at multiple versions of google-cloud-core to determine which version is compatible with other requirements. This could take a while. Is there any way to avoid it
    b
    s
    • 3
    • 7
  • p

    polite-application-51650

    06/07/2022, 7:56 AM
    Hi Team, can anyone tell me how to add fields to the dataset/table created using Java Emitters of Datahub?
    b
    • 2
    • 1
  • f

    few-grass-66826

    06/07/2022, 11:40 AM
    Hi guys how to use UI created secrets from CLI ingestion? It trows error
    b
    c
    a
    • 4
    • 7
  • b

    brave-pencil-21289

    06/07/2022, 12:02 PM
    We are facing issue while pip install for Hive. Can some one help on this.
    d
    b
    • 3
    • 4
  • b

    brave-pencil-21289

    06/07/2022, 12:35 PM
    Do we have any recipe for sybase Or sybaseIQ ingestion?
    d
    • 2
    • 4
  • b

    brave-pencil-21289

    06/07/2022, 12:55 PM
    Do we have netezza ingestion plugin
    d
    • 2
    • 2
  • f

    few-air-56117

    06/07/2022, 1:12 PM
    Hi folks, how can i delete all metadata info? I tried to delete data from mysql, but i think is cached in elastic search. Thx 😄
    h
    • 2
    • 1
  • e

    echoing-kangaroo-49844

    06/07/2022, 3:00 PM
    Hello, I have a python file that runs a dag in airflow called mysql_sample_dag, I am doing data ingestion through airflow. but it returns all the databases on the server and I only want one. as specified in the "database": "database"
    b
    • 2
    • 1
  • h

    high-ice-84066

    06/07/2022, 5:47 PM
    Does anyone know if its possible to pass in http query parameters, or http headers into openapi ingestion recipe. Perhaps in forced_examples or elsewhere? Doesn't look like it from the code. e.g. GET http://host/api/products?afterEpochMs=1654610458&amp;beforeEpochMs=1654613465
    b
    • 2
    • 3
  • d

    delightful-beard-43126

    06/07/2022, 9:30 PM
    Hey, how can I connect to a Kafka Connect cluster over SSL? I’ve been searching through both the website and GitHub documentations but didn’t found any mentions to it
    d
    • 2
    • 1
  • h

    hallowed-machine-2603

    06/08/2022, 12:06 AM
    Hi guys, I succeeded in connecting Database-DataHub and ingesting. But I want to see dataset detal, even if it's a part of dataset. How can I find detail data or data summary at DataHub?
    b
    b
    • 3
    • 5
  • l

    loud-rose-15723

    06/08/2022, 12:31 AM
    Hello everyone! I am studying datahub and the PULL ingestion is clear to me. Otherwise, I am struggling to understand how to implement PUSH based ingestions, generating Metadata Change Events. I was wondering if I can implement some integration with Github Actions in order to automatically create a metadata change event based on some metadata yaml file. Do you know if we have some material about this? Is this possible to be implemented?
    m
    b
    • 3
    • 6
  • r

    rhythmic-flag-69887

    06/08/2022, 3:08 AM
    Hello Im trying to learn how to ingest DBT into our datahub. I saw the documentation for the DBT source but I dont understand it. Im looking at the quick start recipe, and it says I should give it the paths to certain json files, but this doesn’t really help. I found the manifest_file in the tests folder, but the other .json files dont exist. Also how do I connect the git to datahub?
    g
    • 2
    • 6
  • b

    bright-cpu-56427

    06/08/2022, 3:19 AM
    Hi guys, I am testing it with datahub docker quickstart.(i am using aws ec2) I am trying to ingest the glue dataset, but I get a connection refused error. What should I check?
    c
    • 2
    • 5
  • c

    cuddly-arm-8412

    06/08/2022, 3:21 AM
    hi,team,when i open the project,it prompts Could not find netty-transport-native-epoll-4.1.45.Final-linux-x86_64.jar (io.nettynetty transport native epoll4.1.45.Final). Searched in the following locations: file:maven/io/netty/netty-transport-native-epoll/4.1.45.Final/netty-transport-native-epoll-4.1.45.Final-linux-x86_64.jar Possible solution: - Declare repository providing the artifact, see the documentation at https://docs.gradle.org/current/userguide/declaring_repositories.html
    b
    • 2
    • 3
  • f

    few-air-56117

    06/08/2022, 6:49 AM
    Hi folks, did anyone now what are dose container urn and how ca i delete them? Thx 😄
    b
    • 2
    • 2
  • f

    few-air-56117

    06/08/2022, 9:16 AM
    Hi folk , i deleted the metata using cli delete --hard, but the dataset remais in history/search, Is there a way to remove them? Thx 😄
    b
    • 2
    • 2
  • b

    brave-pencil-21289

    06/08/2022, 9:37 AM
    While ingesting netezza to datahub we are getting below failure. Can someone help on this.
    b
    • 2
    • 1
  • b

    brave-pencil-21289

    06/08/2022, 1:31 PM
    While performing sybaseIQ ingestion I am facing below error file. Can someone help me on this.
    d
    b
    s
    • 4
    • 13
1...454647...144Latest