https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • w

    witty-butcher-82399

    03/20/2023, 12:00 PM
    Does anyone has tested DBT ingestor with 1.4 version? One of our users is planning an upgrade from 1.2 to 1.4 and has raised the following concern because of some breaking changes:
    We areplanning to upgrade DBT in our project from 1.2 to 1.4 and this is comes with some breaking changes, I would like to check with you if this upgrading would have some effects on DataHub
    We have some doubts that the changes below would have some risks if your integration is not ready to handle that:
    ◦ Renamed
    raw_sql
    to
    raw_code
    ◦ Renamed
    compiled_sql
    to
    compiled_code
    ◦ rename macro
    dbt_utils.surrogate_key
    to
    dbt_utils.generate_surrogate_key
    Thanks
    ✅ 1
  • s

    some-car-9623

    03/20/2023, 5:35 PM
    Hello Everyone, I working on Ingestion of Dashboard and charts in MDH, I am able to ingest that successfully. question is on Dashboards we have many tab, how to show those tabs in Dashboard entity.si thee any ways to represents the tabs under the Dashboards? TIA. Thanks
    ✅ 1
    a
    • 2
    • 3
  • s

    some-car-9623

    03/20/2023, 5:35 PM
    Geetha
  • a

    astonishing-pager-27015

    03/21/2023, 2:41 AM
    I've run a Great Expectations checkpoint with the datahub action included, and the debug logs look successful - the Dataset URN looks correct and an assertion URN is logged as well - but the table in the UI still has a greyed out Validation tab. Any thoughts? was sending to the wrong address, ignore me lol
    ✅ 1
    • 1
    • 2
  • f

    freezing-account-90733

    03/21/2023, 3:08 AM
    Hi How to delete plataform using curl ?
    d
    g
    • 3
    • 4
  • n

    numerous-account-62719

    03/21/2023, 4:47 AM
    Hi Team I am not able to find what issue is here in the ingestion pipeline can you take a look at below error ``````
  • n

    numerous-account-62719

    03/21/2023, 4:47 AM
    Untitled
    Untitled
    g
    h
    • 3
    • 7
  • b

    bitter-evening-61050

    03/21/2023, 6:11 AM
    Hi Team, Can we pull the permission from the datahub to the any file using api or any method.
    ✅ 1
    a
    • 2
    • 1
  • r

    rich-daybreak-77194

    03/21/2023, 8:58 AM
    I try to post some tags to datahub with openapi tableau source. but i don’t know what is platformSchema value.
    d
    • 2
    • 2
  • b

    bitter-evening-61050

    03/21/2023, 10:51 AM
    Hi Team, How to write a graphql api to get the role for specific user
    d
    a
    w
    • 4
    • 7
  • g

    gray-angle-76914

    03/21/2023, 11:49 AM
    Hi! How can I change in the receipe the module en_core_web_sm used by spacy in the classification feature for es_core_web_sm? Thanks!
    ✅ 1
    a
    • 2
    • 1
  • d

    damp-battery-32786

    03/21/2023, 1:37 PM
    Hi, I am looking to use DataHub to do Oracle ingestion. The goal would be to create service account which can read all of the dictionary tables, but not the underlying data. We seem to be hitting issues as we granted the dba_* tables that are used, but it looks like SQL Alchemy is then leveraging all_*. The all_* tables will only show you the rows you have read access. Is there any way to get the user account set up in a way that allows ingest but locks down for security?
    a
    • 2
    • 1
  • f

    fancy-shoe-14428

    03/21/2023, 2:45 PM
    Hello! Has anyone successfully connected datahub to amazon msk? I am not able to retrieve anything but topic names because the schema registry there is handled via glue and the config for the kafka connection only accepts a schema registry url
    a
    a
    • 3
    • 5
  • f

    flaky-dinner-67771

    03/21/2023, 2:48 PM
    Hello I'm trying to ingest metadata from Oracle through the UI. However, DataHub does not define statistical parameters for numeric fields, such as min, max, etc. For postgre, such parameters are defined. Why does DataHub not define them? How to make it so that it began to identify them?
    Copy code
    source:
        type: oracle
        config:
            host_port: 'host:port'
            database: null
            username: username
            password: password
            service_name: service_name
            schema_pattern:
                allow:
                    - schema_name
            profiling:
                enabled: true
            stateful_ingestion:
                enabled: true
    d
    • 2
    • 4
  • f

    future-student-30987

    03/21/2023, 3:57 PM
    Hi guys, how's it going? Could someone confirm if this info is up to date? "Compatibility Metabase version v0.41.2" We are trying to configure an ingestion recipe for Metabase, how ever we are running version 0.45.x
    ✅ 1
    a
    • 2
    • 1
  • l

    lively-waiter-97452

    03/21/2023, 4:54 PM
    Hello, I'm running the quickstart version (docker compose handling the various containers) and just playing with the tool to discover it a bit more. I did add an ingestion for an Oracle database I have (limited to a single schema, all tables and views) and I struggle to understand what a container with a
    urn:li:container:...
    name is and where it does come from. I did some guessing on how to set options in the ingestion definition to get a structure making sense but that one is a mystery. Thinking it was because I was using
    service_name
    instead of
    database
    , I did try setting the
    sqlalchemy_uri
    instead, but still there is that random name of that container that I can't understand what it is supposed to represent. Attached a picture of my guessing of the ingestion parameter and how they translated on screen. My database is a 19c, I'm connecting to a PDB by using the service name. I did see in this same channel similar question 10 months ago but they don't really have a solution or anything. Is this an expected behavior? I see that the code of the ingestion oracle.py has been changed a month ago, therefore I assumed that the old reports of this issues were fixed. I just don't know this tool enough right now to figure out what that container is meant to represent and how I can rename it at the ingestion definition. Thanks
    g
    b
    r
    • 4
    • 7
  • l

    lively-dusk-19162

    03/21/2023, 8:07 PM
    Hi, Can anyone help me in solving the below error Could not run phased build action using connection to Gradle distribution 'https://services.gradle.org/distributions/gradle-6.9.2-bin.zip'. Build file '/Users/c7q9ff/Downloads/datahub-master-4/metadata-service/restli-servlet-impl/build.gradle' line: 80 A problem occurred evaluating project 'metadata servicerestli-servlet-impl'. Could not resolve all dependencies for configuration 'metadata servicerestli-servlet-impl:dataModel'. Failed to calculate the value of task 'metadata modelscompileJava' property 'javaCompiler'. Unable to download toolchain matching these requirements: {languageVersion=8, vendor=any, implementation=vendor-specific} Could not GET 'https://api.adoptopenjdk.net/v3/binary/latest/8/ga/mac/aarch64/jdk/hotspot/normal/adoptopenjdk'. The server may not support the client's requested TLS protocol versions: (TLSv1.2, TLSv1.3). You may need to configure the client to allow other protocols to be used. See: https://docs.gradle.org/6.9.2/userguide/build_environment.html#gradle_system_properties PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target unable to find valid certification path to requested target
    ✅ 1
    a
    • 2
    • 1
  • l

    lively-dusk-19162

    03/21/2023, 9:32 PM
    Hello all, Is there possibility that we can upgrade gradle version in the project?
    ✅ 1
    a
    • 2
    • 1
  • m

    miniature-kitchen-67237

    03/22/2023, 11:08 AM
    Hello guys! I'm having a problem with ingestion large databases, when I ingest smaller ones it is ok, but when I try to download larger dbs the UI seems to be working slower and slower and finally the UI stop working. I guess this might be problem with overloading. Where can I change the capacity, so I can load larger ingsetion? It would be great if I can enlarge it without deleting current state.
    a
    a
    • 3
    • 5
  • h

    hallowed-microphone-6899

    03/22/2023, 1:43 PM
    Hi All,I just follow Airflow Integration to do , but airflow log pending ,status picture 1 , airflow connections picture 2; code is in picture 3 env info : airflow = 2.5.2 ( standalone), acryl_datahub_airflow_plugin = 0.10.0.6, Python 3.9.6. Can you tell me who to do ?
    a
    d
    • 3
    • 5
  • a

    astonishing-article-48608

    03/22/2023, 2:34 PM
    Hello,
  • a

    astonishing-article-48608

    03/22/2023, 2:39 PM
    Hello, I am getting this error when executing the registry.py: packages/datahub/ingestion/api/registry.py", line 88, in _register raise KeyError(f"key already in use - {key}") KeyError: 'key already in use - console' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-d4492384-1b86-49fb-899a-0c1dbb2dd109/lib/python3.9/site-packages/datahub/entrypoints.py", line 175, in main sys.exit(datahub(standalone_mode=False, **kwargs)) File
    g
    • 2
    • 1
  • a

    astonishing-article-48608

    03/22/2023, 2:41 PM
    Is there a flag to set "override" to true ie to override the check? def _register( self, key: str, tp: Union[str, Type[T], Exception], override: bool = False
  • b

    breezy-honey-91751

    03/22/2023, 3:43 PM
    Hi, I am facing the same issue. Can anyone help here?
    a
    l
    • 3
    • 4
  • m

    microscopic-room-90690

    03/23/2023, 2:59 AM
    Hi team, I find that https://github.com/datahub-project/datahub/tree/master/metadata-ingestion/src/datahub is very useful. While when I use it, I can not find where the package datahub.metadata.schema_classes is. Can anyone help?
    Copy code
    from datahub.metadata.schema_classes import (
        GlossaryTermAssociationClass as Term,
        KafkaAuditHeaderClass,
        OwnerClass as Owner,
        OwnershipTypeClass,
        SystemMetadataClass,
        TagAssociationClass as Tag,
        UpstreamClass as Upstream,
    )
    ✅ 1
    b
    g
    • 3
    • 6
  • h

    handsome-advantage-37565

    03/23/2023, 8:19 AM
    Hi Team, I would appreciate if you could point me to the example of how to ingest metadata from file source.. I found the documentation but not sure what goes input directory? Do I need to use any specifically formatted contents in input file? I configured new source using UI and config looks like below: source: type: file config: filename: 'C:\temp\ingestion\sampleMetadata.json' But I am unable to find out if DataHub needs this input file specifically formatted with metadata in specific format etc or can it accepts anything is it? Could not find full end to end example on file ingestion source... Documentation - https://datahubproject.io/docs/generated/ingestion/sources/file/
    g
    a
    • 3
    • 7
  • s

    square-yak-42039

    03/23/2023, 9:07 AM
    How can I prevent datahub from updating column and model descriptions from dbt yml files?
    ✅ 1
    a
    • 2
    • 1
  • m

    mysterious-advantage-78411

    03/23/2023, 11:08 AM
    Hi, is there is a way to ingest metadata from files like *.json.gzip from s3?
    ✅ 1
    f
    • 2
    • 6
  • f

    future-student-30987

    03/23/2023, 1:06 PM
    Anyone?
    ✅ 1
  • a

    adventurous-area-49559

    03/23/2023, 2:01 PM
    I got powerbi ingested which is great, just wondering if there was lineage between snowflake and powerbi?
    g
    • 2
    • 26
1...111112113...144Latest