https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • a

    alert-fall-82501

    01/11/2023, 11:55 AM
    My hive ingestion airflow dag job is failing ,Ingestion is going well but it has tag of fail at the end . Who can help me here please? ..logs and config file please check in thread TIA ?
    a
    d
    • 3
    • 4
  • b

    boundless-nail-65912

    01/11/2023, 12:54 PM
    Hello Team,
  • b

    bulky-grass-52762

    01/11/2023, 2:50 PM
    Hey @green-football-43791! Sorry for pinging you from here, but just wanted to ask something about sibling entities. Is there a reason why you’ve added sibling aspects to only
    dataset
    types? More in 🧵
    g
    d
    • 3
    • 15
  • p

    proud-waitress-17589

    01/11/2023, 3:56 PM
    Hi All, I'm curious if anyone has any experience integrating Segment.io with Datahub. I know there isn't currently an out of the box connector, so looking for any tips or guidance on a good way to connect the two.
    ✅ 1
    👀 1
    a
    • 2
    • 1
  • b

    busy-furniture-10879

    01/11/2023, 5:34 PM
    Using the ingestion GUI is there a way to set the Token? I think I need to since i keep getting bounced with
    Copy code
    "container-info-[[]]-urn:li:container:e943b835b9e091b23b44fa8446526916 with ('Unable to emit metadata to "
               "DataHub GMS', {'message': '401 Client Error: Unauthorized for url: <http://datahub-datahub-gms:8080/aspects?action=ingestProposal>', 'id': "
    👀 1
    ✅ 1
    a
    g
    +2
    • 5
    • 15
  • a

    alert-fall-82501

    01/11/2023, 6:51 PM
    I want to import airflow dag jobs to datahub . can anybody help me with that ?
    ✅ 1
    g
    • 2
    • 2
  • b

    bland-appointment-45659

    01/11/2023, 10:29 PM
    Team, We are trying to ingest metadata from Snowflake using v 0.9.2. Able to get view-table lineage but if table-table lineage is not coming through. Any pointers on what to look further ?
    ✅ 1
    m
    d
    h
    • 4
    • 12
  • g

    gray-cpu-75769

    01/11/2023, 6:31 PM
    Hi Team, I’m facing few issue while ingesting metadata from the MongoDB, Below attach is the config which I’m passing and the current datahub and cli version being used is 0.9.3 Any help would be appreciated
    Copy code
    source:
        type: mongodb
        config:
            connect_uri: '<mongodb://172.21.3.68:3463>'
            options:
                authSource: admin
                replicaSet: rs0
                readPreference: primary
            username: ********
            password: ********
    pipeline_name: 'urn:li:dataHubIngestionSource:4bc8b8bb-1716-4725-b2b6-7fa6b0abe32e'
    Untitled.py
    h
    • 2
    • 5
  • p

    polite-actor-701

    01/12/2023, 6:34 AM
    I have a question about Tableau. When the source is Tableau, there is no error if the sink is set to datahub-rest. However, when I set the sink to datahub-kafka, an error occurred saying that localhost could not be found. Of course, datahub-rest also used localhost. And when the source is Oracle, there was no error using datahub-kafka. Can't use datahub-kafka in Tableau? Then why? If not, is this a bug? I am using v0.8.32. And I use datahub-kafka like this: sink: type: "datahub-kafka" config: connection: bootstrap: "localhost:9092" schema_registry_url: "http://localhost:8081"
    ✅ 1
    h
    • 2
    • 5
  • l

    late-lunch-35723

    01/12/2023, 7:17 AM
    Hi team, I have an question. I'm trying to view the lineage from Snowflake between tables in a database cloned from another database (using query CREATE DATABASE target_db CLONE source_db) ,but can't get lineage. does Datahub support lineage from Snowflake between tables in a cloned database?
    ✅ 1
    h
    • 2
    • 4
  • b

    bright-receptionist-94235

    01/12/2023, 7:41 AM
    Hey, when the new Vertica connector will be available?
    ✅ 1
    a
    • 2
    • 3
  • b

    best-umbrella-88325

    01/12/2023, 8:19 AM
    Hi Community! I'm testing the feature of stateful ingestion on MySQL source. I ingested 3 tables into datahub, then dropped 1 table on the source (MySQL) and ingested again. Stateful ingestion was enabled. But, the removed table still appears. Is this the way stateful ingestion works or there's something wrong here? Any help appreciated. Thanks in advance. Ingestion recipe in thread.
    ✅ 1
    c
    • 2
    • 7
  • s

    salmon-psychiatrist-4013

    01/12/2023, 9:23 AM
    Hello Team, I am trying to pull data from Vertica and for time being sink it into a file. Pull of data from Vertica is failing with FATAL error "Authentication failed". I am sure the User/Password combination is correct. I realized after some trials that connection to Vertica DB is secured with SSL. So wondering how to securely connect to Vertica using SSL as it would need Certificate etc. PS: sorry for multiple edits to the question was pressing enter which was submitting the question 😞 Rookie mistake
    ✅ 1
    a
    h
    • 3
    • 3
  • p

    proud-memory-42381

    01/12/2023, 9:47 AM
    Hi! I'm trying to use a GMS secret in the yaml recipe located outside the docker containers. Is that possible? I'm being told this: expandvars.UnboundVariable: 'NETEZZA_PASSWORD: unbound variable' I'm not very comfortable providing clear text passwords in the CLI based recipe files... Thanks!
    h
    • 2
    • 5
  • v

    victorious-spoon-76468

    01/12/2023, 1:21 PM
    Hey, all! I’m currently trying to programmatically associate previously registered glossary terms with datasets, but the terms URNs are saved as random/hashed(?) strings , such as
    urn:li:glossaryTerm:744edc26-4a11-49e8-96cc-d9816223624d
    . In my use case I expect the user to inform the actual term name, like
    Active Client
    , and based on this name associate the term with a dataset. Is it possible to get the random string based on the term name? Tried using
    make_term_urn("Active Client")
    but it just returns
    urn:li:glossaryTerm:Active Client
    . Thanks in advance!
    ✅ 1
    a
    • 2
    • 1
  • m

    magnificent-lawyer-97772

    01/12/2023, 1:53 PM
    Hi folks, I am seeing the following error in our mce-consumer logs:
    Copy code
    c.l.m.k.MetadataChangeProposalsProcessor - Error while processing FMCP: FailedMetadataChangeProposal - {error=com.linkedin.r2.RemoteInvocationException: com.linkedin.data.template.RequiredFieldNotPresentException: Field "value" is required but it is not present
    It seems to be cause by the Tableau connector. Has anyone seen this in the Tableau connector? I know that this error can occur in other circumstances as well. I will leave the full trace in a thread. Using version 0.9.1.
    👀 1
    ✅ 1
    a
    • 2
    • 2
  • g

    gray-ocean-32209

    01/12/2023, 2:14 PM
    @bulky-soccer-26729 I’m still seeing this issue redshift ingestion recipe running ‘cli_version’: ‘0.9.5’, and ‘gms_version’: ‘v0.9.5’
    Copy code
    [2023-01-12 19:28:09,001] ERROR    {datahub.entrypoints:189} - Failed to configure the source (redshift): 1 validation error for RedshiftConfig
    include_view_lineage
      extra fields not permitted (type=value_error.extra)
    b
    • 2
    • 9
  • s

    some-car-9623

    01/12/2023, 3:34 PM
    Hello Everyone, we are facing the issue in the Lineage where only partial linage is showed up in the UI. like for example there are 4 up streams for a Dataset but in UI it displayed only 2 other 2 is missing . In the Sql it shows 4 upstreams . Any Idea what is causing this ?
    ✅ 1
    a
    • 2
    • 3
  • a

    average-family-72780

    01/12/2023, 3:50 PM
    Hello, using the biquery_v2 ingestion source, is it possible to include the full user email inside the URN to differentiate between john.doe@gmail.com and john.doe@hotmail.com ? Right now those users will be mapped as one in the Datacatalog. https://github.com/datahub-project/datahub/issues/7021
    a
    • 2
    • 15
  • g

    gray-cpu-75769

    01/12/2023, 7:00 PM
    Hello Everyone I’m unable to view the log directory under datahub-frontend pod after the datahub upgrade from v0.9.1 to v0.9.3 does anyone know or havefaced such similar issue ?
    ✅ 1
    h
    • 2
    • 3
  • b

    best-umbrella-88325

    01/13/2023, 6:34 AM
    Hi All. Is there any way I can provide the ingress URL of my datahub cluster instead of the GMS endpoint to the Java Emitter?
    h
    • 2
    • 4
  • r

    rapid-crowd-46218

    01/13/2023, 7:48 AM
    Hello, I'm trying to ingest the glue data catalog on local VM. However, when I ran yaml file, I get the following error message (in image), but I don't know the root cause. The pod's status of minikube is as follows. Is there a problem with datahub installed? Thank you in advance.
    h
    s
    • 3
    • 9
  • c

    curved-planet-99787

    01/13/2023, 8:09 AM
    Athena source I'm currently examining the Athena source and playing a bit around with all the features available for it. In general it works really great and reliable 🙌 There are just a few minor aspects that have popped up: 1. As reported in this GitHub Issue, lineage does not work for Athena views and view definitions are also not available 2. Profiling does not work for views either So I just wanted to ask here what is the current status here? Are these aspects simply not yet implemented for the Athena source or are there any issues with Athena that make it hard/impossible to provide the functionalities?
    ✅ 2
    plus1 1
    h
    • 2
    • 5
  • s

    salmon-psychiatrist-4013

    01/13/2023, 8:53 AM
    Team, quick question : is it possible to have sample data set along with Meta Data in Datahub ? I have meta data info and profiling data available in Datahub but customer would also like to browse through and look into actual data (if not full data then at least sample data of say few 1000 rows). Is this possible ?
    ✅ 1
    h
    • 2
    • 2
  • a

    alert-fall-82501

    01/13/2023, 10:42 AM
    Hi Team - I ingesting metadata from hive . My airflow dag jobs is failing with this source of ingestion . Can anybody suggest on this ?..see error log file in thread .
    h
    • 2
    • 11
  • f

    few-needle-68678

    01/14/2023, 4:20 PM
    Hi All. Another one question. When I try to ingest some data from OpenAPI, I get an empty record: total records, ids and other parameters from record were empty. What should I do? Thank you
    b
    • 2
    • 1
  • p

    polite-activity-25364

    01/16/2023, 2:04 AM
    Hi Team, Is there a way to clear all ingestion history? above image, bigquery ingesiton status is pending , but it actually succeeded. (Other sources’ histories look good) if delete the history from the UI and ingest it again, it will appear in the pending state. env is eks with helm chart
    🫠 1
    ✅ 1
    🥹 1
    a
    • 2
    • 2
  • m

    mammoth-gigabyte-6392

    01/16/2023, 6:15 AM
    Hello team, Can we do the data profiling for excel files too? I am ingesting an excel file as a dataset from s3 and wanted to know if profiling is possible for it? If not then can someone please let me know for what all file types it is supported? Thanks in advance!
    ✅ 1
    h
    • 2
    • 1
  • s

    steep-vr-39297

    01/16/2023, 10:48 AM
    Hi, team.
    datahub docker quickstart
    is being tested locally.
    c.l.m.s.e.update.BulkListener
    has this error been corrected?
    👀 1
    e
    • 2
    • 16
  • h

    happy-camera-26449

    01/16/2023, 12:48 PM
    hi, i want to ingest s3 into datahub, but it cant be connected directly so i ve added proxy through aws_proxy. its still not connecting. Is there anything else i ll have to change?
    ✅ 1
    g
    • 2
    • 3
1...959697...144Latest