https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • v

    victorious-evening-88418

    02/17/2023, 1:54 PM
    Hi @ripe-eye-60209, I had the same problem in the past. I solved the issue upgrading to DataHub CLI version: 0.10.0 and commenting "workspace_id_pattern" in the recipe
    thanks bear 1
  • l

    lively-dusk-19162

    02/23/2023, 2:09 AM
    Hello, I am facing the below error when I was running the below command: ./gradlew build Could you please help me on this error?
  • l

    lively-dusk-19162

    03/01/2023, 3:10 PM
    Could anyone please help me on this?
  • m

    most-animal-32096

    03/06/2023, 5:06 PM
    So, for the follow-up, here attached is the archive of a very minimal sample Gradle project to import next to
    datahub-client
    one, to actually try metadata emission, through REST and Kafka. (NB: previously mentioned documentation misses the
    emitter.close()
    and doesn't mention the required Gradle dependencies)
    datahub-client-sample.zip
    🩺 1
  • n

    numerous-scientist-83156

    03/08/2023, 10:31 AM
    I did some more digging around trying to find out why this was not working as expected. I found that if i changed the platform from
    adslGen2
    to
    adlsg2
    , both name and id, and my class' local variable, it would work as expected, with breadcrumbs and all (first picture) My coworker then mentioned that there are some predefined
    dataplatforms
    that could be found in data_platform.json, here I noticed that the delimiter for
    adlsGen2
    is
    /
    instead of
    .
    So just for fun I changed the platform back to use the predefined name
    adlsGen2
    but added a line that replaces all the
    .
    in the dataset urn with
    /
    and this also works as expected.. I've then looked through the code a bit more and found that the function
    create_from_ids
    , that's being used in the
    make_dataset_urn_with_platform_instance
    function, i made to always use the
    .
    as the delimiter in the name.. Is this working as intended? Is there a another function I should be using to generate the dataset_urn when it's from an
    adlsGen2
  • t

    thousands-printer-59538

    03/15/2023, 10:31 AM
    message has been deleted
  • w

    witty-butcher-82399

    03/20/2023, 12:03 PM
    A quick search in the code shows the breaking changes have been implemented already https://github.com/datahub-project/datahub/blob/b526dc1ab6cd31cc235cd0edf87caacbba[…]metadata-ingestion/src/datahub/ingestion/source/dbt/dbt_core.py Question solved 😅
  • w

    wonderful-jordan-36532

    03/22/2023, 11:18 AM
    Not a particular platform actually, but Databricks ml models or mlflow would work best for us. Otherwise AWS or just via plain documentation
  • c

    clean-scooter-32205

    03/23/2023, 11:59 AM
    Hi! when trying to ingest from databricks unity catalog, I get a
    PERMISSION_DENIED
    :
    Only account admin can list metastores
    . Is there a way to not require an account admin token? I would only be using a specific metastore id, and there’s no way the account admin would be ok on having an associated token lying around.
  • b

    brash-caravan-14114

    03/28/2023, 3:18 PM
    I am experiencing the same error. I configured according to the documentation. Kafka-setup-job, gms, and frontend were all deployed successfully, using only IAM permissions. The topics were created on the cluster which proves the authentication works. However, when trying run an ingestion from the ui, I see the same error as @best-napkin-60434 in the datahub-actions pod. I have also tried to configure executor.yaml as suggested here, and replaced the user / password configuration with the JAAS configuration described here. I received the following error:
    Copy code
    KafkaException: KafkaError{code=_INVALID_ARG,val=-186,str="Java JAAS configuration is not supported, see <https://github.com/edenhill/librdkafka/wiki/Using-SASL-with-librdkafka> for more information."}
    Is it possible to use datahub-actions and authenticate using IAM? attaching executor.yaml Thanks!
    executor.yaml
    plus1 1
  • w

    witty-butcher-82399

    03/28/2023, 4:17 PM
    Beyond the concern on the missing entities in the map, we have found a scenario where an
    AssertionError
    is thrown when doing the
    _should_process
    validation. I have created a PR fixing this case: https://github.com/datahub-project/datahub/pull/7702
  • p

    proud-dusk-671

    03/30/2023, 11:23 AM
    Any updates on this? The docs are written in respect to Confluent and no information about the AWS MSK ingestion is provided
  • m

    modern-france-82371

    04/05/2023, 10:26 AM
    Hi, I’m using version 0.10.0, I was setting both configured additional access policies and not. It also had the same. I think it’s a bug.
  • b

    bland-barista-59197

    04/13/2023, 7:46 PM
    Hi Team, I’m getting same error. do you get any solution to address 404 error?
  • q

    quiet-rain-16785

    04/17/2023, 2:05 PM
    hi guys any update on this i am doing the same...but getting nothing at datahub!! can you help me @dazzling-judge-80093 @quiet-television-68466
    plus1 2
  • a

    agreeable-table-54007

    04/18/2023, 9:00 AM
    @modern-artist-55754 Oh okay thanks for the info so if i want to ingest data from data factory i'd need to convert it into something else. But then also in csv : resource subresource glossary_terms tags ownersownership_type description domain Are there only these columns or can we add more ? And is the json schema useful for ingesting csv files ? Is this yml structure correct for a csv file ? source: type: "file" config: format: "csv" path: "/path/to/your/data.csv" sink: type: "datahub-rest" config: server: "http://localhost:8080" Thanks.
  • d

    damp-lighter-99739

    04/18/2023, 2:28 PM
    Hi team, Could someone help with this please
  • w

    wonderful-jordan-36532

    04/24/2023, 6:53 AM
    How did you resolve bypassing the 2FA requirement for Tableau ingestion? @brave-france-7945
  • q

    quiet-television-68466

    04/25/2023, 11:49 AM
    Really sorry to message this in the chat, but still looking for a bit more help if anyone has any ideas!
  • a

    adorable-magazine-49274

    04/25/2023, 12:01 PM
    Is there anyone can help me?
  • b

    bulky-lunch-41113

    04/27/2023, 3:18 AM
    message has been deleted
  • n

    numerous-byte-87938

    05/01/2023, 5:29 PM
    Gentle bump in case it was missed 😃
  • f

    fresh-dusk-60832

    05/02/2023, 2:15 PM
    did anyone have this problem?
  • f

    flaky-refrigerator-97518

    05/09/2023, 2:51 AM
    0.10.2 quickstart (docker-compose) Logs already shared : org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=index_not_found_exception, reason=no such index Error occurs when I add new custom entity
  • b

    bland-barista-59197

    05/11/2023, 7:39 AM
    Hi @dazzling-judge-80093 I think this we can be reproduced by 1. Set
    project_on_behalf
    other than scanning project e.g. bq-project-1. 2. Added two dataset in bq-project-1. one has partition key and other does not. Solution: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/ge_data_profiler.py#L923 should be something like this `bq_sql = f"SELECT * FROM
    {schema}
    .`{table}`"`
  • r

    ripe-helmet-49084

    05/31/2023, 9:47 AM
    Hi All, Can someone suggest on this please.
  • f

    fierce-agent-11572

    05/31/2023, 2:12 PM
    thank you very much 🙏
  • a

    astonishing-father-13229

    05/31/2023, 3:58 PM
    Thanks for all the help, it's working now Thanks Steve and Bagwan 😎
    👍 1
  • h

    hundreds-airline-29192

    06/02/2023, 8:19 AM
    quick start with specify version solve my problems
  • r

    ripe-helmet-49084

    06/02/2023, 10:57 AM
    HI @gentle-hamburger-31302 can you please help me to get the table level lineage from MYSQL to Redash, now sure what I am missing here. I am using below recipe. source: type: "redash" config: connect_uri: "****" api_key: "****" dashboard_patterns: allow: - "test_dashboard" chart_patterns: allow: - "test_chart_*" parse_table_names_from_sql: true sink: type: "datahub-rest" config: server: "http://****:8080"
1...140141142143144Latest