https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • i

    incalculable-ocean-74010

    03/05/2021, 7:14 PM
    Not at the moment no, I’m simply implementing a cronjob resource in k8s
  • b

    brief-toothbrush-55766

    03/06/2021, 12:32 PM
    Copy code
    Did not recognize type 'geometry' of column 'geom'
      "Did not recognize type '%s' of column '%s'" % (attype, name)
  • c

    calm-lawyer-777

    03/17/2021, 11:02 AM
    Hi @chilly-spring-43918
  • m

    mammoth-bear-12532

    03/18/2021, 5:04 PM
    Just wanted to provide some status on the ingestion connectors, we now have support for the following systems (https://datahubproject.io/docs/metadata-ingestion#sources) • Kafka • MySQL • SQL Server • Hive • Postgres • Snowflake • BigQuery • Athena • Druid • LDAP Thanks for the contributions! Keep them coming! 🚀
    🎉 3
  • m

    mammoth-bear-12532

    03/23/2021, 5:15 AM
    @curved-crayon-1929: I just filed this (https://github.com/linkedin/datahub/issues/2280) for MongoDB. We'll get to it soon.
    👍 2
  • c

    curved-crayon-1929

    03/23/2021, 5:28 AM
    Thanks a lot @mammoth-bear-12532
  • l

    loud-island-88694

    04/05/2021, 2:49 PM
    welcome @wonderful-area-88986!
    thankyou 1
  • m

    mammoth-bear-12532

    04/06/2021, 12:20 AM
    <!here>: wanted to let people know that the much requested AWS Glue support is now in. Please try it out and improve it as needed. Thanks to the folks at Depop and Klarna for working on it! https://datahubproject.io/docs/metadata-ingestion/#aws-glue-glue
  • i

    icy-easter-2378

    04/15/2021, 3:19 PM
    This is on Ubuntu, if that makes a difference
  • i

    icy-easter-2378

    04/15/2021, 3:20 PM
    Copy code
    vmadmin@datahub-poc:~$ pip --version
    pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7)
  • i

    icy-easter-2378

    04/15/2021, 3:20 PM
    Maybe I should be using pip3?
  • c

    colossal-nest-20040

    04/20/2021, 6:19 PM
    @colossal-nest-20040 has left the channel
  • m

    modern-nest-69826

    04/21/2021, 12:06 PM
    datahub_docker.sh ingest -c output.txt,datahub ingest -c output.txt
    datahub ingest -c output.txtdatahub_docker.sh ingest -c output.txt
  • m

    modern-nest-69826

    04/21/2021, 12:08 PM
    Processing was started from datahub/metadata-ingestion folder
  • s

    steep-pizza-15641

    04/28/2021, 5:48 PM
    Hi I'm a little confused if we refer to Postgres as 'postgres' or 'postgresql' If I use Airflow native lineage, Datahub seems to prefer me to refer to postgres as urnlidataPlatform:postgres (that way datahub presents me with Postgres icons in the GUI) e.g native lineage message:
    Copy code
    c.l.m.k.MetadataAuditEventsProcessor - {com.linkedin.metadata.snapshot.DatasetSnapshot={urn=urn:li:dataset:(urn:li:dataPlatform:postgres,myapp.public.target_table_b,PROD), aspects=[{com.linkedin.dataset.UpstreamLineage={upstreams=[{auditStamp={actor=urn:li:corpuser:datahub, time=1619631366909}, type=TRANSFORMED, dataset=urn:li:dataset:(urn:li:dataPlatform:postgres,myapp.public.source_table_c,PROD)}, {auditStamp={actor=urn:li:corpuser:datahub, time=1619631366909}, type=TRANSFORMED, dataset=urn:li:dataset:(urn:li:dataPlatform:postgres,myapp.public.source_table_d,PROD)}]}}]}}
  • i

    icy-holiday-55016

    05/07/2021, 12:52 PM
    Hi, I'm trying to ingest lineage data following the recently updated guide here: https://github.com/linkedin/datahub/tree/master/metadata-ingestion#using-datahubs-airflow-lineage-backend-recommended
  • i

    icy-holiday-55016

    05/07/2021, 12:53 PM
    oops hit enter too quickly, rest of the message to follow...
  • i

    icy-holiday-55016

    05/07/2021, 3:54 PM
    Answered my own question above: it was due to me using Airflow 2.0.1 as opposed to 2.0.2 (it's in the docs but I missed it). Comms between Airflow and Datahub working now
    🙌 4
  • s

    square-greece-86505

    05/10/2021, 12:19 PM
    Hi all, I created a PR for ingesting lineage from Kafka Connect. Currently limited only for Debezium source connector. Please have a look and test 🙂 https://github.com/linkedin/datahub/pull/2516
    🎉 5
  • c

    curved-sandwich-81699

    05/21/2021, 11:14 PM
    Hi everyone, I just opened a PR to fix some issues we were having with the lineage/urns generated after ingesting metadata from dbt, if anyone here could test/and or review that would be great: https://github.com/linkedin/datahub/pull/2596
  • e

    enough-potato-17984

    05/26/2021, 2:54 AM
    Cool!How about the lineage of mysql and other source ?  Are there other components to handle them?
  • p

    powerful-telephone-71997

    06/01/2021, 6:17 AM
    Hi Folks, I tried ingesting metadata from Redshift, but I think the script gathers tables list at the beginning and if a temp table is dropped in between, the script reports table not found… (edited) This should be a warning and not an error in my opinion. I am using the REST api way, should I ingest using Kafka and try? will this issue  go away?
  • p

    powerful-telephone-71997

    06/03/2021, 1:51 PM
    Dont know the issue, I get this during ingestion, right at the end:
  • p

    powerful-telephone-71997

    06/03/2021, 1:52 PM
    However the catalog shows the tables
  • l

    loud-island-88694

    06/03/2021, 4:16 PM
    @powerful-telephone-71997 can we setup a call to unblock you?
  • a

    astonishing-mechanic-42915

    06/04/2021, 10:44 AM
    Getting the following error while ingesting data from dbt to datahub GMS.
  • a

    astonishing-mechanic-42915

    06/04/2021, 10:44 AM
    Appreciate any help!
  • b

    better-orange-49102

    06/08/2021, 8:15 AM
    i have the datahub running in docker containers in a VM, and I tried to ingest via REST API at the host container, using acryl-datahub package. I fed a json file to the rest api using
    Copy code
    datahub ingest -c <recipe name>
    I kept getting this message:
  • m

    miniature-airport-96424

    06/17/2021, 1:30 PM
    Hey , i don't know if i'm on the right channel , i'm working to deploy datahub on k8s
  • m

    miniature-airport-96424

    06/17/2021, 1:30 PM
    did i miss a componant in my deployement
    Copy code
    datahub-datahub-frontend-94d4cd6b5-nch6g                   1/1     Running   0          14h
    datahub-datahub-gms-748884b4db-69cg2                       0/1     Running   1          6m53s
    datahub-prerequisites-cp-schema-registry-dd6d4fb86-n55dz   2/2     Running   0          14h
    datahub-prerequisites-kafka-0                              1/1     Running   0          9m15s
    datahub-prerequisites-neo4j-community-0                    1/1     Running   0          14h
    datahub-prerequisites-zookeeper-0                          1/1     Running   0          14h
    elasticsearch-master-0                                     1/1     Running   0          14h
    elasticsearch-master-1                                     1/1     Running   0          14h
    elasticsearch-master-2                                     1/1     Running   0          14h
    flux-68bdf85d98-gmblt                                      1/1     Running   0          14h
1...131132133...144Latest