https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • l

    late-bear-87552

    02/02/2022, 6:05 PM
    wanted to deny table which starts with temp_, can anyone help me with the yml file???
  • p

    plain-farmer-27314

    02/09/2022, 2:23 PM
    Hi, where can we learn more about:
    Copy code
    Data Quality — Metadata Model Support
    Data Quality test results are now supported in the DataHub backend metadata model!
    Would love to see how we might be able to integrate our data quality checks
  • i

    important-fireman-21788

    02/09/2022, 2:58 PM
    Hi, like @cool-iron-6335 on his message : https://datahubspace.slack.com/archives/CUMUWQU66/p1626083174047000, I would like to know if there is a ingestion source or recipe to ingest HDFS files (mostly CSV) into datahub. I have also the same question for Hbase (namespace, table eventually column familly, ... ) I saw that there is a Hive metadata ingestion source, but I would like to also add HDFS ans Hbase ressources to datahub. Thanks in advance for your knowledge.
  • g

    great-dusk-47152

    02/11/2022, 1:35 AM
    @helpful-optician-78938 Can't slide, click grey button just show this.
  • w

    witty-butcher-82399

    02/14/2022, 2:59 PM
    How do I set the commit policy for the
    DatahubIngestionCheckpointingProvider
    ? is that configurable somehow? https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/run/pipeline.py#L235-L243
  • m

    mysterious-kitchen-97015

    02/15/2022, 10:27 AM
    [removed, wrong channel]
  • m

    mysterious-nail-70388

    02/16/2022, 5:55 AM
    Hi guys, I am having a problem with UI metadata ingestion, because the PIP we are using locally to install other plug-ins is using the mirror source, using the original connection timed out, causing execution to fail. Are there any good solutions or optimizations in this area ?
  • h

    happy-twilight-40558

    02/17/2022, 9:19 AM
    Hi guys, one of the requirement we have is ingest metadata from: - BI Tools like SAP BO,  - ETL Tools like Informatica/Datastage   - RDBMS like Oracle Exadata - csv/excel/txt files We wonder if these sources are in scope of plugins development in coming roadmaps ( Q1 2022) or better if someone has already developed them by himself. Is it possible? Does anyone else have the same needs as us? Thanks, Mauro
  • w

    wide-army-23885

    02/17/2022, 10:42 PM
    I really appreciate any input
  • m

    mysterious-nail-70388

    02/21/2022, 8:47 AM
    Is there any requirement on the version of data source for es data source metadata ingestion For example, there are version differences between ES-5 and ES-7
  • s

    stale-printer-44316

    02/21/2022, 4:24 PM
    Can anyone help me on the above please?
  • l

    lively-fall-12210

    02/23/2022, 10:20 AM
    Cross-posted as I am not sure where to put this question
  • l

    late-animal-78943

    02/24/2022, 11:17 AM
    👋 Hello, team!
  • d

    dazzling-judge-80093

    02/25/2022, 11:14 AM
    The error message complains about this as well but not a very user friendly way ->
  • r

    rough-van-26693

    02/28/2022, 5:11 AM
    Screenshot 2022-02-28 at 1.10.52 PM.png
  • a

    ancient-knife-38383

    02/28/2022, 11:37 PM
    I have to say that openlineage API endpoint would be really useful. We have implemented openlineage format emitter in our product as well so this would be an awesome way to onboard more users to DataHub
    ➕ 3
  • f

    fierce-airplane-70308

    03/01/2022, 8:41 PM
    @square-solstice-69079 did you also install python cx_oracle
  • b

    better-orange-49102

    03/02/2022, 3:39 AM
    for glossary term, there is an option to introduce "inherit" and "contain" related terms. Right now, you need to specify the related terms manually for each and every term right, as the respective terms do not know who inherited from them? Is there any depth limitations for glossary terms? Also, whats the functional diff between a glossary node and browsepath "folder", since I don't see a placeholder for the glossaryNode description.
  • n

    nutritious-bird-77396

    03/02/2022, 6:42 PM
    @shy-parrot-64120 Where you able to find a fix for this? I see the same issue where the source dataset schema is not ingested which does not match the source ingested from postgres plugin.
  • q

    quiet-kilobyte-82304

    03/02/2022, 8:42 PM
    Is there an easy way to bulk update the fabric from
    dev
    to
    prod
    across all urns?
  • f

    full-cartoon-72793

    03/03/2022, 6:28 PM
    Hello. I got a deployment of DataHub in AKS up an running using the default config and running the helm install commands. Now, I am looking to setup ingestion from Databricks via the Hive connector. In the documentation, it says to “Ensure that databricks-dbapi is installed. If not, use
    pip install databricks-dbapi
    to install”. Any ideas on how I ensure this is installed inside my AKS Cluster?
  • a

    astonishing-monkey-81508

    03/08/2022, 3:58 AM
    @stocky-midnight-78204 Seeing same things on my side
  • a

    astonishing-monkey-81508

    03/08/2022, 3:58 AM
    image.png
  • f

    fierce-waiter-13795

    03/08/2022, 9:23 AM
    Bumping this up, because I think this could be a bug
  • s

    stocky-midnight-78204

    03/09/2022, 12:30 PM
    When I summit one sql job for running four insert sql query and two insert tasks are successfully executed and rest two tasks failed due to below error:
  • s

    stocky-midnight-78204

    03/09/2022, 12:30 PM
    Any one faced this issue?
  • b

    broad-thailand-41358

    03/09/2022, 2:37 PM
    Thanks @green-football-43791, it looks like updating worked, but now I'm getting a connection timeout. I believe it's because urllib3 doesn't read the http_proxy variables. Is there a way to pass proxy info via the config or the
    datahub ingest
    command?
  • k

    kind-baker-52130

    03/10/2022, 12:21 AM
    Hi All - I am setting up new quickstart. I am now stuck on this error. Connecting to snowflake. Has anybody seen this?
  • c

    curved-carpenter-44858

    03/10/2022, 10:54 AM
    any update on the above ? We also have same problem. Currently we are using open source version of delta lake along with a standalone hive metastore service (No hiveserver). For the metadata ingestion, I tried using a spark thrift server but it did not work. So looking for the options. is there any option available currently in datahub to ingest the metadata of delta lake tables from the hive standalone metastore service ?
  • d

    damp-queen-61493

    03/10/2022, 4:50 PM
    Hello! Are there any transformer to set dataset's domain?
1...135136137...144Latest