https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • b

    brash-sundown-77702

    05/19/2022, 2:34 PM
    I keep getting this error -
  • b

    brash-sundown-77702

    05/19/2022, 2:34 PM
    Copy code
    [2022-05-19 07:32:38,929] ERROR    {datahub.entrypoints:165} - mysql is disabled; try running: pip install 'acryl-datahub[mysql]'
  • b

    brash-sundown-77702

    05/19/2022, 2:35 PM
    I ran 'pip install 'acryl-datahub[mysql]'' successsfully. but still I am getting the above error.
  • b

    brash-sundown-77702

    05/19/2022, 2:35 PM
    Copy code
    CLI command used is datahub ingest -c mysql-datahub_localhost.yml
  • b

    brash-sundown-77702

    05/19/2022, 2:36 PM
    Copy code
    [root@localhost mysql]# cat mysql-datahub_localhost.yml
    source:
      type: "mysql"
      config:
        username: datahub
        password: datahub
        database: mediawiki
        host_port: :3306
    sink:
      type: "datahub-rest"
      config:
        server: "<http://localhost:8080>"
  • n

    nice-mechanic-83147

    05/20/2022, 6:53 AM
    Hi Team ,
  • c

    chilly-gpu-46080

    05/23/2022, 4:16 AM
    does this need to be
    dbo.Base_Data
    for Great Expectation validations to work? Will greatly appreciate any help on this!
  • c

    chilly-gpu-46080

    05/23/2022, 7:37 AM
    Hello all! @hundreds-photographer-13496 was able to pin point the issue that was causing Great Expectations assertions to not link with DataHub. It seems it was do with how I was ingesting from MSSQL source. In my recipe, I did not specify a
    database
    and hence the urn generated was not the same as the one from GE. urn in DataHub
    … dbo.table_name …
    urn from GE assertion
    … database_name.dbo.table_name …
    not specifying a
    database
    allowed me to ingest ALL tables from my MSSQL source. Current behaviour - when database is included in recipe - only one database is ingested, database name is included in dataset urn when database is not included in recipe - all databases are ingested, but database name is not included in dataset urn Expect behaviour - ingest all tables with correct urn PS: is creating a separate recipe for each database the correct way to ingest?
  • b

    breezy-noon-83306

    05/24/2022, 2:37 PM
    Is it possible to ingest data sources from azure or other clouds to datahub ?
  • c

    careful-insurance-60247

    05/24/2022, 7:16 PM
    Is there an feature that allows us to set a header to bypass some authentication we have setup on our edge? We use cloudflares zero trust and need to define a header to get around sso
  • s

    stocky-midnight-78204

    05/31/2022, 7:53 AM
    Any one face this issue:2022/05/31 074139 Connected to tcp://mysql:3306 2022/05/31 074139 Connected to tcp://broker:29092 2022/05/31 074209 Problem with request: Get "http://elasticsearch:9200": proxyconnect tcp: dial tcp 116.62.81.17380 i/o timeout. Sleeping 1s 2022/05/31 074209 Problem with request: Get "http://neo4j:7474": proxyconnect tcp: dial tcp 116.62.81.17380 i/o timeout. Sleeping 1s 2022/05/31 074240 Problem with request: Get "http://neo4j:7474": proxyconnect tcp: dial tcp 116.62.81.17380 i/o timeout. Sleeping 1s 2022/05/31 074240 Problem with request: Get "http://elasticsearch:9200": proxyconnect tcp: dial tcp 116.62.81.17380 i/o timeout. Sleeping 1s
  • s

    stocky-midnight-78204

    05/31/2022, 7:54 AM
    I don'n know why go to 116.62.81.173:80
  • s

    stocky-midnight-78204

    05/31/2022, 7:54 AM
    Neo4j is up
  • w

    worried-painting-70907

    06/01/2022, 7:54 PM
    image.png
  • b

    better-orange-49102

    06/02/2022, 1:07 AM
    bump^
  • c

    cool-actor-73767

    06/02/2022, 7:31 PM
    Hi team! I'm trying to use UI Ingestion from Glue, but doens't work. When I execute I don't receive any message, but doens't work. Attached image of recipe config. It's wrong? I glad if someone help me.
  • m

    many-keyboard-47985

    06/03/2022, 4:19 AM
    Hello. I try use datahub for discovery platform. Could you please help me to answer questions? I want to integrate kafka stream in datahub. But, I can’t find kafka streams metadata integration. Then, I crate custom data platform of ‘kafka-streams’ (reference this guide - https://datahubproject.io/docs/how/add-custom-data-platform) How can I add custom dataset using rest api OR datahub cli recipes? Can not find reference. Could you help on this?
  • s

    sparse-raincoat-42898

    06/03/2022, 11:12 AM
    image.png,image.png
  • n

    nutritious-bird-77396

    06/03/2022, 3:49 PM
    @big-carpet-38439
    KILL
    signals are being generated for the
    dataHubExecutionRequestSignal
    aspect but the process doesn't respond looks like... Alternative i found was to delete the whole ingestion job and create it again, let me know if there is a better approach?
  • a

    adorable-guitar-54244

    06/10/2022, 3:22 PM
    IMG_20220610_205112.jpg
  • s

    steep-midnight-37232

    06/16/2022, 3:54 PM
    image.png
  • l

    lemon-zoo-63387

    06/20/2022, 10:48 AM
    hello ereyone,A magical thing happened. The following command was executed successfully
    python3 -m datahub ingest -c oracle_recipe.yaml
    The UI always fails, and the Oracle client is also installed,Thanks in advance for your help!
    python3 -m pip install cx_Oracle --upgrade
    python3 -m pip install cx_Oracle --upgrade --user
  • d

    dry-doctor-17275

    06/22/2022, 6:51 AM
    Hello, We deployed our datahub project via quickstart script, and our manager ask us if the mysql can be replace with our company's mssql or oracle? (The red line), Thanks for the help!
  • b

    blue-beach-27940

    06/26/2022, 9:46 AM
    image.png
  • b

    blue-beach-27940

    06/26/2022, 9:47 AM
    This is my code:
  • c

    clean-piano-28976

    06/27/2022, 11:20 AM
    @mammoth-bear-12532 is there anything else you can suggest I check on my side?
  • a

    ambitious-pharmacist-14608

    06/28/2022, 12:27 AM
    Bump
  • b

    blue-beach-27940

    06/28/2022, 6:02 AM
    ohoh, my datahub-gms is down
  • r

    rich-policeman-92383

    06/29/2022, 11:57 AM
    Hello While profiling a hive table the datahub job fails with out of memory error. What could be the cause and what can be done to reduce the memory footprint?
  • g

    green-lion-58215

    06/29/2022, 3:31 PM
    Any help with this?
1...138139140...144Latest