https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • f

    future-table-91845

    05/11/2023, 5:42 PM
    I am trying DataHub first time and I do not see AZURE STORAGE ( BLOB or ADLS GEN 2) as a source.
  • f

    future-table-91845

    05/11/2023, 5:42 PM
    Will that be available in future?
  • b

    big-postman-38407

    05/12/2023, 7:36 AM
    Hello everyone! Quick question: is it possible to use the business logic only and create my own UX/UI of the tool?
    🔍 1
    đź“– 1
    l
    d
    +2
    • 5
    • 6
  • b

    broad-ghost-1006

    05/12/2023, 8:53 AM
    Hi Guys, I am new to Datahub. I have tried the quick-start guide already but I need a customise way to deploy. Any tutorial on how to deploy datahub to an on-prem machine using Docker compose NOT the quickstart method.
    l
    d
    b
    • 4
    • 3
  • b

    billions-baker-82097

    05/12/2023, 10:43 AM
    Hi Team, I want to create my own custom metadata model...is there any easy way to do so?
    đź“– 1
    âś… 1
    🔍 1
    l
    d
    a
    • 4
    • 10
  • b

    bland-orange-13353

    05/12/2023, 1:16 PM
    This message was deleted.
    🔍 1
    âś… 1
    đź“– 1
    l
    • 2
    • 1
  • s

    salmon-exabyte-77928

    05/12/2023, 3:22 PM
    Hi everyone! I'm trying to deploy datahub using helm chart, specify postgresql database:
    values.yaml
    Copy code
    host: "deps-postgresql.deps.svc.cluster.local:5432"
          hostForpostgresqlClient: "deps-postgresql.deps.svc.cluster.local"
          port: "5432"
          url: "jdbc:<postgresql://deps-postgresql.deps.svc.cluster.local:5432/dh02?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2>"
          driver: "org.postgresql.Driver"
          username: "postgres"
          password:
            secretRef: postgresql-secrets
            secretKey: postgres-password
          # --------------OR----------------
          #   value: password
    But postgresql setup job creates database
    datahub
    and ignores database
    dh02
    specified in connection string. Any minds?
    Copy code
    2023/05/12 15:11:11 Waiting for: <tcp://deps-postgresql.deps.svc.cluster.local:5432>
    2023/05/12 15:11:11 Connected to <tcp://deps-postgresql.deps.svc.cluster.local:5432>
    CREATE DATABASE
    -- create metadata aspect table
    After job finished database
    datahub
    appears on the postgresql instance and tables present there, but db
    dh02
    is not.
    đź“– 1
    🔍 1
    âś… 1
    âś… 2
    l
    • 2
    • 2
  • a

    ancient-kitchen-28586

    05/14/2023, 8:24 AM
    Hi guys,
    l
    • 2
    • 1
  • a

    ancient-kitchen-28586

    05/14/2023, 8:28 AM
    I'm trying to run the quickstart using datahub docker quickstart from Windows. It starts fine, but then gets stuck repeating this message:
    Copy code
    [+] Running 0/0
     - Container mysql  Creating                                                                                       0.0s
    Error response from daemon: invalid volume specification: 'C:\Users\janko\.datahub\mysql\init.sql:/docker-entrypoint-initdb.d/init.sql:rw': invalid mount config for type "bind": bind source path does not exist: c:\users\janko\.datahub\mysql\init.sql
    I don't have a mysql folder in .datahub. Any ideas why this could be?
    d
    • 2
    • 2
  • h

    hallowed-lock-74921

    05/14/2023, 4:23 PM
    Hi Guys, I am getting the below error while executing ./gradlew quickstart
    l
    d
    • 3
    • 3
  • h

    hallowed-lock-74921

    05/14/2023, 4:23 PM
    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
  • h

    hallowed-lock-74921

    05/14/2023, 4:28 PM
    Configure project dockermysql-setup
    fullVersion=v0.10.2-147-g0fa983a.dirty cliMajorVersion=0.10.2 version=0.10.3-SNAPSHOT SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    Task dockermysql-setup:docker
    FAILED unknown flag: --load See 'docker --help'.
  • h

    hallowed-lock-74921

    05/14/2023, 5:25 PM
    FAILURE: Build completed with 3 failures. 1: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockerkafka-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== 2: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockerelasticsearch-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== 3: Task failed with an exception. ----------- * What went wrong: Execution failed for task 'dockermysql-setup:docker'.
    Process 'command 'docker'' finished with non-zero exit value 125
    * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights. ============================================================================== * Get more help at https://help.gradle.org Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0. Use '--warning-mode all' to show the individual deprecation warnings. See https://docs.gradle.org/6.9.2/userguide/command_line_interface.html#sec:command_line_warnings BUILD FAILED in 1m 10s 88 actionable tasks: 52 executed, 36 up-to-date (venv) Apples-MacBook-Pro-2:datahub apple$
    âś‹ 1
    âś… 1
    l
    • 2
    • 1
  • p

    proud-dusk-671

    05/15/2023, 3:04 PM
    Hi team, Looking to understand the "type of owners" concept in datahub? I see four different kinds of owners that datasets can have (Technical Owner, Business Owner, Data Steward and None). Can you tell me what impact does adding each of these types of owners have on the underlying asset? Will a user (with Reader role) be able to edit datasets by virtue of being an owner on any of these datasets
    đź“– 1
    🔍 1
    l
    d
    • 3
    • 4
  • p

    prehistoric-greece-5672

    05/15/2023, 3:21 PM
    Hi there. What resources do you recommend for brand new DataHub users (like me!) who want to learn how to use DataHub to find stuff? Everything I've found at datahubproject.io, the DataHub blog on medium.com, and the DataHubProject channel on YouTube is for people who already know how to use DataHub or for people who are setting up DataHub for others to use. It's discouraging.
    đź“– 1
    🔍 1
    l
    d
    a
    • 4
    • 6
  • b

    billions-baker-82097

    05/15/2023, 4:46 PM
    Hi, I am trying to build custom metadata model as per documentation but I was confused what should I mention in the extraVolumeMounts in the datahub-gms container?
    l
    d
    • 3
    • 2
  • b

    breezy-balloon-32520

    05/15/2023, 5:04 PM
    Hi there, I have upgraded to the latest version of datahub (0.10.2.3). However I can no-longer get the broker service to start. I get.. ERROR Fatal error during KafkaServer startup... kafka.common.InconsistentClusterIdException. The Cluster ID ... doesn't match stored clusterId Some(...) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
    âś… 1
    l
    b
    • 3
    • 2
  • w

    witty-butcher-82399

    05/16/2023, 3:45 AM
    Hi! Having a look at DataHubSystemAuthenticator, doc mentions that:
    This authenticator also looks for a "delegated actor urn" which can be provided by system callers using the 'X-DataHub-Actor' header.
    However, the current logic does not match that, how is that? https://github.com/datahub-project/datahub/blob/master/metadata-service/auth-impl/[…]ub/authentication/authenticator/DataHubSystemAuthenticator.java
    a
    b
    • 3
    • 7
  • p

    proud-dusk-671

    05/16/2023, 6:06 AM
    Hi team, A very naive question but how can everything done via Datahub UI be persisted over Git? Is it possible to back up everything to Git via yaml?
    d
    a
    +2
    • 5
    • 11
  • b

    breezy-leather-30929

    05/16/2023, 6:45 AM
    Hi all! Can I track how many users do search > click result > click link to dashboard? Can I track how many users actually use datahub? it’s like tracking the funnel of the datahub and monitoring whether it is really used, and whether it is really helpful.
    b
    a
    • 3
    • 6
  • p

    powerful-finland-16210

    05/16/2023, 6:54 AM
    Hi, Is it possible to run a full production datahub instance in a Minikube or are you advising not to run production in Minikube? Best Regards Chris
    a
    b
    • 3
    • 3
  • r

    rough-summer-14442

    05/16/2023, 6:55 AM
    Hi All, I want to know how I can customize the logos and some parts of the react app, I have deployed datahub via docker and dont see the datahub web-react component. Any advise on this is appreciated.
    a
    • 2
    • 2
  • b

    billions-baker-82097

    05/16/2023, 11:03 AM
    Hi, can we add new custom aspect to a entity through the yaml configuration file ?
    a
    • 2
    • 1
  • n

    numerous-refrigerator-15664

    05/16/2023, 11:16 AM
    Hi team, I have successfully ingested hive datasets from hive metastore in mysql using presto-on-hive recipe. I was able to ingest a few datasets that I wanted for testing using
    database_pattern
    and
    table_pattern
    . Additionally, I'm trying to do the following things, and I need some advice on it. Any help would be appreciated. 1. Now I'm trying to ingest most of the datasets from
    hive metastore
    , and I was wondering if there's a way to do pattern filtering for other items in
    hive metastore
    as well: • DBS.DB_LOCATION_URI (e.g. allow only the pattern "hdfs://cluster1/dsc/.*") • DBS.OWNER_NAME (e.g. deny those with accounts ".*test") • If it is impossible via recipes, would there be any other possible ways? 2. Our organization manages additional metadata for hive's DB, table, and column in other mysql DB (say
    our_meta
    ) which is separated from
    hive metastore
    . For example, for a table named
    customer.cust_mst
    on the hive, this table exists in the
    hive metastore
    , and a separate mysql DB also manages information about this table. Given the situation, I'd like to ingest the metadata of
    our_meta
    into datahub. What should be the best way to do it? • Some of the managed items in
    our_meta
    (mainly technical meta) seems to be managed as custom properties, and this should be synced in batch mode. • Some of the managed items in
    our_meta
    (mainly business meta) can be managed as business glossary or tags, and both batch sync and API calls should possible. • I am looking into custom ingestion source and metadata ingestion transformer, but I am not sure how to approach. I hope I can handle these without forking the source if possible. Thanks in advance.
    a
    • 2
    • 2
  • e

    eager-belgium-3850

    05/16/2023, 1:47 PM
    Dear community, I am trying to start up datahub through the terminal using the command “datahub docker quickstart” but the process continues to run endlessly with the following message being displayed (attached in the picture). What exactly causes this error message? Any help is greatly appreciated Thanks.
    a
    • 2
    • 1
  • h

    handsome-flower-80167

    05/16/2023, 7:14 PM
    Hi all, I am new to Datahub I am trying to work on a use-case to test spark lineage, with various data sources. I created a simple example where you read two files from hdfs join them and save it back to hdfs. this works on my local setup of datahub it shows spark lineage as shown in image below. When I point this to the datahub server deployed on cloud it doesn't show any upstream and downstream dataset lineage, as shown in first image and idea where I am going wrong?
    a
    • 2
    • 2
  • m

    most-engine-69486

    05/16/2023, 7:14 PM
    Hi All, Could you please recommend a way how can I limit access for the user or for the group to platform resources? For example, I have resources: BigQuery, MetaBase, SQL Server, Kafka. • User A must have access to BigQuery, Metabase • User B must have access to MetaBase, SQL Server, Kafka
    a
    • 2
    • 5
  • d

    dry-spring-48163

    05/17/2023, 5:10 AM
    Hello, I'm wondering if the open source version of Datahub supports any kind of single sign on? I saw somewhere in the docs it mentioned only the cloud version supported OIDC but other docs didn't mention this so I wasn't sure..
    âś… 1
    d
    • 2
    • 1
  • h

    handsome-cat-78137

    05/17/2023, 9:48 AM
    Hi All! I am ingesting data from s3 using the datahub pipeline (
    datahub.ingestion.run.pipeline
    ). Then to update properties, tags, owners etc: 1. I search for the uploaded dataset urn using the resli api with endpoint
    /entities
    because I do not know the urn beforehand 2. Update the urn The problem that I face with this approach is that urn creation takes some time depending on the dataset. And usually it is not ready when I try searching for it. Is there some way to know when the urn is created after the pipeline is run? Or there a better way to do this?
    âś… 1
    d
    m
    • 3
    • 3
  • p

    polite-cat-69516

    05/18/2023, 6:11 AM
    Dear community,I can see the lineage at the table level but not at the field level. Do I need additional configuration?
    d
    b
    • 3
    • 3
1...636465...80Latest