https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • h

    high-hospital-85984

    02/03/2021, 7:06 PM
    Hi guys! I saw that we have “Data privacy management for datasets” on the roadmap. Do we have any concrete plans for this yet? We would be interested in helping with the implementation
    g
    m
    • 3
    • 21
  • n

    nutritious-bird-77396

    02/03/2021, 9:43 PM
    After the merge of https://github.com/linkedin/datahub/pull/2076 in master Hitting the GraphQL Endpoint - http://localhost:8091/graphql
    Copy code
    dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:kafka,bar,PROD)") {
        urn
        platform {
        	urn
        }
    }
    I get the error
    Copy code
    "message": "The field at path '/dataset/platform' was declared as a non null type, but the code involved in retrieving data has wrongly returned a null value.  The graphql specification requires that the parent field be set to null, or if that is non nullable that it bubble up null to its parent and so on. The non-nullable type is 'DataPlatform' within parent type 'Dataset'",
    even though 
    kafka
     is available in the 
    dataPlatforms
    Copy code
    curl '<http://localhost:8080/dataPlatforms/kafka?aspects=List(com.linkedin.dataplatform.DataPlatformInfo)>' -H 'X-RestLi-Protocol-Version:2.0.0' -s | jq
    {
      "name": "kafka",
      "dataPlatformInfo": {
        "name": "kafka",
        "type": "MESSAGE_BROKER",
        "datasetNameDelimiter": "."
      }
    }
    Am i missing something?
    b
    c
    • 3
    • 13
  • m

    mammoth-bear-12532

    02/09/2021, 9:52 PM
    Congrats @some-crayon-90964 on your first contribution to the React app 🥳 here's to many more!
    🙌 4
    s
    • 2
    • 1
  • b

    bland-garden-89536

    02/10/2021, 12:02 PM
    Hi folks, I work in Delivery Hero, Berlin. I am in process of evaluating datahub as our data catalog solution 🤟. Does data hub support table preview or some other queries (like counts) to get an idea about a table in it’s UI?
    b
    l
    a
    • 4
    • 4
  • b

    bland-garden-89536

    02/10/2021, 2:49 PM
    One more question, can we change meta-data on UI and somehow it propagates back to backend database e.g BQ
    h
    b
    +2
    • 5
    • 9
  • n

    nutritious-bird-77396

    02/11/2021, 1:31 AM
    Hi Team, I am trying to retrieve a new field from the GMS Client API : From what i understand from the Rest.li user guide, I have added the new field in the PDL in gms/api (https://github.com/linkedin/datahub/blob/master/gms/api/src/main/pegasus/com/linkedin/dataset/Dataset.pdl#L97) Are there any other changes needed for these field to be retrieved from the Client?(https://github.com/linkedin/datahub/blob/master/gms/client/src/main/java/com/linkedin/dataset/client/Datasets.java) With only these changes i see the RequestBuilder classes contain the new fields but the client doesn’t return it, i suspect I am missing some more changes…. Appreciate your help on this.
    b
    • 2
    • 6
  • u

    user

    02/11/2021, 3:08 AM
    Which version of ElasticSearch are you running in production?
  • i

    incalculable-ocean-74010

    02/11/2021, 4:21 PM
    Hello, when onboarding new entities, do I have to manually write the restspec.json file for the new entity in the gms-api module?
    b
    a
    • 3
    • 12
  • b

    big-carpet-38439

    02/12/2021, 3:32 PM
    welcome @worried-nail-88622 🙂
    w
    • 2
    • 1
  • a

    acceptable-architect-70237

    02/12/2021, 5:07 PM
    I understand (or at least I think I understand) the version is populated by
    gms
    automatically. and I noticed that if I do add a version in Kafka message as such, it won't be respected in
    mce
    because of
    renamed/avro/com/linkedin/mxe/MetadataChangeEvent.avsc
    . here are two questions: 1. is my understanding correct? 2. can I add
    version?
    we try to backfill a aspect for datasets. we need to handle each version. What's best way to do so?
    e
    b
    • 3
    • 4
  • i

    incalculable-ocean-74010

    02/15/2021, 10:06 AM
    Hello, is step 4.6 of the entity onboarding process still applicable today or is it legacy?

    https://raw.githubusercontent.com/linkedin/datahub/master/docs/imgs/onboard-a-new-entity.png▾

    (modify
    beans.xml
    from the GMS-war module) I don't see anything specific to pre-existing entities.
    b
    • 2
    • 1
  • p

    powerful-egg-69769

    02/15/2021, 3:44 PM
    hello, what authentication solutions are in use/planned for datahub?
    b
    • 2
    • 4
  • i

    incalculable-ocean-74010

    02/15/2021, 4:18 PM
    How can I check how many instances of each snapshot a given datahub deployment has directly from the database?
    b
    • 2
    • 1
  • b

    big-carpet-38439

    02/15/2021, 11:06 PM
    Btw... Does anyone know what
    Copy code
    needs.setup.outputs.tag
    means in the github actions files?
    m
    • 2
    • 3
  • c

    calm-motorcycle-16283

    02/16/2021, 4:56 AM
    Hi, actually I don't know anything about datahub yet. What documents should I start with? I ran Quickstart and added sample data.
    m
    • 2
    • 29
  • l

    loud-island-88694

    02/16/2021, 8:53 PM
    Welcome @gorgeous-engine-526 ! What are the main use cases you are targeting to solve with DataHub?
    g
    • 2
    • 3
  • b

    big-carpet-38439

    02/17/2021, 5:52 PM
    what's the easiest way to run datahub-gms locally in debug mode?
    m
    s
    n
    • 4
    • 7
  • n

    narrow-painting-12219

    02/18/2021, 7:08 PM
    Hello I just did run datahub with docker-compose and opened frontend. How do I learn how to "load data" to datahub? It's kind of foggy to me, yet
    b
    g
    +2
    • 5
    • 54
  • n

    nutritious-bird-77396

    02/18/2021, 11:53 PM
    I am trying to do a search on a specific field in the GraphQL API for Datasets… When i try to search for fields like
    name
    ,
    origin
    ,
    platform
    it works but other than it throws exception…. For example: The below one works but not with fields like
    description
    or
    nativeDataType
    Copy code
    {
      search(input: {type: DATASET, query: "*", filters: {field: "origin", value: "TEST" }}) {
        start
        count
        total
        entities {
          ... on Dataset {
            urn
            name
            description
            ownership {
              owners {
                owner {
                  urn
                }
              }
            }
          }
        }
        facets {
          field
          aggregations {
            value
            count
          }
        }
      }
    }
    What is needed for this to work?
    e
    • 2
    • 7
  • c

    calm-sunset-28996

    02/19/2021, 10:56 AM
    https://github.com/linkedin/datahub/blob/master/datahub-web/documentation/MAIN.md This part seems pretty stale, any chance I can find the working links?
    i
    m
    • 3
    • 4
  • n

    narrow-painting-12219

    02/19/2021, 12:18 PM
    Hello Had a problem ingesting pg db with geometry type column: https://gist.github.com/carrbrpoa/b4c62917921d1e320aeb72f396ffe14a
    m
    g
    • 3
    • 16
  • h

    high-hospital-85984

    02/19/2021, 2:12 PM
    btw, who owns the roadmap ? It feels a bit outdated 😅
    👍 2
    l
    m
    • 3
    • 2
  • g

    gorgeous-journalist-2467

    02/19/2021, 2:19 PM
    Hello everyone. Thanks for your work supporting DataHub 🙂 I have a question regarding the cloud, more precisely Azure. Are there plans to make DataHub deployment a Managed Service?
    l
    • 2
    • 5
  • a

    able-keyboard-68141

    02/19/2021, 7:49 PM
    Hey everyone - happy to join the community! Fantastic townhall demos this morning. Has the recording been posted anywhere yet?
    m
    • 2
    • 2
  • b

    bulky-lunch-72217

    02/21/2021, 3:41 AM
    hello, i get an error where execute the /opt/datahub/datahub-master/metadata-ingestion/sql-etl/mysql_etl.py the error is avro.io.AvroTypeException: The datum is not an example of the schema
    m
    g
    +2
    • 5
    • 26
  • c

    curved-magazine-23582

    02/23/2021, 4:34 PM
    for dashboard/charts, do they support upstream lineage back to datasets, etc?
    l
    m
    • 3
    • 2
  • n

    narrow-painting-12219

    02/23/2021, 8:36 PM
    Another issue while trying to ingest in another Linux env.:
    Copy code
    (venv) carr@carr-VirtualBox:~/Projetos/datahub/metadata-ingestion$ datahub ingest -c ./examples/recipes/example_to_datahub_rest.yml
    Traceback (most recent call last):
      File "/home/carr/Projetos/datahub/venv/bin/datahub", line 11, in <module>
        load_entry_point('datahub', 'console_scripts', 'datahub')()
      File "/home/carr/Projetos/datahub/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 480, in load_entry_point
        return get_distribution(dist).load_entry_point(group, name)
      File "/home/carr/Projetos/datahub/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2693, in load_entry_point
        return ep.load()
      File "/home/carr/Projetos/datahub/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2324, in load
        return self.resolve()
      File "/home/carr/Projetos/datahub/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2330, in resolve
        module = __import__(self.module_name, fromlist=['__name__'], level=0)
      File "/home/carr/Projetos/datahub/metadata-ingestion/src/datahub/entrypoints.py", line 13, in <module>
        from datahub.ingestion.run.pipeline import Pipeline
      File "/home/carr/Projetos/datahub/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 16, in <module>
        from datahub.ingestion.source import source_class_mapping
      File "/home/carr/Projetos/datahub/metadata-ingestion/src/datahub/ingestion/source/__init__.py", line 8, in <module>
        from .ldap import LDAPSource
      File "/home/carr/Projetos/datahub/metadata-ingestion/src/datahub/ingestion/source/ldap.py", line 4, in <module>
        import ldap
    ModuleNotFoundError: No module named 'ldap'
    m
    g
    • 3
    • 14
  • n

    narrow-painting-12219

    02/24/2021, 2:50 AM
    Is it possible to ingest a specific pg schema?
    m
    • 2
    • 2
  • n

    narrow-painting-12219

    02/24/2021, 3:39 AM
    Are there some docs or tips about datahub and it's handling in a production environment? I mean, setting it up, filling data with ingestions, setting airflows, etc
    l
    • 2
    • 4
  • h

    high-hospital-85984

    02/24/2021, 1:16 PM
    It seem’s like the frontend/`OwnerViewDao` only supports CorpUsers as owners, and not groups, even though the
    Owner
    -model states that owners can be groups. Is this a bug or by design?
    g
    • 2
    • 6
1...456...80Latest