DataHub #getting-started

abundant-garage-20831

03/02/2024, 12:46 PM

Hi I'm trying to create lineage via gms api of datahub, using updatelineage, that is the only one I found, there is two options edgestoadd and edgestoremove, I don't have any edge to remove, but it has to be added otherwise it is giving error. Can someone tell me the syntax to just add a lineage using curl command

abundant-wall-60815

03/04/2024, 9:03 AM

How to solve this problem of missing jar package?import com.linkedin.metadata.models.registry.config.Entities;

damp-solstice-31196

03/04/2024, 6:50 PM

Hi everyone. Once data documentation and the business glossary are created, is there a way to export that information and use it in another Datahub instance? Or would the only way be copying data across separate RDS instances? Thank you!

boundless-nail-65912

03/06/2024, 6:17 AM

Hi Team, We are trying to implement Columnar level lineage for Vertica Source. Can anyone provide where can we get the more information about column level lineage?

swift-dentist-41637

03/06/2024, 10:30 AM

Hi folks! We've been playing with the Datahub quickstart and it's great 💪 We have a question about infrastructure for a full deployment: is it theoretically possible to bypass Kafka? We're wondering if it's mainly to enable push-based metadata changes - and so if we were willing to lose those features could we replace it with a cheaper/simpler system? Or is it quite deeply embedded and would require a lot of changes?

flaky-raincoat-44662

03/06/2024, 9:52 PM

I love the data products concept and formalization of data mesh principles embedded in the product. Can a data product include predefined/calculated metrics and if so, how does grain and dimensionalization work?

tall-painting-67124

03/07/2024, 12:27 AM

Hi All,

tall-painting-67124

03/07/2024, 12:28 AM

not sure if this is the right thread but just wanted to understand if we can have Apache solr as a source in datahub.. I know Elastic is supported but how about Apache solr

tall-answer-76571

03/07/2024, 9:42 AM

Hello all! Could someone please explain which PowerBI license is required for integrating dashboards with a data catalog?

red-scientist-36390

03/07/2024, 11:23 AM

Hello! We’re currently evaluating our options in terms of data observability tools and are curious about the different costs & benefits of managed Datahub. Is there anywhere specific where this info could be accessed? Thank you

wonderful-baker-8803

03/07/2024, 10:56 PM

Hi all, I am currently using Datahub v 0.12.1. checking how can we pass personal access token via datahub cli?

little-musician-10851

03/07/2024, 11:09 PM

Hello! I would like to evaluate and experiment with Datahub integrations with Neo4j. I know Neo4j acts as the graph database but how would i see that reflected in my local Neo4j instance? Any thoughts or additional documentation would be helpful. Thanks!

bland-receptionist-85001

03/08/2024, 12:16 AM

This feature of data exploration exist?

https://www.youtube.com/watch?v=vGGqjP-5Rms▾

billions-baker-82097

03/08/2024, 12:46 PM

Hi team, I am working on a design where I wanted to fetch the changes happening at the source side and put it the datahub without re-running the recipe file. Traditionally what we have in the datahub is like we have to re-run our recipe to get the changes from the source and it gets reflected on the datahub in a timeseries way. But can we have something where we do not require to re-run our recipe to get the changes to the datahub. Current Scene: We are fetching pull-based metadata from source ( we need to re-run the recipe ) and reflecting to the datahub UI. Required Scene: Data Source should be able to send the changes to the Datahub, whenever there is a change in their source. It's like *push-based ingestio*n from source to the Datahub. Any thoughts and inputs are welcome.

wonderful-solstice-50942

03/08/2024, 3:04 PM

Hello, how do I switch the tokenizer for ElasticSearch in a Docker deployment?

tall-answer-76571

03/11/2024, 9:52 AM

Hi everyone! Is it possible to connect DataHub to PowerBI Report Server (NOT Power bi service in cloud) ?

fresh-rain-44904

03/11/2024, 8:27 PM

Hi All, We have an instance of Salesforce Marketing Cloud (SFMC); and we would like to connect SFMC with DataHub. I know DataHub has a connector for Salesforce but is this connector covering only the Salesforce Data Cloud? Also it would be great if anyone has any experience connecting SFMC with DataHub? TIA!

little-scooter-91144

03/12/2024, 7:07 AM

Hi，everyone!Can lineage relationship be established between datahub containers? I used s3 source and found that the type of folder in the ingestion result was container, but there was no blood tab on the page (just like dataset for each file).

red-piano-8955

03/13/2024, 6:35 PM

Hi All, is there a way to create a policy that allows all readers to add tags to any dataset but not delete them?

tall-painting-67124

03/14/2024, 4:01 AM

Hi All,

tall-painting-67124

03/14/2024, 4:05 AM

I are trying to ingest data from solr using sqlalchemy (sqlalchmey-solr) as source. Now, I am able to successfully connect and get the ingestion to complete but none of the collection is actually sourced. If I check the ingestion logs it reports warning for collections. "Ingestion error: 'nullable'"

tall-painting-67124

03/14/2024, 4:05 AM

"warnings": { "default.demo": [ "Ingestion error: 'nullable'" ], "default.t2": [ "Ingestion error: 'nullable'" ] }, "failures": {}, "soft_deleted_stale_entities": [], "tables_scanned": 2, "views_scanned": 0, "entities_profiled": 0, "filtered": [], "num_view_definitions_parsed": 0, "num_view_definitions_failed_parsing": 0, "num_view_definitions_failed_column_parsing": 0, "view_definitions_parsing_failures": [], "start_time": "2024-03-14 023928.636230 (now)", "running_time": "0.5 seconds" }

glamorous-portugal-11744

03/14/2024, 6:40 AM

hello !!! when i quickstart error :Error response from daemon: driver failed programming external connectivity on endpoint datahub-gms (bd46efce6791a2f654321b679a027aad18c2f33c235f6fd0cababc171aaa6fe1): Error starting userland proxy: listen tcp4 0.0.0.08080 bind: address already in use . when i use DATAHUB_MAPPED_GMS_PORT=58080 quickstart then error dependency failed to start: container datahub-gms is unhealthy

glamorous-portugal-11744

03/14/2024, 6:40 AM

image.png

glamorous-portugal-11744

03/14/2024, 6:41 AM

image.png

glamorous-portugal-11744

03/14/2024, 6:53 AM

dear all !!! why ???🙁

glamorous-portugal-11744

03/14/2024, 6:55 AM

datahub gms log:

glamorous-portugal-11744

03/14/2024, 6:55 AM

image.png

wonderful-egg-79350

03/14/2024, 8:02 AM

When I deploy datahub by using AWS EKS, Could I use snowflake for Database instead of RDS? https://datahubproject.io/docs/deploy/aws

little-scooter-91144

03/14/2024, 9:26 AM

Hi everyone!What is the difference between these data types? What scenarios are used?

Copy code

MetadataChangeEvent,
MetadataChangeProposal,
MetadataChangeProposalWrapper,
UsageAggregation

Especially the difference between mcp and mce.