Hi there! I hope I am asking this on the right pla...
# advice-metadata-modeling
g
Hi there! I hope I am asking this on the right place. How could we model services on Datahub? say for example, We have certain dataset on our lake, and there is a service that takes data from that service and turns it into other data, or sometimes just consumes it and does not produce any data. We would like to model what are the services that are consuming data on out lake. Anyone has any idea on how to model that? Thank you 🙂
r
Hi David! Unfortunately DataHub lacks a first-class entity model that directly models the concept of a service, as up to this point the main focus has been on modeling the primitive "asset" types of a data ecosystem - tables, dashboards, data pipelines, etc. That being said, if you really want to model this, I'd suggest ingesting these services as "DataJobs" and "DataFlows", which are semantically the closest thing to what you're looking for. These entity types are supposed to model the processes that read, transform, and produce derived data. If you emit aspects for each of these entity types, such as the dataJobInputOutput aspect, you'll begin to see them appear in the lineage view relating tables to one another.
👍 1
🤘 1
g
Thanks a lot!
This was already quite helpfull
How hard would it be for me to create a service entity, that would be pretty much the same as DataJobs para with a different name?
r
David - It would require a few steps. Not impossible by any means. 1. Modeling the new entity in entity-registry.yml and adding the aspect PDL models under metadata-models 2. Writing GraphQL Type class and Query for fetching by urn inside of datahub-graphql-core 3. Add a new Entity Type in the React frontend code to display the profile and search views
ty 1