Hello I ve got a question about the RBAC feature on the road DataHub #ingestion

Hello. I've got a question about the RBAC feature ...

wonderful-quill-11255

08/18/2021, 2:16 PM

Hello. I've got a question about the RBAC feature on the roadmap. Will this include a solution for preventing different teams of overwriting each others metadata which might otherwise happen in a well federated metadata production landscape?

big-carpet-38439

08/18/2021, 2:45 PM

Can you elaborate on an example case? RBAC will help determine who can write what metadata through the UI and API at a granular level (asset level)

wonderful-quill-11255

08/18/2021, 4:43 PM

In my head I was thinking of some kind of namespace concept where permissions to create metadata would be granted on a namespace level.

big-carpet-38439

08/18/2021, 4:52 PM

That's something we hope to achieve using "domains"

big-carpet-38439

08/18/2021, 4:53 PM

But it will likely be the case that conflicting writes cannot be prevented at ingest time via the "ingest" apis, but rather via the UI and GraphQL api

wonderful-quill-11255

08/18/2021, 4:54 PM

Hmm. Ok. Preventing it at the ingest time would have been optimal ofc.

big-carpet-38439

08/18/2021, 5:01 PM

The default disposition is that batch ingest clients should be trusted to do the right thing.. Is that not the case for you folks?

big-carpet-38439

08/18/2021, 5:01 PM

We are actively working to implement RBAC so this feedback is quite useful

wonderful-quill-11255

08/18/2021, 5:04 PM

I don't think people would actively misbehave, just that accidents might happen.

wonderful-quill-11255

08/18/2021, 5:07 PM

I don't think we can assume batch ingests. By batch do you mean actual batches or just the kafka route?

big-carpet-38439

08/18/2021, 5:16 PM

basically anything being reported from the metadata ingestion framework, which hits a dedicated set of "ingest" endpoints

big-carpet-38439

08/18/2021, 5:17 PM

in the future, we will likely consider these endpoints as framework internal. and our public api will be GraphQL

wonderful-quill-11255

08/18/2021, 5:26 PM

Ok. Thanks! Do you know if the GraphQL api includes/will include async endpoints?

big-carpet-38439

08/18/2021, 5:28 PM

by async you mean "fire and forget"?

wonderful-quill-11255

08/18/2021, 5:37 PM

yup

wonderful-quill-11255

09/13/2021, 12:17 PM

@big-carpet-38439 Regarding

The default disposition is that batch ingest clients should be trusted to do the right thing

I'd like to check again if there are any plans to require authentication and perhaps a permission model on top of the gms rest api?

big-carpet-38439

09/13/2021, 3:30 PM

Authentication - yes. Permissions model, likely not

big-carpet-38439

09/13/2021, 3:31 PM

Unless there are strong use cases. What do you have in mind?

wonderful-quill-11255

09/13/2021, 6:02 PM

Some kind of namespacing of entities as a means of avoiding different teams accidentally overwriting each others entities. But getting authentication would be a great feature on its own.

big-carpet-38439

09/13/2021, 6:14 PM

Hmm got it...where does the risk of conflict come from? Is it multiple teams ingestin metadata from the same SQL db?

big-carpet-38439

09/14/2021, 12:41 AM

We really want to understand if this will be common or not

big-carpet-38439

09/14/2021, 1:23 AM

The thing about the Rest.li ingestion APIs: They are fairly low-level, coupled closely to how we actually store the data. Going forward we are trying to lighten the dependencies directly on storage so we have an easier time with public API model changes

square-activity-64562

09/14/2021, 5:00 AM

@big-carpet-38439 e.g. We have 2 country teams - one using mysql with our company name as database name, another using mariadb with our company name as database name. As I was ingesting it I held on to ingesting tables and changed browse paths so there was no confusion. But if country teams themselves were doing the ingestion and there were common table names it would have caused problems

big-carpet-38439

09/14/2021, 5:02 AM

This to me is more a problem with how we identify individual data assets, which is going to improve soon enough

square-activity-64562

09/14/2021, 5:14 AM

yes data instances would be great to have

➕ 1

wonderful-quill-11255

09/14/2021, 6:28 AM

Is it multiple teams ingestin metadata from the same SQL db

Well, probably not from the same physical db no. But potentially two different domains in an organisation could have named both their dbs the same thing without knowing. Maybe I'm grasping for straws here. Let's leave that part for now and I'll try to formulate more realistic examples if I can come up with them. Thanks for your patience. Regarding the auth, do you think that is something coming in Q3, Q4 or later?

Open in Slack

Previous Next