swift-account-97627
09/07/2020, 12:40 PMSchemaMetadata.SchemaFields
and something like DataProfile.FieldProfiles
).
If this is the correct model, what would be a good way to associate each particular FieldProfile with a particular SchemaField? Or is there a different model that would be better?
More generally, it seems like there's a tension between two models for field-level aspects:
1. Dataset has many Aspects, some of which have metadata for many Fields
2. Dataset has many Fields, some of which have multiple Aspects
I don't have an opinion on which of these models is more "correct", but the current implementation only seems to really support one aspect per field, and pushes any extensions to favour model (1) above. Is this a conscious design decision, or has this question just not come up yet?bumpy-keyboard-50565
09/07/2020, 2:30 PMDatasetFieldUrn
that can be used as a pointer without fully materialized field entities. Fields will be created as nodes in the graph to answer questions like "Give me all the PII-containing fields of this dataset based on annotation supplied to its upstream datasets (aka fine-grained metadata propagation)"swift-account-97627
09/07/2020, 3:45 PMswift-account-97627
09/07/2020, 3:45 PMswift-account-97627
09/07/2020, 3:45 PMbumpy-keyboard-50565
09/07/2020, 3:48 PMswift-account-97627
09/07/2020, 3:49 PMswift-account-97627
09/07/2020, 3:49 PMto answer questions like "Give me all the PII-containing fields of this dataset based on annotation supplied to its upstream datasets (aka fine-grained metadata propagation)"Incidentally, is there already a model for the single-dataset part of this (i.e. ignoring lineage, just directly annotating individual fields within a dataset as containing PII)? It looks like the PII flag is also dataset-level only at the moment.
bumpy-keyboard-50565
09/07/2020, 3:51 PMswift-account-97627
09/07/2020, 8:30 PM