Thread
#getting-started
    h

    high-hospital-85984

    1 year ago
    Hi guys! I saw that we have “Data privacy management for datasets” on the roadmap. Do we have any concrete plans for this yet? We would be interested in helping with the implementation
    And hopefully we could expand this to dataset fields
    g

    green-football-43791

    1 year ago
    Out of curiosity, what would you be looking for specifically?
    m

    mammoth-bear-12532

    1 year ago
    @high-hospital-85984: are you imagining data field level compliance tagging here?
    h

    high-hospital-85984

    1 year ago
    To start with we want to add “sensitivity” tags to tables, more specifically to columns.
    m

    mammoth-bear-12532

    1 year ago
    Yeah that is something we have built and use heavily at LinkedIn ..
    h

    high-hospital-85984

    1 year ago
    yep, super simple to start. Tags would preferable be quite customizable, as they might differ between companies (we use a colour based scheme)
    m

    mammoth-bear-12532

    1 year ago
    The way we did it was two-level ... first create a "taxonomy" of data types (e.g. email / phone-num / ... ) and then allow fields to be tagged with them... and separately have a relnship between the data types -> (is this a sensitive type or not?)
    h

    high-hospital-85984

    1 year ago
    Cool, any chance of getting that into the open source code? 😉
    m

    mammoth-bear-12532

    1 year ago
    if you're offering help, I'll take you up on it 🙂
    h

    high-hospital-85984

    1 year ago
    But we wouldn’t mind adding some functionality to tag dataset fields, if the community sees some value in it.
    If you can point us in the right direction, we can take it from there. BaseFieldMapping sort of looks promising, but seems to be tightly tied to transformations?
    g

    green-football-43791

    1 year ago
    Out of curiosity, what set of sensitivity tags are you interested in? Would it be a binary of sensitive/sensitive or will you have a scale?
    I've been thinking about creating an alternative table view that would allow for different types of tags, this is a mock I'm working on:
    Does this presentation align w/ your vision?
    h

    high-hospital-85984

    1 year ago
    Our scale (currently) is green, yellow, orange, and red.
    So what you’re showing seems quite aligned, yes. I’m assuming that the
    PII
    and
    Financial
    tags there are strings?
    I’m secretly hoping for a tag-propagation feature, like in Apache Atlas, so hopefully we can find a solution that takes us closer to that 🙂
    m

    mammoth-bear-12532

    1 year ago
    Yeah tag-propagation would be the next step for sure. This is just making sure that the tags render well for human consumption.
    h

    high-hospital-85984

    1 year ago
    But how do you suggest we get the ball rolling with this?
    m

    mammoth-bear-12532

    1 year ago
    I would say, providing feedback on the metadata models ... what is working and what needs to change .. would be high leverage