hi everyone! can anyone share yml file (example) f...
# getting-started
t
hi everyone! can anyone share yml file (example) for glossary load to data hub via cli?
b
hey there! these docs will help you out for this: https://datahubproject.io/docs/generated/ingestion/sources/business-glossary/#module-datahub-business-glossary and then an example yaml file would look like this:
Copy code
version: 1
source: DataHub
owners:
  users:
    - datahub
url: "<https://github.com/datahub-project/datahub/>"
nodes:
  - name: Classification
    description: A set of terms related to Data Classification
    terms:
      - name: Sensitive
        description: Sensitive Data
        custom_properties:
          is_confidential: false
      - name: Confidential
        description: Confidential Data
        custom_properties:
          is_confidential: true
      - name: HighlyConfidential
        description: Highly Confidential Data
        custom_properties:
          is_confidential: true
  - name: PersonalInformation
    description: All terms related to personal information
    owners:
      users:
        - datahub
    terms:
      - name: Email
        description: An individual's email address
        inherits:
          - Classification.Confidential
        owners:
          groups:
            - Trust and Safety
      - name: Address
        description: A physical address
      - name: Gender
        description: The gender identity of the individual
        inherits:
          - Classification.Sensitive
t
thanks a lot! when I read this article and tried to click here an error 404 occured:
@bulky-soccer-26729 could you please comment, did I understand this example correct: 2 nodes are defined - Classification and PersonalInformation. Each node contains several terms with some properties. Where can I find nodes in GUI of datahub? is node = domain?
b
oh goodness that must be a bad link in the docs - taking a note of that so we can fix it! thanks for calling that out.
also you have that correct! so "nodes" in this context are going to be "Term Groups" in the UI. Nodes are basically just folders containing Terms and other Nodes (Term Groups) in your business glossary
t
got it! thanks!
@bulky-soccer-26729 one more question: can I make a node inside a node? I mean what if I need several levels of enclosure (e.g. node 1: Customer identification, node 2: Customer identification documents and on this level lots of terms may exist). Can I create such structure? or there is any limitation on the number of nodes inside nodes?
b
yup you can certainly do that! you can have as many levels of nodes as you desire
t
great!
@bulky-soccer-26729, I tried to load yml file you've mentioned above and following errors occured:
message has been deleted
could you please explain what I did wrong?
b
hey! can you send me the file you call
datahub ingest -c
with? it should be slightly different than ingesting by file otherwise. it should have the
type
be
datahub-business-glossary
and
file
pointing to your file instead of
fileame
. so it should look like:
Copy code
source:
  type: datahub-business-glossary
  config:
    file: "<path_to_yml_glossary_file>"