hello again, I am trying to ingest the glossary of...
# ingestion
s
hello again, I am trying to ingest the glossary of terms through the API and it is giving me this error "message": "com.linkedin.metadata.entity.ValidationException: Failed to validate record with class com.linkedin.entity.Entity: ERROR :: /value/com.linkedin.metadata.snapshot.GlossaryTermSnapshot/glossaryTermInfo :: unrecognized field found but not allowed\nERROR :: /value/com.linkedin.metadata.snapshot.GlossaryTermSnapshot/urn :: field is required but not found and has no default value\nERROR :: /value/com.linkedin.metadata.snapshot. GlossaryTermSnapshot/aspects :: field is required but not found and has no default value\n",     "status": 422
{
   
"entity":{
      
"value":{
         
"com.linkedin.metadata.snapshot.GlossaryTermSnapshot":{
            
"urn":"urn:li:glossaryTerm:instruments.FinancialInstrument_v1",
            
"ownership":{
               
"owners":[
                  
{
                     
"owner":"urn:li:corpuser:datahub",
                     
"type":"DATAOWNER"
                  
}
               
],
               
"lastModified":{
                  
"actor":"urn:li:corpuser:datahub",
                  
"time":1581407189000
               
}
            
},
            
"glossaryTermInfo":{
               
"definition":"written contract that gives rise to both a financial asset of one entity and a financial liability of another entity",
               
"customProperties":{
                  
"FQDN":"full"
               
},
               
"sourceRef":"FIBO",
               
"sourceUrl":"<https://spec.edmcouncil.org/fibo/ontology/FBC/FinancialInstruments/FinancialInstruments/FinancialInstrument>",
               
"termSource":"EXTERNAL"
            
}
         
}
      
}
   
}
}
This is the body that I send
c
If you are using ingestProposal API, then above format is not correct. You can use python or java client and can create instance of mcpw and emit it to datahub without worrying about format. Please refer to bellow docs https://datahubproject.io/docs/metadata-ingestion/as-a-library or https://datahubproject.io/docs/metadata-integration/java/as-a-library
plus1 1
s
Do you have an example of what it would be like to ingest the glossary of terms?
c
Do you need example in java or python?
s
python
pls 🙏🙏
c
please refer this:
s
Sorry I know it's late for you, is there any documentation about the emitter? What entities can I change and in the aspect what classes could I put?
c
Datahub entities and thier aspects are listed here: https://demo.datahubproject.io/browse/dataset/prod/datahub/entities
s
Hi @careful-pilot-86309, how could I only ingest the glossary of terms from the emitter? I put you in context, I have deployed datahub in an EKS, and doing it from the interface implies adding the file to the path where my frontend is located. I want to be able to do that ingest without having to go into the container and leave the file
With yesterday's tests, I was able to add terms but they are not indexed correctly, nor do they appear in the glossary term tab
c
Can you share your code?
Meanwhile, can you try bellow example?
s
@careful-pilot-86309 Thank you very much, that file helped me a lot, I was able to ingest my glossary of terms through the emitter. I have one more question, this also allows you to delete ??
c
@silly-beach-19296 you can use bellow to soft-delete: emitter.emit_mcp( MetadataChangeProposalWrapper( entityType=entity_type, changeType=ChangeTypeClass.UPSERT, entityUrn=urn, aspectName="status", aspect=StatusClass(removed=True), systemMetadata=SystemMetadataClass( runId=run_id, lastObserved=deletion_timestamp ), ) systemMetadata param is not mandatory
👍 1
Hard delete is currently not supported by emitter but you can use bellow API:
Copy code
#Hard delete
#curl -X POST "<http://localhost:8080/entities?action=delete>" -H "Content-Type: application/json" -d '{"urn":"urn:li:glossaryTerm:SavingAccount"}'
r
Hey there! 👋 Make sure your message includes the following information if relevant, so we can help more effectively! 1. Are you using UI or CLI for ingestion? 2. Which DataHub version are you using? (e.g. 0.12.0) 3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)