Hello, everyone, I used JSON file to import metada...
# ingestion
l
Hello, everyone, I used JSON file to import metadata. The first time I succeeded, and then I deleted it. When I executed it again, the client display was succeed, but there was no data in the UI page. Just modify the source. Why is that?
python3 -m datahub delete --env QA --entity_type dataset --platform hive
c
Your json had the env as PROD vs your delete command has env as QA. Was env also modified when you changed the source?
l
My screenshot is demo data on GIT. I just mark source and schema,
c
Can you share your recipe?
l
Copy code
[
    {
        "auditHeader": null,
        "proposedSnapshot": {
            "com.linkedin.pegasus2avro.metadata.snapshot.DatasetSnapshot": {
                "urn": "urn:li:dataset:(urn:li:dataPlatform:hive,schema-data-nan,QA)",
                "aspects": [
                    {
                        "com.linkedin.pegasus2avro.schema.SchemaMetadata": {
                            "schemaName": "schema-data-nan",
                            "platform": "urn:li:dataPlatform:hive",
                            "version": 0,
                            "created": {
                                "time": 1621882982738,
                                "actor": "urn:li:corpuser:etl",
                                "impersonator": null
                            },
                            "lastModified": {
                                "time": 1621882982738,
                                "actor": "urn:li:corpuser:etl",
                                "impersonator": null
                            },
                            "deleted": null,
                            "dataset": null,
                            "cluster": null,
                            "hash": "",
                            "platformSchema": {
                                "com.linkedin.pegasus2avro.schema.MySqlDDL": {
                                    "tableSchema": ""
                                }
                            },
                            "fields": [
							    {
                                    "fieldPath": "county_id",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.NumberType": {}
                                        }
                                    },
                                    "nativeDataType": "Integer()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "county_name",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.StringType": {}
                                        }
                                    },
                                    "nativeDataType": "String()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "county_address",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.StringType": {}
                                        }
                                    },
                                    "nativeDataType": "String()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "num_infection_isolation_rooms",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.NumberType": {}
                                        }
                                    },
                                    "nativeDataType": "Integer()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                }
                            ],
                            "primaryKeys": null,
                            "foreignKeysSpecs": null
                        }
                    }
                ]
            }
        },
        "proposedDelta": null
    },
    {
        "auditHeader": null,
        "proposedSnapshot": {
            "com.linkedin.pegasus2avro.metadata.snapshot.DatasetSnapshot": {
                "urn": "urn:li:dataset:(urn:li:dataPlatform:hive,schema-data-test,QA)",
                "aspects": [
                    {
                        "com.linkedin.pegasus2avro.schema.SchemaMetadata": {
                            "schemaName": "schema-data-test",
                            "platform": "urn:li:dataPlatform:hive",
                            "version": 0,
                            "created": {
                                "time": 1621882983026,
                                "actor": "urn:li:corpuser:etl",
                                "impersonator": null
                            },
                            "lastModified": {
                                "time": 1621882983026,
                                "actor": "urn:li:corpuser:etl",
                                "impersonator": null
                            },
                            "deleted": null,
                            "dataset": null,
                            "cluster": null,
                            "hash": "",
                            "platformSchema": {
                                "com.linkedin.pegasus2avro.schema.MySqlDDL": {
                                    "tableSchema": ""
                                }
                            },
                            "fields": [
                                {
                                    "fieldPath": "county_type",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.StringType": {}
                                        }
                                    },
                                    "nativeDataType": "String()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "county_name",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.StringType": {}
                                        }
                                    },
                                    "nativeDataType": "String()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "state_name",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.StringType": {}
                                        }
                                    },
                                    "nativeDataType": "String()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "personnel_code",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.NumberType": {}
                                        }
                                    },
                                    "nativeDataType": "Integer()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                },
                                {
                                    "fieldPath": "total_personnel_count",
                                    "jsonPath": null,
                                    "nullable": true,
                                    "description": null,
                                    "type": {
                                        "type": {
                                            "com.linkedin.pegasus2avro.schema.NumberType": {}
                                        }
                                    },
                                    "nativeDataType": "Integer()",
                                    "recursive": false,
                                    "globalTags": null,
                                    "glossaryTerms": null
                                }
                            ],
                            "primaryKeys": null,
                            "foreignKeysSpecs": null
                        }
                    }
                ]
            }
        },
        "proposedDelta": null
    }
]
c
I just tried your code with recipe and json file. Everything works well for me. I can see them on ui as well as on cli. I have understood correctly, you are not able to view them on UI right?
l
Yes, execute recipe.yaml after the delete command , the client display finished succeddfully, but the UI no data
UI no hive,It's so strange
c
@lemon-zoo-63387 Can you try to read entities by urn using cli? If you get thing here, it will confirm that ingestion is successful and something on ui side is broken. may be elastic search or something. Check if all containers are healthy.
Command to read entities:
datahub get --urn <urn>
b
status set as removed=True, maybe?
c
You can use --soft, in cli.
l
Can you give me more details about the command? I'm a rookie😁😁
b
datahub get --urn="urn:li:dataset:(urn:li:dataPlatform:hive,schema-data-test,QA)"
c
It should be:
datahub get --urn "urn:li:dataset:(urn:li:dataPlatform:hive,schema-data-test,QA)"
l
I execute it again, status
Copy code
"status":{
   "removed": true
}
b
you probably should do a hard delete instead of updating the status
Copy code
datahub delete --env QA --entity_type dataset --platform hive --hard
then re ingest
l
b
oh,that doesnt work. then the more tedious way is to append this to each workunit in the json file:
Copy code
{
            "com.linkedin.pegasus2avro.common.Status": {
              "removed": false
            }
          },
c
@lemon-zoo-63387 Please correct me, but you are not trying to delete the dataset. You are trying to igest it.
l
yes,If it is not deleted, the test is successful, and when I change hive to MySQL, Postgres.... All successfully ingested