Hello, does datahub repository provide some json ...
# troubleshoot
b
Hello, does datahub repository provide some json validator? Let's say that I had created some json files on my own and I'd like to upload them to datahub in two different ways - using the curl command and using file as a source in yml file. I know at this point that the structure for two of them differs and they are not alike with each other, even though they carry the same information to datahub. I am asking this question, because datahub (which I did setup for locally on my server) starts to throw status 500 after some period of time and sometimes there is error visible in the logs stating that the json inside MySql database is erroneous, despite of ingesting it beforehand to datahub.
i
Hello Pawel, how did you ingest the json files? Can you share the commands and recipe files?
b
I had used this command: curl 'http://localhost:8080/entities?action=ingest' -X POST --data-raw "$(<testing_curl.json)" If you would like I can simplify my json and share it too
i
If you would like I can simplify my json and share it too
Please do, DataHub does not support reading custom json, so unless you are specifying metadata in DataHub’s expected format like: https://raw.githubusercontent.com/datahub-project/datahub/master/metadata-ingestion/examples/demo_data/demo_data.json then I don’t expect the data can be correctly ingested.
b
Sure, here is an example
This can be uploaded via the command I had shared with you previously:
curl '<http://localhost:8080/entities?action=ingest>' -X POST --data-raw "$(<basic_example.json)"
Datahub will accept it and show it correctly. If this is not the way to do it, how can I then generate the ingestible json based on my csv file?
i
how can I then generate the ingestible json based on my csv file?
Right now we don’t have a CSV source in our ingestion list we would love a contribution though! Alternatively you can take the information in your CSV and generate json files that can then be ingested via https://datahubproject.io/docs/metadata-ingestion/source_docs/file
b
Right now I wrote very simple csv parser that generates the json that I had provided to you in this thread. So the question remains - if the json that I had provided to you is valid, can I use it in curl command to send it to datahub? Did I understood it correctly that making my own json is okay as long as I use "File as a source" (https://datahubproject.io/docs/metadata-ingestion/source_docs/file/)?
i
if the json that I had provided to you is valid, can I use it in curl command to send it to datahub?
Yes, but file as source is prefered.
thank you 1