Hi guys, I've manually ingested a chart and dashbo...
# ingestion
f
Hi guys, I've manually ingested a chart and dashboard with the REST API, but fail to find them in the GUI. They do appear in searches and lineage. Should I do something else?
b
Do you mind going directly to this URL to see if their profiles appear? http://localhost:9002/chart/<YOUR-URL-ENCODED-CHART-URN>
f
It does show up.
But not in charts.
b
It seems that for some reason this entity is not in the search index. How are you deploying this? Helm or docker compose?
f
docker compose
This is the curl command
Copy code
curl "<http://localhost:8080/dashboards?action=ingest>" -X POST -H 'X-RestLi-Protocol-Version:2.0.0' --data "{
    "snapshot": {
        "aspects": [{
            "com.linkedin.dashboard.DashboardInfo": {
                "title": "Daily Summary",
                "description": "The dashboard shall include an overview of the MUACs Daily Summary of Traffic count and Delay of previous day and same weekday of previous year.",
                "charts": [],
                "dashboardUrl": "<https://mtabl001.muac.corp.eurocontrol.int/#/views/DailySummary/DailySummary?:iid=2>",
                "lastModified": {
                    "created": {
                        "time": 0,
                        "actor": "urn:li:corpuser:datahub"
                    },
                    "lastModified": {
                        "time": 0,
                        "actor": "urn:li:corpuser:datahub"
                    }
                }
            }
        }],
        "urn": "urn:li:dashboard:(looker,baz)"
    }
}
"
b
Hmm okay. Do you mind sending me the zipped debug logs from the
gms
container? They reside under the /tmp/datahub/gms/logs directory, and the file name will be formatted as
<date>.debug.log
Here's how to get them: 1. Find the docker container id for the "gms" container:
docker ps -a
and look for the linkedin/gms container at the desired version (or head for latest) 2. Copy the container id shown 3. Find the name of the debug log file
Copy code
docker exec --privileged <container-id from step 2> ls /tmp/datahub/gms/logs
4. Dump the logs to a local file:
Copy code
touch gms-debug.log
docker exec --privileged <container-id from step 2> cat /tmp/datahub/gms/logs/<debug-log-file-from-step-4> > gms-debug.log
f
Here you are, just a little remark that it was in /tmp/datahub/logs and not /tmp/datahub/gms/logs
b
Ah if it was at that location you are likely working with an old version of GMS
Is that your expectation?
It makes sense, bc these are not debug logs these are standard error / info logs
f
What should be the good GMS version? 🙂
Hold on, so I've updated to v0.8.6
Copy code
1bca72ea9dbd   linkedin/datahub-gms:v0.8.6                   "/bin/sh -c /datahub…"   3 minutes ago   Up 3 minutes (healthy)     0.0.0.0:8080->8080/tcp, :::8080->8080/tcp
But I got the same error.
Copy code
sudo docker exec --privileged 1bca72ea9dbd ls /tmp/datahub/gms/logs

ls: /tmp/datahub/gms/logs: No such file or directory
b
I think it should be logs/gms
I may have misquoted above^ do you mind trying to ls /tmp/datahub?
f
I could have been more flexible in finding it, of course 🙂 Here you are.
b
thank you!
With the newer version of DH are you still seeing the issue? It may be that you need to reindex 😕 https://datahubproject.io/docs/how/restore-indices/ this should hopefully resynchronize the indices
f
Yes, new version still has the issue.
Tried reindexing, still charts or dashboards do not show up in their GUI sections. They do show up when searching.
b
I see - where are you ingesting dashboards and charts? To confirm, you're saying that the relationships between charrts and dashboards are not being reflected
f
I am using a REST call to add charts or dashboards, the relation between them is good. The problem is they do not display on screen after they are ingested.
Look at my initial screens.
Hey, discovered something that might help with your investigations. I told I can add a chart, but it does not show up in Charts. But if I search for it and manually add an owner (maybe works with other things, like tag), it shows up.
Here is the curl command again:
Copy code
curl --location --request POST '<http://mdhprot601:8080/charts?action=ingest>' \
--header 'X-RestLi-Protocol-Version: 2.0.0' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "snapshot": {
        "aspects": [
            {
                "com.linkedin.chart.ChartInfo": {
                    "chartUrl": "<https://mtabl001.muac.corp.eurocontrol.int/#/views/DailySummary/DailySummary>",
                    "title": "Daily Summary",
                    "description": "The dashboard shall include an overview of the MUACs Daily Summary of Traffic count and Delay of previous day and same weekday of previous year.",
                    "inputs": [
                        {
                            "string": "urn:li:dataset:(urn:li:dataPlatform:oracle,edw.delay_per_regulation_bb,PROD)",
                            "string": "urn:li:dataset:(urn:li:dataPlatform:oracle,edw.date_r,PROD)",
                            "string": "urn:li:dataset:(urn:li:dataPlatform:oracle,inf.date_d,PROD)",
                            "string": "urn:li:dataset:(urn:li:dataPlatform:oracle,edw.operational_sector_d,PROD)",
                            "string": "urn:li:dataset:(urn:li:dataPlatform:oracle,edw.time_d,PROD)"
                        }
                    ],
                    "lastModified": {
                        "created": {
                            "time": 0,
                            "actor": "urn:li:corpuser:datahub"
                        },
                        "lastModified": {
                            "time": 0,
                            "actor": "urn:li:corpuser:datahub"
                        }
                    }
                }
            }
        ],
        "urn": "urn:li:chart:(tableau,baz1)"
    }
}'
All right, found a work-around, I am able to ingest from a generated file using
ChartSnapshotClass
. Charts do show up now.
Very very soon, I will be able to build full file to dashboard lineage, completely automatic.
b
Ah I see! So this may be a bug - if you're updated to latest we are recommending that folks begin using the "entities" endpoint to ingest data
Let me send you an example CURL
Copy code
curl '<http://localhost:8080/entities?action=ingest>' -X POST -H 'X-RestLi-Protocol-Version:2.0.0' --data '{
   "entity":{ 
      "value":{
         "com.linkedin.metadata.snapshot.DatasetSnapshot": {"aspects":[{"com.linkedin.common.Ownership":{"owners":[{"owner":"urn:li:corpuser:john","type":"DATAOWNER"}],"lastModified":{"time":0,"actor":"urn:li:corpuser:goose"}}},{"com.linkedin.dataset.UpstreamLineage":{"upstreams":[{"auditStamp":{"time":0,"actor":"urn:li:corpuser:fbar"},"dataset":"urn:li:dataset:(urn:li:dataPlatform:foo,barUp,PROD)","type":"TRANSFORMED"}]}},{"com.linkedin.common.InstitutionalMemory":{"elements":[{"url":"<https://www.linkedin.com>","description":"Sample doc","createStamp":{"time":0,"actor":"urn:li:corpuser:fbar"}}]}},{"com.linkedin.schema.SchemaMetadata":{"schemaName":"FooEvent","platform":"urn:li:dataPlatform:foo","version":0,"created":{"time":0,"actor":"urn:li:corpuser:fbar"},"lastModified":{"time":0,"actor":"urn:li:corpuser:fbar"},"hash":"","platformSchema":{"com.linkedin.schema.KafkaSchema":{"documentSchema":"{\"type\":\"record\",\"name\":\"MetadataChangeEvent\",\"namespace\":\"com.linkedin.mxe\",\"doc\":\"Kafka event for proposing a metadata change for an entity.\",\"fields\":[{\"name\":\"auditHeader\",\"type\":{\"type\":\"record\",\"name\":\"KafkaAuditHeader\",\"namespace\":\"com.linkedin.avro2pegasus.events\",\"doc\":\"Header\"}}]}"}},"fields":[{"fieldPath":"foo","description":"Bar","nativeDataType":"string","type":{"type":{"com.linkedin.schema.StringType":{}}}}]}}],"urn":"urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)"}
      }
   }
}'
This curl updates a Dataset
Basically you'd want to replace this with ChartSnapshot and the contents of that!