I'm experimenting with ingestion in a 'docker quic...
# troubleshoot
m
I'm experimenting with ingestion in a 'docker quickstart' setup. I've somehow managed to get the data in an usable state, from a UI point of view: • when refreshing the homepage, there are two toaster popups saying
an unknown error has occurred (error 500)
. • Only one dataplatform is shown • selecting a dataset that's part of my experimentation, it show a red band on top, with the message
An unknown error occurred. An unknown error occurred.
(sic) • the schema, Documentation, Properties tabs are empty how do I figure out what caused the 500's?
(using the obvious
docker logs
)
Copy code
13:41:59.773 [ForkJoinPool.commonPool-worker-7] ERROR c.l.datahub.graphql.GmsGraphQLEngine:1222 - Failed to load Entities of type: DataJob, keys: [urn:li:dataJob:(urn:li:dataFlow:(flink,prod-lz-dsh.b2cbilling.flinkcluster,prod),cdr-ingest), urn:li:dataJob:(urn:li:dataFlow:(flink,prod-lz-dsh.b2cbilling.flinkcluster,prod),cdr-processor)] Failed to batch load Data Jobs
13:41:59.774 [ForkJoinPool.commonPool-worker-7] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to retrieve entities of type DataJob
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
	at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1596)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: java.lang.RuntimeException: Failed to retrieve entities of type DataJob
	at com.linkedin.datahub.graphql.GmsGraphQLEngine.lambda$null$133(GmsGraphQLEngine.java:1223)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
	... 5 common frames omitted
Caused by: java.lang.RuntimeException: Failed to batch load Data Jobs
	at com.linkedin.datahub.graphql.types.datajob.DataJobType.batchLoad(DataJobType.java:118)
	at com.linkedin.datahub.graphql.GmsGraphQLEngine.lambda$null$133(GmsGraphQLEngine.java:1220)
	... 6 common frames omitted
Caused by: com.linkedin.data.template.TemplateOutputCastException: Invalid URN syntax: Invalid number of keys.: urn:li:dataset:(urn:li:dataPlatform:dsh,prod-lz-dsh.internal.month-aggr-cdr-data,PROD)urn:li:dataset:(urn:li:dataPlatform:dsh,prod-lz-dsh.internal.hour-aggr-cdr-voice,PROD)
figured it out... missing comma in a urn list in a
DataJobInputOutput.outputDatasets
this being said: the UI should not break over a misconfigured entity.
m
@many-guitar-67205 : thanks for reporting. Agree that the UI should be more robust to bad metadata (which should have been prevented from getting in in the first place). Can you share the fragment of the metadata event that caused this?
m
Here's an example: • datahub docker quickstart • datahub docker ingest-sample-data • update of example script https://github.com/linkedin/datahub/blob/master/metadata-ingestion/examples/library/lineage_dataset_job_dataset.py :
Copy code
from typing import List

import datahub.emitter.mce_builder as builder
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.emitter.rest_emitter import DatahubRestEmitter
from datahub.metadata.com.linkedin.pegasus2avro.datajob import DataJobInputOutputClass
from datahub.metadata.schema_classes import ChangeTypeClass


# Construct the DataJobInputOutput aspect.
input_datasets: List[str] = [
    'urn:li:dataset:(urn:li:dataPlatform:mysql,librarydb.member,PROD)'
    'urn:li:dataset:(urn:li:dataPlatform:mysql,librarydb.checkout,PROD)'
]

output_datasets: List[str] = [
    builder.make_dataset_urn(
        platform="kafka", name="debezium.topics.librarydb.member_checkout", env="PROD"
    )
]

input_data_jobs: List[str] = [
    builder.make_data_job_urn(
        orchestrator="airflow", flow_id="flow1", job_id="job0", cluster="PROD"
    )
]

datajob_input_output = DataJobInputOutputClass(
    inputDatasets=input_datasets,
    outputDatasets=output_datasets,
    inputDatajobs=input_data_jobs,
)

# Construct a MetadataChangeProposalWrapper object.
# NOTE: This will overwrite all of the existing lineage information associated with this job.
datajob_input_output_mcp = MetadataChangeProposalWrapper(
    entityType="dataJob",
    changeType=ChangeTypeClass.UPSERT,
    entityUrn=builder.make_data_job_urn(
        orchestrator="airflow", flow_id="flow1", job_id="job1", cluster="PROD"
    ),
    aspectName="dataJobInputOutput",
    aspect=datajob_input_output,
)

# Create an emitter to the GMS REST API.
emitter = DatahubRestEmitter("<http://localhost:8080>")

# Emit metadata!
emitter.emit_mcp(datajob_input_output_mcp)
note the missing comma in
input_datasets
When browsing the UI, got to
<http://localhost:9002/search?filter_platform=urn:li:dataPlatform:airflow>
At the top, it mentions that an error has occurred. No Airflow entries are shown (there should be 3 from the sample dataset) (In my case, the home page was not complete either, but I can't seem to reproduce that with this simple example) The code that checks if a urn is valid does not seem to care about any extra characters appended after a valid urn, so during ingestion the urns are considered to be ok.
m
Thanks for the info @many-guitar-67205, its a somewhat tricky validation scenario, we'll get back on how we want to handle this going forward. /cc @orange-night-91387 @big-carpet-38439
b
Yes but overall agree this needs fixed just a matter of figuring out the ‘how’ from here