Hello - I'm trying out the new (and in beta) data-...
# ingestion
s
Hello - I'm trying out the new (and in beta) data-lake ingestion tool on a file on my Windows laptop. As far as I can tell it very nearly works but the push to the DataHub GMS fails with an error "Unable to emit metadata to DataHub GMS". If I look down the sizeable info message then it strikes me the core of the problem is something around: 'Caused by: com.linkedin.data.template.TemplateOutputCastException: Invalid URN syntax: Invalid URN Parameter: ' "'No enum constant com.linkedin.common.FabricType.prod: " 'urnlidataset:(urnlidataPlatform:local-data-lake,Indices-2021-03,prod)\n'
Ingestion recipe attached and the error message in full
I can successfully run a Glue ingestion recipe
s
Try changing
env
to be
PROD
instead
s
Excellent, thanks @square-activity-64562 - that worked! I picked up the lower case "prod" from the documentation here: https://datahubproject.io/docs/metadata-ingestion/source_docs/data_lake#setup
s
s
Thanks! I tried to provide a review to merge the PR but I don't think I have sufficient authority. I am now going to battle with my Spark versioning...
👍 1
s
Thanks for the review. The merge requires someone from the team to review. We will get it merged later.
s
I got data lake profiling working once I switched the value of the SPARK_VERSION value from 2.4.7 to 3.0.3. I had it set to 2.4.7 because I was using Pydeequ earlier last year for some work...