damp-greece-27806
03/09/2022, 8:27 PMpipeline.run()
in a script vs. datahub ingest -c dbt.yml
. Using the datahub cli, we can ingest our dbt stuff fine with some warnings generated. When calling pipeline.run()
, we notice that it’s unable to emit data to GMS. This feels specific to DBT as we use the inline config for redshift -> datahub and it works finegreen-football-43791
03/09/2022, 8:30 PMdamp-greece-27806
03/09/2022, 8:33 PMerrors
section of the JSON payload says:
SinkReport(records_written=685, warnings=[], failures=[{'error': 'Unable to emit metadata to DataHub GMS'
damp-greece-27806
03/09/2022, 8:33 PMgreen-football-43791
03/09/2022, 8:33 PMdamp-greece-27806
03/09/2022, 8:33 PMdamp-greece-27806
03/09/2022, 8:47 PMdamp-greece-27806
03/09/2022, 8:49 PMdatahub.configuration.common.PipelineExecutionError: ('Sink reported errors', SinkReport(records_written=685, warnings=[], failures=[{'error': 'Unable to emit metadata to DataHub GMS', 'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': "com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: com.datahub.util.exception.RetryLimitReached: Failed to add after 3 retries\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)\n\tat com.linkedin.metadata.resources.entity.EntityResource.ingest(EntityResource.java:178)\n\tat sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
damp-greece-27806
03/09/2022, 8:49 PMgreen-football-43791
03/09/2022, 8:54 PMgreen-football-43791
03/09/2022, 8:54 PMdamp-greece-27806
03/09/2022, 8:57 PMgreen-football-43791
03/09/2022, 8:58 PMdamp-greece-27806
03/09/2022, 8:59 PMdamp-greece-27806
03/09/2022, 8:59 PMCaused by: io.ebean.DuplicateKeyException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?,?,?,?,?)
damp-greece-27806
03/09/2022, 8:59 PMdamp-greece-27806
03/09/2022, 9:00 PMCaused by: java.sql.BatchUpdateException: Duplicate entry 'urn:li:dataset:(urn:li:dataPlatform:xxx' for key 'metadata_aspect_v2.PRIMARY'
green-football-43791
03/09/2022, 9:00 PMdamp-greece-27806
03/09/2022, 9:05 PMdamp-greece-27806
03/09/2022, 9:05 PMgreen-football-43791
03/09/2022, 9:06 PMdamp-greece-27806
03/09/2022, 9:07 PMdamp-greece-27806
03/09/2022, 9:07 PMdamp-greece-27806
03/09/2022, 9:07 PMgreen-football-43791
03/09/2022, 9:07 PMgreen-football-43791
03/09/2022, 9:07 PMgreen-football-43791
03/09/2022, 9:07 PMgreen-football-43791
03/09/2022, 9:08 PMdamp-greece-27806
03/09/2022, 9:08 PMacryl-datahub 0.8.27.1
green-football-43791
03/09/2022, 9:08 PMdamp-greece-27806
03/09/2022, 9:08 PMgreen-football-43791
03/09/2022, 9:09 PMdamp-greece-27806
03/09/2022, 9:09 PMdamp-greece-27806
03/09/2022, 9:09 PMdamp-greece-27806
03/09/2022, 9:09 PMacryl-datahub
is 0.8.17.1damp-greece-27806
03/09/2022, 9:10 PMdamp-greece-27806
03/09/2022, 9:10 PMgreen-football-43791
03/09/2022, 9:10 PMgreen-football-43791
03/09/2022, 9:10 PMgreen-football-43791
03/09/2022, 9:10 PMdamp-greece-27806
03/09/2022, 10:09 PMcker-local-runner-1 | The conflict is caused by:
docker-local-runner-1 | apache-airflow[package-extra] 2.0.2 depends on markupsafe<2.0 and >=1.1.1
docker-local-runner-1 | acryl-datahub[redshift,redshift-usage] 0.8.27.1 depends on markupsafe==2.0.1
docker-local-runner-1 | The user requested (constraint) markupsafe==1.1.1
and we can’t easily upgrade airflowgreen-football-43791
03/09/2022, 10:30 PMgreen-football-43791
03/09/2022, 10:30 PMdamp-greece-27806
03/10/2022, 3:45 PMMarkupSafe
to 1.1.1damp-greece-27806
03/10/2022, 8:47 PMelegant-traffic-96321
03/10/2022, 8:47 PMdamp-greece-27806
03/10/2022, 8:48 PMdamp-greece-27806
03/11/2022, 5:56 PMgreen-football-43791
03/11/2022, 5:57 PMgreen-football-43791
03/11/2022, 5:57 PMdamp-greece-27806
03/11/2022, 5:57 PMsquare-activity-64562
03/14/2022, 1:29 PM