I am experiencing error `duplicate key value viol...
# ingestion
b
I am experiencing error
duplicate key value violates unique constraint "pk_metadata_aspect_v2"
for DBT ingestion. Recipe:
Copy code
source:
  type: "dbt"
  config:
    manifest_path: "home/user/manifest.json"
    catalog_path: "/home/user/catalog.json"
    target_platform: "snowflake" 
    load_schemas: False
Copy code
2022-02-04T16:05:00.3797694Z                                       '\tat com.linkedin.metadata.entity.EntityService.ingestSnapshotUnion(EntityService.java:685)\n'
2022-02-04T16:05:00.3798072Z                                       '\tat com.linkedin.metadata.entity.EntityService.ingestEntity(EntityService.java:588)\n'
2022-02-04T16:05:00.3798485Z                                       '\tat com.linkedin.metadata.resources.entity.EntityResource.lambda$ingest$4(EntityResource.java:179)\n'
2022-02-04T16:05:00.3798818Z                                       '\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:30)\n'
2022-02-04T16:05:00.3798995Z                                       '\t... 84 more\n'
2022-02-04T16:05:00.3799362Z                                       'Caused by: io.ebean.DuplicateKeyException: Error when batch flush on sql: insert into metadata_aspect_v2 '
2022-02-04T16:05:00.3799717Z                                       '(urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?,?,?,?,?)\n'
2022-02-04T16:05:00.3800094Z                                       '\tat io.ebean.config.dbplatform.SqlCodeTranslator.translate(SqlCodeTranslator.java:46)\n'
2022-02-04T16:05:00.3800474Z                                       '\tat io.ebean.config.dbplatform.DatabasePlatform.translate(DatabasePlatform.java:219)\n'
2022-02-04T16:05:00.3800903Z                                       '\tat io.ebeaninternal.server.transaction.TransactionManager.translate(TransactionManager.java:246)\n'
2022-02-04T16:05:00.3801310Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.translate(JdbcTransaction.java:698)\n'
2022-02-04T16:05:00.3801719Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.batchFlush(JdbcTransaction.java:680)\n'
2022-02-04T16:05:00.3802166Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.internalBatchFlush(JdbcTransaction.java:796)\n'
2022-02-04T16:05:00.3802657Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.flushCommitAndNotify(JdbcTransaction.java:1005)\n'
2022-02-04T16:05:00.3803084Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.commit(JdbcTransaction.java:1057)\n'
2022-02-04T16:05:00.3803628Z                                       '\tat com.linkedin.metadata.entity.ebean.EbeanAspectDao.runInTransactionWithRetry(EbeanAspectDao.java:449)\n'
2022-02-04T16:05:00.3803810Z                                       '\t... 90 more\n'
2022-02-04T16:05:00.3804188Z                                       'Caused by: java.sql.BatchUpdateException: Batch entry 1 insert into metadata_aspect_v2 (urn, aspect, version, '
2022-02-04T16:05:00.3804479Z                                       'metadata, createdOn, createdBy, createdFor, systemmetadata) values '
2022-02-04T16:05:00.3805067Z                                       '(\'urn:li:dataset:(urn:li:dataPlatform:snowflake,table name,PROD)\',\'datasetKey\',0,\'{"origin":"PROD","name":"<table name>","platform":"urn:li:dataPlatform:snowflake"}\',\'2022-02-04 '
2022-02-04T16:05:00.3805431Z                                       '16:04:14.293+00\',\'urn:li:corpuser:UNKNOWN\',NULL,\'{"lastObserved":1643990269459,"runId":"dbt-2022_02_04-15_54_54"}\') '
2022-02-04T16:05:00.3805764Z                                       'was aborted: ERROR: duplicate key value violates unique constraint "pk_metadata_aspect_v2"\n'
2022-02-04T16:05:00.3805971Z                                       '  Detail: Key (urn, aspect, '
2022-02-04T16:05:00.3806338Z                                       'version)=(urn:li:dataset:(urn:li:dataPlatform:snowflake,<table name>,PROD), datasetKey, 0) '
2022-02-04T16:05:00.3806634Z                                       'already exists.  Call getNextException to see other errors in the batch.\n'
2022-02-04T16:05:00.3807014Z                                       '\tat org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:159)\n'
2022-02-04T16:05:00.3807410Z                                       '\tat org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2268)\n'
2022-02-04T16:05:00.3807781Z                                       '\tat org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:510)\n'
2022-02-04T16:05:00.3808144Z                                       '\tat org.postgresql.jdbc.PgStatement.internalExecuteBatch(PgStatement.java:851)\n'
2022-02-04T16:05:00.3808480Z                                       '\tat org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:874)\n'
2022-02-04T16:05:00.3808874Z                                       '\tat org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1563)\n'
2022-02-04T16:05:00.3809035Z                                       '\tat '
2022-02-04T16:05:00.3809499Z                                       'io.ebean.datasource.delegate.PreparedStatementDelegator.executeBatch(PreparedStatementDelegator.java:357)\n'
2022-02-04T16:05:00.3809931Z                                       '\tat io.ebeaninternal.server.persist.BatchedPstmt.executeAndCheckRowCounts(BatchedPstmt.java:130)\n'
2022-02-04T16:05:00.3810313Z                                       '\tat io.ebeaninternal.server.persist.BatchedPstmt.executeBatch(BatchedPstmt.java:97)\n'
2022-02-04T16:05:00.3810707Z                                       '\tat io.ebeaninternal.server.persist.BatchedPstmtHolder.flush(BatchedPstmtHolder.java:124)\n'
2022-02-04T16:05:00.3811102Z                                       '\tat io.ebeaninternal.server.persist.BatchControl.flushPstmtHolder(BatchControl.java:206)\n'
2022-02-04T16:05:00.3811470Z                                       '\tat io.ebeaninternal.server.persist.BatchControl.executeNow(BatchControl.java:220)\n'
2022-02-04T16:05:00.3811915Z                                       '\tat io.ebeaninternal.server.persist.BatchedBeanHolder.executeNow(BatchedBeanHolder.java:95)\n'
2022-02-04T16:05:00.3812293Z                                       '\tat io.ebeaninternal.server.persist.BatchControl.flush(BatchControl.java:271)\n'
2022-02-04T16:05:00.3812648Z                                       '\tat io.ebeaninternal.server.persist.BatchControl.flush(BatchControl.java:227)\n'
2022-02-04T16:05:00.3813051Z                                       '\tat io.ebeaninternal.server.transaction.JdbcTransaction.batchFlush(JdbcTransaction.java:678)\n'
2022-02-04T16:05:00.3813591Z                                       '\t... 94 more\n'
2022-02-04T16:05:00.3813970Z                                       'Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint '
2022-02-04T16:05:00.3814177Z                                       '"pk_metadata_aspect_v2"\n'
2022-02-04T16:05:00.3814376Z                                       '  Detail: Key (urn, aspect, '
2022-02-04T16:05:00.3814833Z                                       'version)=(urn:li:dataset:(urn:li:dataP
This error is mostly seen for
target platform
(snowflake in this case) dataset .
urn:li:dataPlatform:snowflake,<name>,PROD)
q
If it could be any help: I saw this error a lot when using a version of the ingestion library that didn't match the version of datahub I had deployed.
b
our Datahub is
0.8.24
and I tried ingestion libs (acryl-datahub) on
0.8.24.0
,
0.8.24.1
and,
0.8.24.2
. It’s throwing same error.
i
Hello Mayur, could you share your manifest & catalog files from dbt?
This issue looks to be a case-sensitive mismatch in urns. I need a little more information about you are trying to ingest to be able to help more
b
same issue is also reported by someone else. https://github.com/linkedin/datahub/issues/4124
@incalculable-ocean-74010 I am going to isolate the issue in test dbt files and get back to you
thank you 1