nutritious-bird-77396
04/16/2022, 5:49 PMi.e. 0 and 1
select aspect, max(version) from metadata_aspect_v2 where urn='urn:li:corpuser:<redacted>' and aspect ='groupMembership' group by urn, aspect, version
So, picking the first returned value 0 it's returning the next version as 1 again causing the primary key error.
Postgres Databases are created using this script.
I suspect something with our postgres setup....not clear on where the issue could be..
Let me know if you have any thoughts.nutritious-bird-77396
04/18/2022, 4:22 PMversion
in dbResults
• In my case as the version 1 comes first then version 0 the list has groupMembership
and version as 0
as the latest
• So when incrementing for the next version it gets 1 again causing primary key error..
Here are debug logs from the environment to prove my finding:
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - From DB- Urn: urn:li:corpuser:<redacted>, Aspect: corpUserInfo, currVersion: 0
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - From DB- Urn: urn:li:corpuser:<redacted>, Aspect: groupMembership, currVersion: 1
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - From DB- Urn: urn:li:corpuser:<redacted>, Aspect: groupMembership, currVersion: 0
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - Contains ASpect - Urn: urn:li:corpuser:<redacted>, Aspect: corpUserInfo, nextVersion: 1
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - Added to result - Aspect: corpUserInfo, nextVersion: 1
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - Contains ASpect - Urn: urn:li:corpuser:<redacted>, Aspect: groupMembership, nextVersion: 1
16:01:26 [qtp1025799482-17] INFO c.l.m.entity.ebean.EbeanAspectDao - Added to result - Aspect: groupMembership, nextVersion: 1
16:01:26 [qtp1025799482-17] INFO c.l.m.e.ebean.EbeanEntityService - Urn - corpUserInfo, AspectName- 1, nextVersion- {}
16:01:26 [qtp1025799482-17] INFO c.l.m.e.ebean.EbeanEntityService - Urn - groupMembership, AspectName- 1, nextVersion- {}
16:01:26 [qtp1025799482-17] INFO c.l.m.filter.RestliLoggingFilter - POST /entities?action=ingest - ingest - 500 - 47ms
16:01:26 [qtp1025799482-17] ERROR c.l.m.filter.RestliLoggingFilter - <http://Rest.li|Rest.li> error:
com.linkedin.restli.server.RestLiServiceException: com.datahub.util.exception.RetryLimitReached: Failed to add after 3 retries
Findings:
• It looks like this is a basic functionality in Datahub so i doubt if this has to do with Postgres?
Any inputs on this would be helpfuldelightful-barista-90363
04/18/2022, 4:42 PMMetadataChangeEvent
and MetadataWorkUnit
however i am having difficulty finding the definition of these objects and if these are related to tags. I was wondering if anyone could give me some advice. Much appreciated in advanced!
Edit: I think this example https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/library/dataset_add_tag.py is what i was looking forclean-nightfall-92007
04/19/2022, 6:54 AM/aspects?action=ingestProposal
,This string should be OK
is not a valid string representation of bytes
mysterious-nail-70388
04/19/2022, 8:38 AMfast-ability-23281
04/21/2022, 8:33 PMearly-librarian-13786
04/22/2022, 11:31 AMnutritious-bird-77396
04/22/2022, 1:47 PMcuddly-arm-8412
04/24/2022, 6:20 AMbrash-photographer-9183
04/26/2022, 12:05 PMprehistoric-salesclerk-23462
04/27/2022, 1:12 PM{'workunits_produced': 0,
'workunit_ids': [],
'warnings': {},
'failures': {'version': ['Error: (snowflake.connector.errors.DatabaseError) 250001 (08001): Failed to connect to DB: '
'<http://mycompay.eu-central-1.snowflakecomputing.com:443|mycompay.eu-central-1.snowflakecomputing.com:443>. Incorrect username or password was specified.\n'
'(Background on this error at: <http://sqlalche.me/e/13/4xp6)']>},
steep-soccer-91284
04/28/2022, 1:34 AMrapid-book-98432
04/29/2022, 2:21 PMalert-football-80212
05/01/2022, 2:08 PMbest-umbrella-24804
05/02/2022, 1:37 AMwonderful-egg-79350
05/02/2022, 8:14 AMdry-zoo-35797
05/03/2022, 3:54 PMdry-zoo-35797
05/03/2022, 3:54 PMquaint-lighter-81058
05/03/2022, 5:03 PMnutritious-bird-77396
05/03/2022, 5:05 PM[2022-05-03 16:53:06,994] DEBUG {datahub.cli.ingest_cli:94} - Using config: {'source': {'type': 'datahub-stream', 'config': {'auto_offset_reset': 'latest', 'connection': {'bootstrap': '<redacted>', 'schema_registry_url': '<redacted>', 'consumer_config': {'security.protocol': 'SASL_SSL'}}, 'actions': [{'type': 'executor', 'config': {'local_executor_enabled': True, 'remote_executor_enabled': 'False', 'remote_executor_type': 'acryl.executor.sqs.producer.sqs_producer.SqsRemoteExecutor', 'remote_executor_config': {'id': 'remote', 'aws_access_key_id': '""', 'aws_secret_access_key': '""', 'aws_session_token': '""', 'aws_command_queue_url': '""', 'aws_region': 'us-east-1'}}}], 'topic_routes': {'mae': 'MetadataAuditEvent_v4', 'mcl': 'MetadataChangeLog_Versioned_v1'}}}, 'sink': {'type': 'console'}, 'datahub_api': {'server': '<redacted>:8080', 'extra_headers': {'Authorization': 'Basic __datahub_system:JohnSnowKnowsNothing'}}}
sasl.mechanism
that is passed is not set in the consumer configs....worried-motherboard-80036
05/05/2022, 4:59 PMsource:
type: "elasticsearch"
config:
# Coordinates
host: '<https://internal_ip:9200>'
# Credentials
username: the_user
password: the_pass
sink:
type: "console"
because I am not passing any SSL config params, I am getting:
elasticsearch.exceptions.SSLError: ConnectionError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131))
I dug into the source code a bit, and I am seeing that the connection to elasticsearch is set by passing only the host and the http_auth (basically username and pass):mysterious-nail-70388
05/06/2022, 3:19 AMsticky-dawn-95000
05/08/2022, 12:39 PMrich-policeman-92383
05/10/2022, 12:09 PMastonishing-dusk-99990
05/11/2022, 12:32 PM'JSONDecodeError: Invalid control character at: line 1594 column 4096 (char 65977)\n'
However when I'm running in local I'm having in the same error, but I modified in the local with this solution it works with assigned content of json file into variable https://stackoverflow.com/questions/9156417/valid-json-giving-jsondecodeerror-expecting-delimiter https://stackoverflow.com/questions/63107394/jsonload-jsondecodeerror-invalid-control-character-at
But if I tried to execute in datahub UI it always error. Does anyone know which file I can edit to add strict=False
in json.loads?miniature-sandwich-75434
05/16/2022, 7:24 AMmillions-waiter-49836
05/16/2022, 7:37 PMpartition
sits in dataset.datasetProfiles.partitionSpec
.Does this mean we can only surface partition
as part of datasetProfiles
?
IMHO, shouldn’t dataset partition exists by itself, not depending on the dataset profiles?best-umbrella-24804
05/17/2022, 4:11 AMchilly-gpu-46080
05/18/2022, 7:38 AMbrash-sundown-77702
05/19/2022, 2:33 PM