microscopic-mechanic-13766
09/20/2022, 12:10 PMmax_workers
but haven't improve the times.
I don't think it is a problem with either the quantity of data (as I only have 4 tables, and the maximum number of rows in them is 30) or my deployment as it didn't use to be this slow.
Any tips how to either improve the ingestion or to determine the actual cause of the problem??
(The profiling in other sources is normal, so it isn't a problem with either the ingestion or profiling, but a problem with Hive ingestion)hundreds-photographer-13496
09/20/2022, 12:45 PMmicroscopic-mechanic-13766
09/20/2022, 1:50 PMmicroscopic-mechanic-13766
09/20/2022, 1:54 PM[2022-09-20 13:23:29,781] DEBUG {datahub.ingestion.run.pipeline:43} - sink wrote workunit profile-default.prueba
[2022-09-20 13:23:29,783] DEBUG {datahub.emitter.rest_emitter:235} - Attempting to emit to DataHub GMS; using curl equivalent to:
curl -X POST -H 'User-Agent: python-requests/2.28.0' -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'Content-Type: application/json' -H 'Authorization: Basic __datahub_system:JohnSnowKnowsNothing' --data '{"proposal": {"entityType": "dataset", "entityUrn": "urn:li:dataset:(urn:li:dataPlatform:hive,default.prueba2,PROD)", "changeType": "UPSERT", "aspectName": "datasetProfile", "aspect": {"value": "{\"timestampMillis\": 1663671701130, \"partitionSpec\": {\"type\": \"FULL_TABLE\", \"partition\": \"FULL_TABLE_SNAPSHOT\"}, \"rowCount\": 0, \"columnCount\": 2, \"fieldProfiles\": [{\"fieldPath\": \"name\", \"uniqueCount\": 0, \"nullCount\": 0, \"sampleValues\": []}, {\"fieldPath\": \"age\", \"uniqueCount\": 0, \"nullCount\": 0, \"sampleValues\": []}]}", "contentType": "application/json"}, "systemMetadata": {"lastObserved": 1663673009688, "runId": "hive-2022_09_20-13_01_02"}}}' '<http://datahub-gms:8080/aspects?action=ingestProposal>'
[2022-09-20 13:23:29,798] DEBUG {datahub.ingestion.run.pipeline:43} - sink wrote workunit profile-default.prueba2
I am guessing that for some reason it stopped executing or it got stuck at that point.