wonderful-bear-5842
03/10/2024, 9:14 AMstocky-plumber-3084
03/11/2024, 2:16 AMbillions-yacht-53533
03/11/2024, 7:18 AMstocky-plumber-3084
03/11/2024, 7:33 AMwonderful-rain-49084
03/11/2024, 8:12 AMCaused by: java.sql.BatchUpdateException: Batch entry 1 update metadata_aspect_v2 set metadata='{"paths":["/prod/trino/output/sk_test_v1_abc"]}', createdOn='2024-03-08 17:17:44.062+00', createdBy='urn:li:corpuser:datahub', createdFor=NULL, systemmetadata='{"registryVersion":"0.0.0.0-dev","lastRunId":"no-run-id-provided","runId":"trino-2024_03_08-17_17_31","registryName":"unknownRegistry","lastObserved":1709918264042}' where urn='urn:li:dataset:(urn:li:dataPlatform:trino,output.sk_test_v1_abc.fitnesse_route_delta,PROD)' and aspect='browsePaths' and version=0 was aborted: ERROR: could not serialize access due to concurrent update Call getNextException to see other errors in the batch.
...
Caused by: org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2365)
Any clue what can be wrong with our setup and how to fix it?bland-orange-13353
03/11/2024, 11:29 AMhandsome-fireman-90345
03/11/2024, 1:24 PMhandsome-fireman-90345
03/11/2024, 1:24 PMrapid-queen-98305
03/11/2024, 2:21 PMlemon-airplane-7413
03/11/2024, 11:07 PMadventurous-dawn-19232
03/12/2024, 6:35 AMaloof-oil-31167
03/12/2024, 10:59 AMget_urns_by_filter
in order to get all datasets urns of a specific platform,
the function default batch_size
arg is value is 10k, whenever i want to take some more(e.g 15k) itβs failing with this error -
{'code': 500, 'type': 'SERVER_ERROR', 'classification': 'DataFetchingException'}
is there anyway to paging over those values? Ideally i want to query over almost 20k results
this is the code -
datahub_graph = DataHubGraph(DatahubClientConfig(server=DATAHUB_HOST,
token=os.getenv('DATAHUB_TOKEN')))
datasets_urns = datahub_graph.get_urns_by_filter(platform="snowflake", batch_size=15000)
mysterious-advantage-78411
03/12/2024, 1:22 PMimportant-electrician-22243
03/12/2024, 8:04 PM[application-akka.actor.default-dispatcher-52] WARN p.api.mvc.DefaultJWTCookieDataCodec - decode: cookie has invalid signature! message = JWT signature does not match locally computed signature. JWT validity cannot be asserted and should not be trusted.
2024-03-12 19:11:54,317 [application-akka.actor.default-dispatcher-52] INFO p.api.mvc.DefaultJWTCookieDataCodec - The JWT signature in the cookie does not match the locally computed signature with the server. This usually indicates the browser has a leftover cookie from another Play application, so clearing cookies may resolve this error message.
2024-03-12 19:11:54,322 [application-akka.actor.default-dispatcher-52] ERROR controllers.SsoCallbackController - Caught exception while attempting to handle SSO callback! It's likely that SSO integration is mis-configured.
adventurous-dawn-19232
03/13/2024, 4:21 AMcuddly-wall-60618
03/13/2024, 8:05 AMdataset_urn = make_data_job_urn(orchestrator='airflow',flow_id="pal_example_datamart", job_id="params_eval")
s = DataJobInfoClass(name="hello",type="COMMAND",customProperties={"ingest_server":"true"})
emitter =DataHubRestEmitter(gms_server="<http://localhost:8082>",token='')
emitter.emit_mcp(datajob_input_output_mcp)
mammoth-apple-56011
03/13/2024, 11:58 AMlimits:
cpu: '4'
memory: 8Gi
Also sometimes we have a 500 error in the Datahub UI.
The logs in the datahub-frontend Pod saying this:
Caused by: java.util.concurrent.TimeoutException: Read timeout to datahub-gms-datahub-gms/10.234.71.18:8080 after 60000 ms
But there is no datahub-gms Pod with such address:
10.238.46.45
10.239.10.201
10.239.26.249
10.237.17.16
As I see the 500 error is because of datahub-frontend are using old ip addresses for the GMS Pods.
How can I configure it to use the new ip addresses?able-carpenter-97384
03/13/2024, 3:14 PMwitty-butcher-82399
03/13/2024, 3:50 PM2024-03-13 16:32:452024-03-13 15:32:45,667 [ForkJoinPool.commonPool-worker-25] ERROR c.l.d.g.r.browse.BrowseResolver:60 - Failed to execute browse: entity type: DATA_PRODUCT, path: [], filters: null, start: 0, count: 10 null
Which suggests that DataProductType is missing BrowseableEntityType
We are running 0.12.0, however DataProductType is still missing the interface in master.fast-area-764
03/13/2024, 5:57 PMBusinessGlossaryFileSource
does not inherit from StatefulIngestionSourceBase
. Without stateful ingestion every new ingestion run only adds terms to the glossary and never removes termsagreeable-greece-66183
03/13/2024, 10:04 PMsome-zoo-21364
03/14/2024, 11:11 AMextract-QUERY_SCAN => Error was {'S': 'ERROR', 'C': 'XX000', 'M': 'Result size exceeds LISTAGG limit', 'D': '\n -----------------------------------------------\n error: Result size exceeds LISTAGG limit\n code: 8001\n context: LISTAGG limit: 65535\n query: 234890373[child_sequence:3]\n location: string_ops.cpp:138\n process: query1_1990_234890378 [pid=25559]\n -----------------------------------------------\n', 'F': '../src/sys/xen_execute.cpp', 'L': '12414', 'R': 'pg_throw'}
blue-cartoon-10359
03/14/2024, 1:19 PMDataHubGraph
in python to call
dataset = graph.get_urns_by_filter(
entity_types=["dataset"],
env="DEV",
platform="mssql", extraFilters=[
{'field': 'domains', 'values': [domain_urn]}])]
)
however, when I run this I'm encountering this issue (see below - which is not related to any access tokens being expired etc.), anyone who might know the cause?
line 172, in _send_restli_request
raise OperationalError(
datahub.configuration.common.OperationalError: ('Unable to get metadata from DataHub', {'message': '401 Client Error: Unauthorized for url: <http://___/api/graphql'}>)
creamy-machine-95935
03/14/2024, 1:48 PMdatahub delete --domain financial #Do not work
fresh-musician-87803
03/14/2024, 2:04 PMrapid-night-88791
03/14/2024, 3:14 PMfierce-coat-26780
03/14/2024, 3:42 PMv0.12.1
and I can't filter the data in my GraphQL query and neither the documentation, google nor LLMs could really help me.
I want to query all my datasets of type model in the platform dbt. And it should only show datasets that have a certain property materialization = ephemeral
.
It seems that I can't access the properties of the dataset with the orFilters
.
Details are in this thread πbland-orange-13353
03/14/2024, 4:09 PMdamp-solstice-31196
03/14/2024, 7:04 PMhundreds-arm-67649
03/14/2024, 10:55 PM