early-island-92859
03/29/2023, 11:11 PM[2023-03-21, 10:34:11 UTC] {ge_data_profiler.py:917} ERROR - Encountered exception while profiling BIGQUERY_PROJECT.dataset1.table1
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 203, in _execute
self._query_job.result()
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/job/query.py", line 1499, in result
do_get_result()
File "/home/airflow/.local/lib/python3.7/site-packages/google/api_core/retry.py", line 288, in retry_wrapped_func
on_error=on_error,
File "/home/airflow/.local/lib/python3.7/site-packages/google/api_core/retry.py", line 190, in retry_target
return target()
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/job/query.py", line 1489, in do_get_result
super(QueryJob, self).result(retry=retry, timeout=timeout)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/job/base.py", line 728, in result
return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/google/api_core/future/polling.py", line 137, in result
raise self._exception
google.api_core.exceptions.NotFound: 404 Not found: Dataset EXTRACTOR_PROJECT:table1 was not found in location US
Location: US
Job ID: 11111111-1111-1111-1111-111111111111
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 912, in _generate_single_profile
cursor.execute(bq_sql)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/dbapi/_helpers.py", line 494, in with_closed_check
return method(self, *args, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 167, in execute
formatted_operation, parameters, job_id, job_config, parameter_types
File "/home/airflow/.local/lib/python3.7/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 205, in _execute
raise exceptions.DatabaseError(exc)
google.cloud.bigquery.dbapi.exceptions.DatabaseError: 404 Not found: Dataset EXTRACTOR_PROJECT:table1 was not found in location US
When I look at the jobId which was failed in EXTRACTOR_PROJECT, I see that there is no projectID in front of the datasetId. So I believe bigquery looks for the table in the EXTRACTOR_PROJECT and returns 404 because that table is in BIGQUERY_PROJECT.
SELECT * FROM `dataset1.table1` LIMIT 10000
for comparison I look at the successful jobIds and I see that projectId is added before datasetid.
SELECT
*
FROM
`BIGQUERY_PROJECT.dataset2.table2`
WHERE
DATE(`date`) BETWEEN DATE('2023-03-20 00:00:00') AND DATE('2023-03-21 00:00:00')
Why datahub is not adding BIGQUERY_PROJECT for all queries? Can someone help me to resolve it?witty-motorcycle-52108
03/29/2023, 11:41 PMred-plumber-64268
03/30/2023, 7:53 AM⏳ Pipeline running successfully so far; produced 157 events in 1 minute and 2 seconds.
[2023-03-30 07:48:19,731] ERROR {datahub.ingestion.source.bigquery_v2.bigquery:636} - Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-bigquery-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 626, in _process_project
yield from self._process_schema(
File "/tmp/datahub/ingest/venv-bigquery-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 782, in _process_schema
yield from self._process_view(
File "/tmp/datahub/ingest/venv-bigquery-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 882, in _process_view
yield from self.gen_view_dataset_workunits(
File "/tmp/datahub/ingest/venv-bigquery-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 951, in gen_view_dataset_workunits
yield from self.gen_dataset_workunits(
File "/tmp/datahub/ingest/venv-bigquery-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 1007, in gen_dataset_workunits
lastModified=TimeStamp(time=int(table.last_altered.timestamp() * 1000))
AttributeError: 'int' object has no attribute 'timestamp'
[2023-03-30 07:48:19,731] ERROR {datahub.ingestion.source.bigquery_v2.bigquery:637} - Unable to get tables for dataset dashboards in project annotell-com, skipping. Does your service account has bigquery.tables.list, bigquery.routines.get, bigquery.routines.list permission, bigquery.tables.getData permission? The error was: 'int' object has no attribute 'timestamp'
salmon-angle-92685
03/30/2023, 10:10 AMTable -> View -> Application
track.
Is there anyway of doing this?
Thank you so much for your help!salmon-angle-92685
03/30/2023, 10:15 AMboundless-nail-65912
03/30/2023, 11:35 AMproud-dusk-671
03/30/2023, 11:52 AMsalmon-angle-92685
03/30/2023, 12:12 PMenough-noon-12106
03/30/2023, 12:44 PMLast synchronized *14 hours ago*
or lastModifed using python emitter in push based ingestioncool-tiger-42613
03/30/2023, 1:07 PMrich-state-73859
03/30/2023, 4:55 PMfailed to write record with workunit urn:li:assertion:c3675908211ca5988d94475197414b71-assertionInfo with ('Unable to emit metadata to DataHub GMS', {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: javax.persistence.PersistenceException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?,?,?,?,?)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)', 'message': 'javax.persistence.PersistenceException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?', 'status': 500, 'id': 'urn:li:assertion:c3675908211ca5988d94475197414b71'}) and info {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: javax.persistence.PersistenceException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?,?,?,?,?)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)', 'message': 'javax.persistence.PersistenceException: Error when batch flush on sql: insert into metadata_aspect_v2 (urn, aspect, version, metadata, createdOn, createdBy, createdFor, systemmetadata) values (?,?,?,?', 'status': 500, 'id': 'urn:li:assertion:c3675908211ca5988d94475197414b71'}
lemon-scooter-69730
03/31/2023, 12:17 PMlively-spring-5482
03/31/2023, 2:26 PMresource,subresource,glossary_terms,tags,owners,ownership_type,description,domain
"urn:li:dataset:(urn:li:dataPlatform:snowflake,prd_dwh.test_schema.h_test,PROD)",,[],,,,,,
Exception thrown:
{'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:422]: Failed to validate record with class '
'com.linkedin.common.GlossaryTerms: ERROR :: /terms/0/urn :: "Provided urn " is invalid\n'
'\n'
tall-vr-26334
03/31/2023, 2:48 PMsource:
type: s3
config:
platform: s3
path_specs:
-
include: 'my_bucket_name'
aws_config:
aws_region: my-region
aws_role:
-
RoleArn: 'arn'
ExternalId: <external_id>
Here is the error I'm getting
~~~~ Execution Summary - RUN_INGEST ~~~~
Execution finished with errors.
{'exec_id': '3ad02c48-58b6-4ff9-9a23-a7ea2268f308',
'infos': ['2023-03-31 14:42:26.606317 INFO: Starting execution for task with name=RUN_INGEST',
"2023-03-31 14:42:34.708565 INFO: Failed to execute 'datahub ingest'",
'2023-03-31 14:42:34.708757 INFO: Caught exception EXECUTING task_id=3ad02c48-58b6-4ff9-9a23-a7ea2268f308, name=RUN_INGEST, '
'stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
'errors': []}
~~~~ Ingestion Report ~~~~
{
"cli": {
"cli_version": "0.10.1",
"cli_entry_location": "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/__init__.py",
"py_version": "3.10.10 (main, Mar 14 2023, 02:37:11) [GCC 10.2.1 20210110]",
"py_exec_path": "/tmp/datahub/ingest/venv-s3-0.10.1/bin/python3",
"os_details": "Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31",
"peak_memory_usage": "167.37 MB",
"mem_info": "167.37 MB"
},
"source": {
"type": "s3",
"report": {
"events_produced": 0,
"events_produced_per_sec": 0,
"entities": {},
"aspects": {},
"warnings": {},
"failures": {},
"filtered": [],
"start_time": "2023-03-31 14:42:29.926640 (2.33 seconds ago)",
"running_time": "2.33 seconds"
}
},
"sink": {
"type": "datahub-rest",
"report": {
"total_records_written": 0,
"records_written_per_second": 0,
"warnings": [],
"failures": [],
"start_time": "2023-03-31 14:42:29.031984 (3.22 seconds ago)",
"current_time": "2023-03-31 14:42:32.252517 (now)",
"total_duration_in_seconds": 3.22,
"gms_version": "v0.10.1",
"pending_requests": 0
}
}
}
~~~~ Ingestion Logs ~~~~
Obtaining venv creation lock...
Acquired venv creation lock
venv setup time = 0
This version of datahub supports report-to functionality
datahub ingest run -c /tmp/datahub/ingest/3ad02c48-58b6-4ff9-9a23-a7ea2268f308/recipe.yml --report-to /tmp/datahub/ingest/3ad02c48-58b6-4ff9-9a23-a7ea2268f308/ingestion_report.json
[2023-03-31 14:42:28,995] INFO {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.1
[2023-03-31 14:42:29,035] INFO {datahub.ingestion.run.pipeline:184} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
[2023-03-31 14:42:29,599] ERROR {logger:26} - Please set env variable SPARK_VERSION
[2023-03-31 14:42:29,600] INFO {logger:27} - Using deequ: com.amazon.deequ:deequ:1.2.2-spark-3.0
[2023-03-31 14:42:30,227] INFO {datahub.ingestion.run.pipeline:201} - Source configured successfully.
[2023-03-31 14:42:30,230] INFO {datahub.cli.ingest_cli:129} - Starting metadata ingestion
[2023-03-31 14:42:32,253] INFO {datahub.ingestion.reporting.file_reporter:52} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/3ad02c48-58b6-4ff9-9a23-a7ea2268f308/ingestion_report.json' mode='w' encoding='UTF-8'>
[2023-03-31 14:42:32,253] INFO {datahub.cli.ingest_cli:134} - Source (s3) report:
{'events_produced': 0,
'events_produced_per_sec': 0,
'entities': {},
'aspects': {},
'warnings': {},
'failures': {},
'filtered': [],
'start_time': '2023-03-31 14:42:29.926640 (2.33 seconds ago)',
'running_time': '2.33 seconds'}
[2023-03-31 14:42:32,254] INFO {datahub.cli.ingest_cli:137} - Sink (datahub-rest) report:
{'total_records_written': 0,
'records_written_per_second': 0,
'warnings': [],
'failures': [],
'start_time': '2023-03-31 14:42:29.031984 (3.22 seconds ago)',
'current_time': '2023-03-31 14:42:32.253876 (now)',
'total_duration_in_seconds': 3.22,
'gms_version': 'v0.10.1',
'pending_requests': 0}
[2023-03-31 14:42:32,543] ERROR {datahub.entrypoints:192} - Command failed: 'NoneType' object has no attribute 'access_key'
Traceback (most recent call last):
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/entrypoints.py", line 179, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
raise e
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
res = func(*args, **kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
return func(ctx, *args, **kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
loop.run_until_complete(run_func_check_upgrade(pipeline))
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 158, in run_func_check_upgrade
ret = await the_one_future
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 149, in run_pipeline_async
return await loop.run_in_executor(
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
raise e
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
pipeline.run()
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 339, in run
for wu in itertools.islice(
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/s3/source.py", line 744, in get_workunits
for file, timestamp, size in file_browser:
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/s3/source.py", line 656, in s3_browser
s3 = self.source_config.aws_config.get_s3_resource(
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/aws/aws_common.py", line 183, in get_s3_resource
resource = self.get_session().resource(
File "/tmp/datahub/ingest/venv-s3-0.10.1/lib/python3.10/site-packages/datahub/ingestion/source/aws/aws_common.py", line 139, in get_session
"AccessKeyId": current_credentials.access_key,
AttributeError: 'NoneType' object has no attribute 'access_key'
salmon-angle-92685
03/31/2023, 3:28 PMloud-application-42754
03/31/2023, 5:49 PMlively-dusk-19162
04/01/2023, 7:26 AMmysterious-table-75773
04/02/2023, 8:54 PMbitter-evening-61050
04/03/2023, 6:17 AMlively-raincoat-33818
04/03/2023, 8:11 AMgifted-diamond-19544
04/03/2023, 11:13 AMlively-night-46534
04/03/2023, 11:23 AMgray-angle-76914
04/03/2023, 1:36 PM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': '6619ef27-9cba-4790-ac3f-538f8a4c3c08',
'infos': ['2023-04-03 11:07:52.379075 [exec_id=6619ef27-9cba-4790-ac3f-538f8a4c3c08] INFO: Starting execution for task with name=RUN_INGEST',
'2023-04-03 11:07:52.390782 [exec_id=6619ef27-9cba-4790-ac3f-538f8a4c3c08] INFO: Caught exception EXECUTING '
'task_id=6619ef27-9cba-4790-ac3f-538f8a4c3c08, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 70, in execute\n'
' recipe: dict = SubProcessTaskUtil._resolve_recipe(validated_args.recipe, ctx, self.ctx)\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_task_common.py", line 107, in _resolve_recipe\n'
' json_recipe = json.loads(resolved_recipe)\n'
' File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads\n'
' return _default_decoder.decode(s)\n'
' File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode\n'
' obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n'
' File "/usr/local/lib/python3.10/json/decoder.py", line 353, in raw_decode\n'
' obj, end = self.scan_once(s, idx)\n'
'json.decoder.JSONDecodeError: Invalid control character at: line 1 column 1027 (char 1026)\n']}
Execution finished with errors.
I am using the following recipe:
source:
type: snowflake
config:
stateful_ingestion:
enabled: false
env: DEV
platform_instance: <platform>
authentication_type: KEY_PAIR_AUTHENTICATOR
private_key_password: '${snf_dev_pkp}'
private_key: '${snf_dev_pk_str}'
convert_urns_to_lowercase: true
include_external_url: true
database_pattern:
ignoreCase: true
include_technical_schema: true
include_tables: false
include_table_lineage: false
include_table_location_lineage: true
include_views: false
include_view_linage: true
include_column_lineage: true
ignore_start_time_lineage: true
include_usage_stats: true
store_last_usage_extraction_timestamp: true
top_n_queries: 10
include_top_n_queries: true
format_sql_queries: true
include_operational_stats: true
include_read_operational_stats: false
apply_view_usage_to_tables: true
email_domain: none
store_last_profiling_timestamps: true
profiling:
enabled: false
turn_off_expensive_profiling_metrics: false
profile_table_level_only: true
include_field_null_count: true
include_field_distinct_count: true
include_field_min_value: true
include_field_max_value: true
include_field_mean_value: true
include_field_median_value: true
include_field_stddev_value: true
include_field_quantiles: true
include_field_distinct_value_frequencies: true
include_field_histogram: true
include_field_sample_values: true
field_sample_values_limit: 4
query_combiner_enabled: true
sink:
type: datahub-rest
config:
server: <server>
where private_key_password and private_key have been added as secrets. Any idea what could be the error?
Thanks!colossal-finland-28298
04/04/2023, 4:23 AMsource:
type: postgres
config:
host_port: localhost:5432
username: postgres
# database:
# password:
include_tables: true
include_views: true
schema_pattern:
deny:
- information_schema
- pg_catalog
profiling:
enabled: true
include_field_distinct_count: false
include_field_min_value: false
include_field_median_value: false
include_field_max_value: false
include_field_mean_value: false
include_field_stddev_value: false
partition_profiling_enabled: false
catch_exceptions: false
query_combiner_enabled: false
include_field_null_count: false
field_sample_values_limit: 10
include_field_distinct_value_frequencies: false
include_field_histogram: false
include_field_quantiles: false
query_combiner_enabled: false
profile_table_row_count_estimate_only: true
turn_off_expensive_profiling_metrics: true
include_field_sample_values
default value is “true” so I did not put that option in the recipe.
I found in ingest log that it has still gathering the count of distinct even turn all options of profiling off except the one which is getting sample values(data).
2023-03-31 18:34:32,666 INFO sqlalchemy.engine.Engine [cached since 0.3957s ago] {}
2023-03-31 18:34:32,668 INFO sqlalchemy.engine.Engine SELECT count(distinct(bid)) AS count_1
FROM public.pgbench_branches
How can I turn distinct count off in profiling mode?
Thank you in advance!curved-planet-99787
04/04/2023, 6:53 AM0.10.1
).
After the successful login to Tableau and the retrieval of all projects, the ingestion starts querying the metadata API which ends with the following error:
2023-04-04 06:42:00,701 - [INFO] - [tableau.endpoint.datasources:73] - Querying all datasources on site
2023-04-04 06:42:00,795 - [INFO] - [tableau.endpoint.metadata:61] - Querying Metadata API
2023-04-04 06:42:00,868 - [ERROR] - [datahub.ingestion.run.pipeline:389] - Caught error
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 339, in run
for wu in itertools.islice(
File "/usr/local/lib/python3.10/site-packages/datahub/utilities/source_helpers.py", line 90, in auto_stale_entity_removal
for wu in stream:
File "/usr/local/lib/python3.10/site-packages/datahub/utilities/source_helpers.py", line 41, in auto_status_aspect
for wu in stream:
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2247, in get_workunits_internal
yield from self.emit_workbooks()
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 708, in emit_workbooks
for workbook in self.get_connection_objects(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 687, in get_connection_objects
) = self.get_connection_object_page(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 650, in get_connection_object_page
raise RuntimeError(f"Query {connection_type} error: {errors}")
RuntimeError: Query workbooksConnection error: [{'message': "Validation error of type FieldUndefined: Field 'projectLuid' in type 'Workbook' is undefined @ 'workbooksConnection/nodes/projectLuid'", 'locations': [{'line': 9, 'column': 7, 'sourceName': None}], 'description': "Field 'projectLuid' in type 'Workbook' is undefined", 'validationErrorType': 'FieldUndefined', 'queryPath': ['workbooksConnection', 'nodes', 'projectLuid'], 'errorType': 'ValidationError', 'extensions': None, 'path': None}]
Can someone help me by pointing me to the potential root cause?
I can't really trace the problem here, but it looks like a DataHub internal issue to mepurple-microphone-86243
04/04/2023, 7:58 AMmost-animal-32096
04/04/2023, 3:47 PMmetadata-ingestion
layer and DataProcessInstance
entity schema/aspects definition, that leads to a 500
error on REST POST
call to datahub-gms
.
Feel free to comment with your opinion of any possible solution (as there are 2 possible sides to fix it).green-lion-58215
04/04/2023, 9:46 PMsteep-waitress-15973
04/05/2023, 3:01 AMacceptable-midnight-32657
04/05/2023, 8:29 AM