witty-butcher-82399
07/19/2022, 12:52 PM│ PermissionDenied: 403 request failed: the user does not have 'bigquery.readsessions.create' permission for 'projects/XXXXXXXX'
According to the docs, that permission is required only for lineage.
So I tried by disabling table lineage with: include_table_lineage: False
However, still getting the same error. Is there any other config setting for disabling the table lineage? or is this a bug in the config field?
🧵witty-butcher-82399
07/19/2022, 12:54 PM[2022-07-19 12:38:11,217] INFO {datahub.cli.ingest_cli:99} - DataHub CLI version: 0.8.40+0.1.0
/usr/local/lib/python3.9/site-packages/datahub/ingestion/transformer/add_dataset_browse_path.py:33: DeprecationWarning: Call to deprecated class DatasetTransformer. (Legacy transformer that supports transforming MCE-s using transform_one method. Use BaseTransformer directly and implement the transform_aspect method)
return cls(config, ctx)
/usr/local/lib/python3.9/site-packages/datahub/ingestion/transformer/add_dataset_ownership.py:174: DeprecationWarning: Call to deprecated class DatasetTransformer. (Legacy transformer that supports transforming MCE-s using transform_one method. Use BaseTransformer directly and implement the transform_aspect method)
return cls(config, ctx)
[2022-07-19 12:38:14,869] INFO {datahub.cli.ingest_cli:115} - Starting metadata ingestion
[2022-07-19 12:38:15,434] INFO {datahub.ingestion.run.pipeline:104} - sink wrote workunit container-info-mo-data-catalog-dev-rygq-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914
[2022-07-19 12:38:15,506] INFO {datahub.ingestion.run.pipeline:104} - sink wrote workunit container-platforminstance-mo-data-catalog-dev-rygq-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914
[2022-07-19 12:38:15,553] INFO {datahub.ingestion.run.pipeline:104} - sink wrote workunit container-subtypes-mo-data-catalog-dev-rygq-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914
[2022-07-19 12:38:17,311] INFO {datahub.cli.ingest_cli:119} - Source (bigquery) report:
{'workunits_produced': 3,
'workunit_ids': ['container-info-XXXXXXXX-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914',
'container-platforminstance-XXXXXXXX-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914',
'container-subtypes-XXXXXXXX-urn:li:container:3cab3e00a5a582ac90f3aaae6264e914'],
'warnings': {},
'failures': {},
'cli_version': '0.8.40+0.1.0',
'cli_entry_location': '/usr/local/lib/python3.9/site-packages/datahub/__init__.py',
'py_version': '3.9.9 (main, Dec 21 2021, 10:03:34) \n[GCC 10.2.1 20210110]',
'py_exec_path': '/usr/local/bin/python',
'os_details': 'Linux-5.4.92-flatcar-x86_64-with-glibc2.31',
'tables_scanned': 0,
'views_scanned': 0,
'entities_profiled': 0,
'filtered': [],
'soft_deleted_stale_entities': [],
'query_combiner': None,
'num_total_lineage_entries': None,
'num_skipped_lineage_entries_missing_data': None,
'num_skipped_lineage_entries_not_allowed': None,
'num_skipped_lineage_entries_sql_parser_failure': None,
'num_skipped_lineage_entries_other': None,
'num_total_log_entries': None,
'num_parsed_log_entires': None,
'num_total_audit_entries': None,
'num_parsed_audit_entires': None,
'bigquery_audit_metadata_datasets_missing': None,
'lineage_metadata_entries': None,
'include_table_lineage': False,
'use_date_sharded_audit_log_tables': False,
'log_page_size': 1000,
'use_v2_audit_metadata': False,
'use_exported_bigquery_audit_metadata': False,
'start_time': datetime.datetime(2022, 7, 18, 0, 0, tzinfo=datetime.timezone.utc),
'end_time': datetime.datetime(2022, 7, 20, 0, 0, tzinfo=datetime.timezone.utc),
'log_entry_start_time': None,
'log_entry_end_time': None,
'audit_start_time': None,
'audit_end_time': None,
'upstream_lineage': {},
'partition_info': {}}
[2022-07-19 12:38:17,312] INFO {datahub.cli.ingest_cli:122} - Sink (datahub-kafka) report:
{'records_written': 3,
'warnings': [],
'failures': [],
'downstream_start_time': None,
'downstream_end_time': None,
'downstream_total_latency_in_seconds': None}
..................................................File "/usr/local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
55 def error_remapped_callable(*args, **kwargs):
56 try:
57 return callable_(*args, **kwargs)
58 except grpc.RpcError as exc:
--> 59 raise exceptions.from_grpc_error(exc) from exc
..................................................
args = (
parent: "projects/XXXXXXXX"
read_session {
data_format: ARROW
table: "projects/XXXXXXXX/datasets/_f73150d601bf08c0a38c405e168e4a1391e5c632/tables/anonbade4036_44d1
_476e_a120_33a3d83eac50"
read_options {
arrow_serialization_options {
buffer_compression: LZ4_FRAME
}
}
}
max_stream_count: 1
, )
kwargs = {'metadata': [(...), (...), ]}
callable_ = <grpc._channel._UnaryUnaryMultiCallable object at 0x7f7c2b569c70>
grpc.RpcError = <class 'grpc.RpcError'>
exceptions.from_grpc_error = <function 'from_grpc_error' exceptions.py:590>
..................................................---- (full traceback above) ----
File "/usr/local/lib/python3.9/site-packages/datahub/entrypoints.py", line 149, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/datahub/upgrade/upgrade.py", line 333, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line 338, in wrapper
raise e
File "/usr/local/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line 290, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/datahub/utilities/memory_leak_detector.py", line 102, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/datahub/cli/ingest_cli.py", line 131, in run
raise e
File "/usr/local/lib/python3.9/site-packages/datahub/cli/ingest_cli.py", line 117, in run
pipeline.run()
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 215, in run
for wu in itertools.islice(
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/bigquery.py", line 905, in get_workunits
for wu in super().get_workunits():
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py", line 725, in get_workunits
self.add_information_for_schema(inspector, schema)
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/bigquery.py", line 757, in add_information_for_schema
for row in result.fetchall():
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/result.py", line 1288, in fetchall
self.connection._handle_dbapi_exception(
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/result.py", line 1284, in fetchall
l = self.process_rows(self._fetchall_impl())
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/result.py", line 1230, in _fetchall_impl
return self.cursor.fetchall()
File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/dbapi/_helpers.py", line 494, in with_closed_check
return method(self, *args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 382, in fetchall
self._try_fetch()
File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 256, in _try_fetch
rows_iterable = self._bqstorage_fetch(bqstorage_client)
File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery/dbapi/cursor.py", line 295, in _bqstorage_fetch
read_session = bqstorage_client.create_read_session(
File "/usr/local/lib/python3.9/site-packages/google/cloud/bigquery_storage_v1/services/big_query_read/client.py", line 615, in create_read_session
response = rpc(
File "/usr/local/lib/python3.9/site-packages/google/api_core/gapic_v1/method.py", line 154, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func
return retry_target(
File "/usr/local/lib/python3.9/site-packages/google/api_core/retry.py", line 190, in retry_target
return target()
File "/usr/local/lib/python3.9/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
raise exceptions.from_grpc_error(exc) from excPermissionDenied: 403 request failed: the user does not have 'bigquery.readsessions.create' permission for 'projects/mo-data-catalog-dev-rygq'
[2022-07-19 12:38:17,784] INFO {datahub.entrypoints:176} - DataHub CLI version: 0.8.40+0.1.0 at /usr/local/lib/python3.9/site-packages/datahub/__init__.py
[2022-07-19 12:38:17,784] INFO {datahub.entrypoints:179} - Python version: 3.9.9 (main, Dec 21 2021, 10:03:34)
[GCC 10.2.1 20210110] at /usr/local/bin/python on Linux-5.4.92-flatcar-x86_64-with-glibc2.31
[2022-07-19 12:38:17,784] INFO {datahub.entrypoints:182} - GMS config {}
Stream closed EOF for datahighway-dev/demo-ingestion-bigquery-manual-ncj-ndnv5 (crawler)
square-activity-64562
07/22/2022, 12:55 PMlogging.logEntries.list
logging.privateLogEntries.list
witty-butcher-82399
07/22/2022, 1:00 PM# basic requirements
bigquery.datasets.get
bigquery.datasets.getIamPolicy
bigquery.jobs.create
bigquery.jobs.list
bigquery.jobs.listAll
bigquery.models.getMetadata
bigquery.models.list
bigquery.routines.get
bigquery.routines.list
bigquery.tables.get
resourcemanager.projects.get
bigquery.readsessions.create
bigquery.readsessions.getData
# requirements if profiling enabled
bigquery.tables.create
bigquery.tables.getData
bigquery.tables.list
# requirements if table lineage enabled
logging.logEntries.list
logging.privateLogEntries.list
square-activity-64562
07/22/2022, 1:01 PMwitty-butcher-82399
07/22/2022, 1:03 PMbigquery.tables.getData
. Is that correct?witty-butcher-82399
07/22/2022, 1:04 PMsquare-activity-64562
07/22/2022, 1:06 PMwitty-butcher-82399
07/22/2022, 1:11 PMsquare-activity-64562
07/22/2022, 1:12 PMvia GCP logging
square-activity-64562
07/22/2022, 1:12 PMwitty-butcher-82399
07/22/2022, 1:12 PMsquare-activity-64562
07/22/2022, 1:13 PMneeded for lineage generation via GCP logging
not just table lineagesquare-activity-64562
07/22/2022, 1:14 PMsquare-activity-64562
07/22/2022, 1:14 PM