plain-farmer-27314
02/22/2022, 4:40 PMplain-farmer-27314
02/22/2022, 4:41 PM[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - Traceback (most recent call last):
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 813, in _generate_single_profile
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - pretty_name=pretty_name,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 869, in _get_ge_dataset
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - **batch_kwargs,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/data_context.py", line 1645, in get_batch
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - batch_parameters=batch_parameters,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/data_context.py", line 1348, in _get_batch_v2
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - return validator.get_dataset()
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/validator/validator.py", line 1942, in get_dataset
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - **self.batch.batch_kwargs.get("dataset_options", {}),
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/dataset/sqlalchemy_dataset.py", line 641, in __init__
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - "No BigQuery dataset specified. Use bigquery_temp_table batch_kwarg or a specify a "
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - ValueError: No BigQuery dataset specified. Use bigquery_temp_table batch_kwarg or a specify a default dataset in engine url
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - Limit and offset parameters are ignored when using query-based batch_kwargs; consider adding limit and offset directly to the generated query.
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - Encountered exception while profiling discord-data-analytics-prd.dem.sketch_guild_num_users_with_voice_or_video_call_instance
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - Traceback (most recent call last):
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 813, in _generate_single_profile
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - pretty_name=pretty_name,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 869, in _get_ge_dataset
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - **batch_kwargs,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/data_context.py", line 1645, in get_batch
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - batch_parameters=batch_parameters,
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/data_context/data_context.py", line 1348, in _get_batch_v2
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - return validator.get_dataset()
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/validator/validator.py", line 1942, in get_dataset
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - **self.batch.batch_kwargs.get("dataset_options", {}),
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/site-packages/great_expectations/dataset/sqlalchemy_dataset.py", line 641, in __init__
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - "No BigQuery dataset specified. Use bigquery_temp_table batch_kwarg or a specify a "
[2022-02-22, 16:21:22 UTC] {pod_launcher.py:149} INFO - ValueError: No BigQuery dataset specified. Use bigquery_temp_table batch_kwarg or a specify a default dataset in engine url
dazzling-judge-80093
02/22/2022, 4:46 PMbigquery_temp_table_schema
property in the config.
To be able profile partitioned datasets great expectation (the framework we use for profiling under the hood) needs to create temporary tables and for that we need a schema where we can create these tables.
These tables going to be purged in the end.
https://datahubproject.io/docs/metadata-ingestion/source_docs/bigquery#profilingplain-farmer-27314
02/22/2022, 4:50 PMplain-farmer-27314
02/22/2022, 4:54 PMdazzling-judge-80093
02/22/2022, 5:07 PMdazzling-judge-80093
02/22/2022, 5:07 PMplain-farmer-27314
02/22/2022, 7:56 PMplain-farmer-27314
02/22/2022, 7:57 PMdazzling-judge-80093
02/23/2022, 10:11 AMplain-farmer-27314
03/01/2022, 3:53 PMdazzling-judge-80093
03/02/2022, 6:10 AMdazzling-judge-80093
03/02/2022, 8:30 AMplain-farmer-27314
03/02/2022, 2:10 PM