gray-spoon-5206
02/10/2022, 6:36 AMgray-spoon-5206
02/10/2022, 6:36 AMdna-datahub-spike ➤ datahub ingest -c ./snowflake-ingestion.yml git:master*
[2022-02-10 17:33:13,614] ERROR {datahub.entrypoints:119} - Stackprinter failed while formatting <FrameInfo /opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/source/sql/sql_common.py, line 221, scope SQLAlchemyConfig>:
File "/opt/anaconda3/lib/python3.8/site-packages/stackprinter/frame_formatting.py", line 224, in select_scope
raise Exception("Picked an invalid source context: %s" % info)
Exception: Picked an invalid source context: [221], [192], dict_keys([192, 193])
So here is your original traceback at least:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/cli/ingest_cli.py", line 77, in run
pipeline = Pipeline.create(pipeline_config, dry_run, preview)
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 175, in create
return cls(config, dry_run=dry_run, preview_mode=preview_mode)
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 120, in __init__
source_class = source_registry.get(source_type)
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 126, in get
tp = self._ensure_not_lazy(key)
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 84, in _ensure_not_lazy
plugin_class = import_path(path)
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 32, in import_path
item = importlib.import_module(module_name)
File "/opt/anaconda3/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 783, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/source/sql/snowflake.py", line 28, in <module>
from datahub.ingestion.source.sql.sql_common import (
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/source/sql/sql_common.py", line 206, in <module>
class SQLAlchemyConfig(StatefulIngestionConfigBase):
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/source/sql/sql_common.py", line 221, in SQLAlchemyConfig
from datahub.ingestion.source.ge_data_profiler import GEProfilingConfig
File "/opt/anaconda3/lib/python3.8/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 27, in <module>
from great_expectations.core.util import convert_to_json_serializable
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/__init__.py", line 7, in <module>
from great_expectations.data_context import DataContext
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/data_context/__init__.py", line 1, in <module>
from .data_context import BaseDataContext, DataContext, ExplorerDataContext
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/data_context/data_context.py", line 29, in <module>
import great_expectations.checkpoint.toolkit as checkpoint_toolkit
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/checkpoint/__init__.py", line 1, in <module>
from ..util import verify_dynamic_loading_support
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/util.py", line 35, in <module>
from great_expectations.core.expectation_suite import (
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/core/__init__.py", line 3, in <module>
from .expectation_suite import (
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/core/expectation_suite.py", line 10, in <module>
from great_expectations.core.evaluation_parameters import (
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/core/evaluation_parameters.py", line 27, in <module>
from great_expectations.core.util import convert_to_json_serializable
File "/opt/anaconda3/lib/python3.8/site-packages/great_expectations/core/util.py", line 62, in <module>
import pyspark
File "/opt/anaconda3/lib/python3.8/site-packages/pyspark/__init__.py", line 46, in <module>
from pyspark.context import SparkContext
File "/opt/anaconda3/lib/python3.8/site-packages/pyspark/context.py", line 31, in <module>
from pyspark import accumulators
File "/opt/anaconda3/lib/python3.8/site-packages/pyspark/accumulators.py", line 97, in <module>
from pyspark.cloudpickle import CloudPickler
File "/opt/anaconda3/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 146, in <module>
_cell_set_template_code = _make_cell_set_template_code()
File "/opt/anaconda3/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 127, in _make_cell_set_template_code
return types.CodeType(
TypeError: an integer is required (got type bytes)
broad-tomato-45373
02/10/2022, 6:40 AMsnowflake-ingestion.yml
by masking the sensitive info.
This will help in understanding the issue in detail.gray-spoon-5206
02/10/2022, 6:42 AM---
source:
type: snowflake
config:
# Coordinates
host_port: ${SNOWFLAKE_HOST}
warehouse: "PLATFORM_WH"
# Credentials
username: USER_NAME
password: ${PASSWORD}
role: "DATA_CATALOG_READER"
# TODO: This should use privatekey authentication, data catalog reader needs a key
# authentication: KEY_PAIR_AUTHENTICATOR
# private_key_path: ...
database_pattern:
allow:
- "^BILLING\$"
- "^OPERATIONS_ANALYTICS\$"
- "^PURCHASING\$"
- "^PRODUCT_ANALYTICS\$"
schema_pattern:
allow:
- "^RAW\$"
- "^TRANSFORMED_PROD\$"
- "^PUBLISHED_PROD\$"
include_tables: true
include_views: true
include_table_lineage: true
# Disable profiling for local execution as it will eat all the credits
profiling:
enabled: false
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
gray-spoon-5206
02/10/2022, 6:48 AMsquare-activity-64562
02/11/2022, 12:43 PMpython -c "import platform; print(platform.platform())"
python -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
python3 -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
square-activity-64562
02/11/2022, 12:43 PM---
from the beginning of the file? I don't think we have that in any our files.square-activity-64562
02/11/2022, 12:44 PMinclude_views: false
include_table_lineage: false
Trying to rule out possibilites heregray-spoon-5206
02/14/2022, 2:35 AMsparse-monitor-9160
06/11/2022, 4:42 PMsource:
type: "snowflake"
config:
account_id: "my_account.us-east-1"
warehouse: "sor_wh"
username: "my_username"
password: "my_password"
role: "my_role"
include_views: false
include_table_lineage: false
table_pattern:
allow:
- "temp_1"
sink:
type: "datahub-rest"
config:
server: '<http://localhost:8080>'
Appreciate your help.square-activity-64562
06/15/2022, 4:42 AMsparse-monitor-9160
06/15/2022, 11:25 AM