sparse-monitor-9160
06/14/2022, 12:32 PM[2022-06-14 08:27:04,506] INFO {datahub.cli.ingest_cli:99} - DataHub CLI version: 0.8.38
[2022-06-14 08:27:10,903] ERROR {datahub.entrypoints:167} - Stackprinter failed while formatting <FrameInfo /usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py, line 270, scope SQLAlchemyConfig>:
File "/usr/local/lib/python3.9/site-packages/stackprinter/frame_formatting.py", line 225, in select_scope
raise Exception("Picked an invalid source context: %s" % info)
Exception: Picked an invalid source context: [270], [219], dict_keys([219, 220])
So here is your original traceback at least:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/datahub/cli/ingest_cli.py", line 106, in run
pipeline = Pipeline.create(pipeline_config, dry_run, preview, preview_workunits)
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 202, in create
return cls(
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 149, in __init__
source_class = source_registry.get(source_type)
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 126, in get
tp = self._ensure_not_lazy(key)
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 84, in _ensure_not_lazy
plugin_class = import_path(path)
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 32, in import_path
item = importlib.import_module(module_name)
File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/snowflake.py", line 29, in <module>
from datahub.ingestion.source.sql.sql_common import (
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py", line 236, in <module>
class SQLAlchemyConfig(StatefulIngestionConfigBase):
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py", line 270, in SQLAlchemyConfig
from datahub.ingestion.source.ge_data_profiler import GEProfilingConfig
File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 12, in <module>
from great_expectations import __version__ as ge_version
File "/usr/local/lib/python3.9/site-packages/great_expectations/__init__.py", line 7, in <module>
from great_expectations.data_context import DataContext
File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/__init__.py", line 1, in <module>
from great_expectations.data_context.data_context import (
File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/data_context/__init__.py", line 1, in <module>
from great_expectations.data_context.data_context.base_data_context import (
File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/data_context/base_data_context.py", line 20, in <module>
from great_expectations.core.config_peer import ConfigPeer
File "/usr/local/lib/python3.9/site-packages/great_expectations/core/__init__.py", line 3, in <module>
from .expectation_suite import (
File "/usr/local/lib/python3.9/site-packages/great_expectations/core/expectation_suite.py", line 10, in <module>
from great_expectations.core.evaluation_parameters import (
File "/usr/local/lib/python3.9/site-packages/great_expectations/core/evaluation_parameters.py", line 27, in <module>
from great_expectations.core.util import convert_to_json_serializable
File "/usr/local/lib/python3.9/site-packages/great_expectations/core/util.py", line 22, in <module>
from great_expectations.types import SerializableDictDot
File "/usr/local/lib/python3.9/site-packages/great_expectations/types/__init__.py", line 15, in <module>
import pyspark
File "/usr/local/lib/python3.9/site-packages/pyspark/__init__.py", line 51, in <module>
from pyspark.context import SparkContext
File "/usr/local/lib/python3.9/site-packages/pyspark/context.py", line 31, in <module>
from pyspark import accumulators
File "/usr/local/lib/python3.9/site-packages/pyspark/accumulators.py", line 97, in <module>
from pyspark.serializers import read_int, PickleSerializer
File "/usr/local/lib/python3.9/site-packages/pyspark/serializers.py", line 72, in <module>
from pyspark import cloudpickle
File "/usr/local/lib/python3.9/site-packages/pyspark/cloudpickle.py", line 145, in <module>
_cell_set_template_code = _make_cell_set_template_code()
File "/usr/local/lib/python3.9/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
return types.CodeType(
TypeError: an integer is required (got type bytes)
[2022-06-14 08:27:10,904] INFO {datahub.entrypoints:176} - DataHub CLI version: 0.8.38 at /usr/local/lib/python3.9/site-packages/datahub/__init__.py
[2022-06-14 08:27:10,904] INFO {datahub.entrypoints:179} - Python version: 3.9.7 (default, Sep 3 2021, 12:36:14)
[Clang 11.0.0 (clang-1100.0.33.17)] at /usr/local/opt/python@3.9/bin/python3.9 on macOS-10.14.6-x86_64-i386-64bit
[2022-06-14 08:27:10,904] INFO {datahub.entrypoints:182} - GMS config {'models': {}, 'versions': {'linkedin/datahub': {'version': 'v0.8.38', 'commit': '38718b59b358fc6c564ee982752bf2023533b224'}}, 'managedIngestion': {'defaultCliVersion': '0.8.38', 'enabled': True}, 'statefulIngestionCapable': True, 'supportsImpactAnalysis': True, 'telemetry': {'enabledCli': True, 'enabledIngestion': False}, 'datasetUrnNameCasing': False, 'retention': 'true', 'datahub': {'serverType': 'quickstart'}, 'noCode': 'true'}
Here is my YML (with sensitive data replaced by my_
):
source:
type: "snowflake"
config:
account_id: "my_account.us-east-1"
warehouse: "sor_wh"
username: "my_username"
password: "my_password"
role: "my_role"
include_views: false
include_table_lineage: false
table_pattern:
allow:
- "temp_1"
sink:
type: "datahub-rest"
config:
server: '<http://localhost:8080>'
Here is my environment:
$ python -c "import platform; print(platform.platform())"
Darwin-18.7.0-x86_64-i386-64bit
$ python -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
2.7.16 (default, Jan 27 2020, 04:46:15)
[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)]
/usr/bin/python
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: No module named datahub
$ python3 -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
3.9.7 (default, Sep 3 2021, 12:36:14)
[Clang 11.0.0 (clang-1100.0.33.17)]
/usr/local/opt/python@3.9/bin/python3.9
/usr/local/lib/python3.9/site-packages/datahub/__init__.py
0.8.38
bulky-soccer-26729
06/14/2022, 3:26 PMbulky-soccer-26729
06/14/2022, 3:27 PMsparse-monitor-9160
06/14/2022, 5:34 PMbulky-soccer-26729
06/14/2022, 5:35 PM