magnificent-honey-40185
05/30/2023, 8:49 PMTraceback (most recent call last):
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 120, in _add_init_error_context
yield
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 220, in __init__
source_class = source_registry.get(source_type)
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 183, in get
tp = self._ensure_not_lazy(key)
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 127, in _ensure_not_lazy
plugin_class = import_path(path)
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 57, in import_path
item = importlib.import_module(module_name)
File "/usr/bin/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/source/redshift/redshift.py", line 41, in <module>
from datahub.ingestion.source.redshift.lineage import RedshiftLineageExtractor
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/source/redshift/lineage.py", line 11, in <module>
from sqllineage.runner import LineageRunner
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/sqllineage/__init__.py", line 41, in <module>
_monkey_patch()
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/sqllineage/__init__.py", line 35, in _monkey_patch
_patch_updating_lateral_view_lexeme()
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/sqllineage/__init__.py", line 24, in _patch_updating_lateral_view_lexeme
if regex("LATERAL VIEW EXPLODE(col)"):
TypeError: 'str' object is not callable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/interpreter-input-ba4237f7-4df7-4ccb-8375-0d65c6af6170.tmp", line 4, in <module>
pipeline = Pipeline.create(
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 334, in create
return cls(
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 220, in __init__
source_class = source_registry.get(source_type)
File "/usr/bin/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/share/tomcat8/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 122, in _add_init_error_context
raise PipelineInitError(f"Failed to {step}: {e}") from e
datahub.ingestion.run.pipeline.PipelineInitError: Failed to find a registered source for type redshift: 'str' object is not callable
Script failed with status: 1
Below is the code :
from datahub.ingestion.run.pipeline import Pipeline
# The pipeline configuration is similar to the recipe YAML files provided to the CLI tool.
pipeline = Pipeline.create(
{
"source": {
"type": "redshift",
"config": {
"username": "username",
"password": "password",
"database": "db",
"host_port": "host:5439",
"default_schema":"schema"
},
},
"sink": {
"type": "datahub-rest",
"config": {
"server": "<http://host/api/gms>",
"token" : "token"
},
},
}
)
# Run the pipeline and report the results.
pipeline.run()
pipeline.pretty_print_summary()
nutritious-salesclerk-57675
05/31/2023, 6:40 AMrich-policeman-92383
05/31/2023, 11:57 AMERROR: missing chunk number 0 for toast value 734921 in pg_toast_83651
wide-kilobyte-73035
05/31/2023, 11:59 AMelegant-nightfall-29115
05/31/2023, 11:30 PMchilly-boots-22585
06/01/2023, 9:36 AMsource:
type: starburst-trino-usage
config:
host_port: '<http://datamesh.conest.com:443|datamesh.conest.com:443>'
database: tpch
username: ds-starburst
include_views: true
include_tables: true
profiling:
enabled: true
profile_table_level_only: true
stateful_ingestion:
enabled: true
password: '${starburst-trino-cred}'
sink:
type: datahub-rest
config:
server: 'datahub-datahub-gms:8080'
I am receiving this error datahub.ingestion.run.pipeline.PipelineInitError: Failed to set up framework context: Failed to instantiate a valid DataHub Graph instance
One more thing is that i have a gms running with ALB endpoint so what should be the gms value in sink ? it should be like above one or "http://datahub-datahub-gms.svc.cluster.local:8080" or
datahub-datahub-gms LoadBalancer 10.100.85.45 a6af33dc651074b21-608523110.eu-west-1.elb.amazonaws.com 808030858/TCP,431830170/TCP 37hproud-dusk-671
06/01/2023, 11:04 AMpurple-salesmen-12745
06/01/2023, 4:02 PMimportant-vegetable-80842
06/01/2023, 5:40 PMimportant-vegetable-80842
06/01/2023, 5:41 PMimportant-vegetable-80842
06/01/2023, 5:41 PMfuture-table-91845
06/01/2023, 7:02 PMfuture-table-91845
06/01/2023, 7:04 PMcreamy-ram-28134
06/02/2023, 4:34 PMcreamy-van-28626
06/02/2023, 5:13 PMmysterious-table-75773
06/04/2023, 7:46 PMSELECT
• does elasticsearch, kafka, postgres, system-update-job and no-code-migrations, cleanup job, restore indices job are mandatory as well, why?
• with version v0.10.3 I have removed schema-registry from deployment and using kafka configs, is there something I should know doing this move?icy-caravan-72551
06/05/2023, 1:53 PMsteep-doctor-17127
06/05/2023, 8:55 PMsteep-doctor-17127
06/05/2023, 9:43 PMsteep-doctor-17127
06/05/2023, 9:43 PMstocky-plumber-3084
06/06/2023, 4:47 AM"JAVA_OPTS=-Xms512m -Xmx512m -Dhttp.port=9002 -Dhttp.proxyHost=<IP> -Dhttp.proxyPort=<port> -Dhttps.proxyHost=<IP> -Dhttps.proxyPort=<port> -Dhttp.nonProxyHosts=localhost -Dconfig.file=datahub-frontend/conf/application.conf -Djava.security.auth.login.config=datahub-frontend/conf/jaas.conf -Dlogback.configurationFile=datahub-frontend/conf/logback.xml -Dlogback.debug=false -Dpidfile.path=/dev/null"
BQ ingestion error:
The error was: Deadline of 600.0s exceeded while calling target function, last exception: HTTPSConnectionPool(host='<http://oauth2.googleapis.com|oauth2.googleapis.com>', port=443): Max retries exceeded with url: /token (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f3e9cbd7310>: Failed to establish a new connection: [Errno 101] Network is unreachable')) Stacktrace: Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connection.py\", line 174, in _new_conn\n conn = connection.create_connection(\n File \"/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py\", line 95, in create_connection\n raise err\n File \"/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py\", line 85, in create_connection\n sock.connect(sa)\nOSError: [Errno 101] Network is unreachable\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 703, in urlopen\n httplib_response = self._make_request(\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 386, in _make_request\n self._validate_conn(conn)\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 1042, in _validate_conn\n conn.connect()\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connection.py\", line 363, in connect\n self.sock = conn = self._new_conn()\n File \"/usr/local/lib/python3.10/site-packages/urllib3/connection.py\", line 186, in _new_conn\n raise NewConnectionError(\nurllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f3e9cbd7310>: Failed to establish a new connection: [Errno 101] Network is unreachable\n\n
(edited)brief-afternoon-9651
06/06/2023, 4:58 AMshy-dog-84302
06/06/2023, 5:29 AMcalm-scientist-99377
06/06/2023, 6:53 AMextraEnvs: # []
# - name: MY_ENVIRONMENT_VAR
# value: the_value_goes_here
- name: AUTH_OIDC_ENABLED
value: "true"
- name: AUTH_OIDC_CLIENT_ID
value: "****"
- name: AUTH_OIDC_CLIENT_SECRET
value: "***"
- name: AUTH_OIDC_DISCOVERY_URI
value: "https://***/v1/.well-known/openid-configuration"
- name: AUTH_OIDC_BASE_URL
value: "<https://localhost:9092>"
# your-datahub-url
- name: AUTH_OIDC_SCOPE
value: "openid profile email groups"
After login, I end up here -
<https://localhost:9092/callback/oidc?code=rl2nxdhcoasxpbwdw4na2gqwb&state=c56dbb3953>
best-river-568
06/06/2023, 9:46 AMmagnificent-honey-40185
06/06/2023, 12:58 PMcalm-scientist-99377
06/06/2023, 5:35 PMalert-traffic-45034
06/07/2023, 1:53 AMfierce-agent-11572
06/07/2023, 9:00 AMdatahub docker quickstart --quickstart-compose-file quickstart/docker-compose.quickstart.yml
the front doesn't startlemon-scooter-69730
06/07/2023, 1:00 PMspec:
...
concurrencyPolicy: Replace