microscopic-mechanic-13766
11/17/2022, 3:15 PMhundreds-photographer-13496
11/17/2022, 5:57 PMmicroscopic-mechanic-13766
11/18/2022, 8:41 AMhttp or binary makes it take the default value, which is binary.hundreds-photographer-13496
11/18/2022, 8:57 AMmicroscopic-mechanic-13766
11/18/2022, 8:59 AMhundreds-photographer-13496
11/18/2022, 9:00 AMscheme: 'hive+http'microscopic-mechanic-13766
11/18/2022, 9:32 AM'/tmp/datahub/ingest/venv-hive-0.9.\n'
' 0.4/lib/python3.10/site-packages/datahub/cli/ingest_cli.py:155> exception=ModuleNotFoundError("No module named '
"'kerberos\n"
' \'")>\n'
" run_pipeline_async = <function 'run.<locals>.run_pipeline_async' ingest_cli.py:155>\n"hundreds-photographer-13496
11/18/2022, 9:42 AMmicroscopic-mechanic-13766
11/18/2022, 9:47 AMmicroscopic-mechanic-13766
11/18/2022, 9:47 AMhundreds-photographer-13496
11/18/2022, 10:23 AMhundreds-photographer-13496
11/18/2022, 10:24 AMmicroscopic-mechanic-13766
11/18/2022, 11:24 AMscheme:hive+http )
If that property is not set, now the realm to which it is trying to obtain the ticket from is wrongmicroscopic-mechanic-13766
11/18/2022, 11:24 AMhundreds-photographer-13496
11/18/2022, 1:29 PMpip install 'acryl-pyhive[kerberos]' on CLI and confirm it successfully installs. (we don't need pip install kerberos that I mentioned earlier, so that can be uninstalled)microscopic-mechanic-13766
11/21/2022, 8:53 AMmicroscopic-mechanic-13766
11/21/2022, 9:07 AMdatahub@datahub-actions:/$ pip install 'acryl-pyhive[kerberos]'
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: acryl-pyhive[kerberos] in /usr/local/lib/python3.10/site-packages (0.6.13)
Requirement already satisfied: future in /usr/local/lib/python3.10/site-packages (from acryl-pyhive[kerberos]) (0.18.2)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.10/site-packages (from acryl-pyhive[kerberos]) (2.8.2)
Collecting requests-kerberos>=0.12.0
Downloading requests_kerberos-0.14.0-py2.py3-none-any.whl (11 kB)
Collecting pyspnego[kerberos]
Downloading pyspnego-0.6.3-py3-none-any.whl (124 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.9/124.9 kB 3.8 MB/s eta 0:00:00
Requirement already satisfied: requests>=1.1.0 in /usr/local/lib/python3.10/site-packages (from requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (2.28.0)
Requirement already satisfied: cryptography>=1.3 in /usr/local/lib/python3.10/site-packages (from requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (36.0.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/site-packages (from python-dateutil->acryl-pyhive[kerberos]) (1.16.0)
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/site-packages (from cryptography>=1.3->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (1.15.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests>=1.1.0->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (1.26.9)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests>=1.1.0->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (2022.6.15)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/site-packages (from requests>=1.1.0->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests>=1.1.0->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (3.3)
Collecting krb5>=0.3.0
Downloading krb5-0.4.1.tar.gz (218 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 218.7/218.7 kB 13.7 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting gssapi>=1.6.0
Downloading gssapi-1.8.2.tar.gz (94 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.3/94.3 kB 4.8 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: pycparser in /usr/local/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=1.3->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (2.21)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/site-packages (from gssapi>=1.6.0->pyspnego[kerberos]->requests-kerberos>=0.12.0->acryl-pyhive[kerberos]) (5.1.1)
Building wheels for collected packages: gssapi, krb5
Building wheel for gssapi (pyproject.toml) ... done
Created wheel for gssapi: filename=gssapi-1.8.2-cp310-cp310-linux_x86_64.whl size=3335479 sha256=ec5cabb4d5f868a811524fa95e7a0ea238382601ad45c3bc223ebbc12adf16ee
Stored in directory: /home/datahub/.cache/pip/wheels/59/a8/83/5017e55a50e766ad6874c236b60fdace4f8552a00a1ebc9474
Building wheel for krb5 (pyproject.toml) ... done
Created wheel for krb5: filename=krb5-0.4.1-cp310-cp310-linux_x86_64.whl size=4405756 sha256=a3e3ac43c21d4cb7a4f27b896952b60df8363ace82db2e48aac8622f9c93a560
Stored in directory: /home/datahub/.cache/pip/wheels/04/07/80/b1e1c44fecd717bd7ef457b78dc92bf15eedc095ca0236e917
Successfully built gssapi krb5
Installing collected packages: krb5, gssapi, pyspnego, requests-kerberos
WARNING: The script pyspnego-parse is installed in '/home/datahub/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed gssapi-1.8.2 krb5-0.4.1 pyspnego-0.6.3 requests-kerberos-0.14.0hundreds-photographer-13496
11/21/2022, 9:20 AMmicroscopic-mechanic-13766
11/21/2022, 9:48 AMacryl-pyhive[kerberos] and kerberos .
After that I was getting another error:
---- (full traceback above) ----
File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 149, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 343, in wrapper
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 295, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 102, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 205, in run
loop.run_until_complete(run_func_check_upgrade(pipeline))
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 161, in run_func_check_upgrade
ret = await the_one_future
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 152, in run_pipeline_async
return await loop.run_in_executor(
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 143, in run_pipeline_to_completion
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 129, in run_pipeline_to_completion
pipeline.run()
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 334, in run
for wu in itertools.islice(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 728, in get_workunits
for inspector in self.get_inspectors():
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 532, in get_inspectors
with engine.connect() as conn:
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2263, in connect
return self._connection_cls(self, **kwargs)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 104, in __init__
else engine.raw_connection()
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2369, in raw_connection
return self._wrap_pool_connect(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
return fn()
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
return _ConnectionFairy._checkout(self)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
rec = pool._do_get()
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 139, in _do_get
with util.safe_reraise():
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
return self._create_connection()
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
return _ConnectionRecord(self)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
self.__connect(first_connect_check=True)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 660, in __connect
with util.safe_reraise():
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
connection = pool._invoke_creator(self)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
return dialect.connect(*cargs, **cparams)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 508, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 126, in connect
return Connection(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 273, in __init__
response = self._client.OpenSession(open_session_req)
File "/usr/local/lib/python3.10/site-packages/TCLIService/TCLIService.py", line 186, in OpenSession
self.send_OpenSession(req)
File "/usr/local/lib/python3.10/site-packages/TCLIService/TCLIService.py", line 195, in send_OpenSession
self._oprot.trans.flush()
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 81, in flush
super(TCookieHttpClient, self).flush()
File "/usr/local/lib/python3.10/site-packages/thrift/transport/THttpClient.py", line 191, in flush
self.__http.putheader('Cookie', self.headers['Set-Cookie'])
File "/usr/local/lib/python3.10/http/client.py", line 1244, in putheader
raise CannotSendHeader()
This thing was solved by downgrading the thrift version to 0.13.0, as the 0.16.0 version has a bug on some of the methods in the stack trace.
By doing these two things, I was able to successfully ingest from Hive with HTTP!microscopic-mechanic-13766
11/21/2022, 10:14 AMhundreds-photographer-13496
11/21/2022, 10:35 AMimportant-electrician-22243
07/12/2023, 4:15 AM