salmon-cricket-21860
07/15/2021, 1:00 AMFile "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/ingestion/source/sql_common.py", line 62, in make_sqlalchemy_uri
40 def make_sqlalchemy_uri(
41 scheme: str,
42 username: Optional[str],
43 password: Optional[str],
44 at: Optional[str],
45 db: Optional[str],
46 uri_opts: Optional[Dict[str, Any]] = None,
47 ) -> str:
(...)
58 if uri_opts is not None:
59 if db is None:
60 url += "/"
61 params = "&".join(
--> 62 f"{key}={quote_plus(value)}" for (key, value) in uri_opts.items() if value
63 )
AttributeError: 'DruidConfig' object has no attribute 'items'
better-orange-49102
07/15/2021, 1:32 AMsalmon-cricket-21860
07/15/2021, 1:37 AMsource:
type: druid
config:
env: PROD
host_port: druid-broker-endpoint:8082
schema_pattern:
deny:
- "^(lookup|sys).*"
sink:
type: "datahub-rest"
config:
server: "<http://datahub-datahub-gms.catalog-production.svc.cluster.local:8080>"
Hi, Thanks for the answer @better-orange-49102 I just checked again the recipe I used, but looks same w/ what in documentationsalmon-cricket-21860
07/15/2021, 1:39 AMAttributeError: 'DruidConfig' object has no attribute 'items'For me, It seems language or class level (DruidConfig) problem. Since the error shows failed to find 'items' function in 'DruidConfig' object.
gray-shoe-75895
07/15/2021, 2:39 AMsalmon-cricket-21860
07/15/2021, 2:41 AMgray-shoe-75895
07/15/2021, 3:47 AMsalmon-cricket-21860
07/15/2021, 3:48 AMsalmon-cricket-21860
07/15/2021, 2:32 PM...
[2021-07-15 23:29:26,537] ERROR {datahub.entrypoints:106} - File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1215, in _fetchone_impl
1213 def _fetchone_impl(self):
1214 try:
--> 1215 return self.cursor.fetchone()
1216 except AttributeError as err:
..................................................
self = <sqlalchemy.engine.result.ResultProxy object at 0x7fc7b0f612e0>
self.cursor.fetchone = # AttributeError
self.cursor = None
..................................................
AttributeError: 'NoneType' object has no attribute 'fetchone'
---- (full traceback above) ----
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/entrypoints.py", line 98, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/home/jovyan/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/jovyan/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/jovyan/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/jovyan/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/jovyan/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/entrypoints.py", line 85, in ingest
pipeline.run()
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 108, in run
for wu in self.source.get_workunits():
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/ingestion/source/sql_common.py", line 280, in get_workunits
yield from self.loop_tables(inspector, schema, sql_config)
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/datahub/ingestion/source/sql_common.py", line 300, in loop_tables
columns = inspector.get_columns(table, schema)
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 390, in get_columns
col_defs = self.dialect.get_columns(
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/pydruid/db/sqlalchemy.py", line 178, in get_columns
return [
File "/home/jovyan/conda-envs/catalog/lib/python3.8/site-packages/pydruid/db/sqlalchemy.py", line 178, in <listcomp>
return [
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1010, in __iter__
row = self.fetchone()
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1343, in fetchone
self.connection._handle_dbapi_exception(
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1336, in fetchone
row = self._fetchone_impl()
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1217, in _fetchone_impl
return self._non_result(None, err)
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 1236, in _non_result
util.raise_(
File "/home/jovyan/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
ResourceClosedError: This result object does not return rows. It has been closed automatically.
salmon-cricket-21860
07/15/2021, 2:33 PMgray-shoe-75895
07/15/2021, 4:32 PMdatahub --debug ingest ...
salmon-cricket-21860
07/16/2021, 12:19 AM[2021-07-16 09:13:28,664] INFO {datahub.ingestion.run.pipeline:44} - sink wrote workunit prod_web_expr_event_stream_r0
2021-07-16 09:13:28,664 INFO sqlalchemy.engine.base.Engine
SELECT COLUMN_NAME,
DATA_TYPE,
IS_NULLABLE,
COLUMN_DEFAULT
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'test'
AND TABLE_SCHEMA = 'druid'
[2021-07-16 09:13:28,664] INFO {sqlalchemy.engine.base.Engine:110} -
SELECT COLUMN_NAME,
DATA_TYPE,
IS_NULLABLE,
COLUMN_DEFAULT
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'test'
AND TABLE_SCHEMA = 'druid'
2021-07-16 09:13:28,664 INFO sqlalchemy.engine.base.Engine {}
[2021-07-16 09:13:28,664] INFO {sqlalchemy.engine.base.Engine:110} - {}
Seems failing at this part. 'test' table. Our druid cluster doesn't have any 'test' datasource.
• Maybe it's registered on druid RDS and removed previously,
Anyway after modifying recipe to deny table pattern 'test', I was able to ingest druid tables 🙂 Thanks Harshal Sheth!
Source (druid) report:
{'failures': {},
'filtered': ['test', 'test2', 'lookup.*', 'sys.*'],
'tables_scanned': 13,
'views_scanned': 0,
'warnings': {},
'workunit_ids': ['prod_app_event_cancel_stream_r0',
'prod_app_event_order_stream_r0',
'prod_app_event_view_stream_r0',
'prod_app_experiment_event_stream_r0',
'prod_server_event_view_stream_r0',
'prod_server_inventory_payload_stream_r0',
'prod_web_event_cancel_stream_r0',
'prod_web_event_order_stream_r0',
'prod_web_event_view_stream_r0',
'prod_web_experiment_event_stream_r0',
'prod_web_expr_event_stream_r0'],
'workunits_produced': 11}
Sink (console) report:
{'failures': [], 'records_written': 11, 'warnings': []}
Pipeline finished successfully
salmon-cricket-21860
07/16/2021, 12:21 AMtest table doesn't have column information
gray-shoe-75895
07/16/2021, 12:42 AMgray-shoe-75895
07/16/2021, 12:43 AMgray-shoe-75895
07/19/2021, 7:22 PMsalmon-cricket-21860
07/20/2021, 1:38 AMmodern-umbrella-58840
06/08/2023, 11:15 PMTables error: This result object does not return rows
In my case the Druid SQL API or Datahub is only able to query system tables. I mean the below query doesn't return any tables from druid
schema
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'druid'
But these queries work fine:
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA =INFORMATION_SCHEMA
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA =sys
are working fine and returning results.
Did you have to do any additional configuration or setup on druid for datahub to be able to query druid
SCHEMA tables?