freezing-london-39671
12/08/2022, 5:10 PMcuddly-state-92920
12/08/2022, 5:39 PMlively-dusk-19162
12/08/2022, 6:21 PMfull-chef-85630
12/09/2022, 8:59 AMlemon-cat-72045
12/09/2022, 10:20 AMalert-fall-82501
12/09/2022, 12:56 PMsilly-boots-14314
12/09/2022, 1:00 PMcuddly-state-92920
12/09/2022, 1:27 PMworried-chef-87127
12/09/2022, 4:04 PMpackages/looker_sdk/rtl/api_methods.py", line 87, in _return\n'
' raise error.SDKError(response.value.decode(encoding=encoding))\n'
"looker_sdk.error.SDKError: HTTPSConnectionPool(host='looker.#####.com', port=443): Read timed out. (read timeout=120)\n"
'[2022-12-09 06:02:03,395] ERROR {datahub.entrypoints:195} - Command failed: \n'
"\tHTTPSConnectionPool(host='looker.#####.com', port=443): Read timed out. (read timeout=120).\n"
'\tRun with --debug to get full stacktrace.\n'
"\te.g. 'datahub --debug ingest run -c /tmp/datahub/ingest/dd7bf998-80b7-433c-add2-ec77e3103e8d/recipe.yml --report-to "
"/tmp/datahub/ingest/dd7bf998-80b7-433c-add2-ec77e3103e8d/ingestion_report.json'\n",
"2022-12-09 06:02:03.643273 [exec_id=dd7bf998-80b7-433c-add2-ec77e3103e8d] INFO: Failed to execute 'datahub ingest'",
'2022-12-09 06:02:03.643439 [exec_id=dd7bf998-80b7-433c-add2-ec77e3103e8d] INFO: Caught exception EXECUTING '
'task_id=dd7bf998-80b7-433c-add2-ec77e3103e8d, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
cuddly-state-92920
12/09/2022, 8:24 PMgifted-knife-16120
12/10/2022, 8:37 PMimportant-night-50346
12/11/2022, 5:25 PM"domain": {
"urn:li:domain:test_domain": {
"allow": [
"test_database.test_schema.*",
]
}
}
Tried to achieve the same with transformer and it seems that it also does not work for container level. Could you please clarify this as my intention was to set GlobalTags, Ownership and Domain aspects for both container and datasets…colossal-sandwich-50049
12/11/2022, 9:09 PM5.x
? Why I ask is because 5.x has quite a lot of beneficial features, like being able to set retry
https://hc.apache.org/httpcomponents-client-5.2.x/index.html
cc: @great-toddler-2251microscopic-machine-90437
12/12/2022, 11:58 AM' \'failures\': {\'tableau-login\': ["Unable to login (check your Tableau connection and credentials): Invalid version: \'Unknown\'"]},
Attached is the complete error stack trace.few-tent-75240
12/12/2022, 1:30 PMcuddly-state-92920
12/12/2022, 2:25 PMpython3 -m pip install --upgrade datahub-classify
But after to execute this command it returns this message:
ERROR: Could not find a version that satisfies the requirement datahub-classify (from versions: none)
ERROR: No matching distribution found for datahub-classify
My version datahub is:
datahub --version
acryl-datahub, version 0.9.3.2
Does anyone know what is going on in this scenario?rhythmic-church-10210
12/12/2022, 6:12 PMproud-ice-24189
12/12/2022, 8:28 PMhappy-twilight-91685
12/13/2022, 1:30 AMsilly-butcher-31834
12/13/2022, 4:35 AMmanifest_path
, catalog_path
, sources_path
, test_results_path
location to ingest dbt metadata to datahub in google cloud storage or google cloud environment. because i have difference machine between datahub instance and dbt instance. thanksbrave-lunch-64773
12/13/2022, 4:42 AMlate-ability-59580
12/13/2022, 9:10 AMdbt
with Snowflake
as the target_platform
.
While incremental dbt
models (for example) are mapped perfectly to their underlying Snowflake
tables,
dbt
sources are left separate.
Any idea how to overcome this?brave-pencil-21289
12/13/2022, 10:09 AMmagnificent-lock-58916
12/13/2022, 11:03 AMstateful_ingestion:
enabled: true
remove_stale_metadata: true
Frankly, we didn’t use ignore_old_state
and ignore_new_state
because we have trouble understanding what these options actually do. It’d be really nice if you would help us understand them too
But yeah, the main question is how do we set up configuration so that our ingestion would be actually stateful? Currently it’s not working as desiredrapid-city-92351
12/13/2022, 11:18 AMpackages/datahub/ingestion/source/snowflake/snowflake_lineage.py", '
'line 503, in _populate_view_downstream_lineage\n'
' json.loads(db_row["DOWNSTREAM_TABLE_COLUMNS"]),\n'
' File "/usr/local/lib/python3.10/json/__init__.py", line 339, in loads\n'
" raise TypeError(f'the JSON object must be str, bytes or bytearray, '\n"
'TypeError: the JSON object must be str, bytes or bytearray, not NoneType\n'
'[2022-12-13 11:04:07,215] ERROR {datahub.entrypoints:195} - Command failed: \n'
'\tthe JSON object must be str, bytes or bytearray, not NoneType.\n'
Now i moved back to 0.9.1 and here it is also not working anymore. Doesn´t make sense for me at all. Has someone an idea and could help
Thankstall-father-13753
12/13/2022, 2:34 PMTraceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 770, in loop_tables
yield from self._process_table(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 812, in _process_table
description, properties, location_urn = self.get_table_properties(
File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 910, in get_table_properties
table_info: dict = inspector.get_table_comment(table, schema) # type: ignore
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 558, in get_table_comment
return self.dialect.get_table_comment(
File "/usr/local/lib/python3.10/site-packages/pyhive/sqlalchemy_hive.py", line 376, in get_table_comment
rows = self._get_table_columns(connection, table_name, schema, extended=True)
File "/usr/local/lib/python3.10/site-packages/pyhive/sqlalchemy_hive.py", line 290, in _get_table_columns
rows = connection.execute('DESCRIBE{} {}'.format(extended, full_table)).fetchall()
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1003, in execute
return self._execute_text(object_, multiparams, params)
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1172, in _execute_text
ret = self._execute_context(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 479, in execute
_check_status(response)
File "/usr/local/lib/python3.10/site-packages/pyhive/hive.py", line 609, in _check_status
raise OperationalError(response)
sqlalchemy.exc.OperationalError: (pyhive.exc.OperationalError) TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Error while compiling statement: FAILED: SemanticException [Error 10072]: Database does not exist: `test`:28:27', 'org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:380', 'org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:206', 'org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290', 'org.apache.hive.service.cli.operation.Operation:run:Operation.java:320', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:530', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:506', 'sun.reflect.GeneratedMethodAccessor66:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1729', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy36:executeStatement::-1', 'org.apache.hive.service.cli.CLIService:executeStatement:CLIService.java:280', 'org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:531', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*org.apache.hadoop.hive.ql.parse.SemanticException:Database does not exist: `test`:34:7', 'org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer:validateDatabase:DDLSemanticAnalyzer.java:1954', 'org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer:analyzeDescribeTable:DDLSemanticAnalyzer.java:2013', 'org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer:analyzeInternal:DDLSemanticAnalyzer.java:343', 'org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer:analyze:BaseSemanticAnalyzer.java:258', 'org.apache.hadoop.hive.ql.Driver:compile:Driver.java:512', 'org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1317', 'org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1295', 'org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:204', '*org.apache.hadoop.hive.ql.parse.SemanticException:Database does not exist: `test`:34:0', 'org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer:validateDatabase:DDLSemanticAnalyzer.java:1951'], sqlState='42000', errorCode=10072, errorMessage='Error while compiling statement: FAILED: SemanticException [Error 10072]: Database does not exist: `test`'), operationHandle=None)
[SQL: DESCRIBE FORMATTED `test`.`test_es_entity`]
(Background on this error at: <http://sqlalche.me/e/13/e3q8>)
After some digging up, it turned out that problem is with... backtics used in describe query. In our setup we have disabled support for quoted identifiers (https://issues.apache.org/jira/browse/HIVE-6013). So, is it possible to configure ingestor in such a way that it won’t be using backtics in queries?proud-memory-42381
12/13/2022, 3:47 PMbest-umbrella-88325
12/13/2022, 4:11 PMpip install
the required dialect packages yourself.".
We are curios to know where do these be installed? I mean if we install it on the machine that runs the CLI, does it mean we cannot ingest using SQLAlchemy from the UI?
Please let us know if we are on the right track. Any help would be appreciated.
Thanks in advance.cuddly-dinner-641
12/13/2022, 7:26 PMwitty-motorcycle-52108
12/13/2022, 7:59 PM