Hi! anyone have an idea why I might be receiving t...
# troubleshoot
b
Hi! anyone have an idea why I might be receiving this when running manual ingestion for snowflake-usage? User has accountadmin role and the query runs fine in snowflake.
l
@square-activity-64562 ^
s
Please share the following • Are you using UI based ingestion, or Python CLI through command line, or programatically via python SDK or programmatically via Java emitter? • Version of CLI that you are using • full logs in text format (instead of screenshots) from the ingestion that fails. Please do not remove any parts of the log (mask the secret if any secret is being shown) • the recipe in text format (instead of screenshots)
b
• CLI command
datahub ingest -c ./dataplatform-usage.yml
fails with that error. • acryl-datahub, version 0.8.29.2 Log
Copy code
[josefbit@localhost datahub]$ datahub ingest -c ./dataplatform-usage.yml
[2022-03-17 14:31:45,616] INFO     {datahub.telemetry.telemetry:125} - Sending init Telemetry
[2022-03-17 14:31:45,930] INFO     {datahub.telemetry.telemetry:159} - Sending Telemetry
[2022-03-17 14:31:46,080] INFO     {datahub.cli.ingest_cli:75} - DataHub CLI version: 0.8.29.2
[2022-03-17 14:31:46,090] INFO     {datahub.ingestion.sink.datahub_rest:60} - Setting gms config
[2022-03-17 14:31:46,090] INFO     {datahub.telemetry.telemetry:58} - Updating telemetry config
[2022-03-17 14:31:47,464] INFO     {datahub.ingestion.source_config.sql.snowflake:114} - using authenticator type 'DEFAULT_AUTHENTICATOR'
[2022-03-17 14:31:47,464] INFO     {datahub.ingestion.source_config.sql.snowflake:86} - Cleaned Host port is evdp.europe-west4.gcp
[2022-03-17 14:31:47,464] INFO     {datahub.ingestion.source_config.usage.snowflake_usage:46} - snowflake usage tables are only accessible by role "accountadmin" by default; you set None
[2022-03-17 14:31:47,465] INFO     {datahub.cli.ingest_cli:91} - Starting metadata ingestion
[2022-03-17 14:31:47,466] INFO     {datahub.ingestion.source.usage.snowflake_usage:418} - Checking usage date ranges
[2022-03-17 14:31:48,585] INFO     {datahub.telemetry.telemetry:159} - Sending Telemetry
[2022-03-17 14:31:48,981] ERROR    {datahub.entrypoints:152} - Stackprinter failed while formatting <FrameInfo /home/josefbit/datahub/lib64/python3.8/site-packages/more_itertools/recipes.py, line 353, scope <genexpr>>:
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/stackprinter/frame_formatting.py", line 224, in select_scope
    raise Exception("Picked an invalid source context: %s" % info)
Exception: Picked an invalid source context: [353], [], dict_keys([332])

So here is your original traceback at least:

Traceback (most recent call last):
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/cursor.py", line 791, in execute
    Error.errorhandler_wrapper(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 272, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 327, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 206, in default_errorhandler
    raise error_class(
snowflake.connector.errors.ProgrammingError: 002003 (02000): SQL compilation error:
Database 'SNOWFLAKE' does not exist or not authorized.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/entrypoints.py", line 138, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/telemetry/telemetry.py", line 202, in wrapper
    raise e
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/telemetry/telemetry.py", line 194, in wrapper
    res = func(*args, **kwargs)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/utilities/memory_leak_detector.py", line 102, in wrapper
    res = func(*args, **kwargs)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/cli/ingest_cli.py", line 92, in run
    pipeline.run()
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 181, in run
    for wu in itertools.islice(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/ingestion/source/usage/snowflake_usage.py", line 261, in get_workunits
    for wu in cast(Iterable[MetadataWorkUnit], operation_aspect_work_units_raw):
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/more_itertools/recipes.py", line 353, in <genexpr>
    (x for (cond, x) in t2 if cond),
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/more_itertools/recipes.py", line 349, in <genexpr>
    evaluations = ((pred(x), x) for x in iterable)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/ingestion/source/usage/snowflake_usage.py", line 475, in _aggregate_access_events
    for event in events:
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/ingestion/source/usage/snowflake_usage.py", line 419, in _get_snowflake_history
    self._check_usage_date_ranges(engine)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/datahub/ingestion/source/usage/snowflake_usage.py", line 302, in _check_usage_date_ranges
    for db_row in engine.execute(query):
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 2235, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1003, in execute
    return self._execute_text(object_, multiparams, params)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1172, in _execute_text
    ret = self._execute_context(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
    util.raise_(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/cursor.py", line 791, in execute
    Error.errorhandler_wrapper(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 272, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 327, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/home/josefbit/datahub/lib64/python3.8/site-packages/snowflake/connector/errors.py", line 206, in default_errorhandler
    raise error_class(
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002003 (02000): SQL compilation error:
Database 'SNOWFLAKE' does not exist or not authorized.
[SQL:
            select
                min(query_start_time) as min_time,
                max(query_start_time) as max_time
            from snowflake.account_usage.access_history
            where ARRAY_SIZE(base_objects_accessed) > 0
        ]
(Background on this error at: <http://sqlalche.me/e/13/f405>)

[2022-03-17 14:31:48,981] INFO     {datahub.entrypoints:161} - DataHub CLI version: 0.8.29.2 at /home/josefbit/datahub/lib64/python3.8/site-packages/datahub/__init__.py
[2022-03-17 14:31:48,981] INFO     {datahub.entrypoints:164} - Python version: 3.8.8 (default, Nov  9 2021, 13:31:34)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] at /home/josefbit/datahub/bin/python3 on Linux-4.18.0-348.12.2.el8_5.x86_64-x86_64-with-glibc2.2.5
[2022-03-17 14:31:48,981] INFO     {datahub.entrypoints:167} - GMS config {'models': {}, 'versions': {'linkedin/datahub': {'version': 'v0.8.29', 'commit': '2d82531a1d80be057d29ede455da7873e71ba35e'}}, 'managedIngestion': {'defaultCliVersion': '0.8.26.6', 'enabled': True}, 'statefulIngestionCapable': True, 'supportsImpactAnalysis': True, 'telemetry': {'enabledCli': True, 'enabledIngestion': False}, 'retention': 'true', 'noCode': 'true'}
Recipe
Copy code
source:
    type: snowflake-usage
    config:
        host_port: <http://removed.snowflakecomputing.com|removed.snowflakecomputing.com>
        warehouse: DATA_WH
        username: username
        password: password

        email_domain: <http://domain.com|domain.com>
        top_n_queries: 10
sink:
    type: datahub-rest
    config:
        server: "<http://localhost:8080>"
UI ingestion also fails
This throws sqlalchemy error "Account not specified error" which is the real cause I guess
s
You are using Quickstart, right? The managed ingestion on quickstart is using an older version of CLI that is why the error is different. Can you please add the role explicitly and try again?
Copy code
role = "accountadmin"
Even if the user has accountadmin role the default role might be different. So by default it could be running with PUBLIC role which would not have the required access.
b
ah yeah ok that worked, I didn't see it was left out here since I had defined it in my main ingestion recipe, thanks!
s
Glad to know it worked out. I will look at the connector to see if we can make this clear somehow for next time
b
documentation is correct in the example so I must have deleted the field by mistake
👍 1