Hey everyone! I am having issues ingesting dbt cl...
# troubleshoot
c
Hey everyone! I am having issues ingesting dbt cloud sources. dbt-cloud environment: single tenant environment and connecting from DH 10.1. Error in DataHub:
Copy code
~~~~ Execution Summary - RUN_INGEST ~~~~
Execution finished with errors.
{'exec_id': '39e4de59-eb3e-4739-b024-c51ea5c76fbe',
 'infos': ['2023-05-16 01:00:00.066911 INFO: Starting execution for task with name=RUN_INGEST',
           '2023-05-16 01:00:00.067371 INFO: Caught exception EXECUTING task_id=39e4de59-eb3e-4739-b024-c51ea5c76fbe, name=RUN_INGEST, '
           'stacktrace=Traceback (most recent call last):\n'
           '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 112, in execute_task\n'
           '    task_event_loop = asyncio.new_event_loop()\n'
           '  File "/usr/local/lib/python3.10/asyncio/events.py", line 783, in new_event_loop\n'
           '    return get_event_loop_policy().new_event_loop()\n'
           '  File "/usr/local/lib/python3.10/asyncio/events.py", line 673, in new_event_loop\n'
           '    return self._loop_factory()\n'
           '  File "/usr/local/lib/python3.10/asyncio/unix_events.py", line 64, in __init__\n'
           '    super().__init__(selector)\n'
           '  File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 53, in __init__\n'
           '    selector = selectors.DefaultSelector()\n'
           '  File "/usr/local/lib/python3.10/selectors.py", line 350, in __init__\n'
           'OSError: [Errno 24] Too many open files\n'],
 'errors': []}

~~~~ Ingestion Logs ~~~~
My ingestion configuration...
Copy code
source:
    type: dbt-cloud
    config:
        max_threads: 1
        metadata_endpoint: '<https://my-metadata-cloud>-.com/graphql'
        project_id: '3'
        job_id: '82'
        target_platform: snowflake
        stateful_ingestion:
            enabled: true
        account_id: '9999'
        token: MYDBTToken
Failed to configure the source (dbt-cloud): 1 validation error for DBTCloudConfig
max_threads
  extra fields not permitted (type=value_error.extra)
Whenever I don't specify max_threads the ingestion works, but I keep hitting a file handle leak. (Error shown above).
Copy code
I added "max_threads: 1"
However, this config does not work either. @gray-shoe-75895 @big-carpet-38439, we looked at this last week for an on-prem dbt fix, (Essentially telling DataHub to run the dbt ingestion single-threaded), but this is not working for dbt cloud. Any help would be appreciated! Thanks
1
a
@famous-waitress-64616 might be able to provide some insight here
c
Thanks @astonishing-answer-96712 and @famous-waitress-64616, I have already resolved this issue by adding a sink configuration specifying max_threads: 1 as well as restart the actions container on k8s.