Hi! :slightly_smiling_face: I'm having problems s...
# ask-community-for-troubleshooting
n
Hi! 🙂 I'm having problems setting up airbyte + airflow with docker-compose. I can't get communication between airflow and airbyte working. I have followed the guide at https://github.com/airbytehq/airbyte/tree/master/resources/examples/airflow , including installing
apache-airflow-providers-airbyte
and setting up a connection in airflow pointing at
'<airbyte://host.docker.internal:8000>'
However, when trying to start my DAG in airflow, I get the following error:
Copy code
requests.exceptions.HTTPError: 404 Client error: Not found for url: <http://host.docker.internal:8000/api/v1/connections/sync>
Hos OS is Ubuntu 18.04. Does anyone have any ideas on what I should try? Thanks in advance!
1
u
Hi @Niclas Grahm could you please share the output of the
docker ps
command. Could you try to replace
host.docker.internal:8000
with
webapp:80
?
n
Hi @[DEPRECATED] Augustin Lafanechere, thanks for taking the time! docker ps output:
Copy code
a2791cf49753   apache/airflow:2.1.0             "/usr/bin/dumb-init …"   9 minutes ago    Up 9 minutes (healthy)     0.0.0.0:8085->8080/tcp                                                     airflow_webserver
f1e494fe4578   apache/airflow:2.1.0             "/usr/bin/dumb-init …"   9 minutes ago    Up 9 minutes (healthy)     0.0.0.0:5555->5555/tcp, 8080/tcp                                           airflow_flower_1
0fac7b4c82cc   apache/airflow:2.1.0             "/usr/bin/dumb-init …"   9 minutes ago    Up 9 minutes (healthy)     8080/tcp                                                                   airflow_airflow-worker_1
9914b4b2c499   apache/airflow:2.1.0             "/usr/bin/dumb-init …"   9 minutes ago    Up 9 minutes (healthy)     8080/tcp                                                                   airflow_airflow-scheduler_1
66e172a630ae   apache/superset:latest-dev       "/usr/bin/docker-ent…"   9 minutes ago    Up 9 minutes (healthy)     0.0.0.0:8088->8088/tcp                                                     superset_app
968f0af2eed6   apache/superset:latest-dev       "/usr/bin/docker-ent…"   9 minutes ago    Up 9 minutes (unhealthy)   8088/tcp                                                                   superset_worker_beat
62333a921688   apache/superset:latest-dev       "/usr/bin/docker-ent…"   9 minutes ago    Up 9 minutes (unhealthy)   8088/tcp                                                                   superset_worker
88e6ce4798c0   airbyte/temporal:0.32.8-alpha    "./update-and-start-…"   9 minutes ago    Up 9 minutes               6933-6935/tcp, 6939/tcp, 7234-7235/tcp, 7239/tcp, 0.0.0.0:7233->7233/tcp   airbyte-temporal
a007bc33d66e   airbyte/scheduler:0.32.8-alpha   "/bin/bash -c ${APPL…"   9 minutes ago    Up 9 minutes                                                                                          airbyte-scheduler
610eba7350be   airbyte/server:0.32.8-alpha      "/bin/bash -c ${APPL…"   9 minutes ago    Up 9 minutes               8000/tcp, 0.0.0.0:8001->8001/tcp                                           airbyte-server
b435ac56b251   postgres:10                      "docker-entrypoint.s…"   9 minutes ago    Up 9 minutes               5432/tcp                                                                   superset_db
00e156565a28   redis:latest                     "docker-entrypoint.s…"   9 minutes ago    Up 9 minutes               6379/tcp                                                                   superset_cache
fd1e3f5f4d7a   airbyte/webapp:0.32.8-alpha      "/docker-entrypoint.…"   9 minutes ago    Up 9 minutes               0.0.0.0:8000->80/tcp                                                       airbyte-webapp
0e64814fd7d0   airbyte/db:0.32.8-alpha          "docker-entrypoint.s…"   9 minutes ago    Up 9 minutes               5432/tcp                                                                   airbyte-db
3843c320d60a   airbyte/worker:0.32.8-alpha      "/bin/bash -c ${APPL…"   9 minutes ago    Up 9 minutes                                                                                          airbyte-worker
81a953e12e0a   postgres:13                      "docker-entrypoint.s…"   9 minutes ago    Up 9 minutes (healthy)     5432/tcp                                                                   airflow_postgres_1
6ab0079657c8   redis:latest                     "docker-entrypoint.s…"   9 minutes ago    Up 9 minutes (healthy)     0.0.0.0:6379->6379/tcp                                                     airflow_redis_1
f69a79319742   postgres                         "docker-entrypoint.s…"   10 minutes ago   Up 10 minutes              0.0.0.0:2000->5432/tcp                                                     airbyte-destination
I tried changing the connection to webapp:80, but i get similar results. here's the log for the failed airflow task:
Copy code
*** Reading local file: /opt/airflow/logs/trigger_airbyte_job_example/airbyte_example/2021-12-01T15:24:10.272424+00:00/1.log
[2021-12-01 15:24:10,914] {taskinstance.py:876} INFO - Dependencies all met for <TaskInstance: trigger_airbyte_job_example.airbyte_example 2021-12-01T15:24:10.272424+00:00 [queued]>
[2021-12-01 15:24:10,926] {taskinstance.py:876} INFO - Dependencies all met for <TaskInstance: trigger_airbyte_job_example.airbyte_example 2021-12-01T15:24:10.272424+00:00 [queued]>
[2021-12-01 15:24:10,926] {taskinstance.py:1067} INFO - 
--------------------------------------------------------------------------------
[2021-12-01 15:24:10,926] {taskinstance.py:1068} INFO - Starting attempt 1 of 1
[2021-12-01 15:24:10,928] {taskinstance.py:1069} INFO - 
--------------------------------------------------------------------------------
[2021-12-01 15:24:10,934] {taskinstance.py:1087} INFO - Executing <Task(AirbyteTriggerSyncOperator): airbyte_example> on 2021-12-01T15:24:10.272424+00:00
[2021-12-01 15:24:10,938] {standard_task_runner.py:52} INFO - Started process 1391 to run task
[2021-12-01 15:24:10,941] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'trigger_airbyte_job_example', 'airbyte_example', '2021-12-01T15:24:10.272424+00:00', '--job-id', '5', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/dag_airbyte_example.py', '--cfg-path', '/tmp/tmpt_wccg5y', '--error-file', '/tmp/tmphm42qy48']
[2021-12-01 15:24:10,941] {standard_task_runner.py:77} INFO - Job 5: Subtask airbyte_example
[2021-12-01 15:24:10,969] {logging_mixin.py:104} INFO - Running <TaskInstance: trigger_airbyte_job_example.airbyte_example 2021-12-01T15:24:10.272424+00:00 [running]> on host 0fac7b4c82cc
[2021-12-01 15:24:10,999] {taskinstance.py:1282} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=trigger_airbyte_job_example
AIRFLOW_CTX_TASK_ID=airbyte_example
AIRFLOW_CTX_EXECUTION_DATE=2021-12-01T15:24:10.272424+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-12-01T15:24:10.272424+00:00
[2021-12-01 15:24:11,004] {base.py:78} INFO - Using connection to: id: airbyte_example. Host: webapp, Port: 80, Schema: , Login: , Password: None, extra: {}
[2021-12-01 15:24:11,009] {http.py:140} INFO - Sending 'POST' to url: <http://webapp:80/api/v1/connections/sync>
[2021-12-01 15:24:12,051] {http.py:195} WARNING - HTTPConnectionPool(host='webapp', port=80): Max retries exceeded with url: /api/v1/connections/sync (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f31992aacf8>: Failed to establish a new connection: [Errno -2] Name or service not known',)) Tenacity will retry to execute the operation
[2021-12-01 15:24:12,053] {taskinstance.py:1481} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/util/connection.py", line 61, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 392, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.6/http/client.py", line 1287, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1333, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1282, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1042, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.6/http/client.py", line 980, in send
    self.connect()
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connection.py", line 172, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f31992aacf8>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 727, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/util/retry.py", line 446, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='webapp', port=80): Max retries exceeded with url: /api/v1/connections/sync (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f31992aacf8>: Failed to establish a new connection: [Errno -2] Name or service not known',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1137, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1311, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1341, in _execute_task
    result = task_copy.execute(context=context)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/airbyte/operators/airbyte.py", line 74, in execute
    job_object = hook.submit_sync_connection(connection_id=self.connection_id)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/airbyte/hooks/airbyte.py", line 101, in submit_sync_connection
    headers={"accept": "application/json"},
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/http/hooks/http.py", line 141, in run
    return self.run_and_check(session, prepped_request, extra_options)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/http/hooks/http.py", line 196, in run_and_check
    raise ex
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/http/hooks/http.py", line 187, in run_and_check
    allow_redirects=extra_options.get("allow_redirects", True),
  File "/home/airflow/.local/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='webapp', port=80): Max retries exceeded with url: /api/v1/connections/sync (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f31992aacf8>: Failed to establish a new connection: [Errno -2] Name or service not known',))
[2021-12-01 15:24:12,056] {taskinstance.py:1531} INFO - Marking task as FAILED. dag_id=trigger_airbyte_job_example, task_id=airbyte_example, execution_date=20211201T152410, start_date=20211201T152410, end_date=20211201T152412
[2021-12-01 15:24:12,076] {local_task_job.py:151} INFO - Task exited with return code 1
u
What about
airbyte-webapp:80
😄 (docker uses containers names for internal dns resolutions)
n
Same result unfortunately.
Running
docker ps --format '{{.ID}}\t{{.Names}} {{.Networks}}'
, I notice that airflow and airbyte aren't on the same network (I think, don't have too much experience here):
Copy code
a2791cf49753    airflow_webserver                       airflow_default
f1e494fe4578    airflow_flower_1                        airflow_default
0fac7b4c82cc    airflow_airflow-worker_1                        airflow_default
9914b4b2c499    airflow_airflow-scheduler_1                     airflow_default
66e172a630ae    superset_app                    superset_default
968f0af2eed6    superset_worker_beat                    superset_default
62333a921688    superset_worker                 superset_default
88e6ce4798c0    airbyte-temporal                        mds_airbyte_default
a007bc33d66e    airbyte-scheduler                       mds_airbyte_default
610eba7350be    airbyte-server                  mds_airbyte_default
b435ac56b251    superset_db                     superset_default
00e156565a28    superset_cache                  superset_default
fd1e3f5f4d7a    airbyte-webapp                  mds_airbyte_default
0e64814fd7d0    airbyte-db                      mds_airbyte_default
3843c320d60a    airbyte-worker                  mds_airbyte_default
81a953e12e0a    airflow_postgres_1                      airflow_default
6ab0079657c8    airflow_redis_1                 airflow_default
f69a79319742    airbyte-destination                     bridge
Again, I'm using the example in
resources/examples/airflow
in the airbyte github repo
I think it works now. I added
Copy code
extra_hosts:
      - "host.docker.internal:host-gateway"
in the airflow docker-compose.yaml for the webapp service. For reference, I'm running Windows 11, wsl2/ubuntu18.04
u
Cool! Yes sorry for my misleading suggestion I should've double check if they were in the same network. I think your struggle came from the fact that the config is a bit different for docker on windows.
n
I think so too, I've had issues with
host.docker.internal
in other scenarios on windows as well. Thanks again for your time Augustin!
👍 1