Maybe this needs to go to the forums, but seeing t...
# ask-community-for-troubleshooting
s
Maybe this needs to go to the forums, but seeing the activity here with channel renames/etc prompted me to update our install (docker images) to the latest version. That was a mistake. They start, but the server doesn't respond and I see messages in the airbyte-server about timeout waiting for namespace default to be initialized in temporal. The logs for airbyte-temporal show the namespace registration attempt and that it was already registered. Seems a docker connection issue or something? Strange.
1
✍️ 1
u
@[DEPRECATED] Marcos Marx turned this message into Zendesk ticket 2501 to ensure timely resolution!
m
Are you using latest version on master @Steve Palm also can you upload the server logs here?
s
I just pulled the latest from github. I'll admit, I haven't looked at this since I first installed it so I am not remembering all the pieces. 😞
I'll try to collect some meaningful logs and come back.
l
Had this kind of issue when updating correctly the docker-compose but not the .env file from github’s master if this can help @Steve Palm 🙂
🙏 1
m
Latest version introduced some new
.env
variables so it is necessary to update that
s
@Léon Stefani That doesn't update with git pull? Sorry to be dense, how to update it?
m
s
Ok, I had notes from what I did before, I will look at those official docs. Sorry for not checking that first, I didn't think they would have changed. Silly me. 🙂 Thanks!!
Ok, I pulled it from the repo, so those docs say that 'git pull' is all that is needed, but it seems that is not the case.
m
Ok, I pulled it from the repo, so those docs say that ‘git pull’ is all that is needed, but it seems that is not the case.
should be; if it is not working please try again to spin up the docker container
s
I downloaded the .env from github and compared it with mine, no difference. I have brought it down and up several times. I'll keep working on it. Seems the server cannot connect to temporal, the error is a timeout message.
Copy code
2022-09-30 18:15:06 INFO i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):221 - Waiting for temporal server...
2022-09-30 18:15:06 WARN i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):232 - Waiting for namespace default to be initialized in temporal...
2022-09-30 18:15:13 WARN i.t.i.r.GrpcSyncRetryer(retry):56 - Retrying after failure
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline exceeded after 4.987951421s. 
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.48.0.jar:1.48.0]
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.48.0.jar:1.48.0]
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.48.0.jar:1.48.0]
	at io.grpc.health.v1.HealthGrpc$HealthBlockingStub.check(HealthGrpc.java:252) ~[grpc-services-1.48.0.jar:1.48.0]
	at io.temporal.serviceclient.WorkflowServiceStubsImpl.lambda$checkHealth$2(WorkflowServiceStubsImpl.java:286) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.internal.retryer.GrpcSyncRetryer.retry(GrpcSyncRetryer.java:61) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.internal.retryer.GrpcRetryer.retryWithResult(GrpcRetryer.java:51) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.serviceclient.WorkflowServiceStubsImpl.checkHealth(WorkflowServiceStubsImpl.java:279) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.serviceclient.WorkflowServiceStubsImpl.<init>(WorkflowServiceStubsImpl.java:186) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.serviceclient.WorkflowServiceStubs.newInstance(WorkflowServiceStubs.java:51) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.temporal.serviceclient.WorkflowServiceStubs.newInstance(WorkflowServiceStubs.java:41) ~[temporal-serviceclient-1.8.1.jar:?]
	at io.airbyte.commons.temporal.TemporalUtils.lambda$createTemporalService$0(TemporalUtils.java:93) ~[io.airbyte-airbyte-commons-temporal-0.40.10.jar:?]
	at io.airbyte.commons.temporal.TemporalUtils.getTemporalClientWhenConnected(TemporalUtils.java:237) ~[io.airbyte-airbyte-commons-temporal-0.40.10.jar:?]
	at io.airbyte.commons.temporal.TemporalUtils.createTemporalService(TemporalUtils.java:89) ~[io.airbyte-airbyte-commons-temporal-0.40.10.jar:?]
	at io.airbyte.commons.temporal.TemporalUtils.createTemporalService(TemporalUtils.java:104) ~[io.airbyte-airbyte-commons-temporal-0.40.10.jar:?]
	at io.airbyte.commons.temporal.TemporalUtils.createTemporalService(TemporalUtils.java:108) ~[io.airbyte-airbyte-commons-temporal-0.40.10.jar:?]
	at io.airbyte.server.ServerApp.getServer(ServerApp.java:240) ~[io.airbyte-airbyte-server-0.40.10.jar:?]
	at io.airbyte.server.ServerApp.main(ServerApp.java:333) ~[io.airbyte-airbyte-server-0.40.10.jar:?]
m
Did you try to access localhost:8000 and see if it’s working? Temporal sometimes took a few secons/minutes to start.
s
That error above repeated for over 15min
I hit localhost:8000 with lynx and get a 'You need to enable Javascript to run this app.' This is on a remote host, I didn't have that port forwarded over the ssh tunnel.
Hitting it with a web browser shows:
Oops! Something went wrong… Cannot reach server. The server may still be starting up.
m
what
docker logs airbyte-temporal
gives to you?
s
airbyte-temporal-log.txt
This is on a cpanel host, and I had trouble getting it set up initially, I had to add a custom network segment to the .yaml file: networks: default: driver: bridge ipam: driver: default config: - subnet: 172.18.0.1/16
Then it worked without problems until this update.
I see these in airbyte-webapp:
Copy code
2022/09/30 18:31:00 [error] 39#39: *15 connect() failed (111: Connection refused) while connecting to upstream, client: 172.18.0.1, server: localhost, request: "POST /api/v1/workspaces/list HTTP/1.1", upstream: "<http://172.18.0.5:8001/api/v1/workspaces/list>", host: "localhost:8800", referrer: "<http://localhost:8800/>"
172.18.0.1 - - [30/Sep/2022:18:31:00 +0000] "POST /api/v1/workspaces/list HTTP/1.1" 502 497 "<http://localhost:8800/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36 Edg/105.0.1343.53" "-"
airbyte-cron is also reporting errors about connecting to Temporal
and airbyte-worker
Seems it is a docker connection issue that is new for some reason.
m
you changed the default port to 8800?
s
webapp has: ports: - 8000:80
8800 is the local end of my ssh tunnel to 8000 on the other end
From the remote host cli I can use nc to connect to port 7233 (temporal). I get an @ back.
I got a cli in airbyte-server, but seems no commands are there that could be used to check connectivity to temporal.
doing a
yum search netcat
gives me timeout errors, so it may be networking related? Something in docker changed maybe?
(inside airbyte-server that is)
m
ohhh, you’re in redhat or not ubuntu system?
s
This is inside the docker container
It is redhat
or something
m
oh, https://github.com/airbytehq/airbyte/blob/master/docs/archive/faq/deploying-on-other-os.md I’m not sure if Airbyte runs out-of-box in Redhat/CentOS probably you must configure some other steps to make it work. I strongly recommend to you use Ubuntu
s
You miss that this was working perfectly fine until the update pull. So maybe something else changed. Were any ports changed/added/etc? That was the whole point of docker was to isolate from the host environment I thought. I can revisit firewall rules, but this was working before.
iptables -L for chain DOCKER shows d3306, 7233, 80, 8001 and 9000 as ACCEPT.
m
The most dramatic changed was the introduction of Micronaut framework in latest version; but I don’t see this as the problem. Could you pull the correct release version?
git pull v0.40.10
to ensure you’re using the correct docker-compose and .env file?
s
Is there something missing from that command? I get
Copy code
fatal: 'v0.40.10' does not appear to be a git repository
I came back to this after doing a lot of other chores/etc and saw this in the airbyte-server log:
Copy code
2022-09-30 19:04:08 WARN i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):232 - Waiting for namespace default to be initialized in temporal...
2022-09-30 19:04:15 WARN i.t.i.r.GrpcSyncRetryer(retry):56 - Retrying after failure
 .
 .
 .
2022-09-30 21:44:19 INFO i.t.s.WorkflowServiceStubsImpl(<init>):188 - Created GRPC client for channel: ManagedChannelOrphanWrapper{delegate=ManagedChannelImpl{logId=39, target=airbyte-temporal:7233}}
2022-09-30 21:44:24 INFO i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):249 - Temporal namespace default initialized!
2022-09-30 21:44:24 INFO i.a.s.ServerApp(migrateExistingConnectionsToTemporalScheduler):295 - Migration to temporal scheduler has already been performed
2022-09-30 21:44:24 INFO i.a.s.ServerApp(getServer):266 - Starting server...
2022-09-30 21:44:24 INFO i.a.c.EnvConfigs(getEnvOrDefault):1072 - Using default value for environment variable WORKER_ENVIRONMENT: 'DOCKER'
2022-09-30 21:44:24 INFO o.e.j.u.l.Log(initialized):169 - Logging initialized @146392ms to org.eclipse.jetty.util.log.Slf4jLog
2022-09-30 21:44:24 INFO o.e.j.s.Server(doStart):360 - jetty-9.4.31.v20200723; built: 2020-07-23T17:57:36.812Z; git: 450ba27947e13e66baa8cd1ce7e85a4461cacc1d; jvm 17.0.4.1+9-LTS
Sep 30, 2022 9:44:24 PM org.glassfish.jersey.server.wadl.WadlFeature configure
WARNING: JAXBContext implementation could not be found. WADL feature is disabled.
2022-09-30 21:44:25 INFO o.h.v.i.u.Version(<clinit>):21 - HV000001: Hibernate Validator 6.1.2.Final
2022-09-30 21:44:25 INFO o.e.j.s.h.ContextHandler(doStart):860 - Started o.e.j.s.ServletContextHandler@7744195{/,null,AVAILABLE}
2022-09-30 21:44:25 INFO o.e.j.s.AbstractConnector(doStart):331 - Started ServerConnector@f559c74{HTTP/1.1, (http/1.1)}{0.0.0.0:8001}
2022-09-30 21:44:25 INFO o.e.j.s.Server(doStart):400 - Started @147758ms
2022-09-30 21:44:25 INFO i.a.s.ServerApp(start):139 - 
    ___    _      __          __
   /   |  (_)____/ /_  __  __/ /____
  / /| | / / ___/ __ \/ / / / __/ _ \
 / ___ |/ / /  / /_/ / /_/ / /_/  __/
/_/  |_/_/_/  /_.___/\__, /\__/\___/
                    /____/
--------------------------------------
 Now ready at <http://localhost:8000/>
--------------------------------------
Version: 0.40.10
SO eventually it came up (around 2hr 45min later), but the web interface still gives an error it cannot contact the server.
While the server came up, and I eventually could log in to the web interface (just by waiting a long time). However, when it spins up the connectors they have the same problem so syncs fail.
m
@Steve Palm sorry the long delay. I’ll return soon 🙂
s
@Marcos Marx (Airbyte) Looping back... After a lot of googling and testing, it seemed something was just strange inside Docker. 'systemctl restart docker' and container communication seems much better now and things are working. Sorry for the rabbit trails, but being a production server I couldn't just reboot to keep testing. 😉 A sync is running now.
m
Ohhhh this is amazing! Sorry to not return to you sooner, I’m OOO 😞
s
No worries, thanks for your contact. I hope I remember this next time. 🙂
213 Views