This message was deleted.
# atlantis-community
s
This message was deleted.
j
The container starts, I can exec to it. I can curl it. When I try to reach it via browser I also receive a 502.
d
so you are getting 400s and 502s? It's not immediately clear what the problem is? Are the liveness and readiness probes still failing?
j
The probes are failing yes. I'm using the /healthz endpoint with a 5 second timeout.
Lets leave the 502 aside for now
I shouldn't have brought that up lol
๐Ÿ˜‚ 1
d
Okay, what do the logs say?
kubectl logs
j
Copy code
{"level":"warn","ts":"2023-04-06T14:24:07.763Z","caller":"scheduled/executor_service.go:62","msg":"Received interrupt. Attempting to Shut down scheduled executor service","json":{},"stacktrace":"<http://github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).Run|github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).Run>\n\tgithub.com/runat
lantis/atlantis/server/scheduled/executor_service.go:62"}
{"level":"warn","ts":"2023-04-06T14:24:07.763Z","caller":"scheduled/executor_service.go:89","msg":"Received interrupt, cancelling job","json":{},"stacktrace":"<http://github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).runScheduledJob.func1|github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).runScheduledJob.func1>\n\tgithub.com/runatlantis/atlantis/se
rver/scheduled/executor_service.go:89"}
{"level":"warn","ts":"2023-04-06T14:24:07.763Z","caller":"scheduled/executor_service.go:67","msg":"All jobs completed, exiting.","json":{},"stacktrace":"<http://github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).Run|github.com/runatlantis/atlantis/server/scheduled.(*ExecutorService).Run>\n\tgithub.com/runatlantis/atlantis/server/scheduled/executor_
service.go:67"}
{"level":"warn","ts":"2023-04-06T14:24:07.763Z","caller":"server/server.go:970","msg":"Received interrupt. Waiting for in-progress operations to complete","json":{},"stacktrace":"<http://github.com/runatlantis/atlantis/server.(*Server).Start|github.com/runatlantis/atlantis/server.(*Server).Start>\n\tgithub.com/runatlantis/atlantis/server/server.go:
970\<http://ngithub.com/runatlantis/atlantis/cmd.(*ServerCmd).run|ngithub.com/runatlantis/atlantis/cmd.(*ServerCmd).run>\n\tgithub.com/runatlantis/atlantis/cmd/server.go:764\ngithub.com/runatlantis/atlantis/cmd.(*ServerCmd).Init.func2\n\tgithub.com/runatlantis/atlantis/cmd/server.go:640\ngithub.com/runatlantis/atlantis/cmd.(*ServerCmd).withErrPrin
t.func1\n\<http://tgithub.com/runatlantis/atlantis/cmd/server.go:1108|tgithub.com/runatlantis/atlantis/cmd/server.go:1108>\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execu
te\n\<http://tgithub.com/spf13/cobra@v1.6.1/command.go:968|tgithub.com/spf13/cobra@v1.6.1/command.go:968>\ngithub.com/runatlantis/atlantis/cmd.Execute\n\tgithub.com/runatlantis/atlantis/cmd/root.go:30\nmain.main\n\tgithub.com/runatlantis/atlantis/main.go:66\nruntime.main\n\truntime/proc.go:250"}
{"level":"info","ts":"2023-04-06T14:24:07.763Z","caller":"server/server.go:996","msg":"All in-progress operations complete, shutting down","json":{}}
Stream closed EOF for atlantis/atlantis-0 (atlantis)
messy
Basically the service is starting and then receiving an interrupt and dying
Here let me edit that and make it more readable
โœ… 1
d
No worries, my suggestion is to maybe tweak the checks initialDelay to give Atlantis time to come up before itโ€™s killed by the liveness check
j
Yea sure no problem - we have argocd deployed and the liveness/readiness timeout there is 1s with no issue. The liveness/readiness timeout in the runatlantis docs is 5s. Can you recommend a new threshold or just crank it to 11?
d
Just crank it for now
Itโ€™s sound like the liveness check is killing the container too aggressively
Might be how GKE is vs EKS, who knows
j
Not I, said the fly
This was just the sort of advice I was hoping for - the kind that confirms my suspicions. Thank you
d
Not a problem, keep me posted
j
Careful what you wish for
๐Ÿ˜‚ 1
๐Ÿ™‚
No dice
Cranked initial delay to 120s and its still failing
d
Interesting, what are the logs like now that the liveness probe isnโ€™t killing the container
The last log batch was showing Atlantis shutting down
j
I think this might be an issue with GCP load balancer config as ingress
d
It might explain the differences since its a GCP specific part
j
yea
c
This is the sort of stuff that makes me prefer running Atlantis standalone ๐Ÿ™‚
j
Oh its so fun tho /s
Ok so the resolution was dumb - there was a tls secret in our manifest that was constantly redirecting the http liveness/readiness checks because it was ancillary. Dropped it and everything was fine. NOw I need to build the ingress and service using google CLB
๐Ÿ’ฏ 1
๐ŸŽ‰
d
Great to hear that!