This message was deleted Atlantis #atlantis-community

Join Slack

This message was deleted.

# atlantis-community

Slackbot

04/17/2023, 3:20 PM

This message was deleted.

PePe Amengual

04/17/2023, 3:27 PM

you upgraded the git client version ?

Justin S

04/17/2023, 3:27 PM

No.

PePe Amengual

04/17/2023, 3:27 PM

look at the issue

Justin S

04/17/2023, 3:27 PM

This is all in Kubernetes, no change in image, or helm chart

PePe Amengual

04/17/2023, 3:28 PM

this has been documented

Justin S

04/17/2023, 3:28 PM

i have seen the issue

Justin S

04/17/2023, 3:28 PM

What was posted there, doesnt appear to match up

PePe Amengual

04/17/2023, 3:29 PM

what about the permissions of the Atlantis data dir?

Justin S

04/17/2023, 3:29 PM

They wouldnt randomly change

PePe Amengual

04/17/2023, 3:29 PM

could that have changed ?

Justin S

04/17/2023, 3:29 PM

Copy code

statefulSet:
      securityContext:
        fsGroup: 1000
        runAsUser: 100
        fsGroupChangePolicy: "OnRootMismatch"

Justin S

04/17/2023, 3:29 PM

no change.

PePe Amengual

04/17/2023, 3:31 PM

well something changed for sure and the Atlantis code does not do that , so if for whatever reason the filesystem permissions changed you could possibly get this error

Justin S

04/17/2023, 3:31 PM

i can not imagine a situation where out of 100+ PVC's, this one randomly changed

PePe Amengual

04/17/2023, 3:31 PM

can you run like a chown on the pod ?

Justin S

04/17/2023, 3:31 PM

I cant launcht he pod

Justin S

04/17/2023, 3:32 PM

because it fails with the /nonexistent error

Justin S

04/17/2023, 3:33 PM

Copy code

❯ kubectl logs pod/atlantis-0 -n atlantis
{"level":"info","ts":"2023-04-17T15:32:41.901Z","caller":"vcs/gitlab_client.go:110","msg":"determined GitLab is running version 15.10.0","json":{}}
Error: initializing server: writing generated .git-credentials file with user, token and hostname to /nonexistent/.git-credentials: open /nonexistent/.git-credentials: no such file or directory

Justin S

04/17/2023, 3:33 PM

just spun up a test ENV

Justin S

04/17/2023, 3:33 PM

same error

Justin S

04/17/2023, 3:43 PM

def. tied just to atlantis

Justin S

04/17/2023, 3:43 PM

did a re-deploy int eh test ENV of everything using EFS

Justin S

04/17/2023, 3:43 PM

only atlantis breaks

PePe Amengual

04/17/2023, 3:48 PM

efs is nfs basically

Justin S

04/17/2023, 3:49 PM

Yes.

PePe Amengual

04/17/2023, 3:49 PM

could you try and ebs volume for Atlantis ?

PePe Amengual

04/17/2023, 3:49 PM

get it out of efs?

Justin S

04/17/2023, 3:49 PM

one sec, im spinning up a 3rd test ENV

PePe Amengual

04/17/2023, 3:49 PM

in your test env

Justin S

04/17/2023, 3:49 PM

with EFS and no pod sec. templates

Justin S

04/17/2023, 3:49 PM

since a fresh install with the defaults fail

Justin S

04/17/2023, 3:54 PM

One thing thats interesting, is the github issue, and the helm chart all assume the atlantis user is 100:1000, but its 1000:1000 on the debian image

Justin S

04/17/2023, 3:54 PM

probably a good reason to specify the UID here

Copy code

# Add atlantis user to Debian as well
RUN useradd --create-home --user-group --shell /bin/bash atlantis && \
    adduser atlantis root && \
    chown atlantis:root /home/atlantis/ && \
    chmod g=u /home/atlantis/ && \
    chmod g=u /etc/passwd

Justin S

04/17/2023, 3:57 PM

ok, its back to the original error

Justin S

04/17/2023, 3:57 PM

dubious permissions

PePe Amengual

04/17/2023, 3:59 PM

in a non efs volume ?

Justin S

04/17/2023, 4:00 PM

still EFS

Justin S

04/17/2023, 4:00 PM

i have to setup a EBS provisioner, since we dont use it anywhere

Justin S

04/17/2023, 4:00 PM

but, im atleast back to atlantis being broken.

Justin S

04/17/2023, 4:00 PM

Copy code

drwxrwxr-x   5 1001 1001 6.0K Apr 17 15:55  atlantis-data

Justin S

04/17/2023, 4:00 PM

thats interesting, since that uid/gid does not exist

PePe Amengual

04/17/2023, 4:01 PM

ohhhhh

Justin S

04/17/2023, 4:05 PM

I have no idea where that comes from TBH

Justin S

04/17/2023, 4:10 PM

and latest debian image is completely broken

Justin S

04/17/2023, 4:10 PM

Copy code

Digest: sha256:5389ae79b49230b8e4c6a305230f69e43f65c0cd4312f3822684880e39fdb47c
Status: Downloaded newer image for <http://ghcr.io/runatlantis/atlantis:v0.23.4-debian|ghcr.io/runatlantis/atlantis:v0.23.4-debian>
error: failed switching to "atlantis": unable to find user atlantis: no matching entries in passwd file

PePe Amengual

04/17/2023, 4:15 PM

I think that @Dylan Page is fixing or fixed that

PePe Amengual

04/17/2023, 4:16 PM

we had a few issues with the debían image

Justin S

04/17/2023, 4:16 PM

I dont understand why this is mounting the volume as 1001

Justin S

04/17/2023, 4:16 PM

i would expect it to mount it as root honestly

Dylan Page

04/17/2023, 4:16 PM

It’s fixed in the latest dev image

Justin S

04/17/2023, 4:17 PM

im still on the 23.3 image

Justin S

04/17/2023, 4:17 PM

which worked up until.. today

PePe Amengual

04/17/2023, 4:27 PM

the thing is that the permissions are done at the dockerfile level and after that atlantis run

PePe Amengual

04/17/2023, 4:27 PM

the code just uses the filesystem so there is no way for atlantis to change that

Justin S

04/17/2023, 4:28 PM

we deploy ia atlantis

Justin S

04/17/2023, 4:28 PM

i have no MR showing a change

Justin S

04/17/2023, 4:28 PM

we didnt change

Justin S

04/17/2023, 4:28 PM

we did a fresh deploy, same error

Justin S

04/17/2023, 4:28 PM

and the permissions are not done at the dockerfile level.

PePe Amengual

04/17/2023, 4:28 PM

did you try the non EFS install?

Justin S

04/17/2023, 4:28 PM

we only use EFS

Justin S

04/17/2023, 4:29 PM

in 20 clusters

Justin S

04/17/2023, 4:29 PM

and only atlantis is having this error

Justin S

04/17/2023, 4:29 PM

Getting EBS provisioners in place will take a while

Justin S

04/17/2023, 4:29 PM

to troubleshoot a broken app

PePe Amengual

04/17/2023, 4:31 PM

any information you have add it to the issue you commented since this chat will disappear in a month

Justin S

04/17/2023, 4:31 PM

ill try to get it in place

Justin S

04/17/2023, 4:31 PM

currently have to back atlantis out and move things back into flux for the time being

Justin S

04/17/2023, 4:31 PM

so we can operate

PePe Amengual

04/17/2023, 4:31 PM

I do not run atlantis in K8s so I can’t help you much

Justin S

04/17/2023, 5:07 PM

so i found another instance of atlantis we have

Justin S

04/17/2023, 5:07 PM

deployed with the same code, same base image, only diff is this image we add vault cli

Justin S

04/17/2023, 5:07 PM

its working

Justin S

04/17/2023, 5:08 PM

I can tell you that EFS creates a random UID/GID for the mount point, which is where the 10001 or 10002 etc stuff shows up from

Justin S

04/17/2023, 5:08 PM

im not sure why that would randomly be a problem now.

Justin S

04/17/2023, 5:09 PM

so a customer EFS storage class is created.. and setting uid/gid, instead of using the Range like it defaults too

Justin S

04/17/2023, 5:09 PM

Copy code

drwxrwxr-x   4 atlantis atlantis 6.0K Apr 17 17:07  atlantis-data

Justin S

04/17/2023, 5:09 PM

so that looks more correct

PePe Amengual

04/17/2023, 5:12 PM

can you force the uuid on the chart? I believe is possible

Justin S

04/17/2023, 5:12 PM

we have been doing that

Justin S

04/17/2023, 5:12 PM

now ive forced EFS to match

Justin S

04/17/2023, 5:12 PM

testing now

Justin S

04/17/2023, 5:13 PM

That fixed it

Justin S

04/17/2023, 5:14 PM

im not sure why the OTHER env is working, because we are not specifying a UID/GID on EFS there, just on the chart.

PePe Amengual

04/17/2023, 5:14 PM

I worked with NFS for like 10 years, I always had to force uuid and user

PePe Amengual

04/17/2023, 5:15 PM

that is why I wanted you to test EBS just to make sure

Justin S

04/17/2023, 5:15 PM

this is the only chart, where we force it

Justin S

04/17/2023, 5:15 PM

all using EFS

Justin S

04/17/2023, 5:15 PM

i think we have 20 clusters at this point.

Justin S

04/17/2023, 5:15 PM

The other atlantis deployment is working, without it being specified as well, which has me more confused

Justin S

04/17/2023, 5:16 PM

now it looks like atlantis has decided to use its own TF version

Justin S

04/17/2023, 5:16 PM

Copy code

{
  "level": "info",
  "ts": "2023-04-17T17:10:47.192Z",
  "caller": "terraform/terraform_client.go:361",
  "msg": "Detected module requires version: 1.4.5",
  "json": {
    "repo": "sphinx/terraform-aws-infra",
    "pull": "75"
  }
}

Justin S

04/17/2023, 5:16 PM

Copy code

ATLANTIS_DEFAULT_TF_VERSION=v1.3.7

Justin S

04/17/2023, 5:16 PM

Copy code

DEFAULT_TERRAFORM_VERSION=1.4.2

Justin S

04/17/2023, 5:17 PM

not even sure where that second ENV var comes from now..

PePe Amengual

04/17/2023, 5:17 PM

in your workflow you have

required_version

? that version will be used

Justin S

04/17/2023, 5:18 PM

Copy code

required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
  required_version = "~> 1.3"
}

Justin S

04/17/2023, 5:18 PM

thats what is defined everywhere

Justin S

04/17/2023, 5:19 PM

Copy code

environment:
      ATLANTIS_DEFAULT_TF_VERSION: v1.3.7

Justin S

04/17/2023, 5:20 PM

Ya, i dont actually see

DEFAULT_TERRAFORM_VERSION

even defined in the chart

PePe Amengual

04/17/2023, 5:23 PM

https://github.com/runatlantis/atlantis/pull/3153 https://github.com/runatlantis/atlantis/pull/2760

Justin S

04/17/2023, 5:25 PM

im not sure what im supposed to be seeing?

Justin S

04/17/2023, 5:26 PM

required_version = "~> 1.3"

should not resolve to 1.4.5

Justin S

04/17/2023, 5:27 PM

ya, this doesnt make sense

Justin S

04/17/2023, 5:28 PM

NOTE

The Atlantis latest docker imageopen in new window

tends to have recent versions of Terraform, but there may be a delay as new versions are released. The highest version of Terraform allowed in your code is the version specified by
DEFAULT_TERRAFORM_VERSION
in the image your server is running.

Justin S

04/17/2023, 5:28 PM

DEFAULT_TERRAFORM_VERSION=1.4.2

Justin S

04/17/2023, 5:28 PM

and yet it pulls 1.4.5

Justin S

04/17/2023, 5:28 PM

when we say

required_version = "~> 1.3"

PePe Amengual

04/17/2023, 5:28 PM

atlantis can download TF for you no matter what the default TF version is

Justin S

04/17/2023, 5:29 PM

the doc imply it wouldnt use 1.4.5

Justin S

04/17/2023, 5:29 PM

since the image states

DEFAULT_TERRAFORM_VERSION=1.4.2

Justin S

04/17/2023, 5:29 PM

either way though, we specify 1.3.

Justin S

04/17/2023, 5:31 PM

Copy code

{
  "level": "info",
  "ts": "2023-04-17T17:13:23.792Z",
  "caller": "models/shell_command_runner.go:156",
  "msg": "successfully ran \"/atlantis-data/bin/terraform1.4.5 plan -input=false -refresh -out \\\"/atlantis-data/repos/sphinx/terraform-aws-infra/75/default/us-gov-west-1/qa/network/default.tfplan\\\"\" in \"/atlantis-data/repos/sphinx/terraform-aws-infra/75/default/us-gov-west-1/qa/network\"",
  "json": {
    "repo": "sphinx/terraform-aws-infra",
    "pull": "75"
  }
}

Justin S

04/17/2023, 5:31 PM

Copy code

❯ cat us-gov-west-1/qa/network/provider.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
  required_version = "~> 1.3"
}

Justin S

04/17/2023, 5:33 PM

https://github.com/runatlantis/atlantis/issues/3201

Justin S

04/17/2023, 5:33 PM

The patch is relevant in
~> 1.3.0
to ensure it matches on only
1.3.x
whereas
~> 1.3
may match to
1.4

Justin S

04/17/2023, 5:34 PM

why would it even match that

Justin S

04/17/2023, 5:35 PM

now im curious if its been using 1.4.5 this entire time

PePe Amengual

04/17/2023, 5:35 PM

you could check the state file

Justin S

04/17/2023, 5:35 PM

doing so now

Justin S

04/17/2023, 5:35 PM

def. feels wrong that 1.3 could match 1.4

Justin S

04/17/2023, 5:38 PM

Copy code

"terraform_version": "1.3.9",

Justin S

04/17/2023, 5:41 PM

OK, im good now

Justin S

04/17/2023, 5:41 PM

PePe Amengual

04/17/2023, 5:42 PM

you fixed yourself!!!!

PePe Amengual

04/17/2023, 5:42 PM

thanks to you

PePe Amengual

04/17/2023, 5:44 PM

where you using 1.4.5?

Dylan Page

04/17/2023, 6:21 PM

~= or ~> is essentially a wildcard on the lowest tier version.

Dylan Page

04/17/2023, 6:22 PM

It’s common in most package managers

Dylan Page

04/17/2023, 6:23 PM

You’re basically saying “any version no less than 1.3, and no greater than 2.0”

Dylan Page

04/17/2023, 6:23 PM

If you do 1.3.0, it changes to “any version no less than 1.3.0, and no greater than 1.3”

Open in Slack

Previous Next