GitHub
08/28/2023, 3:07 PMimage▾
GitHub
08/28/2023, 4:10 PMplan
via Atlantis does not skipped it if there's no changes:
screenshot_2023-08-28-17:42:45-2▾
v1.5.5
.
The plan workflow configuration:
workflows:
terragrunt:
plan:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform$(bash /opt/terragrunt-tfversion.sh)"'
# Reduce Terraform suggestion output
- env:
name: TF_IN_AUTOMATION
value: 'true'
- run:
command: "/opt/terragrunt-plan.sh"
output: strip_refreshing
The plan
script:
> cat terragrunt-plan.sh
#!/bin/bash
set -e -o pipefail
export TERM="xterm"
direnv allow .
eval "$(direnv export bash)";
export TF_CLI_ARGS="-no-color"
terragrunt plan $(echo "$COMMENT_ARGS" | sed 's/\\//g' | sed 's/,/ /g') -out="$PLANFILE" 2>&1 | awk -v owner="${BASE_REPO_OWNER}" -v repo="${BASE_REPO_NAME}" -v pr="${PULL_NUM}" '{ print "[INFO] " owner "/" repo "#" pr ":", $0; fflush(); }' | tee -a /proc/1/fd/1 | cut -d" " -f3-
terragrunt show -no-color -json "$PLANFILE" > "$SHOWFILE";
Logs
Logs
[DBUG] pre-hooks configured, running...
[DBUG] got workspace lock
[DBUG] Setting shell to default: ""
[DBUG] Setting shellArgs to default: ""
[INFO] successfully ran "sh -c terragrunt-atlantis-config generate --output atlantis.yaml --parallel --automerge --create-project-name --terraform-version=0.14.11 --filter=\"[eu|jp|us|global]*\" --create-workspace" in "/atlantis-data/repos/company/terraform/xxxx/default"
[DBUG] 1 files were modified in this pull request
[DBUG] got workspace lock
[INFO] successfully parsed atlantis.yaml file
[DBUG] moduleInfo for /atlantis-data/repos/company/terraform/xxxx/default (matching "") = map[]
[DBUG] found downstream projects for "toto/stg/test/terragrunt.hcl": []
...
[INFO] 1 projects are to be planned based on their when_modified config
[DBUG] determining config for project at dir: "toto/stg/test" workspace: "toto_stg_test"
[DBUG] MergeProjectCfg started
[DBUG] setting apply_requirements: [approved,mergeable,undiverged] from repos[1], id: <http://github.com/company/test|github.com/company/test>
[DBUG] setting workflow: "terragrunt" from repos[1], id: <http://github.com/company/terraform|github.com/company/terraform>
[DBUG] setting allow_custom_workflows: false from default server config
[DBUG] setting repo_locking: true from default server config
[DBUG] setting policy_check: false from default server config
[DBUG] setting plan_requirements: [] from default server config
[DBUG] setting import_requirements: [] from default server config
[DBUG] setting allowed_overrides: [] from default server config
[DBUG] setting delete_source_branch_on_merge: false from default server config
[DBUG] final settings: plan_requirements: [], apply_requirements: [approved,mergeable,undiverged], import_requirements: [], workflow: terragrunt
[DBUG] Building project command context for plan
[DBUG] deleting previous plans and locks
[INFO] Running plans in parallel
[INFO] acquired lock with id "company/terraform/toto/stg/test..."
[DBUG] acquired lock for project
[DBUG] starting "echo \"terraform$(bash /opt/terragrunt-tfversion.sh)\"" in "/atlantis-data/repos/company/terraform/xxxx/..."
[DBUG] starting "/opt/terragrunt-plan.sh" in "/atlantis-data/repos/company/terraform/xxxx/..."
[INFO] plan success. output available at: <https://github.com/company/terraform/pull/xxxx>
Environment details
• Atlantis version: v0.25.0
• Deployment method: tf module
• Atlantis flags: ATLANTIS_CHECKOUT_STRATEGY: merge
runatlantis/atlantisJT
08/28/2023, 6:12 PMGitHub
08/28/2023, 9:16 PMYou should avoid using the :latest tag when deploying containers in production, because this makes it hard to track which version of the image is running and hard to roll back.Instead, the tag of the specific release should be used (for example,
v0.25.0
)
Reproduction Steps
It appears this was introduced in 89ccb86, as part of #3049
Logs
N/A
Environment details
Any deployment using Kustomize from v0.23.0-pre.20230125 or later
runatlantis/atlantisGitHub
08/29/2023, 8:39 AMRan Plan for 2 projects:
project: template dir: template workspace: default
project: awesome-project dir: infra/awesome-project workspace: default
1. project: template dir: template workspace: default
Plan Error
dir "template" does not exist
Reproduction Steps
1. Create a template as per the docs
2. Either
• modify any of the files specified in autoplan -> when_modified, or
• set --enable-regexp-cmd
and then comment atlantis plan -p.*
Logs
Environment details
• Atlantis version: 0.25.0
• Deployment method: helm chart v4.15.0
Repo atlantis.yaml
file:
version: 3
parallel_plan: false
parallel_apply: false
projects:
- &template
name: template
dir: template
workflow: custom
autoplan:
enabled: true
when_modified:
- "./terraform/modules/**/*.tf"
- "**/*.tf"
- ".terraform.lock.hcl"
- <<: *template
name: awesome-project
dir: ./infra/awesome-project
workflows:
custom:
plan:
apply:
Additional Context
runatlantis/atlantisGitHub
08/31/2023, 9:37 AMsignal: killed
while planning, running the plan locally works as expected. The module contains contains the flux_bootstrap_git resource which creates a 8000 line YAML file and which was changed in the relevant PR due to an upgrade. I suspect it is relevant, because an earlier plan of the same module but without changing the mentioned resource worked without problems with the same Atlantis version.
Reproduction Steps
The only change was changing the version
of flux_bootstrap_git
from v2.0.0
to v2.1.0
, so I expect the issue should be reproducable this way.
Logs
Logs
"stacktrace":"<http://github.com/runatlantis/atlantis/server/events.RunAndEmitStats|github.com/runatlantis/atlantis/server/events.RunAndEmitStats>\n\tgithub.com/runatlantis/atlantis/server/events/instrumented_project_command_runner.go:78\ngithub.com/runatlantis/atlantis/server/events.(*InstrumentedProjectCommandRunner).Plan\n\tgithub.com/runatlantis/atlantis/server/events/instrumented_project_command_runner.go:38\ngithub.com/runatlantis/atlantis/server/events.runProjectCmds\n\tgithub.com/runatlantis/atlantis/server/events/project_command_pool_executor.go:48\ngithub.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).run\n\tgithub.com/runatlantis/atlantis/server/events/plan_command_runner.go:256\ngithub.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run\n\tgithub.com/runatlantis/atlantis/server/events/plan_command_runner.go:292\ngithub.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:301"
Plan output
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
time=2023-08-31T06:58:04Z level=error msg=Terraform invocation failed in /atlantis-data/repos/REPO_AND_MODULE/.terragrunt-cache/asdf prefix=[/atlantis-data/repos/REPO_AND_MODULE]
time=2023-08-31T06:58:04Z level=error msg=1 error occurred:
* [/atlantis-data/repos/REPO_AND_MODULE/.terragrunt-cache/asdf] signal: killed
Environment details
• Atlantis version: v0.25.0 (tried with v0.24.3 as well)
• Deployment method: helm chart
• Slightly extended Atlantis image (Terragrunt & AWS binaries): https://github.com/schueco/atlantis-docker-image/blob/v1.0.16/Dockerfile
Atlantis server-side config file:
repos:
- id: /.*/
workflow: terragrunt
pre_workflow_hooks:
- run: terragrunt-atlantis-config generate --autoplan --workflow terragrunt --parallel=false --output atlantis.yaml
apply_requirements: [mergeable]
allowed_overrides: [workflow]
allowed_workflows: [terragrunt]
allow_custom_workflows: true
workflows:
terragrunt:
plan:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- env:
name: DESTROY_PARAMETER
command: if [ "$COMMENT_ARGS" = '\-\d\e\s\t\r\o\y' ]; then echo "-destroy"; else echo ""; fi
- run: terragrunt plan -no-color -out=$PLANFILE $DESTROY_PARAMETER
Repo atlantis.yaml
file:
automerge: false
parallel_apply: false
parallel_plan: false
projects: ...
Additional Context
• similar to #452
runatlantis/atlantisGitHub
08/31/2023, 2:51 PMcdktf synth
locally and commit the generated files to the repo.
Describe the solution you'd like
The current CDKTF use case has the limitation that cdktf synth
must be run locally and the generated files committed to the repo for Atlantis to plan and apply against them. This is because Atlantis only considers modified files to be ones that the relevant VCS reports as being modified in the PR/MR. We can add an additional server configuration flag include-git-untracked-files
that would add any files that have been dynamically generated on the server in the project working directory to the modified file list by leveraging git ls-files
. This would allow us to run cdktf synth
in a pre-workflow hook to generate Terraform files which Atlantis would then plan and apply on.
Describe the drawbacks of your solution
None
Describe alternatives you've considered
None
runatlantis/atlantisGitHub
09/01/2023, 5:41 AMGitHub
09/01/2023, 5:17 PMhalt_on_failure
to be added to pre-workflow hooks. I am happy to contribute if maintainers and community thinks there is value in having this feature.
Usecase
We use https://github.com/transcend-io/terragrunt-atlantis-config to generate atlantis.yaml
. In our teams developers open PRs to our repos and if there is an error in terragrunt config or hcl the config generator fails too resulting in no atlantis.yaml. Because of this atlantis falls back to default workflow resulting in a weird error because it cannot understand terragrunt configs.
running "/atlantis-data/bin/terraform1.0.0 plan -input=false -refresh -no-color -out \"/atlantis-data/repos/deliveryhero/pd-sre-terraform/34/default/squads/dark-stores/groceries-product-service-golden-signals/staging/default.tfplan\"" in "/atlantis-data/repos/deliveryhero/pd-sre-terraform/34/default/squads/dark-stores/groceries-product-service-golden-signals/staging": exit status 1
Error: No configuration files
Plan requires configuration to be present. Planning without a configuration
would mark everything for destruction, which is normally not what is desired.
If you would like to destroy everything, run plan with the -destroy option.
Otherwise, create a Terraform configuration file (.tf file) and try again.
runatlantis/atlantisGitHub
09/04/2023, 1:47 AM---
repos:
- id: /.*/
allowed_overrides: [workflow]
allow_custom_workflows: true
apply_requirements: [mergeable,approved,undiverged]
atlantis.yaml:
version: 3
projects:
- name: internal_hub
dir: ./terraform/internal_hub
autoplan:
when_modified: ["**/*.tf","**/*.hcl"]
enabled: true
After push to MR I see planfile in the atlantis POD:
atlantis-0:/$ cd /atlantis-data/repos/
atlantis-0:/atlantis-data/repos$ find | grep tfplan
./__REPO__/753/default/terraform/internal_hub/internal_hub-default.tfplan
But, when I want to discard plan via WEB, I got 500 error:
deleting lock failed with: remove /atlantis-data/repos/__REPO__/753/default/terraform/internal_hub/default.tfplan: no such file or directory
PODs logs:
{"level":"info","ts":"2023-09-04T01:23:05.042Z","caller":"events/working_dir.go:382","msg":"Deleting plan: /atlantis-data/repos/__REPO__/753/default/terraform/internal_hub/default.tfplan","json":{}}
{"level":"warn","ts":"2023-09-04T01:23:05.042Z","caller":"events/delete_lock_command.go:41","msg":"Failed to delete plan: remove /atlantis-data/repos/__REPO__/753/default/terraform/internal_hub/default.tfplan: no such file or directory","json":{},"stacktrace":"<http://github.com/runatlantis/atlantis/server/events.(*DefaultDeleteLockCommand).DeleteLock|github.com/runatlantis/atlantis/server/events.(*DefaultDeleteLockCommand).DeleteLock>\n\tgithub.com/runatlantis/atlantis/server/events/delete_lock_command.go:41\ngithub.com/runatlantis/atlantis/server/controllers.(*LocksController).DeleteLock\n\tgithub.com/runatlantis/atlantis/server/controllers/locks_controller.go:114\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2136\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\tgithub.com/gorilla/mux@v1.8.0/mux.go:210\ngithub.com/urfave/negroni/v3.(*Negroni).UseHandler.Wrap.func1\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:59\ngithub.com/urfave/negroni/v3.HandlerFunc.ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:33\ngithub.com/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:51\ngithub.com/runatlantis/atlantis/server.(*RequestLogger).ServeHTTP\n\tgithub.com/runatlantis/atlantis/server/middleware.go:70\ngithub.com/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:51\ngithub.com/urfave/negroni/v3.(*Recovery).ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/recovery.go:210\ngithub.com/urfave/negroni/v3.middleware.ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:51\ngithub.com/urfave/negroni/v3.(*Negroni).ServeHTTP\n\tgithub.com/urfave/negroni/v3@v3.0.0/negroni.go:111\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2938\nnet/http.(*conn).serve\n\tnet/http/server.go:2009"}
I think, something is wrong in escaping slash in repo path.
Besides, there is one job in Gitlab CI, which will newer finished.
screen_atlantis▾
GitHub
09/05/2023, 3:42 PM--tf-download-url
but it would be nice to have this officially documented so people can start using opentf instead of terraform for future releases.
https://www.runatlantis.io/docs/server-configuration.html#tf-download-url
Describe the solution you'd like
See above
Describe the drawbacks of your solution
N/A
Describe alternatives you've considered
N/A
* * *
• Previous ticket #3727
• #1776
• https://github.com/warrensbox/terraform-switcher
• warrensbox/terraform-switcher#315
atlantis/server/core/terraform/terraform_client.go
Line 379 in </runatlantis/atlantis/commit/3ea2914e1637066f336f64984c41f62391632223|3ea2914>
atlantis/server/core/terraform/terraform_client.go
Lines 555 to 562 in </runatlantis/atlantis/commit/3ea2914e1637066f336f64984c41f62391632223|3ea2914>
runatlantis/atlantisGitHub
09/06/2023, 2:31 PMversion: 3
projects:
- name: qa
dir: qa_acct/qa_env
terraform_version: v0.12.8
autoplan:
when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
enabled: false
- name: staging
dir: prod_acct/staging_env
terraform_version: v0.12.8
autoplan:
when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
enabled: false
- name: prod
dir: prod_acct/prod_env
terraform_version: v0.12.8
autoplan:
when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
enabled: false
Plans are generated for all three projects as normal after commenting exactly atlantis plan
.
Immediately afterword, commenting atlantis apply
attempts to apply all three environments as expected. In this case, there was an apply error due to an AWS IAM policy being misconfigured and the plans were not successfully applied. A commit was pushed to fix this issue and another atlantis apply
was submitted. Note, there was not another atlantis plan
after the fix commit was pushed. Atlantis behaved as if it had forgotten about the failed plans and assumed they had been applied successfully when, in fact, they had not been. I believe the expected behavior should be to reject the apply since new commits were made and force another plan be run, correct?
The result was the following:
Ran Apply for 0 projects:
Automatically merging because all plans have been successfully applied.
Locks and plans deleted for the projects and workspaces modified in this pull request:
* dir: `prod_acct/prod_env` workspace: `default`
* dir: `prod_acct/staging_env` workspace: `default`
* dir: `qa_acct/qa_env` workspace: `default`
runatlantis/atlantisGitHub
09/06/2023, 9:05 PMapply
is run against a request that needs but does not have approval, atlantis comments this error:
Apply Failed: Pull request must be approved by at least one person other than the author before running apply.
Presumably from this line of code: https://github.com/runatlantis/atlantis/blob/main/server/events/command_requirement_handler.go#L25
However, at least on my setup (using Approval Rules in gitlab), what is says is not true: the person authoring the request can in fact approve it. This is a setting one could put in an Approval Rule in gitlab, but is not required.
Peeking at other VCSs (https://github.com/runatlantis/atlantis/tree/master/server/events/vcs) I don't see any code that makes sure that the user is different from the person that said approved. It seems like atlantis just trusts whatever the underlying VCS calls "approved".
I'm not sure if it's best to fix this by making this line more vague ("Pull request must be approved according to the project's approval rules") or by trying to figure out what those rules are per-VCS, that's up to the atlantis team.
Reproduction Steps
1. Open an MR in a context with Approval Rules and Approvals Required (in my case in gitlab, not sure how this affects other VCSs)
2. Type "atlantis apply" without approving, and this comment will come up
Environment details
If not already included, please provide the following:
• Atlantis version: v0.19.3
• Atlantis flags:
ATLANTIS_PARALLEL_POOL_SIZE = 5
ATLANTIS_LOG_LEVEL = "debug"
ATLANTIS_AUTOMERGE = "true"
ATLANTIS_REPO_WHITELIST = "<http://gitlab.dev.tripadvisor.com/techops/terraform-aws,gitlab.dev.tripadvisor.com/techops/cloud/*|gitlab.dev.tripadvisor.com/techops/terraform-aws,gitlab.dev.tripadvisor.com/techops/cloud/*>"
ATLANTIS_GITLAB_HOSTNAME = "<https://gitlab.dev.tripadvisor.com>"
ATLANTIS_GITLAB_USER = "atlantis-svc"
ATLANTIS_WRITE_GIT_CREDS = "true"
Atlantis server-side config file:
repos:
- id: /.*/
workflow: terragrunt
allowed_overrides: [workflow]
apply_requirements: ["approved", "mergeable"]
workflows:
terragrunt:
plan:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- run: terragrunt plan -no-color -out=$PLANFILE
apply:
steps:
- env:
name: TERRAGRUNT_TFPATH
command: 'echo "terraform${ATLANTIS_TERRAFORM_VERSION}"'
- run: terragrunt apply -no-color $PLANFILE
Repo atlantis.yaml
file:
version: 3
projects:
- name: core-network
workspace: core-network
dir: .
terraform_version: v1.1.7
- dir: us-east-1
workspace: core-network-us-east-1
terraform_version: v1.1.7
runatlantis/atlantisGitHub
09/07/2023, 5:20 AMGitHub
09/07/2023, 1:22 PMatlantis unlock
command on a PR, the whole working directory is deleted, rather than just the plan files. This is an issue especially on PRs with a large number of plans, as the next plan on the PR must then go through the terraform init
process again for each plan, playing the 'Russian Roulette' as to whether the outstanding race condition with the Terraform provider cache will cause a failure as discussed in hashicorp/terraform#31964.
Reproduction Steps
1. Create a PR that contains changes for at least one plan.
2. Comment atlantis plan
on the PR.
3. On the Atlantis server, check that the working directory for the PR has been created and the relevant repo branch has been cloned into it, and the relevant tfplan files created.
4. Comment atlantis unlock
on the PR.
5. On the Atlantis server, see that the working directory for the PR is now empty.
Environment details
• Atlantis version: 0.25.0
Additional Context
I recommend changing the DeleteLocksByPull
function here
atlantis/server/events/delete_lock_command.go
Line 62 in </runatlantis/atlantis/commit/70c9f1700753716ef26479454df0d625bf070b81|70c9f17>
to call WorkingDir.DeletePlan
rather than deleteWorkingDir
which would mean just the plan files are deleted, and the working directory remains, complete with any initialised Terraform folders.
runatlantis/atlantisGitHub
09/08/2023, 3:08 PMGitHub
09/12/2023, 7:30 AMGitHub
09/13/2023, 3:22 PMGitHub
09/14/2023, 3:52 AM{"level":"warn","ts":"2023-09-14T03:42:14.057Z","caller":"events/plan_command_runner.go:332","msg":"unable to update commit status: POST <https://api.github.com/repos/MYORG/MYREPO/statuses/4da1c56b7fb29a3f1398f4be1b158d623963dd04>: 404 Not Found []","json":{"repo":"MYORG/MYREPO","pull":"150"},"stacktrace":"<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:332|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:332>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:151|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:151>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290>
<http://github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand|github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand>
<http://github.com/runatlantis/atlantis/server/events/command_runner.go:177%22}|github.com/runatlantis/atlantis/server/events/command_runner.go:177"}>
{"level":"error","ts":"2023-09-14T03:42:14.122Z","caller":"vcs/instrumented_client.go:231","msg":"Unable to update status at url: , error: POST <https://api.github.com/repos/MYORG/MYREPO/statuses/4da1c56b7fb29a3f1398f4be1b158d623963dd04>: 404 Not Found []","json":{"repository":"MYORG/MYREPO","pull-num":"150"},"stacktrace":"<http://github.com/runatlantis/atlantis/server/events/vcs.(*InstrumentedClient).UpdateStatus|github.com/runatlantis/atlantis/server/events/vcs.(*InstrumentedClient).UpdateStatus>
<http://github.com/runatlantis/atlantis/server/events/vcs/instrumented_client.go:231|github.com/runatlantis/atlantis/server/events/vcs/instrumented_client.go:231>
<http://github.com/runatlantis/atlantis/server/events/vcs.(*ClientProxy).UpdateStatus|github.com/runatlantis/atlantis/server/events/vcs.(*ClientProxy).UpdateStatus>
<http://github.com/runatlantis/atlantis/server/events/vcs/proxy.go:84|github.com/runatlantis/atlantis/server/events/vcs/proxy.go:84>
<http://github.com/runatlantis/atlantis/server/events.(*DefaultCommitStatusUpdater).UpdateCombinedCount|github.com/runatlantis/atlantis/server/events.(*DefaultCommitStatusUpdater).UpdateCombinedCount>
<http://github.com/runatlantis/atlantis/server/events/commit_status_updater.go:81|github.com/runatlantis/atlantis/server/events/commit_status_updater.go:81>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:324|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:324>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:152|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:152>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290>
<http://github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand|github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand>
<http://github.com/runatlantis/atlantis/server/events/command_runner.go:177%22}|github.com/runatlantis/atlantis/server/events/command_runner.go:177"}>
{"level":"warn","ts":"2023-09-14T03:42:14.122Z","caller":"events/plan_command_runner.go:332","msg":"unable to update commit status: POST <https://api.github.com/repos/MYORG/MYREPO/statuses/4da1c56b7fb29a3f1398f4be1b158d623963dd04>: 404 Not Found []","json":{"repo":"MYORG/MYREPO","pull":"150"},"stacktrace":"<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).updateCommitStatus>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:332|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:332>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:152|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:152>
<http://github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run|github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run>
<http://github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290|github.com/runatlantis/atlantis/server/events/plan_command_runner.go:290>
<http://github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand|github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand>
<http://github.com/runatlantis/atlantis/server/events/command_runner.go:177%22}|github.com/runatlantis/atlantis/server/events/command_runner.go:177"}>
runatlantis/atlantisGitHub
09/18/2023, 9:15 PMroot
instead of atlantis
Reproduction Steps
✗ docker run -it --rm --entrypoint bash <http://ghcr.io/runatlantis/atlantis:v0.25-alpine|ghcr.io/runatlantis/atlantis:v0.25-alpine> -c 'whoami'
root
Logs
Environment details
Additional Context
The fix would be to add USER atlantis
at the bottom before the entrypoint. This may require additional changes such as chmod/chown access to the other directories that atlantis commonly accesses or even migrating to a better home specific directory structure.
This will most likely require changes to the docker-entrypoint.sh and subsequent removal of vulnerable `gosu` (see issues).
runatlantis/atlantisGitHub
09/18/2023, 11:00 PMsh -l
or bash -l
is not supported by gosu
and so it will not run the .profile
file. Thus, we need to softlink into /usr/local/bin
for the binaries to be available to all users (root and atlantis).
RUN apk add --no-cache \
build-base \
libffi-dev \
openssl-dev \
bzip2-dev \
zlib-dev \
readline-dev \
sqlite-dev
RUN cd $HOME \
&& git clone <https://github.com/pyenv/pyenv.git> $HOME/.pyenv \
&& cd $HOME/.pyenv \
&& git branch pyenv-2.3.27 v2.3.27 \
&& git checkout pyenv-2.3.27 \
&& cd $HOME \
&& git clone <https://github.com/pyenv/pyenv-virtualenv.git> $HOME/.pyenv/plugins/pyenv-virtualenv \
&& cd $HOME/.pyenv/plugins/pyenv-virtualenv \
&& git branch virtualenv-1.2.1 v1.2.1 \
&& git checkout virtualenv-1.2.1 \
&& cd $HOME \
&& echo 'export PYENV_ROOT="$HOME/.pyenv"' >> $HOME/.profile \
&& echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> $HOME/.profile \
&& echo 'eval "$(pyenv init -)"' >> $HOME/.profile \
&& echo 'eval "$(pyenv virtualenv-init -)"' >> $HOME/.profile \
&& source $HOME/.profile \
&& pyenv install \
3.8.18 \
3.9.18 \
3.10.13 \
3.11.5 \
&& ln -sf $HOME/.pyenv/versions/3.8.18/bin/python3.8 /usr/local/bin/ \
&& ln -sf $HOME/.pyenv/versions/3.9.18/bin/python3.9 /usr/local/bin/ \
&& ln -sf $HOME/.pyenv/versions/3.10.13/bin/python3.10 /usr/local/bin/ \
&& ln -sf $HOME/.pyenv/versions/3.11.5/bin/python3.11 /usr/local/bin/
runatlantis/atlantisGitHub
09/19/2023, 9:56 AMrepos:
- id: /.*/
pre_workflow_hooks:
- run: terraform init
- run: ./script.sh
description: Generating configs
branch: /.*/
workflow: run-plan-stg
apply_requirements: [approved, mergeable]
allowed_overrides: [apply_requirements, workflow, delete_source_branch_on_merge]
allowed_workflows: [run-plan-stg]
allow_custom_workflows: true
workflows:
run-plan-stg:
plan:
steps:
- run: echo "Run plan on $WORKSPACE"
- init
- plan
apply:
steps:
- run: echo "In Terraform Apply"
- apply`
And what I'm getting is that everytime the workflow is triggered, Atlantis is trying to plan the default workspace
Screenshot 2023-09-19 at 11 25 00 AM▾
repos:
- id: /.*/
pre_workflow_hooks:
- run: terraform init
- run: ./script.sh
description: Generating configs
branch: /.*/
- workflow: run-plan
** workspace: stg **
apply_requirements: [approved, mergeable]
allowed_overrides: [apply_requirements, workflow, delete_source_branch_on_merge]
allowed_workflows: [run-plan]
allow_custom_workflows: true
** - workflow: run-plan
workspace: stg-eu
apply_requirements: [approved, mergeable]
allowed_overrides: [apply_requirements, workflow, delete_source_branch_on_merge]
allowed_workflows: [run-plan]
allow_custom_workflows: true **
workflows:
run-plan:
plan:
steps:
- run: echo "Run plan on $WORKSPACE"
- init
- plan
apply:
steps:
- run: echo "In Terraform Apply"
- apply`
And to get something similar to what I'm getting with atlantis.yaml
Screenshot 2023-09-19 at 11 24 07 AM▾
GitHub
09/19/2023, 2:15 PMimage▾
GitHub
09/20/2023, 3:21 PMatlantis plan
it throws out following error intermittently:
The default workspace at path <path to project> is currently locked by another command that is running for this pull request. Wait until the previous command is complete and try again.
Not true for all projects, only specific projects show this behavior. But same default
workflow is been used for all projects.
Reproduction Steps
Intermittent issue thus couldn't reproduce this behavior.
Logs
Environment details
Additional Context
runatlantis/atlantisGitHub
09/25/2023, 4:26 PM✗ dockle <http://ghcr.io/runatlantis/atlantis:v0.25.0-alpine|ghcr.io/runatlantis/atlantis:v0.25.0-alpine>
WARN - CIS-DI-0001: Create a user for the container
* Last user should not be root
INFO - CIS-DI-0005: Enable Content trust for Docker
* export DOCKER_CONTENT_TRUST=1 before docker pull/build
INFO - CIS-DI-0006: Add HEALTHCHECK instruction to the container image
* not found HEALTHCHECK statement
debian
✗ dockle <http://ghcr.io/runatlantis/atlantis:v0.25.0-debian|ghcr.io/runatlantis/atlantis:v0.25.0-debian>
WARN - CIS-DI-0001: Create a user for the container
* Last user should not be root
INFO - CIS-DI-0005: Enable Content trust for Docker
* export DOCKER_CONTENT_TRUST=1 before docker pull/build
INFO - CIS-DI-0006: Add HEALTHCHECK instruction to the container image
* not found HEALTHCHECK statement
INFO - CIS-DI-0008: Confirm safety of setuid/setgid files
* setgid file: grwxr-xr-x usr/bin/expiry
* setuid file: urwxr-xr-x usr/bin/chfn
* setuid file: urwxr-xr-x usr/bin/umount
* setgid file: grwxr-xr-x usr/bin/ssh-agent
* setgid file: grwxr-xr-x usr/bin/wall
* setuid file: urwxr-xr-x usr/lib/openssh/ssh-keysign
* setuid file: urwxr-xr-x usr/bin/passwd
* setuid file: urwxr-xr-x usr/bin/su
* setuid file: urwxr-xr-x usr/bin/mount
* setuid file: urwxr-xr-x usr/bin/newgrp
* setgid file: grwxr-xr-x usr/bin/chage
* setuid file: urwxr-xr-x usr/bin/gpasswd
* setuid file: urwxr-xr-x usr/bin/chsh
* setgid file: grwxr-xr-x usr/sbin/unix_chkpwd
Additional Context
• https://github.com/goodwithtech/dockle
• Related to #3777
runatlantis/atlantisGitHub
09/25/2023, 7:17 PMmerge
. Plan a PR. Merge in changes to base branch. Apply PR.
Logs
relevant log shown above. Lemme know if more detailed log is needed.
Environment details
• Atlantis version: I've experienced this problem on 0.23.5 and 0.24.2
• Deployment method: docker container on AWS fargate
runatlantis/atlantisGitHub
09/25/2023, 7:17 PMGitHub
09/25/2023, 7:53 PMgit clone depth ....
fatal: could not read Username for '<https://github.com>': No such device or address
Reproduction Steps
To reproduce setup atlantis with
- name: ATLANTIS_GH_APP_ID
value: '376225'
- name: ATLANTIS_GH_APP_KEY
valueFrom:
secretKeyRef:
name: atlantis-vcs
key: ATLANTIS_GH_APP_KEY
- name: ATLANTIS_WRITE_GIT_CREDS
value: "true"
- name: ATLANTIS_GH_WEBHOOK_SECRET
valueFrom:
secretKeyRef:
name: atlantis-vcs
key: ATLANTIS_GH_WEBHOOK_SECRET
- name: ATLANTIS_GH_ORG
value: some-org
### Bitbucket Config ###
- name: ATLANTIS_BITBUCKET_USER
value: GCP-Infra-Team-Bot
- name: ATLANTIS_BITBUCKET_TOKEN
valueFrom:
secretKeyRef:
name: atlantis-vcs
key: ATLANTIS_BITBUCKET_TOKEN
What will happen is that the server will see that a Bitbucket User and Token are set and write to the .git-credentials
file the authenticated url for bitbucket but then won't write the line for the github.
I feel like this is a bug. I had a look a the code I think it might be here. https://github.com/runatlantis/atlantis/blob/main/server/events/vcs/git_cred_writer.go#L40
Since the github credentials are never written but the file already exists it will never write the credential. If I delete the file while the server is running it will then eventually write the file with the github credential but without the Bitbucket one.
Is this intended behavior or a bug?
runatlantis/atlantisGitHub
09/25/2023, 8:50 PMCommandRunner
- looking at this code piece. I think it should be feasible to run that code in 2 modes:
• current (default) mode when event controller creates a command runner
• worker (new) mode when even controller spawns a k8s job which:
• uses the same image or image without webserver component
• executes the event handler logic using the arguments provided from event controller
• uses RWM volume mount to save the state like generated plans etc.
in values.yaml
this could be smth like:
remote-execution:
enabled: true
stateVolumeName: my-volume
workerAffinity: {}
workerTolerations: {}
workerResources: {}
workerLimits: {}
if enabled
, this flag should also add a CRD to the namespace atlantis is installed in:
apiVersion: <http://apiextensions.k8s.io/v1|apiextensions.k8s.io/v1>
kind: CustomResourceDefinition
metadata:
name: <http://worker-job.atlantis.runatlantis.io|worker-job.atlantis.runatlantis.io>
spec:
group: <http://atlantis.runatlantis.io|atlantis.runatlantis.io>
scope: Namespaced
names:
plural: worker-jobs
singular: worker-job
kind: AtlantisWorker
shortNames:
- aw
versions:
- name: v1beta1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
metadata:
type: object
properties:
labels:
type: object
additionalProperties:
type: string
nullable: true
annotations:
type: object
additionalProperties:
type: string
nullable: true
description: Job metadata
template:
type: object
x-kubernetes-embedded-resource: true
x-kubernetes-preserve-unknown-fields: true
This will allow storing whole k8s job template on the cluster. Now the event handling flow will require an adjustment, I tried to put this into a mermaid:
graph TD;
A[event] --> B{remote execution enabled?};
B -->|No| C[local runner];
B -->|Yes| X;
X --> D[Prepare EventContext];
X --> E[resolve CommandType];
X --> F[read job template in RemoteJob object];
D --> G[set cmd
and args
in container spec];
E --> G;
F --> G;
G --> H[Send generated RemoteJob to the cluster];
H --> I[Wait for RemoteJob to complete];
I -->|Receive HTTP POST from job| J[Job Completed];
I -->|Check status on cluster| J;
J --> K[Send a Webhook Event data];
Note that with the proposed mode (adding a new CommandRunner) worker will be responsible for VCS communication as that class has VcsClient, so maybe actually checking the job result is not needed at all.
Describe the drawbacks of your solution
I do not see a lot of challenges with maintaining the k8s integration itself, as Batch API has been very stable and has plenty of features that Atlantis can make use of in the future like suspend
could be used for delayed apply. Only detail here is that every minor k8s release brings some exiciting stuff to use, so if Atlantis uses it, we'll be forced to add k8s feature compatibility matrix and decide on how we support different k8s versions and what should be available depending on the version. However, current architecture is not very friendly towards supporting remote exec and it would require some effort to add a new feature so it can work alongside existing functionality, or work selectively based on k8s version of the cluster Atlantis is deployed to.
Then, this adds CRD to the chart which comes with its own fun like migrations. This could be avoided by moving base template to app config instead.
However, running an external runner requires an image, and regardless of the route chosen this adds maintenance. If we go with a single image as now, we'll have to add support for "cli" mode on top of current "webserver" mode. If we go with 2 images, this adds a lot of chores to add a second image generation and publish it etc. Plus some people might want to run their own image, so they will come throw issues to support that, so a bit of a pandora box here.
Last, but not least, running an external process in an environment like k8s always come with a cost of investing into bookkeeping. What happens if the job fails to execute the command? How to handle exit 137 or other special exit codes when container might not be able to communicate its status gracefully? Most likely we'll need some sort of "garbage collector", or where I work, we call them "maintainers", which is another app instance that handles this edge cases. Note this is not about removing jobs, as TTL controller handles that no problem, but rather about situations when somebody runs atlantis plan
and gets silence in return because the launched job has crashed due to app misconfiguration etc.
Overall, I think all those are manageable, but no doubt this adds a new level of complexity to the app and will require more maintenance than before.
Describe alternatives you've considered
The alternatives will always be somewhere around the idea of either multithreading or doing replicaCount: >1
. I'd say the latter would be great, if possible and easier to implement compared to k8s jobs.
runatlantis/atlantisGitHub
09/25/2023, 9:09 PM