This message was deleted Atlantis #atlantis-contributors

Join Slack

This message was deleted.

# atlantis-contributors

Slackbot

06/12/2023, 1:25 PM

This message was deleted.

Bruno Schaatsbergen

06/12/2023, 1:26 PM

cc @Chris ter Beke (tagging my colleague for visibility).

Bruno Schaatsbergen

06/12/2023, 1:28 PM

cc @RB, @PePe Amengual, @Dylan Page

Dylan Page

06/12/2023, 1:29 PM

It is, might I also suggest maybe proposing an ADR first for context and we can use the meeting to discuss the ADR?

Chris ter Beke

06/12/2023, 1:41 PM

Hi! Thanks for inviting and tagging me @Bruno Schaatsbergen. So first a little of context: we were investigating using Atlantis at some customers at a scale that requires different teams to deploy their GCP infrastructure projects/repositories in a scalable but isolated way. For me this means that a central DevOps/CCoE team manages the Atlantis deployment, and client teams can request their repositories to be 'whitelisted' and automatically be able to deploy to just their GCP projects. Unfortunately this is currently seems not possible without changing the Atlantis server config and re-starting Atlantis, and create a lot of repeated config in the server config. Our preferred solution direction involves GCP Workload Identity Federation (which is based on OIDC), but this requires jobs to not be able to break out and access the OIDC private key that's somewhere on the filesystem. In the code we can see that Atlantis runs a series of

os.exec

commands but these run as the same user/context as the main process, meaning a client team can always create a branch, run some escalating shell commands to gain access to the private key, generate any JWT they want, and gain access to other client's GCP resources. Is there any way that a more isolated job model could fit into Atlantis' architecture (changing user IDs for jobs, changing chroot, running in containers, remote runners, etc.).

Dylan Page

06/12/2023, 2:16 PM

So right now with the current architecture, it's not possible to run a more isolated job model. We are definitely thinking about moving to a more "control/data plane cloud-native" model (I put it in quotes cuz buzzwords) than what we have currently which would enable more job models including the isolated one that you are wanting to implement. My focus first has been mainly on the filesystem side of the workers due to problems with file locking when doing Git cloning. But happy to discuss and plan out a proposal on how to further break out certain subsystems of Atlantis to better fit a the new model

Chris ter Beke

06/13/2023, 10:01 AM

Thanks for the response! I would be a strong proponent of a cloud native model where the orchestration plane is separated from the runners. Always happy to brainstorm on that. I'm also keen to get it working nicely in Cloud Run so there's no VMs and VPC to manage 🙂

👍🏽 1

Bruno Schaatsbergen

07/28/2023, 11:51 AM

FYI on this (@PePe Amengual, @Dylan Page) - we found a neat solution in Google Cloud to currently isolate authentication per account/project level. By granting the identity attached to Atlantis only 'impersonation/assume' rights on other service accounts (IAM roles). All the workflows are defined server-side, including a step per workflow to impersonate an identity specifically created for a project.

Bruno Schaatsbergen

07/28/2023, 11:51 AM

We are still keen on supporting OIDC, but currently we advice and help our clients (where we roll out Atlantis) to move to the above mentioned setup first 🙂

Bruno Schaatsbergen

07/28/2023, 11:52 AM

I'll write something up on this in the near future, a blog post or w/e.

🙌 1

Dylan Page

07/28/2023, 1:17 PM

That is definitely ideal, just requires a bit of configuration with the workflows. I bet we could replicate with the other cloud providers too

3 Views

Open in Slack

Previous Next