Pardeep Bhatt
04/05/2023, 5:54 AM~/.atlantis
using nfs on multiple nodes. We are able to successfully achieve this and if plan request landing on one node and apply request on another node then because of syncing happing for the ~/.atlantis
the apply is running fine and we are good here. But this comes with a new set of problem now i.e. when we have made simultaneous plan requests and when they landed on same node then we are getting the error
Plan Error
The default workspace at path . is currently locked by another command that is running for this pull request.
Wait until the previous command is complete and try again.
which is fine and expected but when the both requests have landed on different-different nodes then there are 3 different scenarios happening
1. first request plan passing and second request plan failing with failed to read plan/state file error.
2. first request plan passing and second request plan failing with error unable to get lock.
3. both request plan are passing.
what we want is to somehow if we can give back this error
Plan Error
The default workspace at path . is currently locked by another command that is running for this pull request.
Wait until the previous command is complete and try again.
back to the user in case of multiple plan requests, then that would be fine and we will be good to go ahead.
Any sort of will be appreciated.
Thanks.PePe Amengual
04/05/2023, 3:53 PMPardeep Bhatt
04/06/2023, 5:56 AMPePe Amengual
04/06/2023, 6:01 AMPardeep Bhatt
04/06/2023, 6:01 AMPlan Failed: This project is currently locked by an unapplied plan from pull !XX. To continue, delete the lock from !XX or apply that plan and merge the pull request.
Once the lock is released, comment atlantis plan here to re-plan.
so this is fine, but the when on the same MR multiple plan/apply requests are fired irrespective of waiting for the response of first request to came, there are two scaneris which are possible
1. if all requests landed on same node
2. if all requests landed on different node
for case 1) it is expected that we will get workspace lock error i.e.
Plan Error
The default workspace at path . is currently locked by another command that is running for this pull request.
Wait until the previous command is complete and try again.
and this is happening as expected, but in case 2) this is not happening, because what we find from code that this information is stored in the application memory and not in any database, so in order to make this info available to other nodes, this info needs to stored in some common shared storage location, like redis can be used again here, but now this comes with 2 I/O operations in a single plan/apply request, because first data will be written in redis that a plan for this path is going to start, like we are doing here but instead of writing it to application memory this time it will be written to redis and similarly this info will be removed from redis once the plan/apply operation is performed, like inside unlock fn, the data will be removed from redis.
what do you think of this approach or if you have something else in mind please let me know.
we have hard requirement of running atlantis on multi node, because the traffic which we have it can’t be handled by a single node, we need multi node architecture.PePe Amengual
04/12/2023, 12:29 PMNish Krishnan
04/12/2023, 3:51 PMPardeep Bhatt
04/21/2023, 5:52 AMPePe Amengual
04/21/2023, 4:09 PMJon
04/21/2023, 5:55 PM