We wanted to migrate the airbyte application datab...
# give-feedback
w
We wanted to migrate the airbyte application database out of its docker container. In doing so, we realized that ALL of our API secrets and passwords (including personal passwords of various coworkers) were stored in plain text. Regardless of configuration routing (yes we changed the docker password), storing passwords in plain text is incredibly irresponsible due the fact that SOMEONE will have the docker password. That doesn't mean that person should have easy access to viewing every single password setup by others.
j
Many of us use a Secret Manager, which moves these out of the Airbyte DB—but keep in mind, every time they're used they have to be decrypted to be sent to the target system, so for that moment in time they'll still exist as such. But overall this is sort of due to the nature of this type of system; unless the target system supports something like oAuth, it has to have access to the raw secret (and even with oAuth, you're still storing the secrets and such that would be privileged). The expectation is generally not to expose Airbyte to any unauthorized users. They do store secrets in a separate table, but it obviously isn't very hard to join the IDs from the JSON config to get there (but it does mean you could control access to that table differently than others). Sure there are things like pgcrypto you can use for column-level encryption, but you still have to have the key in-system—you can't use hashing since you need to send non-hashed values to the remote systems. Even with a secrets manager, anyone with access to it can still decrypt the secret—so really the only true protection is the limitation of access to the system and multi-layer security. In our case, none of Airbyte's components are exposed to the web, and the admin UI is locked behind identity-aware proxy (Google IAP, which controls access to only our trusted users). Further, we hide Airbyte in our app altogether and only interface with it via the API. So I agree that there are options out there, but those are largely supplied by using a secrets manager currently. They could encrypt the values for storage, but since the key is present in system it's not all that much of a protection unless someone got a copy of a database backup without the corresponding application key.
w
I understand these limitations and don't disagree with any of them and that there would be no way for the application to work without feeding plain text into connection details. However, the documentation makes absolutely zero mention of this potentially important lack of security and instead redirects to focusing on the configuration management. I think it should at least be mentioned that this how Airbyte functions (or at least in OSS). This approach could violate a lot of company IT policies that explicitly prohibit storing of passwords in plaintext and could be grounds for dismissal or reprimanding if used in a business setting. If a bad actor truly wanted to get to the keys and had system access, theres not much to be done, sure. But an application key of some sort would still be better than nothing. At mature enterprises, DBAs and developers dont have a lot of overlap in access. Any encrypted key/password would require the DBA to: 1. know that there even is an application key 2. have access to the application code to find the key or env file with the key My pain point is not so much around security against bad actors, its that my fellow coworkers (or essentially whoever sets up the docker user's password) shouldn't stumble upon everyones personal passwords from setting up sources and destinations
j
@Walker Philips Yeah, good points all around. I think it definitely makes sense to make it explicit in the docs. I know that the Secrets managers are a little newer to the mix, so maybe a warning that recommends using one would be a good stop-gap. And maybe a feature request around generating a key in the initial setup and using it for secrets in new installs would make sense too. If you're up for it, drop a PR for the docs fixes with some recommend language and maybe a feature request for at-rest secret encryption by default as well (which I'm sure will be a bigger discussion). (I think there's a separate bug in that switching to a secrets manager breaks existing connections, when it should really just trigger a migration of the secrets, verify them, and then remove them from the internal storage)
w
@Justin Beasley Yes, I will open one!
u
Thanks Walker for bringing this discussion. I have shared your concerns our engineering team. We agree that our documentation needs to be clear regarding this limitation in the default deployment. We also want to provide guidance on implementing external secrets for use in a production environment with Airbyte. I’ll work with our documentation team to include this information in our deployment instructions. Today Airbyte support AWS, GCS and Vault as external secret manager tools for production deployments.
w
Thanks for passing it along Marcos, let me know if I still need to make a pull request or if this is being handled already.