This message was deleted.
# advice-data-transformation
s
This message was deleted.
🤯 1
😅 1
a
Hi Eugene, thank you for sharing this! Which version of Airbyte are you using? Could you also share the version of the source-postgres connector and destination-bigquery connector? Do you mind opening a topic on our forum sharing this problem? It would help centralizing the discussion with our engineering team.
Feel free to share the reply from Google support too 🙏🏻
e
• Airbyte version is 0.33.11-alpha Can't find any information on connector versions from the interface. It also looks like Airbyte created all those VM instances. Is this technically possible? And if so, what's the reason behind it?
a
What is your deployment method? Docker or kubernetes?
e
docker
a
Airbyte is not triggering creation of VM on your cloud account. I'm not sure this is an Airbyte-related problem. I'd suggest you deep dive a bit in GCP audit logs to understand who created these instances.
e
Got it. I will. I just need to know it's impossible for airbyte to created new VM instances so I can get it off the equation. But all those instances were created exactly the day I launched a bunch of syncs through Airbyte so it was an obvious place to look at
a
it's impossible for airbyte to created new VM instances so I can get it off the equation
Airbyte itself does not provision any infrastructure. This why I asked for your deployment method, a badly configured K8S cluster could lead to unlimited node creation, but again this is not something managed by Airbyte.
Is the service account associated with your Airbyte VM permitted to create GCP instances.
e
Yes. I guess it is
but can't be 100% sure. I deleted the service account cause we were getting charged even though all syncs were on pause since Friday
a
Did you had the opportunity to connect to these instance and get a glimpse of what they were running?
e
judging by the activity name it's somehow connected with data insertion
a
Are you sure that your are not running on Kubernetes? I see
pod
occurrences in your screenshot
e
We are running Airbyte version on our own servers and we deployed it using Docker. We synchronise tables in our MongoDb and MySQL databases with BigQuery using Bigquery connectors. We don't use Kubernetes as far as I know
We followed these instructions when deploying it, didn't change anything apart from what is required in the instructions: https://docs.airbyte.com/deploying-airbyte/local-deployment
a
For the costs you mentionned, are they BigQuery costs or compute instance costs?
We are running Airbyte version on our own servers and we deployed it using Docker.
Yes sorry the pod probably refer to internal Google namespace.
Ok, you need to find which service account created the VM then.
e
Our own syncs doesn't utilize this from what I can see - we simply upload CSV files.
that's the only operation I see being performed on all those instances
a
I'll triple check with our technical team but I can't find explicit reference to this kind of operation in our repo. It would help if you could get the IP address or any more details about these operations.
e
and looks like we don't get any activities of this type since the time we shot down Airbyte. But we still got charged, I assume maybe it was because the instances had been created and hadn't been shot down (my lame explanation)
a
judging by the activity name it's somehow connected with data insertion
The activity you see in the screenshot is not data insertion but bulk creation of GCP instances. As far as I know it's not something required in Airbyte realm, neither by Airbyte platform itself or the source and destination connector you are using. I wrote to our technical team to make sure that my assertions are correct. From my standpoint I'm under the impression that the GCP instances on which you run Airbyte got compromised and someone was able to run this bulk creation of instances for malicious activity. I can't be 💯 sure of this if you don't get more details about what is (or was) running on the new GCP instances.
e
Thanks. I'll try to get info from Google Cloud team. I'm not able to dig into it deeper since after disabling the billing I'm no longer allowed in compute engine dashboards.
a
I don't have any context about your organization IAM management, but I'm under the impression the
sendpulse-backend
service account is not a default service account that GCP provisions for a new VM. Did you assign this existing
sendpulse-backend
service account to your Airbyte VM ?
e
yes. I've created a JSON key for this account and specified it in Airbyte.
This one was created manually
m
If you open the activity are you able to see the IP generated request?
e
I can only see IPs of VM instances
a
You need to consume Audit logs from GCP to get more details about what happenned on your account.
👏🏻 For figuring this out and I'm sorry for you being a target. The only certitude you have is that your service account is probably compromised and you should renew it asap.
e
I've deleted it earlier today. Should be more careful about this in the future. Hopefully I will be able to arrange a refund with Google Cloud
a
I would suggest you reconnect to your original airbyte instance VM and try to check what command were run and try to find authentication log to check if someone from the outside of the organization got a console access to this VM.
Even though it's not accessible to the outside world and runs locally on our servers with access through a private VPN
If you are confident about your networking set up you should double check the service account file did not leak. You might have committed it on a public repo?
I'd suggest to remove all your organization related information from this thread too (all your screenshots