Hi, we want to use google cloud storage as a plugg...
# general
e
Hi, we want to use google cloud storage as a pluggable storage, does anyone have any example configs or advice on how to do that? cc: @User @User
x
Do you mean use gs buckets?
e
Hi, yes, that's what I mean:)
x
I suggest to use persistent disk for Pinot server
gs could be used as controller deep store
e
For the temp directory or are you saying we shouldn't use gcs buckets at all?
Ah, makes sense - so how would we use gcs for controller deep store?
x
The thing I’ve tried out is to make a nfs on top of gcs
e
Nice! How do you do that? Or is there a pull request or code you can point me to?
x
Then mount the nfs to all controller pods
There is an open source repo of gcsfuse
e
A cool, I just got it
So you would use the gcsfuse utility in the helm chart (init job?) to mount the bucket to the path?
Ah I see - you add it to the image and then run gcsfuse as part of the start up?
Thanks for the advice!
x
e
Nice, I'll try this out. This is better than using the HadoopPinotFS with gcs?
x
There are two ways, if you can build your own image, then you can have one pod which mount gcs during pod init
e
Ok, and that's for the controller pods, right?
x
The other is that you maintain a nfs service and deployment which mount gcs then your Pinot controller pod just need to mount a nfs
e
Ah ok
I'm not sure I know how to do that one, the link you shared describes how to do that?
The second option I mean
x
For server we suggest to mount persistence disk
Yes
👍 1
In helm, you can change storageclass for Pinot server to get this
e
Nice. We have pvc's working right now - but for the controller to use the gcsfuse mount what config parameters would I need to set?
x
Also note that this is per server per gce disk
Yes
e
Yep
x
You need the change the volume
e
Is that the
Copy code
controller.data.dir
parameter?
x
Yes
This should be sub directory under your gcs mount
👍 1
e
Thanks a lot! Sounds like this should be enough to get started.
And just one more n00b question (still new to pinot) - what's the advantage for using gcs for controller deep storage?
x
Cause for pint controller to serve segments, you want a volume mounted to all the pods, which is ReadWriteMany
e
Got it, thanks
x
This access mode on gcp is only available as nfs
👍 1
e
Nice, thanks for the info!