Ali Atıl
11/02/2021, 7:38 PMMayank
You can think of deep store as the persistent store to keep backup copy of the data ingested into Pinot.
- Serving nodes flush data to "local" disk periodically, but that is their local copy. It goes through a "commit" protocol that involves saving a copy of the data in deep-store to consider the data committed into Pinot.
- Local disk attached to Pinot servers is not viewed as a persistent store. However, that is where Pinot server first looks to load the data (that the Controller asks it to load in IdealState). Only if it doesn't find it locally, it will download it from deep-store.
- As mentioned above, it is used as persistent copy of the data ingested into Pinot, for servers to download (note new servers may join the cluster), disaster recovery, etc
- Yes, in a production setup it is recommended to have a storage that is shared across the controllers. It could be something like NFS, or something like S3, ADLS, GCS, etc.
Ali Atıl
11/02/2021, 9:40 PM