Hello, are there any thoughts on using Pinot as an...
# getting-started
m
Hello, are there any thoughts on using Pinot as an online feature store?
m
Glad you asked. It is already being used as an online feature store at LinkedIn
m
awesome, I have been doing my own research into whats out there for feature stores and Pinot just seemed like a brilliant fit aside from the OLAP use case
It is powering critical cases like Linkedin feed recommendation etc.
m
I am looking for design patterns into loading real time data into Pinot and keeping it consistent with offline store
1. Even though Pinot is very good at adhoc queries, is the best pattern to pre-aggregate/transform streaming data before loading into Pinot? 2. Also the offline uploads into Pinot can probably support syncing offline and online data but what is the best way to structure the data such that the 2 sources don't conflict with each other. For example, if the offline data source is available only up to yesterday but the real time data is up to the minute
m
1. pre-aggregate/transform are optimizations you can perform if you need to meet critical latency/throughput requirements (but not always needed). 2. The two can (must) have overalp, and Pinot will internally ensure there is no double counting due to overlap