Apache Pinot

Hi team, we are currently evaluating a solution using Pinot hybrid table to produce a dataset with both S3 offline historical data and Kafka real time data. Is there some documents we can find the information about what hybrid table setup support and doesn’t support, regarding e.g. ingestion, query and retention etc. Thanks.

In pinot you can configure deep store on s3 and create a hybrid table ingest data from both batch data source(s3) and realtime data source(kafka).

<https://docs.pinot.apache.org/users/tutorials/use-s3-as-deep-store-for-pinot>

<https://docs.pinot.apache.org/basics/components/table#hybrid-table>

Thanks, so far we only used offline table, other than table configuration setup, is there any known functional differences, limitation or constraints of using a hybrid table compared to the offline table, in terms of query, retention and ingestion?

Real-time table has different retention than offline table. Ingestion wise, it's from Kafka . For each query, it's split into two queries based on time boundary. Please check the doc for details