Hello wave Does anyone have insight on how much of a bottlen Apache Pinot #getting-started

Hello :wave: Does anyone have insight on how much ...

Abdelhakim Bendjabeur

09/07/2022, 1:16 PM

Hello 👋 Does anyone have insight on how much of a bottleneck can Presto become if plugged on top of Pinot to unlock full SQL syntax? i.e. how much will Presto hurt the latency and the high load resistance that make Pinot a suitable solution for user-facing analytics?

Mayank

09/07/2022, 2:41 PM

In general, the connector will push down as much as Pinot can execute. What’s the qps and latency you are targeting? Typically, you wouldn’t want to serve external user facing workload with read time joins and other complex queries

Abdelhakim Bendjabeur

09/07/2022, 3:56 PM

We don't have a very high load at the moment, around 90k requests per day, which is around 1 request per second. But we are expecting a high increase in the upcoming months. The side goal for us is to relieve Postgres from this pressure and use adequate tools for customer-facing realtime analytics. We are trying to find the sweet spot between precomputing everything (flat data using Flink+kafka) and maybe relying on some joins. Because data can come from multiple tables in the source database, and it can grow super complex to flatten everything in a single kafka topic while guaranteeing 100% data consistency

Mayank

09/07/2022, 5:02 PM

In my experience, Pinot+Presto/Trino works great for internal dashboards like use cases. Once you have external users and depending on complexity of the joins as well as SLA you want to guarantee, it is probably better to stay away from read time complex joins.

🙏 1

Nizar Hejazi

09/11/2022, 10:14 PM

If you are using Presto/Trino mainly for joins, one option here is the upcoming Pinot support for native joins support (aka v2 engine, Beta phase), see: https://apache-pinot.slack.com/archives/CDRCA57FC/p1659989733533859

Open in Slack

Previous Next