Hi folks This could probably be a question more geared towar Apache Pinot #troubleshooting

Hi folks! This could probably be a question more g...

Diogo Baeder

04/07/2022, 9:13 PM

Hi folks! This could probably be a question more geared towards @User, but I'll ask broadly anyway: is there any documentation available about how to implement ad-hoc segment replacement, in terms of what this flow would be? I'll follow up in this thread.

Diogo Baeder

04/07/2022, 9:17 PM

What I want to have is a single table that holds data for multiple regions and sectors within these regions. And I also want to be able to partition the data by region and sector. The problem is that with the daily ingestion I would do I would end up with far too many segments and they would be too small, most of them not even with 1MB of data. So I thought about using merge rollups - which some here recommended to me -, however that would probably just merge everything together for each bucket, thus defeating my partitioning per region and sector. Then I thought, I could just implement the rolling up of these segments myself. The problem, though, is that I have no idea how this works; How do I "build a segment"? Do I just create a batch job for each rolled up segment, and then delete the old tiny ones? What's the recommended way to approach this?

Mayank

04/07/2022, 9:32 PM

https://docs.pinot.apache.org/operators/operating-pinot/pinot-managed-offline-flows

Diogo Baeder

04/08/2022, 12:47 PM

Thanks man!

Diogo Baeder

04/08/2022, 12:50 PM

That doesn't solve my problem, however, since it just creates segments by buckets, and it seems likely that it will kill whatever partitioning I do. And it's from realtime to offline, not rollups.

Ken Krugler

04/08/2022, 5:55 PM

I’m not sure about specific documentation on segment replacement. We basically build segments with specific names (e.g.

us-202201

), save them in HDFS, and then do a metadata push to force a reload. But @User could likely suggest a better approach, e.g. when doing a replacement (versus the initial push of segments for a new month) we could use the REST api to trigger a reload???

Diogo Baeder

04/08/2022, 5:59 PM

Hmmm... got it. Alright, I'll sort out the details about that then, for my case. Thanks man!

Neha Pawar

04/08/2022, 10:27 PM

@User the merge roll-up minion task should honor the partitioning config from the table. Adding @User (the author of that) to confirm

Diogo Baeder

04/08/2022, 10:28 PM

Ah, really? If so, then my problem is solved! 😊

Jackie

04/08/2022, 10:38 PM

I want to understand more details of the requirements. Why do you want to partition the data by region and sector? Do you use some primary key to partition the data, or just generate one segment per region per sector per day?

Diogo Baeder

04/08/2022, 10:50 PM

I'd like to do the later - generate one segment per region/sector/day. Then, I'd like to roll them up into monthly segments.

Mayank

04/09/2022, 3:16 PM

How big would be that one segment that is partitioned per region/sector/day?

Diogo Baeder

04/09/2022, 7:28 PM

I haven't brought up precise numbers yet, but at most something like 6k rows with 6 columns each

Ken Krugler

04/09/2022, 10:03 PM

A segment with only 6K rows sounds like it would be too small for good performance.

➕ 1

Mayank

04/09/2022, 11:44 PM

Yep. If the goal for partitioning was to improve performance, this size is going in the opposite direction. You can get much better performance simply by using sorted and/or inv index.

Diogo Baeder

04/10/2022, 8:17 PM

But that is for each day. This is why I want to rollup by month. Perhaps even more, for older data. The thing is, we don't usually get data for everything that we consume in the same query, our queries focus usually on sectors, or at max on regions, very rarely spanning multiple regions. It's more important for us to have larger windows of data from a single sector, for example, or even a single region, than have day-by-day with everything crumbled together.

Mayank

04/11/2022, 1:19 AM

Unless you are running in TB range for data size, and your latency requirement in ms range, sorted + inv index can do that trick. Basically there is a cost to what you are trying to do (partition, roll up etc), it would be really good to ensure that the benefit will outweigh the cost before I’d suggest going that route

Diogo Baeder

04/11/2022, 1:05 PM

There's a chance we get to the TB data size (I don't know yet), but latency does have to be super low - in the ms range -, because that's to serve data for our end-users. We're mostly able to serve in <1s today, with our current technology, except when there's a lot of data being queried at the same time, and we'd like to at the very least keep that performance. This is why it's so important to partition the data, in our case. If we don't partition, then there will be a lot of data put together when it's not going to be used all the time, like data from regions only accessed when other regions are not accessed - e.g. data from Asia together with data from the Americas.

Mayank

04/11/2022, 1:09 PM

Data not used will never be loaded in memory. Sorted and inv index will avoid reading u necessary data, partitioning is not the only way. Also, your data will be time partitioned so that will help too

Mayank

04/11/2022, 1:10 PM

But once you are convinced that there is RoI, we can help with suggestions on meeting SLA.

Diogo Baeder

04/11/2022, 1:22 PM

Got it

Diogo Baeder

04/11/2022, 1:26 PM

Alright, let me not worry about partitioning for now, then. The time partitioning is done automatically, without me having to define it, right?

Diogo Baeder

04/11/2022, 1:26 PM

(Having a timeColumn, that is)

Mayank

04/11/2022, 1:27 PM

Assuming you push daily data or your real-time events are in chronological order, yes

Diogo Baeder

04/11/2022, 1:28 PM

Awesome. Thanks!

Diogo Baeder

04/11/2022, 1:59 PM

By the way, we will be ingesting batches on separate regions anyway (because of the way we will have this scheduled to happen), so in the end it might just make sense to at least keep the region partitioning done. Or not?

Mayank

04/11/2022, 2:30 PM

For Pinot to prune segments you need to explicitly specify the partition function and ensure that region X is always in partition k as seen by your ingestion pipeline as well as Pinot

Mayank

04/11/2022, 2:31 PM

If you don’t specify that, Pinot will not prune the segment upfront, but it will on the server side still prune it based on metadata

Mayank

04/11/2022, 2:32 PM

So these optimizations are things to tune at ms latency and > 1000 qps. Below that the RoI may not be there, as you’d still get the performance you need without partitioning setup

Diogo Baeder

04/11/2022, 2:35 PM

Got it. Thanks man! 🙂

Open in Slack

Previous Next