Hi team we have a use case for retrieving records by scannin Apache Pinot #general

Hi team, we have a use case for retrieving records...

Carl

10/19/2022, 5:37 PM

Hi team, we have a use case for retrieving records by scanning ~10b records and filtering by date range and id(1m ids in total) without doing any aggregations, with ~100 qps at peak time. Is it possible to scale and configure Pinot to support this kind of use case, if so what would be the best latency we can expect?

Mayank

10/19/2022, 5:39 PM

You want to fetch raw records based on date range? What’s the upper bound of records one query will fetch and what’s the avg?

Mayank

10/19/2022, 5:39 PM

Also, avg record/row length?

Carl

10/19/2022, 5:43 PM

We can set the upper bound if necessary, currently there is not. In avg fetch records count could be as ~100k.

Mayank

10/19/2022, 5:43 PM

It should be fine. Can you give me a sample query structure?

Mayank

10/19/2022, 5:46 PM

For example if the query is something like

select * from <table> where id = xxx and date between (x, y)

then you can sort and partition on id (if it has equality predicate), and have range index on date, and you can also use replica-groups. This will give you great performance and scalability.

Carl

10/19/2022, 5:47 PM

Select * from table where date between ‘2022-09-01’ and ‘2022-10-01’ and id in (1,2,3) and filter1=‘something’ order by date

Mayank

10/19/2022, 5:48 PM

Ok, will filter1 always have equality predicate, and is medium to high cardinality? If so, sort/partition on it.

Carl

10/19/2022, 5:56 PM

that extra filter1 only has a few less than 5 possible values.

Mayank

10/19/2022, 5:56 PM

Ok, then sort on id for sure. You may not need to partition, given that that read qps ~100qps

Carl

10/19/2022, 5:57 PM

By sorting you mean at ingestion time or query time?

Mayank

10/19/2022, 5:57 PM

ingestion time

Carl

10/19/2022, 6:08 PM

Is having id in the invertedindexcolumns enough?

Mayank

10/19/2022, 6:08 PM

Yeah, that works too. Sorted index is better than inverted index for overall performance, as it brings in locality.

Carl

10/19/2022, 6:09 PM

The thing is we would need to filter by two ids. Looks like sortedcolumn only support one column.

Mayank

10/19/2022, 6:12 PM

It helps with locality, that matters if you go at high throughput (> 1000 qps), but ok for your case.

Carl

10/19/2022, 6:16 PM

We are having invertedindex on id columns but still fetching 500k records exceeded 10s timeout.

Mayank

10/19/2022, 6:19 PM

How long does count(*) take for that query? Also what’s your cluster setup like (instance type jvm conf etc)?

Carl

10/19/2022, 6:19 PM

Count(*) took less than 500ms

Mayank

10/19/2022, 6:20 PM

500ms seems large

Mayank

10/19/2022, 6:20 PM

What’s the size of each record? Also does your query have order by

Mayank

10/19/2022, 6:21 PM

Do you have range index on time? What’s the time granularity

Carl

10/19/2022, 6:21 PM

No order by, just select.

Carl

10/19/2022, 6:24 PM

We don’t have any columns in rangeindexcolumns, the average query latency of this table is less than 100ms.

Mayank

10/19/2022, 6:25 PM

What happens if you use LIMIT to get a smaller number? When does it start to break? I suspect that your rows are large and that may be putting memory pressure.

Carl

10/19/2022, 6:31 PM

You mean too many columns in a row?

Mayank

10/19/2022, 6:31 PM

Total size of a row (sum of all column values for a row)

Carl

10/19/2022, 6:36 PM

It’s a demoralized table with about 80 columns with combination of string and Int metrics.

Carl

10/19/2022, 6:39 PM

It start to break at 10k limit.

Mayank

10/19/2022, 6:40 PM

What’s the size of each row in bytes? How many servers, what’s their cpu/mem/jvm conf? I feel your cluster may be under resourced to handle the payload.

Carl

10/19/2022, 6:49 PM

Is there some Pinot query to get size of row in bytes?

Mayank

10/19/2022, 6:50 PM

You can query one row, and get its size outside of Pinot

Carl

10/19/2022, 6:51 PM

It’s a four broker cluster, the cpu usage is less than 5%, memory is at 80%. The performance has been decent with our traffic.

Mayank

10/19/2022, 6:51 PM

servers?

Mayank

10/19/2022, 6:52 PM

Not the usage, of cpu/mem, but actual available on servers

Carl

10/19/2022, 7:19 PM

4 servers in the cluster. CPU 10G and memory 100G

Mayank

10/19/2022, 7:20 PM

100GB memory? That seems more than enough, what’s the jvm heap size

Carl

10/19/2022, 7:26 PM

It’s at 24G

Mayank

10/19/2022, 7:33 PM

Then your per row size must be really big, otherwise I don’t see any other reason for the issue

Carl

10/19/2022, 7:39 PM

It’s averagely 0.5 kb per row based on the table size and row count.

Ashwin

10/20/2022, 12:41 AM

What level of latency are you looking for? P75? P90?

Carl

10/20/2022, 12:46 AM

I was looking for something like p75: 1s p95:5s or so. But now it’s always timing out after 10 seconds even with just selecting 5 columns with 100k rows.

Mayank

10/20/2022, 4:01 AM

Table size is pinot segment size which is highly compressed. Can you get one row from Pinot and save it as text and give its file size (as a proxy)?

Open in Slack

Previous Next