Hi Team While executing a query on the query console `select Apache Pinot #troubleshooting

Hi Team, While executing a query on the query con...

Awadesh Kumar

09/14/2022, 9:23 AM

Hi Team, While executing a query on the query console -

select * from packages_test limit 10

, it's giving different results on different execution i.e. 1 record and 10 records alternatively. Could anyone please help with this issue? Thanks

Navina

09/14/2022, 9:37 AM

Hi @Awadesh Kumar: if you have replication enabled on your table, the query may get routed to a different replica each time. Ideally, all replicas should be consistent and you should see the same query results. But there may be a difference of a few records now and then, depending on how fast each server is consuming.

Navina

09/14/2022, 9:37 AM

What is your replication factor for your table? What source are you consuming from?

Awadesh Kumar

09/14/2022, 9:41 AM

We are consuming from Kafka and replication is 2

Copy code

"segmentsConfig": {
      "schemaName": "packages_test",
      "replication": "2",
      "timeColumnName": "updated_at",
      "replicasPerPartition": "2"
    }

Any possible reason for replica inconsistency?

Navina

09/14/2022, 9:42 AM

hmm kafka is typically more stable consumption-wise. do you have upserts enabled? also, please double check that IdealState (IS) and External view (EV) match for this table.

Awadesh Kumar

09/14/2022, 9:46 AM

yes, upsert is enabled.

Awadesh Kumar

09/14/2022, 9:53 AM

Also, both IS and EV are the same for this table except for one field -

"BUCKET_SIZE": "0"

is present in EV but not in IS. not sure if it can cause any issues.

Navina

09/14/2022, 9:59 AM

I don't think BUCKET_SIZE is going to cause the issue. which version of pinot are you using? I know that there were some upsert related issues that were recently fixed, that could lead to inconsistent query results. can you try your query with `option(skipUpsert=true)`s and see if you get consistent results?

Awadesh Kumar

09/14/2022, 10:12 AM

Getting the same issue with this query as well -

select * from packages_test limit 10 option(skipUpsert=true)

we are using pinot version - 0.10.0

Navina

09/14/2022, 10:13 AM

oh that's weird. are both of the replicas ingesting data ? or is one of them stuck ?

Awadesh Kumar

09/14/2022, 10:34 AM

@Navina we have two server pods that are up and running. How can we check replicas ingesting/stuck?

Navina

09/14/2022, 10:49 AM

you can check the graphs to see if the metric for document count is increasing similarly on both replicas. I think there is also a metric called LLC_PARTITION_CONSUMING. If that metric is 0 , then ingestion may not be happening as expected

Mayank

09/14/2022, 1:12 PM

select * without any predicates or order by will return the first N rows it reads, which may not be deterministic. If you want consistent results you will need to either have some predicates or apply ordering. Your current query is asking for any 10 rows in the table.

Navina

09/14/2022, 4:49 PM

@Mayank even if the row contents are not same, the result should have at least 10 rows, right? if one replica can return it, then shouldn't the other return any 10 rows as well?

Mayank

09/14/2022, 4:51 PM

Yeah, both should return 10 rows (assuming there are at least 10 rows).

Awadesh Kumar

09/15/2022, 5:24 AM

Hi @Mayank @Navina After some time it started working fine and we started getting 10 rows. Can replication lag cause this issue?

Navina

09/15/2022, 5:32 AM

pinot itself doesn't have a replication layer. The replicas independently consume data from the source topic. so, you would have to look at the replica server logs to see what happened during the time the query returned different results. Again, the document count metric per server can tell you the time-window to look into.

👍 1

Awadesh Kumar

09/21/2022, 11:02 AM

Hi @Navina, Again facing a strange issue while running the query on the console-

select * from test_table where id=454715266949122

gives no records found

select id, reference_number from test_table where id=454715266949122

gives the data Can you please help, how to debug this?

Navina

09/21/2022, 12:21 PM

@Awadesh Kumar assuming this is still running with upserts, can you try these queries with

option(skipUpsert=true)

and see if the behavior remains the same?

Awadesh Kumar

09/21/2022, 2:32 PM

it didn't work with

option(skipUpsert=true)

as well. After a few hours, we started getting results... not able to figure out the root cause.

Navina

09/21/2022, 4:11 PM

Can you elaborate on what you mean by "it didn't work"? If you can provide the steps to reproduce I can try on my end. So far, we haven't encountered such an issue where the data is available for query after hours of ingest

Open in Slack

Previous Next