https://pinot.apache.org/ logo
s

Slackbot

03/02/2021, 10:25 PM
This message was deleted.
x

Xiang Fu

03/02/2021, 10:27 PM
what’s the select * query results?
there should be one doc per kafka event
if there is only one doc, then it means pinot only consumed 1 record from Kafka
also is there any exceptions from pinot server log?
j

Josh Highley

03/02/2021, 10:31 PM
select * has no query result but query response stats has totalDocs = 1, so it seems the data is there, just not being returned
and, yes, I only sent 1 record to Kafka for the table
no errors in the server or broker logs
x

Xiang Fu

03/02/2021, 11:05 PM
Then I somehow feel the groovy function might not be correct. @Neha Pawar is there a way we can ingest using consumption/index time as the time value?
n

Neha Pawar

03/02/2021, 11:12 PM
you could use
now()
to set current time?
x

Xiang Fu

03/02/2021, 11:21 PM
j

Josh Highley

03/02/2021, 11:34 PM
those transforms are on the Select. I'm running the transform on the ingestion
my actual transform is more complex, but I've boiled it down to the test case. putting just a long in the Groovy script doesn't work, but putting System.currentTimeMillis() does work. When it doesn't work, the record still shows up in the query stats but not in the results
if I use a different date field, defined the same way but not acting as the table time column, then its value is populated by the transform as I expect
I've found that when I destroy the table, Pinot still seems to remember the most recent table date-time. If I create the table again, same name and schema, then even though the table is starts empty, Pinot queries won't return records with a date-time prior to those when the table was destroyed.
if I create the table with a different table name but same transformation, then the records will be returned by the query
I've been able to simplify the issue somewhat, I'm going to start a new message thread
x

Xiang Fu

03/03/2021, 1:04 AM
Sure, I think that transform function can also be used in ingestion transform function
for table destroy, did you also delete the schema ?
then it seems indicate that Pinot caches some transform functions per table name
j

Josh Highley

03/03/2021, 1:08 AM
I found that the transform doesn't matter. I can specify the column time in the data and have the same issue
when I re-create the table, if I insert the same primary key with an earlier column time, the record won't be returned by a query
it's still using the most recent column time for the record that existed in the deleted table
x

Xiang Fu

03/03/2021, 1:11 AM
Is this for upsert?
j

Josh Highley

03/03/2021, 1:11 AM
yes
x

Xiang Fu

03/03/2021, 1:11 AM
i think upsert uses timestamp to figure out the latest record
so only newest version of the record is counted
j

Josh Highley

03/03/2021, 1:12 AM
right, but it's using the timestamp from records in the deleted table when I re-create the table with the same name
x

Xiang Fu

03/03/2021, 1:13 AM
I see, how long did you wait for the table to be deleted then recreated it? We observed some issue when recreate table in a very short time
if not, then it’s very likely that some intermediate status is not cleaned up
j

Josh Highley

03/03/2021, 1:17 AM
several seconds between deleting and re-creating. There's only a few rows in the tables (testing)
n

Neha Pawar

03/03/2021, 2:41 AM
now
should work on ingest and query