https://pinot.apache.org/ logo
#general
Title
# general
a

Arpit

10/20/2021, 1:59 PM
Hi, I have created a realtime table on 0.8.0 Pinot cluster. Data is getting in pinot but I see this log msg for one segment "Stopping consumption due to row limit nRows=100000 numRowsIndxed=10000 numRowsconsumed=100000" Also i checked the debug endpoint in swagger and it shows below result for segment
k

Kishore G

10/20/2021, 2:28 PM
thats a valid log statement, it gets printed before flushing the segment to disk. After that there should be a new consuming segment that will start consuming messages again.
a

Arpit

10/20/2021, 2:30 PM
Where is it pickking the value 100000 from?
k

Kishore G

10/20/2021, 2:33 PM
from table config
Sorry, I did not see the debug output
looks like the segment is not getting built
any exception in the log?
a

Arpit

10/20/2021, 2:39 PM
Also I spotted an error when it is trying to build the segment after that log message and same error I can see in the debug endpoint. So looks like something is wrong with our data
We have not specified that value 100000 in config
k

Kishore G

10/20/2021, 2:40 PM
right.. can you paste the error here
I think thats the default
m

Mayank

10/20/2021, 2:48 PM
Yeah, please paste the error. The 100k value seems to indicate the initial value of segment auto sizing.
a

Arpit

10/20/2021, 3:04 PM
So it is trying to flush the segment after reading 100k records which is a default value for some property. I am specifying some values(size/time) in config for flushing but seems they are not getting picked up
s

Subbu Subramaniam

10/20/2021, 4:34 PM
a

Arpit

10/20/2021, 8:45 PM
The exception in log while creating a segment is because of a Datetime field. I have declared a datetime string type field with format(1millisecondssimple_date_format:YYYY-MM-DD'T'HHMMSS.SSSZ). The exception says Could not parse "2021-10-09T184254.985Z": value 42 for monthOfYear must be in the range 1,12. According to the format I defined, 42 is seconds but it is taking it as month.
k

Kishore G

10/20/2021, 9:44 PM
can you please file an issue?
a

Arpit

10/20/2021, 10:11 PM
I got past the above issue. Reason was format specifier is case sensitive so I had to put M for month and m for minutes.
However now I get another error for the same column while building dictionary at segment creation time. The log says " created dictionary for String column: InsertedTime with cardinality:149, max length in bytes24,range2021-10-09T184254.985Z to null And than error later with illegalargumentexception: invalid format: "null"
To my understanding, it looks like it scans all the values for the given column to build a range and in this case it gets a null and a valid value. but null is obviously not valid format for the given field and it fails
I gave a default value(1800-01-01T000000.000Z) in schema for the given field but still same error
How should i fix this ?
j

Jackie

10/21/2021, 11:54 PM
@User Can you please put a larger default value? Pinot doesn't support time value older than
1971-01-01
to prevent users from putting wrong time units
Oh, actually Pinot does not support default time values by default (This PR adds the support of using machine time as the default). Any reason why you got
null
time values
a

Arpit

10/22/2021, 8:35 AM
I fixed the null values and now it is ingesting smoothly🙂
👍 1