Elon
12/04/2020, 1:54 AMJoão Comini
12/04/2020, 3:10 PM[BaseBrokerRequestHandler] [jersey-server-managed-async-executor-1] Failed to find time boundary info for hybrid table: transaction
When I try to run a query, i get a timeout. Server log:
Timed out while polling results block, numBlocksMerged: 0 (query: QueryContext{_tableName='transaction_REALTIME', _selectExpressions=[count(*)], _aliasMap={}, _filter=transactionDate > '1606971455132', _groupByExpressions=null, _havingFilter=null, _orderByExpressions=null, _limit=10, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=9999}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:transaction_REALTIME), filterQuery:FilterQuery(id:0, column:transactionDate, value:[(1606971455132 *)], operator:RANGE, nestedFilterQueryIds:[]), aggregationsInfo:[AggregationInfo(aggregationType:COUNT, aggregationParams:{column=*}, isInSelectList:true, expressions:[*])], filterSubQueryMap:FilterQueryMap(filterQueryMap:{0=FilterQuery(id:0, column:transactionDate, value:[(1606971455132 *)], operator:RANGE, nestedFilterQueryIds:[])}), queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=9999}, pinotQuery:PinotQuery(dataSource:DataSource(tableName:transaction_REALTIME), selectList:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))], filterExpression:Expression(type:FUNCTION, functionCall:Function(operator:GREATER_THAN, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:transactionDate)), Expression(type:LITERAL, literal:<Literal longValue:1606971455132>)]))), limit:10)})
If I try to use Tracing
i get a NPE in the offline servers:
ERROR [QueryScheduler] [pqr-0] Encountered exception while processing requestId 83 from broker Broker_pinot-broker-0.pinot-broker-headless.pinot.svc.cluster.local_8099
java.lang.NullPointerException: null
at org.apache.pinot.core.util.trace.TraceContext.getTraceInfo(TraceContext.java:188) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:235) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
at org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
I'm running Pinot 0.6.0 btw,Tanmay Movva
12/07/2020, 5:06 PMlatest
tag. I tried by using the stream.kafka
/ stream.kafka.consumer.prop
as prefixes both did not work.Elon
12/07/2020, 5:33 PMXiang Fu
lâm nguyễn hoàng
12/08/2020, 5:46 PMDerek
12/09/2020, 8:55 PM"stream.kafka.consumer.prop.auto.isolation.level": "read_committed",
in our realtime table, but it also seems like it is processing uncommitted messagesKen Krugler
12/09/2020, 11:07 PMKen Krugler
12/10/2020, 7:34 PMTanmay Movva
12/11/2020, 5:13 AMKen Krugler
12/11/2020, 8:29 PMPlaysted
12/12/2020, 5:04 PMTanmay Movva
12/14/2020, 2:41 PMKen Krugler
12/15/2020, 1:32 AMinputDirURI: 'hdfs://<clustername>/user/hadoop/pinot-input/'
includeFileNamePattern: 'glob:**/us_*.gz'
outputDirURI: 'hdfs://<clustername>/user/hadoop/pinot-segments/'
When I run the job, segments are generated, but then each segment fails with something like:
Failed to generate Pinot segment for file - hdfs:/user/hadoop/pinot-input/us_2020-03_03.gz
java.lang.IllegalStateException: Unable to extract out the relative path based on base input path: hdfs://<clustername>/user/hadoop/pinot-input/
So it looks like the input file URI is getting the authority (<clustername>
) stripped out, which is why the baseInputDir.relativize(inputFile)
call fails to generate appropriate results in SegmentGenerationUtils.getRelativeOutputPath
. Or is there something else I need to be doing here to get this to work properly? I’m able to read the files, so the inputDirURI
is set up properly (along with HDFS jars).Elon
12/17/2020, 8:13 PMTaran Rishit
12/18/2020, 4:46 PMdhurandar
12/18/2020, 5:51 PMbalci
12/18/2020, 9:12 PMError: COMPILATION ERROR :
[INFO] -------------------------------------------------------------
Error: /home/runner/work/incubator-pinot/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java:[450,45] cannot find symbol
symbol: variable BATCH_TYPE
location: class org.apache.pinot.spi.ingestion.batch.BatchConfigProperties
Error: /home/runner/work/incubator-pinot/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/core/util/TableConfigUtilsTest.java:[452,35] cannot find symbol
symbol: method constructBatchProperty(java.lang.String,java.lang.String)
location: class org.apache.pinot.spi.ingestion.batch.BatchConfigProperties
...
It is interesting because the test I added is almost a copy of an existing test case using same symbols (ingestionBatchConfigTest). Does anyone have any insight into what might have gone wrong? @Neha Pawar I noticed you added the test ‘ingestionBatchConfigsTest’ recently, curious if you had a similar issue. Thanks.Punish Garg
12/21/2020, 11:28 AMEndOfStreamException: Unable to read additional data from client sessionid 0x17685043a2e001d, likely client has closed socket
Elon
12/21/2020, 9:40 PMstream.kafka.consumer.prop.isolation.level
or group_id
, client_id
, etc. - it looks like only a specific list properties are honored, like stream.kafka.topic.name
, stream.kafka.decoder.class.name
... - I can create a github issue, lmk.Yash Agarwal
12/22/2020, 1:40 PMLaxman Ch
12/23/2020, 1:49 PMElon
12/24/2020, 6:01 AMlâm nguyễn hoàng
12/28/2020, 7:09 PMlâm nguyễn hoàng
12/28/2020, 7:09 PMJackie
12/28/2020, 7:16 PMDaniel Lavoie
12/30/2020, 4:51 PMps aux
so we get confirmation of the exact arguments that where provided to the jvm process?Elon
01/04/2021, 10:48 PMif (queryLogRateLimiter.tryAcquire() || forceLog(schedulerWaitMs, numDocsScanned)) {
<http://LOGGER.info|LOGGER.info>("Processed requestId={},table={},segments(queried/processed/matched/consuming)={}/{}/{}/{},"
+ "schedulerWaitMs={},reqDeserMs={},totalExecMs={},resSerMs={},totalTimeMs={},minConsumingFreshnessMs={},broker={},"
+ "numDocsScanned={},scanInFilter={},scanPostFilter={},sched={}", requestId, tableNameWithType,
Ken Krugler
01/08/2021, 1:24 AMwhere mvfield in ('a', 'b') group by mvfield
, and mvfield
is a multi-valued field, I get a result with groups for values from mvfield
that aren’t in my where clause. I assume I’m getting groups for every value found in mvfield
from rows where mvfield
contains a match for my filter, but it seems wrong…am I missing something?Yash Agarwal
01/08/2021, 5:04 AM