Alice
07/04/2022, 5:14 AMEhsan Irshad
07/04/2022, 7:51 AMAyush Kumar Jha
07/04/2022, 9:43 AMIlya Yatsishin
07/04/2022, 1:36 PMTrying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Creating an executor service with 10 threads(Job parallelism: 10, available cores: 16.)
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@73ab3aac] for table hits
Mohamed Emad
07/04/2022, 11:31 PMAnish Nair
07/05/2022, 7:53 AMAmit Bisht
07/05/2022, 9:42 AMAn error occurred while communicating with Other Databases (JDBC)
Bad Connection: Tableau could not connect to the data source.
Error Code: FAB9A2C5
org/apache/commons/configuration/Configuration
Generic JDBC connection error
org/apache/commons/configuration/Configuration
I checked through postman that the broker and server endpoint are working. anyone here knows how to fix this or troubleshoot ??harnoor
07/05/2022, 3:33 PMtext_match(backend_name,'/perf/')
and REGEXP_LIKE(backend_name,'perf')
What should be the equivalent text_match
filter for REGEXP_LIKE(backend_name,'perf')
?
ThanksIlya Yatsishin
07/05/2022, 3:39 PMRecordReader initialized will read a total of 99997497 records.
But then it tries to push data after 43172732
- not sure if it was planning to process the rest or failed earlier.
3. There is still empty list of segments that it is trying to push. I don’t understand if something is wrong or not. Nothing is written
4. It looks job finished and no data can be seen. I can’t see error
at row 43172732. reading next block
block read in memory in 1682 ms. row count = 262144
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@7b7b3edb] for table hits
Full log https://pastila.nl/?017b92b4/361513a635141c533150a5b79f6a4848Rohan Kaushal
07/05/2022, 6:00 PMAbdullah Jaffer
07/06/2022, 12:05 AMselect sum(col1) as sum table group by col2
result:
sum col2
1 1
2 2
3 3
and then average the result, (1 + 2 +3)/3 = 2
I need a subquery for this, i think this is not supported in Pinot, can this be accomplished in the Trino connector? If so, how efficient is this? I don't want to avg the result code since that is not scalable due to unexpected no# of results in the group byAlice
07/06/2022, 2:46 AM"taskTypeConfigsMap": {
"RealtimeToOfflineSegmentsTask": {
"bucketTimePeriod": "1h",
"bufferTimePeriod": "6h",
"schedule": "0 0 0/2 * * ?",
"roundBucketTimePeriod": "1m",
"mergeType": "rollup",
"value.aggregationType": "max",
"maxNumRecordsPerSegment": "2000000"
}
}
Harsha Dandu
07/06/2022, 6:56 AMeywek
07/06/2022, 8:43 AMSELECT * FROM datasource_6298afcc7527000300387fdf
only my consuming segment is queried and the query result 0 docs.
And if I do
SELECT * FROM datasource_6298afcc7527000300387fdf_OFFLINE
I get 1k docs
It seems that the OFFLINE table/segment is ignored (like if the table wasn’t hybrid anymore?), do you have any idea on how I can troubleshoot this? Or any tips to fix it?
I cannot see something useful in logs
2022/07/06 08:21:52.843 INFO [QueryScheduler] [pqr-1] Processed requestId=16284217,table=datasource_6298afcc7527000300387fdf_REALTIME,segments(queried/processed/matched/consuming)=1/0/0/1,schedulerWaitMs=0,reqDeserMs=0,totalExecMs=0,resSerMs=0,totalTimeMs=0,minConsumingFreshnessMs=9223372036854775807,broker=Broker_<ip>_8099,numDocsScanned=0,scanInFilter=0,scanPostFilter=0,sched=FCFS,threadCpuTimeNs(total/thread/sysActivity/resSer)=0/0/0/0
2022/07/06 08:21:52.844 INFO [BaseBrokerRequestHandler] [jersey-server-managed-async-executor-13771] requestId=16284217,table=datasource_6298afcc7527000300387fdf,timeMs=2,docs=0/0,entries=0/0,segments(queried/processed/matched/consuming/unavailable):1/0/0/1/0,consumingFreshnessTimeMs=9223372036854775807,servers=1/1,groupLimitReached=false,brokerReduceTimeMs=0,exceptions=0,serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs,RequestSentDelayMs);<ip>_R=0,1,927,0,-1,offlineThreadCpuTimeNs(total/thread/sysActivity/resSer):0/0/0/0,realtimeThreadCpuTimeNs(total/thread/sysActivity/resSer):0/0/0/0,query=SELECT * FROM datasource_6298afcc7527000300387fdf
Thank youA_Phil
07/06/2022, 9:06 AMtimestamp
, id
and `value`; the id
describes the value
being ingested; here value
can hold both INT
and STRING
for my use case.
I wanted to know which of these options are feasible:
1. Create columns value_int
and value_string
and use a filtering function in Pinot that can save records in value_int
if value
is INT
, and vice-versa for STRING
values of value
. I tried this, but the filter function as shown in the docs, does not allow this
2. Store all values as STRING
and use a pinot-specific CAST
or CONVERT
function to convert values to INT
to do aggregations. But I could not find a cast/convert function in Pinot. Thus I am not able to do sum
operations on the data
I would welcome any ideas/workaround for the same.Cheguri Vinay Goud
07/06/2022, 9:13 AMpinot.properties: |
connector.name=pinot
pinot.controller-urls=<http://pinot-controller-headless:9000>
David Gregory
07/06/2022, 12:07 PMdocker exec -i pinot ./pinot/bin/pinot-admin.sh AddTable -schemaFile testschema.json –tableConfigFile testtable.json -exec
• From what I can tell, this is exactly what it should be. The schema is already successfully added. Kafka is successfully running with messages in a topic. However, when I run this line, it returns the error: MissingParameterException: Missing required option: ‘-tableConfigFile=<_tableConfigFile>’
• Then, when I look into the Pinot UI, there is no table.
• I try then to add the table via the Pinot UI and discover that the table is actually there, but I cannot see it in the UI.
◦ I initially see this message when saving the Realtime table via the Pinot UI: “TimeoutException: Timeout expired while fetching topic metadata”
◦ I try the save again and at that point I received the message that the table exists already.
• Apparently, I am able to delete the table using the “Delete tableConfig” in the Swagger API… The “Delete tableName” would not work.
• Attached are the schema and table config…Alice
07/06/2022, 1:15 PMHarsha Dandu
07/06/2022, 4:17 PMGrace Walkuski
07/06/2022, 8:00 PMTommaso Garuglieri
07/06/2022, 9:18 PM7c15bc
) to replace trino for low latency geospatial aggregation queries.
We have a table with about 10^8
records, using the star tree indexes we are able to perform our queries orders of magnitude faster than trino 🚀.
But when using geospatial filtering, ST_Contains
we experience performance degradation (from <200ms to over 10 seconds) even if we are using H3 geospatial indexes with different resolutions (which are triggered as expected) and the geometres are not complex.
Our queries are pretty straightforward:
select sum(x) from <table> where ST_Contains(ST_GeogFromText('...'),location_st_point) = 1 and ...
Is there any way we can improve latencies for geospatial aggregations ?
Should we just avoid geospatial filters with this number of records ?Weixiang Sun
07/07/2022, 12:25 AMSELECT mv_column
FROM enriched_customer_orders_v1_17_1
GROUP BY mv_column
LIMIT 10
Here is the exception:
"message": "QueryExecutionError:\njava.lang.UnsupportedOperationException\n\tat org.apache.pinot.segment.spi.index.reader.MutableForwardIndex.readDictIds(MutableForwardIndex.java:71)\n\tat org.apache.pinot.segment.spi.index.reader.MutableForwardIndex.readDictIds(MutableForwardIndex.java:76)\n\tat org.apache.pinot.core.common.DataFetcher$ColumnValueReader.readDictIds(DataFetcher.java:278)\n\tat org.apache.pinot.core.common.DataFetcher.fetchDictIds(DataFetcher.java:88)\n\tat org.apache.pinot.core.common.DataBlockCache.getDictIdsForSVColumn(DataBlockCache.java:99)\n\tat org.apache.pinot.core.operator.docvalsets.ProjectionBlockValSet.getDictionaryIdsSV(ProjectionBlockValSet.java:69)\n\tat org.apache.pinot.core.query.distinct.dictionary.DictionaryBasedSingleColumnDistinctOnlyExecutor.process(DictionaryBasedSingleColumnDistinctOnlyExecutor.java:42)\n\tat org.apache.pinot.core.operator.query.DistinctOperator.getNextBlock(DistinctOperator.java:61)\n\tat org.apache.pinot.core.operator.query.DistinctOperator.getNextBlock(DistinctOperator.java:38)\n\tat org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator.processSegments(BaseCombineOperator.java:150)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:105)\n\tat org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)",
"errorCode": 200
Diogo Baeder
07/07/2022, 1:27 AMDeployment
to StatefulSet
and facing some issues. More in this thread.Peter Pringle
07/07/2022, 7:23 AMMergeResponseError: responses for table: myTable from servers [d123-1_R] got dropped due to data schema inconsistency
What is the correct way to fix this. I tried reloading all segments in the realtime table also restarted all processes but it didn't seem to fix the issue.Dan DC
07/07/2022, 10:13 AMIlya Yatsishin
07/07/2022, 3:56 PMSELECT UserID, minute(EventTime) AS m, SearchPhrase, COUNT(*) FROM hits GROUP BY UserID, m, SearchPhrase ORDER BY COUNT(*) DESC LIMIT 10
java.lang.NullPointerException: null
Caught exception while merging results blocks (query: QueryContext{_tableName='hits_OFFLINE', _selectExpressions=[UserID, minute(EventTime), SearchPhrase, count(*)], _aliasList=[null, m, null, null], _filter=null, _groupByExpressions=[UserID, minute(EventTime), SearchPhrase], _havingFilter=null, _orderByExpressions=[count(*) DESC], _limit=10, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=10000000}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:hits_OFFLINE), pinotQuery:PinotQuery(dataSource:DataSource(tableName:hits_OFFLINE), selectList:[Expression(type:IDENTIFIER, identifier:Identifier(name:UserID)), Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:FUNCTION, functionCall:Function(operator:MINUTE, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:EventTime))])), Expression(type:IDENTIFIER, identifier:Identifier(name:m))])), Expression(type:IDENTIFIER, identifier:Identifier(name:SearchPhrase)), Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))], groupByList:[Expression(type:IDENTIFIER, identifier:Identifier(name:UserID)), Expression(type:FUNCTION, functionCall:Function(operator:MINUTE, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:EventTime))])), Expression(type:IDENTIFIER, identifier:Identifier(name:SearchPhrase))], orderByList:[Expression(type:FUNCTION, functionCall:Function(operator:DESC, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))]))], limit:10, queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=10000000}))})
java.lang.NullPointerException: null
at org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.mergeResults(GroupByOrderByCombineOperator.java:236) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:119) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:50) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.InstanceResponseOperator.getCombinedResults(InstanceResponseOperator.java:113) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:106) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:34) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:53) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:304) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:203) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:151) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:137) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Greg P
07/07/2022, 4:42 PM@@ -158,7 +158,10 @@ public abstract class BaseSingleSegmentConversionExecutor extends BaseTaskExecut
new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.ENABLE_PARALLEL_PUSH_PROTECTION, "true");
NameValuePair tableNameParameter = new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.TABLE_NAME,
TableNameBuilder.extractRawTableName(tableNameWithType));
- List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter);
+ NameValuePair tableTypeParameter = new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.TABLE_TYPE,
+ TableNameBuilder.getTableTypeFromTableName(tableNameWithType).toString());
+ List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter,
+ tableTypeParameter);
Greg P
07/07/2022, 4:43 PMDiogo Baeder
07/07/2022, 7:53 PMSUM()
, is it done in the Server, Broker or both? If I run queries with lots of SUMs going on, should I expect the heavier work to be done by the Server or the Broker?Abdullah Jaffer
07/07/2022, 9:29 PM