Tymm
12/21/2020, 8:19 AMTymm
12/24/2020, 6:32 AMMark.Tang
12/28/2020, 1:49 AMChundong Wang
12/29/2020, 6:01 PMSELECT facility_name as key_col, COUNT(*) as val_col
FROM enriched_station_orders_v1_OFFLINE
WHERE created_at_seconds BETWEEN 1606756268 AND 1609175468
AND (facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef')
GROUP BY key_col
ORDER BY val_col DESC
LIMIT 5
We’ll get exceptions on pinot-server like (index number seems to vary),
Caught exception while processing and combining group-by order-by for index: 1
However if we change from facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
to facility_organization_id = 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
there won’t be such exception. Or if we switch to facility_id
instead of facility_name
it won’t threw exception as well.
Have you seen such issue before?Will Briggs
12/30/2020, 10:07 PMeventTimestamp
). I would like to maintain this when querying / filtering my records at the individual event level. However, I would also like to define an hourly derived timestamp to be used for pre-aggregating with a star tree index.
My segments config looks like this:
"segmentsConfig": {
"timeColumnName": "eventTimestamp",
"timeType": "MILLISECONDS",
"retentionTimeUnit": "HOURS",
"retentionTimeValue": "48",
"segmentPushType": "APPEND",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"schemaName": "mySchema",
"replication": "1",
"replicasPerPartition": "1"
},
My star tree index looks like this:
"starTreeIndexConfigs": [{
"dimensionsSplitOrder": [
"dimension1",
"dimension2"
],
"skipStarNodeCreationForDimensions": [
],
"functionColumnPairs": [
"SUM__metric1",
"SUM__metric2",
"SUM__metric3",
"DISTINCT_COUNT_HLL__dimension3",
"DISTINCT_COUNT_HLL__dimension4"
],
"maxLeafRecords": 10000
}],
And my dateTimeFieldSpecs:
"dateTimeFieldSpecs": [
{
"name": "eventTimestamp",
"dataType": "LONG",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:HOUR",
"dateTimeType": "PRIMARY"
}
],
Can anyone confirm that this is the correct approach? Should I be using an ingestion transformation of toEpochHoursRounded
instead, and specifying that as a DERIVED dateTimeField in the dateTimeFieldSpecs configuration, and manually adding that to the dimensionsSplitOrder of my star tree index?Chethan UK
01/04/2021, 1:30 PMJinwei Zhu
01/04/2021, 7:25 PMKishore G
Mayank
Will Briggs
01/05/2021, 4:46 AMMark.Tang
01/06/2021, 2:13 AMOguzhan Mangir
01/06/2021, 12:05 PMMahesh Yeole
01/06/2021, 9:45 PMJinwei Zhu
01/06/2021, 10:21 PMMark.Tang
01/07/2021, 6:24 AMMark.Tang
01/07/2021, 9:58 AMJinwei Zhu
01/11/2021, 11:01 PMJackie
01/13/2021, 7:24 PMenableDynamicStarTreeCreation
into your index config, see https://docs.pinot.apache.org/configuration-reference/table#table-index-config for more detailsNeha Pawar
Amit Chopra
01/13/2021, 11:39 PMYupeng Fu
01/14/2021, 1:17 AMMahesh Yeole
01/14/2021, 3:33 AMSean Chen
01/14/2021, 9:13 AMSean Chen
01/15/2021, 4:39 AMexclude.sequence.id
? Is it used just for naming the segment? If I create 3 segments, each with a unique name but having the same time-range, can I set exclude.sequence.id
true all the time?Sean Chen
01/15/2021, 11:49 AMAmit Chopra
01/15/2021, 4:53 PMselect device, count(device) as aggreg from metrics where eventTime > 26835599 and eventTime < 26835626 group by device order by aggreg desc limit 10
I see:
• numServersQueried = 2
• numServersResponded = 2
• numSegmentsQueried = 4
• numSegmentsProcessed = 1
• numSegmentsMatched = 1
Questions:
1. Given above query, the eventTime
falls within time range of a single segment - metrics_OFFLINE_26835599_26835666_3
. So i was expecting numServersQueried to be 1 (instead of 2). Do i need to set something up for broker pruning to take effect?
2. Similarly i was expecting numSegmentsQueried to be 1 (instead of 4).
3. I always see numSegmentsProcessed and numSegmentsMatched to be same value always. What is the difference between the two. I looked at https://docs.pinot.apache.org/users/api/querying-pinot-using-standard-sql/response-format, but it wasn’t super clear to me from reading there.Ken Krugler
01/15/2021, 4:59 PMtroywinter
01/18/2021, 6:33 AM[
{
"errorCode": 200,
"message": "QueryExecutionError:\norg.apache.pinot.core.query.exception.BadQueryRequestException: Caught exception while initializing transform function: lookup\n\tat org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:207)\n\tat org.apache.pinot.core.operator.transform.TransformOperator.<init>(TransformOperator.java:56)\n\tat org.apache.pinot.core.plan.TransformPlanNode.run(TransformPlanNode.java:52)\n\tat org.apache.pinot.core.plan.SelectionPlanNode.run(SelectionPlanNode.java:83)\n\tat org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:100)\n\tat org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33)\n\tat org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45)\n\tat org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:294)\n\tat org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:215)\n\tat org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:157)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:141)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)"
}
]
troywinter
01/18/2021, 3:18 PM2021/01/18 10:26:32.704 INFO [ControllerStarter] [main] Initializing PinotFSFactory
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.getConstructor(Class.java:1825)
at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:295)
at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:264)
at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:245)
at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:53)
at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:74)
at org.apache.pinot.controller.ControllerStarter.initPinotFSFactory(ControllerStarter.java:481)
at org.apache.pinot.controller.ControllerStarter.setUpPinotController(ControllerStarter.java:329)
at org.apache.pinot.controller.ControllerStarter.start(ControllerStarter.java:287)
at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:116)
at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:91)
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:234)
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:286)
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:233)
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:183)
at org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:130)
at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:162)
at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:182)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 21 more
And below are the startup opts:
JAVA_OPTS -Xms256M -Xmx1G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -Xloggc:/opt/pinot/gc-pinot-controller.log -Dlog4j2.configurationFile=/opt/pinot/conf/pinot-controller-log4j2.xml -Dplugins.dir=/opt/pinot/plugins -Dplugins.include=pinot-hdfs -classpath /opt/hadoop-lib/hadoop-common-3.1.1.3.1.0.0-78.jar:/opt/hadoop-lib/hadoop-client-3.1.1.3.1.0.0-78.jar:/opt/hadoop-lib/hadoop-hdfs-3.1.1.3.1.0.0-78.jar:/opt/hadoop-lib/hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar
Davide Berdin
01/18/2021, 9:54 PM