https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • a

    Alex

    11/20/2019, 1:08 AM
    I was asking about a different field:
    Copy code
    aggregateMetrics - Switch for the aggregate metrics feature. This feature will aggregate realtime stream data as it is consumed, where applicable, in order to reduce segment sizes. We sum the metric column values of all rows that have the same value for dimension columns and create one row in a realtime segment for all such rows. This feature is only available on REALTIME tables.
  • a

    Alex

    11/20/2019, 1:18 AM
    @User which class is responsible for aggregations?
  • n

    Neha Pawar

    11/20/2019, 1:37 AM
    https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java#L668
  • n

    Neha Pawar

    11/20/2019, 1:38 AM
    It gets set in the RealtimeSegmentConfig when initializing the segment
  • s

    Subbu Subramaniam

    11/20/2019, 2:02 AM
    Btw, I have a pull request after your recent comments on the documentation, and the discussion. https://github.com/apache/incubator-pinot/pull/4839
  • k

    Kishore G

    11/20/2019, 4:24 AM
    Screen Shot 2019-11-19 at 8.23.53 PM.png
  • s

    Seunghyun

    11/20/2019, 6:42 AM
    star-tree will only be generated for flushed segments (immutable segments).
    aggregateMetrics
    is a different feature where it provides a way to aggregate values for consuming segments and it only supports
    sum
    as of now. On the other hand, startreee supports multiple aggregation functions.
  • s

    Sidd

    11/20/2019, 9:02 PM
    <!here>, design document for Text Search feature. Everyone should be able to access
    👍 2
  • s

    Sidd

    11/20/2019, 9:02 PM
    https://docs.google.com/document/d/19uLti7wwl7nPlDuy6cUVnLOll2C8u3YtUITbNj0TT5o/edit?usp=sharing
  • e

    Elon

    11/21/2019, 12:12 AM
    We have 270m records in a pinot table and when we do a select ... order by ... limit 5 the query returns in 10 seconds with no data (from controller ui) but if we remove the order by it returns instantly with the expected 5 rows from the limit clause.
  • s

    Seunghyun

    11/21/2019, 12:37 AM
    @User Do you have any filter on your query? If you run
    order by
    without any filter, you’re basically sorting the entire table, which is very expensive.
  • e

    Elon

    11/21/2019, 12:38 AM
    We did not:) Is there a way we can speed it up with an index?
  • e

    Elon

    11/21/2019, 12:38 AM
    Mostly just testing things out right now and experimenting with pql.
  • s

    Seunghyun

    11/21/2019, 12:43 AM
    if it’s for just testing, you can put some filter to reduce the number of rows to be sorted
  • a

    Alex

    11/21/2019, 12:49 AM
    interestingly, we also had high latency by executing: select * from bla limit 10
  • a

    Alex

    11/21/2019, 12:49 AM
    10 seconds of latency
  • a

    Alex

    11/21/2019, 12:50 AM
    and the json was saying only 2 out of 3 servers replied. Couldn’t find any fishy logs on the brokers.
  • s

    Seunghyun

    11/21/2019, 12:53 AM
    Broker log contains
    requestId
    and you will be able to find the relevant log on the server side using
    broker host + requestId
    pair.
  • a

    Alex

    11/21/2019, 12:55 AM
    let me check
  • a

    Alex

    11/21/2019, 12:59 AM
    see:
  • a

    Alex

    11/21/2019, 12:59 AM
    Copy code
    2019/11/21 00:01:34.700 ERROR [CombineOperator] [pqr-0] Caught ExecutionException.
    java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out while polling result from first thread
    	at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_232]
    	at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_232]
    	at org.apache.pinot.core.operator.CombineOperator.getNextBlock(CombineOperator.java:158) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.CombineOperator.getNextBlock(CombineOperator.java:44) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:48) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:37) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:26) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:48) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:48) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:213) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:152) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:136) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_232]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_232]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) ~[guava-20.0.jar:?]
    	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) ~[guava-20.0.jar:?]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) ~[guava-20.0.jar:?]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_232]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_232]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
    Caused by: java.util.concurrent.TimeoutException: Timed out while polling result from first thread
    	at org.apache.pinot.core.operator.CombineOperator$2.callJob(CombineOperator.java:133) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.CombineOperator$2.callJob(CombineOperator.java:126) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.util.trace.TraceCallable.call(TraceCallable.java:44) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	... 8 more
    2019/11/21 00:01:34.704 ERROR [CombineOperator] [pqw-1] Caught exception while executing query.
    java.lang.RuntimeException: Thread has been interrupted
    	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:37) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.query.SelectionOrderByOperator.getNextBlock(SelectionOrderByOperator.java:145) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.query.SelectionOrderByOperator.getNextBlock(SelectionOrderByOperator.java:44) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:48) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.operator.CombineOperator$1.runJob(CombineOperator.java:104) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40) ~[pinot-core-0.2.0-SNAPSHOT.jar:0.2.0-SNAPSHOT-eb45b438c5053f5caaf289614f386706a472947e]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_232]
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_232]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_232]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) ~[guava-20.0.jar:?]
    	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) ~[guava-20.0.jar:?]
    	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) ~[guava-20.0.jar:?]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_232]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_232]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
  • a

    Alex

    11/21/2019, 1:00 AM
    there is no data coming, but servers are constantly logging:
  • a

    Alex

    11/21/2019, 1:00 AM
    Copy code
    019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.consumer.prop.group.id' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'streamType' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.zk.broker.url' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'realtime.segment.flush.threshold.time' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.consumer.prop.auto.offset.reset' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
    2019/11/21 00:59:54.031 WARN [ConsumerConfig] [flattened_orders_hours__75__6__20191120T0115Z] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
  • k

    Kishore G

    11/21/2019, 1:26 AM
    I think we saw this the other day as well. I have an idea on why this might be happening.
  • k

    Kishore G

    11/21/2019, 1:27 AM
    Basically, whenever there a bad query, Full table scan etc, there is a FullGC
  • k

    Kishore G

    11/21/2019, 1:27 AM
    and kafka consumption is not recovering once the GC is done
  • k

    Kishore G

    11/21/2019, 1:27 AM
    can you grep for zookeeper state in the server log
  • n

    Neha Pawar

    11/21/2019, 2:36 AM
    when did you start seeing this issue? we are seeing similar behavior in a release we rolled out recently. Only for realtime servers
  • k

    Kishore G

    11/21/2019, 2:38 AM
    Dont recall the exact date/release
  • k

    Kishore G

    11/21/2019, 2:39 AM
    @User can you please grep for zookeeper in the server logs and share it
1...100101102...160Latest