https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • a

    Alice

    07/04/2022, 5:14 AM
    Hi team, I’m testing dedup feature for Pinot realtime table. When I upload a table config successfully, I found the table status is bad. Server logs show the following error: 2022/07/04 050648.156 WARN [TableStateUtils] [HelixTaskExecutor-message_handle_thread] Failed to find current state for instance: Server_pinot-server-20.pinot-server-headless.pinot-trial.svc.cluster.local_8098, sessionId: 2000280dc650274, table: table_name_REALTIME. Test is based on Pinot 0.11.
    s
    • 2
    • 12
  • e

    Ehsan Irshad

    07/04/2022, 7:51 AM
    Hi I am trying to put the dependency in my sbt file for pinot flink connector. But seems module is not available https://repo1.maven.org/maven2/org/apache/pinot/pinot-connectors/0.10.0/pinot-connectors-0.10.0.pom
    k
    x
    +2
    • 5
    • 13
  • a

    Ayush Kumar Jha

    07/04/2022, 9:43 AM
    Hi everyone I need a help here . We are moving all our infrastructure to another data center so is Pinot. I want to use the segments stored in blob as backup.How can I add these segments to new cluster??
    m
    • 2
    • 4
  • i

    Ilya Yatsishin

    07/04/2022, 1:36 PM
    Hi! I’m trying to ingest some data from CSV file. Created schema and table. Started ingestion job, but it is everything what I see in output. I can’t find any logs, nothing in UI about jobs. It is my second time when I try to use Pinot, but there is no diagnostics and I fail.
    Copy code
    Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
    Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
    Creating an executor service with 10 threads(Job parallelism: 10, available cores: 16.)
    Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
    Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
    Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@73ab3aac] for table hits
    m
    r
    +3
    • 6
    • 31
  • m

    Mohamed Emad

    07/04/2022, 11:31 PM
    Hi! I have a realtime table it stopped consuming Kafka suddenly when I debug the table I found "errorMessage": "Could not build segment"
    m
    s
    • 3
    • 10
  • a

    Anish Nair

    07/05/2022, 7:53 AM
    Hey Team, Do we have any info on Segment Metadata like how Druid provides : https://druid.apache.org/docs/latest/querying/segmentmetadataquery.html Looking to derive few info like cardinality of column etc
    k
    l
    • 3
    • 9
  • a

    Amit Bisht

    07/05/2022, 9:42 AM
    Hi, I am trying to connect tableau desktop with pinot server using these steps (https://docs.pinot.apache.org/integrations/tableau), but Tableau gives me following error.
    Copy code
    An error occurred while communicating with Other Databases (JDBC)
    Bad Connection: Tableau could not connect to the data source.
    Error Code: FAB9A2C5
    org/apache/commons/configuration/Configuration
    Generic JDBC connection error
    org/apache/commons/configuration/Configuration
    I checked through postman that the broker and server endpoint are working. anyone here knows how to fix this or troubleshoot ??
    k
    • 2
    • 27
  • h

    harnoor

    07/05/2022, 3:33 PM
    Hi. Can someone tell the difference between below two filters?
    text_match(backend_name,'/perf/')
    and
    REGEXP_LIKE(backend_name,'perf')
    What should be the equivalent
    text_match
    filter for
    REGEXP_LIKE(backend_name,'perf')
    ? Thanks
    m
    a
    • 3
    • 9
  • i

    Ilya Yatsishin

    07/05/2022, 3:39 PM
    Continuation of 🧵 https://apache-pinot.slack.com/archives/C011C9JHN7R/p1656941781724149 I got the same dataset in Parquet and tried to import without this CSV parser. 1. It again writes something about AVRO. 2. It sees this amount of records
    RecordReader initialized will read a total of 99997497 records.
    But then it tries to push data after
    43172732
    - not sure if it was planning to process the rest or failed earlier. 3. There is still empty list of segments that it is trying to push. I don’t understand if something is wrong or not. Nothing is written 4. It looks job finished and no data can be seen. I can’t see error
    Copy code
    at row 43172732. reading next block
    block read in memory in 1682 ms. row count = 262144
    Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
    Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
    Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@7b7b3edb] for table hits
    Full log https://pastila.nl/?017b92b4/361513a635141c533150a5b79f6a4848
    k
    r
    • 3
    • 19
  • r

    Rohan Kaushal

    07/05/2022, 6:00 PM
    Hi , I'm trying to query the broker api at http://localhost:8099/query/sql but i get the following error response : No 'Access-Control-Allow-Origin' header is present on the requested resource. Any help would be much appreciated .
    m
    j
    k
    • 4
    • 29
  • a

    Abdullah Jaffer

    07/06/2022, 12:05 AM
    I have a use case where I need to perform a group by based on some filters and then average out the result of the group by so:
    Copy code
    select sum(col1) as sum table group by col2
    result:
    sum col2
    1      1    
    2      2
    3      3
    and then average the result, (1 + 2 +3)/3 = 2 I need a subquery for this, i think this is not supported in Pinot, can this be accomplished in the Trino connector? If so, how efficient is this? I don't want to avg the result code since that is not scalable due to unexpected no# of results in the group by
    m
    • 2
    • 6
  • a

    Alice

    07/06/2022, 2:46 AM
    Hi team, I’ve a question about r2o task. In order to make my question clear, I show my table config here. The following config means this r2o task runs every 2 hours. It will take all realtime segments that contain rows with timestamp earlier than current time - 6h, and create new segments. Every new segment has 1h data with no more than 2000000 rows. It’ll split into more segments if row number in this hour is larger than 2000000 here. Could you help me confirm if I’m understanding this task config correctly? Thanks.
    Copy code
    "taskTypeConfigsMap": {
          "RealtimeToOfflineSegmentsTask": {
            "bucketTimePeriod": "1h",
            "bufferTimePeriod": "6h",
            "schedule": "0 0 0/2 * * ?",
            "roundBucketTimePeriod": "1m",
            "mergeType": "rollup",
            "value.aggregationType": "max",
            "maxNumRecordsPerSegment": "2000000"
          }
        }
    m
    l
    • 3
    • 4
  • h

    Harsha Dandu

    07/06/2022, 6:56 AM
    Hi Team, trying to quickstart in docker, but its kind of hung up here and can't access localhost:9000 as well, I am trying on MacOS Monterey. No failed to start pinot errors but it got hung here in the terminal
    k
    • 2
    • 3
  • e

    eywek

    07/06/2022, 8:43 AM
    Hello, I’m using an hybrid table and I’m having an OFFLINE segment containing 1k docs created July 1st 2022, 20733, and I have a REALTIME segment containing 0 docs created July 2nd 2022, 11937 But when I query my table
    Copy code
    SELECT * FROM datasource_6298afcc7527000300387fdf
    only my consuming segment is queried and the query result 0 docs. And if I do
    Copy code
    SELECT * FROM datasource_6298afcc7527000300387fdf_OFFLINE
    I get 1k docs It seems that the OFFLINE table/segment is ignored (like if the table wasn’t hybrid anymore?), do you have any idea on how I can troubleshoot this? Or any tips to fix it? I cannot see something useful in logs
    Copy code
    2022/07/06 08:21:52.843 INFO [QueryScheduler] [pqr-1] Processed requestId=16284217,table=datasource_6298afcc7527000300387fdf_REALTIME,segments(queried/processed/matched/consuming)=1/0/0/1,schedulerWaitMs=0,reqDeserMs=0,totalExecMs=0,resSerMs=0,totalTimeMs=0,minConsumingFreshnessMs=9223372036854775807,broker=Broker_<ip>_8099,numDocsScanned=0,scanInFilter=0,scanPostFilter=0,sched=FCFS,threadCpuTimeNs(total/thread/sysActivity/resSer)=0/0/0/0
    2022/07/06 08:21:52.844 INFO [BaseBrokerRequestHandler] [jersey-server-managed-async-executor-13771] requestId=16284217,table=datasource_6298afcc7527000300387fdf,timeMs=2,docs=0/0,entries=0/0,segments(queried/processed/matched/consuming/unavailable):1/0/0/1/0,consumingFreshnessTimeMs=9223372036854775807,servers=1/1,groupLimitReached=false,brokerReduceTimeMs=0,exceptions=0,serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs,RequestSentDelayMs);<ip>_R=0,1,927,0,-1,offlineThreadCpuTimeNs(total/thread/sysActivity/resSer):0/0/0/0,realtimeThreadCpuTimeNs(total/thread/sysActivity/resSer):0/0/0/0,query=SELECT * FROM datasource_6298afcc7527000300387fdf
    Thank you
    l
    n
    m
    • 4
    • 21
  • a

    A_Phil

    07/06/2022, 9:06 AM
    My requirement: I want to create a single table where I can do aggregations like sum on integer values of a column; but also hold string values in the same column. What I am dealing with: I have a realtime table where I am ingesting data with fields
    timestamp
    ,
    id
    and `value`; the
    id
    describes the
    value
    being ingested; here
    value
    can hold both
    INT
    and
    STRING
    for my use case. I wanted to know which of these options are feasible: 1. Create columns
    value_int
    and
    value_string
    and use a filtering function in Pinot that can save records in
    value_int
    if
    value
    is
    INT
    , and vice-versa for
    STRING
    values of
    value
    . I tried this, but the filter function as shown in the docs, does not allow this 2. Store all values as
    STRING
    and use a pinot-specific
    CAST
    or
    CONVERT
    function to convert values to
    INT
    to do aggregations. But I could not find a cast/convert function in Pinot. Thus I am not able to do
    sum
    operations on the data I would welcome any ideas/workaround for the same.
  • c

    Cheguri Vinay Goud

    07/06/2022, 9:13 AM
    Hello, I'm running both Trino and Pinot in minikube (in the same namespace) and when I try to query Pinot using Trino. I'm facing below issue. Can someone please help me with this? I don't understand why request is forwarded to port 8090 (refer error logs). This port number is neither present in pinot helm not pinot.properties in trino configmap Below is the config of pinot in Trino configmap
    Copy code
    pinot.properties: |
        connector.name=pinot
        pinot.controller-urls=<http://pinot-controller-headless:9000>
    trino-error-logs.txt
  • d

    David Gregory

    07/06/2022, 12:07 PM
    Hey all I’m new to Pinot and this Slack. I’m stuck on a couple of issues. I’m hoping there are answers here… 1. Realtime tables: I am trying to add a very simple Realtime table via the commandline. My pinot implementation is within a docker container. Here is the line I am running:
    docker exec -i pinot ./pinot/bin/pinot-admin.sh AddTable -schemaFile testschema.json –tableConfigFile testtable.json -exec
    • From what I can tell, this is exactly what it should be. The schema is already successfully added. Kafka is successfully running with messages in a topic. However, when I run this line, it returns the error: MissingParameterException: Missing required option: ‘-tableConfigFile=<_tableConfigFile>’ • Then, when I look into the Pinot UI, there is no table. • I try then to add the table via the Pinot UI and discover that the table is actually there, but I cannot see it in the UI. ◦ I initially see this message when saving the Realtime table via the Pinot UI: “TimeoutException: Timeout expired while fetching topic metadata” ◦ I try the save again and at that point I received the message that the table exists already. • Apparently, I am able to delete the table using the “Delete tableConfig” in the Swagger API… The “Delete tableName” would not work. • Attached are the schema and table config…
    testschema.jsontesttable.json
    m
    • 2
    • 12
  • a

    Alice

    07/06/2022, 1:15 PM
    Hi, another question about realtime2offlinesegmentstask. We found a small bucket time(1h) data would task more than 1 hour to process, especially in mapper phase. Is it normal? Is there some task configures I’m missing here?
    l
    m
    • 3
    • 7
  • h

    Harsha Dandu

    07/06/2022, 4:17 PM
    <!channel> I have started test pinot cluster on kubernetes aws eks and new to kubernetes How do i run below commands in kubectl?? to do sample data ingestion of offline data say i have some dumps in s3 buckets to test with bin/pinot-admin.sh AddTable -schemaFile examples/batch/airlineStats/airlineStats_schema.json -tableConfigFile examples/batch/airlineStats/airlineStats_offline_table_config.json -exec bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile examples/batch/airlineStats/ingestionJobSpec.yaml
    r
    • 2
    • 1
  • g

    Grace Walkuski

    07/06/2022, 8:00 PM
    Hi! Is this still accurate? https://docs.pinot.apache.org/basics/getting-started/frequent-questions/query-faq#does-pagination-work-in-group-by-queries
    l
    k
    a
    • 4
    • 4
  • t

    Tommaso Garuglieri

    07/06/2022, 9:18 PM
    Hi ! We are evaluating Pinot (
    7c15bc
    ) to replace trino for low latency geospatial aggregation queries. We have a table with about
    10^8
    records, using the star tree indexes we are able to perform our queries orders of magnitude faster than trino 🚀. But when using geospatial filtering,
    ST_Contains
    we experience performance degradation (from <200ms to over 10 seconds) even if we are using H3 geospatial indexes with different resolutions (which are triggered as expected) and the geometres are not complex. Our queries are pretty straightforward:
    Copy code
    select sum(x) from <table> where ST_Contains(ST_GeogFromText('...'),location_st_point) = 1 and ...
    Is there any way we can improve latencies for geospatial aggregations ? Should we just avoid geospatial filters with this number of records ?
    👍 1
    m
    y
    k
    • 4
    • 9
  • w

    Weixiang Sun

    07/07/2022, 12:25 AM
    Quick question: does pinot 0.8 support group by MV functionality? Here is sample query
    Copy code
    SELECT mv_column
    FROM enriched_customer_orders_v1_17_1 
    GROUP BY mv_column
    LIMIT 10
    Here is the exception:
    Copy code
    "message": "QueryExecutionError:\njava.lang.UnsupportedOperationException\n\tat org.apache.pinot.segment.spi.index.reader.MutableForwardIndex.readDictIds(MutableForwardIndex.java:71)\n\tat org.apache.pinot.segment.spi.index.reader.MutableForwardIndex.readDictIds(MutableForwardIndex.java:76)\n\tat org.apache.pinot.core.common.DataFetcher$ColumnValueReader.readDictIds(DataFetcher.java:278)\n\tat org.apache.pinot.core.common.DataFetcher.fetchDictIds(DataFetcher.java:88)\n\tat org.apache.pinot.core.common.DataBlockCache.getDictIdsForSVColumn(DataBlockCache.java:99)\n\tat org.apache.pinot.core.operator.docvalsets.ProjectionBlockValSet.getDictionaryIdsSV(ProjectionBlockValSet.java:69)\n\tat org.apache.pinot.core.query.distinct.dictionary.DictionaryBasedSingleColumnDistinctOnlyExecutor.process(DictionaryBasedSingleColumnDistinctOnlyExecutor.java:42)\n\tat org.apache.pinot.core.operator.query.DistinctOperator.getNextBlock(DistinctOperator.java:61)\n\tat org.apache.pinot.core.operator.query.DistinctOperator.getNextBlock(DistinctOperator.java:38)\n\tat org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator.processSegments(BaseCombineOperator.java:150)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:105)\n\tat org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)",
        "errorCode": 200
    • 1
    • 1
  • d

    Diogo Baeder

    07/07/2022, 1:27 AM
    Hi folks! I'd like to ask for some help on running Pinot locally on a k3s cluster (tiny k8s cluster for development). I'm migrating my Pinot instances manifests from
    Deployment
    to
    StatefulSet
    and facing some issues. More in this thread.
    m
    • 2
    • 7
  • p

    Peter Pringle

    07/07/2022, 7:23 AM
    I have added a couple of new columns to my existing table schema. Offline tables seem fine but querying the REALTIME tables with select * .., I end up with an error
    MergeResponseError: responses for table: myTable from servers [d123-1_R] got dropped due to data schema inconsistency
    What is the correct way to fix this. I tried reloading all segments in the realtime table also restarted all processes but it didn't seem to fix the issue.
    m
    • 2
    • 1
  • d

    Dan DC

    07/07/2022, 10:13 AM
    Hey all, we have be3n experiencing an issue with our k8s deployment. We have a node group to which we deploy zookeeper and all pinot nodes and this node group is scaled down to 0 every n weeks. When the nodes come back up we noticed that some realtime table report error code 305 segments unavailable when queried. We have to rebalance these tables to get them back to normal. I know scaling down the node group to 0 may not be a good policy but I doubt we can change this. My question is whether this behaviour is expected or it can be considered an issueto be fixed? Please let me know if I should raise a github issue for this
    k
    m
    • 3
    • 5
  • i

    Ilya Yatsishin

    07/07/2022, 3:56 PM
    Copy code
    SELECT UserID, minute(EventTime) AS m, SearchPhrase, COUNT(*) FROM hits GROUP BY UserID, m, SearchPhrase ORDER BY COUNT(*) DESC LIMIT 10
    java.lang.NullPointerException: null
    Copy code
    Caught exception while merging results blocks (query: QueryContext{_tableName='hits_OFFLINE', _selectExpressions=[UserID, minute(EventTime), SearchPhrase, count(*)], _aliasList=[null, m, null, null], _filter=null, _groupByExpressions=[UserID, minute(EventTime), SearchPhrase], _havingFilter=null, _orderByExpressions=[count(*) DESC], _limit=10, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=10000000}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:hits_OFFLINE), pinotQuery:PinotQuery(dataSource:DataSource(tableName:hits_OFFLINE), selectList:[Expression(type:IDENTIFIER, identifier:Identifier(name:UserID)), Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:FUNCTION, functionCall:Function(operator:MINUTE, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:EventTime))])), Expression(type:IDENTIFIER, identifier:Identifier(name:m))])), Expression(type:IDENTIFIER, identifier:Identifier(name:SearchPhrase)), Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))], groupByList:[Expression(type:IDENTIFIER, identifier:Identifier(name:UserID)), Expression(type:FUNCTION, functionCall:Function(operator:MINUTE, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:EventTime))])), Expression(type:IDENTIFIER, identifier:Identifier(name:SearchPhrase))], orderByList:[Expression(type:FUNCTION, functionCall:Function(operator:DESC, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))]))], limit:10, queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=10000000}))})
    java.lang.NullPointerException: null
            at org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.mergeResults(GroupByOrderByCombineOperator.java:236) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:119) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:50) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.InstanceResponseOperator.getCombinedResults(InstanceResponseOperator.java:113) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:106) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:34) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:53) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:304) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:203) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:151) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:137) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
            at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            at java.lang.Thread.run(Thread.java:829) [?:?]
    m
    • 2
    • 4
  • g

    Greg P

    07/07/2022, 4:42 PM
    Hi all Hopefully someone with Minion tasks experience can chime in. I have been trying to implement a custom task for purging records. I took existing PurgeTask classes as a base, and managed to get it work. I hit one issue though. REALTIME segments cannot be updated even though API supports it. It appears the table type is lost, and it defaults to OFFLINE. I made this change
    Copy code
    @@ -158,7 +158,10 @@ public abstract class BaseSingleSegmentConversionExecutor extends BaseTaskExecut
               new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.ENABLE_PARALLEL_PUSH_PROTECTION, "true");
           NameValuePair tableNameParameter = new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.TABLE_NAME,
               TableNameBuilder.extractRawTableName(tableNameWithType));
    -      List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter);
    +      NameValuePair tableTypeParameter = new BasicNameValuePair(FileUploadDownloadClient.QueryParameters.TABLE_TYPE,
    +              TableNameBuilder.getTableTypeFromTableName(tableNameWithType).toString());
    +      List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter,
    +              tableTypeParameter);
    k
    f
    • 3
    • 13
  • g

    Greg P

    07/07/2022, 4:43 PM
    Basically adding tableType param. Hope someone could validate this is the right way to go and there is no something really wrong with REALTIME. My plugin seems to work, but I only tested it locally on a small sample.
  • d

    Diogo Baeder

    07/07/2022, 7:53 PM
    Hi guys, quick question about the behavior of aggregations in Pinot: in the case of
    SUM()
    , is it done in the Server, Broker or both? If I run queries with lots of SUMs going on, should I expect the heavier work to be done by the Server or the Broker?
    m
    • 2
    • 2
  • a

    Abdullah Jaffer

    07/07/2022, 9:29 PM
    Hi guys, just a question, does field order matter during ingestion, let's say I have a csv file with fields in this order col1, col2, col3, does the fields in my schema definition need to follow the same order as the csv file?
    m
    • 2
    • 1
1...474849...166Latest