https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • d

    Diogo Baeder

    11/18/2021, 1:31 PM
    I don't really know, that part was not written by me, but I'll try to find out. By the way, your trick with the env vars works like a charm 🙂
    m
    • 2
    • 3
  • p

    Priyank Bagrecha

    11/18/2021, 6:53 PM
    Ha, that explains why I can't get 0.9.0 working via launcher scripts. Thank you. Can we please update the documentation for it?
    m
    • 2
    • 6
  • a

    Ali Atıl

    11/18/2021, 9:51 PM
    Hello everyone, is there anyway to do join operation on real-time tables?
    m
    • 2
    • 3
  • m

    Map

    11/19/2021, 10:41 PM
    Any idea how to trouble this error message?
    2021/11/19 223605.496 INFO [CurrentStateComputationStage] [HelixController-pipeline-task-pinot-prod-(aa26cf97_TASK)] Event aa26cf97_TASK : Ignore a pending message ee7f9ef0-1de9-4737-b0b4-db4a4e1b9073 for a non-exist resource table0_REALTIME and partition table0__0__0__20211119T2150Z
    m
    • 2
    • 8
  • m

    Mahesh babu

    11/22/2021, 12:15 PM
    Hi Team ,I'm trying to setup pinot in docker and load table . I'm Facing issues while loading data into table. ERROR: java.lang.RuntimeException: Failed to read from Schema URI - 'http://localhost:9000/tables/transcript/schema', . can you please help me to fix this issue. I'm using this yml file.executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndTarPush inputDirURI: '/tmp/pinot-quick-start/rawdata/' includeFileNamePattern: 'glob:**/*.csv' outputDirURI: '/tmp/pinot-quick-start/segments/' overwriteOutput: true pinotFSSpecs: - scheme: file className: org.apache.pinot.spi.filesystem.LocalPinotFS recordReaderSpec: dataFormat: 'csv' className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader' configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec: tableName: 'transcript' schemaURI: 'http://localhost:9000/tables/transcript/schema' tableConfigURI: 'http://localhost:9000/tables/transcript' pinotClusterSpecs: - controllerURI: 'http://localhost:9000'
    m
    m
    • 3
    • 10
  • m

    Mahesh babu

    11/22/2021, 2:58 PM
    docker run --rm -ti --network=pinot-demo -v /tmp/pinot-quick-start:/tmp/pinot-quick-start --name pinot-data-ingestion-job apachepinot/pinot:latest LaunchDataIngestionJob -jobSpecFile /tmp/pinot-quick-start/docker-job-spec.yml
    m
    • 2
    • 4
  • p

    Priyank Bagrecha

    11/22/2021, 7:30 PM
    i am running into this stack trace in log with in a second after adding a real-time table
    Copy code
    021/11/20 00:18:41.296 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task km_mp_play_startree__103__0__20211120T0018Z
    java.lang.reflect.InvocationTargetException: null
            at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
            at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
            at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
            at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
            at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            at java.lang.Thread.run(Thread.java:829) [?:?]
    Caused by: java.lang.OutOfMemoryError: Direct buffer memory
            at java.nio.Bits.reserveMemory(Bits.java:175) ~[?:?]
            at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) ~[?:?]
            at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) ~[?:?]
            at org.apache.pinot.segment.spi.memory.PinotByteBuffer.allocateDirect(PinotByteBuffer.java:38) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.spi.memory.PinotDataBuffer.allocateDirect(PinotDataBuffer.java:115) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.local.io.writer.impl.DirectMemoryManager.allocateInternal(DirectMemoryManager.java:53) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.local.io.readerwriter.RealtimeIndexOffHeapMemoryManager.allocate(RealtimeIndexOffHeapMemoryManager.java:80) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex.addBuffer(FixedByteSVMutableForwardIndex.java:208) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex.<init>(FixedByteSVMutableForwardIndex.java:77) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.<init>(MutableSegmentImpl.java:308) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.<init>(LLRealtimeSegmentDataManager.java:1364) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:344) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:162) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline(SegmentOnlineOfflineStateModelFactory.java:86) ~[pinot-all-0.9.0-jar-with-dependencies.jar:0.9.0-cf8b84e8b0d6ab62374048de586ce7da21132906]
            ... 12 more
    i have tried increasing heap size (right now at 16G) and i am still running into this issue. i am using 5 servers to consume from a topic with 128 partitions, with an event rate of about 7M events per minute. I see 26 segments on 3 servers and 25 on 2 servers in Bad state.
    m
    n
    • 3
    • 28
  • y

    Yeongju Kang

    11/23/2021, 8:06 AM
    Hi, I have some questions related to indices. 1. Is Forward index applied to all non specified columns of a table in default? 2. Are there ways to see query execution plans including index usage? I tried from explain, explain (type distributed), explain (type io) from presto but failed to find useful information from that 3. Some index files doesn’t seem to be purged after table deletion. Should I delete those myself if I have to make table with same name? (I didn’t try re-generation of the behavior) Thank you for your effort to such a nice software!
    x
    • 2
    • 2
  • y

    Yeongju Kang

    11/23/2021, 10:31 AM
    Hi, I have 5 servers(v0.9) in my cluster and one of them turned to dead state. server’s process never printed crash log and last lines of the server log looks like its state seems okay. but table state at zookeeper turned to offline and i cannot see my node from liveinstances. I am running my server on EKS and it never had pod restart. Is there anything I can do before pod restart?
    x
    • 2
    • 4
  • a

    Ali Atıl

    11/23/2021, 2:12 PM
    Hello Everyone, is it possible to change H3 index resolution after table creation?
    m
    • 2
    • 2
  • d

    Deepak Mishra

    11/23/2021, 6:57 PM
    Hello everyone , i am not able to start zookeeper using pinot-0.9.0 with command - bin/pinot-admin.sh StartZookeeper . Please help
    p
    • 2
    • 4
  • m

    Mahesh babu

    11/24/2021, 5:17 AM
    Hi Team, I'm trying to load data from Minio to pinot but facing issues while running yaml files. ERROR:expected '<document start>', but found BlockMappingStart in 'string', line 6, column 1: jobType: SegmentCreationAndMetad ...
    x
    • 2
    • 33
  • a

    Ayush Kumar Jha

    11/24/2021, 6:20 PM
    Hi everyone , recently I tried to upgrade pinot version from 0.7.1 and I am doing ingestion using files stored in azure blob but I am getting this error
    Copy code
    java.lang.IllegalStateException: Unable to extract out the relative path for input file file path "file path"
    in 0.8.0 and 0.9.0 but it is working fine in 0.7.1
    m
    s
    x
    • 4
    • 19
  • v

    Vibhor Jain

    11/25/2021, 6:18 AM
    Hi Team, We are facing one issue in the query where one of our col values contains single quotes and we are getting CalciteSqlToPinotQuery exception. For example, one of the values is L'hôp Test. Since it contains a single quote, how to handle this and query the data for this value?
    m
    m
    • 3
    • 4
  • m

    Mahesh babu

    11/25/2021, 11:33 AM
    Hi Team, I'm not able to run controller and server of pinot in docker with config file ,i have to read data from s3 for that i have to run server and controller with config file .i'm trying to run the server and controller by using this commands.sudo docker run --rm -ti \ --network=pinot-demo \ --name pinot-controller \ -p 9010:9010 \ -e JAVA_OPTS="-Dplugins.dir=/opt/pinot/plugins -Xms1G -Xmx4G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xloggc:gc-pinot-controller.log" \ -d apachepinot/pinot:latest StartController -configFileName "/home/mahesh/Documents/pinot/s3-pinot/controller.conf" \ -zkAddress pinot-zookeeper:2181
    x
    • 2
    • 4
  • a

    Ali Atıl

    11/25/2021, 2:39 PM
    Hi, when upsert functionality enabled, would it update the records in the segments of the offline table for hybrid tables?
    m
    • 2
    • 2
  • p

    Prashanth Rao

    11/26/2021, 10:58 AM
    Hi Team, we are using Apache Pinot for running OLAP queries and right now one of the table is stuck in consumer rebalance state(while pointing to a kafka topic) for last 12 hours. I tried restarting the Pinot Servers which didn't help . Can someone please suggest any steps here . These messages came in repeatedly
    Copy code
    [Consumer clientId=consumer-2, groupId=event_template_mapping_REALTIME_1627646764788_0] Group coordinator <*> (id: 795314267 rack: null) is unavailable or invalid, will attempt rediscovery
    [Consumer clientId=consumer-2, groupId=event_template_mapping_REALTIME_1627646764788_0] Discovered group coordinator <*> (id: 795314267 rack: null)
    [Consumer clientId=consumer-2, groupId=event_template_mapping_REALTIME_1627646764788_0] (Re-)joining group
    And finally after 6-7 hours saw this message , which basically didn't fetch any partition .
    Copy code
    [Consumer clientId=consumer-4, groupId=event_template_mapping_REALTIME_1627646764788_0] Successfully joined group with generation 6
    [Consumer clientId=consumer-4, groupId=event_template_mapping_REALTIME_1627646764788_0] Setting newly assigned partitions []
    m
    j
    p
    • 4
    • 5
  • m

    Map

    11/29/2021, 5:23 PM
    We run Pinot 0.8.0. When ingesting a table in
    FULL
    upsert
    mode, we notice the number of rows returned for the same query varies across times, but it is supposed to remain consistent. For example, there are 1000 unique values keyed on column
    A
    , which we use as the primary key for the pinot table
    table1
    . A query like
    select count(1) from table1
    can return values 1567, or 789, in addition to 1000. In the case of 2000, you can find duplicated rows with different timestamps such as
    Copy code
    | A | currenttime |
    | - | ------------ |
    | a | 1:00:00 |
    | a | 1:00:01 |
    | b | 1:00:00 |
    | b | 1:00:03 |
    ...
    In the case of 789, many rows are simply missing… We suspect this is related to the process of updating the index for the upserted table. Have anyone seen this before?
    m
    y
    +4
    • 7
    • 102
  • a

    Anusha

    11/30/2021, 3:01 AM
    Hello Team, I see that the new version 0.9.0 is released. I am trying to enable authentication but I am unable to.. Could someone please guide me. Is there any documentation available for that ? Thanks in advance.
    m
    • 2
    • 2
  • y

    yelim yu

    11/30/2021, 5:08 AM
    Hello team, My team wants to make a table which has two 3 timestamp cols and few other string cols. While we construct this table schema, when we added two timestamp cols (unixtime milli) in dimension spec, topic couldnt consume the event. Could you please give us reason why?
    m
    m
    • 3
    • 7
  • e

    eywek

    11/30/2021, 9:24 AM
    Hello, I was wondering if it was planned to add the
    LIKE
    operator to
    JSON_MATCH
    ? I’m currently using
    Copy code
    REGEXP_LIKE(JSONEXTRACTSCALAR("labels", '$.demande_intention', 'STRING'), 'terminal')
    but it’s very slow (even with small number of scanned documents (21). And maybe having it directly with
    JSON_MATCH
    could speed-up this operation?
    Copy code
    JSON_MATCH("labels", 'demande_intention LIKE ''terminal''')
    Thank you
    a
    r
    • 3
    • 12
  • a

    Anish Nair

    11/30/2021, 3:20 PM
    Hi Team, I was trying out "comparisonColumn" config of upsertConfig, it seems like table config is not accepting this config. after updating the table config, config is still the same like below. "upsertConfig": { "mode": "FULL" }, Also I tried pushing old transaction date time record into real-time, and it got updated with that new record. Which shouldn't have. Can someone please help?
    m
    m
    j
    • 4
    • 13
  • m

    Mahesh babu

    12/01/2021, 7:08 AM
    Hi Team, i'm trying to connect minio through apache pinot when i'm trying to run yml files it is failing with this error. "*ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job -* java.lang.RuntimeException: software.amazon.awssdk.core.exception.SdkClientException: Configured region (localhost%3A9010) resulted in an invalid URI: https://s3.localhost%3A9010.amazonaws.com Valid region examples: " i started controller and server with controller.conf and server.conf and my yml file is "executionFrameworkSpec: name: 'standalone' segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner' segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner' segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner' jobType: SegmentCreationAndUriPush inputDirURI: 'http://localhost:39391/buckets/test/' *includeFileNamePattern: 'glob:**/*.csv'* outputDirURI: 'http://localhost:39391/buckets/pinot-output/' overwriteOutput: true pinotFSSpecs: - scheme: http className: org.apache.pinot.plugin.filesystem.S3PinotFS configs: region: 'localhost:9010' recordReaderSpec: dataFormat: 'csv' className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader' configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' tableSpec: tableName: 'transcript' schemaURI: 'http://localhost:9000/tables/transcript/schema' tableConfigURI: 'http://localhost:9000/tables/transcript/' pinotClusterSpecs: *- controllerURI: 'http://localhost:9000'*"
    m
    • 2
    • 35
  • s

    Syed Akram

    12/01/2021, 11:26 AM
    hi, is there any way to set query timeout parameter from jdbc/java-client? if query is taking more than 10sec
    a
    n
    • 3
    • 4
  • m

    Map

    12/01/2021, 10:25 PM
    When running the realtimeProvisioningHelper, we got a bunch of NAs. Any idea on how to troubleshoot this?
    Copy code
    RealtimeProvisioningHelper -tableConfigFile <tableConfig> -numPartitions 1 -pushFrequency null -numHosts 1,2,3,4 -numHours 1,2,3,4,56,12,18,24 -sampleCompletedSegmentDir <path-to-segment> -ingestionRate 1000 -maxUsableHostMemory 5120G -retentionHours 1
    Note:
    
    * Table retention and push frequency ignored for determining retentionHours since it is specified in command
    * See <https://docs.pinot.apache.org/operators/operating-pinot/tuning/realtime>
    Memory used per host (Active/Mapped)
    
    numHosts --> 1 |2 |3 |4 |
    numHours
    1 --------> 8.1G/295.67G |4.05G/147.83G |4.05G/147.83G |4.05G/147.83G |
    2 --------> NA |NA |NA |NA |
    3 --------> NA |NA |NA |NA |
    4 --------> NA |NA |NA |NA |
    12 --------> NA |NA |NA |NA |
    18 --------> NA |NA |NA |NA |
    24 --------> NA |NA |NA |NA |
    56 --------> NA |NA |NA |NA |
    
    Optimal segment size
    
    numHosts --> 1 |2 |3 |4 |
    numHours
    1 --------> 1.51G |1.51G |1.51G |1.51G |
    2 --------> NA |NA |NA |NA |
    3 --------> NA |NA |NA |NA |
    4 --------> NA |NA |NA |NA |
    12 --------> NA |NA |NA |NA |
    18 --------> NA |NA |NA |NA |
    24 --------> NA |NA |NA |NA |
    56 --------> NA |NA |NA |NA |
    
    Consuming memory
    
    numHosts --> 1 |2 |3 |4 |
    numHours
    1 --------> 8.1G |4.05G |4.05G |4.05G |
    2 --------> NA |NA |NA |NA |
    3 --------> NA |NA |NA |NA |
    4 --------> NA |NA |NA |NA |
    12 --------> NA |NA |NA |NA |
    18 --------> NA |NA |NA |NA |
    24 --------> NA |NA |NA |NA |
    56 --------> NA |NA |NA |NA |
    
    Total number of segments queried per host (for all partitions)
    numHosts --> 1 |2 |3 |4 |
    numHours
    1 --------> 2 |1 |1 |1 |
    2 --------> NA |NA |NA |NA |
    3 --------> NA |NA |NA |NA |
    4 --------> NA |NA |NA |NA |
    12 --------> NA |NA |NA |NA |
    18 --------> NA |NA |NA |NA |
    24 --------> NA |NA |NA |NA |
    56 --------> NA |NA |NA |NA |
    m
    m
    +2
    • 5
    • 15
  • m

    Map

    12/02/2021, 3:20 AM
    When query Pinot via Trino (362), the avg() function doesn’t seem to work correctly. It always returns no data…
    m
    e
    • 3
    • 18
  • s

    Syed Akram

    12/02/2021, 7:01 AM
    is it possible to create segment file name with date in the filename, instead of time in millis(long)... for eg.,testtable_OFFLINE_1637625600000_1637712000000_1469.tar.gz to testtable_OFFLINE_2021-11-01_2021-11-05_1.tar.gz
    m
    • 2
    • 1
  • y

    Yeongju Kang

    12/02/2021, 7:35 AM
    Hello folks, I am struggling with hybrid table but i have trouble to make it work. My configs are like below. Offline table data is only displayed without streaming data blended. I could find log of consuming kafka events but couldn’t find broker, controller or server error. My hybrid table creation testing is running on minikube, with pinot 0.9.0. Process I did was creating offline, and then realtime. • bin/pinot-admin.sh AddTable -tableConfigFile hybrid_realtime.json -schemaFile hybrid_schema.json -exec • bin/pinot-admin.sh AddTable -tableConfigFile hybrid_offline.json -schemaFile hybrid_schema.json -exec 1. hybrid_schema.json
    Copy code
    {
      "schemaName": "transcript",
      "dimensionFieldSpecs": [
        {
          "name": "studentID",
          "dataType": "INT"
        },
        {
          "name": "firstName",
          "dataType": "STRING"
        },
        {
          "name": "lastName",
          "dataType": "STRING"
        },
        {
          "name": "gender",
          "dataType": "STRING"
        },
        {
          "name": "subject",
          "dataType": "STRING"
        },
        {
          "name": "doNotFailPlease",
          "dataType": "STRING",
          "defaultNullValue": ""
        },
        {
          "name": "ts2",
          "dataType": "TIMESTAMP"
        }
      ],
      "metricFieldSpecs": [
        {
          "name": "score",
          "dataType": "FLOAT"
        }
      ],
      "dateTimeFieldSpecs": [
        {
          "name": "ts",
          "dataType": "TIMESTAMP",
          "format": "1:SECONDS:EPOCH",
          "granularity": "1:SECONDS"
        }
      ],
      "primaryKeyColumns": [
        "studentID"
      ]
    }
    2. hybrid_offline.json
    Copy code
    {
        "tableName": "transcript_hybrid",
        "tableType": "OFFLINE",
        "segmentsConfig": {
            "replication": 1,
            "timeColumnName": "ts",
            "timeType": "SECONDS"
        },
        "tenants": {
            "broker": "DefaultTenant",
            "server": "DefaultTenant"
        },
        "tableIndexConfig": {
            "loadMode": "MMAP"
        },
        "metadata": {}
    3. hybrid_realtime.json
    Copy code
    {
      "tableName": "transcript_hybrid",
      "tableType": "REALTIME",
      "segmentsConfig": {
        "timeColumnName": "ts",
        "timeType": "SECONDS",
        "schemaName": "transcript",
        "replicasPerPartition": "1"
      },
      "tenants": {},
      "tableIndexConfig": {
        "loadMode": "MMAP",
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "lowlevel",
          "stream.kafka.topic.name": "transcript",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.broker.list": "kafka.local-pinot.svc.cluster.local:9092",
          "realtime.segment.flush.threshold.time": "6h"
        }
      },
      "metadata": {
        "customConfigs": {}
      },
      "routing": {
        "instanceSelectorType": "strictReplicaGroup"
      },
      "upsertConfig": {
        "mode": "FULL"
      }
    }
    m
    m
    • 3
    • 34
  • d

    Deepak Mishra

    12/02/2021, 9:46 AM
    Hello everyone , I am working backfill data using spark batch ingestion , can we handle duplicate data while backfill , so that it won’t get duplicated in OFFLINE table
    m
    m
    • 3
    • 2
  • e

    Elon

    12/02/2021, 4:41 PM
    Question about replica groups and pools for a realtime table: If we set the replicas per partition to 1 in the segment config, and in the instance config set numReplicaGroups to 1 but have 3 pools, do the segments in the table end up having 3 replicas? i.e. 1 per pool?
    • 1
    • 4
1...272829...166Latest