https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • d

    Dan Hill

    07/04/2020, 11:26 PM
    Are there guides / advice for how we should iterate on Pinot tables once they are serving traffic? - What sort of schema changes are safe to do live? - How do teams usually roll out breaking changes? A separate Pinot table? How can we roll this out incrementally? For my case, we're using Presto to join so we'll probably have to modify our Presto query and do an extra join. - Any helpful tools for rolling out changes incrementally? E.g. the system populating Pinot will likely have a canary setup. Is there something that helps verify that the deployed canary events match the non-canary events? Is this something separate from Pinot? - How do teams experiment with different Pinot setups to evaluate latencies? A whole separate Pinot stack? How do teams experiment with different indexes?
  • s

    Somanshu Jindal

    07/06/2020, 8:56 AM
    Hi, If i want to use zookeeper cluster for production setup, Can i specify all the zookeeper hosts when starting various pinot components like controller, broker etc.
  • y

    Yash Agarwal

    07/06/2020, 11:14 AM
    Is it possible to use multiple buckets for S3PinotFs ? We have limitations to the amount of data we can store in a single bucket.
  • p

    Pradeep

    07/06/2020, 7:52 PM
    QQ, wondering how difficult would it be to include timestampNanos as part of the time column in pinot? (is it just a matter of pinot parsing and understanding that timestamp is in Nanos or there are more assumptions around?) I believe currently till
    millis
    is supported. Context is we have system level events (think stream of syscalls) and want to be able to store the nanos timestamp to fix the order among them and also it’s used by other systems in our infrastructure. Currently I am storing nanos column as a different column and created a
    millis
    column to serve as time column, thinking if I can avoid storing the additional duplicate info if the feature is simple enough to add?
  • k

    Kishore G

    07/06/2020, 7:59 PM
    IMO, nanos cannot be used as timestamp
  • k

    Kishore G

    07/06/2020, 7:59 PM
    irrespective of Pinot supporting that datatype
  • k

    Kishore G

    07/06/2020, 8:00 PM
    nanos is mainly used to measure relative times
  • e

    Elon

    07/06/2020, 11:38 PM
    FYI, we have a table which already exists and I wanted to add a sorted column index but getting "bad request 400". Nothing in the controller logs. Can you see what's wrong with the following?
  • e

    Elon

    07/06/2020, 11:38 PM
    Copy code
    curl -f -k -X POST --header 'Content-Type: application/json' -d '@realtime.json' ${CONTROLLER}/tables
  • e

    Elon

    07/06/2020, 11:39 PM
    Copy code
    {
      "tableName": "oas_integration_operation_event",
      "tableType": "REALTIME",
      "segmentsConfig": {
        "timeColumnName": "operation_ts",
        "timeType": "SECONDS",
        "retentionTimeUnit": "DAYS",
        "retentionTimeValue": "7",
        "segmentPushType": "APPEND",
        "segmentPushFrequency": "daily",
        "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
        "schemaName": "oas_integration_operation_event",
        "replicasPerPartition": "3"
      },
      "tenants": {
        "broker": "DefaultTenant",
        "server": "DefaultTenant"
      },
      "tableIndexConfig": {
        "loadMode": "MMAP",
        "invertedIndexColumns": [
          "service_slug",
          "operation_type",
          "operation_result",
          "store_id"
        ],
        "sortedColumn": [
          "operation_ts"
        ],
        "noDictionaryColumns": [],
        "aggregateMetrics": "false",
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "LowLevel",
          "stream.kafka.topic.name": "oas-integration-operation-completion-avro",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.decoder.prop.schema.registry.rest.url": "<http://XXXX:8081>",
          "stream.kafka.zk.broker.url": "XXXX/",
          "stream.kafka.broker.list": "XXXX:9092",
          "realtime.segment.flush.threshold.time": "6h",
          "realtime.segment.flush.threshold.size": "0",
          "realtime.segment.flush.desired.size": "200M",
          "stream.kafka.consumer.prop.auto.isolation.level": "read_committed",
          "stream.kafka.consumer.prop.auto.offset.reset": "smallest",
          "stream.kafka.consumer.prop.group.id": "oas_integration_operation_event-load-pinot-llprb",
          "stream.kafka.consumer.prop.client.id": "XXXX"
        },
        "starTreeIndexConfigs": [
          {
            "dimensionsSplitOrder": [
              "service_slug",
              "store_id",
              "operation_type",
              "operation_result"
            ],
            "functionColumnPairs": [
              "PERCENTILEEST__operation_latency_ms",
              "AVG__operation_latency_ms",
              "DISTINCTCOUNT__store_id",
              "COUNT__store_id",
              "COUNT__operation_type"
            ]
          },
          {
            "dimensionsSplitOrder": [
              "service_slug",
              "store_id"
            ],
            "functionColumnPairs": [
              "COUNT__store_id",
              "COUNT__operation_type"
            ]
          }
        ]
      },
      "metadata": {
        "customConfigs": {}
      }
    }
  • m

    Mayank

    07/06/2020, 11:39 PM
    IIRC, uploading segments to realtime tables was not possible (a while back, but not sure if it continues to be the case).
    👍 1
  • m

    Mayank

    07/06/2020, 11:40 PM
    can you try swagger?
  • e

    Elon

    07/06/2020, 11:41 PM
    Sure
  • e

    Elon

    07/06/2020, 11:42 PM
    Oh, thanks! Looks like I can't change the time type for the time column, i.e. segmentsConfig.timeType
  • p

    Pradeep

    07/08/2020, 10:36 PM
    Hi, I am trying to test following change (https://github.com/apache/incubator-pinot/pull/5661) on my cluster. So, pulled code from the master, but I am seeing below exception. wondering if there’s any change you know of? I only see this change (https://github.com/apache/incubator-pinot/pull/5608) which says that existing behavior  shouldn’t change? Below is the exception I see, which seems to be trying to fetch the S3 region from configuration
    Copy code
    java.lang.IllegalArgumentException: null
            at shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:108) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependenci
    es.jar:0.5.0-SNAPSHOT-2ec7dee1597021742f68f0ae8b279f7560e55894]
            at org.apache.pinot.plugin.filesystem.S3PinotFS.init(S3PinotFS.java:80) ~[pinot-s3-0.5.0-SNAPSHOT-shaded.jar:0.5.0-SNAPSHOT-2ec7dee
    1597021742f68f0ae8b279f7560e55894]
            at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:55) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.
    jar:0.5.0-SNAPSHOT-2ec7dee1597021742f68f0ae8b279f7560e55894]
            at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:75) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:
    0.5.0-SNAPSHOT-2ec7dee1597021742f68f0ae8b279f7560e55894]
  • p

    Pradeep

    07/08/2020, 10:38 PM
    Copy code
    pinot.server.storage.factory.s3.region=us-east-2
    I already have this config,
  • p

    Pradeep

    07/08/2020, 10:39 PM
    and this was working fine with earlier version
  • k

    Kishore G

    07/08/2020, 10:39 PM
    @Daniel Lavoie ^^
  • m

    Mayank

    07/08/2020, 10:40 PM
    Yeah, #5608 seems one PR that is related.
  • d

    Daniel Lavoie

    07/08/2020, 10:41 PM
    Definately sounds related, I’ll investigate to tomorrow morning!
  • m

    Mayank

    07/08/2020, 10:46 PM
    My guess is subsetting of config is broken.
  • m

    Mayank

    07/08/2020, 10:48 PM
    `
    Copy code
    PinotConfiguration schemesConfiguration = fsConfig.subset(CLASS);
  • d

    Daniel Lavoie

    07/08/2020, 10:48 PM
    Yeah, that would explain the config being object being null.
  • m

    Mayank

    07/08/2020, 10:48 PM
    Is
    class
    new?
  • m

    Mayank

    07/08/2020, 10:48 PM
    If so, this is a backward incompatible change?
  • d

    Daniel Lavoie

    07/08/2020, 10:50 PM
    We have tests around fsConfit subsetting, if that is broken, it’s definitely not intended. I’m not home right now.
  • k

    Kishore G

    07/08/2020, 10:50 PM
    @Pradeep can you paste the configuration
  • k

    Kishore G

    07/08/2020, 10:50 PM
    entire file
  • p

    Pradeep

    07/08/2020, 10:51 PM
    Copy code
    pinot.server.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
    pinot.server.storage.factory.s3.accessKey=
    pinot.server.storage.factory.s3.secretKey=
    pinot.server.storage.factory.s3.region=
    pinot.server.segment.fetcher.protocols=file,http,s3
    pinot.server.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
    pinot.server.instance.dataDir=/home/ubuntu/pinot/data
    pinot.server.instance.segmentTarDir=/home/ubuntu/pinot/segments
  • p

    Pradeep

    07/08/2020, 10:51 PM
    This is the server config
1...120121122...166Latest