https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • k

    Kishore G

    08/18/2020, 12:26 AM
    Sorry 😐
  • k

    Kishore G

    08/18/2020, 12:26 AM
    Schema
  • c

    Christian Acuna

    08/18/2020, 12:27 AM
    yup, the schema is used in the realtime table so I know it exists
  • k

    Kishore G

    08/18/2020, 12:27 AM
    Paste the offline table config?
  • c

    Christian Acuna

    08/18/2020, 12:28 AM
    Copy code
    {
      "OFFLINE": {
        "tableName": "DnsForwarderServiceStatus_OFFLINE",
        "tableType": "OFFLINE",
        "segmentsConfig": {
          "timeType": "MILLISECONDS",
          "schemaName": "olap_enriched_dns_forwarder_service_status",
          "timeColumnName": "cloud_timestamp_ms",
          "replication": "1",
          "replicasPerPartition": "1",
          "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
        },
        "tenants": {
          "broker": "DefaultTenant",
          "server": "DefaultTenant"
        },
        "tableIndexConfig": {
          "createInvertedIndexDuringSegmentGeneration": false,
          "noDictionaryColumns": [],
          "enableDefaultStarTree": false,
          "enableDynamicStarTreeCreation": false,
          "aggregateMetrics": true,
          "nullHandlingEnabled": true,
          "loadMode": "MMAP",
          "invertedIndexColumns": [],
          "autoGeneratedInvertedIndex": false
        },
        "metadata": {
          "customConfigs": {}
        }
      },
      "REALTIME": {
        "tableName": "DnsForwarderServiceStatus_REALTIME",
        "tableType": "REALTIME",
        "segmentsConfig": {
          "timeType": "MILLISECONDS",
          "schemaName": "olap_enriched_dns_forwarder_service_status",
          "timeColumnName": "cloud_timestamp_ms",
          "retentionTimeUnit": "DAYS",
          "retentionTimeValue": "30",
          "replication": "1",
          "replicasPerPartition": "1",
          "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
        },
        "tenants": {
          "broker": "DefaultTenant",
          "server": "DefaultTenant"
        },
        "tableIndexConfig": {
          "createInvertedIndexDuringSegmentGeneration": false,
          "noDictionaryColumns": [],
          "enableDefaultStarTree": false,
          "enableDynamicStarTreeCreation": false,
          "aggregateMetrics": true,
          "nullHandlingEnabled": true,
          "streamConfigs": {
            "streamType": "kafka",
            "stream.kafka.consumer.type": "lowlevel",
            "stream.kafka.topic.name": "olap-enriched-dns-forwarder-service-status",
            "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
            "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
            "stream.kafka.broker.list": "KAFKA_URL",
            "realtime.segment.flush.threshold.time": "30m",
            "realtime.segment.flush.threshold.size": "100000",
            "stream.kafka.consumer.prop.auto.offset.reset": "largest",
            "stream.kafka.zk.broker.url": "ZK_URL"
          },
          "loadMode": "MMAP",
          "invertedIndexColumns": [],
          "autoGeneratedInvertedIndex": false
        },
        "metadata": {
          "customConfigs": {}
        }
      }
    }
  • k

    Kishore G

    08/18/2020, 12:30 AM
    Looks right to me
  • c

    Christian Acuna

    08/18/2020, 12:30 AM
    hmm
  • n

    Neha Pawar

    08/18/2020, 12:36 AM
    i think there was some recent change where schema name should match the table name.
  • c

    Christian Acuna

    08/18/2020, 12:37 AM
    image.png
  • n

    Neha Pawar

    08/18/2020, 12:38 AM
    Copy code
    Schema schema = ZKMetadataProvider.getTableSchema(_propertyStore, _offlineTableName);
        Preconditions.checkState(schema != null, "Failed to find schema for table: %s", _offlineTableName);
    this is throwing the exception. it needs schema with name = DnsForwarderServiceStatus
  • c

    Christian Acuna

    08/18/2020, 12:38 AM
    ah! thank you so much!
  • n

    Neha Pawar

    08/18/2020, 12:39 AM
    upload same schema with DnsForwarderServiceStatus name, and also, update the schema name in tableConfigs
    👍 1
  • c

    Christian Acuna

    08/18/2020, 12:46 AM
    that fixed it!
    👌 1
  • c

    Christian Acuna

    08/18/2020, 12:46 AM
    thanks so much
  • k

    Kishore G

    08/20/2020, 4:23 PM
    @Yash Agarwal yes, @Xiang Fu lets enable allow-multiple-aggregations by default in presto pinot connector
  • x

    Xiang Fu

    08/20/2020, 5:00 PM
    will do
  • b

    Buchi Reddy

    08/20/2020, 10:10 PM
    Hey everyone,
    DISTINCTCOUNT
    queries on raw data from realtime tables seems to be very slow. Tried the HLL approximation but that didn’t help. If we were to be okay with approximated results, would you recommend
    Theta Sketches
    ? Is that generally faster than the HLL?
  • m

    Mayank

    08/20/2020, 10:10 PM
    HLL is faster than T/S
  • m

    Mayank

    08/20/2020, 10:10 PM
    T/S is better if you want to do set operations like intersect/union/difference
  • m

    Mayank

    08/20/2020, 10:11 PM
    HLL or T/S helps if pre-aggregate, which you can't for RT
  • b

    Buchi Reddy

    08/20/2020, 10:12 PM
    Hmm.. For most of our queries, both
    DISTINCTCOUNT
    and
    HLL
    are equally slow. Are there any optimizations that we can do to improve the latencies?
  • m

    Mayank

    08/20/2020, 10:15 PM
    A good feature ask
    Aggregating HLL T/S derived columns during consumption
    (cc: @Jackie)
  • j

    Jackie

    08/20/2020, 10:18 PM
    We should support all the aggregations available in
    ValueAggregator
    for aggregation during consumption
  • j

    Jackie

    08/20/2020, 10:19 PM
    FYI, that's the aggregations supported by star-tree
  • e

    Elon

    08/21/2020, 4:53 PM
    Is the call to http://<CONTROLLER HOST>:<PORT>/tables/<TABLENAME>/instances deprecated?
  • e

    Elon

    08/21/2020, 7:32 PM
    Hi, we are exploring using pinot for a logging backend and running into an issue where the controller http threads are all blocked on LLCSegmentCompletionHandlers.segmentCommit
  • y

    Yash Agarwal

    08/24/2020, 5:12 PM
    What is the segment config and name strategy should I use, to support the following data ingest
    Copy code
    Data size: 3 Years
    Daily Raw Orc Size: 4GB
    Daily Segment Counts: 11
    Data Full Refresh: Weekly
    Data Append: Daily
  • p

    Pradeep

    08/26/2020, 1:09 AM
    QQ, I am using following
    ingestionConfig
    for my offline table, but I don’t see
    epochMinutes
    getting populated. Is ingestionConfig only applicable to realtime tables?
    Copy code
    "ingestionConfig": {
          "transformConfigs": [
            {
              "columnName": "epochMinutes",
              "transformFunction": "toEpochMinutes(timestampMillis)"
            }
          ]
        }
  • p

    Pradeep

    08/26/2020, 1:11 AM
    I am using spark executor to generate and publish the segments into pinot
  • x

    Xiang Fu

    08/26/2020, 1:34 AM
    I think it’s only works for realtime right now @Neha Pawar ^^
1...128129130...166Latest