Hello again! :sweat_smile: We are trying some prod...
# general
d
Hello again! 😅 We are trying some production environment setups and I'm having trouble identifying the optimal configuration. Can you point me to some resources? I also need to find out how much storage I need to setup for the Controller, but I couldn't see anything related to that in the docs. I tried running with 1G (the default value) and 10G, but it wasn't enough. Segments are uploaded to Controller storage, right? On the thread, my schema, table configs and helm chart configs.
Table schema:
Copy code
{
  "schemaName": "responseCount",
  "primaryKeyColumns": [
    "responseId"
  ],
  "dimensionFieldSpecs": [
    {
      "name": "responseId",
      "dataType": "STRING"
    },
    {
      "name": "formId",
      "dataType": "STRING"
    },
    {
      "name": "channelId",
      "dataType": "STRING"
    },
    {
      "name": "channelPlatform",
      "dataType": "STRING"
    },
    {
      "name": "companyId",
      "dataType": "STRING"
    },
    {
      "name": "submitted",
      "dataType": "BOOLEAN"
    },
    {
      "name": "deleted",
      "dataType": "BOOLEAN"
    }
  ],
  "dateTimeFieldSpecs": [
    {
      "name": "recordTimestamp",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    },
    {
      "name": "createdAt",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    },
    {
      "name": "deletedAt",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    }
  ]
}
Table configs:
Copy code
{
  "tableName": "responseCount",
  "tableType": "REALTIME",
  "segmentsConfig": {
    "schemaName": "responseCount",
    "timeColumnName": "recordTimestamp",
    "replication": "1",
    "replicasPerPartition": "2"
  },
  "upsertConfig": {
    "mode": "PARTIAL",
    "partialUpsertStrategies": {
      "deleted": "OVERWRITE",
      "formId": "OVERWRITE",
      "recordTimestamp": "OVERWRITE",
      "channelId": "OVERWRITE",
      "channelPlatform": "OVERWRITE",
      "companyId": "OVERWRITE",
      "submitted": "OVERWRITE",
      "createdAt": "OVERWRITE",
      "deletedAt": "OVERWRITE"
    }
  },
  "routing": {
    "instanceSelectorType": "strictReplicaGroup"
  },
  "tableIndexConfig": {
    "loadMode": "MMAP",
    "nullHandlingEnabled": true,
    "streamConfigs": {
      "streamType": "kafka",
      "stream.kafka.topic.name": "[redacted]",
      "stream.kafka.broker.list": "[redacted]",
      "stream.kafka.consumer.type": "lowlevel",
      "stream.kafka.consumer.prop.auto.offset.reset": "smallest",
      "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
      "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
      "realtime.segment.flush.threshold.rows": "0",
      "realtime.segment.flush.threshold.time": "24h",
      "realtime.segment.flush.segment.size": "100M"
    }
  },
  "tenants": {},
  "metadata": {}
}
Helm chart config:
Copy code
controller:
  jvmOpts: "-Xms1G -Xmx4G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  persistence:
    size: 100G
  ingress:
    v1beta1:
      enabled: true
    annotations:
      <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: internal
    tls: { }
    path: /
    hosts:
      - <http://pinot-controller.prod.k8s.usabilla.net|pinot-controller.prod.k8s.usabilla.net>
  resources:
    limits:
      cpu: 1
      memory: 4G
    requests:
      cpu: 1
      memory: 4G

broker:
  jvmOpts: "-Xms1G -Xmx4G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  ingress:
    v1beta1:
      enabled: true
    annotations:
      <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: internal
    tls: { }
    path: /
    hosts:
      - <http://pinot-broker.prod.k8s.usabilla.net|pinot-broker.prod.k8s.usabilla.net>
  resources:
    limits:
      cpu: 1
      memory: 4G
    requests:
      cpu: 1
      memory: 4G

server:
  replicaCount: 3
  jvmOpts: "-Xms2G -Xmx7G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  resources:
    limits:
      cpu: 4
      memory: 16G
    requests:
      cpu: 4
      memory: 16G
d
I think it will depend a lot on the amount of data you have, probably. But I can provide what we started using on the company I work for, for ~100M rows where each has ~5KB of data: • 1 node for Kafka, 8GB RAM, 100GB storage space • 1 node for ZooKeeper, 8GB RAM, 100GB storage space • 2 node for Pinot Controllers, 8GB RAM, 20GB storage space • 1 node for Pinot Broker, 16GB RAM, 20GB storage space • 2 nodes for Pinot Server, 16GB RAM, 1TB storage space
d
Do you have deepstore configured with controllers?
Not having deepstore means the controller will use the local FS to store segments. If you have multiple controllers, it means segments will be inconsistenly stored in different controllers. Local FS with controllers only works if you are pointing to something like NFS. Recommended approach is to go with any of the supported deep store. This will offload all long term storage outside of the controller and you shouldn’t observe massive GB usage on them.
If you are running single controller, then local FS is good enough if you can afford the local storage and the non HA constraint.
d
Agreed, and I learned that the hard way 😄
d
We will have ~800M rows with ~300 bytes of data We don't have deep storage, but we have only 1 Controller Maybe due to the required amount of data, we need to set up deep storage
d
Controller will have to storage everything that is stored by the servers.
Is suposed to be the cold copy of servers in case a server disk is lost
d
Of course, that makes sense 🤦‍♀️ I need to bump up server's storage
d
I would not recommend going into production without deep store configured for servers.
It’s your safety policy
d
Yes, I will setup those
m
Also would need replication for controller/broker/server in production
d
@User where do aggregations run, after the raw data is fetched? Is it on the brokers?
d
I've already setup replication. Will the deep storage be used by a realtime table in the same way it happens for offline ones? We only have a realtime table and we didn't setup a realtime to offline job (to be honest, we don't know if we need this)
d
For realtime tables there's always at least one "consuming" segment (which is open for writes) and then the "committed" segments, which are flushed to the deep storage. Kinda "hot" and "cold" segments, in a sense. Once a consuming segment reaches a defined threshold (by table configuration), it gets committed and flushed to the deep storage.
One mistake I did when my company migrated our Pinot cluster to another AWS region was that I didn't stop and commit the consuming segments, and I ended up losing the "hot" data - I still kept the data from the deep storage though.
d
That's a good advice indeed 🤔 How am I supposed to stop a table and commit the consuming segments? Would it happen automatically if I disable a table?
m
@User For a query, servers do aggregation on data they host, and broker does a final merge.
@User Are you trying to migrate table from one AWS region to another?
d
@User nice, thanks! 🙂
d
No, we are setting up our first prod deployment
x
to my knowledge, disabling table stops the consumption, but doesn’t force to commit the consuming segments. But if the data is still in Kafka, restarting the table should resume consumption from last commit offset (i.e. the end of last commit segments).
👍 1
m
Ok, in case of first prod deployment, why do you need to stop consumption @User? In Diogo’s case, it seems like there was a migration involved
d
I think there's some noise going on, probably; I said about migration just because we had to do it on our company, just as an example of what happens to segments being consumed or having been committed already. Diana's case doesn't seem to involve any of that at the moment, but I think she asked just out of curiosity, in case she falls into the same situation. Am I correct, @User?
☝️ 1
And sorry for mentioning our migration, you can blame this confusion on me 😄
m
All good @User, thanks for your help, as always.
d
np 😊