Hello again sweat smile We are trying some production enviro Apache Pinot #general

Hello again! :sweat_smile: We are trying some prod...

Diana Arnos

03/16/2022, 8:21 AM

Hello again! 😅 We are trying some production environment setups and I'm having trouble identifying the optimal configuration. Can you point me to some resources? I also need to find out how much storage I need to setup for the Controller, but I couldn't see anything related to that in the docs. I tried running with 1G (the default value) and 10G, but it wasn't enough. Segments are uploaded to Controller storage, right? On the thread, my schema, table configs and helm chart configs.

Diana Arnos

03/16/2022, 8:21 AM

Table schema:

Copy code

{
  "schemaName": "responseCount",
  "primaryKeyColumns": [
    "responseId"
  ],
  "dimensionFieldSpecs": [
    {
      "name": "responseId",
      "dataType": "STRING"
    },
    {
      "name": "formId",
      "dataType": "STRING"
    },
    {
      "name": "channelId",
      "dataType": "STRING"
    },
    {
      "name": "channelPlatform",
      "dataType": "STRING"
    },
    {
      "name": "companyId",
      "dataType": "STRING"
    },
    {
      "name": "submitted",
      "dataType": "BOOLEAN"
    },
    {
      "name": "deleted",
      "dataType": "BOOLEAN"
    }
  ],
  "dateTimeFieldSpecs": [
    {
      "name": "recordTimestamp",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    },
    {
      "name": "createdAt",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    },
    {
      "name": "deletedAt",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSSZ",
      "granularity": "1:MILLISECONDS"
    }
  ]
}

Table configs:

Copy code

{
  "tableName": "responseCount",
  "tableType": "REALTIME",
  "segmentsConfig": {
    "schemaName": "responseCount",
    "timeColumnName": "recordTimestamp",
    "replication": "1",
    "replicasPerPartition": "2"
  },
  "upsertConfig": {
    "mode": "PARTIAL",
    "partialUpsertStrategies": {
      "deleted": "OVERWRITE",
      "formId": "OVERWRITE",
      "recordTimestamp": "OVERWRITE",
      "channelId": "OVERWRITE",
      "channelPlatform": "OVERWRITE",
      "companyId": "OVERWRITE",
      "submitted": "OVERWRITE",
      "createdAt": "OVERWRITE",
      "deletedAt": "OVERWRITE"
    }
  },
  "routing": {
    "instanceSelectorType": "strictReplicaGroup"
  },
  "tableIndexConfig": {
    "loadMode": "MMAP",
    "nullHandlingEnabled": true,
    "streamConfigs": {
      "streamType": "kafka",
      "stream.kafka.topic.name": "[redacted]",
      "stream.kafka.broker.list": "[redacted]",
      "stream.kafka.consumer.type": "lowlevel",
      "stream.kafka.consumer.prop.auto.offset.reset": "smallest",
      "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
      "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
      "realtime.segment.flush.threshold.rows": "0",
      "realtime.segment.flush.threshold.time": "24h",
      "realtime.segment.flush.segment.size": "100M"
    }
  },
  "tenants": {},
  "metadata": {}
}

Helm chart config:

Copy code

controller:
  jvmOpts: "-Xms1G -Xmx4G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  persistence:
    size: 100G
  ingress:
    v1beta1:
      enabled: true
    annotations:
      <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: internal
    tls: { }
    path: /
    hosts:
      - <http://pinot-controller.prod.k8s.usabilla.net|pinot-controller.prod.k8s.usabilla.net>
  resources:
    limits:
      cpu: 1
      memory: 4G
    requests:
      cpu: 1
      memory: 4G

broker:
  jvmOpts: "-Xms1G -Xmx4G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  ingress:
    v1beta1:
      enabled: true
    annotations:
      <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: internal
    tls: { }
    path: /
    hosts:
      - <http://pinot-broker.prod.k8s.usabilla.net|pinot-broker.prod.k8s.usabilla.net>
  resources:
    limits:
      cpu: 1
      memory: 4G
    requests:
      cpu: 1
      memory: 4G

server:
  replicaCount: 3
  jvmOpts: "-Xms2G -Xmx7G -javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent-0.12.0.jar=5556:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml"
  resources:
    limits:
      cpu: 4
      memory: 16G
    requests:
      cpu: 4
      memory: 16G

Diogo Baeder

03/16/2022, 12:42 PM

I think it will depend a lot on the amount of data you have, probably. But I can provide what we started using on the company I work for, for ~100M rows where each has ~5KB of data: • 1 node for Kafka, 8GB RAM, 100GB storage space • 1 node for ZooKeeper, 8GB RAM, 100GB storage space • 2 node for Pinot Controllers, 8GB RAM, 20GB storage space • 1 node for Pinot Broker, 16GB RAM, 20GB storage space • 2 nodes for Pinot Server, 16GB RAM, 1TB storage space

Daniel Lavoie

03/16/2022, 2:01 PM

Do you have deepstore configured with controllers?

Daniel Lavoie

03/16/2022, 2:03 PM

Not having deepstore means the controller will use the local FS to store segments. If you have multiple controllers, it means segments will be inconsistenly stored in different controllers. Local FS with controllers only works if you are pointing to something like NFS. Recommended approach is to go with any of the supported deep store. This will offload all long term storage outside of the controller and you shouldn’t observe massive GB usage on them.

Daniel Lavoie

03/16/2022, 2:04 PM

If you are running single controller, then local FS is good enough if you can afford the local storage and the non HA constraint.

Diogo Baeder

03/16/2022, 2:06 PM

Agreed, and I learned that the hard way 😄

Diana Arnos

03/16/2022, 2:29 PM

We will have ~800M rows with ~300 bytes of data We don't have deep storage, but we have only 1 Controller Maybe due to the required amount of data, we need to set up deep storage

Daniel Lavoie

03/16/2022, 2:30 PM

Controller will have to storage everything that is stored by the servers.

Daniel Lavoie

03/16/2022, 2:30 PM

Is suposed to be the cold copy of servers in case a server disk is lost

Diana Arnos

03/16/2022, 2:32 PM

Of course, that makes sense 🤦‍♀️ I need to bump up server's storage

Daniel Lavoie

03/16/2022, 2:34 PM

I would not recommend going into production without deep store configured for servers.

Daniel Lavoie

03/16/2022, 2:34 PM

It’s your safety policy

Diana Arnos

03/16/2022, 2:35 PM

Yes, I will setup those

Mayank

03/16/2022, 2:39 PM

Also would need replication for controller/broker/server in production

Diogo Baeder

03/16/2022, 2:41 PM

@User where do aggregations run, after the raw data is fetched? Is it on the brokers?

Diana Arnos

03/16/2022, 2:42 PM

I've already setup replication. Will the deep storage be used by a realtime table in the same way it happens for offline ones? We only have a realtime table and we didn't setup a realtime to offline job (to be honest, we don't know if we need this)

Diogo Baeder

03/16/2022, 2:45 PM

For realtime tables there's always at least one "consuming" segment (which is open for writes) and then the "committed" segments, which are flushed to the deep storage. Kinda "hot" and "cold" segments, in a sense. Once a consuming segment reaches a defined threshold (by table configuration), it gets committed and flushed to the deep storage.

Diogo Baeder

03/16/2022, 2:46 PM

One mistake I did when my company migrated our Pinot cluster to another AWS region was that I didn't stop and commit the consuming segments, and I ended up losing the "hot" data - I still kept the data from the deep storage though.

Diana Arnos

03/16/2022, 2:56 PM

That's a good advice indeed 🤔 How am I supposed to stop a table and commit the consuming segments? Would it happen automatically if I disable a table?

Mayank

03/16/2022, 5:37 PM

@User For a query, servers do aggregation on data they host, and broker does a final merge.

Mayank

03/16/2022, 5:37 PM

@User Are you trying to migrate table from one AWS region to another?

Diogo Baeder

03/16/2022, 6:18 PM

@User nice, thanks! 🙂

Diana Arnos

03/16/2022, 6:22 PM

No, we are setting up our first prod deployment

Xiaobing

03/16/2022, 6:32 PM

to my knowledge, disabling table stops the consumption, but doesn’t force to commit the consuming segments. But if the data is still in Kafka, restarting the table should resume consumption from last commit offset (i.e. the end of last commit segments).

👍 1

Mayank

03/16/2022, 7:02 PM

Ok, in case of first prod deployment, why do you need to stop consumption @User? In Diogo’s case, it seems like there was a migration involved

Diogo Baeder

03/16/2022, 7:09 PM

I think there's some noise going on, probably; I said about migration just because we had to do it on our company, just as an example of what happens to segments being consumed or having been committed already. Diana's case doesn't seem to involve any of that at the moment, but I think she asked just out of curiosity, in case she falls into the same situation. Am I correct, @User?

☝️ 1

Diogo Baeder

03/16/2022, 7:09 PM

And sorry for mentioning our migration, you can blame this confusion on me 😄

Mayank

03/16/2022, 7:28 PM

All good @User, thanks for your help, as always.

Diogo Baeder

03/16/2022, 7:35 PM

np 😊

Open in Slack

Previous Next