João Comini
11/30/2020, 9:08 PMRealtimeProvisioningHelper
, may you help me?
These are my doubts:
• Why do we need a numHours
parameter? What's the impact of having a consuming segment for a certain amount of time (pros/cons)?
• And what does Mapped
means in the Memory used per host
result? Is it about the segments in disk?
This is the results that I got:
RealtimeProvisioningHelper -tableConfigFile /tmp/transaction-table.json -numPartitions 20 -pushFrequency null -numHosts 4,8,12,16,20 -numHours 24,48,72,96 -sampleCompletedSegmentDir /tmp/out/transaction_1606528528_1606614928_0 -ingestionRate 4 -maxUsableHostMemory 16G -retentionHours 768
Note:
* Table retention and push frequency ignored for determining retentionHours since it is specified in command
* See <https://docs.pinot.apache.org/operators/operating-pinot/tuning/realtime>
Memory used per host (Active/Mapped)
numHosts --> 4 |8 |12 |16 |20 |
numHours
24 --------> 6.8G/71.9G |3.4G/35.95G |2.72G/28.76G |2.04G/21.57G |1.36G/14.38G |
48 --------> 7.33G/72.62G |3.66G/36.31G |2.93G/29.05G |2.2G/21.79G |1.47G/14.52G |
72 --------> 8.01G/73.11G |4.01G/36.55G |3.2G/29.24G |2.4G/21.93G |1.6G/14.62G |
96 --------> 8.39G/74.08G |4.2G/37.04G |3.36G/29.63G |2.52G/22.22G |1.68G/14.82G |
Optimal segment size
numHosts --> 4 |8 |12 |16 |20 |
numHours
24 --------> 20.02M |20.02M |20.02M |20.02M |20.02M |
48 --------> 40.04M |40.04M |40.04M |40.04M |40.04M |
72 --------> 60.05M |60.05M |60.05M |60.05M |60.05M |
96 --------> 80.07M |80.07M |80.07M |80.07M |80.07M |
Consuming memory
numHosts --> 4 |8 |12 |16 |20 |
numHours
24 --------> 756.05M |378.02M |302.42M |226.81M |151.21M |
48 --------> 1.47G |750.11M |600.09M |450.07M |300.04M |
72 --------> 2.15G |1.07G |878.76M |659.07M |439.38M |
96 --------> 2.92G |1.46G |1.17G |896.57M |597.71M |
Total number of segments queried per host (for all partitions)
numHosts --> 4 |8 |12 |16 |20 |
numHours
24 --------> 320 |160 |128 |96 |64 |
48 --------> 160 |80 |64 |48 |32 |
72 --------> 110 |55 |44 |33 |22 |
96 --------> 80 |40 |32 |24 |16 |
Neha Pawar
João Comini
11/30/2020, 9:53 PMNeha Pawar
João Comini
11/30/2020, 9:57 PMJoão Comini
11/30/2020, 9:58 PMSubbu Subramaniam
11/30/2020, 9:59 PMNeha Pawar
Subbu Subramaniam
11/30/2020, 10:00 PMNeha Pawar
Subbu Subramaniam
11/30/2020, 10:04 PMJoão Comini
11/30/2020, 10:05 PMNeha Pawar
João Comini
11/30/2020, 10:17 PMmapped
you mean that the segments aren't in memory right? The segments are in the server's disk and mapped in memory, am i missing something?Subbu Subramaniam
11/30/2020, 10:28 PMmmap
João Comini
11/30/2020, 10:30 PMNeha Pawar
Xiang Fu
Neha Pawar
João Comini
11/30/2020, 11:09 PMXiang Fu
João Comini
11/30/2020, 11:21 PMXiang Fu
João Comini
12/01/2020, 1:25 PMXiang Fu
João Comini
12/09/2020, 1:51 PM