<https apache pinot slack com archives CDRCA57FC p1621695391 Apache Pinot #troubleshooting

Join Slack

<https://apache-pinot.slack.com/archives/CDRCA57FC...

# troubleshooting

kauts shukla

05/22/2021, 3:18 PM

https://apache-pinot.slack.com/archives/CDRCA57FC/p1621695391110900

Mayank

05/22/2021, 3:19 PM

Do you have inverted index or on-heap dictionary specified ?

Mayank

05/22/2021, 3:19 PM

In table config

kauts shukla

05/22/2021, 3:23 PM

@Mayank It looks like more than a GC issue, in logs everytime its lossing connection and reconnecting to zookeeper

Mayank

05/22/2021, 3:23 PM

That is because of GC pause, it times out to send heart beat to ZK

kauts shukla

05/22/2021, 3:24 PM

@Mayank Does it creates multiple connection can hit Zookeeper IOPS on call for every segemnt check.

Mayank

05/22/2021, 3:24 PM

No, should be single session per server

kauts shukla

05/22/2021, 3:24 PM

@Mayank

Copy code

"invertedIndexColumns": [
        "userid",
        "sessionid",
        "eventlabel",
        "dp_created_at",
        "timestampist"
      ]

kauts shukla

05/22/2021, 3:24 PM

Copy code

"sortedColumn": [
        "dp_created_at",
        "timestampist"
      ],

Mayank

05/22/2021, 3:24 PM

Any On-heap dictionary?

kauts shukla

05/22/2021, 3:25 PM

Copy code

"autoGeneratedInvertedIndex": true,
      "createInvertedIndexDuringSegmentGeneration": true, "enableDefaultStarTree": true,
      "enableDynamicStarTreeCreation": true,

Mayank

05/22/2021, 3:25 PM

If not, I can’t think what is occupying heap. Metadata cannot take 64GB

Mayank

05/22/2021, 3:25 PM

Oh segment generation takes heap

Mayank

05/22/2021, 3:26 PM

Are too many segments being generated in parallel?

kauts shukla

05/22/2021, 3:26 PM

50 segemnts in parallel

Mayank

05/22/2021, 3:27 PM

There you go

kauts shukla

05/22/2021, 3:28 PM

is this is the culprit

kauts shukla

05/22/2021, 3:28 PM

Copy code

"createInvertedIndexDuringSegmentGeneration": true,

Mayank

05/22/2021, 3:28 PM

Mayank

05/22/2021, 3:29 PM

Periodically segments consumed in memory are flushed to disk. This goes through some heap usage. If 50 partitions go through the same at once it will run out of heap

Mayank

05/22/2021, 3:29 PM

How did you specify 50?

kauts shukla

05/22/2021, 3:29 PM

kafka topic has 50 partitions

kauts shukla

05/22/2021, 3:30 PM

I haven’t specified its already existed

Mayank

05/22/2021, 3:30 PM

Not talking about consumption. Periodically the consuming segment needs to be flushed to disk, this uses some heap. Typically if all partitions flush time disk at the same time then there will be heap pressure

Mayank

05/22/2021, 3:31 PM

There is a way to specify max parallel segment generation

kauts shukla

05/22/2021, 3:31 PM

@Mayank: how to specify it.

Mayank

05/22/2021, 3:32 PM

I’ll find. In the meanwhile can you grip the log for segment generation

kauts shukla

05/22/2021, 3:34 PM

what i have to grep ?

Mayank

05/22/2021, 3:35 PM

Try something like "grep -i created segment"

kauts shukla

05/22/2021, 3:35 PM

post MessageLatencyMonitor it always throw ERROR [SegmentBuildTimeLeaseExtender] [pool-5-thread-1] Failed to send lease extension

Mayank

05/22/2021, 3:36 PM

Yeah, then it is likely segment generation

Mayank

05/22/2021, 3:36 PM

grep -i "Driver, indexing time :"

kauts shukla

05/22/2021, 3:37 PM

no log with this

Mayank

05/22/2021, 3:37 PM

In server's data dir do ls -l and see if segments have timestamp that are near by

Mayank

05/22/2021, 3:38 PM

that will tell how many generated at the same time

Mayank

05/22/2021, 3:38 PM

There should have been logs

Mayank

05/22/2021, 3:39 PM

grep -i "Trying to build segment"

Mayank

05/22/2021, 3:39 PM

grep -i "Successfully built segment"

kauts shukla

05/22/2021, 3:41 PM

last segment created at May 22 14:08 UTC

Mayank

05/22/2021, 3:41 PM

how many around that time?

kauts shukla

05/22/2021, 3:42 PM

no luck with this grep -i “Successfully built segment”

kauts shukla

05/22/2021, 3:42 PM

no logs on both the server with this greo

kauts shukla

05/22/2021, 3:42 PM

7:38 pm IST

Mayank

05/22/2021, 3:43 PM

Hmm, what is uyour logging level? These are info messages and should be there for sure.

Mayank

05/22/2021, 3:43 PM

Also config

realtime.max.parallel.segment.builds

to specify how many segment generation in parallel

kauts shukla

05/22/2021, 3:43 PM

logging level is INFO only

kauts shukla

05/22/2021, 3:44 PM

realtime.max.parallel.segment.builds ? where I should mention this

kauts shukla

05/22/2021, 3:44 PM

Table config ?

Mayank

05/22/2021, 3:44 PM

In server config

Mayank

05/22/2021, 3:44 PM

Although, I'd think that the default should not be unlimited, so still unsure if this is the root cause

Mayank

05/22/2021, 3:57 PM

Can you try setting it to a small value like 4.

Mayank

05/22/2021, 3:59 PM

Also look at https://docs.pinot.apache.org/operators/operating-pinot/tuning/realtime#realtimeprovisioninghelper

Mayank

05/22/2021, 4:35 PM

If your current segment size is 1.8GB, reducing its to 112M would increase number of segments too much. May be 300MB or 500MB

Open in Slack

Previous Next