Getting this very frequently for all the tables.
# troubleshooting
p
Getting this very frequently for all the tables.
All the tables are in bad state
Running v0.11.0
I tried to reload a segment manually. Got a job Id and status of the job was this
Copy code
{
  "metadata": {
    "jobId": "f1e3cbcd-de42-4470-8a9f-ad546e3697b8",
    "messageCount": "2",
    "submissionTimeMs": "1664366425214",
    "jobType": "RELOAD_SEGMENT",
    "segmentName": "packages__0__0__20220908T0929Z",
    "tableName": "packages_REALTIME"
  },
  "estimatedTimeRemainingInMinutes": -1,
  "timeElapsedInMinutes": 1.2516833333333333,
  "totalServersQueried": 2,
  "totalServerCallsFailed": 0,
  "successCount": 0,
  "totalSegmentCount": 2
}
I wonder why estimatedTimeRemainingInMinutes is -1
m
Does debug table api show any issue? If not any errors in the server log, the one you pointed is not the cause
p
Copy code
[
	{
		"tableName": "packages_REALTIME",
		"numSegments": 79,
		"numServers": 2,
		"numBrokers": 2,
		"segmentDebugInfos": [],
		"serverDebugInfos": [],
		"brokerDebugInfos": [],
		"tableSize": {
			"reportedSize": "5 MB",
			"estimatedSize": "5 MB"
		},
		"ingestionStatus": {
			"ingestionState": "UNHEALTHY",
			"errorMessage": "Did not get any response from servers for segment: packages__0__9__20220927T1248Z"
		}
	}
]
Debug tables API returned this. I wonder why segments have stopped consuming data from kafka.
n
@Piyush Chauhan: For this log , is there a more detailed stacktrace for the exception ? @Kartik Khare
p
@Kartik Khare @Navina These are the detailed logs
k
Copy code
2022-09-30T14:06:36+05:30	org.apache.helix.HelixException: HelixManager is not connected within retry timeout for cluster pinot-shipment-dev

2022-09-30T14:06:36+05:30	Exception while logging status update

2022-09-30T14:06:36+05:30	zkClient to shipment-dev-zookeeper:2181 is not connected, wait for 10000ms.

2022-09-30T14:06:36+05:30	zkClient is not connected after waiting 10000ms., clusterName: pinot-shipment-dev, zkAddress: shipment-dev-zookeeper:2181

2022-09-30T14:06:36+05:30	zkClient is not connected after waiting 10000ms., clusterName: pinot-shipment-dev, zkAddress: shipment-dev-zookeeper:2181
Seems like your zookeeper is down
p
Seems like zookeeper memory usage spiked. What should be the ideal memory allotted to zookeeper? @Kartik Khare