This message was deleted.
# troubleshooting
s
This message was deleted.
t
This alert means that the Coordinator has been wanting to add a replica of a segment for at least
replicantLifetime
runs (by default: 15 runs) and has not been able to. By default, Coordinator runs occur once a minute, so that's 15 minutes. However, this can be overridden by setting
druid.coordinator.period
to something else. If an environment does happen to have a more aggressive setting for
druid.coordinator.period
then I would suggest either restoring the default setting, or increasing
replicantLifetime
to make the alerting less sensitive. Search on this page for
replicantLifetime
and
druid.coordinator.period
for more details: https://druid.apache.org/docs/latest/configuration/index.html Also check if any issues with datanodes , are they often going down?
v
The datanodes are not going down. We are using the default for coordinator.period which is 1 min. I decreased the replicant lifetime to 3 to see if this would decrease the amount the errors. We also experiencing this issue https://apachedruidworkspace.slack.com/archives/C0309C9L90D/p1686513388513769 This is the distribution of errors.
I just noted and my coordinator heap is much smaller. Im gonna increase its heap to see if it helps
Copy code
You can set the Coordinator heap to the same size as your Broker heap, or slightly smaller: both services have to process cluster-wide state and answer API requests about this state.
Well i dont think that worked 🙂