This message was deleted Apache Druid #troubleshooting

Join Slack

This message was deleted.

# troubleshooting

Slackbot

06/11/2023, 7:56 PM

This message was deleted.

Amatya Avadhanula

06/12/2023, 1:53 AM

Could you please check the historical logs corresponding to the stuck load queue? You could try restarting the coordinator to see if it helps clear the bad segment from the load queue. (Also, please switch from

curator

http

loadqueues before restarting the coordinator in case you are using the former)

victor regalado

06/12/2023, 5:07 AM

Copy code

Unannouncing segment[<segment>] at path[..]
org.apache.druid.server.coordination.SegmentLoadDropHandler - Completely removing <segment> in [30,000] millis
org.apache.druid.server.SegmentManager - Attempting to close segment <segment>

victor regalado

06/12/2023, 5:07 AM

only these 3 log lines

Amatya Avadhanula

06/12/2023, 5:09 AM

Could you please share the replication factor of your segments and also run the following sys table query from the druid console?

select num_replicas, count(*) from sys.segments group by 1

victor regalado

06/12/2023, 5:09 AM

Replication factor is 2

Amatya Avadhanula

06/12/2023, 5:10 AM

The above query should give the count of segments with various replication levels. It's possible that the cluster is overreplicated and segments are being dropped

victor regalado

06/12/2023, 5:11 AM

Copy code

1, 503
2, 103229

Amatya Avadhanula

06/12/2023, 5:11 AM

Oh, overreplication is not the problem here then

Amatya Avadhanula

06/12/2023, 5:14 AM

only these 3 log lines

Are these the only 3 log lines in total? I interpreted it as these 3 log lines are repeated continuously on the historical earlier

victor regalado

06/12/2023, 5:17 AM

I mean they are repeated per segment

victor regalado

06/12/2023, 4:06 PM

Do you have any insight on

org.apache.druid.server.coordinator.ReplicationThrottler:

errors ? Why do they occur ?

victor regalado

06/12/2023, 10:36 PM

Replicant create queue stuck after 3 runs

I increased it from 10 to 100 to see if it can reduce the # of errors.

Open in Slack

Previous Next