Slackbot
01/11/2023, 2:07 AMCory Johannsen
01/11/2023, 2:13 AMjava.lang.NullPointerException: null
at org.apache.druid.client.HttpServerInventoryView$2.toDruidServer(HttpServerInventoryView.java:181)
at org.apache.druid.client.HttpServerInventoryView$2.lambda$nodesRemoved$1(HttpServerInventoryView.java:163)
comes from
private DruidServer toDruidServer(DiscoveryDruidNode node)
{
return new DruidServer(
node.getDruidNode().getHostAndPortToUse(),
node.getDruidNode().getHostAndPort(),
node.getDruidNode().getHostAndTlsPort(),
((DataNodeService) node.getServices().get(DataNodeService.DISCOVERY_SERVICE_KEY)).getMaxSize(),
((DataNodeService) node.getServices().get(DataNodeService.DISCOVERY_SERVICE_KEY)).getServerType(),
((DataNodeService) node.getServices().get(DataNodeService.DISCOVERY_SERVICE_KEY)).getTier(),
((DataNodeService) node.getServices().get(DataNodeService.DISCOVERY_SERVICE_KEY)).getPriority()
);
}
}
Specifically, ((DataNodeService) node.getServices().get(DataNodeService.DISCOVERY_SERVICE_KEY))
returns null for the PEON
as its services
map is empty.Cory Johannsen
01/11/2023, 2:15 AMPEON
event is causing the problem.Cory Johannsen
01/11/2023, 10:08 PMPEON
and INDEXER
services that are announced by an indexer pod when it starts up are both on the same host/port. This results in a collision in the DruidNodeDiscoveryProvider
serviceDiscoveryMap
which uses the host and port string as a key. I patched the k8s extension to prevent the notification of the downstream listeners if a pod deletion arrives for a druid node with no services. That prevents the issue, but makes me wonder if this PEON
should even be getting announced. It has no services declared, so it can't be assigned any work. If this PEON
announcement is the issue then the real fix is prevent it.
Is it expected that the indexer announces a peon with no services at startup? If yes, what is the purpose of this peon. If no then I can determine why the k8s extension is making this announcement.Cory Johannsen
01/11/2023, 10:13 PM│ 2023-01-11T22:02:02.433753502Z [K8sDruidNodeDiscoveryProvider-ListenerExecutor] Node[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.183.191', bindOnHost=false, port=-1, plaintextPort=8091, enablePlai │
│ ntextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}] discovered but doesn't have service[lookupNodeService]. Ignored. │
│ 2023-01-11T22:02:02.434313196Z [K8sDruidNodeDiscoveryProvider-ListenerExecutor] Node[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.4.151.172', bindOnHost=false, port=-1, plaintextPort=8091, enablePlai │
│ ntextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}] discovered but doesn't have service[lookupNodeService]. Ignored.
Indicating that the peon that has been announced is essentially invalid. This leads me to believe that either the peon announcement should not occur or that it is missing information.Cory Johannsen
01/11/2023, 10:24 PM[main] Announcing DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.96.19', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='INDEXER', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, serverType=indexer-executor, priority=0}, workerNodeService=WorkerNodeService{ip='storage--druid-indexer-577957494-zhrhs', capacity=128, version='0', category='_default_worker_category'}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}]
[main] Announced DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.96.19', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='INDEXER', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, serverType=indexer-executor, priority=0}, workerNodeService=WorkerNodeService{ip='storage--druid-indexer-577957494-zhrhs', capacity=128, version='0', category='_default_worker_category'}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}]
[main] Announcing DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.96.19', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}]
[main] Announced DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.96.19', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}]
Above is the announcements inside the indexer. Note that the PEON
announcements have services={}
Sergio Ferragut
01/11/2023, 10:30 PMdruid.indexer.runner.startPort
and druid.indexer.runner.endPort
. I'm not sure what the Peon announcement is for.
On a side note, have you looked at the new MM-less K8s extension `druid-kubernetes-overlord-extensions` to avoid the need for MMs completely? It's new and experimental, but sounds like it would fit your deployment well.Cory Johannsen
01/11/2023, 10:40 PM8100-8103
but the announced peon still uses 8091
, which is why Im really suspicious of it.Cory Johannsen
01/11/2023, 10:41 PMCory Johannsen
01/12/2023, 7:32 PMAbhishek Agarwal
01/16/2023, 9:12 AMAbhishek Agarwal
01/16/2023, 9:44 AMI have determined that the underlying issue with the k8s extension is that thewhy does this happen, though? two services can't start on the same host and portandPEON
services that are announced by an indexer pod when it starts up are both on the same host/port.INDEXER
Cory Johannsen
01/17/2023, 5:07 PMCory Johannsen
01/17/2023, 6:21 PM[main] Announcing DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.206.35', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='INDEXER', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, serverType=indexer-executor, priority=0}, workerNodeService=WorkerNodeService{ip='storage--druid-indexer-76c796f796-q4mlf', capacity=128, version='0', category='_default_worker_category'}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}]
[main] Announced DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.206.35', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='INDEXER', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, serverType=indexer-executor, priority=0}, workerNodeService=WorkerNodeService{ip='storage--druid-indexer-76c796f796-q4mlf', capacity=128, version='0', category='_default_worker_category'}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}]
[main] Announcing DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.206.35', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}]
[main] Announced DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/indexer', host='10.0.206.35', bindOnHost=false, port=-1, plaintextPort=8091, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={}}]
Note that I have the peon port explicitly set to 8100
druid_indexer_runner_startPort: "8100"
druid_indexer_runner_endPort: "8100"
Cory Johannsen
01/17/2023, 7:50 PMAbhishek Agarwal
01/19/2023, 6:54 AMAbhishek Agarwal
01/19/2023, 10:42 AMpeon
. I tried with curator node discovery but that shouldn't matter. what kind of indexing task are you running?Cory Johannsen
01/19/2023, 5:03 PMCory Johannsen
01/19/2023, 5:04 PMAbhishek Agarwal
03/15/2023, 11:23 AMAbhishek Agarwal
03/15/2023, 11:24 AM