This message was deleted.
# dev
s
This message was deleted.
c
Copy code
if (httpResponseStatus.getCode() == 400) {
            // don't bother retrying if it's a bad request
            throw new IAE(exceptionMessage.toString());
          } else {
            throw new IOE(exceptionMessage.toString());
          }
When using HTTP discovery the status code is 400 when the request produces a "no route to host" resulting from the removal of a pod.
Since this situation can always occur if a pod is spontaneously lost, this becomes a normal condition. I'm considering handling this as an exception I can propagate that retains the information about the failure. 2 questions: 1. Is this the correct way to handle this? 2. Would it make sense to have another exception in addition to
NoTaskLocationException
and
TaskNotRunnableException
that represents this situation? Or does
TaskNotRunnableException
already cover this? I'm not clear on the exact intent of
TaskNotRunnableException
Oh, it looks like NoRouteToHost bubbles up as an IOException that I can trap in my task client. Perhaps this will work as-is.
Confirmed. I can handle the NoRouteToHost in my supervisor.
g
makes sense that it can be detected this way: it's not really an HTTP protocol-level error — it's something lower level
a
curious - which component here is producing 400 for "No route to host" since that isn't technically a 400.
c
Here is the full exception from the logs
Copy code
{
  "level": "ERROR",
  "@timestamp": "2022-10-04T16:18:56.598Z",
  "thread": "HttpServerInventoryView-3",
  "message": "failed to get sync response from [<http://10.4.154.126:8091/_1664898749884>]. Return code [0], Reason: [null]",
  "exception": {
    "exception_class": "io.netty.channel.ChannelException",
    "exception_message": "Faulty channel in resource pool",
    "stacktrace": "io.netty.channel.ChannelException: Faulty channel in resource pool\n\tat org.apache.druid.java.util.http.client.NettyHttpClient.go(NettyHttpClient.java:125)\n\tat org.apache.druid.server.coordination.ChangeRequestHttpSyncer.sync(ChangeRequestHttpSyncer.java:218)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: Host is unreachable: /10.4.154.126:8091\nCaused by: java.net.NoRouteToHostException: Host is unreachable\n\tat java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)\n\tat java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)\n\tat io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)\n\tat io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n"
  },
  "hostName": "storage--druid-coordinator-589db54d45-x4ndp"
}
ChangeRequestHttpSyncer
is the Druid class making the call