This message was deleted.
# ask-for-help
s
This message was deleted.
j
Here is the traceback:
Copy code
2023-02-16 14:38:54,050 - bentoml._internal.server.http_app - ERROR - Exception on /logo_image_classifier/predict [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/bentoml/_internal/server/http_app.py", line 336, in api_func
    output = await run_in_threadpool(api.func, input_data)
  File "/usr/local/lib/python3.7/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/usr/local/lib/python3.7/site-packages/anyio/to_thread.py", line 32, in run_sync
    func, *args, cancellable=cancellable, limiter=limiter
  File "/usr/local/lib/python3.7/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.7/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/bentoml/bento/src/service.py", line 86, in logo_image_classifier_predict
    return _invoke_runner(models["logo_image_classifier"], "run", input)
  File "/home/bentoml/bento/src/service.py", line 66, in _invoke_runner
    result = getattr(runner, name).run(*input_npack)
  File "/usr/local/lib/python3.7/site-packages/bentoml/_internal/runner/runner.py", line 52, in run
    return self.runner._runner_handle.run_method(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 266, in run_method
    *args,
  File "/usr/local/lib/python3.7/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/usr/local/lib/python3.7/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/usr/local/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.7/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 186, in async_run_method
    "Yatai-Bento-Deployment-Namespace": component_context.yatai_bento_deployment_namespace,
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 899, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "/usr/local/lib/python3.7/site-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
Btw it only happens when I limit the container to 4GiB of memory (to simulate what I have in cloud). If I increase the memory limit, it seems to work well
However the error is confusing, as it does not indicate anything about memory issues…
cc @Benjamin Tan
s
I think this issue is caused generally when the async event thread is blocked heavily and so the runner clients take too long to read data. Maybe in your case that's being caused by GC or swapping?
j
That first part of explanation looks reasonable. I’m not sure about the specific ideas; now it’s pretty reliably reproducible. Yes, it also depends on size of the data I’m sending
This does not seem to be issue isolated to me. Several people have already complained about the same issue in this chat. Is there a way I can do anything about it?
For example, I’m trying to reduce the number of api-workers; does not help. I tried to reduce the number for --backlog option. No change. Every time I send more than 2 concurrent requests, the service fails with ServerDisconnectedError and is not able to recover (i.e. all subsequent calls fail)
s
Facing exactly the same issue as above. Did anyone find the fix for this case?
j
Nope
OK, I created an issue https://github.com/bentoml/BentoML/issues/3669 @Sudeep Ghimire @Benjamin Tan feel free to add your observations
👍 1