https://bentoml.com logo
Title
d

Dajana Muho

05/24/2023, 11:21 PM
Hii, We are trying to deploy a project that grabs dataset from BigQuery and predicts using our framework written on BentoML.The deployment is being done using bentoctl and as a provisioner we are using terraform to deploy it on Google Cloud Run. For short operations the requests are working fine, but on longer ones we are hitting a timeout of 300s. We have already increased the timeout on the Google Cloud Run service deployment, added an environment variable for BENTOML__APISERVER__DEFAULT_TIMEOUT - but still hitting the same blocker. Also I’m attaching a snippet of our YAML spec for our deployment in Google Cloud Run. If someone has occur the same error, please let me know the steps you follow to solve it. I want to mention that it woks very well from my local machine, I'm able to finish the request successfully, but fails on cloud.
spec:
  containerConcurrency: 80
  timeoutSeconds: 1800
  serviceAccountName: xx
  containers:
  - name: x-anomaly-detection-1
    image: xxx
    ports:
    - name: http1
      containerPort: 3000
    env:
    - name: BENTOML_PORT
      value: '3000'
    - name: BENTOML__APISERVER__DEFAULT_TIMEOUT
      value: '1800'
    resources:
      limits:
        cpu: 4000m
        memory: 16Gi
    startupProbe:
      timeoutSeconds: 240
      periodSeconds: 240
      failureThreshold: 1
      tcpSocket:
        port: 3000
c

Chaoyu

05/26/2023, 2:19 AM
Hi @Dajana Muho - sorry about the confusion, the
BENTOML__APISERVER__DEFAULT_TIMEOUT
env var is an legacy option only available in BentoML < 1.0. For 1.0 and above, you can use a config file to configure the timeout options
You can also use ENV VAR to set config directly instead of using a config file, e.g.:
BENTOML_CONFIG_OPTIONS='api_server.timeout=300'
d

Dajana Muho

05/26/2023, 4:55 PM
Hey @Chaoyu Thank you very much for your time to check this issue. We are still facing the same issue even though we applied the change you suggest. It crushes on the deployment env with the error,
textPayload: "Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/server/http_app.py", line 336, in api_func
    output = await run_in_threadpool(api.func, input_data)
  File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/bentoml/bento/src/src/services/bento_service.py", line 90, in forecast_target_stats
    return time_series_runner.predict.run('target_stats', input)
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/runner/runner.py", line 52, in run
    return self.runner._runner_handle.run_method(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 291, in run_method
    anyio.from_thread.run(
  File "/usr/local/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 220, in async_run_method
    async with <http://self._client.post|self._client.post>(
  File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 894, in start
    with self._timer:
  File "/usr/local/lib/python3.10/site-packages/aiohttp/helpers.py", line 721, in __exit__
    raise asyncio.TimeoutError from None"
Im wondering if there is anything I can do on the code side to improve it. The runner is declared like
class timeseriesforecasting(bentoml.Runnable):
    SUPPORTED_RESOURCES = ("cpu",)
    SUPPORTS_CPU_MULTI_THREADING = True

    @bentoml.Runnable.method(batchable=False)
Does it look good to you, should i add any other config here ? Is a bit strange why on my local machine works, but on the Google Cloud Run fails 😕
c

Chaoyu

05/30/2023, 4:55 PM
Hi @Dajana Muho, from the error message, looks like you have a timeout on the Runner call, could you try set a timeout for the runner as well? e.g.:
BENTOML_CONFIG_OPTIONS="api_server.timeout=1800 runners.timeout=1800"
I’m not sure how google cloud run UI works here but you may not need to add the quote in the env var value 🤔 if it doesn’t work, I’d try it out