Hi, I am trying to understand how useful async cal...
# ask-for-help
j
Hi, I am trying to understand how useful async calls with bento is. So I have the following structure is this. I just want to know whether this is a good practice to parallelize calls and speed up the api call or whether there is a better way or whether i should just make it sequential. As added context, the models are detection models which take around 1 to 2 seconds on a low CPU.
Copy code
class BigCodeForRunning:
   ...
   async predict(img_path: str, runner: bentoml.Runner):
       # do something to image such as convert to numpy, resize, etc.
       img = do something ...
       
       # run predict
       output = runner.run.run(img)
       
       # do some post processing to output
       output = post_process(output)
       return output

# Bunch of runners i want to parallelize

async def...
  runner1 = ...to_runner()
  runner2 = ...to_runner()
  runner3 = ...to_runner()

  bigcodeblock = BigCodeForRunning()
  
  result = await asyncio.gather(
     asyncio.create_task(bigcodeblock.predict(img_path, runner1)),
     asyncio.create_task(bigcodeblock.predict(img_path, runner2)),
     asyncio.create_task(bigcodeblock.predict(img_path, runner3)),
  )
  
  ... do other stuff
Notice that I used
run
instead of
async_run
since it gives me some problem