This message was deleted.
# announcements
s
This message was deleted.
👍 8
❤️ 5
🎉 4
💯 2
🚀 2
🍱 13
r
@Gustavo Cid Ornelas
👍 1
s
@Sean this look brilliant, thank you so much for the new feature! Just some feedback or questions regarding this: • we see internally the it still calling localhost as API based, why someone cannot access the “classify” as old svc in 0.13 as a python function, any design ideas behind this? • the reason we are curious on this, is we also looking this similar but vanilla version (non-spark batch run), or dask/ray, maybe we could have bentoml.batch.run(bento, XXX, df, backend=‘spark/ray/dash/none’)? myabe we miss your documentation if u already have non-spark run, do let use know
Just not confuse you, we do see needs and user cases for spark in our current production, just thinking more extra user cases
s
@Shaohong Bai thanks for the feedback. One of the main differences between 1.0 and 0.13 is the multi-process api and runner architecture, such that, an optimal number of api and runner server processes are spawned to run the inference logic at runtime. Therefore, some forms of IPC are needed for this architecture. Exposing the API as a Python function is likely doing a disservice to helping users understand the intent of the architecture. That being said, calling the API from Python is still possible through the Bento Client API. You can dynamically generate a Python client providing the service URL and call the endpoints in a Pythonic way.
👍 1
s
@Sean good explanation, thank you for clarification.
s
Similarly, Bento Client can work well with Dask DataFrame API.
s
We were discussing this internally, we think it is a much lighter and better solution, rather than passing model object across nodes/threads, it is a good move for sure in all these big data framework
s
Definitely. Both options are possible with trade offs. •
run_in_spark
brings the models closer to the data with upfront setup cost. The lifecycle of the Bento is managed by Spark. • Bento Client does not require initial setup. Bento service lifecycle is explicitly managed.
a
Hi, question! how does a python virtual virtual env is handled? each bento will have different requirements.txt. will this feature create and distribute virtual env across spark exectures?
s
@Amit Gelber the current implementation requires the environment to be setup on the workers prior. We are currently working on setting up the environment using Conda automatically. cc: @sauyon