This message was deleted.
# ask-for-help
s
This message was deleted.
s
Hi Shaohong, short answer is yes. BentoML 1.0 does not have parity on some of the frameworks in 0.13 including fasttext. Using a custom runner is the correct approach and shouldn’t be too difficult.
👍 1
s
@Sean thanks we will check out. Regarding the another side, pushing model to yatai. Are we still having python client to push to Yatai and pull from another instance to use, also multiple Yatai with different URL? I understand the new yatai is K8s mainly use case but we still like to remain, at the same time, someone can pull directly from corresponding Yatai URL like before (dev/prod) with switch in different situation in python, so like to see how to migrate svc.save(yatai_url=XXXX)
particuarlly, python based rather than cli based bento build and benoml push, i did know we have bentoml yatai login --api-token {YOUR_TOKEN_GOES_HERE} --endpoint http://yatai.127.0.0.1.sslip.io this
current thinking is doing something similar to your models.py (expose what under Click), using model_store = BentoMLContainer.model_store.get() to manipulate like before, then
Copy code
yatai_client.push_model(model_obj, force=force, threads=threads)
Do u see any easy way to pass yatai_url on fly? or every time need to initialise a context
s
Hi Shaohong, the Python YataiClient you mentioned is supported, however, pushing and pulling from multiple Yatai instances is not supported in this YataiClient. Communicating with multiple Yatai instances is possible through the lower level API YataiRESTApiClient, but you will need to write the high level code similar to YataiClient to invoke the lower level APIs.
We are looking to invest more into the Python YataiClient this quarter. Could you please share more of your multi Yatai instance use cases with us? @Tim Liu
s
that is fine. I can dive into your code, From you exisited knowledge, in the new version, could u think any way we could do get_yatai_client(yatai_url=yatai_url).repository.load(model_version) like before?
👀 1
also there is something we like to confirm, when we are doing migration, it seems before we are pack whole svc, however now it can only pack model, if there are logic implemented on customer runner, someone pull the model using in python, they need reimplement the logic on their side? of course if deployment it is API, as API included logic, so there is a chance from pack pull/push whole service to only model
s
The custom runner code should also be packaged into the bento, through the
include
field in the
bentofile.yaml
. The simplest way is to include the custom runner class definition in the service.py where the API is defined. But the custom runner class can also be defined elsewhere can packaged in to the bento,
s
so someone can call classify (e.g. tutorial) that function in python? although it serves as api endpoint (we do not want do request call, like to do model.predict in python). The thing is, I cannot see anywhere, someone pull the model in new version, and then load model from model = bentoml.models.load(XXX), can directly do model.predict as production solution
Copy code
runner.init_local()
 runner.run( MODEL_INPUT )
e.g. above you guys suggest for development and testing, what will be productional practice, people do not want to use API just like to do pre-computing as usual?
s
Regarding having the YataiClient working with multiple Yatai instances, it turned out that we don’t support context switch today. The good news is that I don’t think it’s too far fetched to add the support to the YataiClient today. I put together a PR hopefully can be merged before the next release. You can see example of CLI and YataiClient API in the PR description. https://github.com/bentoml/BentoML/pull/3448
👀 1
runner.init_local
is correct for invoking the model as a function in the API.
s
even for production?
s
What it means is that the runner will be run in the same process as the API server. You will lose runner capabilities like adaptive batching. This approach makes sense if the runner workload is trivial enough to not introduce another interprocess communication.
s
i see make sense, yeah, that is ok for us now, as we are doing pre-computing in spark cluster, so basically losing the batching feature, but scale in another way. This is aligned, and very interesting then