This message was deleted.
# ask-for-help
s
This message was deleted.
s
It looks like this user’s question is similar to mine. Bo’s answer to his question also has a lot of my questions. I don’t see the explanation is documented anywhere on the site. The custom runner is what I will need to solve my problem because I need to preprocess the dataframe.
On the adaptive batching architecture, it says
The order of the requests in a batch is not guaranteed
. If I need to post-process the batch prediction returned by the model, what is the impact or things I need to pay attention to?
@Tim Liu or @Bo bumping this message to get some help with my BentoML upgrade questions: 1. What type should I use for the runner to accept a list of JSON as input in batch mode? 2. How do I return a list of JSON in batch mode, like in the BentoML 0.13 code above? 3. Can you help to answer my question above for adaptive batching? Thanks!
b
Appreciate the pinging. Sometime I got lost in the messages. I want to understand your 0.13 code a little more before I can answer your questions. In your self.prediction_provider.predict, does the order matters? Does the order of requests change the outcome of the result?
Appreciate the pinging. Sometime I got lost in the messages. I want to understand your 0.13 code a little more before I can answer your questions. In your self.prediction_provider.predict, does the order matters? Does the order of requests change the outcome of the result? If the order does not matter when you inferencing by your model, then I think we can try out this code for 1.0
Copy code
@api(input=JSON(), output=JSON())
def predict(input_json_req):
    domain_request = Mappings.to_domain(input_json_req)
    domain_df = pd.DataFrame(domain_request)
    result = prediction_runner.predict.run(domain_df)
    return JsonSerializer.to_json(result)
happy to hop on a quick call today @Shihgian Lee
s
Hi @Bo Thank you for your reply! I think you might have answered my question. I think the answer to your question is the order does not matter. Please let me know if I got this right. We predict the requests and post-process the predictions as it comes in the order given to us by the runner. For example, request 1, 2, 3, 4 came into the runner as batch. We pass them to the xgboost for predictions. Then we post-process them in the same order returned by the xgboost and then return a list of json objects to the calling method. By following the example you provided, my refactored service API looks like the following:
Copy code
input_spec = JSON(pydantic_model=InputFeatures)

my_runner = bentoml.Runner(MyRunnable, name='my_runnable')
svc = bentoml.Service('my_service', runners=[my_runner])


@svc.api(input=input_spec, output=NumpyNdarray())
def predict(input_data: InputFeatures):
    input_df = pd.DataFrame([input_data.dict()])
    return my_runner.predict.run(input_data)
Is my understanding correct?
my_runner.predict.run(input_data)
returns a list of JSON objects dataclass instances. Will that work?
Hi @Bo Bump 👆 🙂 This afternoon I have meetings. If a quick call will be easy for you, I can do it tomorrow because I only have 1 meeting 🙂 Thank you!