This message was deleted BentoML #ask-for-help

Join Slack

This message was deleted.

# ask-for-help

Slackbot

10/04/2022, 4:57 PM

This message was deleted.

Tim Liu

10/04/2022, 5:15 PM

For reference, here is the stacktrace. I think the arguments might not being passed right: 2022-10-04T124827-0400 [ERROR] [dev_api_server:lstm_bento] Exception on /classify [POST] (trace=a981d2df3b33509cb395e12c02630622,span=bc05c5af262d41b8,sampled=0) Traceback (most recent call last): File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 291, in _run_method res = raw_method(*args, **kwargs) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py”, line 153, in error_handler raise e.with_traceback(filtered_tb) from None File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/tensorflow/python/saved_model/function_deserialization.py”, line 295, in restored_function_body raise ValueError( ValueError: Could not find matching concrete function to call loaded from the SavedModel. Got: Positional arguments (3 total): * <_VariantDataset element_spec={‘nontemporal_features’: TensorSpec(shape=(1, 500), dtype=tf.float64, name=None), ‘temporal_features’: TensorSpec(shape=(1, 404, 500), dtype=tf.float64, name=None)}> * False * None Keyword arguments: {} Expected these arguments to match one of the following 4 option(s): Option 1: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘nontemporal_features’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘temporal_features’)] * False * None Keyword arguments: {} Option 2: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘inputs/0’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘inputs/1’)] * False * None Keyword arguments: {} Option 3: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘inputs/0’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘inputs/1’)] * True * None Keyword arguments: {} Option 4: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘nontemporal_features’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘temporal_features’)] * True * None Keyword arguments: {} During handling of the above exception, another exception occurred: Traceback (most recent call last): File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/server/http_app.py”, line 323, in api_func output = await run_in_threadpool(api.func, input_data) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/starlette/concurrency.py”, line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/to_thread.py”, line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/_backends/_asyncio.py”, line 937, in run_sync_in_worker_thread return await future File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/_backends/_asyncio.py”, line 867, in run result = context.run(func, *args) File “/Users/shivacharan/Documents/repos/BentoML/examples/quickstart/service.py”, line 21, in classify result = lstm_bento_runner.run(input) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runner.py”, line 44, in run return self.runner._runner_handle.run_method( # type: ignore File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runner_handle/local.py”, line 46, in run_method return getattr(self._runnable, __bentoml_method.name)(*args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runnable.py”, line 139, in method return self.func(obj, *args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 323, in run_method return _run_method(runnable_self, *args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 301, in _run_method casted_args = cast_py_args_to_tf_function_args( File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/utils/tensorflow.py”, line 186, in cast_py_args_to_tf_function_args parameters = [ File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/utils/tensorflow.py”, line 188, in <listcomp> name=s.name, AttributeError: ‘list’ object has no attribute ‘name’

Tim Liu

10/04/2022, 5:16 PM

@Shiva Charan Velichala Can you post the shape of the data which you pass to the original model's predict method?

Shiva Charan Velichala

10/04/2022, 5:32 PM

Copy code

nontemporal_feature_dimension = 500
temporal_feature_dimension = 500 dataset = tf.data.Dataset.range(1)
dataset = dataset.map(lambda x: {"nontemporal_features": np.random.rand(1, nontemporal_feature_dimension),
                                 "temporal_features": np.random.rand(1, np.random.randint(400, high=1000),
                                                                     temporal_feature_dimension)})

Shiva Charan Velichala

10/04/2022, 5:44 PM

i am building the dataset and passing that to the function

Shiva Charan Velichala

10/04/2022, 8:08 PM

hey @Tim Liu, any update on this?

Sean

10/04/2022, 9:51 PM

@Shiva Charan Velichala how are you saving the model? What’s the signature you provided?

Shiva Charan Velichala

10/04/2022, 10:05 PM

Copy code

saved_model = bentoml.tensorflow.save_model("lstm_bento", model)
print(f"Model saved: {saved_model}")

Shiva Charan Velichala

10/04/2022, 10:07 PM

@Sean lmk if its easier to jump on a call and troubleshoot?

Sean

10/04/2022, 11:42 PM

I think what’s missing is passing the model signatures during save model. Could you please get familiarized with the model signatures. We can jump on a call if you still have questions. https://docs.bentoml.org/en/latest/frameworks/tensorflow.html

Shiva Charan Velichala

10/04/2022, 11:46 PM

Got it i will check this out and let u know if i cant figure it out. Thanks for ur response

Shiva Charan Velichala

10/05/2022, 2:47 PM

hey @Sean, tried it and still facing issues, can we hop on a call when u have some time

Shiva Charan Velichala

10/06/2022, 2:16 PM

hi @Sean

Sean

10/06/2022, 5:39 PM

HI @Shiva Charan Velichala, what time of day is best for you to meet?

Shiva Charan Velichala

10/06/2022, 6:10 PM

i am available all day today

Shiva Charan Velichala

10/06/2022, 6:10 PM

whenever ur ready ping me and we can jump on a call

Shiva Charan Velichala

10/06/2022, 6:14 PM

@Sean

Sean

10/06/2022, 6:37 PM

Maybe around 2:30pm PST. I will ping you ahead of time.

Shiva Charan Velichala

10/06/2022, 6:39 PM

sure, that works. Can u plz share ur email i will send a meeting invite for the same

Shiva Charan Velichala

10/06/2022, 9:24 PM

Hi @Sean lmk wen ur ready we can hop on a call

Sean

10/06/2022, 9:33 PM

Yes. Let’s use this link, https://us02web.zoom.us/j/2069799684?pwd=Z2hJbEcxQmRYbEJNY1ZaL2dNQzJlZz09.

Sean

10/06/2022, 11:59 PM

@Aaron Pham please see training and service code here.

Sean

10/07/2022, 12:17 AM

@Shiva Charan Velichala are you free to chat now?

Sean

10/07/2022, 1:00 AM

Hi @Shiva Charan Velichala, I think we got it to work. In

train.py

, we made two changes. 1. Use

bentoml.keras

instead of

bentoml.tensorflow

module since the model is Keras 2. We commented out the

metrics

argument in

model.compile

since metrics doesn’t get saved and loaded well.

Copy code

import timeit
start_time = timeit.default_timer()
import tensorflow as tf
import numpy as np
import bentoml

nontemporal_feature_dimension = 500
temporal_feature_dimension = 500


class PatientAUC(tf.keras.metrics.AUC):
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.reduce_any(y_true, axis=-1)
        y_pred = tf.reduce_max(y_pred, axis=-2)
        super().update_state(y_true, y_pred, sample_weight=None)


def build_model(nontemporal_feature_dimension, temporal_feature_dimension, hidden_size=64):
    nontemporal_features = tf.keras.Input(shape=(nontemporal_feature_dimension), name="nontemporal_features")
    temporal_features = tf.keras.Input(shape=(None, temporal_feature_dimension), name="temporal_features")

    if nontemporal_feature_dimension == 0:
        x = tf.keras.layers.LSTM(hidden_size, return_sequences=True)(temporal_features)
    else:
        nt_embed_h = tf.keras.layers.Dense(hidden_size)(nontemporal_features)
        nt_embed_c = tf.keras.layers.Dense(hidden_size)(nontemporal_features)
        initial_state = [nt_embed_h, nt_embed_c]
        x = tf.keras.layers.LSTM(hidden_size, return_sequences=True)(temporal_features, initial_state=initial_state)

    y = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1, activation='sigmoid'))(x)
    model = tf.keras.Model([nontemporal_features, temporal_features], y)
    return model


model = build_model(nontemporal_feature_dimension, temporal_feature_dimension)

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    # metrics=[
    #     tf.keras.metrics.AUC(curve='ROC', name='AUROC'),
    #     tf.keras.metrics.AUC(curve='PR', name='AUPRC'),
    #     PatientAUC(curve='ROC', name='PatientAUROC'),
    #     PatientAUC(curve='PR', name='PatientAUPRC')
    # ],
)

dataset = tf.data.Dataset.range(50)
dataset = dataset.map(lambda x: {"nontemporal_features": np.random.rand(1, nontemporal_feature_dimension),
                                 "temporal_features": np.random.rand(1, np.random.randint(400, high=1000),
                                                                     temporal_feature_dimension)})

start_time = timeit.default_timer()

# predictions = model.predict(dataset, batch_size=1, workers=64, use_multiprocessing=True)
# print(predictions)
elapsed = timeit.default_timer() - start_time
print(elapsed)

bentoml.keras.save_model(name="lstm_keras_test4", model=model, signatures={"predict": {"batchable": False}} )

# start_time = timeit.default_timer()
# for patient in feature_list:
#     predictions = model.predict(dataset, batch_size=1, workers=1, use_multiprocessing=False)

Sean

10/07/2022, 1:18 AM

On the

service.py

side, we have to make two changes. 1. Change the runner initialization to using

bentoml.keras

instead of

bentoml.tensorflow

2. Convert the input to Tensor type instead of the current example input.

Shiva Charan Velichala

10/07/2022, 2:57 PM

when u say convert input this is what i am doing -> i am creating a tensor inside the servie.py and passing it to the model, but i am still getting an error

Shiva Charan Velichala

10/07/2022, 2:57 PM

lmk if i missed something

Sean

10/07/2022, 9:08 PM

• What is the error you see? • The example code above did not convert dataset into a tensor.

Sean

10/10/2022, 10:30 PM

Hi Shiva, any update on our side? Were you able to setup the Keras service?

Shiva Charan Velichala

10/10/2022, 10:46 PM

Hi @Sean, i was meaning to message you. I am unable to make the service work with input as JSON. I need some help in figuring this out.

Shiva Charan Velichala

10/11/2022, 5:50 PM

hi @Sean i was able to figure out the dataset conversion to json, the service runs fine now. However, i have done some load testing using locust and its way slower than what we thought. Do u have a few mins to jump on a call to see if we are doing something wrong in testing ? the model runs in 20 ms in local

Shiva Charan Velichala

10/11/2022, 8:00 PM

hi @Sean

Tim Liu

10/11/2022, 9:07 PM

@Shiva Charan Velichala Did you use the --production flag?

Shiva Charan Velichala

10/11/2022, 9:09 PM

yep, bentoml serve --production

Tim Liu

10/11/2022, 9:11 PM

couple other things: • You want to add "async" option as part of the endpoint method definition • You also want to use "await runner.predict.async_run"

Tim Liu

10/11/2022, 9:12 PM

you'll also want to enable batching by saving the model correctly: https://docs.bentoml.org/en/latest/guides/batching.html

Shiva Charan Velichala

10/11/2022, 9:12 PM

also, this is another thing i noticed. i compiled my model before saving it but i see the compile manually message

Tim Liu

10/11/2022, 9:13 PM

Haven't seen that before... But here's a generic checklist performance optimizations: https://docs.bentoml.org/en/latest/guides/performance.html

Shiva Charan Velichala

10/11/2022, 9:20 PM

async await

Shiva Charan Velichala

10/11/2022, 9:20 PM

thats throwing an error AttributeError: ‘_thread._local’ object has no attribute ‘current_async_module’

Tim Liu

10/11/2022, 9:22 PM

also need to change run() to async_run()

Shiva Charan Velichala

10/11/2022, 9:26 PM

gotcha, changed that, it works now. But i dont see a difference in performance.😟

Tim Liu

10/11/2022, 9:36 PM

huh... that's a little weird... do you see any difference at all? I'd also look into adaptive batching, there should be a pretty significant increase in performance from that. What's the end to end latency look like?

Shiva Charan Velichala

10/11/2022, 9:37 PM

The average time per request is 1700ms before async and 1500 with async

Tim Liu

10/11/2022, 10:03 PM

How fast does the model run per request in your previous test where you were calling the predict directly?

Shiva Charan Velichala

10/11/2022, 10:04 PM

1700ms

Tim Liu

10/11/2022, 10:09 PM

Oh, I meant without bentoml at all. How long does the model take to predict in a normal training environment

Shiva Charan Velichala

10/11/2022, 10:18 PM

U mean calling predict just like a python function?

Shiva Charan Velichala

10/11/2022, 10:18 PM

I used timeit to calculate time elapsed and its about 30ms

Tim Liu

10/11/2022, 11:21 PM

cool, yea, there seems like there's an issue. One other thing that I'd try are these 2 things:

Copy code

model = bentoml.framework.load_model(model_tag)
model.predict(data)

Copy code

runner = bentoml.framework.get(model_tag).to_runner()
runner.init_local()
runner.predict.run(data)

Try timing these 2 methods, if one or the other isn't about 30ms, I think we can narrow down where the issue is

Tim Liu

10/11/2022, 11:23 PM

load_model() brings the model back into memory exactly as it was when it was trained, if this one takes a long time, it has to do with how we're loading the model back. That "warning, compiled model" may have been a clue init_local() creates a runner that you can run without a separate process. If this is not slow, then it could have to do with inter-process communication

Shiva Charan Velichala

10/12/2022, 2:13 PM

tried runner.init_local() and i see the below message

Shiva Charan Velichala

10/12/2022, 2:13 PM

and the load_model is slower too

Shiva Charan Velichala

10/12/2022, 2:13 PM

its taking 3000 ms

Shiva Charan Velichala

10/12/2022, 2:13 PM

@Tim Liu

Shiva Charan Velichala

10/12/2022, 2:37 PM

i am loading using the bentoml.keras.load_model and passing it to the service but getting the above error

Tim Liu

10/12/2022, 4:55 PM

ah, ok, yes this seems to indicate that there's something going on with the model loading itself. You're probably going to have to give us a little more information on the model. We're going to have to likely look through the keras documentation to see how it's supposed to be saved and loaded in your particular case

2 Views

Open in Slack

Previous Next