This message was deleted.
# ask-for-help
s
This message was deleted.
t
For reference, here is the stacktrace. I think the arguments might not being passed right: 2022-10-04T124827-0400 [ERROR] [dev_api_server:lstm_bento] Exception on /classify [POST] (trace=a981d2df3b33509cb395e12c02630622,span=bc05c5af262d41b8,sampled=0) Traceback (most recent call last): File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 291, in _run_method res = raw_method(*args, **kwargs) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py”, line 153, in error_handler raise e.with_traceback(filtered_tb) from None File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/tensorflow/python/saved_model/function_deserialization.py”, line 295, in restored_function_body raise ValueError( ValueError: Could not find matching concrete function to call loaded from the SavedModel. Got: Positional arguments (3 total): * <_VariantDataset element_spec={‘nontemporal_features’: TensorSpec(shape=(1, 500), dtype=tf.float64, name=None), ‘temporal_features’: TensorSpec(shape=(1, 404, 500), dtype=tf.float64, name=None)}> * False * None Keyword arguments: {} Expected these arguments to match one of the following 4 option(s): Option 1: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘nontemporal_features’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘temporal_features’)] * False * None Keyword arguments: {} Option 2: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘inputs/0’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘inputs/1’)] * False * None Keyword arguments: {} Option 3: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘inputs/0’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘inputs/1’)] * True * None Keyword arguments: {} Option 4: Positional arguments (3 total): * [TensorSpec(shape=(None, 500), dtype=tf.float32, name=‘nontemporal_features’), TensorSpec(shape=(None, None, 500), dtype=tf.float32, name=‘temporal_features’)] * True * None Keyword arguments: {} During handling of the above exception, another exception occurred: Traceback (most recent call last): File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/server/http_app.py”, line 323, in api_func output = await run_in_threadpool(api.func, input_data) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/starlette/concurrency.py”, line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/to_thread.py”, line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/_backends/_asyncio.py”, line 937, in run_sync_in_worker_thread return await future File “/Users/shivacharan/.pyenv/versions/3.9.1/envs/BentoML/lib/python3.9/site-packages/anyio/_backends/_asyncio.py”, line 867, in run result = context.run(func, *args) File “/Users/shivacharan/Documents/repos/BentoML/examples/quickstart/service.py”, line 21, in classify result = lstm_bento_runner.run(input) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runner.py”, line 44, in run return self.runner._runner_handle.run_method( # type: ignore File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runner_handle/local.py”, line 46, in run_method return getattr(self._runnable, __bentoml_method.name)(*args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/runner/runnable.py”, line 139, in method return self.func(obj, *args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 323, in run_method return _run_method(runnable_self, *args, **kwargs) File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/frameworks/tensorflow_v2.py”, line 301, in _run_method casted_args = cast_py_args_to_tf_function_args( File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/utils/tensorflow.py”, line 186, in cast_py_args_to_tf_function_args parameters = [ File “/Users/shivacharan/Documents/repos/BentoML/bentoml/_internal/utils/tensorflow.py”, line 188, in <listcomp> name=s.name, AttributeError: ‘list’ object has no attribute ‘name’
@Shiva Charan Velichala Can you post the shape of the data which you pass to the original model's predict method?
s
Copy code
nontemporal_feature_dimension = 500
temporal_feature_dimension = 500 dataset = tf.data.Dataset.range(1)
dataset = dataset.map(lambda x: {"nontemporal_features": np.random.rand(1, nontemporal_feature_dimension),
                                 "temporal_features": np.random.rand(1, np.random.randint(400, high=1000),
                                                                     temporal_feature_dimension)})
i am building the dataset and passing that to the function
hey @Tim Liu, any update on this?
s
@Shiva Charan Velichala how are you saving the model? What’s the signature you provided?
s
Copy code
saved_model = bentoml.tensorflow.save_model("lstm_bento", model)
print(f"Model saved: {saved_model}")
@Sean lmk if its easier to jump on a call and troubleshoot?
s
I think what’s missing is passing the model signatures during save model. Could you please get familiarized with the model signatures. We can jump on a call if you still have questions. https://docs.bentoml.org/en/latest/frameworks/tensorflow.html
s
Got it i will check this out and let u know if i cant figure it out. Thanks for ur response
hey @Sean, tried it and still facing issues, can we hop on a call when u have some time
hi @Sean
s
HI @Shiva Charan Velichala, what time of day is best for you to meet?
s
i am available all day today
whenever ur ready ping me and we can jump on a call
@Sean
s
Maybe around 2:30pm PST. I will ping you ahead of time.
s
sure, that works. Can u plz share ur email i will send a meeting invite for the same
Hi @Sean lmk wen ur ready we can hop on a call
s
@Aaron Pham please see training and service code here.
@Shiva Charan Velichala are you free to chat now?
Hi @Shiva Charan Velichala, I think we got it to work. In
train.py
, we made two changes. 1. Use
bentoml.keras
instead of
bentoml.tensorflow
module since the model is Keras 2. We commented out the
metrics
argument in
model.compile
since metrics doesn’t get saved and loaded well.
Copy code
import timeit
start_time = timeit.default_timer()
import tensorflow as tf
import numpy as np
import bentoml

nontemporal_feature_dimension = 500
temporal_feature_dimension = 500


class PatientAUC(tf.keras.metrics.AUC):
    def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.reduce_any(y_true, axis=-1)
        y_pred = tf.reduce_max(y_pred, axis=-2)
        super().update_state(y_true, y_pred, sample_weight=None)


def build_model(nontemporal_feature_dimension, temporal_feature_dimension, hidden_size=64):
    nontemporal_features = tf.keras.Input(shape=(nontemporal_feature_dimension), name="nontemporal_features")
    temporal_features = tf.keras.Input(shape=(None, temporal_feature_dimension), name="temporal_features")

    if nontemporal_feature_dimension == 0:
        x = tf.keras.layers.LSTM(hidden_size, return_sequences=True)(temporal_features)
    else:
        nt_embed_h = tf.keras.layers.Dense(hidden_size)(nontemporal_features)
        nt_embed_c = tf.keras.layers.Dense(hidden_size)(nontemporal_features)
        initial_state = [nt_embed_h, nt_embed_c]
        x = tf.keras.layers.LSTM(hidden_size, return_sequences=True)(temporal_features, initial_state=initial_state)

    y = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(1, activation='sigmoid'))(x)
    model = tf.keras.Model([nontemporal_features, temporal_features], y)
    return model


model = build_model(nontemporal_feature_dimension, temporal_feature_dimension)

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    # metrics=[
    #     tf.keras.metrics.AUC(curve='ROC', name='AUROC'),
    #     tf.keras.metrics.AUC(curve='PR', name='AUPRC'),
    #     PatientAUC(curve='ROC', name='PatientAUROC'),
    #     PatientAUC(curve='PR', name='PatientAUPRC')
    # ],
)

dataset = tf.data.Dataset.range(50)
dataset = dataset.map(lambda x: {"nontemporal_features": np.random.rand(1, nontemporal_feature_dimension),
                                 "temporal_features": np.random.rand(1, np.random.randint(400, high=1000),
                                                                     temporal_feature_dimension)})

start_time = timeit.default_timer()

# predictions = model.predict(dataset, batch_size=1, workers=64, use_multiprocessing=True)
# print(predictions)
elapsed = timeit.default_timer() - start_time
print(elapsed)

bentoml.keras.save_model(name="lstm_keras_test4", model=model, signatures={"predict": {"batchable": False}} )

# start_time = timeit.default_timer()
# for patient in feature_list:
#     predictions = model.predict(dataset, batch_size=1, workers=1, use_multiprocessing=False)
On the
service.py
side, we have to make two changes. 1. Change the runner initialization to using
bentoml.keras
instead of
bentoml.tensorflow
2. Convert the input to Tensor type instead of the current example input.
s
when u say convert input this is what i am doing -> i am creating a tensor inside the servie.py and passing it to the model, but i am still getting an error
lmk if i missed something
s
• What is the error you see? • The example code above did not convert dataset into a tensor.
Hi Shiva, any update on our side? Were you able to setup the Keras service?
s
Hi @Sean, i was meaning to message you. I am unable to make the service work with input as JSON. I need some help in figuring this out.
hi @Sean i was able to figure out the dataset conversion to json, the service runs fine now. However, i have done some load testing using locust and its way slower than what we thought. Do u have a few mins to jump on a call to see if we are doing something wrong in testing ? the model runs in 20 ms in local
hi @Sean
t
@Shiva Charan Velichala Did you use the --production flag?
s
yep, bentoml serve --production
t
couple other things: • You want to add "async" option as part of the endpoint method definition • You also want to use "await runner.predict.async_run"
you'll also want to enable batching by saving the model correctly: https://docs.bentoml.org/en/latest/guides/batching.html
s
also, this is another thing i noticed. i compiled my model before saving it but i see the compile manually message
t
Haven't seen that before... But here's a generic checklist performance optimizations: https://docs.bentoml.org/en/latest/guides/performance.html
s
async await
thats throwing an error AttributeError: ‘_thread._local’ object has no attribute ‘current_async_module’
t
also need to change run() to async_run()
s
gotcha, changed that, it works now. But i dont see a difference in performance.😟
t
huh... that's a little weird... do you see any difference at all? I'd also look into adaptive batching, there should be a pretty significant increase in performance from that. What's the end to end latency look like?
s
The average time per request is 1700ms before async and 1500 with async
t
How fast does the model run per request in your previous test where you were calling the predict directly?
s
1700ms
t
Oh, I meant without bentoml at all. How long does the model take to predict in a normal training environment
s
U mean calling predict just like a python function?
I used timeit to calculate time elapsed and its about 30ms
t
cool, yea, there seems like there's an issue. One other thing that I'd try are these 2 things:
Copy code
model = bentoml.framework.load_model(model_tag)
model.predict(data)
Copy code
runner = bentoml.framework.get(model_tag).to_runner()
runner.init_local()
runner.predict.run(data)
Try timing these 2 methods, if one or the other isn't about 30ms, I think we can narrow down where the issue is
load_model() brings the model back into memory exactly as it was when it was trained, if this one takes a long time, it has to do with how we're loading the model back. That "warning, compiled model" may have been a clue init_local() creates a runner that you can run without a separate process. If this is not slow, then it could have to do with inter-process communication
s
tried runner.init_local() and i see the below message
and the load_model is slower too
its taking 3000 ms
@Tim Liu
i am loading using the bentoml.keras.load_model and passing it to the service but getting the above error
t
ah, ok, yes this seems to indicate that there's something going on with the model loading itself. You're probably going to have to give us a little more information on the model. We're going to have to likely look through the keras documentation to see how it's supposed to be saved and loaded in your particular case