Matthieu Vanhoutte
02/09/2023, 7:23 PMserve
take some memory?
I defined the following bentoml_configuration.yml
file:
api_server:
workers: 2
metrics:
enabled: false
namespace: bentoml_api_server
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
format:
trace_id: 032x
span_id: 016x
runners:
onnx_mrr_runner:
batching:
enabled: true
max_batch_size: 5000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
onnx_mfr_runner:
batching:
enabled: true
max_batch_size: 5000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
onnx_mfsa_runner:
batching:
enabled: true
max_batch_size: 5000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
onnx_mrrr_runner:
batching:
enabled: true
max_batch_size: 1000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
onnx_err_runner:
batching:
enabled: true
max_batch_size: 5000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
onnx_errr_runner:
batching:
enabled: true
max_batch_size: 25000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
ct2_fp16_fr_en_runner:
batching:
enabled: true
max_batch_size: 1000
max_latency_ms: 10000
logging:
access:
enabled: false
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
metrics:
enabled: false
namespace: bentoml_runner
1. So, I can enable all logging:access:enabled
without taking disk memory on cloud instance?
2. How could I define that I just want logging above the warning
level on the console?
3. Does metrics:enabled:true
takes memory space?
From a thread in #support