Slackbot
04/03/2023, 10:33 PMEric Riddoch
04/04/2023, 5:13 PMTim Liu
04/07/2023, 3:55 PMEric Riddoch
04/07/2023, 5:02 PMTim Liu
04/07/2023, 5:14 PMSean
04/08/2023, 10:14 AMSean
04/08/2023, 11:38 AMaiohttp
client, you can start a trace like here, https://opentelemetry-python-kinvolk.readthedocs.io/en/latest/instrumentation/aiohttp_client/aiohttp_client.html.Sean
04/08/2023, 11:43 AMaiohttp
underneath. You can start a trace with the following example.
import numpy as np
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from bentoml.client import Client
from opentelemetry.instrumentation.aiohttp_client import AioHttpClientInstrumentor
# Enable instrumentation
AioHttpClientInstrumentor().instrument()
# Set the tracer provider
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
# Create a new root span, set it as the current span in context
with tracer.start_as_current_span("parent") as span:
client = Client.from_url("<http://localhost:3000>")
res = client.classify(np.array([[5,4,3,2]]))
print(res, span)
Within the span, you have access to both the trace id and span id.
[1.] _Span(name="parent", context=SpanContext(trace_id=0xa0a6d729fa810529ca039f4f74daa245, span_id=0xd2b688f5ec252e34, trace_flags=0x01, trace_state=[], is_remote=False))
On the server side, the request has the same trace id in both api_server
and runner
.
2023-04-08T04:39:39-0700 [INFO] [runner:iris_clf:1] _ (scheme=http,method=POST,path=/predict,type=application/octet-stream,length=32) (status=200,type=application/vnd.bentoml.NdarrayContainer,length=8) 0.827ms (trace=a0a6d729fa810529ca039f4f74daa245,span=f6636b5e84fae525,sampled=1)
2023-04-08T04:39:39-0700 [INFO] [api_server:iris_classifier:7] 127.0.0.1:50801 (scheme=http,method=POST,path=/classify,type=application/json,length=22) (status=200,type=application/json,length=5) 2.845ms (trace=a0a6d729fa810529ca039f4f74daa245,span=4a0f7029f04a41e3,sampled=1)
Over the wire, the trace data is sent as http headers over the OpenTelemetry protocol, see traceparent
.
Hypertext Transfer Protocol
POST /classify HTTP/1.1\r\n
Host: localhost:3000\r\n
content-type: application/json\r\n
Hypertext Transfer Protocol
POST /classify HTTP/1.1\r\n
Host: localhost:3000\r\n
content-type: application/json\r\n
traceparent: 00-90dfec16e4879db3ddae2241fa29fb16-da2a8c6f736a0282-01\r\n
Accept: */*\r\n
Accept-Encoding: gzip, deflate\r\n
User-Agent: Python/3.10 aiohttp/3.8.3\r\n
Content-Length: 22\r\n
[Content length: 22]
\r\n
[Full request URI: <http://localhost:3000/classify>]
[HTTP request 1/1]
[Response in frame: 91]
File Data: 22 bytes: 00-90dfec16e4879db3ddae2241fa29fb16-da2a8c6f736a0282-01\r\n
Accept: */*\r\n
Accept-Encoding: gzip, deflate\r\n
User-Agent: Python/3.10 aiohttp/3.8.3\r\n
Content-Length: 22\r\n
[Content length: 22]
\r\n
[Full request URI: <http://localhost:3000/classify>]
[HTTP request 1/1]
[Response in frame: 91]
File Data: 22 bytes
With this, you know the exact trace id for every request you decide to trace.Sean
04/08/2023, 11:44 AMEric Riddoch
04/08/2023, 11:39 PMSean
04/09/2023, 10:24 AMx-bentoml-request-id
header. Trace ID is current not automatically returned from the response today. Request ID is not logged in the access log.
Hypertext Transfer Protocol
HTTP/1.1 200 OK\r\n
date: Sat, 08 Apr 2023 11:39:39 GMT\r\n
server: uvicorn\r\n
content-length: 5\r\n
[Content length: 5]
content-type: application/json\r\n
x-bentoml-request-id: 9273968296152751308\r\n
\r\n
[HTTP response 1/1]
[Time since request: 0.003232000 seconds]
[Request in frame: 151]
[Request URI: <http://localhost:3000/classify>]
File Data: 5 bytes
However, returning a Request or Trace ID from the response is not the most reliable approach. It is also technically difficult to cover all paths, because the server will have to instrument all paths that could potentially return a response, even those prior to a trace context is established. If feasible, you can add a header to your API service code to include the Trace ID manually. Though trace_context
is an internal API, it should be stable.
import numpy as np
import bentoml
from <http://bentoml.io|bentoml.io> import NumpyNdarray
from bentoml._internal.context import trace_context
iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
@svc.api(
input=NumpyNdarray.from_sample(
np.array([[4.9, 3.0, 1.4, 0.2]], dtype=np.double), enforce_shape=False
),
output=NumpyNdarray.from_sample(np.array([0.0], dtype=np.double)),
)
async def classify(input_series: np.ndarray, ctx) -> np.ndarray:
ctx.response.headers["x-bentoml-trace-id"] = str(trace_context.trace_id)
return await iris_clf_runner.predict.async_run(input_series)
Trace instrumentation is best started from the caller side. OpenTelemetry is the most powerful when the entire call graph is propagating trace context. OpenTelemetry should support most popular frameworks, Python or not, out-of-the-box. https://opentelemetry.io/docs/instrumentation/Eric Riddoch
04/09/2023, 6:54 PMctx
parameter. That's. great!
Trace instrumentation is best started from the caller side. OpenTelemetry is the most powerful when the entire call graph is propagating trace context. OpenTelemetry should support most popular frameworks, Python or not, out-of-the-boxI'm aware that OpenTelemetry has instrumentation for many languages/frameworks. And internally (within the DE/AI team), we can certainly instrument our code. Although we prefer the NewRelic instrumentation whenever it's supported because it actually injects more context than OTel does by default. That aside, if we're looking for a "request ID" (I really mean a "trace ID" when I say this, but customers don't necessarily know the difference) it must be generated and returned whether the customer (client) is instrumented or not.
A Request ID should already be returned in the response today. See below for an example HTTP 200 response with theI'm also aware that BentoML returns a request ID (although I don't believe it does when a 500 error is returned which makes it even less useful than it already is) Rant about error handling in BentoML / Could BentoML support middlewares / custom error handling? BentoML is built off of the starlette framework, isn't it? Does starlette not support some sort of middleware mechanism where you can add headers to all requests whether they succeed or not? It's really the error case that we care about. I don't imagine we'd be auditing success cases nearly as much, altthough it's definitely good to have that capability. With FastAPI, we're able to use middlewares to achieve this, and even set up a "global error handler" to make sure error responses are much more informative than the vagueheader.x-bentoml-request-id
500 Internal Server Error... something went wrong
we regard returning errors that vague without context as a sin of REST API design, so it was frustrating to discover that BentoML doesn't give the user any control over that logic, effectively forcing you to always return useless error messages when runtime errors occur.
Question about your code example
I don't love that this approach requires data scientists to know about tracing when they write their services (we have then own their own services mostly).
It is cool that you can get the trace ID like this though!
In this example: would the trace ID make it into the response headers if the request failed due to a runtime error?
Also, if an error happens during deserialization of the parameters, I'm assuming we wouldn't get the trace ID and correspondingly, couldn't look up the error logs to diagnose that's what happened.