Hey, what would be the correct approach to track a...
# troubleshooting
e
Hey, what would be the correct approach to track async call duration in a Flink application? We're using Prometheus to observe cluster metrics. I have found that Flink exposes https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/metrics/ and it's possible to register my own metrics, but I'm not entirely sure how histograms work? Normally for histograms I'd declare latency buckets, but here it suggests using Coadahale/DropWizard histograms that seem to work slightly differently. I'm seeking to find something similar as to what I'm doing in Rust, e.g.
Copy code
/// HTTP request latency histogram buckets
const HTTP_REQUESTS_HISTOGRAM_BUCKETS_SECONDS: &[f64; 18] = &[
    0.010, 0.020, 0.030, 0.050, 0.060, 0.070, 0.080, 0.090, //
    0.100, 0.150, 0.200, 0.250, 0.300, 0.350, 0.400, 0.450, //
    0.500, // Timeout limit
    0.600, // Catch all
];

lazy_static::lazy_static! {
    /// Tracks request histogram
    pub static ref HTTP_REQ_HISTOGRAM: HistogramVec = register_histogram_vec!(
        "http_request_duration_seconds",
        "The HTTP request latencies in seconds.",
        &[HANDLER_LABEL, PORTAL_LABEL, QUERY_LABEL],
        HTTP_REQUESTS_HISTOGRAM_BUCKETS_SECONDS.to_vec()
    )
    .expect("valid histogram vec metric");
}
πŸ™