What fellow saas people use for metrics? I’ve been pondering on various metrics tiers like:
1. ‘Throw-away’ volatile metrics for your systems monitoring. They are usually short-lived unless you’re a metrics brat and willing to spend your time on federated retention with downsampling (i.e have 24h or even a week worth of raw data and then downsample important ones for historic purposes). This sounds like a prometheus domain
2. Not throw-away non-volatile multi-tenant metrics for your business data and your clients business data. E.g some analytics or dashboard metrics that are important enough to have them persisted, but not very important to have them in your operational database. There are too many solutions here, but it can be summed up to ‘persistent prometheus’.
3. Mission-critical data, e.g counters and gauges you use for billing. This is important enough to use your operational database or timescaledb to have acid-ity guarantees. (I seem to be mixing general use time-series with metrics that don’t need a granularity that small, but there are still aggregate labels like your customer id, your customer’s project or entity category, whatever else granularity you might need that will add up to the overall volume)
With all the trade-offs to choose from for long-term storage, there’s also a path to decide for collecting and ‘throwing’ your metrics. How important are your metrics that you’re okay to dedicate your kafka or redis to receiving them? Are you willing to lose some due to outage? If they are important, are you wiling to return an error response to your consumer because your metrics pipeline failed? Or are you willing to increase your data path complexity and store your metrics in the same transactional scope as your main domain? even if it’s then processed via CDC, it still adds to your complexity budget. I’m careful in considering ELT, because I can’t just add dbt to my airflow and then have it ran on airbyte with temporal that needs cassandra. Not that those tools are bad, but I’d rather just insert metrics data to the postgres table.
I would love to hear how people handle these needs.