This message was deleted.
# general
s
This message was deleted.
👀 1
👍 1
m
The Pulsar Metadata Store abstraction introduced in PIP 45 doesn't really provide a way to paginate, so this kind of change would probably be large. Just mentioning the feature in case someone is interested in working on it.
👀 1
k
@Michael Marshall With the current implementation, at what amount of topics in a namespace do you expect to start facing some problems?
m
I don't have any numbers. It'll be dependent on several factors, like the topic name length and the number of topics. My main point in posting is that the lack of pagination can make it take longer to get the topics and that makes page loads more expensive than they would be if you could request only the content you wanted to display
k
Not a perfect test, but for ~8000 topics with
topic-nn
names its about 200ms and 500kb (not gzipped). Can’t say it’s too slow for such amount of topics, but of course, can be improved as you said. :)
m
Nice. I haven't looked at this in a while, but the other problem is likely when trying to get topic stats, which has a lot more data.
k
In case we need to display the data without sorting, we can make a limited amount of requests (10-20) to load stats only for visible topics in a list. In case we need sorting on UI, it makes more sense to periodically poll Prometheus metrics from all brokers and preprocess them on the server side. Unfortunately, it looks like not all metrics can be calculated correctly this way, and I didn’t found a good solution yet. Actually, with the OTel PIP by @Asaf Mesika it probably could be implemented in a better way. Need to dive into it a bit. If someone could help with a few other questions, we could get a good Pulsar UI pretty fast. 😃
a
For now, server side pre-processing is the only way to go. One option is to add a batch call to get certain metrics for many topics at once, so everything you need in one call.
k
@Asaf Mesika you mean to implement a Pulsar broker admin API endpoint for this use case? If your intuition tells you that it may work fast enough and such a contribution may be accepted, I can try. Should be a good Java and Pulsar codebase exercise for me.
a
For the batch stats - yes. The idea is that the topic stats can be requested not only for a single topic, and you can specify a filter, listing only the metrics you care about. I’m pretty sure all the people that scrape those to compose their own special dashboards would appreciate it.
This endpoint can delegate calls to other brokers for you.
👍 1
k
Ok. Actually great idea! Will take a look in a few days. Most likely will have some questions, be ready. 🙂
m
This endpoint can delegate calls to other brokers for you.
I don't remember, do we do this already? This kind of operation seems like it could get expensive in large clusters.
k
Is a cluster of 100 brokers may be considered large enough? I actually don’t see a problem with 100 parallel HTTP requests if the new endpoint won’t be called too frequently. Probably I should make some tests first on a smaller scale first to better understand the data amount and latency, then multiply to get approximate results.
a
I’ve done some work with this in Go for work by hooking into both Prometheus and the
pulsarctl
libs. It’s not too difficult, but it would be a decent amount of work to flesh out all the calls, and to add caching layer and such. Another option is to use Prometheus for coarse stats, making API calls as necessary when drilling down.
👀 1