This message was deleted pulsar #general

Join Slack

This message was deleted.

# general

Slackbot

05/23/2023, 5:04 PM

This message was deleted.

👀 1

👍 1

Michael Marshall

05/23/2023, 5:05 PM

The Pulsar Metadata Store abstraction introduced in PIP 45 doesn't really provide a way to paginate, so this kind of change would probably be large. Just mentioning the feature in case someone is interested in working on it.

👀 1

Kiryl Valkovich

05/23/2023, 5:27 PM

@Michael Marshall With the current implementation, at what amount of topics in a namespace do you expect to start facing some problems?

Michael Marshall

05/23/2023, 5:34 PM

I don't have any numbers. It'll be dependent on several factors, like the topic name length and the number of topics. My main point in posting is that the lack of pagination can make it take longer to get the topics and that makes page loads more expensive than they would be if you could request only the content you wanted to display

Kiryl Valkovich

05/23/2023, 5:54 PM

Not a perfect test, but for ~8000 topics with

topic-nn

names its about 200ms and 500kb (not gzipped). Can’t say it’s too slow for such amount of topics, but of course, can be improved as you said. :)

Screen Recording 2023-05-23 at 7.51.05 PM.mov

Michael Marshall

05/23/2023, 6:18 PM

Nice. I haven't looked at this in a while, but the other problem is likely when trying to get topic stats, which has a lot more data.

Kiryl Valkovich

05/23/2023, 6:37 PM

In case we need to display the data without sorting, we can make a limited amount of requests (10-20) to load stats only for visible topics in a list. In case we need sorting on UI, it makes more sense to periodically poll Prometheus metrics from all brokers and preprocess them on the server side. Unfortunately, it looks like not all metrics can be calculated correctly this way, and I didn’t found a good solution yet. Actually, with the OTel PIP by @Asaf Mesika it probably could be implemented in a better way. Need to dive into it a bit. If someone could help with a few other questions, we could get a good Pulsar UI pretty fast. 😃

Asaf Mesika

05/23/2023, 6:53 PM

For now, server side pre-processing is the only way to go. One option is to add a batch call to get certain metrics for many topics at once, so everything you need in one call.

Kiryl Valkovich

05/23/2023, 7:01 PM

@Asaf Mesika you mean to implement a Pulsar broker admin API endpoint for this use case? If your intuition tells you that it may work fast enough and such a contribution may be accepted, I can try. Should be a good Java and Pulsar codebase exercise for me.

Asaf Mesika

05/23/2023, 7:03 PM

For the batch stats - yes. The idea is that the topic stats can be requested not only for a single topic, and you can specify a filter, listing only the metrics you care about. I’m pretty sure all the people that scrape those to compose their own special dashboards would appreciate it.

Asaf Mesika

05/23/2023, 7:03 PM

This endpoint can delegate calls to other brokers for you.

👍 1

Kiryl Valkovich

05/23/2023, 7:05 PM

Ok. Actually great idea! Will take a look in a few days. Most likely will have some questions, be ready. 🙂

Michael Marshall

05/23/2023, 7:08 PM

This endpoint can delegate calls to other brokers for you.

I don't remember, do we do this already? This kind of operation seems like it could get expensive in large clusters.

Kiryl Valkovich

05/23/2023, 7:26 PM

Is a cluster of 100 brokers may be considered large enough? I actually don’t see a problem with 100 parallel HTTP requests if the new endpoint won’t be called too frequently. Probably I should make some tests first on a smaller scale first to better understand the data amount and latency, then multiply to get approximate results.

Andy Walker

05/24/2023, 12:58 PM

I’ve done some work with this in Go for work by hooking into both Prometheus and the

pulsarctl

libs. It’s not too difficult, but it would be a decent amount of work to flesh out all the calls, and to add caching layer and such. Another option is to use Prometheus for coarse stats, making API calls as necessary when drilling down.

👀 1

2 Views

Open in Slack

Previous Next