This message was deleted Apache Druid #troubleshooting

Join Slack

This message was deleted.

# troubleshooting

Slackbot

05/16/2023, 1:45 AM

This message was deleted.

추호관

05/16/2023, 4:14 AM

Iike is there any manager for kafka offset

Vadim

05/16/2023, 4:53 AM

do you mean the kafka supervisor? you should be able to see it on the ingestion tab of the web console. click on whatever supervisor you want and select details

🙌 1

druid 1

추호관

05/16/2023, 8:54 AM

Ah thanks! I just curious how druid manage offset internally in code level. So I am looking in druid source code and I wonder if there is kafka offset manager or something that handle. Like new Peon Task when pop how does new Peon Task know offset,

Saydul Bashar

05/16/2023, 10:27 AM

The offset is stored in the druid_dataSources table.

추호관

05/16/2023, 3:46 PM

Oh I see, Thanks @Saydul Bashar would you recommend me what class has a responsibility of it? Really thank you Cuz I want to understand how druid manage offset that well.

Saydul Bashar

05/16/2023, 5:49 PM

Hi @추호관 you are welcome. I am glad that I could help. I will have a look at the code. I haven't had to look into it for a while.

🙌 1

Renato Santos

05/16/2023, 10:14 PM

i don't think it will help look at the table directly, because it's saved as binary I don't think it's all in one class, but the main logic must be in the segment publisher, not in the supervisor, I think

🙌 1

Renato Santos

05/16/2023, 10:23 PM

https://github.com/apache/druid/blob/master/extensions-core/kafka-indexing-service[…]org/apache/druid/indexing/kafka/supervisor/KafkaSupervisor.java

🙌 1

Renato Santos

05/16/2023, 10:29 PM

it's maybe there, but I don't think it's 100% there I think some part of those classes are mixed with https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/input/DruidInputSource.java

🙌 1

Renato Santos

05/16/2023, 10:29 PM

(no this file, but this 'tree')

🙌 1

추호관

05/17/2023, 4:02 AM

Thanks I will look into that! And if there is any updates on information it would be pleasure to know that

Didip Kerabat

05/18/2023, 2:30 AM

It would be cool if users can manipulate the offset from the UI, @Vadim 😝

Gian Merlino

05/24/2023, 1:20 AM

we need an API for that first 🙂

Gian Merlino

05/24/2023, 1:20 AM

Maybe we can extend the

reset

API to allow people to provide specific offsets?

Vadim

05/24/2023, 1:22 AM

do people really think in terms of offsets? What is the usecase here? Just curious

🙌 1

Gian Merlino

05/24/2023, 1:23 AM

one (niche but perhaps meaningful) use case would be having a second druid cluster pick up ingestion exactly where a first druid cluster left off

🙌 2

Gian Merlino

05/24/2023, 1:24 AM

another might be that you want some history but you don't want to start all the way from the beginning of the topic

Gian Merlino

05/24/2023, 1:24 AM

so neither "earliest" or "latest" is really what you want

추호관

05/24/2023, 1:26 AM

I agree for that. I scale out and scale up druid clusters sometimes. (excluding equipment or Including equipment or move to another cluster). So sometimes I want to move offsets manually. And situation when changing kafka ingestion to druid (like flink, kstreams to druid)

Saydul Bashar

05/24/2023, 2:48 PM

Moving offsets manually can be done in the db by updating the druid_dataSources table where the offset is stored. An API will be easier though.

추호관

05/24/2023, 6:15 PM

Ahhh I have seen it in metadata storage(in my case mysql)

Saydul Bashar

05/24/2023, 6:29 PM

Yea MySql is my most favourite metadata storage as well. So, you can just bump it up there if you want. I have done it in the past on production systems. Works like a charm. If you know some of the tables that has key information in the metadata table, backups, recovery and moving offsets becomes a lot easier.

🙌 1

Didip Kerabat

05/24/2023, 9:51 PM

yup yup, it’s in the db but an API would make life easier. Any time whenever there’s a “gap” in the stream, you want to manipulate offset yourself. Agree with what Gian said 100%. There are some situations where earliest and latest are not what users want.

Saydul Bashar

05/24/2023, 10:05 PM

Agreed

6 Views

Open in Slack

Previous Next