This message was deleted.
# troubleshooting
s
This message was deleted.
u
Iike is there any manager for kafka offset
v
do you mean the kafka supervisor? you should be able to see it on the ingestion tab of the web console. click on whatever supervisor you want and select details
๐Ÿ™Œ 1
druid 1
u
Ah thanks! I just curious how druid manage offset internally in code level. So I am looking in druid source code and I wonder if there is kafka offset manager or something that handle. Like new Peon Task when pop how does new Peon Task know offset,
s
The offset is stored in the druid_dataSources table.
u
Oh I see, Thanks @Saydul Bashar would you recommend me what class has a responsibility of it? Really thank you Cuz I want to understand how druid manage offset that well.
s
Hi @์ถ”ํ˜ธ๊ด€ you are welcome. I am glad that I could help. I will have a look at the code. I haven't had to look into it for a while.
๐Ÿ™Œ 1
r
i don't think it will help look at the table directly, because it's saved as binary I don't think it's all in one class, but the main logic must be in the segment publisher, not in the supervisor, I think
๐Ÿ™Œ 1
it's maybe there, but I don't think it's 100% there I think some part of those classes are mixed with https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/input/DruidInputSource.java
๐Ÿ™Œ 1
(no this file, but this 'tree')
๐Ÿ™Œ 1
u
Thanks I will look into that! And if there is any updates on information it would be pleasure to know that
d
It would be cool if users can manipulate the offset from the UI, @Vadim ๐Ÿ˜
g
we need an API for that first ๐Ÿ™‚
Maybe we can extend the
reset
API to allow people to provide specific offsets?
v
do people really think in terms of offsets? What is the usecase here? Just curious
๐Ÿ™Œ 1
g
one (niche but perhaps meaningful) use case would be having a second druid cluster pick up ingestion exactly where a first druid cluster left off
๐Ÿ™Œ 2
another might be that you want some history but you don't want to start all the way from the beginning of the topic
so neither "earliest" or "latest" is really what you want
u
I agree for that. I scale out and scale up druid clusters sometimes. (excluding equipment or Including equipment or move to another cluster). So sometimes I want to move offsets manually. And situation when changing kafka ingestion to druid (like flink, kstreams to druid)
s
Moving offsets manually can be done in the db by updating the druid_dataSources table where the offset is stored. An API will be easier though.
u
Ahhh I have seen it in metadata storage(in my case mysql)
s
Yea MySql is my most favourite metadata storage as well. So, you can just bump it up there if you want. I have done it in the past on production systems. Works like a charm. If you know some of the tables that has key information in the metadata table, backups, recovery and moving offsets becomes a lot easier.
๐Ÿ™Œ 1
d
yup yup, itโ€™s in the db but an API would make life easier. Any time whenever thereโ€™s a โ€œgapโ€ in the stream, you want to manipulate offset yourself. Agree with what Gian said 100%. There are some situations where earliest and latest are not what users want.
s
Agreed