https://pinot.apache.org/ logo
p

Pradeep

02/09/2021, 7:07 PM
Hi, I am having trouble getting the reload segments to work (I am running latest master code) Seeing below logs on pinot-server. Looks like
SegmentMessageHandlerFactory
is not getting registered for some reason. When I restart the server I don’t see Logs from this function beyond this point. (https://sourcegraph.com/github.com/apache/incubator-pinot/-/blob/pinot-server/src/ma[…]org/apache/pinot/server/starter/helix/HelixServerStarter.java)
Copy code
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:0
21 START:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK
Resubscribe change listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, for listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff, watchChild: false
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:1
The latency of message dbb39d17-cece-4f3c-bd88-29fb61a93863 is 922470 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: dbb39d17-cece-4f3c-bd88-29fb61a93863
The latency of message 4c8dffc3-68cf-417e-8cd2-a572ce4cdcb4 is 42 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: 4c8dffc3-68cf-417e-8cd2-a572ce4cdcb4
21 END:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK Took: 7ms
x

Xiang Fu

02/09/2021, 8:56 PM
do you have more logs show any error logs
are you running on docker or baremetal
for reloading segments, do you mean you did call the controller reload endpoint or just restart server
p

Pradeep

02/09/2021, 8:57 PM
I didn't find any more error logs, next set of logs are segment state transition logs. I was trying controller endpoint then I tried restarting to see debug it
It's running on docker
x

Xiang Fu

02/09/2021, 8:58 PM
after you restart the pinot server, is ip address same?
can you check the idealstates for the table and see if segment assigned instances with same ip
p

Pradeep

02/09/2021, 8:59 PM
Oh ok let me check that and get back
x

Xiang Fu

02/09/2021, 9:03 PM
you can set this config in to pinot server config to use hostname not ip to register pinot:
Copy code
pinot.set.instance.id.to.hostname=true
p

Pradeep

02/09/2021, 9:08 PM
thanks, let me try t hat
So I currently have only one pinot server and all the segments are always attached to that. But I see how this is better
Copy code
pinot.set.instance.id.to.hostname=true
x

Xiang Fu

02/09/2021, 9:17 PM
got it, there should be a pinotServer.log file in your docker container
if possible you can enter into the container and check that file
p

Pradeep

02/09/2021, 9:24 PM
hmm, it’s empty We created our own docker image but it uses pinot-admin.sh
x

Xiang Fu

02/09/2021, 9:26 PM
can you check if this file is there :
/opt/pinot/conf/pinot-server-log4j2.xml
p

Pradeep

02/09/2021, 9:27 PM
ah no, let me fix that and try
x

Xiang Fu

02/09/2021, 9:27 PM
if so you can append
-Dlog4j2.configurationFile=/opt/pinot/conf/pinot-server-log4j2.xml
to the existing
JAVA_OPTS
(I think you must already configured it ) when you start up pinot server
p

Pradeep

02/09/2021, 9:31 PM
Copy code
-Dlog4j2.configurationFile=conf/pinot-admin-log4j2.xml
yup it’s setup
i see it as an option with which the process is running too
x

Xiang Fu

02/09/2021, 9:37 PM
ic
so
conf/pinot-admin-log4j2.xml
should tell where the log files are located
p

Pradeep

02/09/2021, 9:42 PM
yeah for some reason i don’t see it appending to the pinotServer.log but i see console outputs as part of
sudo docker logs <container_id>
x

Xiang Fu

02/09/2021, 9:47 PM
ic, that comes from pinot-admin-log4j2.xml
you can set console log to INFO
so it will print more info
p

Pradeep

02/09/2021, 9:52 PM
it’s already set to INFO, i do see the info logs
are you looking for any specific log entry?
x

Xiang Fu

02/09/2021, 9:54 PM
not really
cause you say segments are not reloading
so the first thing is to see if there is any error that prevent the reloading
if restart doesn’t work, then we need to check if idealstates matches the instance id
p

Pradeep

02/09/2021, 9:56 PM
so segments are in healthy state, I am trying to reload some indicies
and trying to do that using reload. Assignment is correct, all segments are healthy and I am able to query them
x

Xiang Fu

02/09/2021, 9:57 PM
ic
p

Pradeep

02/09/2021, 9:57 PM
Initialized with retryCount: 3, retryWaitMs: 100, retryDelayScaleFactor: 5 NoOpPinotCrypter not configured, adding as default Register state model factory for state model SegmentOnlineOfflineStateModel using factory name DEFAULT with org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory@6 bb2e932 Adding preconnect callback: org.apache.pinot.server.starter.helix.HelixServerStarter$$Lambda$142/653079016@1c09a18 Connecting Helix manager ClusterManager.connect() Starting ZkClient event thread. Client environment:zookeeper.version=3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0, built on 11/01/2017 18:06 GMT Client environment:host.name=<NA> Client environment:java.version=1.8.0_275 Client environment:java.vendor=Private Build Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre Client environmentjava.class.path=/opt/pinot/apache pinot incubating 0.7.0 SNAPSHOT bin/lib/pinot all 0.7.0 SNAPSHOT jar with dependencies.jarjmx_prometheus_javaagent-0.12.0.jar Client environmentjava.library.path=/usr/java/packages/lib/amd64/usr/lib/x86_64-linux-gnu/jni/lib/x86 64 linux gnu/usr/lib/x86_64-linux-gnu/usr/lib/jni/lib:/usr/lib Client environment:java.io.tmpdir=/tmp Client environment:java.compiler=<NA> Client environment:os.name=Linux Client environment:os.arch=amd64 Client environment:os.version=5.4.0-1030-aws Client environment:user.name=root Client environment:user.home=/root Client environment:user.dir=/opt/pinot/apache-pinot-incubating-0.7.0-SNAPSHOT-bin
x

Xiang Fu

02/09/2021, 9:57 PM
then there should be log entry with that segment id
p

Pradeep

02/09/2021, 9:58 PM
let me try that on a single segment
x

Xiang Fu

02/09/2021, 9:58 PM
if there is any indices, it should also tell it’s creating indices for that segment
yes
you can do /reload with force option
p

Pradeep

02/09/2021, 10:05 PM
nope I see this on the server side when I trigger a reload on a segment. How do I a force reload, this is triggered from controller endpoint
/segments/{tablename}/{segment}/reload
Copy code
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:7
21 START:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK
Resubscribe change listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, for listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff, watchChild:
false
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Consumed 0 events from (rate:0.0/s), currentOffset=4037, numRowsConsumedSoFar=0, numRowsIndexedSoFar=0
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:3
The latency of message 2334f0d8-e251-4f75-85f0-c4db72623b31 is 136141 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: 2334f0d8-e251-4f75-85f0-c4db72623b31
The latency of message f41b6769-5f43-4982-bd40-5c8e43dbd9ac is 60 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: f41b6769-5f43-4982-bd40-5c8e43dbd9ac
21 END:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK Took: 9ms
only thing i can think of is that SegmentMessageHandlerFactory not getting registered for some reason which is weird since I don’t see any exception/errors
x

Xiang Fu

02/09/2021, 10:07 PM
I don’t think that’s the issue
if it’s not registered then segment shouldn’t be loaded
so there is no segment got reloaded in server side ?
I think consuming segment cannot be reloaded
you can reload an online segment that is already persisted
p

Pradeep

02/09/2021, 10:09 PM
yeah this is an online segment
x

Xiang Fu

02/09/2021, 10:10 PM
ok
then there is no logs on server side for this segment?
p

Pradeep

02/09/2021, 10:10 PM
I see that segments offline to online is managed by SegmentOnlineOfflineStateModelFactory rite? where as SegmentMessageHandlerFactory seems to be handling reloading/refreshing messages
no I don’t see any logs on the server side for that segment when I did a reload
ok let me get back to debugging this a bit later, but let me know if you have any other ideas/suggestions. Thanks a lot for the help
x

Xiang Fu

02/09/2021, 10:12 PM
this SegmentMessageHandlerFactory is on pinot server side only
Copy code
// Register message handler factory
    SegmentMessageHandlerFactory messageHandlerFactory =
        new SegmentMessageHandlerFactory(fetcherAndLoader, instanceDataManager, serverMetrics);
    _helixManager.getMessagingService()
        .registerMessageHandlerFactory(Message.MessageType.USER_DEFINE_MSG.toString(), messageHandlerFactory);
this is registered always
p

Pradeep

02/09/2021, 10:15 PM
yeah that makes sense which is why I am a bit confused on what’s going on, anyways let me get back to this when I get some time in the evening
x

Xiang Fu

02/09/2021, 10:18 PM
sure, also check if restarting server makes segment reloading happen with new indexing
p

Pradeep

02/09/2021, 10:19 PM
yeah restarting server is creating appropriates indicies
x

Xiang Fu

02/09/2021, 10:24 PM
ok