Laxman Ch
08/03/2021, 10:01 PMKishore G
Laxman Ch
08/03/2021, 10:59 PMLaxman Ch
08/03/2021, 11:01 PMjava.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(java.base@11.0.11/Native Method)
at java.net.SocketInputStream.socketRead(java.base@11.0.11/Unknown Source)
at java.net.SocketInputStream.read(java.base@11.0.11/Unknown Source)
at java.net.SocketInputStream.read(java.base@11.0.11/Unknown Source)
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at shaded.org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at shaded.org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at shaded.org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at shaded.org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at shaded.org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at shaded.org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at shaded.org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at org.apache.pinot.common.utils.FileUploadDownloadClient.sendRequest(FileUploadDownloadClient.java:383)
at org.apache.pinot.common.utils.FileUploadDownloadClient.uploadSegmentMetadataFiles(FileUploadDownloadClient.java:508)
at org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.sendCommitEndWithMetadataFiles(ServerSegmentCompletionProtocolHandler.java:231)
at org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.segmentCommitEndWithMetadata(ServerSegmentCompletionProtocolHandler.java:138)
at org.apache.pinot.core.data.manager.realtime.SplitSegmentCommitter.commit(SplitSegmentCommitter.java:66)
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.commit(LLRealtimeSegmentDataManager.java:878)
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.commitSegment(LLRealtimeSegmentDataManager.java:848)
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:615)
at java.lang.Thread.run(java.base@11.0.11/Unknown Source)
"RMI TCP Connection(idle)" #83020 daemon prio=5 os_prio=0 cpu=3998.34ms elapsed=520.00s tid=0x00007f571c58b800 nid=0x1618c waiting on condition [0x00007f5581501000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.11/Native Method)
- parking to wait for <0x0000000681a08ad8> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.11/Unknown Source)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base@11.0.11/Unknown Source)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base@11.0.11/Unknown Source)
at java.util.concurrent.SynchronousQueue.poll(java.base@11.0.11/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.11/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.11/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.11/Unknown Source)
at java.lang.Thread.run(java.base@11.0.11/Unknown Source)
Laxman Ch
08/03/2021, 11:04 PM"grizzly-http-server-29" #154 prio=5 os_prio=0 cpu=1273160.39ms elapsed=93474.47s tid=0x00007f651d4bf000 nid=0xe0 runnable [0x00007f6314fd3000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(java.base@11.0.11/Native Method)
at java.net.SocketInputStream.socketRead(java.base@11.0.11/Unknown Source)
at java.net.SocketInputStream.read(java.base@11.0.11/Unknown Source)
at java.net.SocketInputStream.read(java.base@11.0.11/Unknown Source)
at sun.security.ssl.SSLSocketInputRecord.read(java.base@11.0.11/Unknown Source)
at sun.security.ssl.SSLSocketInputRecord.readHeader(java.base@11.0.11/Unknown Source)
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(java.base@11.0.11/Unknown Source)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(java.base@11.0.11/Unknown Source)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(java.base@11.0.11/Unknown Source)
at java.io.BufferedInputStream.fill(java.base@11.0.11/Unknown Source)
at java.io.BufferedInputStream.read1(java.base@11.0.11/Unknown Source)
at java.io.BufferedInputStream.read(java.base@11.0.11/Unknown Source)
- locked <0x00000007ffcddf98> (a java.io.BufferedInputStream)
at sun.net.www.http.HttpClient.parseHTTPHeader(java.base@11.0.11/Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(java.base@11.0.11/Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(java.base@11.0.11/Unknown Source)
- locked <0x00000007ffe82d30> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(java.base@11.0.11/Unknown Source)
- locked <0x00000007ffe82d30> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at java.net.HttpURLConnection.getResponseCode(java.base@11.0.11/Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(java.base@11.0.11/Unknown Source)
at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:36)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:144)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:79)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:996)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.write(HttpStorageRpc.java:753)
at com.google.cloud.storage.BlobWriteChannel$1.run(BlobWriteChannel.java:60)
at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.11/Unknown Source)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.BlobWriteChannel.flushBuffer(BlobWriteChannel.java:53)
at com.google.cloud.BaseWriteChannel.flush(BaseWriteChannel.java:112)
at com.google.cloud.BaseWriteChannel.write(BaseWriteChannel.java:139)
at org.apache.pinot.plugin.filesystem.GcsPinotFS.copyFromLocalFile(GcsPinotFS.java:353)
at org.apache.pinot.controller.api.resources.LLCSegmentCompletionHandlers.segmentUpload(LLCSegmentCompletionHandlers.java:367)
at jdk.internal.reflect.GeneratedMethodAccessor181.invoke(Unknown Source)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.11/Unknown Source)
at java.lang.reflect.Method.invoke(java.base@11.0.11/Unknown Source)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$$Lambda$380/0x00000008405c8040.invoke(Unknown Source)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:391)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:80)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:253)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:232)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:679)
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:353)
at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549)
at java.lang.Thread.run(java.base@11.0.11/Unknown Source)
Kishore G
Kishore G
Kishore G
Laxman Ch
08/04/2021, 12:22 AMgrizzly-http-server-*
This is a perf test environment. So the number of segments are highSubbu Subramaniam
08/04/2021, 12:27 AMLaxman Ch
08/04/2021, 12:28 AMgrizzly-http-server-*
)
Number of kafka partitions: 24
Number of segments: 8 segments per minute (1 segment per partition for every 3 minutes)
Segment size before compression (pinot server on-disk size): 700MB to 1GB
Segment size after compression (in deepstore): 50MBLaxman Ch
08/04/2021, 12:37 AMHave you tested the write latency of gcs?Nope. I can give a try. But as per observation so far, that’s not issue.
How long does it take to write a segment?Is there a pinot metric for this? Average segment upload time?
What does your perf test do? Are you testing query performance or segment completion performace?We are testing our product. Not pinot specifically. Our queries are less frequent and mostly in the time range of last hour.
Laxman Ch
08/04/2021, 8:03 PMKishore G
Kishore G
Kishore G
Kishore G
Kishore G
Kishore G
Laxman Ch
08/04/2021, 9:04 PMLaxman Ch
08/04/2021, 9:05 PMLaxman Ch
08/04/2021, 9:06 PMLaxman Ch
08/04/2021, 9:09 PMNumber of kafka partitions: 24
Number of segments: 8 segments per minute (1 segment per partition for every 3 minutes * 24 partitions)
Segment size before compression (pinot server on-disk size): 700MB to 1GB
Segment size after compression (in deepstore): 50MB3 segments per minute * 50 MB per segment = 150 MB per minute = 2.5 MBps = 20 mbps
Laxman Ch
08/04/2021, 9:11 PMLaxman Ch
08/04/2021, 9:12 PMLaxman Ch
08/04/2021, 9:17 PMpeer download
policy further reduces overhead on controller.
• Is my understanding correct?
• Is this peer download feature stable and recommended to use in production setups?Kishore G
Kishore G
Laxman Ch
08/05/2021, 4:06 AMSubbu Subramaniam
08/05/2021, 4:10 PMSubbu Subramaniam
08/05/2021, 4:18 PMgrep
for any segment name on the server logs.Subbu Subramaniam
08/05/2021, 4:18 PMLaxman Ch
08/05/2021, 9:49 PMHow many servers do you have?12 servers
Laxman Ch
08/05/2021, 10:21 PMcontroller-server bandwidth. Is it too low?Controllers and servers are running in same subnet (google cloud/k8s). I don’t see any throttling in the network. Thread dumps on controller clearly indicate they are waiting on GCS as posted in this thread.
Laxman Ch
08/05/2021, 10:26 PMpeer download
policy) few days ago.
The following snippet of code from org.apache.pinot.core.data.manager.realtime.SegmentCommitterFactory#createSegmentCommitter
backed me away from trying it. The upload timeout is hardcoded to 10 seconds in this policy. Not sure why are we assuming 10 seconds is sufficient for any size of the segment to be uploaded to any deep store.Laxman Ch
08/05/2021, 10:26 PMSubbu Subramaniam
08/05/2021, 10:34 PMSubbu Subramaniam
08/05/2021, 10:34 PMLaxman Ch
08/06/2021, 6:19 AMLaxman Ch
08/06/2021, 6:21 AMpeer download
policy but we are seeing high latencies and low throughput from controller to gcs in our setup. So, I am thinking it happens in pinot server as well.Subbu Subramaniam
08/06/2021, 3:53 PM