Hi, anyone facing issues with Segment purging with...
# troubleshooting
l
Hi, anyone facing issues with Segment purging with GCS as deep store.
We are facing the following issue
Copy code
2020/12/22 02:35:25.329 ERROR [SegmentDeletionManager] [pool-6-thread-1] Had trouble deleting directories: <gs://test-bucket-dev/pinot-data/Deleted_Segments>
java.io.IOException: java.io.IOException: java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.listFiles(GcsPinotFS.java:322) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.controller.helix.core.SegmentDeletionManager.removeAgedDeletedSegments(SegmentDeletionManager.java:233) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.controller.helix.core.retention.RetentionManager.postprocess(RetentionManager.java:89) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.postprocess(ControllerPeriodicTask.java:131) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTables(ControllerPeriodicTask.java:97) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.runTask(ControllerPeriodicTask.java:68) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:120) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.core.periodictask.PeriodicTaskScheduler.lambda$start$0(PeriodicTaskScheduler.java:73) ~[pinot-all-0.6.0-jar-with-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.io.IOException: java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.getBase(GcsPinotFS.java:135) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.normalizeToDirectoryPrefix(GcsPinotFS.java:105) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.listFiles(GcsPinotFS.java:307) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	... 13 more
Caused by: java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
	at java.net.URI$Parser.fail(URI.java:2913) ~[?:?]
	at java.net.URI$Parser.failExpecting(URI.java:2919) ~[?:?]
	at java.net.URI$Parser.parse(URI.java:3119) ~[?:?]
	at java.net.URI.<init>(URI.java:685) ~[?:?]
	at java.net.URI.<init>(URI.java:786) ~[?:?]
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.getBase(GcsPinotFS.java:133) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.normalizeToDirectoryPrefix(GcsPinotFS.java:105) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	at org.apache.pinot.plugin.filesystem.GcsPinotFS.listFiles(GcsPinotFS.java:307) ~[pinot-gcs-0.6.0-shaded.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]
	... 13 more
Had gone through this codepath multiple times but I couldn’t see any issue with config or code.
d
The exceptions complains of a malformed URI
Copy code
java.net.URISyntaxException: Expected scheme-specific part at index 5: file:
Can you share your non sensitive gcs configuration?
l
Yeah. GCS path is also there in the above stacktrace.
Copy code
ERROR [SegmentDeletionManager] [pool-6-thread-1] Had trouble deleting directories: <gs://test-bucket-dev/pinot-data/Deleted_Segments>
Segments are getting uploaded up to gcs cleanly. Also, they are moved to Deleted_Segments too in gcs. But, they are not getting deleted from there after the retention period is over
controller config here (masked some data)
Copy code
root@pinot-controller-0:/opt/pinot# cat /var/pinot/controller/config/pinot-controller.conf
controller.helix.cluster.name=my-views
controller.port=9000
controller.zk.str=zookeeper.host.svc.cluster.local:2181/pinot
pinot.set.instance.id.to.hostname=true
controller.data.dir=<gs://test-bucket-dev/pinot-data>
controller.enable.split.commit=true
controller.local.temp.dir=/var/pinot/controller/data/temp
<http://pinot.controller.storage.factory.class.gs|pinot.controller.storage.factory.class.gs>=org.apache.pinot.plugin.filesystem.GcsPinotFS
pinot.controller.storage.factory.gs.projectId=XXXXXXXXX
pinot.controller.storage.factory.gs.gcpKey=XXXXXXXX.json
pinot.controller.segment.fetcher.protocols=file,http,gs
pinot.controller.segment.fetcher.gs.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
x
@Elon have you seen similar issue or any resolution is recommended?
e
Checking, I think I know why, will update shortly
Working on a fix, will update shortly
thanks for catching this @Laxman Ch! - we are affected by this as well
x
Thanks Elon!
l
Thanks for the update @Elon. Curious to know where this bug is in the code. Also, is there workaround?
e
Happy new year everyone! I was on break but will begin to work on this today. cc @Laxman Ch
x
We had some findings @Laxman Ch may fill it up
image.png
image.png
so this tableNameURI should be
gs://
however they are actually using local file system
l
@Elon: I did some debugging on this issue last week. As posted in above screenshots, the bugs is in the implementation of org.apache.pinot.plugin.filesystem.GcsPinotFS#listFiles. GCS implementation of listFiles API is returning relative path. Whereas, other FS implementations (S3, Local) are returning full path including scheme.
We need to fix GcsPinotFS.listFiles and also other we need to fix the callers of this API that are explicitly expecting a relative path.
e
Yep, that's what I see to, working on the fix.
👍 1
Ok, I think I got it - only question: I have unit tests using our gcs file auth. Is there a projectid, bucket that we can use for pinot?
l
I was also having the same question when I noticed there are no unit tests for GcsPinotFS.
e
I have local unit tests but definitely cannot share our gcp key 🙂
l
com.google.cloud.storage.contrib.nio.testing.LocalStorageHelper
💡 1
d
I guess unit test should mock that. Integration tests however should have a chance to configure private gcp key and cover that integration scenario.
l
Check this once.
e
iirc that one doesn't allow for all the fs operations
nice find though! will take a look at that. Just testing the fix locally for now
l
yeah. s3 has unit test libraries which I used in past. I was searching for something similar for GCS too.
I came across this one.
e
Yep, s3 has better testing libraries. Looks like gcp wants $$ :)
l
As per my understanding LocalStorageHelper should be sufficient for our unit testing
Copy code
*   <li>Unsupported operations
 *       <ul>
 *         <li>bucket create
 *         <li>bucket get
 *         <li>bucket delete
 *         <li>list all buckets
 *         <li>generations
 *         <li>file attributes
 *         <li>patch
 *         <li>continueRewrite
 *         <li>createBatch
 *         <li>checksums, etags
 *         <li>IAM operations
 *       </ul>
👍 1
e
Yep, saw the same thing, I'll try it out - will see what required functionality is missing if anything.
👍 1
Either way, right now I'm working on the fix and testing with a real bucket - will create a pr and use that library.
https://github.com/apache/incubator-pinot/pull/6426 - will add the tests tomorrow, converting from using a real gcs bucket to local storage tests.
@Laxman Ch ^ when you have some time it would be great if you can take a look
@Laxman Ch - unit tests are done, pushing changes after work, busy today:)
l
@Elon: Change looks good to me. Can you please update the PR with latest changes. Also, I can verify these changes in our cluster.
e
Actually - we have it working on production also, it's good.
The issue is that in order to test I would have to change the code since everything is based on getBucket() but that is not supported by the local storage.
I think that should be done in a separate pr.
wdyt?
l
@Elon - I think its fine to add these tests as a followup PR. However, my vote is a non-binding vote. @Xiang Fu/@Jackie has some comments on the PR. I think we should clarify/fix and proceed with merge.
e
I updated yesterday and addressed the comments, if @Xiang Fu and @Jackie are ok with it I'm fine to merge. This has been running in production for more than a week with no issues.
👍 2