Slackbot
06/26/2023, 9:27 PMNick M
09/11/2023, 9:37 PMKaran Kumar
09/12/2023, 12:54 AMKaran Kumar
09/12/2023, 12:55 AMNick M
09/12/2023, 11:30 AM2023-09-11T09:17:06,188 INFO [forking-task-runner-0] org.apache.druid.indexing.overlord.ForkingTaskRunner - Exception caught during execution
java.lang.NullPointerException: null
at org.apache.hadoop.io.erasurecode.CodecUtil.createRawEncoderWithFallback(CodecUtil.java:174) ~[?:?]
at org.apache.hadoop.io.erasurecode.CodecUtil.createRawEncoder(CodecUtil.java:131) ~[?:?]
at org.apache.hadoop.hdfs.DFSStripedOutputStream.<init>(DFSStripedOutputStream.java:312) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:315) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1271) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1250) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1232) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1170) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:569) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566) ~[?:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507) ~[?:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233) ~[?:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1210) ~[?:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1091) ~[?:?]
at org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs.pushTaskFile(HdfsTaskLogs.java:92) ~[?:?]
at org.apache.druid.storage.hdfs.tasklog.HdfsTaskLogs.pushTaskLog(HdfsTaskLogs.java:65) ~[?:?]
at org.apache.druid.indexing.overlord.ForkingTaskRunner.waitForTaskProcessToComplete(ForkingTaskRunner.java:517) ~[druid-indexing-service-27.0.0.jar:27.0.0]
at org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:404) ~[druid-indexing-service-27.0.0.jar:27.0.0]
at org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:171) ~[druid-indexing-service-27.0.0.jar:27.0.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]Karan Kumar
09/13/2023, 5:47 AMThe codec implementations for Reed-Solomon and XOR can be configured with the following client and DataNode configuration keys: io.erasurecode.codec.rs.rawcoders for the default RS codec, io.erasurecode.codec.rs-legacy.rawcoders for the legacy RS codec, io.erasurecode.codec.xor.rawcoders for the XOR codec. User can also configure self-defined codec with configuration key like: io.erasurecode.codec.self-defined-codec.rawcoders. The values for these key are lists of coder names with a fall-back mechanism. These codec factories are loaded in the order specified by the configuration values, until a codec is loaded successfully. The default RS and XOR codec configuration prefers native implementation over the pure Java one. There is no RS-LEGACY native codec implementation so the default is pure Java implementation only. All these codecs have implementations in pure Java. For default RS codec, there is also a native implementation which leverages Intel ISA-L library to improve the performance of codec. For XOR codec, a native implementation which leverages Intel ISA-L library to improve the performance of codec is also supported. Please refer to section "Enable Intel ISA-L" for more detail information. The default implementation for RS Legacy is pure Java, and the default implementations for default RS and XOR are native implementations using Intel ISA-L library.
I suggest create a simple hdfs client to push stuff to hdfs from the druid node. That will help you debug more.Piotr Dołowy
09/19/2023, 9:46 AM<https://github.com/apache/druid/blob/master/extensions-core/hdfs-storage/src/main/java/org/apache/druid/storage/hdfs/HdfsDataSegmentPusher.java>:
Path tmpIndexFile = new Path(StringUtils.format(
"%s/%s/%s/%s_index.zip",
fullyQualifiedStorageDirectory.get(),
segment.getDataSource(),
UUIDUtils.generateUuid(),
segment.getShardSpec().getPartitionNum()
));
but because you cant append to files with EC enabled that was throwing an error.
My solution was simple to change tmpIndexFile path to somewhere without EC. But probably there are more elegant solutions, maybe override EC policy for tmp files, compressing to local node,