Ken Krugler
01/27/2021, 9:24 PMPinotFS.listFiles()
implementations not returning the protocol with the path. So you get back /user/hadoop/blah
, not hdfs:///user/hadoop/blah
. When those paths get used later, without knowledge of the file system, then you run into problems. Does anyone know why listFiles()
(and maybe other methods in a PinotFS implementation) don’t include the protocol?Xiang Fu
Ken Krugler
01/28/2021, 12:38 AMXiang Fu
Ting Chen
01/28/2021, 12:49 AMKen Krugler
01/28/2021, 12:50 AMTing Chen
01/28/2021, 12:57 AMif (_hadoopFS.exists(path)) {
// _hadoopFS.listFiles(path, false) will not return directories as files, thus use listStatus(path) here.
List<FileStatus> files = listStatus(path, recursive);
for (FileStatus file : files) {
filePathStrings.add(file.getPath().toUri().getRawPath());
}
} else {
I think getRawPath() is the issueTing Chen
01/28/2021, 1:02 AMKen Krugler
01/28/2021, 1:03 AM/user/hadoop/blah
paths, without the scheme, so the mapper then doesn’t know how to process them (since in theory they could be file:///user/hadoop/blah
paths)Ken Krugler
01/28/2021, 1:05 AMhdfs:///user/hadoop/blah
, not the required <hdfs://server.com:9000/user/hadoop/blah>
Ken Krugler
01/28/2021, 1:06 AMfile:///user/blah
, or just /user/blah
?Ting Chen
01/28/2021, 1:43 AMTing Chen
01/28/2021, 1:43 AM/**
* Lists all the files and directories at the location provided.
* Lists recursively if {@code recursive} is set to true.
* Throws IOException if this abstract pathname is not valid, or if an I/O error occurs.
* @param fileUri location of file
* @param recursive if we want to list files recursively
* @return an array of strings that contains file paths
* @throws IOException on IO failure. See specific implementation
*/
public abstract String[] listFiles(URI fileUri, boolean recursive)
throws IOException;
Ting Chen
01/28/2021, 1:44 AMTing Chen
01/28/2021, 1:46 AMKen Krugler
01/29/2021, 11:55 PMTing Chen
01/30/2021, 12:44 AM