hi i try to use hdfs with pinot follow document -...
# general
b
hi i try to use hdfs with pinot follow document --> https://docs.pinot.apache.org/basics/getting-started/hdfs-as-deepstorage it not working ###my config## ####controller config## pinot.service.role=CONTROLLER pinot.cluster.name=pinot-uat controller.host=pinot-uat01 controller.data.dir=hdfs://path/in/hdfs/for/controller/segment controller.local.temp.dir=/tmp/pinot/data/controller controller.zk.str=172.19.131.1162181,172.19.131.1172181,172.19.131.118:2181 controller.enable.split.commit=true controller.access.protocols.http.port=9000 controller.helix.cluster.name=pinot-uat pinot.controller.storage.factory.class.hdfs=org.apache.pinot.plugin.filesystem.HadoopPinotFS pinot.controller.storage.factory.hdfs.hadoop.conf.path=/etc/hadoop/conf pinot.controller.segment.fetcher.protocols=file,http,hdfs pinot.controller.segment.fetcher.hdfs.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher pinot.controller.segment.fetcher.hdfs.hadoop.kerberos.principle=hdptest@TRUE.CARE pinot.controller.segment.fetcher.hdfs.hadoop.kerberos.keytab=/data/apache-pinot/keytab/hdptest.keytab controller.vip.host=pinotuat.true.care controller.vip.port=9000 controller.port=9000 pinot.set.instance.id.to.hostname=true pinot.server.grpc.enable=true ########Executable export HADOOP_HOME=/usr/lib/hadoop export HADOOP_VERSION=2.6.0-cdh5.16.2 export HADOOP_GUAVA_VERSION=11.0.2 export HADOOP_GSON_VERSION=2.2.4 export GC_LOG_LOCATION=/data/apache-pinot/logs/ export PINOT_VERSION=0.8.0 export PINOT_DISTRIBUTION_DIR=/data/apache-pinot export SERVER_CONF_DIR=/data/apache-pinot/conf export ZOOKEEPER_ADDRESS=172.19.131.1162181,172.19.131.1172181,172.19.131.118:2181 export CLASSPATH_PREFIX="${HADOOP_HOME}/client/hadoop-hdfs-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-annotations-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-auth-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-common-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/guava-${HADOOP_GUAVA_VERSION}.jar:${HADOOP_HOME}/client/gson-${HADOOP_GSON_VERSION}.jar" export JAVA_OPTS="-Xms8G -Xmx12G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xloggc:${GC_LOG_LOCATION}/gc-pinot-controller.log" ${PINOT_DISTRIBUTION_DIR}/bin/start-controller.sh -configFileName ${SERVER_CONF_DIR}/pinot-controller.conf ###########error log#### 2022/03/10 120641.771 INFO [StartControllerCommand] [main] Executing command: StartController -configFileName /data/apache-pinot/conf/pinot-controller.conf 2022/03/10 120641.843 INFO [StartServiceManagerCommand] [main] Executing command: StartServiceManager -clusterName pinot-uat -zkAddress 172.19.131.1162181,172.19.131.1172181,172.19.131.118:2181 -port -1 -bootstrapServices [] 2022/03/10 120641.843 INFO [StartServiceManagerCommand] [main] Starting a Pinot [SERVICE_MANAGER] at 0.012s since launch 2022/03/10 120641.847 INFO [StartServiceManagerCommand] [main] Started Pinot [SERVICE_MANAGER] instance [ServiceManager_poc-pinot01_-1] at 0.016s since launch 2022/03/10 120641.848 INFO [StartServiceManagerCommand] [main] Starting a Pinot [CONTROLLER] at 0.016s since launch WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/usr/lib/hadoop/hadoop-auth-2.6.0-cdh5.16.2.jar) to method sun.security.krb5.Config.getInstance() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2803) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2853) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2835) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:186) at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:65) at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:52) at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:72) at org.apache.pinot.controller.BaseControllerStarter.initPinotFSFactory(BaseControllerStarter.java:518) at org.apache.pinot.controller.BaseControllerStarter.setUpPinotController(BaseControllerStarter.java:358) at org.apache.pinot.controller.BaseControllerStarter.start(BaseControllerStarter.java:308) at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:123) at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:93) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:233) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:285) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:232) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:182) at org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:149) at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:166) at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:186) at org.apache.pinot.tools.admin.PinotController.main(PinotController.java:35) Caused by: java.lang.ClassNotFoundException: org.apache.htrace.core.Tracer$Builder at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
m
What version of Hadoop and Java? Cc: @User
b
export HADOOP_VERSION=2.6.0-cdh5.16.2 java --> java-11-openjdk.x86_64 (because use pinot 0.8.0 , in need to use newest version , first time i test with 0.9.3 it's error too.
Did you have recommend which version can work with hdfs.... (Now i test with hdfs on CDH 5.16.2)
hadoop client i use cloudera client.
please help.
x
can you try to put Pinot-hdfs shaded jar into your class path?
b
you mean export CLASSPATH_PREFIX=/data/apache-pinot/plugins/pinot-file-system/pinot-hdfs/pinot-hdfs-0.8.0-shaded.jar ?
[root@poc-pinot01 pinot-hdfs]# export CLASSPATH_PREFIX="${HADOOP_HOME}/client/hadoop-hdfs-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-annotations-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-auth-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/hadoop-common-${HADOOP_VERSION}.jar:${HADOOP_HOME}/client/guava-${HADOOP_GUAVA_VERSION}.jar:${HADOOP_HOME}/client/gson-${HADOOP_GSON_VERSION}.jar:/data/apache-pinot/plugins/pinot-file-system/pinot-hdfs/pinot-hdfs-0.8.0-shaded.jar" [root@poc-pinot01 pinot-hdfs]# ${PINOT_DISTRIBUTION_DIR}/bin/start-controller.sh -configFileName ${SERVER_CONF_DIR}/pinot-controller.conf [0.001s][warning][gc] -Xloggc is deprecated. Will use -Xloggc/data/apache-pinot/logs//gc-pinot-controller.log instead. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jarfile/data/apache-pinot/lib/pinot-all-0.8.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/data/apache-pinot/plugins/pinot-environment/pinot-azure/pinot-azure-0.8.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/data/apache-pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.8.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/data/apache-pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.8.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/data/apache-pinot/plugins/pinot-metrics/pinot-yammer/pinot-yammer-0.8.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. 2022/03/10 153750.007 INFO [StartControllerCommand] [main] Executing command: StartController -configFileName /data/apache-pinot/conf/pinot-controller.conf 2022/03/10 153750.086 INFO [StartServiceManagerCommand] [main] Executing command: StartServiceManager -clusterName pinot-uat -zkAddress 172.19.131.1162181,172.19.131.1172181,172.19.131.118:2181 -port -1 -bootstrapServices [] 2022/03/10 153750.087 INFO [StartServiceManagerCommand] [main] Starting a Pinot [SERVICE_MANAGER] at 0.013s since launch 2022/03/10 153750.091 INFO [StartServiceManagerCommand] [main] Started Pinot [SERVICE_MANAGER] instance [ServiceManager_poc-pinot01_-1] at 0.017s since launch 2022/03/10 153750.091 INFO [StartServiceManagerCommand] [main] Starting a Pinot [CONTROLLER] at 0.018s since launch WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/usr/lib/hadoop/hadoop-auth-2.6.0-cdh5.16.2.jar) to method sun.security.krb5.Config.getInstance() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2803) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2853) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2835) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:186) at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:65) at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:52) at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:72) at org.apache.pinot.controller.BaseControllerStarter.initPinotFSFactory(BaseControllerStarter.java:518) at org.apache.pinot.controller.BaseControllerStarter.setUpPinotController(BaseControllerStarter.java:358) at org.apache.pinot.controller.BaseControllerStarter.start(BaseControllerStarter.java:308) at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:123) at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:93) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:233) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:285) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:232) at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:182) at org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:149) at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:166) at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:186) at org.apache.pinot.tools.admin.PinotController.main(PinotController.java:35) Caused by: java.lang.ClassNotFoundException: org.apache.htrace.core.Tracer$Builder at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 23 more
############# Mar 10 163634 poc-pinot01 pinot-admin.sh: java.lang.RuntimeException: Could not initialize HadoopPinotFS Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:69) ~[pinot-hdfs-0.8.0-shaded.jar:0.8.0-9a0f41bc24243ff74315723b0153b534c2596e30] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:52) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.spi.filesystem.PinotFSFactory.init(PinotFSFactory.java:72) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.controller.BaseControllerStarter.initPinotFSFactory(BaseControllerStarter.java:518) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.controller.BaseControllerStarter.setUpPinotController(BaseControllerStarter.java:358) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.controller.BaseControllerStarter.start(BaseControllerStarter.java:308) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:123) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:93) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:233) ~[pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:285) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:232) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:182) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:149) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:166) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:186) [pinot-all-0.8.0-jar-with-dependencies.jar:0.8.0-c4ceff06d21fc1c1b88469a8dbae742a4b609808] Mar 10 163634 poc-pinot01 pinot-admin.sh: Caused by: java.io.IOException: No FileSystem for scheme: hdfs Mar 10 163634 poc-pinot01 pinot-admin.sh: at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644) ~[pinot-orc-0.8.0-shaded.jar:0.8.0-9a0f41bc24243ff74315723b0153b534c2596e30]
error--> ERROR [PinotFSFactory] [main] Could not instantiate file system for class org.apache.pinot.plugin.filesystem.HadoopPinotFS with scheme hdfs
x
Hmm Interesting, seems like still missing the lib
Can you try to drop the htrace jar into your class path as well
b
on hadoop lib ?
Mar 11 113039 poc-pinot01 pinot-admin.sh: java.lang.RuntimeException: Could not initialize HadoopPinotFS Mar 11 113039 poc-pinot01 pinot-admin.sh: at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:70) ~[pinot-hdfs-0.9.3-shaded.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
it's still error. log in this.
i'm still poc pinot. we don't have enough space for keep segment. we need to use hdfs store that pinot data.
now i'm back to test with new version 0.9.3. still same problems. with hdfs. if i use nfs it's working fine.
Mar 11 114448 poc-pinot01 start-controller.sh: 2022/03/11 114448.720 ERROR [PinotFSFactory] [main] Could not instantiate file system for class org.apache.pinot.plugin.filesystem.HadoopPinotFS with scheme hdfs Mar 11 114448 poc-pinot01 start-controller.sh: java.lang.RuntimeException: Could not initialize HadoopPinotFS
@User i move htrace jar into your class from my class path. it's still error.
x
hmm, still same error?
Which Hadoop version are you using
b
Hadoop 2.6.0-cdh5.16.2
@User Hadoop 2.6.0-cdh5.16.2 authen with keberlos.
i use cloudera.
x
sure, meanwhile you can try the standalone job for segment ingestion
b
normally how pinot keep segment data.
it's huge ? if it very big we need to keep it on hdfs.
controller keep some data or not ?
my problems now is can not controller or server when user hdfs plugin.
if you have some example or some link information more than pinot document ( i already config follow pinot doc ) plase share for me @User. next week i will back to use nfs for poc another fuction table with kafka sasl_ssl.
please help 😄
thank you.
m
Pinot stores a copy of data on deep store (hdfs/s3/etc). But Pinot servers keep a local copy for serving.
👍 1
Hdfs based segment generation is different from using hdfs as deep store, you don’t have the above issue when using as deep store
b
deepp store no need to input in controoler config ?
*controller
m
Yes deepstore needs to be configured on controller and sever. But that is separate from segment generation job on hdfs
b
oh it mean we no need to use deep storage (hdfs) if we use standalone job. how about table realtime (kafka) ?
where is they store data if we config table real time.
m
No, I am saying that they are unrelated to each other. Controller data is stored in deep store (for real-time also)
b
thank you.
still can not use hdfs 😞
can pinot use to create real time table with kafka sasl_ssl ? @User @User