https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • s

    Sidd

    01/12/2021, 3:45 AM
    And that was leading to incorrect results. I had suggested to disable it
  • s

    Sidd

    01/12/2021, 3:47 AM
    Also you need to enclose the search string as a phrase
  • s

    Sidd

    01/12/2021, 3:47 AM
    This was another issue with your queries as they were matching incorrect documents.
  • s

    Sidd

    01/12/2021, 3:48 AM
    If you don't use phrase, all of them will get tokenized around hyphen
  • s

    Sidd

    01/12/2021, 3:48 AM
    And will be a OR based term query
  • m

    Matt

    01/12/2021, 3:48 AM
    Thats correct this issue is different
  • m

    Matt

    01/12/2021, 3:48 AM
    I have cache disabled and also searching with quotes
  • m

    Matt

    01/12/2021, 3:48 AM
    like
  • m

    Matt

    01/12/2021, 3:49 AM
    select * from mytable where regexp_likg(log, '\"0D82F520-62C8-9914-14B8-4C2331E54075\"')
  • m

    Matt

    01/12/2021, 3:50 AM
    The issue is for some ids nothing is returned seems like they are no in the text index at all
  • m

    Matt

    01/12/2021, 3:43 PM
    After bit more analysis it looks like query is fine however for text index the results only start to appear after a while. And it seems text index is skipping segment with status CONSUMING/IN-PROGRESS.
  • m

    Matt

    01/12/2021, 3:44 PM
    wondering whether this is a bug or I am missing some settings to enable Near Real time searches
  • k

    Kishore G

    01/12/2021, 4:09 PM
    That’s a bug
  • v

    vmarchaud

    01/12/2021, 4:28 PM
    Hey question question, we wrote our own plugin for realtime ingestion with google pubsub and in our test we always get one realtime segment by server, even though we configured 1 replica per partition (the stream is high level), do anyone have an idea ? Our ideal setup would be to only have one (so no replica)
  • t

    troywinter

    01/19/2021, 6:39 AM
    The directory really exist in hdfs, and I looked at the source code, FileStatus seems always return false? setPath doesn’t have any affect on isDir
  • t

    troywinter

    01/19/2021, 6:39 AM
    Anyone had experience setting up the pinot deep storage with hdfs, I have pinot connected to hdfs, but it has an error saying Data dir is not a directory, below are the stack trace for this error:
  • n

    Neer Shay

    01/19/2021, 2:14 PM
    Copy code
    java.lang.RuntimeException: Caught exception during running - org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:123) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    Caused by: java.lang.IllegalStateException: PinotFS for scheme: http has not been initialized
    	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.filesystem.PinotFSFactory.create(PinotFSFactory.java:80) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.run(SegmentGenerationJobRunner.java:125) ~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	... 4 more
    Exception caught:
    java.lang.RuntimeException: Caught exception during running - org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:123) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    Caused by: java.lang.IllegalStateException: PinotFS for scheme: http has not been initialized
    	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.filesystem.PinotFSFactory.create(PinotFSFactory.java:80) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.run(SegmentGenerationJobRunner.java:125) ~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
    	... 4 more
  • n

    Neer Shay

    01/19/2021, 2:14 PM
    Hi, I am having issues ingesting data and would appreciate some assistance. I have a K8s setup and the data I am trying to ingest is on a different machine in the cluster (no virtual volume unfortunately). Here is what my ingestion spec looks like:
    Copy code
    executionFrameworkSpec:
      name: 'standalone'
      segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
      segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
      segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
    jobType: SegmentCreationAndTarPush
    inputDirURI: '<http://my-machine-with-data/path-to-data/part-00016-f20fa20d-1c6e-49f0-ac93-6220ebc59bba.c000.snappy.parquet>'
    includeFileNamePattern: 'glob:**/*.parquet'
    outputDirURI: '/tmp/pinot-quick-start/segments/'
    overwriteOutput: true
    pinotFSSpecs:
      - scheme: file
        className: org.apache.pinot.spi.filesystem.LocalPinotFS
    recordReaderSpec:
      dataFormat: 'parquet'
      className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
      configClassName: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReaderConfig'
    tableSpec:
      tableName: 'table_name'
      schemaURI: '<http://k8s.uri/tables/table_name/schema>'
      tableConfigURI: '<http://k8s.uri/tables/table_name>'
    pinotClusterSpecs:
      - controllerURI: '<http://k8s.uri>'
    After trying to run the ingestion (./pinot-admin.sh LaunchDataIngestionJob -jobSpecFile ~/spec.yml), I get the following exception:
  • n

    Neer Shay

    01/19/2021, 2:14 PM
    Is it even possible to ingest via this setup? Thanks in advance for the help!
  • d

    Davide Berdin

    01/19/2021, 2:14 PM
    @Davide Berdin has left the channel
  • k

    Kishore G

    01/19/2021, 3:38 PM
    Copy code
    java.lang.IllegalStateException: PinotFS for scheme: http has not been initialized
  • k

    Kishore G

    01/19/2021, 3:39 PM
    Copy code
    inputDirURI: '<http://my-machine-with-data/path-to-data/part-00016-f20fa20d-1c6e-49f0-ac93-6220ebc59bba.c000.snappy.parquet>'
  • k

    Kishore G

    01/19/2021, 3:39 PM
    can that file be accessed via http?
  • k

    Kishore G

    01/19/2021, 3:40 PM
    we dont have httpFS
  • x

    Xiang Fu

    01/19/2021, 5:44 PM
    I think you can try to use wget to download the file to local then point the input uri to local directory then delete the raw file @Neer Shay
  • k

    Kishore G

    01/19/2021, 5:45 PM
    We should probably have httpFS implementation
  • x

    Xiang Fu

    01/19/2021, 6:11 PM
    hmm, if we treat http as fs, then we may only implement the copy operation, which is probably fine
  • x

    Xiang Fu

    01/19/2021, 6:11 PM
    we can make it a default fs as well
  • k

    Kishore G

    01/19/2021, 6:12 PM
    Yes, copy to local is the only thing we can support for now
  • t

    troywinter

    01/20/2021, 3:49 AM
    Copy code
    WARNING: HK2 service reification failed for [javax.servlet.ServletConfig] with an exception:
    MultiException stack 1 of 2
    java.lang.NoSuchMethodException: Could not find a suitable constructor in javax.servlet.ServletConfig class.
    	at org.glassfish.jersey.inject.hk2.JerseyClassAnalyzer.getConstructor(JerseyClassAnalyzer.java:168)
    	at org.jvnet.hk2.internal.Utilities.getConstructor(Utilities.java:156)
    	at org.jvnet.hk2.internal.ClazzCreator.initialize(ClazzCreator.java:105)
    	at org.jvnet.hk2.internal.ClazzCreator.initialize(ClazzCreator.java:156)
    	at org.jvnet.hk2.internal.SystemDescriptor.internalReify(SystemDescriptor.java:716)
    	at org.jvnet.hk2.internal.SystemDescriptor.reify(SystemDescriptor.java:670)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.reifyDescriptor(ServiceLocatorImpl.java:441)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.narrow(ServiceLocatorImpl.java:2287)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.igdCacheCompute(ServiceLocatorImpl.java:1163)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.access$400(ServiceLocatorImpl.java:105)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1157)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1154)
    	at org.glassfish.hk2.utilities.cache.internal.WeakCARCacheImpl.compute(WeakCARCacheImpl.java:105)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetDescriptor(ServiceLocatorImpl.java:1237)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetInjecteeDescriptor(ServiceLocatorImpl.java:558)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.getInjecteeDescriptor(ServiceLocatorImpl.java:567)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.lambda$new$0(ContextInjectionResolverImpl.java:81)
    	at org.glassfish.jersey.internal.util.collection.Cache$OriginThreadAwareFuture.lambda$new$0(Cache.java:169)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at org.glassfish.jersey.internal.util.collection.Cache$OriginThreadAwareFuture.run(Cache.java:225)
    	at org.glassfish.jersey.internal.util.collection.Cache.apply(Cache.java:77)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:95)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:121)
    	at org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider.lambda$getValueProvider$0(DelegatedInjectionValueParamProvider.java:67)
    	at org.glassfish.jersey.server.spi.internal.ParamValueFactoryWithSource.apply(ParamValueFactoryWithSource.java:50)
    	at org.glassfish.jersey.server.spi.internal.ParameterValueHelper.getParameterValues(ParameterValueHelper.java:64)
    	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$AbstractMethodParamInvoker.getParamValues(JavaResourceMethodDispatcherProvider.java:109)
    	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)
    	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:391)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:80)
    	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:253)
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
    	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
    	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:232)
    	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:679)
    	at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:353)
    	at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549)
    	at java.lang.Thread.run(Thread.java:748)
    MultiException stack 2 of 2
    java.lang.IllegalArgumentException: Errors were discovered while reifying SystemDescriptor(
    	implementation=javax.servlet.ServletConfig
    	contracts={javax.servlet.ServletConfig}
    	scope=org.glassfish.jersey.process.internal.RequestScoped
    	qualifiers={}
    	descriptorType=CLASS
    	descriptorVisibility=NORMAL
    	metadata=
    	rank=0
    	loader=null
    	proxiable=null
    	proxyForSameScope=null
    	analysisName=null
    	id=178
    	locatorId=0
    	identityHashCode=308687884
    	reified=false)
    	at org.jvnet.hk2.internal.SystemDescriptor.reify(SystemDescriptor.java:681)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.reifyDescriptor(ServiceLocatorImpl.java:441)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.narrow(ServiceLocatorImpl.java:2287)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.igdCacheCompute(ServiceLocatorImpl.java:1163)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.access$400(ServiceLocatorImpl.java:105)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1157)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl$8.compute(ServiceLocatorImpl.java:1154)
    	at org.glassfish.hk2.utilities.cache.internal.WeakCARCacheImpl.compute(WeakCARCacheImpl.java:105)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetDescriptor(ServiceLocatorImpl.java:1237)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetInjecteeDescriptor(ServiceLocatorImpl.java:558)
    	at org.jvnet.hk2.internal.ServiceLocatorImpl.getInjecteeDescriptor(ServiceLocatorImpl.java:567)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.lambda$new$0(ContextInjectionResolverImpl.java:81)
    	at org.glassfish.jersey.internal.util.collection.Cache$OriginThreadAwareFuture.lambda$new$0(Cache.java:169)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at org.glassfish.jersey.internal.util.collection.Cache$OriginThreadAwareFuture.run(Cache.java:225)
    	at org.glassfish.jersey.internal.util.collection.Cache.apply(Cache.java:77)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:95)
    	at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:121)
    	at org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider.lambda$getValueProvider$0(DelegatedInjectionValueParamProvider.java:67)
    	at org.glassfish.jersey.server.spi.internal.ParamValueFactoryWithSource.apply(ParamValueFactoryWithSource.java:50)
    	at org.glassfish.jersey.server.spi.internal.ParameterValueHelper.getParameterValues(ParameterValueHelper.java:64)
    	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$AbstractMethodParamInvoker.getParamValues(JavaResourceMethodDispatcherProvider.java:109)
    	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)
    	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:391)
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:80)
    	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:253)
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
    	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
    	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:232)
    	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:679)
    	at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:353)
    	at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549)
    	at java.lang.Thread.run(Thread.java:748)
1...140141142...166Latest