https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • x

    xtrntr

    08/16/2021, 2:15 AM
    i encounter this issue when trying the spark ingestion:
    Copy code
    Caused by: java.lang.NullPointerException
            at org.apache.commons.lang3.SystemUtils.isJavaVersionAtLeast(SystemUtils.java:1626)
            at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
            at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:207)
            ... 27 more
            at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2611)
            at org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner.run(SparkSegmentGenerationJobRunner.java:198)
            at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
            at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
            at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
            at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:93)
            at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:311)
            at org.apache.spark.SparkContext.getOrCreate(SparkContext.scala)
            at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    Exception in thread "main" java.lang.ExceptionInInitializerError
            at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
            at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
            at org.apache.spark.SparkEnv$.create(SparkEnv.scala:359)
            at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:189)
            at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:272)
            at org.apache.spark.SparkContext.<init>(SparkContext.scala:448)
            at org.apache.spark.SparkContext.<init>(SparkContext.scala:125)
            at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142)
            at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:132)
            at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.main(LaunchDataIngestionJobCommand.java:67)
            at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
            at java.base/java.lang.reflect.Method.invoke(Unknown Source)
            at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
            at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
            at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:370)
            at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113)
            at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)ullp
    m
    b
    x
    • 4
    • 26
  • r

    Roberto Díaz

    08/16/2021, 10:11 PM
    hi!! I’m trying to add authentication to my pinot instance and it seems that after adding authentication I’m not able to perform queries from the controller UI because of a 403. Is there any way to add authentication using the UI?
    m
    k
    • 3
    • 13
  • f

    Filip Gep

    08/17/2021, 2:10 PM
    Hey everyone! I’m trying to build a streaming architecture that consists of kafka and Pinot. Here’s my `docker-compose`:
    Copy code
    services:
      web:
        image: apachepinot/pinot:latest
        command: QuickStart -type hybrid
        container_name: "pinot"
        volumes:
          - ./config:/config
        ports:
          - "9000:9000"  # Controller
          - "9001:8000"  # Broker
          - "9002:7000"  # Server
    Is
    kafka
    included as a part of this image? If so, how can I reach it (which port)? Finally, how can I create a topic and publish/subscribe to it? thanks for your help!
    d
    • 2
    • 25
  • s

    Syed Akram

    08/17/2021, 4:28 PM
    and it show proper metadata and other things
    • 1
    • 1
  • s

    Syed Akram

    08/17/2021, 4:35 PM
    Table - gitHubEvents
    x
    • 2
    • 232
  • j

    Jai Patel

    08/18/2021, 12:38 AM
    We’re currently on Pinot 0.6 and we’re getting a ton of warnings when ingesting a new PInot Upsert table. I haven’t seen this particular error before. Is there something I should be looking for?
    Copy code
    2021/08/18 00:35:45.093 WARN [LLRealtimeSegmentDataManager_enriched_station_orders_v1_14_rt_upsert_v2_1__7__213__20210818T0025Z] [enriched_customer_orders_v1_15_1_upsert__6__26__20210817T0050Z] Commit failed with response {"offset":-1,"streamPartitionMsgOffset":null,"status":"FAILED","isSplitCommitType":false,"buildTimeSec":-1}
    2021/08/18 00:35:46.258 WARN [LLRealtimeSegmentDataManager_enriched_station_orders_v1_14_rt_upsert_v2_1__7__213__20210818T0025Z] [enriched_station_orders_v1_14_rt_upsert_v2_1__0__219__20210816T2149Z] Commit failed with response {"offset":-1,"streamPartitionMsgOffset":null,"status":"FAILED","isSplitCommitType":false,"buildTimeSec":-1}
    2021/08/18 00:35:46.311 WARN [LLRealtimeSegmentDataManager_enriched_station_orders_v1_14_rt_upsert_v2_1__7__213__20210818T0025Z] [enriched_station_orders_v1_14_rt_upsert_v2_1__10__218__20210817T0010Z] Commit failed with response {"offset":-1,"streamPartitionMsgOffset":null,"status":"FAILED","isSplitCommitType":false,"buildTimeSec":-1}
    enriched_station_orders_v1_15_1_upsert is a new table, but it seems to be impacting older tables as well.
    s
    • 2
    • 9
  • c

    Charles

    08/18/2021, 8:19 AM
    Could not load content for webpack:///js/main.js (HTTP error: status code 404, net::ERR_UNKNOWN_URL_SCHEME)
    x
    s
    • 3
    • 29
  • e

    eywek

    08/18/2021, 10:20 AM
    Hello, I’m having some bad performances with an
    order
    statement: • When I’m doing this query:
    Copy code
    select * from datasource_607ec7451360000300516e33 where REGEXP_LIKE(url, '^.*tv.*$') limit 10
    only 240 docs are scanned and I get a reply in 20ms • When I’m adding an order:
    Copy code
    select * from datasource_607ec7451360000300516e33 where REGEXP_LIKE(url, '^.*tv.*$') order by "timestamp" desc limit 10
    547 212 docs are scanned and I get a reply in >1.5s Do you have any ideas/tips to improve this?
    c
    • 2
    • 2
  • d

    Deepak Mishra

    08/19/2021, 7:10 AM
    Hi , I am testing retention period in realtime as well as offline table , with “retentionTimeValue”: 1 and “retentionTimeUnit”: “HOURS”. In that case , data should be deleted after 1 hour from this table . But I can see the data still after 2 hours
    x
    • 2
    • 2
  • a

    Arpita Bajpai

    08/19/2021, 9:23 AM
    Hi Everyone, I am also trying the retention period in realtime table, with retention period of 1 hour. But i can still see the segments post 1 hour of creation. In the screenshot below , the segment got created at 11:30 am today but it is still present
    x
    • 2
    • 1
  • s

    Syed Akram

    08/19/2021, 12:07 PM
    is there any option to specify to reload segments in parallel?
    👀 1
    k
    k
    j
    • 4
    • 15
  • c

    Chris F

    08/19/2021, 2:12 PM
    Hello - It appears that Pinot uses an empty kafka consumer group id (low level consumer) - however I think this is being deprecated from a kafka perspective as I see this in the kafka log... will this be a problem ?
    Support for using the empty group id by consumers is deprecated and will be removed in the next major release
    k
    • 2
    • 1
  • a

    Anusha

    08/19/2021, 6:43 PM
    Hi, I am trying to rotate the logs in pinot, this the log4j file which I am using, but logs are not rotating, someone could please help me out with it
    log4j
    k
    • 2
    • 1
  • q

    Qianbo Wang

    08/19/2021, 11:24 PM
    Hi, on this doc, it suggests to check server logs for reasons of “table being in bad state”. Can anyone help and specify which server logs I should look into (i.e. broker, controller, etc)? And is searching for the table name sufficient to find the error? or any pattern would work? Thanks in advance
    m
    • 2
    • 2
  • c

    Charles

    08/20/2021, 12:44 AM
    Hi. I found there many long suffix segments in HDFS (as deep storage in pinot) , Is that means those segments are failed in deep storage? thx
    m
    • 2
    • 3
  • s

    Sadim Nadeem

    08/20/2021, 6:18 AM
    can we apply inverted indexing or range indexing to existing old table columns with tens of millions of records and what impact it will have on the table performance and should I expect query latency to improve for old data queried as well ..
    x
    s
    • 3
    • 68
  • f

    Filip Gep

    08/20/2021, 11:36 AM
    Hey everyone! I’m working on creating a realtime pinot table that is based on messages published on kafka topic. My goal is to see messages pushed to kafka in pinot table as fast as it is possible. Here’s my current table config:
    Copy code
    "tableIndexConfig": {
        "loadMode": "MMAP",
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "simple",
          "stream.kafka.topic.name": "bb8_logs",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.zk.broker.url": "zookeeper:2181/kafka",
          "stream.kafka.broker.list": "kafka:9092",
          "realtime.segment.flush.threshold.rows": "0",
          "realtime.segment.flush.threshold.time": "1h",
          "realtime.segment.flush.desired.size": "50M",
          "stream.kafka.consumer.prop.auto.offset.reset": "smallest"
        }
      }
    Wanted to ask you what should be changed to see kafka topic messages in pinot table as fast as it is possible?
    s
    s
    • 3
    • 21
  • c

    Carl

    08/20/2021, 1:34 PM
    Hi team, we are trying to do aggregation on a string field, e.g select name, max(url) from table group by name. But resulting numformatting exception. Is there any other way we can get one aggregated string value from a group? Thanks
    m
    n
    • 3
    • 4
  • n

    Nisheet

    08/20/2021, 3:18 PM
    Hi, I am trying to load parquet files using spark ingestion task. I had build the jars for java 8 using command
    Copy code
    mvn install package -DskipTests -DskipiTs -Pbin-dist -Drat.ignoreErrors=true -Djdk.version=8 -Dspark.version=2.4.5
    While running the task i getting the error
    Copy code
    21/08/20 15:11:24 ERROR LaunchDataIngestionJobCommand: Exception caught: 
    Can't construct a java object for tag:<http://yaml.org|yaml.org>,2002:org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec; exception=Class not found: org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
     in 'string', line 1, column 1:
        executionFrameworkSpec:
        ^
    
    	at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:349)
    	at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
    	at org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:141)
    	at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:127)
    	at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:450)
    	at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:427)
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.getSegmentGenerationJobSpec(IngestionJobLauncher.java:94)
    I have loaded all the jars in spark class path. Any idea how to resolve this??
    m
    n
    +4
    • 7
    • 20
  • b

    beerus

    08/20/2021, 5:30 PM
    requests.exceptions.ConnectionError: HTTPConnectionPool(host='pinot2-controller-external.data2.svc', port=9000): Max retries exceeded with url: /tables/requests2/schema (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x105fc4280>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
    x
    • 2
    • 1
  • x

    xtrntr

    08/22/2021, 3:27 AM
    when i downscale the number of instances in my k8s helm chart for pinot components, i have instances that are not removed?
    d
    k
    • 3
    • 24
  • d

    Deepak Mishra

    08/22/2021, 5:19 AM
    Hello everyone , after setting retention time 1 day ,there is no real time segment available while i see offline segment there after setting same retention period - 1 day.
    x
    • 2
    • 3
  • p

    Peter Pringle

    08/23/2021, 1:18 AM
    Anyone else find that the pinot server port is not closed when the server process exits. So upon restart the port is in use and it cannot start. Will do some more digging today, rhel 7, java 11, p 0.71
    m
    • 2
    • 1
  • p

    Peter Pringle

    08/23/2021, 1:21 AM
    For kafka are people using group.id. Trying to work out how the replication works e.g. say 2 replicas of a segment, is this 2 servers ingesting the same kafka messages (in which case guess setting group.id won't work) or one server ingesting from kafka and then copying the segments to another server for replication.
    k
    • 2
    • 5
  • b

    beerus

    08/23/2021, 5:29 AM
    Copy code
    2021/08/23 05:26:07.465 ERROR [PinotIngestionRestletResource] [jersey-server-managed-async-executor-0] Caught exception when ingesting file into table: networks3_REALTIME. Cannot ingest file into REALTIME table: networks3_REALTIME
    java.lang.IllegalStateException: Cannot ingest file into REALTIME table: networks3_REALTIME
    	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.apache.pinot.controller.api.resources.PinotIngestionRestletResource.ingestData(PinotIngestionRestletResource.java:184) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.apache.pinot.controller.api.resources.PinotIngestionRestletResource.ingestFromURI(PinotIngestionRestletResource.java:170) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_282]
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_282]
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282]
    	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
    	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.model.ResourceMethodInvoker.lambda$apply$0(ResourceMethodInvoker.java:381) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2$1.run(ServerRuntime.java:819) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2.run(ServerRuntime.java:814) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-7bcbee5058c6a56d02b2d34e173185009ac35146]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_282]
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
    m
    • 2
    • 6
  • s

    Sadim Nadeem

    08/23/2021, 11:18 AM
    what should be the disk allotment for pinot dev installation.. means for controller, broker , minion .. the disk space allotted for server is 200 GB * 3 pods = 600 GB total .. so what should be the controller , broker , minion disk requirement in proportion to that for 3 pods each on k8s .. @Mayank @Xiang Fu @Kishore G @Subbu Subramaniam @Jackie
    x
    • 2
    • 2
  • s

    Sadim Nadeem

    08/23/2021, 11:20 AM
    also is ssd recommended only for pinot server or for everything .. also if we are not loading offline segments .. do we really need 3 pods of minion .. and if ram is 8gb * 3 = 24 gb for pinot server.. for broker, controller , minion . .what should be the ram allotment .. same goes for cpu cores if cores for server is 2.5 cores per pod * 3 = 7.5 cores .. how much core allotment will be prudent for controller , broker ,minion
    m
    • 2
    • 6
  • l

    Lovenish Goyal

    08/24/2021, 5:27 AM
    Hi All, Can anyone help me to connect tableau to pinot ? we are getting below error
    x
    • 2
    • 8
  • s

    Syed Akram

    08/24/2021, 5:02 PM
    https://docs.pinot.apache.org/basics/getting-started/hdfs-as-deepstorage
    x
    • 2
    • 5
  • w

    Will Gan

    08/24/2021, 9:14 PM
    Hi, if I want to have a time column that's essentially another column (say epoch_minutes) but bucketed (say every 5 minutes), my understanding was that I can create a column with format
    1:MINUTES:EPOCH
    and granularity
    5:MINUTES
    , and that Pinot would handle it for me. I don't think that's the case though, i.e. I have to write an ingestion transform right?
    n
    • 2
    • 5
1...202122...166Latest