https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • k

    Kishore G

    07/08/2020, 10:52 PM
    tx
  • k

    Kishore G

    07/09/2020, 12:44 AM
    found the issue: https://github.com/apache/incubator-pinot/pull/5669
  • p

    Pradeep

    07/09/2020, 12:48 AM
    tested the fix, it’s working now thanks @Kishore G
  • m

    Mayank

    07/09/2020, 12:49 AM
    @Kishore G Checked the callers of
    init
    , going by variable names, some are passing
    config
    and others are passing
    fsConfig
  • d

    Daniel Lavoie

    07/09/2020, 1:08 AM
    I'm back home, I'll review the findings and provide more context if needed.
  • m

    Mayank

    07/09/2020, 1:13 AM
    Thanks @Daniel Lavoie. One thing this failure brings up is that we have lack of test coverage here. Perhaps we should also use this opportunity to improve on that
    💯 1
  • s

    Suraj

    07/21/2020, 1:10 AM
    Hello - we are noticing slow queries and wanted to check if there is a way to log the execution plan of the queries in the logs ?
  • k

    Kishore G

    07/21/2020, 1:35 AM
    @Suraj you should know what’s taking time by looking at the response stats
  • e

    Elon

    07/21/2020, 8:10 PM
    We are about to upgrade to pinot-0.4.0 - do you recommend going to head or just cutting it at the 0.4.0 release commit?
  • e

    Elon

    07/21/2020, 8:11 PM
    Any notable config changes, or k8s changes we should be aware of? We're on pinot-0.3.0 now
  • d

    Damiano

    07/21/2020, 8:30 PM
    Nooooo I have just upgraded my custom aggregation function 😄 did you change the API?
  • d

    Damiano

    07/21/2020, 8:30 PM
    😂
  • k

    Kishore G

    07/21/2020, 8:49 PM
    @Elon I would go with 0.4.0 unless you need any feature in master
    👍 1
  • d

    Dan Hill

    07/22/2020, 12:19 AM
    Sorry, I think I've asked before (I lost my slack history). Is there an easy way to have Pinot take the realtime inputs and automatically run data ingestion jobs to populate the offline tables? Mostly checking to see if I can shortcut some work for a v1 deliverable. I assume there is probably a simple setup to output the kafka topic for 1 day, split the data and run batch ingestion jobs.
  • k

    Kishore G

    07/22/2020, 12:31 AM
    Yes, it’s doable but there is no such tool
  • k

    Kishore G

    07/22/2020, 12:33 AM
    You can download the real-time segments use Pinot segment reader to read multiple segments to generate a new offline segment and push it
  • m

    Mayank

    07/22/2020, 5:11 AM
    @Buchi Reddy are there specific types of queries that are failing and passing?
  • b

    Buchi Reddy

    07/22/2020, 5:13 AM
    We’re seeing slowness of some random queries with Pinot. So far here are our observations: • We didn’t tune the segment sizes so we have smaller segments, some are in the size of ~100MB, though in one table they went to 770MB each segment. • On one of the tables, we noticed 10K segments. Queries to this table are some times failing with the exception that I posted in #CDRCA57FC channel. • If we try the queries from Pinot console, we’re seeing the response times are always better than what our service, which is using Java Pinot client, is seeing.
  • k

    Kishore G

    07/22/2020, 4:23 PM
    @Yash Agarwal were you able to rebuild the jar?
  • y

    Yash Agarwal

    07/22/2020, 4:50 PM
    Yes. I built the jar with the updated spark and scala versions.
  • k

    Kishore G

    07/22/2020, 4:55 PM
    i see
  • k

    Kishore G

    07/22/2020, 4:56 PM
    @Xiang Fu you had another command to build the spark job jar directly?
  • x

    Xiang Fu

    07/22/2020, 4:56 PM
    pinot-spark jar?
  • x

    Xiang Fu

    07/22/2020, 4:57 PM
    I’m also using that pinot-all jar
  • y

    Yash Agarwal

    07/24/2020, 5:00 AM
    I am getting
    Copy code
    java.lang.IllegalStateException: Unable to extract out the relative path based on base input path: <hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl>
    	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
    	at org.apache.pinot.plugin.ingestion.batch.common.SegmentGenerationUtils.getRelativeOutputPath(SegmentGenerationUtils.java:144)
    	at org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:292)
    	at org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:214)
    the job config is
    Copy code
    inputDirURI: '<hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl>'
    outputDirURI: '<hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl_segments>'
  • x

    Xiang Fu

    07/24/2020, 5:16 AM
    we get the input file like
    <hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl/a/b/c.avro>
  • x

    Xiang Fu

    07/24/2020, 5:16 AM
    then output segment path should be
    <hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl_segments/a/b/c.tar.gz>
  • x

    Xiang Fu

    07/24/2020, 5:17 AM
    we try to extract the relative path of
    a/b/c.avro
  • x

    Xiang Fu

    07/24/2020, 5:17 AM
    so wanna check the input file path
  • y

    Yash Agarwal

    07/24/2020, 5:22 AM
    the input path something like
    Copy code
    <hdfs://bigredns/apps/hive/warehouse/dev_phx_chargers.db/guest_sdr_gst_data_sgl/partition_d=2020-05-17/00000_0>
1...121122123...166Latest