https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • j

    jose farfan

    02/13/2021, 3:08 AM
    This is the error
  • j

    jose farfan

    02/13/2021, 3:08 AM
    but if I do the query with another value in limit, everything is ok
  • j

    jose farfan

    02/13/2021, 3:09 AM
    by example" SELECT player_nr, processTime, id FROM transaction_line_REALTIME LIMIT 21474836"
  • k

    Kishore G

    02/13/2021, 3:15 AM
    You are pulling a lot of data from Pinot
  • k

    Kishore G

    02/13/2021, 3:16 AM
    Pinot is not meant to be used to pull all the data out
  • j

    jose farfan

    02/13/2021, 3:18 AM
    Hi, there are only 3000 rows now in the table
  • j

    jose farfan

    02/13/2021, 3:19 AM
    it was working few days, and stop to work,
  • k

    Kishore G

    02/13/2021, 3:20 AM
    That should have worked... what’s the jvm memory
  • j

    jose farfan

    02/13/2021, 3:20 AM
    I have updated now to this configuration: - JAVA_OPTS=-Xms4G -Xmx16G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -Xloggc:gc-pinot-server.log
  • j

    jose farfan

    02/13/2021, 3:21 AM
    the broker,server and controller
  • j

    jose farfan

    02/13/2021, 3:38 AM
    if I dont the query : select * from transaction_line limit 2147483647, it is not working
  • y

    Yurii B

    02/15/2021, 4:15 PM
    Hi, I was playing with Pinot and noticed something strange with group by on certain column (some results are missing).
  • y

    Yurii B

    02/15/2021, 4:15 PM
    Copy code
    select visitor_id, count(*) from visitor_ids
    group by visitor_id
    order by count(*) desc
  • y

    Yurii B

    02/15/2021, 4:17 PM
    gives:
    Copy code
    visitor_id	count(*)
    78144a788f39b7075bb37758	44
    979ba2f45c229ab908bdd829	26
    db5e6a7b6b48b64d3df838a7	22
    3402d34497b08b5d0172128f	21
    4f212877c876191c7db40a11	20
    But if specify missing visitor_id, it is there
    Copy code
    select visitor_id, count(*) from visitor_ids
    where visitor_id = '6d797e84d6d19d5825ef7060'
    group by visitor_id
    order by count(*) desc
    gives:
    Copy code
    visitor_id	count(*)
    6d797e84d6d19d5825ef7060	66
  • k

    Kishore G

    02/15/2021, 4:18 PM
    whats the cardinality of visitorId
  • y

    Yurii B

    02/15/2021, 4:19 PM
    column.visitor_id.cardinality = 148594 column.visitor_id.totalDocs = 190000
  • k

    Kishore G

    02/15/2021, 4:19 PM
    I think we limit 100k unique visitors within a segment
  • k

    Kishore G

    02/15/2021, 4:20 PM
    if that visitor id is not found in first 100k unique ids we see in a segment it wont show up
  • k

    Kishore G

    02/15/2021, 4:20 PM
    you can change this config
  • k

    Kishore G

    02/15/2021, 4:22 PM
    Copy code
    num.groups.limit"
  • k

    Kishore G

    02/15/2021, 4:23 PM
    @Mayank can you confirm the above behaviour^^
  • m

    Mayank

    02/15/2021, 4:25 PM
    Yes, default number of group limits to 100k
  • y

    Yurii B

    02/15/2021, 4:40 PM
    @Kishore G Thank you, works as expected.
  • k

    Kishore G

    02/15/2021, 4:42 PM
    we plan to make this a bit smarter but still have a limit
  • n

    Nick Bowles

    02/17/2021, 1:45 AM
    Any tips on speeding up a
    LoadDataIngestionJob
    ? I’ve tried upping Xmx and my pods have plenty of CPU and memory on the table. I notice in the logs it looks like it’s only using one thread but haven’t seen any settings in the documents to give more threads.
  • n

    Nick Bowles

    02/17/2021, 1:56 AM
    Also wondering how to distribute the ingestion load across
    server
    instances. I’ve been ssh’ing into the pod and running
    LaunchDataIngestionJob
    but this only runs it on the server instance I’m logged into.
  • d

    Daniel Lavoie

    02/17/2021, 1:59 AM
    the new minion task based ingestion with the new
    SegmentGenerationAndPushTask
    task will allow you to run scalable ingestion jobs accross multiple minion instances.
  • d

    Daniel Lavoie

    02/17/2021, 2:00 AM
    Sadly the documentation is not ready yet 😞
  • n

    Nick Bowles

    02/17/2021, 2:29 AM
    That’s interesting thanks for sharing! I’m fine with using a single server instance right now just trying to figure out how to speed it up.
  • d

    Daniel Lavoie

    02/17/2021, 2:31 AM
    Using this new mechanism you can have 1 worker generate 1 segment in parrallel per file ingested
1...142143144...166Latest