https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • k

    Kenny Bastani

    05/19/2020, 8:10 PM
    set the channel topic: Community-wide announcements. Please use #C011C9JHN7R for work-based matters.
  • k

    Kishore G

    05/19/2020, 8:50 PM
    I created #C013WKLT5T7 to discuss Pinot PR and Issues, please join the channel if you are committer/contributor or want to learn more about Pinot
  • n

    Nitin Jain

    05/20/2020, 7:06 AM
    <!here>: I am quite new to Pinot, Have couple of questions - 1. While query data, I could see that those are case sensitive.. is it by product or I am missing anything here. Most of the query language are case insensitive. 2. While defining schema I was getting error when using STRING under metricFieldSpecs - documentation says that STRING is supported.
  • s

    Seunghyun

    05/20/2020, 7:11 AM
    @User 1. table name, column names are case sensitive in Pinot. This decision made far in past so I don’t have a full context on why we chose to support case sensitive over case sensitive. Maybe @User @User has the context on this decision. 2. Why do you want to use
    STRING
    for metric column? There’s an exceptional case where we use STRING for metric column (I can think of storing serialized hyperloglog object, which is used for computing approximate distinct count)
  • n

    Nitin Jain

    05/20/2020, 7:21 AM
    Thanks for prompt reply @User: As I mentioned I am quite new and was trying to explore few things, here I did define metric field as STRING but that was not working thought posting here. So far I dont have usecase for STRING. Another question: In query any option available for not in/not exists kind of option ?
  • n

    Nitin Jain

    05/20/2020, 7:31 AM
    For Ex: I am trying following - select ORDER_NO from my_test where ORDER_NO not in (SELECT ORDER_NO FROM my_test where status = '1100')
  • s

    Seunghyun

    05/20/2020, 7:45 AM
    we do have a support for
    NOT IN
  • s

    Seunghyun

    05/20/2020, 7:46 AM
    but only for the list of simple values
    e.g. ORDER_NO NOT IN (111,222,333)
  • s

    Seunghyun

    05/20/2020, 7:46 AM
    we don’t support nested query
  • k

    Kishore G

    05/20/2020, 10:41 AM
    @User case insensitive is supported if you have a schema and you set the cluster config enable.case.insensitive.pql=true
  • k

    Kishore G

    05/20/2020, 10:43 AM
    For nested queries and joins, use presto Pinot connector https://docs.pinot.apache.org/integrations/presto
  • n

    Nitin Jain

    05/20/2020, 11:06 AM
    Thanks @User
  • e

    Elon

    05/20/2020, 5:10 PM
    Question about deploying pinot: if we use the gcs filesystem for the controller deep storage is there any need to have persistence enabled and attach a pvc to the controller?
  • k

    Kishore G

    05/20/2020, 5:13 PM
    No
    👍 1
  • e

    Elon

    05/20/2020, 5:23 PM
    Also we noticed with realtime ingestion that if we set
    realtime.segment.flush.threshold.time
    to 24h and
    realtime.segment.flush.desired.size
    to 150M, for small data sets it never seems to flush, just gets the data from kafka again. We didn't see this behavior in pinot-0.2.0. Did something change? Are the 2 options AND'd together or OR'd together? We're using pinot-0.3.0 vanilla
  • k

    Kishore G

    05/20/2020, 5:31 PM
    OR'd
    👍 1
  • o

    Oguzhan Mangir

    05/25/2020, 4:28 PM
    Is there any way to create pinot cluster in memory for testing purposes?
  • k

    Kishore G

    05/25/2020, 4:51 PM
    As part of junit/integration test or standalone
  • k

    Kishore G

    05/25/2020, 4:51 PM
    See what we do in ClusterTest
    👍 1
  • e

    Elon

    05/26/2020, 11:50 PM
    Does pinot have a now() or current_timestamp function?
  • d

    Dan Hill

    05/30/2020, 6:15 PM
    What's the best way with Pinot to make sure my offline segment writes are idempotent? E.g. If I have a job that writes old data, how do I make sure I replace the old segments? I see
    jobType
    has upload only. What's the identity for a segment? How should I split my input data?
  • k

    Kishore G

    05/30/2020, 6:17 PM
    if the segment name is the same, it will replace the old data
  • k

    Kishore G

    05/30/2020, 6:18 PM
    split your input data by time
  • d

    Dan Hill

    05/30/2020, 6:19 PM
    Got it. I see
    segmentNameGeneratorSpec
    in the docs.
  • k

    Kishore G

    05/30/2020, 6:19 PM
    yep
  • d

    Dan Hill

    05/30/2020, 6:19 PM
    Cool, thanks!
  • k

    Kishore G

    05/30/2020, 6:21 PM
    default is startDate_endDate_<sequence_id>
  • k

    Kishore G

    05/30/2020, 6:21 PM
    we derive startDate and endDate automatically from the timecolumn in the data
  • d

    Dan Hill

    05/30/2020, 6:23 PM
    What's
    sequence_id
    ?
  • k

    Kishore G

    05/30/2020, 6:24 PM
    it allows you to have multiple segments per day
1...135136137...160Latest