https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • i

    Ignacy Krasicki

    03/21/2020, 3:32 PM
    thanks, this might be useful in many usecases (it seems druid also introduced "lastInt" and "lastString" aggregations.) second question is regarding encryption at rest. we are currently using RDBMS with transparent encryption and our requirement would be similar functionality. I am more familiar with druid and it has cold storage (which can be encypted as s3 and hdfs) and local cache which is not encrypted, so it does not meet our requirements. in pinot i found pinotencrypter, but my question is - can everything that is written to disk be encrypted in pinot?
  • k

    Kishore G

    03/21/2020, 3:36 PM
    Pinot also decrypts it when the segment gets to local disk. Performance will be bad, if we decrypt it on the fly for each query.
  • k

    Kishore G

    03/21/2020, 3:38 PM
    If you want to maintain a separate view for last point, you can do that in Pinot
  • d

    Dan Hill

    03/22/2020, 8:55 PM
    I'm working on a system that uses Presto to query Pinot. I saw there is a gitbook page for Presto integration. Presto (prestodb's version) has a Presto integration built in. Is there a difference in integrations between the two approaches?
  • d

    Dan Hill

    03/22/2020, 8:56 PM
    Also, when I run a query using Presto, I can only aggregate one metric at a time. I filed a bug against prestodb.
  • d

    Dan Hill

    03/22/2020, 8:56 PM
    https://github.com/prestodb/presto/issues/14277
  • k

    Kishore G

    03/22/2020, 8:59 PM
    that's because its using older version of Pinot, there is a property allowMultipleAggregations in presto-pinot-connector config. Its false by default, you can set it to true. @User will need your help to move to new pinot sql api here that allows multiple aggregations
  • d

    Dan Hill

    03/22/2020, 8:59 PM
    Ah, okay. Should I follow these instructions?
  • d

    Dan Hill

    03/22/2020, 8:59 PM
    https://apache-pinot.gitbook.io/apache-pinot-cookbook/integrations/presto
  • d

    Dan Hill

    03/22/2020, 9:10 PM
    Cool, I found the property.
  • d

    Dan Hill

    03/22/2020, 9:10 PM
    https://github.com/prestodb/presto/blob/a3f9aa3566675f4b5fea33a96abc58fddbf56a21/presto-pinot-toolkit/src/main/java/com/facebook/presto/pinot/PinotConfig.java
  • n

    Neha Pawar

    03/26/2020, 3:42 AM
    would really appreciate if you can watch it, follow along and try it out
  • k

    Kishore G

    03/27/2020, 5:11 PM
    any experts on readthedocs here?
  • k

    Kishore G

    03/27/2020, 5:12 PM
    we need help adding a banner to old docs https://readthedocs.org/projects/pinot/ and add a reference to new docs https://apache-pinot.gitbook.io/apache-pinot-docs/
  • l

    lsabi

    03/27/2020, 9:10 PM
    What about copying it from this docs?
  • l

    lsabi

    03/27/2020, 9:10 PM
    https://omnia-docs-g2.readthedocs.io/en/latest/blocks/banner/
  • l

    lsabi

    03/27/2020, 9:10 PM
    Source code
  • l

    lsabi

    03/27/2020, 9:10 PM
    https://raw.githubusercontent.com/preciofishbone/OmniaDocsG2/master/blocks/banner/index.rst
  • x

    Xiang Fu

    03/28/2020, 12:18 AM
    <!here> Hello community, We are pleased to announce that Apache Pinot (incubating) 0.3.0 is released! Apache Pinot (incubating) is a distributed columnar storage engine that can ingest data in realtime and serve analytical queries at low latency. The release can be downloaded at: https://pinot.apache.org/download The release note is available at: https://docs.pinot.apache.org/releases/0.3.0 Additional resources - Project website: https://pinot.apache.org Getting started: https://docs.pinot.apache.org/getting-started Mailing list: dev@pinot.apache.org Slack channel: https://communityinviter.com/apps/apache-pinot/apache-pinot Twitter: https://twitter.com/ApachePinot Best Regards, Apache Pinot (incubating) Team
    🎉 12
    👍 9
  • j

    Joey Pereira

    03/30/2020, 8:52 PM
    👋 I had a random question about the query approximation, mentioned on https://pinot.readthedocs.io/en/latest/pql_examples.html
    Results of aggregations with large amounts of group keys (>1M) are approximated
    I wasn't able to find any other details about the approximations referenced in docs, code, or issues. Is there somewhere I can read up on further details about the approximations?
  • k

    Kishore G

    03/30/2020, 9:01 PM
    I am editing the docs to add more details. But here is the gist • In every node, we keep a max limit on the hashmap <GroupByKey, Metric> for group By • When we hit this limit, new keys will be dropped but for existing keys Metric will be updated
  • k

    Kishore G

    03/30/2020, 9:02 PM
    this is for group by without ordering
  • s

    Sidd

    03/30/2020, 9:21 PM
    @User I had recently put together this explanation for another similar question. I hope this will help as well.
    Copy code
    GROUP BY execution happens in 3 stages:
    
    (1) At each Pinot server, we execute the query on a segment -- here by default we don't consider more than 100k unique groups as an attempt to restrict memory usage and prevent OOMs.
    
    (2) At each Pinot server, we combine/merge the results from multiple segments -- this is where we make a best effort at ensuring accuracy by returning max (5*topN, 5000) number of unique groups from each server to the broker.
    
    (3) Reduce the results from all servers at the broker, sort them, return TOP N
    
    By the time server level merge begins in (2), it is very likely that some groups were not considered because of two reasons:
    -- They came later in the scan while the records were being iterated upon and we had already exhausted the limit of 100k per segment
    -- Step 2 is multi-threaded where there are multiple threads (each handling one or more segment) combining the results across all segments into a single data structure. Here what makes into the list is dependent on the execution order/scheduling of threads.
  • j

    Joey Pereira

    03/30/2020, 10:26 PM
    Ah, thanks for the clarification! At first my concern was about accuracy based on cardinality of keys pre-aggregate, but that makes a lot more sense (:
  • d

    Dan Hill

    04/01/2020, 3:33 AM
    Does Pinot's PQL support
    having
    clauses? When I enter a query with a having clause into the web Pinot Data Explorer, it seems like having is ignored and the query is allowed.
    Copy code
    select platform_id, sum(cost_usd_micros) from events_testing where platform_id = 1 group by platform_id having (sum(cost_usd_micros) < 10000)
  • d

    Dan Hill

    04/01/2020, 3:34 AM
    Removing having does not impact the results. I'll still have rows that do not match the having clause.
  • m

    Mayank

    04/01/2020, 3:34 AM
    Not at the moment, we have a plan to add that support
  • d

    Dan Hill

    04/01/2020, 3:34 AM
    Ah, okay.
  • d

    Dan Hill

    04/01/2020, 3:36 AM
    Yea, that seems very useful for my use case. I'd also want Presto to be able to forward the having clause to Pinot.
  • m

    Mayank

    04/01/2020, 3:42 AM
    Ack
1...120121122...160Latest