https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • t

    Taran Rishit

    11/30/2020, 1:46 PM
    Hello im unable to start pinot due to the below error- Administrator@EC2AMAZ-6IDI4LG /cygdrive/c/users/Administrator/documents/apache-inot-incubating-0.6.0-bin/apache-pinot-incubating-0.6.0-bin $ bin/pinot-admin.sh StartController -zkAddress localhost:2191 -controlerPort 9000 Unrecognized VM option 'PrintGCDateStamps' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. im stuck at second step in this -> https://docs.pinot.apache.org/basics/getting-started/advanced-pinot-setup can somebody help me?
    x
    • 2
    • 10
  • a

    Amit Chopra

    11/30/2020, 2:53 PM
    Hi, I saw there is an open ticket for supporting Kinesis (https://github.com/apache/incubator-pinot/issues/5648). Wanted to check when is it slated to be supported?
    k
    • 2
    • 4
  • j

    Joice Jacob

    11/30/2020, 2:54 PM
    Is there any configuration to increase the LIMIT from 10 to a higher number?
    m
    • 2
    • 1
  • j

    Joice Jacob

    11/30/2020, 4:25 PM
    I was working on star tree indexing. While loading the data, I got the following issue. Table and schemas are attached with this thread. I was trying to load 30 records. One of the column of star tree index is MSISDN and its cardinality: 10 also TARIFF_PLAN with cardinality: 5
    instant_table.jsoninstant_schema.json
    error.txt
    n
    k
    • 3
    • 2
  • g

    Graham Plata

    11/30/2020, 5:07 PM
    Hello all, I am encountering an error when trying to run the sample batch job located here https://docs.pinot.apache.org/basics/getting-started/pushing-your-data-to-pinot in k8s after following this guide https://docs.pinot.apache.org/basics/data-import/pinot-file-system/import-from-gcp with the pinot-gcs plugin. I am assuming I have a config issue somewhere but could use some expertise.
    job.ymlcontroller-run.txtcontroller-logs.txt
    x
    • 2
    • 13
  • p

    Paul Baumgart

    11/30/2020, 6:42 PM
    Q: is it possible to query non-integer percentiles? I'm interested in calculating P99.9, but looking at the supported aggregations doc, it looks like those functions only support up to P99.
    j
    m
    +2
    • 5
    • 7
  • k

    Ken Krugler

    11/30/2020, 8:32 PM
    Question about batch import job. When running a LaunchDataIngestionJob, I see the S3-based file(s) being ingested are being copied first to a temp directory on my local machine. Assuming I’ve set up a k8s-based cluster via EKS, is there a way to ingest directly from S3? I see to recall some option to do this, which would be much more efficient.
    x
    • 2
    • 6
  • j

    Joice Jacob

    12/01/2020, 6:31 AM
    I am working on load test using Jmeter in pinot tables. Currently I am getting an exception related to connection pooling. Response messagejava.sql.SQLException Cannot create PoolableConnectionFactory (null) java.sql.SQLException: Cannot create PoolableConnectionFactory (null) at org.apache.commons.dbcp2.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:669) ~[commons-dbcp2-2.7.0.jar:2.7.0] at org.apache.commons.dbcp2.BasicDataSource.createDataSource(BasicDataSource.java:544) ~[commons-dbcp2-2.7.0.jar:2.7.0] at org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:753) ~[commons-dbcp2-2.7.0.jar:2.7.0] at org.apache.jmeter.protocol.jdbc.config.DataSourceElement.initPool(DataSourceElement.java:308) [ApacheJMeter_jdbc.jar:5.3] at org.apache.jmeter.protocol.jdbc.config.DataSourceElement.testStarted(DataSourceElement.java:127) [ApacheJMeter_jdbc.jar:5.3] at org.apache.jmeter.engine.StandardJMeterEngine.notifyTestListenersOfStart(StandardJMeterEngine.java:205) [ApacheJMeter_core.jar:5.3] at org.apache.jmeter.engine.StandardJMeterEngine.run(StandardJMeterEngine.java:380) [ApacheJMeter_core.jar:5.3] at java.lang.Thread.run(Unknown Source) [?:1.8.0_271]
    x
    k
    • 3
    • 22
  • g

    Guillaume Loetscher

    12/01/2020, 11:02 AM
    Hey everyone ! I’m currently investigating Apache Pinot, and after reading a good chunk of the documentation, I have a couple of questions. • If this page, it’s said that if you lose all your controller, your cluster will still be able to answer to read queries (but not write queries, obviously). Then, if a new controller is started, it says that the cluster will recover and will be then available again for write queries. That supposed that all cluster states are stored somewhere. I suppose that “somewhere” is Zookeeper ? • Offline servers are responsible to host segments. Let’s say we have only one replica for a given segment, and the offline server hosting it dies. Will Helix discover that and will ask another offline server to download the same segment, in order to make it available again to the brokers ? • Where can I find some information about the resource requirements (mainly CPU / memory) for controllers / brokers / realtime servers / offline servers ? Thanks for your help !
    t
    k
    • 3
    • 16
  • s

    Sri Surya

    12/01/2020, 2:36 PM
    tried to execute the pinot start controller cmd It got the following error could you please help me with this?
    t
    x
    • 3
    • 2
  • m

    Mahesh Yeole

    12/01/2020, 6:48 PM
    <!here> I am trying to run Pinot in Kubernetes but seeing following error. Any suggestions ? helm install -n pinot-quickstart kafka incubator/kafka --set replicas=1 Error: failed to download "incubator/kafka" (hint: running
    helm repo update
    may help)
    x
    j
    • 3
    • 9
  • m

    Matt

    12/02/2020, 10:50 PM
    Hi , I have Text index up and running and looks working. However I noticed that some results are not correct for eg:- when I search for
    40F916FD-F2A7-2255-FEFB-B43050D8A5EE
    . I get results for
    81753586-72E1-8DC1-FEFB-08DB16E6A793
    &
    40F916FD-F2A7-2255-FEFB-B43050D8A5E
    . Trying to understand why it is so. Also If I try to search for XML tags like
    </ns1:requestControlID>
    it throws an error. Is there any setting I can enable to make these searches work?
    m
    s
    • 3
    • 34
  • d

    Dovydas Sabonis

    12/04/2020, 8:55 PM
    Hello! does pinot support importing gzipped data? We have gzipped JSON files in GCS bucket - can those be imported directly to pinot or do we have to serve uncompressed files in GCS?
    m
    k
    • 3
    • 12
  • t

    Tan Huynh

    12/07/2020, 7:27 PM
    Hello, Is there any advice for Pinot schema design? Do I want to create multiple tables, one for each entity and metrics that I want to query, or should I define one big table with all dimensions and metrics?
    x
    • 2
    • 2
  • k

    Kishore G

    12/07/2020, 8:57 PM
    I'm excited to announce the last Pinot meetup for the year 2020! The Pinot community has grown from 100 to 800 members this year. We want to take this opportunity to thank the entire Pinot community and get your inputs on our 2021 roadmap. In this fireside chat, I will go over all the things we have accomplished together in 2020 and talk about all the fantastic indexing techniques available in Pinot. Afterward, we'll open up for questions and discussions about Pinot and its roadmap. We will share a link tomorrow for everyone to post their questions/topics in advance. We are looking forward to seeing you there! Sign up here - https://www.meetup.com/apache-pinot/events/274700293/
    🎉 3
    👍 6
    🍷 6
    k
    • 2
    • 1
  • n

    Neer Shay

    12/08/2020, 6:58 PM
    Hi! I'm trying to get a little more information on ThirdEye and how it stacks up compared to Sherlock/Druid so I have a few questions: 1. What is going on behind the scenes? Is there some sort of model running which trains on historical data and learns what an anomaly is? 2. How configurable is this? Can I specify which dimensions/multi-dimensions to run on or does it automatically run on everything? 3. How often does it run? Is this configurable? 4. Where does it store its metadata? Thanks in advance for the assitance!
    k
    • 2
    • 4
  • m

    Matt

    12/08/2020, 10:53 PM
    Is there any doc detailing comparison of Pinot with ElasticSearch somewhere?
    x
    k
    +4
    • 7
    • 16
  • g

    Guillaume Loetscher

    12/09/2020, 5:11 PM
    Hey folks 👋 Got a quick question about Pinot + docker : the documentation mentioned several time docker images. Are they supposed to be used in production or just for testing purposes ?
    k
    x
    d
    • 4
    • 7
  • p

    Playsted

    12/09/2020, 9:37 PM
    Are their any guidelines / estimates to resource requirements for Pinot, particularly memory / local node storage needed? Eg. for 10 Tb table in deep storage need X Tb locally at nodes and Y memory for reasonable performance? Is it designed so that the entire table should be cached locally and MMAP'd or can parts go cold and be pulled on demand from deep storage with the extra latency?
    x
    g
    • 3
    • 22
  • p

    Priti Parkar

    12/10/2020, 11:55 PM
    I want to enable 'pre-aggregation' during realtime ingestion. https://docs.pinot.apache.org/basics/components/table#pre-aggregation 1. Can someone please point to documentation (mainly looking for schema or configs) ?  2. Also where are the pre-aggregated data stored? (in the same table/segment or different)
    m
    • 2
    • 5
  • e

    eunbin lee

    12/14/2020, 9:31 AM
    Hi, I'm paying attention to Minion's GDPR support.  I read the document that the minion framework can be used to achieve the requirements to comply with GDPR. However, the detailed description is "coming soon." I'm confused. Is the ability to use the Minion framework to delete records under certain conditions in the background not yet available? Or is it just the document that hasn't been written yet? In addition, I have some questions about audit, authorization, and DR. 1. Audit at the query level. I need to know not only table config and schema change log, but also who, when, and what queries (including target tables and conditions) were requested. Does Pinot offer audit? Or is it possible to use minion to monitor queries in the background and log them? 2. Is Pinot planning to provide authentication-authorization modules? Druid provides the built-in kerberos authenticator and provides authorization through the ranger extension program. have any similar plans? 3. I want to configure replication between two data centers (not using cloud) Ideally, if data center 1 fails, we want to fail over to data center 2 and fail back when Data Center 1 is normal. Suppose I have configured deep storage (hdfs), pinot cluster on k8s in each of the two data centers. Deep storage replication is possible. But what happens to real-time data? I understand that real-time data stores data in memory and periodically flushes segments to disk. If a cluster down, will real-time data that has not yet been flushed be lost? I'm not sure how to configure DR on pinot. Is there any way to recommend it to me? I'm in the process of getting to know Pinot. Thanks in advance for the help.
    x
    m
    +3
    • 6
    • 7
  • p

    Playsted

    12/14/2020, 4:04 PM
    How are text_match regex's performed? I'm looking for string contains type queries (eg. Text_match(column, '/.*partial_term. */')). Normally I would look for lucene ngram tokenization for this but see Pinot isn't using this. How are the partial term regex's completed? Is this essentially raw regex performed against all tokens?
    m
    • 2
    • 4
  • d

    Dharak Kharod

    12/14/2020, 11:00 PM
    Hi, while testing offline table ingestion on pinot github i found that the 
    overwrite
     mode is called 
    refresh
     now and got an error while using the
    overwrite
    as a segment push type. Is the
    overwrite
    keyword not valid anymore?
    n
    • 2
    • 5
  • d

    Darshan

    12/16/2020, 7:27 PM
    Thanks, Kishor and co for the fantastic knowledge session :) Would it be possible to share slides, performance stats and design documents? I ask this cos these references helps to bolster our internal notes/documents.
    k
    • 2
    • 1
  • k

    Karin Wolok

    12/16/2020, 7:27 PM
    This is a great blog post written by @User at Confluera! Very well structured and really walked through their problem / ideas / requirements and challenges. https://medium.com/confluera-engineering/real-time-security-insights-apache-pinot-at-confluera-a6e5f401ff02
    👏 7
    p
    • 2
    • 1
  • s

    Slackbot

    12/17/2020, 8:28 PM
    This message was deleted.
    m
    • 2
    • 1
  • w

    Will Briggs

    12/18/2020, 3:22 PM
    Has anyone configured realtime Kafka ingest with SASL / jaas auth (as in how Confluent handles auth for their managed clusters)?
    k
    e
    • 3
    • 31
  • d

    dhurandar

    12/18/2020, 5:18 PM
    Query regarding Apache Pinot, whats the typical OLAP cube size one can host in Apache Pinot, we have a cube which is almost 50 TB m it has some dimensions which very high cardinality but since raw data is more than 5 Petabytes, 50-100 TB is still a reasonable aggregation. We want interactive performance with our OLAP since it would power important Dashboards and drill-downs. So want to know how much data size we can push inside Apache Pinot??
    m
    • 2
    • 8
  • w

    Will Briggs

    12/18/2020, 7:15 PM
    Essentially, I can’t figure out how to specify a time literal that is compatible with my timestamp column for doing math on it
    m
    x
    a
    • 4
    • 24
  • a

    Amit Chopra

    12/18/2020, 7:51 PM
    Quick question - has anyone setup AWS Athena with Pinot. Given Athena is essentially Presto underneath
    x
    • 2
    • 3
1...111213...160Latest