https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • y

    Yarden Rokach

    09/15/2022, 6:05 PM
    The Community is Growing💪🏻 3000+ Slack members! Another incredible milestone in the Apache Pinot community's growth! Thanks to all of our Slack members for keeping the ideas coming & conversations flowing! Cheers to more than 3000 members!
    🍷 8
    ❤️ 12
    🚀 9
    j
    • 2
    • 2
  • a

    amit mahadik

    09/16/2022, 9:53 AM
    Hi All, Can someone please suggest an article/tutorial around minions
    m
    • 2
    • 2
  • h

    Huaqiang He

    09/19/2022, 1:10 AM
    Hi team, regarding the merge rollup task can I 1) specify multiple aggregationTypes on the same metric column, for example, I want sum & min & max (percentile too) for the latency column after rollup. 2) ignore some dimensional columns during and after the rollup/aggregation. We need to ignore some dimensional columns because before rollup the table stores raw signals which all have a uuid column. Retaining the uuid column, there will be no actual rollup.
  • a

    amit mahadik

    09/19/2022, 5:05 PM
    Hi All, what is granularity and bucketTimePeriod and are they interlinked?
    m
    • 2
    • 1
  • c

    Corneliu Creanga

    09/19/2022, 11:06 PM
    Hello, We have a very large kafka topic (it gets about 10-25 mil rows/second) and we would like to use real time ingestion in order to fill in some tables. For each table we have a custom decoder that knows how to extract the proper data from each message or skip the message. I'm curious how the ingestion work - will Pinot stream the data independently or can we have one time ingestion/apply each decoder for the ingested rows? Thanks a lot :)
    m
    • 2
    • 2
  • a

    abhinav wagle

    09/19/2022, 11:24 PM
    Hello, is Pinot Open source project an ASF project ? https://www.apache.org/licenses/contributor-agreements.html
    m
    • 2
    • 3
  • c

    Chengxuan Wang

    09/20/2022, 2:57 AM
    hey, wondering if there a way to do TEXT_MATCh search for UUID prefix search. this is what i tried
    Copy code
    TEXT_MATCH(order_id, '"ae006b22-b5f"')
    but it doesn’t return any data. if i tried
    TEXT_MATCH(order_id, 'ae006b22-b5f0*')
    it returns data more than started with
    ae006b22-b5f0
    . wondering what is the correct way to do it. thanks.
    m
    s
    • 3
    • 9
  • p

    Prakhar Pande

    09/20/2022, 2:29 PM
    Hi everyone! I have very recently started exploring Pinot. I am facing a few problems while ingesting data from Kafka topic. 1. when I am ingesting data in a table having only default indexing, totalDocs as shown in the query console is around 123 million. However , when I am ingesting in a table with star tree indexing with the same Kafka topic, the total docs is only 70 million (after I have stopped pushing more data into Kafka ). 2. Is there a way I can delete data from pinot cluster but keep it in deep store? 3. if suddenly my cluster goes down, then from which state deep store will my restore data? Thanks in advance.
    t
    • 2
    • 1
  • y

    Yarden Rokach

    09/20/2022, 6:06 PM
    9 days left to apply for the StarTree All-Stars program! ⚡📣 Our All-Stars are provided with access to product discussions, and exclusive events, and will be the first to know about any major product developments, features, and updates! They also are provided with limited-edition Pinot and StarTree swag! 👕 😎 Check out the full program>> Not sure if you should apply? send me a message to discuss it!
  • s

    Sukesh Boggavarapu

    09/23/2022, 9:37 PM
    Can pinot infer partitioning field based on input path? Like if I have s3 path like "s3://my-bucket/dt=2022-09-01" , and I have a table with
    dt
    in schema (but my actual data in s3 doesn't contain
    dt
    column), if I run an ingestion job through spark , can it infer that the partition is
    dt=2022-09-01
    and creates a partition on that and also populate the
    dt
    value?
    k
    h
    • 3
    • 3
  • d

    deepuak01

    09/24/2022, 3:58 PM
    👋 Hi everyone! I am new to Apache Pinot and is looking to use Pinot as an OLAP datastore for my organization
    🦜 1
  • d

    deepuak01

    09/24/2022, 3:59 PM
    Can anyone suggest a link or an online document describing how to set up apache pinot in AWS?
    g
    m
    • 3
    • 23
  • e

    Ehsan Irshad

    09/27/2022, 8:54 AM
    Hi folks, I dont see a channel for spark-pinot connector. Was wondering if its in plan to support the connector with Spark 3 in upcoming releases?
    k
    g
    • 3
    • 7
  • c

    coco

    09/27/2022, 11:11 AM
    How does Pino's partitioning work when partitions increase in Kafka topics? Is there any problem? 'stream ingestion with upsert' https://docs.pinot.apache.org/basics/data-import/upsert#use-strictreplicagroup-for-routing 'routing partitioning' https://docs.pinot.apache.org/operators/operating-pinot/tuning/routing#partitioning
    n
    • 2
    • 3
  • a

    Alex

    09/27/2022, 1:56 PM
    hi everyone! does anyone know of any good sliced and dice UI for Pinot? I’m thinking something like Imply (Pivot before) for Druid. Should be opensourced. The idea -> give analysts an easy way to look at a single table (drill down, slice, …) without any SQL
    d
    h
    • 3
    • 6
  • m

    Machhindra

    09/27/2022, 6:09 PM
    Hi everyone! I am trying to store the ‘metrics’ in timeseries format into pinot real-time table. I am not sure how to design the table config to transform the incoming json from the kafka topic to Pinot table as shown in the picture. Basically, I need to match the ‘label-name’ to pinot columns and insert ‘label-value’ to column value from a json array. I would have put entire labels into a single column but I want to allow user to query like “select … from mytable where ZosSystem=‘Blah’“.
    m
    • 2
    • 4
  • y

    Yarden Rokach

    09/28/2022, 10:52 AM
    #RTASummit-is happening TODAY! Make sure to register, and join us for 3 hours of deep dive into top tier companies’ use cases , data flows, and real time analytics! Jay Kreps (CEO of Confluent) will be there, will you? https://www.linkedin.com/posts/startreedata_trailer-for-jay-kreps-confluent-at-real-t[…]316037263360-Zqw3?utm_source=share&utm_medium=member_desktop
    🔥 3
    i
    • 2
    • 1
  • t

    Tim Berglund

    09/28/2022, 3:09 PM
    Yes! Today!
  • t

    Tim Berglund

    09/28/2022, 3:10 PM
    rtasummit.com. Do what must be done. See you in 50 minutes. 🙂
    🔥 4
  • t

    Tim Berglund

    09/28/2022, 3:42 PM
    A super-secret view of the StarTree sudios, where the Real-Time Analytics Summit is being broadcast.
    🍷 11
    l
    • 2
    • 1
  • y

    Yarden Rokach

    09/28/2022, 6:46 PM
    Last call to submit your nomination for the StarTree All Stars! The nomination will be closing tomorrow. 🌟 https://community.startree.ai/all-stars
  • k

    Karin Wolok

    09/29/2022, 3:30 PM
    Just to add to what Yarden posted above - There will be no submission considerations for All Stars 2023 after this day, so please submit ASAP if you haven't already!
  • e

    Edgaras Kryževičius

    09/29/2022, 3:37 PM
    Hey! When I run spark ingestion job on local system, I can see that pinot-plugins-dir-x (where x is int) directiories are being created. What is it? Where would it be created if I ran spark-submit job on k8s? Would it create on executor pod?
    h
    k
    • 3
    • 2
  • j

    Jinny Cho

    10/04/2022, 2:17 PM
    👋 Can I ask one question? I'm looking into making Zookeeper more resilient. How would you prepare in case of all of the zookeepers are down? I'm considering some kind of backup for Zookeeper and curious if there's any recommendation especially for zookeepers in Pinot environment.
    j
    k
    m
    • 4
    • 8
  • a

    Ashish Kumar

    10/05/2022, 12:39 PM
    Hi Team, I was looking at pinot go client https://docs.pinot.apache.org/users/clients/golang seems like it connects via zookeeper path.. wondering if it's a good practice from security perspective? because it 'll require to expose zookeeper to clients?
    m
    s
    • 3
    • 2
  • p

    piby

    10/06/2022, 11:14 AM
    Hi, I am just exploring this project and have a question on pinot-s3 data ingestion. At our company we have new data coming as json/csv files every minute/hour. We are currently using postgres which is hard to scale so we are looking for a performant, horizontally scalable OLAP solution ideally which runs on Kubernetes. My question is if it is possible to sync a S3 bucket with pinot? So, if we add new csv/json files to the bucket, pinot should automatically injest (only) new files into its segment store without any duplicates. I expect this is doable using S3 events but I couldn’t find if something like this is already in place. If not, then we have to cook up out own solution using S3 events or set up a kafka cluster to stream data to Pinot. Thanks!
    s
    m
    • 3
    • 2
  • l

    Lab Nems

    10/06/2022, 10:36 PM
    Hi, I started working with pinot in my tests I want to connect tableau server to pinot via JDBC but only I encounter a difficulty. The connection to pinot is established very well I can see the pinot tables but only I cannot see the contents of the tables and I have no error in the pinot logs. I encounter this problem only with the containerized version of pinot. Please is there an option to set to connect to pinot with JDBC when pinot is running under docker? Thanks
    r
    m
    c
    • 4
    • 6
  • s

    Steven Hall

    10/06/2022, 11:51 PM
    Hi Team First, this is a cool project. Excited to be looking into it and learning more. Noobie question… I have looked at the architecture and I see some older docs that show the the controller consists of two components: Zookeeper and Helix. If we choose a Kubernetes deployment it seems Kubernetes does the same things that Helix does. Am I correct in assuming the Kubernetes deployment does not include Helix?
    a
    r
    • 3
    • 4
  • m

    Matthew Kerian

    10/07/2022, 8:08 PM
    Hello. We were wondering what’s the preferred way for creating tables/schema. Is there any reason not to just use the web page?
    j
    m
    • 3
    • 3
  • m

    Michael Latta

    10/08/2022, 6:36 AM
    Is it possible and a good idea to use offline tables and create segments directly in flink, or better to write the data from flink to Kafka and use a real time table? We generate the data in flink but given the size writing directly to the segment store might have advantages. We could use a short-ish retention period as well.
    m
    • 2
    • 5
1...515253...160Latest