https://pinot.apache.org/ logo
Join SlackCommunities
Powered by
# general
  • c

    coco

    07/06/2022, 6:20 AM
    Hi team!! Can I save the pid file to another directory? Do you have a config? https://github.com/apache/pinot/blob/de16a0a35d93f3fe0393df563015e7dd1298ceb9/pino[…]va/org/apache/pinot/tools/admin/command/StartServerCommand.java
    m
    • 2
    • 2
  • d

    DV Kumar

    07/06/2022, 7:37 AM
    👋 Hi everyone! I am new to apache pinot. I am looking for an end-to-end architecture for real-time analytics with PowerBI visualizations.
    🍷 2
    m
    • 2
    • 1
  • a

    A_Phil

    07/06/2022, 8:46 AM
    My requirement: I want to create a single table where I can do aggregations like sum on integer values of a column; but also hold string values in the same column. What I am dealing with: I have a realtime table where I am ingesting data with fields
    timestamp
    ,
    id
    and `value`; the
    id
    describes the
    value
    being ingested; here
    value
    can hold both
    INT
    and
    STRING
    for my use case. I wanted to know which of these options are feasible: 1. Create columns
    value_int
    and
    value_string
    and use a filtering function in Pinot that can save records in
    value_int
    if
    value
    is
    INT
    , and vice-versa for
    STRING
    values of
    value
    . I tried this, but the filter function as shown in the docs, does not allow this 2. Store all values as
    STRING
    and use a pinot-specific
    CAST
    or
    CONVERT
    function to do aggregations. But I could not find a cast/convert function in Pinot. Thus I am not able to do
    sum
    operations on the data I would welcome any ideas/workaround for the same.
    m
    • 2
    • 2
  • d

    David Gregory

    07/06/2022, 11:34 AM
    👋 Hi everyone!
    👋 3
  • j

    John Peter S

    07/06/2022, 12:13 PM
    Question: I am trying out pinot on top of docker. How do I create a set of unassigned instances so that I can add it as a server / broker later on under different tenants using the AddTenant command? Currently the StartServer / StartBroker command initiates a single instance under the default tenant. I can't seem to find the proper config to do this.
    m
    • 2
    • 4
  • a

    Amanda Robson

    07/06/2022, 2:57 PM
    Hi all 🙂 Sharing an awesome podcast recording we just did with Pinot co-creator @Kishore G 🙂 https://anchor.fm/ossstartuppodcast/episodes/E41-Real-time-Analytics-Powered-by-Startree--Apache-Pinot-e1krv6q
    🔥 1
    🌟 1
    🆒 8
    y
    • 2
    • 1
  • k

    Kevin Xu

    07/07/2022, 1:17 AM
    Hi all. Could someone tell me how to use presto-pinot-dirver in external client ? or how to build presto-pinot-dirver?
    y
    m
    • 3
    • 2
  • d

    Diogo Baeder

    07/07/2022, 2:04 AM
    Something that I don't see other users mentioning about Pinot, but I think is relevant, is how much fun one can have with this database. I've been having a lot of it, while developing an experimental project with Pinot, and one aspect of it that really makes me happy is how well it works for creating custom segments, which is a big necessity for my project. Like, the fact that I can define a specific name for my segments, and then upload updated segments later on and just see the data there, available for querying, is amazing.
    🍷 1
    y
    • 2
    • 1
  • c

    coco

    07/07/2022, 5:01 AM
    Hi. Team!! https://docs.google.com/document/d/1X32OMT6lC4pCveQVzK6OvRlaW0kE9HZ2vn_EHzesM1w/edit#heading=h.orpvmm6ldpup '1. Server stops sending heartbeat to Helix (1 heartbeat per 10 seconds)' How can I set the heartbeat interval of Pino Server? I want to test to reduce the heartbeat time in order to reduce the query failure time. Hi. @Jackie Would you please check this question?
  • a

    abhinav wagle

    07/07/2022, 5:51 PM
    Hello all, wanted to do a check on what tools/framework is being used for automation/CI-CD of Pinot table/Schema creation ?
    m
    l
    • 3
    • 2
  • p

    Prashant Pandey

    07/10/2022, 5:29 PM
    Hi team, is there a way to “look” inside a segment on disk? Want to do some analysis on a bunch of columns.
    m
    • 2
    • 5
  • r

    Rohan Pednekar

    07/11/2022, 5:49 PM
    Hey 👋 Apache Pinot community - Rohan here from the Presto community.👋 Happy to join this community! I hope it’s ok I post in this channel. I wanted to share a free open source community conference coming up - PrestoCon Day 2022. This is a great event to learn more about Presto, one of the most popular open-source SQL query engines. Meta, Uber, Bytedance, Tencent, Apache Hudi and many more will be sharing how they’re using Presto for next-gen data architecture. PrestoCon Day is fully virtual and free, check out the details and reg at https://events.linuxfoundation.org/prestocon-day/ Hope to see you there!
    👋 2
  • y

    Yarden Rokach

    07/11/2022, 7:15 PM
    Happy Monday it is, Pinot stars!🌟 We're excited to announce StarTree Data Manager, a No-Code, Self-Service Tool for Apache Pinot™. Learn more on the blog post>> https://www.startree.ai/blog/announcing-startree-data-manager Let us know here in the comments if you have any comments!
    👏 2
    🍷 5
    l
    • 2
    • 2
  • d

    Dan DC

    07/12/2022, 1:08 PM
    Please, can someone clarify when a realtime segment becomes "OFFLINE"? Is there any way to change it back to CONSUMING or ONLINE?
    k
    s
    n
    • 4
    • 18
  • s

    Scott deRegt

    07/12/2022, 6:55 PM
    I'm trying to optimize a query that currently contains aggregation functions on top of
    CASE ...
    statements. I found this thread that case statements will not work with star tree - since the thread is a couple of years old just wanted to double-check if this is still the case?
    m
    k
    • 3
    • 7
  • t

    Tony Zhang

    07/12/2022, 11:03 PM
    I am reducing the replica number from 2 to 1 for realtime table, but it seems not worked. the old segments status are 1/2 or bad, any suggestions? thx
    m
    • 2
    • 1
  • h

    Huaqiang He

    07/13/2022, 9:08 AM
    Hi team, in a single table config can I consume one Kafka topic and write into multiple Pinot tables?
    x
    • 2
    • 3
  • s

    Sergii Balganbaiev

    07/13/2022, 2:22 PM
    Hi team, I am currently doing some tests/research about upsert functionality and understood that I data with high cardinality for primary keys can use much heap space for storing upsert metadata. But my question is about storing this metadata off-heap(on disk, in KV-store or somewhere else). I found different threads that this work is currently in progress now and want to know on what stage is it? Maybe there are some estimations when it would be done? And is there some design document about it? I will be grateful for any information 🙂
    m
    k
    a
    • 4
    • 5
  • s

    Scott deRegt

    07/14/2022, 12:08 AM
    Hi 👋 , I have some confusion about offline partition-based segment pruning. My understanding is: 1. In order to use partition-based segmentPruner for offline segment data, the data source that is ingested by Pinot must already be partitioned by the desired partition column. ex
    <s3://bucket/metrics/country=US/files.parquet>
    2. From this thread, it seems there is no way currently to make pinot aware of columns associated with partition filepath metadata. i.e. in above example, pinot table cannot contain a
    country
    column. Am I understanding that correctly? If so, how does partition-based segment pruning help in this case if the partition column cannot be part of the query issued to pinot?
    m
    a
    +3
    • 6
    • 40
  • d

    Deepika Eswar

    07/14/2022, 11:07 AM
    hello all
  • d

    Deepika Eswar

    07/14/2022, 11:08 AM
    I am new to apache pinot. I want to know how to perform ETL in a offline table .Can anyone help
  • d

    Deepika Eswar

    07/14/2022, 11:08 AM
    I have ingested the data from Nifi server to pinot through batch ingestion job, I want to perform operation from one table to other like other databases. Does Pinot support ? P.S I am using Windows. Since most of the documentation is for linux and Mac. I couldnt deep dive much about how to use PINOT locally in windows
    k
    • 2
    • 2
  • s

    Slackbot

    07/14/2022, 12:57 PM
    This message was deleted.
    k
    r
    • 3
    • 16
  • y

    Yarden Rokach

    07/14/2022, 7:13 PM
    Hello Apache Pinot Stars! TLDR: Our Swag Store is Live! 👕 😎 We want to encourage you to share knowledge, consult, and keep contributing to the Apache Pinot community, as we believe this is what a community’s all about! Contributors? Get your Swag! • Click here and you’ll be redirected to your brand new swag store • Login with your work email and a code will be generated and sent to your email • Fill in the code on the site and enjoy your SWAG! But that's not all! Contributors, we want to make sure you do it in style • Check the earning table below! _(Or on our Community page)_ • Share the action with us as specified • We’ll reach out with the coins you earned • Get your swag! Contribute > Make an impact > Get your swag > Repeat ❤️ https://stree.ai/swag
    ❤️ 2
    🙌 1
  • d

    Deepika Eswar

    07/15/2022, 6:42 AM
    how to read an offline table in Pinot using python ?
    m
    • 2
    • 1
  • a

    Abhishek Gupta

    07/15/2022, 10:22 AM
    Hey all! I was going through the "Real-Time data flow" documented here. I had two questions: 1. If there are n replicas of a segment, is it the correct understanding that all n servers consume the events from Kafka and only 1 of them "wins" to commit to deep storage? 2. In a rare event that "all" n servers for a segment go down before the segment could get completed (and committed to deep storage from memory), does this in-memory data get lost forever or do the new servers manage to consume these "lost" events again?
    m
    • 2
    • 2
  • c

    chandarasekaran m

    07/17/2022, 2:55 AM
    Hi Team, How I can parse kafka header(in bytes) and filter based on specific field ? any code samples?
  • a

    Abdullah Jaffer

    07/17/2022, 5:14 AM
    Hi everyone, so for security purposes, we have swagger access disabled when working with Pinot, this hasn't been a problem, but now I needed to create a new tenant, so question is is there a way to do that in the UI?
    m
    • 2
    • 6
  • j

    John Peter S

    07/18/2022, 7:17 AM
    Hi, I have some doubts related to partitioning From the docs I am able to grasp the concept of how partitioning works when importing data through Kafka. But with respect to offline importing I am not very clear. What my current understanding and question is as follows. 1. If my input folder has 10 files. These individual files will be saved as individual segments following the replication config mentioned in the table config. 2. If I want to partition these files as and when it is getting imported as per this doc, it says the input files itself should be in partitioned state. How should this be configured in the table config? Is it similar to the partition config we set for realtime import like giving a columnPartitionMap with partition function and number of partition? If so I am a bit confused cause in offline import, segments created has a 1to1 mapping with the input files and if partition is involved the input files are already partitioned then what is the purpose of the partition config? Is it used for query routing alone?
    k
    r
    x
    • 4
    • 9
  • a

    Abhishek Gupta

    07/18/2022, 8:59 AM
    We have a system which stores transactional data in MongoDB, where updates/deletes are pretty common. To build an analytical reporting system on this data, we were considering to leverage Pinot, by streaming data into it from Mongo. While I see Pinot supports streaming ingestion with "upserts", I had the following questions: 1. The doc mentions at a few places that Pinot is designed for "immutable" data, which kind of contradicts with the upsert feature. How do these two concepts hold together? 2.
    Upsert table maintains an in-memory map from the primary key to the record location
    - The "record location" could be either in-memory or in segment store, so does this map maintain both kinds of locations? By storing all primary keys, will this map keep growing indefinitely in memory and will require vertical scaling of servers at some point? 3. If a record in a segment is updated, all servers need to reload it, I guess. Does it make updates expensive? 4. Overall, is our use-case well suited for Pinot (where data updates/deletes of a record are pretty common)?
    k
    • 2
    • 2
1...464748...160Latest