https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • m

    Mannoj

    04/21/2025, 9:12 AM
    What is the training material that Apache Pinot can recommend for SREs or Devops or to go through any certification for Apache Pinot ?
    x
    p
    k
    • 4
    • 4
  • p

    Peter Corless

    04/21/2025, 6:33 PM
    Folks sound off in this thread to let me know you're registered for rtasummit.com, and what sparked you to register (a specific talk, speaker, networking opportunity, etc.)! Feel free to chime in!
  • p

    Peter Corless

    04/22/2025, 8:13 PM
    Hi folks! We are going to be announcing the premier StarTree “Real-Time Revolutionaries” Awards at RTA Summit 2025. 4,500+ attendees have already registered, yet there will be only a dozen or so companies recognized in this year's inaugural awards. As one of the people behind organizing the awards, one of the things that limited me greatly was simply knowing about a real-time analytics use case. For StarTree customers, we have more direct communication and insights. Yet for many great Apache Pinot OSS stories, I haven't heard of what you're up to! So for next year, please avail yourselves of telling us what you're up to! You might just be a winner in 2026! • Apache Pinot User Story Meanwhile, remember to register for RTA Summit! Coming up this 14 May. rtasummit.com
  • j

    jamangstangs

    04/23/2025, 7:54 AM
    Hi, may I ask when Apache Pinot 1.4.0 is expected to be released?
    x
    • 2
    • 1
  • s

    San Kumar

    04/25/2025, 4:57 AM
    Hello is it recommended to set below properties for multi tenant enablement in table for pinot cluster.tenant.isolation.enable=false pinot.set.instance.id.to.hostname=true as I see from the document https://docs.pinot.apache.org/basics/concepts/components/cluster/tenant As far I know cluster.tenant.isolation.enable=false is not recommonded.
    n
    x
    • 3
    • 7
  • s

    San Kumar

    04/25/2025, 4:58 AM
    can you please provide on these properties for production setting
  • s

    San Kumar

    04/25/2025, 4:59 AM
    or what is the best approach to enable multiple tenant,
  • p

    Peter Corless

    04/25/2025, 8:27 PM
    Have you folks found the deepwiki for Apache Pinot yet? I'd love folks' feedback on the quality and accuracy: https://deepwiki.com/apache/pinot
  • y

    Yarden Rokach

    04/29/2025, 8:54 AM
    🚀 How are Uber, Netflix, and Spotify scaling real-time analytics- for themselves and their users? Join us at #RTASummit on May 14th — the must-attend (and free!) event for data architects and engineers. 💡 One highlight you won’t want to miss: Building Real-Time GenAI Pipelines with Apache Pinot and AWS Discover how to build a real-time social media analysis pipeline using Amazon Managed Service for Apache Flink, Amazon Bedrock, and Apache Pinot as a vector database. See how this architecture powers real-time RAG (Retrieval Augmented Generation), enabling instant insight into social media trends and unlocking next-gen GenAI search and analysis capabilities. Perfect for teams looking to harness GenAI + streaming data for real-time, data-driven decisions. RSVP>
  • s

    San Kumar

    04/29/2025, 9:52 AM
    Hello Team is The pinot java client api is a wrapper of the http restapi?
    y
    • 2
    • 1
  • y

    Yarden Rokach

    05/02/2025, 11:23 AM
    CONGRATS @Gonzalo Ortiz and @Sonam Mandal 💥
  • m

    mohan

    05/13/2025, 6:50 AM
    Hi, We have springboot application integrated with apache pinot jdbc driver. We are using JDBCTemplate which inturn uses datasource(with org.apache.pinot.client.PinotDriver). with https protocol. we are facing lot of threads(running forever) related to at org.apache.pinot.common.utils.tls.RenewableTlsUtils.reloadSslFactoryWhenFileStoreChanges(RenewableTlsUtils.java:239) this is getting created for every connection. To resolve this, we are creating SSLContext and injecting into new PinotDriver(sslContext). This resolved the issue, but PinotDriver has following documentation
    Copy code
    @VisibleForTesting
    public PinotDriver(SSLContext sslContext) {
      _sslContext = sslContext;
    }
    we want to know if we can use the above method as it's annotated as visibleForTesting. if not, what is the best way to use pinot java client to avoid the thread issues. Waiting for the response
    x
    • 2
    • 1
  • y

    Yarden Rokach

    05/13/2025, 2:32 PM
    HAPPENING TOMORROW⏳ Last Call to Join the Real-Time Analytics Summit 🚀 Why attend? • Learn how to scale insights instantly for end users • Cut costs without compromising speed • See how real-time analytics powers top companies Register Here>>
  • p

    Peter Corless

    05/13/2025, 9:20 PM
    Tomorrow, the Real-Time Analytics Summit 2025 is going to be awesome! On top of the amazing roster of keynotes and speakers we have lined up, we'll also be hosting the inaugural Real-Time Revolutionaries awards. These will be given to the organizations and developers who are changing their companies, whole industries and the world through the use of Apache Pinot. Today we're rehearsing. Tomorrow, the live presentations! (Tux and ballgowns are completely optional.) Register for free at https://rtasummit.com Slack Conversation
  • y

    Yarden Rokach

    05/14/2025, 5:28 PM
    We’re LIVE at the Real-Time Analytics Summit! It’s not too late to join us- tune in now and catch what’s next: 🎤 Coming up: Netflix, Starbucks, and Uber share how they amplify real-time analytics for their teams- and millions of end users, every day, every second. Sign up here to join the action>
  • y

    Yarden Rokach

    05/14/2025, 5:49 PM
    Good stuff folks! Join us here
  • p

    Peter Corless

    05/15/2025, 9:02 PM
    ICYMI: https://startree.ai/resources/overheard-at-real-time-analytics-summit-2025 Also, you can still register to watch on-demand: https://rtasummit.com
  • c

    Chirag Varshney

    05/19/2025, 9:57 AM
    Hey, We are currently implementing Apache Pinot in our fintech company to enhance our analytics capabilities. Previously, we used MySQL as our OLTP system. As part of this transition, we’ve introduced a data pipeline that captures CDC (Change Data Capture) events from MySQL, pushes them to Kafka, and then loads them into Pinot’s real-time tables. We’ve also loaded historical data into offline tables. However, we’re encountering an issue: when updating a historical row in MySQL, the changes are not reflecting in Pinot, even after flushing the real-time data to disk and running the
    RealtimeToOfflineSegmentsTask
    . We’re still seeing the old data in queries. Steps Taken So Far: 1. Verified that CDC events are correctly propagated to Kafka. 2. Confirmed that real-time tables are receiving new updates. 3. Flushed real-time segments and executed the
    RealtimeToOfflineSegmentsTask
    to merge data. Questions: • Are there any known gaps in this workflow that could cause stale data in offline segments? • Could there be a misconfiguration in the real-time-to-offline task or retention settings? • Are there additional steps required to ensure updates to historical data are properly reflected? • How we can eliminate duplicate rows in offline table? • What are the correct bucket time to use when delaing with years of historical data for dedup task in Offline tables. We’d appreciate any insights or troubleshooting suggestions to resolve this issue. Thanks in advance for your help!
    p
    • 2
    • 1
  • s

    Suresh PERUML

    05/20/2025, 1:40 AM
    Hi All, I would like to remove or delete certain records in Apache Pinot schema or table. Can you share us the procedure to delete. The schema carries certain columns named ID, Timestamp, values -> where ID would be a number. I would like to delete the data with ID as an input. For example delete from demotable where id='1234' or id in (1234, 145,345,133)?
    m
    • 2
    • 2
  • s

    Suresh PERUML

    05/20/2025, 12:23 PM
    Hi ALl, i am working on backup and restore task, i took a backup of segments, modified or altered value of one of the column inside the segments. Recreated the segment from csv file. Post to that, i am uploading the segments via REST API -> ../segments?tableName=data_record&tableType=OFFLINE .... The API results with "SUCCESS" error message. Post to that , in Apache pinot GUI when i select the table, it shows totaldocs and that is also shows proper value. Post to this verification, when i query select * from table_name or select * from table_name limit 100. I am getting no records found. Can you correct me or share me the exact steps to ensure that the import of data is proper and there is no data loss and i am able to view the data....
    m
    • 2
    • 3
  • z

    Zhuangda Z

    05/22/2025, 6:42 PM
    Hi team, is there a WBL concept in Pinot after an event is ingested? We would want to detect this signal and apply rollup in another data system. One option is to have a cron job regularly pulling the table but wonder if there is something out of the box
    m
    x
    • 3
    • 7
  • f

    Felipe

    05/26/2025, 4:44 PM
    Hi all, quick question, is there a way to aggregate a column value as a list? For examples, all ids following a group by criteria should be shown as a list of ids
    m
    x
    • 3
    • 5
  • a

    Angel Ruiz

    05/27/2025, 5:24 PM
    Hi everyone! I'm Ruiz from Philippines — a software engineer with 7+ years of experience. I am specialized in JS/TS/Python/PHP, React (MERN, Next), *Django*(Wagtail CMS), Headless CMS (Strapi, Payload CMS), and Supabase in software development, AI Solutions such as AI Voice Agent Development(AI/ML, LLM, RAG, Retell/VAPI), Chatbot Development and also Business processing automation using n8n,
    <http://make.com|make.com>
    . Recently, I’ve been diving into cloud technologies and microservices to develop some really exciting projects. I’m currently seeking new opportunities and would love to connect with anyone who has openings that might be a good fit or could use my skills. Don’t hesitate to reach out if you need an extra pair of hands for your company. Thank you! #CDRCA57FC
  • e

    Emilio Duran

    05/28/2025, 2:24 PM
    Hi #CDRCA57FC, I’m Emilio Duran from Philippines, an AI-focused Full Stack Developer with 7+ years of experience building intelligent, end-to-end applications across web and mobile platforms. I specialize in merging modern frontend technologies with powerful backend and AI/ML systems to deliver real-time, data-driven, and user-centric products. My work spans from intuitive UI/UX interfaces to deploying deep learning models and LLM-powered services. 🔍 What I Do: - AI Integration: Seamlessly embed GPT-4, LangChain, Claude, and custom ML models into apps. - Web & Mobile Development: Build responsive, high-performance applications using React, Next.js, React Native, and Node.js. - Backends for AI: Design scalable APIs and pipelines with FastAPI, Flask, and Django. - Chatbots & Voice Agents: Create advanced chatbot systems integrated with Slack, Telegram, and custom APIs. - RAG Systems & NLP: Architect intelligent search, Q\&A, and summarization tools with vector databases like Pinecone and FAISS. - Computer Vision: Deliver real-time object detection and visual intelligence with OpenCV, PyTorch, and edge devices. I’ve partnered with clients from the US, Canada, Germany, and India—successfully delivering 30+ smart solutions for industries like e-commerce, healthcare, and finance. 💬 Let’s connect via DM if you're looking for a developer who can turn AI into real-world solutions, and deliver production-grade web and mobile applications, I’d love to collaborate.
  • s

    SP

    06/02/2025, 5:02 PM
    Hi Everyone, We are optimizing our Apache Pinot deployment in AWS EKS and have a few questions regarding best practices for ensuring Availability Zone (AZ)-local query routing, thereby minimizing cross-AZ costs. Our plan involves tagging Pinot servers according to their respective AZs: - AZ-A Servers: Tagged as
    AZ_A_OFFLINE
    , grouped within the pool
    AZ_A_OFFLINE
    - AZ-B Servers: Tagged as
    AZ_B_OFFLINE
    , grouped within the pool
    AZ_B_OFFLINE
    These pools will be added to the table configuration, ensuring that communication between servers remains within the same AZ. Broker Query Routing • Currently, Pinot brokers distribute queries in a round-robin fashion, and there appears to be no built-in way to enforce AZ-local routing. Ideally, a broker located in AZ-A should query servers from the
    AZ_A_OFFLINE
    pool, but we have not found a direct method to control this behavior. Any suggestions or improvements in this approach is appreciated?
    m
    • 2
    • 3
  • u

    Utsav Jain

    06/03/2025, 9:39 AM
    Hi Everyone, We have enabled partial upserts in realtime table for pinot, but due to the nature of business we had to setup
    metadataTTL
    expiry on upserts to be 12hrs We are seeing that due to that we are now getting duplicate entries created in Pinot due to update of the event arriving after 12 hr window This is causing our reports which utilised count(*) to show wrong results and distinct is not working as the data set has grown too huge To solve this issue we are planning to leverage the Segment compaction job for this using a minion Have anyone implemented such segment compaction task in Java which we can take reference of Then please help
    g
    m
    • 3
    • 14
  • s

    San Kumar

    06/03/2025, 7:05 PM
    Hello Team we are doing batch injection to a offline table based on event_time column.We are receiving updated or new record for particular event_time and doing pusing to offline table by jobspec, is it really required for below configuration to offline TABLE
    Copy code
    ingestionConfig": {
        "batchIngestionConfig": {
          "segmentIngestionType": "APPEND",
          "segmentIngestionFrequency": "DAILY"
        }
      },
    Without this also segment is pusing.What is the use of above property.
    m
    • 2
    • 1
  • m

    Mannoj

    06/04/2025, 12:58 PM
    A quick question,
    cluster config setting will take priority or client setting will take priority in pinot ?
    i.e: am setting pinot.broker.timeoutMs = 60000 in broker and server config file itself, but if some clients want to override this setting, can they set at client level a new setting like pinot.broker.timeoutMs = 1200000 and get the query executed ?
    m
    • 2
    • 1
  • a

    Alexander Maniates

    06/04/2025, 3:53 PM
    Hello, I am looking to understand a bit more how the combination of
    segmentPrunerTypes
    works when pruning segments for a query: 🧵
    m
    s
    • 3
    • 8
  • g

    guru

    06/04/2025, 5:32 PM
    Super excited to share MCP Server for Apache Pinot and We open sourced it !!! Give it a try and share any feedback you may have, Have fun querying Pinot with MCP 🙂 https://startree.ai/resources/startree-mcp-server-for-apache-pinot
    x
    r
    n
    • 4
    • 14
1...156157158159160Latest