https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • a

    Alex Gartner

    06/15/2022, 6:30 PM
    hi all. anyone want to share their pinot setups as far as external customers? Has anyone built web apps or http proxies on top of it to expose to customers? if so, what frameworks did you use, what pinot libraries/apis, etc
    k
    • 2
    • 1
  • f

    Fernando Barbosa

    06/15/2022, 8:32 PM
    [Question] Hi everyone!! I am not sure this is the right channel for this but here we go. I need to query for all transactions in the past five minutes from now. I have a column with timestamp in a realtime table. However, I do not know how to express this in the
    where
    part of statement. Anyone tryied that already? I want something more or less like this:
    Copy code
    where timecolumn <= (current_timestamp - interval '5' minute) as BIGINT * 1000
    m
    • 2
    • 7
  • r

    Roberto

    06/16/2022, 4:02 PM
    Hi everyone! I'm trying to write a query that filters results with the expression " SELECT * from TABLE WHERE TEXT_MATCH(city, '/^Lon/')". I want to obtain the cities that start with "Lon". I don't know if that it's possible in Pinot. Probably my regex is wrong, but I can't find a solution. Thanks!!!
    k
    a
    z
    • 4
    • 27
  • f

    Fernando Barbosa

    06/17/2022, 3:35 PM
    [Question] - Hi everyone, I am back here with another question. I am performing the following query but the `max`value returned ignores the
    where
    . Could you guys please help me with that:
    Copy code
    select max(myColumn) max_myColumn
        from myTable
    	where myDateTime >= (now() - cast((9499687)+(1000*60*60) as long))
    k
    • 2
    • 4
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    Folks. I just tweeted this, but I’m also curious whether we find it more fun to discuss things like this in Slack. So without creating a special channel, I’m just gonna cross-post the Twitter thread here in #CDRCA57FC like a baus. 😆 (One tweet per line for easier threading!)
    🙌 2
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    This account of LinkedIn’s migration from OpenJDK 8 to OpenJDK 11 is about more than Apache Pinot™, but the Pinot story alone is remarkable. There’s a nearly 3x increase in throughput, but it’s the tail latencies where the 🤯 really happens: https://engineering.linkedin.com/blog/2022/linkedin-s-journey-to-java-11
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    Everybody is friends at the 90th percentile, but it’s at P95 and P99 that things like GC pauses start to get you. Those have been flattened in a dramatic way in JDK11.
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    At first blush, one would think this is a G1GC love story, and I suspect the truth really is that simple. I’d be interested to hear if anyone (Pinot committers, LinkedIn folks) has more color on that.
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    It’s also a nice time to point out the extent to which Pinot tries to stay off-heap. There’s a huge amount of sequential I/O going on internally, and also large mmap’d hanging around being core to what Pinot servers do. Much of this is off-heap, as it should be.
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    Of course, you can only play that game so long, since Pinot is, well, a Java program. Actual query processing is still gonna beat the heap good and hard, which is where we see the glories of G1GC delivering as they do.
  • t

    Tim Berglund

    06/17/2022, 6:37 PM
    Really it’s just nice to see JVM innovation trucking along the way it is. This platform remains a healthy place for data infrastructure to live and grow.
    m
    • 2
    • 1
  • n

    Norman he

    06/17/2022, 8:17 PM
    how do i know the timestamp in realtime table my data is ingested?
  • n

    Norman he

    06/17/2022, 8:18 PM
    is there hidden timestamp i can access ? if not what is the best way to track the timestamp it is available in pinot realtime?
    m
    n
    • 3
    • 2
  • c

    chandarasekaran m

    06/18/2022, 1:52 PM
    👋 Hi everyone! I am new to this community. Currently i am looking for alternative for stream processing(beam+Flink) which i currently using. My Use case: 1.Consume data from eventbus and do the transformation on the fly 2.Store transformed data in to Pinot 3.Notify the changes(updated data) to downstream eventbus Based on documentation i set up my local environment, consumed message from kafka, stored in pinot and performing query from query console. I did not explored how to do the transformation on the fly while consuming data from kafka and how to notify to downstream eventbus? is it possible in current pinot architecture ? or any other workaround to do the same ? Can any one help me?
    🖐️ 1
    m
    • 2
    • 1
  • t

    Tanay Karmarkar

    06/18/2022, 2:25 PM
    Hi Everyone, I am new to this community as well! Do you have a set of best practices while running real-time queries? I went through one of videos on YouTube explaining the architecture of Pinot and it's use cases. In the example, the person is using Kafka topic as source and driving queries on the real-time table, I am wondering if the idempotency is built in within these tables or do we need to write the queries in such a way that would take into account idempotency, accuracy of the metrics, quality checks etc. What happens if I reset offset for the consumer or the source publishes events twice or thrice?
    k
    y
    h
    • 4
    • 4
  • t

    Tim Berglund

    06/20/2022, 3:41 PM
    So glad you’re here, Yarden! 💥
    🙏 1
  • m

    Mitchell H

    06/20/2022, 3:44 PM
    Welcome @Yarden Rokach
    😃 1
  • h

    Himanshu Rathore

    06/21/2022, 5:26 AM
    Are there plans to introduce grpc / arrow flight query support for brokers as well ?
    k
    m
    • 3
    • 3
  • m

    Manishbatheja

    06/21/2022, 6:52 AM
    Hello everyone I am new here...i am working as a devops engineer Love to learn new things...
    m
    y
    • 3
    • 3
  • v

    Vuppala Suresh Kumar

    06/21/2022, 8:45 AM
    Hi, When I'm trying to update the values with null in the table via API request body attribute, it is not reflecting in Pinot DB eg:{ "xyz": null } What is the difference between mode FULL and PARTIAL? ------------------------ "upsertConfig": { "mode": "*FULL*", ------------------------ "upsertConfig": { "mode": "*PARTIAL*",
    m
    • 2
    • 8
  • y

    Yarden Rokach

    06/21/2022, 2:52 PM
    Happy Tuesday everyone! ☄️ As there are manyyyy new community members here (including me😛)- I just want to be sure we're all aware of the >>>>*Real-Time-Analytics summit<<<<* hosted by StarTree , this August, SFO! 🌉 This is a great opportunity to learn from the best, explore practical use cases, deep dive into building real-time analytics systems, and network with our community! We'll be revealing every day a new speaker, and believe me- the Agenda is on 🔥 . Have a glimpse of our first speaker Eric Sammer, CEO of Decodableco here: https://twitter.com/startreedata https://www.startree.ai/real-time-analytics-summit Our community has an exclusive promo code! write me down in the comments or DM if you are interested, and I'll share it.
    🎉 3
    l
    • 2
    • 2
  • m

    Mugdha Goel

    06/21/2022, 4:52 PM
    Hello team, quick question I need to be able to store in my dateTimeFieldSpecs , my timestamp column upto nanoseconds from epoch. Is that possible in Pinot? I am looking at the docs and I see only milliseconds. I could be missing the doc for this, but just wanted to check with the team.
    Copy code
    "dateTimeFieldSpecs": [{
        "name": "consensus_timestamp",
        "dataType": "LONG",
        "format" : "1:MILLISECONDS:EPOCH",
        "granularity": "1:NANOSECONDS"
      }]
    Is something like this possible?
    l
    • 2
    • 3
  • a

    Alice

    06/22/2022, 12:26 AM
    Hi team. if I create several tenants to achieve server resource isolation. Is it recommended to assign a broker for each tenant?
    ➕ 1
    n
    • 2
    • 1
  • y

    Yarden Rokach

    06/22/2022, 5:10 PM
    Next up on our amazing list of speakers for the #RTASummit 💫 is @mgrygles, Streaming Developer Advocate at @DataStax Mary is the president of @CJUG and will talk about Event Messaging and Streaming with Apache Pulsar https://twitter.com/startreedata/status/1539358158212190208?s=20&amp;t=TckxATINmO4QteVrLULHkQ Save your seat now>> Promo code is available for our community, LMK if you're interested!
    🦜 2
  • m

    Michael Latta

    06/22/2022, 10:31 PM
    Question: would you expect text indexes to work with OR? I have tried TEXT_MATCH(...) OR TEXT_MATCH(...) on the same file and placing the OR in the text query expression and it consistently returns results with only one word matching the text match, as if it picks a single word form the index and then just returns records with that one word in them.
  • m

    Michael Latta

    06/22/2022, 10:32 PM
    In my case TEXT_MATCH(name,'Bob') returns 20 records, and TEXT_MATCH(name,'B*') returns 30, but using OR I get one or the other.
  • m

    Michael Latta

    06/22/2022, 10:48 PM
    Problem solved. Pinot UI defaults to limit 10 and I was looking at documents scanned not result rows, giving it an explicit limit and ordering the results allowed me to paginate through the results and see the various text matches.
    🙌 1
    👍 1
  • l

    Laxman Ch

    06/23/2022, 4:14 AM
    Hi Devs, I am trying to follow this doc and enabling partition pruner for our hybrid tables https://docs.pinot.apache.org/operators/operating-pinot/tuning/routing#querying-all-segments Noticing following issue with
    routingConfig
    . When partition pruner is enabled, queries are returning the partial results, esp for REALTIME. As per my understanding, this is not returning the results for CONSUMING segments. Wanted to check, if this partition pruner can be used for REALTIME tables. Partial results with:
    Copy code
    "routing": {
          "segmentPrunerTypes": [
            "time",
            "partition"
          ]
        },
    Full results with:
    Copy code
    "routing": {
          "segmentPrunerTypes": [
            "time"
          ]
        },
  • y

    Yarden Rokach

    06/23/2022, 2:23 PM
    Bom Dia everyone! Our community is growing! 🍷 We have a bunch of new folks and that's exciting!dancingcharmander I would warmly recommend anyone who's not in our #C03D8NNB12M channel, to visit it shortly - and meet our new joiners! Veteran members? > you are more than welcome to introduce yourself too! The value of knowing each other, and the synergy it creates... I can write 4 pages blog about it... (that's an idea......🤔) See you all at the #C03D8NNB12M channel ❤️
  • r

    Rajan Garg

    06/23/2022, 2:39 PM
    Hi All, I am facing one issue while deploying pinot on K8s. Can someone help me out? I am trying to install pinot on k8's cluster which is the latest helm released chart:
    0.9.3
    - https://github.com/apache/pinot/blob/master/kubernetes/helm/README.md The only change I have made is the
    nodeSelector
    in values.yaml file. I am getting errors while using this command for installing helm in K8's cluster
    Copy code
    helm install pinot . -f values.yaml -n pinot --set cluster.name=pinot --set server.replicaCount=2
    Here are the zookeeper logs:
    Copy code
    + /config-scripts/run
    mkdir: cannot create directory '/data/log': No space left on device
    /config-scripts/run: line 44: echo: write error: No space left on device
    + exec java -cp '/apache-zookeeper-3.5.5-bin/lib/*:/apache-zookeeper-3.5.5-bin/*jar:/conf:' -Xmx256M -Xms256M org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg
    2022-06-23 09:56:06,224 [myid:] - INFO  [main:QuorumPeerConfig@133] - Reading configuration from: /conf/zoo.cfg
    2022-06-23 09:56:06,244 [myid:] - INFO  [main:QuorumPeerConfig@385] - clientPortAddress is 0.0.0.0/0.0.0.0:2181
    2022-06-23 09:56:06,244 [myid:] - INFO  [main:QuorumPeerConfig@389] - secureClientPort is not set
    2022-06-23 09:56:06,294 [myid:] - ERROR [main:QuorumPeerConfig@645] - Invalid configuration, only one server specified (ignoring)
    2022-06-23 09:56:06,307 [myid:] - ERROR [main:QuorumPeerMain@89] - Invalid config, exiting abnormally
    org.apache.zookeeper.server.quorum.QuorumPeerConfig$ConfigException: Error processing /conf/zoo.cfg
            at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:154)
            at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:113)
            at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
    Caused by: java.lang.IllegalArgumentException: serverid null is not a number
            at org.apache.zookeeper.server.quorum.QuorumPeerConfig.setupMyId(QuorumPeerConfig.java:690)
            at org.apache.zookeeper.server.quorum.QuorumPeerConfig.setupQuorumPeerConfig(QuorumPeerConfig.java:602)
            at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parseProperties(QuorumPeerConfig.java:420)
            at org.apache.zookeeper.server.quorum.QuorumPeerConfig.parse(QuorumPeerConfig.java:150)
            ... 2 more
    Invalid config, exiting abnormally
    The disk space I am having is 25GB.
    y
    m
    • 3
    • 6
1...444546...160Latest