https://pinot.apache.org/ logo
Join SlackCommunities
Powered by
# general
  • a

    Adrian Cole

    08/12/2020, 5:43 AM
    waves to @User and @User
    ๐Ÿ‘‹ 1
  • j

    Joey Pereira

    08/12/2020, 9:54 PM
    ๐Ÿค” question about the documentation. On https://docs.pinot.apache.org/developers/advanced/data-ingestion, the docs say
    The servers download the segments (as a cached local copy to serve queries) and load them into local memory. All segment data is maintained in memory as long as the server hosts that segment.
    But Pinot supports an
    mmap
    mode. Is it safe to say that all segment data will not be stored on physical memory, meaning your cluster memory size does not need to be >= your segment data size.
  • k

    Kishore G

    08/12/2020, 9:56 PM
    good catch, fixing it
    ๐Ÿ‘ 1
  • m

    Mayank

    08/13/2020, 10:00 PM
    Hello, LinkedIn is hosting a virtual meetup for Pinot on Sept 2, 6pm. Please join the meetup page https://www.meetup.com/apache-pinot/events/272495311/. Registration link in the
    Details
    section.
    ๐Ÿฅ‚ 2
    ๐Ÿป 2
    ๐ŸŽ‰ 8
    ๐Ÿฅƒ 3
    ๐Ÿท 5
  • m

    Mayank

    08/13/2020, 10:55 PM
    Note, in order to get the zoom link, please register at https://linkedinpinotmeetup.splashthat.com
  • n

    Neha Pawar

    08/17/2020, 2:51 AM
    Hi all, Weโ€™re planning to add a feature to move segments from realtime table to offline table. The motivation is to eliminate the need for user to setup their own offline flows for a hybrid table, and let Pinot manage that. Hereโ€™s the design doc, called Pinot managed offline flows, if anyone wants to take a look :ย https://docs.google.com/document/d/1-e_9aHQB4HXS38ONtofdxNvMsGmAoYfSnc2LP88MbIc/edit?usp=sharing
    ๐Ÿ‘ 9
  • k

    Kishore G

    08/18/2020, 9:20 PM
    Great article by @User on ingesting parquet files in S3 into Pinot via Spark. https://medium.com/apache-pinot-developer-blog/leverage-plugins-to-ingest-parquet-files-from-s3-in-pinot-decb12e4d09d
    ๐ŸŽ‰ 9
  • r

    Ravikiran Katneni

    08/20/2020, 4:21 AM
    Pinot is taking long time to import data when the data size is huge. I am using "standalone" data load job. Trying with 80GB TPCH Lineitem data split into 600 files(each file is around 130MB). Creating segment file is taking around 3 hours on a 4 CPU 64GB RAM machine. Is this expected behavior?
  • m

    Mayank

    08/20/2020, 4:22 AM
    How many controllers do you have? Are you pushing files sequentially?
  • m

    Mayank

    08/20/2020, 4:23 AM
    Using deep-store with segment-uri push will help reduce the time, by avoiding to have to push the actual payload
  • r

    Ravikiran Katneni

    08/20/2020, 4:23 AM
    Two controllers
  • r

    Ravikiran Katneni

    08/20/2020, 4:24 AM
    Can you help in finding the documentation for "deep-store with segment-uri push" ?
  • m

    Mayank

    08/20/2020, 4:25 AM
    Here's a sample: https://docs.pinot.apache.org/basics/data-import/pinot-file-system/import-from-hdfs#push-hdfs-segment-to-pinot-controller
  • r

    Ravikiran Katneni

    08/20/2020, 4:28 AM
    Is option other than using HDFS?
  • m

    Mayank

    08/20/2020, 4:28 AM
    yeah, you can also use gcs, s3
  • r

    Ravikiran Katneni

    08/20/2020, 4:28 AM
    Ok,thanks
  • m

    Mayank

    08/20/2020, 4:30 AM
    May I ask what are you trying to achieve? Is this a benchmark?
  • m

    Mayank

    08/20/2020, 4:35 AM
    Actually, looks like even when using deep-store, the controller may still need to download the segments (metadata push may not be supported yet)
  • m

    Mayank

    08/20/2020, 4:47 AM
    Ok, then we need to understand where the time is spent. Is it on index generation or actual push?
  • r

    Ravikiran Katneni

    08/20/2020, 4:52 AM
    Indexing is taking time. Push is relatively faster.
  • m

    Mayank

    08/20/2020, 4:53 AM
    Hmm then it should be easy to make the job multi threaded, if it isnโ€™t already
  • k

    Kishore G

    08/20/2020, 6:38 AM
    There is a parallelism setting
  • m

    Mayank

    08/24/2020, 3:45 PM
    Hi Apache Pinot Community, friendly reminder to signup for Apache Pinot virtual meetup on Sept 2. Hope to see you there: https://linkedinpinotmeetup.splashthat.com/
  • b

    Buchi Reddy

    08/24/2020, 7:48 PM
    Hi, do we publish the latest jars from master to maven or only released ones? Neednโ€™t be maven central but any other repo.
  • k

    Kishore G

    08/24/2020, 8:12 PM
    only release versions are published to maven
  • k

    Kishore G

    08/24/2020, 8:14 PM
    you can always publish other versions to your internal artifactory
  • k

    Kenny Bastani

    08/25/2020, 6:51 PM
    Hi all! The one and only @User is presenting on Pinot and Kafka at the virtual Kafka Summit conference. If you signed up for free and can access the talk live, please tune in and support one of our own. https://twitter.com/ApachePinot/status/1298331477315391488?s=20
    ๐Ÿš€ 1
    ๐Ÿ™‚ 3
    ๐ŸŽ‰ 6
    ๐Ÿท 7
  • r

    respergu

    08/25/2020, 8:21 PM
    hello
    ๐Ÿ‘‹ 2
  • k

    Kishore G

    08/25/2020, 8:26 PM
    Hi @User
  • a

    Adi Polak

    08/25/2020, 9:05 PM
    Hi Everyone ๐Ÿ™‚
    ๐Ÿ‘‹ 6
1...141142143...160Latest