https://pinot.apache.org/ logo
Join Slack
Powered by
# s3-multiple-buckets
  • k

    Kartik Khare

    07/06/2020, 5:51 PM
    @User Is there a support for multiple directories for FS? If Yes, we can extend that to multiple buckets.
  • k

    Kartik Khare

    07/06/2020, 5:51 PM
    @User How do you want to split data across buckets?
  • k

    Kishore G

    07/06/2020, 5:58 PM
    @User No, I was thinking if users can provide a list of subFolders/s3buckets, we can pick one randomly or hash it based on segment name
  • k

    Kartik Khare

    07/06/2020, 5:59 PM
    Randomly at the time of creating the segments?
  • k

    Kartik Khare

    07/06/2020, 5:59 PM
    Wouldn't that disrupt the query execution?
  • k

    Kishore G

    07/06/2020, 6:00 PM
    no, we just store the uri along with segment metadata in ZK
  • k

    Kishore G

    07/06/2020, 6:00 PM
    it can point to anything
  • k

    Kishore G

    07/06/2020, 6:01 PM
    actually, this is a problem only with real-time where we create the URI
  • k

    Kishore G

    07/06/2020, 6:01 PM
    with batch ingestion, user can provide any URI
  • y

    Yash Agarwal

    07/06/2020, 6:05 PM
    We don’t have any specific requirement around how to slit data across buckets.
  • k

    Kartik Khare

    07/06/2020, 6:08 PM
    Ok. Then I believe the change needs to be done in the handling of ingestion config and then picking a random directory while creating segments S3 filesystem implementation won't need any change unless the buckets are located in different regions
  • y

    Yash Agarwal

    07/06/2020, 6:09 PM
    all the buckets are co located.
  • k

    Kishore G

    07/06/2020, 6:12 PM
    Yash, is this realtime or offline
  • y

    Yash Agarwal

    07/06/2020, 6:15 PM
    Right now it is only offline.
  • k

    Kishore G

    07/06/2020, 6:48 PM
    then, you dont need any thing for now
  • k

    Kishore G

    07/06/2020, 6:48 PM
    I am guessing you will use the ingestion-job to generate the segments
  • y

    Yash Agarwal

    07/06/2020, 7:12 PM
    Yeah I realised that too. I am very new to this so sorry for any troubles 🙂
  • k

    Kishore G

    07/06/2020, 7:12 PM
    no worries, this is a good feature to have. if you dont mind, can you create an issue
  • a

    Alan H

    07/07/2020, 6:21 AM
    @User has left the channel
  • i

    Itzik Lavon

    08/08/2020, 7:56 AM
    @User has left the channel
  • a

    aj

    03/22/2023, 6:00 PM
    Hi, this sounds like the right channel for my question, but please redirect me if not - I believe my question is a little different from what i see in history. I am trying to use org.apache.pinot.plugin.filesystem.S3PinotFS for ingesting data, but I need to ingest using read only credentials from one account and write credentials to a bucket in a second account. Something like this:
    Copy code
    inputDirURI: '<s3://bucket_with_read_only_credentials/data/>'
    outputDirURI: '<s3://bucket_with_separate_read_write_credentials/pinot/segments>'
    Is it possible to configure separate read and write credentials?
  • k

    Korede Owolabi

    10/27/2023, 9:53 PM
    @aj Did you ever get an answer to this question? I am having the same problem.
    m
    • 2
    • 1