https://pinot.apache.org/ logo
Join Slack
Powered by
# metadata-push-api
  • m

    Mayank

    10/22/2020, 8:42 PM
    And use SSD
  • s

    Sidd

    10/22/2020, 8:42 PM
    exactly
  • m

    Mayank

    10/22/2020, 8:42 PM
    Will that solve our problem
  • s

    Sidd

    10/22/2020, 8:42 PM
    SSD for sure if we are not able to get rid of untarring business
  • x

    Xiang Fu

    10/22/2020, 11:59 PM
    I somehow feel the bottleneck is the controller download, if we can bypass this path, then controller overhead will be reduced a lot
  • m

    Mayank

    10/23/2020, 12:00 AM
    Yes controller bypassing will help a lot
  • m

    Mayank

    10/23/2020, 12:00 AM
    Unfortunately at lnkd we don’t have a reliable deepstore
  • x

    Xiang Fu

    10/23/2020, 12:00 AM
    we can add an option to convert metadata.properties and creation.meta file to json and let controller to handle that as well
  • x

    Xiang Fu

    10/23/2020, 12:01 AM
    basically there is no need for controller to download the data part
  • x

    Xiang Fu

    10/23/2020, 12:01 AM
    even current just metadata tar file helps a lot on the controller side
  • m

    Mayank

    10/23/2020, 12:02 AM
    But we don’t have a deepstore that is reliable and can be used in prod
  • m

    Mayank

    10/23/2020, 12:02 AM
    So we have no way to bypass controller
  • x

    Xiang Fu

    10/23/2020, 12:03 AM
    ic
  • x

    Xiang Fu

    10/23/2020, 12:03 AM
    not even hdfs?
  • m

    Mayank

    10/23/2020, 12:03 AM
    Availability is not good enough for production
  • x

    Xiang Fu

    10/23/2020, 12:03 AM
    pinot segments are anyway generated there and stored right?
  • x

    Xiang Fu

    10/23/2020, 12:03 AM
    ic
  • x

    Xiang Fu

    10/23/2020, 12:04 AM
    even offline push is not good ?
  • m

    Mayank

    10/23/2020, 12:04 AM
    Offline push is good
  • x

    Xiang Fu

    10/23/2020, 12:04 AM
    for realtime side
  • x

    Xiang Fu

    10/23/2020, 12:04 AM
    Uber has implemented the p2p download
  • x

    Xiang Fu

    10/23/2020, 12:05 AM
    maybe you wanna try out
  • m

    Mayank

    10/23/2020, 12:05 AM
    So we are trying out ideas for deepstore
  • x

    Xiang Fu

    10/23/2020, 12:05 AM
    so servers can download from each other
  • x

    Xiang Fu

    10/23/2020, 12:05 AM
    got it
  • m

    Mayank

    10/23/2020, 12:05 AM
    But in case none of those work we want to still optimize with controller in the path
  • x

    Xiang Fu

    10/23/2020, 12:06 AM
    maybe 2 hdfs 🙂
  • x

    Xiang Fu

    10/23/2020, 12:06 AM
    hdfs1:// and hdfs2://
  • x

    Xiang Fu

    10/23/2020, 12:07 AM
    I remember at Uber they used to use two hdfs to satisfy the availability requirements 😛
  • m

    Mayank

    10/23/2020, 12:08 AM
    😃