https://discord.cloudflare.com logo
Join Discord
Powered by
# r2
  • v

    Vitali

    05/05/2022, 12:50 PM
    I really don't know where this issue is coming from but I don't think it's R2
  • a

    albert

    05/05/2022, 1:22 PM
    Will it be possible to do multipart uploads using R2 bindings?
  • v

    Vitali

    05/05/2022, 2:01 PM
    I had a proposal for that but we're not sure what the UX on it would look like
  • n

    ncw

    05/05/2022, 2:25 PM
    > Is it supposed to be an embedded \r or escaped as an entity? That depends on the value of
    encoding-type
    in the URL: see https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html#AmazonS3-ListObjectsV2-request-querystring-EncodingType S3 didn't used to have
    encoding-type
    (it was introduced in November 2013) and it was introduced to fix this problem. If I try the same tests with S3 I get the same result - without
    encoding-type
    I get an embedded
    \r
    which doesn't parse right, with
    encoding-type=url
    it works fine. So R2 is 100% compatible with AWS S3 as far as control characters in ListObjects goes.
  • a

    albert

    05/05/2022, 2:39 PM
    Perhaps something like this?
    Copy code
    mjs
    // Create multipart
    const multipart = await env.R2.createMultipart('some_key', {customMetadata: {'Hello': 'World'}}) // Same options as normal PUT
    const uploadId = multipart.uploadId
    return new Response(uploadId)
    
    // Upload part
    const multipart = env.R2.getMultipart(uploadId)
    await multipart.uploadPart(1, request.body)
    
    // Delete
    const multipart = env.R2.getMultipart(uploadId)
    await multipart.cancel()
    
    // List parts
    const multipart = env.R2.getMultipart(uploadId)
    const parts = await multipart.listParts()
    return new Response(JSON.stringify(parts))
    
    // Complete multipart
    const metadata = await multipart.finish() // Returns the same info as a normal PUT would
    return new Response(JSON.stringify(metadata))
  • v

    Vitali

    05/05/2022, 2:48 PM
    Yeah. That's very similar to what my proposal looked like but it was deemed too confusing at the time & I didn't press it
  • v

    Vitali

    05/05/2022, 2:49 PM
    The question was whether multipart was useful vs "just use the S3 API for that use-case". We're going to be driven by user demand here a bit more I think
  • a

    albert

    05/05/2022, 2:54 PM
    Gotcha. I think it'd be very useful considering there's a cap on request body size for Workers.
  • v

    Vitali

    05/05/2022, 2:55 PM
    No disagreement here
  • i

    Isaac McFadyen | YYZ01

    05/05/2022, 2:56 PM
    Agreed; plus, I'd really love to be able to implement my own auth in a Worker (integrating into what I already have) rather than dealing with SigV4 for multipart uploads.
  • a

    albert

    05/05/2022, 2:56 PM
    Currently you have to use the S3-compatible API for anything above 100, 200 or 500 MB depending on plan. And that requires extra code (or an SDK) on the client side to handle authentication.
  • v

    Vitali

    05/05/2022, 2:57 PM
    https://tenor.com/view/preach-gif-9201620
  • v

    Vitali

    05/05/2022, 3:02 PM
    @ncw the
    /
    issue should be fixed now.
  • n

    ncw

    05/05/2022, 3:25 PM
    > the / issue should be fixed now. Confirmed - all fixed 🙂 I've just been digging into why I sometimes get checksum failures on multipart uploads. The problem appears to be that R2 is returning ETags which look like this for multipart uploads.
    Etag: "d41a85febcd14e479493bc112efc2b57"
    Wheras AWS S3 would return one like this where the
    -3
    on the end indicates a multpart upload of 3 parts
    Etag: "d41a85febcd14e479493bc112efc2b57-3"
    Rclone is perfectly happy with etags like R2 is returning but they must be the MD5SUM of the object if they are returned in that form. It seems from my tests with multipart uploads that the
    ETag
    isn't always the MD5SUM of the object (or maybe never is - not quite sure). Rclone uploads the parts of multipart uploads in parallel which is going to make calculating the MD5SUM for the entire object difficult (which is why AWS don't bother). So I think R2 should either make the
    Etag
    match the MD5SUM of the object for multipart uploads or make it match what AWS does: [here is the bit of code in rclone which calculates an AWS multipart Etag for checking the upload](https://github.com/rclone/rclone/blob/a446106041fb9b5b57bdbb5e279bb8b64079f100/backend/s3/s3.go#L3969-L3970)
  • v

    Vitali

    05/05/2022, 3:26 PM
    Hmm.... does GCS follow this convention?
  • v

    Vitali

    05/05/2022, 3:26 PM
    The multipart spec makes no mention of the format the ETag must have
  • v

    Vitali

    05/05/2022, 3:27 PM
    However, the content is expected to be the hash of part ETags? The ETag for R2 multipart upload is random
  • v

    Vitali

    05/05/2022, 3:28 PM
    The uploadpart etag is totally random
  • n

    ncw

    05/05/2022, 3:33 PM
    > Hmm.... does GCS follow this convention? In their S3 interface? I don't know - rclone supports the native interface which always has hashes on objects. > The multipart spec makes no mention of the format the ETag must have It mentions somewhere that they the MD5SUM of the object unless encryption is active > The uploadpart etag is totally random All the S3 providers that I've tested follow AWS's lead for multipart uploads exactly, except Alibaba who use some algorithm I haven't figured out. A random ETag would be totally fine, just don't make it exactly 32 hex characters - ideally 32 hex characters followed by a dash followed by the number of parts in the multipart upload in decimal. Its pretty easy to calculate the AWS Multipart etag though because all the hashes are sent in the Complete multpart upload message and they just need to be hashed. Here is the inevitable stack overflow on [how to calculate the multpart ETag](https://stackoverflow.com/a/19896823/164234)
  • v

    Vitali

    05/05/2022, 3:35 PM
    > unless encryption is active R2 always has encryption active
  • n

    ncw

    05/05/2022, 3:36 PM
    Here is what AWS say about ETags: search for Etag here: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html
  • v

    Vitali

    05/05/2022, 3:36 PM
    But yeah. I think it's doable with some changes to how we do uploads. I won't get to fixing the digest for a while - there's a bunch there that needs to be fixed (e.g. we don't generate stable hashes for uploaded parts which is a problem for ListParts)
  • v

    Vitali

    05/05/2022, 3:38 PM
    > Objects created by either the Multipart Upload or Part Copy operation have ETags that are not MD5 digests, regardless of the method of encryption. Soo.... but anyway. Not arguing. I can add the dash & number of parts but fixing the etag to be the hash of hashes will be more involved
  • n

    ncw

    05/05/2022, 3:39 PM
    The multipart ETag lets rclone check the returned ETag is as expected so gives a little assurance that the objects weren't corrupted. I'm sure it isn't the only tool that does that! A dash + number will fix the immediate problem with rclone thinking those ETags are MD5SUMs and it is easy not to enable the multipart ETag checking for the R2 provider and I can turn it back on when it is working.
  • v

    Vitali

    05/05/2022, 3:40 PM
    Should be able to fix that today
  • a

    adaptive

    05/05/2022, 3:43 PM
    is there a binding problem with R2?
  • e

    Erisa | Support Engineer

    05/05/2022, 4:02 PM
    They dont work in the preview/playground
  • a

    adaptive

    05/05/2022, 4:04 PM
    Thanks, but this example is failing https://developers.cloudflare.com/r2/get-started/ I wonder how will a GET know mime type
  • j

    James

    05/05/2022, 4:09 PM
    The getting started example is pretty basic. The info is returned in
    httpMetadata
    (assuming available when you
    put
    ), so you can do stuff like this: https://github.com/Cherry/ShareX-R2-Cloudflare-Workers/blob/5934cc77dc24e29e677dd3ff8b9edfebfe83bd93/src/routes.mjs#L122
  • e

    Erisa | Support Engineer

    05/05/2022, 4:12 PM
    Honestly I found James' code above to be much more educational than the docs example
1...118119120...1050Latest