https://discord.cloudflare.com logo
Join Discord
Powered by
# r2
  • v

    Vitali

    05/05/2022, 11:23 AM
    > I can work around this in rclone if necessary - it isn't a big deal. I blame @john.spurlock for this one. Took it on faith instead of double-checking what S3 supports.
  • v

    Vitali

    05/05/2022, 11:24 AM
    > Streaming uploads are multipart uploads where you don't know the size of the file in advance. All the ones I tried worked manually worked fine though. Hmm... I was under the impression that content-length is required for all uploads. That's only true for PutObject?
  • n

    ncw

    05/05/2022, 11:29 AM
    > Hmm... I was under the impression that content-length is required for all uploads. That's only true for PutObject? You can fudge it with multi part uploads - note how Content-Length isn't required here: https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html So you upload as many parts as you can until you get EOF from the source, then call that the last part. I'll capture a failing HTTP transaction from the test suite so we can look at exactly what is going on.
  • v

    Vitali

    05/05/2022, 11:30 AM
    I don't know if that will work well with R2 anyway. All parts (except the last) are required to have the same size.
  • v

    Vitali

    05/05/2022, 11:31 AM
    But yeah, for now no plans to implement indefinite length part uploads
  • i

    itsmatteomanf

    05/05/2022, 11:41 AM
    Just out of curiosity, the bucket's DO is per bucket, right? So a new bucket could solve the issue?
  • v

    Vitali

    05/05/2022, 11:41 AM
    Only one way to find out
  • n

    ncw

    05/05/2022, 11:41 AM
    > I don't know if that will work well with R2 anyway. All parts (except the last) are required to have the same size. It should work fine with R2 multipart upload as it uses the protocol exactly like that. All parts are the same size, except the last. Here is rclone trying to upload the first (and last in this case) part. In this case the part is empty and I suspect R2 isn't expecting a 0 length multipart upload which is causing the problem. Non zero length uploads work fine.
    Copy code
    2022/05/05 12:30:35 DEBUG : PUT /rclone-test-bagagot9kijixab3tayizak6/piped%20data.txt?partNumber=1&uploadId=ADcXbLAJwhj8rHbn9xncuLYfjE7zacAiUJ9Qf%2BiapLwrvCJMAlxGNP%2BJAbFrG4RwmPY1Ja%2BLTjyEh7EssqGRQcPzqdoAHW6CPa1U5NOL1SPvKFriZIFrpOkn9ymIDZOYZRprAx6C6Yyob%2FhbbeAjbQ48WTybgRJT0wJDFPkGJcVJp%2B2Ga3zdRE1IH2N4gpsJ9UciY9Hd4t23i3oZ8JUcVWxk3ux1Wk0KRRjJIL4bqhABrUWa%2Fh%2F%2BeQvREQrbmgxd3v0moBV75KcsWbXqvvtkFFJBCWQX8iMsz5vHy%2F5tzLNWIcorxPrXRTUZ5pMVK%2FRDUA%3D%3D HTTP/1.1
    Host: 14aad7c9ed489151b51557e321b246cf.r2.cloudflarestorage.com
    User-Agent: rclone/v1.59.0-DEV
    Content-Length: 0
    Authorization: XXXX
    Content-Md5: 1B2M2Y8AsgTpgAmY7PhCfg==
    X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
    X-Amz-Date: 20220505T113035Z
    Accept-Encoding: gzip
    Copy code
    2022/05/05 12:30:35 DEBUG : HTTP RESPONSE (req 0xc0005ac200)
    2022/05/05 12:30:35 DEBUG : HTTP/2.0 500 Internal Server Error
    Content-Length: 111
    Cf-Ray: 7069251d391c8895-LHR
    Content-Type: text/plain;charset=UTF-8
    Date: Thu, 05 May 2022 11:30:35 GMT
    Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
    Server: cloudflare
    Vary: Accept-Encoding
    
    <Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message></Error>
    Replicate with
    Copy code
    touch emptyfile ; rclone rcat -vv --dump bodies --low-level-retries 1 --retries 1 --streaming-upload-cutoff 0 r2:rclone/6M < emptyfile 2>&1 | tee r2-streaming2.log
  • v

    Vitali

    05/05/2022, 11:44 AM
    Yeah - the last part is required to have at least length 1
  • v

    Vitali

    05/05/2022, 11:45 AM
    (minimum length for any part really)
  • v

    Vitali

    05/05/2022, 11:56 AM
    Copy code
    <Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message></Error>
    We do need to do a better job here though
  • v

    Vitali

    05/05/2022, 11:59 AM
    Would you mind sharing the request for the CR in the filename? We don't really process the object key in any way so it's a bit surprising.
  • n

    ncw

    05/05/2022, 12:04 PM
    > Yeah - the last part is required to have at least length 1 AWS doesn't have that limitation. It is a minor thing, but will mean, for example, that people can't upload 0 length files via
    rclone mount
    so it would be nice to fix.
  • v

    Vitali

    05/05/2022, 12:10 PM
    0-length files should be uploaded via
    PutObject
    - otherwise you're paying for 3 requests instead of one. In fact, really all files below some threshold should be uploaded via
    PutObject
    .
    PutObject
    is also extra optimized for 0-length files and you'll have better performance.
  • v

    Vitali

    05/05/2022, 12:10 PM
    But yeah, I can add it to the backlog to fix 0-length part uploads
  • n

    ncw

    05/05/2022, 12:11 PM
    > Would you mind sharing the request for the CR in the filename? We don't really process the object key in any way so it's a bit surprising.
    Copy code
    2022/05/05 13:03:50 DEBUG : HTTP REQUEST (req 0xc0006f3400)
    2022/05/05 13:03:50 DEBUG : PUT /rclone/%0Dleading%20CR HTTP/1.1
    Host: 14aad7c9ed489151b51557e321b246cf.r2.cloudflarestorage.com
    User-Agent: rclone/v1.59.0-beta.6116.781bff280.fix-5422-s3-putobject
    Content-Length: 6
    Authorization: XXXX
    Content-Md5: sZRqySSS0jR8YjW00mERhA==
    Content-Type: application/octet-stream
    X-Amz-Acl: private
    X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
    X-Amz-Date: 20220505T120350Z
    X-Amz-Meta-Mtime: 1651752229.551114329
    Accept-Encoding: gzip
    
    hello
    The file uploads correctly
    Copy code
    2022/05/05 13:03:52 DEBUG : HTTP RESPONSE (req 0xc0006f3400)
    2022/05/05 13:03:52 DEBUG : HTTP/2.0 200 OK
    Content-Length: 0
    Cf-Ray: 706955cfd99b749d-LHR
    Content-Type: text/plain;charset=UTF-8
    Date: Thu, 05 May 2022 12:03:52 GMT
    Etag: "b1946ac92492d2347c6235b4d2611184"
    Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
    Server: cloudflare
    Vary: Accept-Encoding
    X-Amz-Version-Id: 4276894ab4cf4d1c8312669ee30fd329
    However it returns as a leading LF in the directory listing! And looking at the HTTP trace it is now obvious why - the CR aren't encoded in the XML so they are being CRLF somewhere in the chain.
    Copy code
    00000820  65 3c 2f 4e 61 6d 65 3e  3c 43 6f 6e 74 65 6e 74  |e</Name><Content|
    00000830  73 3e 3c 4b 65 79 3e 0d  6c 65 61 64 69 6e 67 20  |s><Key>.leading |
    00000840  43 52 3c 2f 4b 65 79 3e  3c 53 69 7a 65 3e 36 3c  |CR</Key><Size>6<|
    00000850  2f 53 69 7a 65 3e 3c 4c  61 73 74 4d 6f 64 69 66  |/Size><LastModif|
    00000860  69 65 64 3e 32 30 32 32  2d 30 35 2d 30 35 54 31  |ied>2022-05-05T1|
    If I set
    --s3-list-url-encode
    then it works fine, so setting
    list_url_encode = true
    will be necessary. This will be something that gets encoded into the R2 provider when it is made.
  • n

    ncw

    05/05/2022, 12:13 PM
    > 0-length files should be uploaded via PutObject - otherwise you're paying for 3 requests instead of one. In fact, really all files below some threshold should be uploaded via PutObject. PutObject is also extra optimized for 0-length files and you'll have better performance. Actually, thinking about it, rclone will normally buffer small files in RAM anyway and upload them with PutObject, so I think 0 length multipart uploads are mostly academic!
  • e

    Erisa | Support Engineer

    05/05/2022, 12:14 PM
    It strikes me as something that you would do purely because you can, rather than having any practical purpose
  • v

    Vitali

    05/05/2022, 12:16 PM
    > If I set --s3-list-url-encode then it works fine, so setting list_url_encode = true will be necessary. This will be something that gets encoded into the R2 provider when it is made. Hmm.... I bet you this is fast-xml-parser doing the conversion. I wonder if it's mandated somewhere in the XML spec.
  • n

    ncw

    05/05/2022, 12:19 PM
    Using encoded listings is the modern S3 thing to do - I just set the parameters to the most conservative to start with, but setting
    list_url_encode = true
    is what the AWS provider does, so I wouldn't worry about it any further.
  • v

    Vitali

    05/05/2022, 12:20 PM
    I mean:
    Copy code
    toXML({
        Object: {
          LeadingCr: '\rsome text',
          TrailingCr: 'some text\r',
        }
      })
    results in
    Copy code
    </TrailingCr></Object>"railingCr>some text
    So I'm pretty unhappy with the unencoded behavior
  • v

    Vitali

    05/05/2022, 12:23 PM
    Is it supposed to be an embedded
    \r
    or escaped as an entity?
  • i

    Isaac McFadyen | YYZ01

    05/05/2022, 12:25 PM
    Interesting... they don't just "work", they actually alias to "auto" 👀
  • i

    Isaac McFadyen | YYZ01

    05/05/2022, 12:25 PM
    Wonder what other options are then.... "us" and "eu" maybe?
  • v

    Vitali

    05/05/2022, 12:26 PM
    Don't bother reading into the region parameter. It's more of a "reserved for potential future use". I have ideas but they're longer term depending on DO implementing certain features.
  • v

    Vitali

    05/05/2022, 12:26 PM
    People complained about us-east-1 and blank strings from a back compat perspective and this is the compromise.
  • i

    itsmatteomanf

    05/05/2022, 12:27 PM
    And Synology just completed a backup due to this being added 🙂 Thanks! 😄
  • v

    Vitali

    05/05/2022, 12:27 PM
    np
  • v

    Vitali

    05/05/2022, 12:48 PM
    Hmmm.... using the S3 JS client
    Copy code
    putObject({
          Bucket: bucketName,
          Key: '\rleading CR',
        }),
    results in this being received in the worker:
    /e2e-leading-cr/%0Dleading%20CR
    which then lists correctly using the S3 client (without encoding):
    Copy code
    console.error
        {
          IsTruncated: false,
          Contents: [
            {
              Key: '\rleading CR',
              LastModified: 2022-05-05T12:45:42.930Z,
              ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
              ChecksumAlgorithm: [],
              Size: 0,
              Owner: [Object]
            }
          ],
          Name: 'e2e-leading-cr',
          MaxKeys: 1000,
          CommonPrefixes: [],
          KeyCount: 1
        }
    I wonder if this is some difference with respect to Miniflare & the actual runtime
  • v

    Vitali

    05/05/2022, 12:49 PM
    Ran the test against a deployed worker. I'm not seeing the issue...
    Copy code
    console.error
          {
            IsTruncated: false,
            Contents: [
              {
                Key: '\rleading CR',
                LastModified: 2022-05-05T12:49:11.123Z,
                ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
                ChecksumAlgorithm: [],
                Size: 0,
                Owner: [Object]
              }
            ],
            Name: 's3-beelzabub-97-leading-cr',
            MaxKeys: 1000,
            CommonPrefixes: [],
            KeyCount: 1
          }
1...117118119...1050Latest