https://discord.cloudflare.com logo
Join Discord
Powered by
# workers-discussions
  • f

    Fabian

    03/14/2023, 3:09 PM
    I was wondering if it is possible to pipe an (image) stream into FormData? I currently have this setup:
    Copy code
    ts
    const { readable, writable } = new TransformStream();
    req.body.pipeTo(writable);
    
    const formData = new FormData();
    formData.append(
      "file",
      readable,
      "test.jpg"
    );
    
    const response = await fetch(
      "...",
      {
        method: "POST",
        body: formData
      }
    );
    Which does not work - at least when uploading the image to Cloudflare Images, it says that the image has an incorrect mime type. If I read the whole body and append it into the form data, it does work as expected - but since that obviously isn't recommended because of the memory limit, I don't like that approach. I couldn't find anything on the interwebz
  • f

    Fabian

    03/14/2023, 4:17 PM
    Ok so apparently this is a feature in many http libraries, but as far as I could see right now it's apparently not possible to use a stream as FormData value in workers because of the worker implementation. Which is kind of a bummer regarding the memory limit but eh will use it as a work around for now
  • z

    zszszsz

    03/14/2023, 4:33 PM
    I cannot imagine how it should look like when multipart/form-data and chunked encoding work together
  • z

    zszszsz

    03/14/2023, 4:35 PM
    Or you can manually construct the raw stream if you know how it should be like
  • f

    Fabian

    03/14/2023, 4:43 PM
    I looked into it and I found a FormData implementation that outputs the raw stream and works with "sub streams", so maybe I'll consider that down the line. In my case it works to just load the whole file into memory (since there is no heavy traffic), but imagine 10 people in the same region wanting to upload a file >12MB with a worker. If I didn't overlook something, it's currently not possible, since it would exceed the 128MB limit 🤔
  • s

    Skye

    03/14/2023, 4:45 PM
    While two requests going to the same colo hitting the same worker instance is certainly possible, it's not guaranteed
  • s

    Skye

    03/14/2023, 4:46 PM
    Especially in the case of colos like LHR, which are made up of several machines - they're probably going to end up reaching different machines
  • s

    Skye

    03/14/2023, 4:46 PM
    I'd say, until (if ever) you're actually hitting those limits, you're not going to need to worry about them for files that small
  • d

    dave

    03/14/2023, 4:57 PM
    if you're doing big files, then I would recommend using a presigned S3 URL, if that works for your use-case.
  • t

    Tom Sherman

    03/14/2023, 4:59 PM
    this problem is something i've been attempting to solve with https://github.com/tom-sherman/response-multipart (altho this library doesn't stream yet, but the API is designed with that in mind)
  • t

    Tom Sherman

    03/14/2023, 5:04 PM
    and also i haven't worked out the API for creating multipart bodies yet, but it'd be something like this
    Copy code
    js
    const request = new MultipartRequest(url)
    request.append({
      headers: {
        "content-disposition": `form-data; name="file"; filename="test.jpg"`
      },
      body: req.body
    })
    
    const response = await fetch(request);
  • t

    Tom Sherman

    03/14/2023, 5:05 PM
    the point being tho: this use case is 100% possible to implement in userland, it just requires manual construction of the multipart/form-data body stream and headers
  • p

    plutoniumm

    03/14/2023, 5:47 PM
    Hi I've been thinking What's the feasibility of using workers to return extremely specific JS for a particular browser to avoid bloat Let's say Parsed User Agent String:
    macOS Safari 14.1
    Can we bundle a custom JS on the fly and send only the polyfills/optimisations needed for Safari 14.1 specifically and none else. (assuming tree shaking specific to browser is possible which in itself is a mountainclimb task)
  • p

    plutoniumm

    03/14/2023, 5:48 PM
    (I know thats not a real user string, its just for demo, like i didnt want to do the whole KHTML version thing)
  • d

    Dani Foldi

    03/14/2023, 5:49 PM
    you absolutely can do that, you can run for example esbuild in a worker, and import necessary core-js polyfills into your bundle, before returning it - the first time you see a user agent your response time will be much slower though, so you might want to think about pre-generating the most common ones (can be saved to #981314061268578304 and counted, sorted)
  • p

    plutoniumm

    03/14/2023, 5:51 PM
    oh yeahhhh that makes a lot of sense. If i prebundle the last few Chromiums and a couple of Safari's thats already massive savings and beyond that i think for frequent visitors various caches will kick in anyway thanks!!!
  • d

    Dani Foldi

    03/14/2023, 6:02 PM
    (Bear in mind that workers responses are not cached at the edge by default, so you'll have to use the cache API yourself to speed it up - and your worker/function will get called for every request, so if you're using Pages, you could build those before publishing so they're static and free)
  • z

    zszszsz

    03/14/2023, 6:18 PM
    Isn't upload a different thing ? It is about parsing the request stream, construct a stream from it and pipe it to upstream.
  • z

    zszszsz

    03/14/2023, 6:19 PM
    Or could it be a url if it is download
  • z

    zszszsz

    03/14/2023, 6:21 PM
    But would it be beneficial to do that ? Wouldn't it be better to ship everything and cache aggressively
  • z

    zszszsz

    03/14/2023, 6:22 PM
    Or maybe dynamic importing at client side
  • f

    Fabian

    03/14/2023, 7:20 PM
    User uploads to worker, worker uploads to API (in this case CF Images), but API only support FormData. The streaming of the data is easy, you only need to pipe the body. The implementation I talked about allows a ReadableStream to be set as a value of a FormData field. I didn't look into it how it does that exactly though.
  • f

    Fabian

    03/14/2023, 7:21 PM
    Thought so, since there are NodeJS libraries that do that
  • t

    Tom Sherman

    03/14/2023, 7:23 PM
    You can't pass a stream to FormData tho that's not a Cloudflare limitation
  • f

    Fabian

    03/14/2023, 7:23 PM
    Yeah I know, the spec doesn't include that
  • f

    Fabian

    03/14/2023, 7:28 PM
    So it's rather "no possibility to do that with standard browser libraries since browsers normally don't care about memory" (obviously they don't need to since they don't have to handle multiple requests at the same time)
  • z

    zszszsz

    03/14/2023, 7:29 PM
    So what about construct the formData from client side and pipe it directly to upstream
  • f

    Fabian

    03/14/2023, 7:31 PM
    Impossible, since metadata needs to be injected which can't be user controlled (they could calculate it themselves but a client could also write bs and can't check unless reading it which destroys the purpose)
  • f

    Fabian

    03/14/2023, 7:32 PM
    Plus the client shouldn't need to know about implementation specific details
  • d

    dave

    03/14/2023, 7:32 PM
    yessssss GPT-4 now has scraped Durable Object docs it seems
1...234223432344...2509Latest