What is the most optimal way to create Bytes from a file?

I am referring to the type Bytes from the crate bytes.

My first idea was to read the file using tokio-fs (the docs should say something about requiring an Arc<Path> as input for the read function...) and then put it into a cache for future requests, but you can't get a weak pointer to Bytes, so that won't work.

I am also planing to break large files into chucks and only store a certain amount of them to avoid OOM.

I am currently streaming chunks to hyper with Body::channel() from that crate.

Basically the idea of the weak pointer to the bytes (of the chunks I guess) was that during multiple downloads of the file memory would be used optimally, but I now know you can't do that.

Also you could implement HTTP range support quite well(i.e not using too much ram) with that approach.

1 Like

Can you elaborate more on what you're trying to produce? I see in the docs

Bytes is an efficient container for storing and operating on contiguous slices of memory.

So I assumed you wanted a single Bytes with the whole thing (which you could get with fs::read + Bytes::from(Vec<u8>)), but then you were talking about chunks...

If you assume the file is quite small, we don't need chunks (or at least we only have one so it does not matter), but basically what I want to do is have only one copy of the file (or chunk) in the program at once (or quite close to it, it is slightly racy), so it's loaded once and then cached (preferably in a weak form, so it does not increment the reference counter), also to clarify I send the file path (as an Arc<Path>, generated earlier in my static site generation code) and the "BodySender". I write the chunks into the "BodySender" (Called just Sender in the docs, but that interferes with the tokio MPSC Sender), but I don't want to allocate more memory per request (and also per chunk per request but we are assuming that we only have one), only the first one for that file (or chunk).

I guess this is too confusing, why I can't just read into a Vec<u8> and then convert it into Bytes is that is too slow and takes up too much ram.

I don't think I'm prematurely optimizing, because the actual tokio part of the program (the part that runs on the tokio threadpool) is basically done, but is quite a bit slower then nginx.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.