I'm a new programmer. i started learning coding about 3 - 4 months ago. And I'm building a full-stack website for hands on experience. I've written the backend server in Rust(No frameworks used, as i wanted to develop a more thorough understanding of how backend servers work). Currently my server only serves static HTML, CSS and image files. So naturally, I was curious as to what's the best format to send image files on the web, and i looked it up on google and Chat GPT, and found that WebP is the best format for sharing image files. So sure enough I adjusted my server to serve my image files in WebP format instead of jpg. But the problem is, still the image takes time to load in the browser. So can anyone help me with this?
My GitHub repo: GitHub - Stargardts/tcp
And here's my website: https://stargardts.me
Sorry, I just get a timeout while loading your website
Hello.
I am an experienced programmer, but happen to be new to Rust.
I think you are doing a great thing trying to get real clue how things work.
However, I would argue Rust itself is rather nasty to learn as your first language mostly due to the borrow checker. There are other languages out there which will give you enough space to learn without making you think about something like this. While it is hard to say which language would be the best overall, I would argue Go is a great first pick.
This being a Rust site I may get lambasted for the following, but here it goes: there is no denying docs are atrocious and following them will actively hinder your progress. For example you took code samples from Turning Our Single-Threaded Server into a Multithreaded Server - The Rust Programming Language or a resource derived from it. Multithreading as employed there is actively detrimental to performance and is not how things are done in real programs, at least on Unix systems (like Linux, where I presume you are trying to run this). More, you can have a high performance, scalable httpd server without threads (but with multiple processes instead).
All that said, whatever language you decide to stick with, I would suggest finding production quality libraries for what you are trying to do and consulting them when stumped.
Good luck.
This is a topic that comes up frequently. I'm not convinced it is as nasty as some make out. I say that as someone introduced to programming with BASIC and assembler pretty much concurrently then Algol68. I don't believe Rust cane much harder as a first language and that was manageable. Yes the borrow checker is an extra twist for a beginner perhaps, but surely not so terrible.
I love for our OP, Stargardts, to persist with Rust and report back on the experience. We need such bold beginners to point out where difficulties are. I guess the real problem is that the Rust Book is not written with raw beginners to programming in mind. Sorry I don't know of any materials that are. "Rust In Action" by Tim McNamara might help.
I'm curious as why you say "Multithreading as employed there is actively detrimental to performance ". It seems quite reasonable when I look at it.
I'm even more curious why you say using processes instead of threads would be even more performant. That is contrary to almost everything I have ever read about the cost of threads vs processes.
Anyway I'm going to assume that a beginner is not expected to aim for the greater, balls to the wall in performance.
I started with some BASIC on Atari and then moved to C. Tons of UB and segfaults later I emerged victorious and I'm pretty sure Rust + borrow checker is the easier path. Yet I don't think either is a good one to pick if it can be helped.
You are partially discouraging Rust yourself by having trouble coming up with a good intro material for people who are starting with programming to begin with.
I'm curious as why you say "Multithreading as employed there is actively detrimental to performance ". It seems quite reasonable when I look at it.
Should you strace any real server (say, nginx) you will find all of the workers can share the listening socket and accept a new connection on their own -- this is load balanced by the kernel. So in particular if you have n workers and n cores, you can accept all the new connections at about the same time. In contrast, the code at hand can only do one at a time and forces context switches -- if you have n cpus, n workers and 1 extra to dispatch, someone has to go off.
One thread producing work and a pool of workers consuming is a known and legitimate pattern, it's just not fit for this particular purpose.
I'm even more curious why you say using processes instead of threads would be even more performant. That is contrary to almost everything I have ever read about the cost of threads vs processes.
I don't know what you have been reading.
Say you have a program which spawns a thread and the new thread goes off cpu indefinitely, not disturbing any work on its own. This situation alone decreases performance of several things for the main thread.
Both the kernel and glibc now have to make concessions stemming from mere existence of the thread, some of it is thread cancellation support and some of it synchronisation just in case.
For example in malloc. GitHub - antonblanchard/will-it-scale malloc2 aka "malloc/free of 1kB", merely having a thread drops alloc/free rate from ~144797791 to ~134990045, or about 7% haircut (see the end for full results).
For a kernel example, syscalls which take a fd as an argument do an internal fd -> file translation without any atomic ops -- there is nobody to mess with the file, so they can afford to merely find it and use it until done. In contrast, if there are other threads which share the file descriptor table, now the kernel has to take a reference to the found file object using an atomic op (expensive) and then release again using atomics (again expensive, even on modern cpus). You may also notice opening and closing file descriptors also now has to be synchronized against other threads and so happens it is a known bottleneck for heavily multithreaded programs with high fd churn rate.
There are numerous other warts, for example in mmap.
Not that threads are useless, but there is a price to pay and chances are decent you don't even need them. They are definitely not a hard requirement for parallel processing for vast majority of programs, http servers being a great example.
# taskset --cpu-list 2 ./malloc2_processes -n
testcase:malloc/free of 1kB
warmup
min:144905075 max:144905075 total:144905075
min:144837421 max:144837421 total:144837421
min:144749375 max:144749375 total:144749375
min:144783953 max:144783953 total:144783953
min:144803352 max:144803352 total:144803352
min:144580176 max:144580176 total:144580176
measurement
min:144582280 max:144582280 total:144582280
min:144797791 max:144797791 total:144797791
# taskset --cpu-list 2 ./malloc2_threads -n
testcase:malloc/free of 1kB
warmup
min:134601245 max:134601245 total:134601245
min:134868131 max:134868131 total:134868131
min:134864794 max:134864794 total:134864794
min:135008028 max:135008028 total:135008028
min:134831842 max:134831842 total:134831842
min:134764404 max:134764404 total:134764404
measurement
min:134805886 max:134805886 total:134805886
min:134990045 max:134990045 total:134990045
You're absolutely right about documentation especially. Along with Rust, I've also been learning Python. And having gone through both languages' ddocs, Rust documentation is extremely cryptic. Especially for someone like me, who hasn't had any formal education in CS, understanding them is very difficult. Which is why most often than not, I have to rely on articles and blog posts for guidance.
I discourage python as it is too lax on types, but that's my take. Go feels like an ok middle-ground for me.
What most people will rightfully point out though is that learning 2 languages at the same time will only lead to confusion. I strongly recommend you stick to just 1 (even if it happens to be Rust or python) until you become proficient in it.
Just noting first that you asked about how to optimize streaming of WebP files to the client from your server, not which programming language to use.
By what you said I'm guessing you're using WebP because the file size should be smaller, but when you test this, the time to download the WebP is no shorter than the time to download the jpg. Is that right?
This is not something I've done, but I took a look at your repo and I only see a jpg file and its path "/images/pic.jpg" in files.rs
. So I would check to be sure that:
- you're streaming a WebP file
mime_guess::from_path
is returning the correct mime type for WebP:image/webp
- the client (browser) is sending an
Accept: image/webp
header to indicate it can handle WebP files
Of course, also check that the file size of the WebP file is actually smaller than the JPG file; if it is not significantly smaller, you may not notice a difference in download speed.
How are you measuring download speed?
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.