Peter Shirley's Ray Tracing In One Weekend implementation in Rust

Implemented Peter Shirley's Ray Tracing In One Weekend in Rust for learning purpose
Here Github repo, commits are done chapter by chapter so that anyone can follow it

Since I did it for learning Rust, the code might be off so, please feel free to give constructive criticism

Here the final result, Width: 1200, Height: 800, Sampling: 10, took ~14 mins to render
Edit: The above render took ~14 mins because of running on debug mode, take only 3.5 mins on running with --release

13 Likes

Hello, I did the same exercise at about the same time as you did...
Here it is my implementation:

https://github.com/fralken/ray-tracing-in-one-weekend

Cheers,

3 Likes

Neat! Thanks for the book recommendation, that was a lot of fun.

I worked through the exercises in Rust, and then did some optimization. The same scene at 1200x800 with x10 sampling takes 8.5 seconds on my 4-core laptop. (I used Rayon to get nearly-free support for multiple cores; with parallelism disabled it takes 31s.)

https://github.com/cbiffle/rtiow-rust/

Edit: @BlackGoku36, I read through your implementation, and I don't see a smoking gun to explain the performance difference, except for one thing: I notice that in sphere.rs you clone a Box<Material> inside of hit. That routine is probably the hottest piece of code in the whole program, so it might help to avoid allocating memory there. If you use Rc in place of Box you can return a new pointer to the same Material, avoiding a new allocation.

Also, make sure you're running it with --release . :wink:

3 Likes

@cbiffle Thank you for your advice.
Yes, I first ran it without --release (didn't know that before :sweat_smile:) and then tried with --release(I saw it from other forum post before) the rendering time of that scene(1200x800 with x10 sampling) went down to just 3.5 mins and now I implemented multiple-cores rendering(Rayon) and with same scene and settings, rendering time went down to just 1.2 mins(My CPU is Intel Core i5-7400 3.00GHz (4 cores and 4 threads)).
Now this is speed i wanted from long time

Yay! Glad to hear it! :grin:

If you can manage to remove the clone call, I bet it will get even faster.

Happy hacking!

1 Like

Thanks @BlackGoku36 and @cbiffle for sharing your implementations. I'm going to apply some of @cbiffle's good practices in my second project:

https://github.com/fralken/ray-tracing-the-next-week

Btw, @BlackGoku36 you can easily fix your commit messages with git rebase -i <commit number> and then git push -f to force the new history to github.

3 Likes

Thanks @fralken, i was finally able to add Image Texture in it :grinning:

Thanks, i will look into it!

Hm. I noticed earlier this week that my Rust implementation is significantly faster than the original C++ code (like up to 2x even on one thread). I'm not entirely sure why yet. But I like puzzles.

Thanks @fralken, i was finally able to add Image Texture in it :grinning:

Glad to hear that!

My implementation is about 30x slower than @cbiffle's :frowning: (90 secs vs 3 secs for the final image of RTTNW) . I have to figure out where I'm doing it wrong. I guess it is somewhere in the bvh implementation.

Edit: indeed there was an error in the axis-aligned bounding box hit function, now fixed. Now it is about 8 secs. There's still room for improvements for the bvh implementation.

Could be CPU difference too
Comparing @cbiffle 's Intel i7-8550U, 4 cores / 8 threads and my Intel Core i7-4770HQ, 4 cores / 8 threads(Spoilers: @cbiffle 's CPU is really faster than mine):

RTTNW final image(300x300, 100 samplings) took around 13.5 sec(me) vs 5 sec(@cbiffle )
(ofc, materials's clone call is still there :sweat_smile:)

Edit: Tested @cbiffle's implementation on mine, it is 13.5 sec(me) vs 6.5 sec(@cbiffle ), so you can say mine is around 2x Slower (might be material clone call :sweat_smile:)

Here are some exercises that might improve the performance of your renderer. As always, it is useful to get acquainted with your operating system's profiler.

Happy hacking!

  • Avoid allocating or cloning an Arc in any routine called from color.

  • Replace recursion with iteration in color.

  • Evaluate improving locality by storing copies of things directly instead of indirectly through Box or sharing them through Arc. The Material in a Hitable is the case that brought me the biggest speedups.

  • Use a faster, lower quality random number generator. rand::thread_rng is seeded from system entropy and intended to be cryptographically secure. I switched to rand::rng::SmallRng. The down side of this is that you have to pass it around.

  • Globally, try to reduce dynamic dispatch.

    • Try making "modifier" types into generic types. For example, convert FlipNormals into FlipNormals<O> where O is some kind of object. Implement Hitable for Box<dyn Hitable> and then you can write FlipNormals<Box<dyn Hitable>> if you really need the old behavior, or (say) FlipNormals<AARect> if you want compile-time specialization.

    • HitableList is an excellent candidate for that technique, btw. If you profile your program, you're likely to see it near the top.

    • If you do both of those, you might observe that a cube can be described as two pieces: a HitableList<AARect> and a HitableList<FlipNormals<AARect>>. In the book's final scene there are a lot of cubes, so removing the dynamic dispatch inside each cube's intersection routine will save cycles.

2 Likes

Cool!
Any tips or some resources for implementing .obj objects in this ray tracer?

@BlackGoku36 This project Is based off of ray tracing in one weekend and has a really clean way of implementing .obj files. You may also want to look into the tobj crate which points to another example ray tracer that does it. This is definitely my next step too.

Hey guys! Thought I'd also throw my version in here. It's my first real rust project and I found it to be a great way to get comfortable with the language.

Thanks for those link, I tried implementing it from Scratchapixel and Bheisler's Pathtracer but the result was really slow, it took whole minute to render low-poly standford bunny, but with that crate and implementation based on that project, the rendering now only take 2-3 secs :grinning:
I decided to keep the tutorial(Peter Shirley's minibooks) seperate, so it is here now: https://github.com/BlackGoku36/RRayTracer

1 Like

I want to try adding real-time previewing with windowing(rust-sdl2,winit etc..) but no idea on how to do this

Hello there, for completeness I also implemented the third book, the rest of your life :slightly_smiling_face:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.