The 'least unsafe' way to provide mutating access to a buffer?

So the question is in the context of reddit discussions on https://www.reddit.com/r/rust/comments/fr50vl/r2_a_router_in_rust/ - where [BigHandLittleSlap] pointed out that the way I let applications have mutating access to packet buffers is terrible - https://github.com/gopakumarce/R2/blob/4457cb0c4fdb14f8aad3d2da46cd5e5524f29902/packet/src/lib.rs#L57

And obviously i'll be more than happy to correct it, can someone tell me whats the right way to do it ? This is what I responded there


And this was my thought process behind it. There are two ways to give applications a way to write into the packet

  1. Supply their own data buffer which gets copied into the packet
  2. Give them access to the packet but with bounds

I initially started off with #1 - but having copies of hundreds of bytes of data for every packet just wont cut it for performance. My "assumption" was that since I am providing a slice to the apps, theres no way they can overrun it/overwrite it and hence that should be acceptable.

I can be very wrong, and will be very glad to accept feedbacks on whats the right mechanism, and ill rework it


I actually don't see a problem with that particular function, as long as the raw buffer is owned by Particle and nobody else has access to it.
however… Particle::new() needs to be an unsafe fn, as it takes a raw pointer and dereferences it safe methods.

How do you ensure the a Particle doesn't outlive the buffer? I don't see Particle implementing drop and freeing the buffer, so it is actually a &mut [u8] which would allow rustc to check things for you.

2 Likes

Why don't you just use the bytes crate? It's a de facto standard buffer of async Rust. This crate supports to split ownership of preallocated large buffer into disjoint views, either writable with unique access or shared but read-only. It also provides traits to read/write values from/into byte buffer.

2 Likes

The concern he had is this, which is a valid concern - that I am making it mut without worrying about ownership etc.. I need to rethink on that yes, looking for the proper way to do that


Mutable means mutable. Never try to magically convert anything even vaguely "const" or not "mut" into mut. This is asking for trouble. More accurately, "mut" means unique ownership. If you have a method that provides mutation, then during that time it must have (temporary) unique ownership and/or the equivalent guarantee. Some sort of lease-release, lock-unlock, or other mechanism to guarantee that no other unique copy of the reference is kept. Typically this is implemented with RAII or something similar, but it's got to be enforced by the Rust type system, not doc comments.

The linked method doesn't do this. It just gleefully hands out a mutable buffer reference it got from a const reference using an unsafe block.

Elsewhere in your code you use Arc, which provides this kind of guarantee at runtime, and does so in a thread-safe way as well. However, you mention in several places that ideally your packet processing code runs as a single thread per core. In that case you can use Rc, which isn't thread safe but lower overhead than Arc. However, again, you've marked the various packet buffer structs as Send and Sync, which is a dubious decision at best, because I don't believe these are thread-safe as-is.

Ideally, if you really want to embrace the Rust model, have thread-local buffer pools that are not thread safe. That is, not Send or Sync. This would let consumers safely use your APIs without being able to accidentally send packet buffers across threads. Then instead of Arc, you can use Rc, which is efficient enough to use even in high performance network code to implement safe mutable buffer access using dynamic ownership checks. No unsafe code needed.

Encode your design in the type system, and then utilise the guarantees this affords to optimise it by eliminating the need for locks or interlocked data access. Use the guarantees to allow safety without overhead. Don't cheat by adding unsafe utility methods.


1 Like

Thanks .. I will look at the bytes crate. I am not sure if it will work only with Box allocated heap mem or can it work with memory allocated by custom allocators (like the Packet memory in my case comes from a custom allocator). Anyways, ill look at the bytes crate, thanks for the info

bytes comes with a fair bit of complexity and overhead that you may or may not want depending on the situation. The upside is that you can hold on to slices into it for pretty much arbitrary amounts of time without ownership issues. The downsides are atomic reference counting and non-contiguous backing storage.

As far as I can tell the main issue with this code is the cast from *const u8 to *mut u8, which is undefined behavior in Rust, and it doesn't matter if mutation actually happens or not.

So why do you store *const u8 in the struct in first place?

Also, why pointers and not a slice? I don't see any lifetime information attach to the struct (as a slice would necessitate), so if the backing storage is ever freed this could easily lead to a use-after-free.

2 Likes

The downsides are atomic reference counting and non-contiguous backing storage.

Thanks for the response. Yes atomic ref count will be a dampener, right now in my packet path there is a single atomic reference per packet, when I initially get packets from a lockfree pool, and I would want to keep the number of atomics as less as possible

So why do you store *const u8 in the struct in first place?

I dont think there was any particular reason for that, I can make it *mut u8, thats fine.

Also, why pointers and not a slice? I don't see any lifetime information attach to the struct (as a slice would necessitate), so if the backing storage is ever freed this could easily lead to a use-after-free.

You mean in the Particle instead of raw: *mut u8, have a raw: &mut [u8] and hence the & will force a lifetime ? I will have to play around with that, let me check on it, thanks!

Rgds,
Gopa.

Yup, exactly. You can also attach a lifetime to a struct with a raw pointer using PhantomData, but that's more complicated and likely not needed in this case.

You can get more prompt responses to these questions in #black-magic on Rust community discord.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.