Request for review: rsor (Reusable Slice of References)

mgeier · November 29, 2020, 2:41pm

I hope the documentation is clear enough about what this crate is supposed to do and how to use it. If not, please let me know!

The main task of this crate is to manipulate lifetimes with some unsafe code, so it would be great to know whether that unsafe code is sound or not.

I would also like to know whether my argument regarding the Send implementation makes sense: rsor::Slice - Rust.

Furthermore, I don't know whether/how exactly I should implement Sync, see https://github.com/mgeier/rsor/issues/7.

Apart from that, I'd love to hear any other kind of feedback, whether it is about coding style, API conventions, naming, documentation, packaging or whatever else!

8573 · November 30, 2020, 7:26am

You might also be interested in https://github.com/rust-secure-code/safety-dance, which is maintained by the Secure Code Working Group as a place to ask for unsafe Rust to be audited. On one hand, I wouldn't guarantee that a new crate without dependents would get much priority in their backlog; on the other hand, that it's a small crate and you've thought about soundness may make it more tractable and appealing to audit.

mgeier · November 30, 2020, 3:57pm

Thanks for the link, this looks like a very good institution!
However, they seem to be talking about crates with hundreds or even thousands of daily downloads, and I don't think asking for my little crate to be reviewed there would be appropriate.

I still hope to get some more responses from the community here, though!

In the meantime, I did a little bit of refactoring: Use NonNull::cast() + a small refactoring by mgeier · Pull Request #8 · mgeier/rsor · GitHub.
This doesn't reduce the amount of unsafe, but hopefully makes it a little clearer what's going on.

I've also tried to extract the main "questionable" things that I'm doing in the implementation and reducing it to a few lines of code:

pub fn create_and_destroy_vec<T>(some_ref: &T)
where
    T: ?Sized,
{
    let mut v = std::mem::ManuallyDrop::new(Vec::<&T>::new());
    v.push(some_ref);
    let ptr = v.as_mut_ptr();
    let ptr = ptr as *mut &mut T;
    let _ = unsafe { Vec::from_raw_parts(ptr, 0, v.capacity()) };
}

Note that there are (at least?) two "dubious" things going on:

A reference is pushed to the Vec and never removed, but later the Vec is assumed to have a length of 0.
A Vec<&T> is created, but a Vec<&mut T> is dropped.

Are those things "allowed"?

mbrubeck · November 30, 2020, 4:37pm

That's fine. Leaking a value is always sound, and in the case of references (which have no destructors), it has no effect at all.

This is equivalent to calling set_len before dropping the vector, so the Vec::set_len documentation is relevant.

This appears to meet all the safety requirements for Vec::from_raw_parts, since &T and &mut T have the same size and alignment.

Your Send impl also seems correct to me at first glance, but I’d need to study the crate and think longer before I could be confident in this.

mgeier · December 1, 2020, 2:51pm

Thanks a lot @mbrubeck, I'm relieved that my assumptions were not totally wrong.

Thanks also for your help with the Sync implementation: https://github.com/mgeier/rsor/pull/11.

krdln · December 1, 2020, 7:40pm

Perhaps related – there was a PR recently which added specialisation of collect() that allows memory reuse in some cases: https://github.com/rust-lang/rust/pull/70793. I was mind-blown when I first seen this PR, and it looks like it made to the stable channel on 1.48!

Using that mechanism there's possible to achieve something similar to Slice. Playground. Haven't looked too deep into rsor though, so not sure 100% if this is a similar usecase or not. Also, not sure if it's good idea to rely on this specialisation, of course

Also, custom allocators are coming, so I guess all such hacks for memory reuse may become obsolete.

mgeier · December 2, 2020, 10:47am

Wow, I didn't think this was possible without unsafe!

This is great! Your API is somewhat similar to the vecstorage crate, which solves the same problem as rsor::Slice.

I've learned a lot from analyzing your code, thanks for taking the time to write it!

I've somehow gotten the (wrong) impression that raw pointers can only be used for Sized types. I guess I misread some compiler error some time ago and never again tried to create a fat raw pointer. Thanks to your code I know now that this is possible after all. I've also double-checked in the reference: Type layout - The Rust Reference. Raw pointers can be just as fat as references! And both have the same layout.

This new-found knowledge allowed me to improve the implementation of rsor::Slice: Store a NonNull<*const T> instead of NonNull<()> by mgeier · Pull Request #12 · mgeier/rsor · GitHub. I think it is now safer and simpler, which is great!

Yes, it definitely solves the given problem. However, I don't think the API is as nice.

It works very well for the initial code example, but I'm more interested in the situations which I've mentioned in the Common Use Cases section. I guess in this case slice_borrow.extend(...) can be used with iterators, but it's slightly more convenient to have a one-liner like reusable_slice.from_iter_mut(...).

I'm not sure whether an abstraction similar to from_refs() could be made?

Another shortcoming of the fact that the Vec is returned to the user is that the user has to be more careful about where the scope of it ends.

I first thought you added the second scope in the example just for symmetry, but it turns out that it is actually necessary, because without it the compiler complains:

mutable borrow might be used here, when `strings` is dropped and runs the `Drop` code for type `SliceBorrow`

This limitation doesn't exist for rsor::Slice.

Why? Because the language doesn't guarantee this specialization to be present?

I guess this will depend on how easy those allocators will be to use. I'm looking forward to them!
But maybe some additional custom abstractions will still be helpful ...

krdln · December 2, 2020, 10:27pm

Didn't know about that crate, thanks! The API indeed looks really similar (perhaps even the same?)

I think something almost like it should be possible. I guess the main difference would be that you couldn't just return a slice, but rather something that derefs to a slice (because we need to have custom drop to return the mem).

Ha! you got me there Yes, drop needs to be explicit (I guess drop(strings) should also work). Your original ("scopeless") code seems to rely on NLL, right?

Take a look the comment on PR itself (in Concerns section):

vec.into_iter().filter(|_| false).collect() will no longer return a vec with 0 capacity, instead it will return its original allocation. This avoids the cost of doing any allocation or deallocation but could lead to large allocations living longer than expected.
If that's not acceptable some resizing policy at the end of the attempted in-place collect would be necessary, which in the worst case could result in one more memcopy than the non-specialized case.

…which is exactly what I've relied on. So this may change in the future. Also, I'm not 100% sure if compiler is required to use the specialized version in all cases (most cases specialization are just optimizations). It would be weird if it wasn't, though.

mgeier · December 3, 2020, 6:02pm

It's not quite the same. I think the main difference is that VecStorage isn't limited to references. It can store arbitrary Sized types and if you want to store references, you have to use a reference as type parameter, e.g. VecStorage<&[f32]> in contrast to Slice<[f32]>.

This flexibility has the downside that there are a few situations where VecStorage panics.

And I think type inference doesn't really work, because VecStorage has a different type parameter than the one used in the returned Vec (the returned helper struct has actually two type parameters: VecGuard<T, TGuard>).

Yes, I think so. NLL are great!

This is also one of the reasons why I implemented rsor::Slice instead of simply using vecstorage::VecStorage.

Yeah, it sounds a bit brittle to iterate over an empty Vec and to rely on the collect()ed result having the same capacity.

It is indeed a bit strange that you can collect() into a "new" Vec and end up with a probably unexpectedly huge capacity.

In the end, I'd like to make use of NLL and I'd like to be able to guarantee that no new memory is allocated as long as the capacity is sufficient, so I'll stick with my rsor::Slice API. But your code is still great and it has really extended my understanding!

system · March 3, 2021, 6:03pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
RStats - my first crates.io contribution code review	11	694	December 16, 2020
Secure and Efficient Open-Source Blockchain Identity System in Rust community	2	312	February 10, 2024
New crate: dumpster, a concurrent garbage collector announcements	5	634	November 12, 2023
Unsafe code review request for cycle collector code review	2	277	July 13, 2023
First project review code review	6	584	April 1, 2023

Request for review: rsor (Reusable Slice of References)

Related Topics