New library for lifetime-erased borrowing across threads

bgr360 · December 15, 2022, 6:28am

Hey y'all. I threw together this crate that I'm calling lender-loan which is based on a similar mechanism that we have at work for our C code. Basically, it's an Arc that guarantees that the value is dropped on the same thread that it was constructed on by blocking until all other references have dropped.

The version I threw together is for the specific use case of lending a lifetime-erased & reference to other threads. Here's what it looks like:

use lender_loan::{Lender, Loan};

fn use_loan(loan: Loan<Vec<i32>>) {
    assert_eq!(*loan, vec![1, 2, 3]);
}

// Create a value that will be lent to other threads.
let mut value = vec![1, 2, 3];
Lender::with(&value, |lender: &Lender<'_, Vec<i32>>| {
    for _ in 0..100 {
        let loan = lender.lend();
        std::thread::spawn(move || use_loan(loan));
    }
});

// It is safe to modify the value again; `Lender::with` blocks until all loans are dropped.
value.push(4);

I'm seeking:

Thoughts on the idea or possible use cases I didn't think of.
A quick double-check on my unsafe code. I like to think I know what I'm doing, but ya never know.

Also, has anyone seen this sort of thing before? Is there some other name for this concurrency / smart pointer primitive? I only know it by the name "lender/loan" so I didn't really know what to search for. It feels like a close cousin to yoke, but it's definitely not trying to solve the same problem.

I think that I could extend it to work with other types of "loan principals" (heh). Like Cow<'a, str> instead of &'a T for example. Probably with a very similar trait mechanism to that of yoke.

H2CO3 · December 15, 2022, 7:26am

I guess this is thread::scope()?

Coding-Badly · December 15, 2022, 7:27am

If you pass ownership of value to Lender::with then return value from that same call I think you can completely eliminate the lifetime. That should eliminate the unsafe code. Though I think you'd end up with two Arcs.

LegionMammal978 · December 15, 2022, 2:42pm

A couple issues with the unsafe code:

LoanInner<'a, T>: Send should only be satisfied when T: Sync, since it's logically a shared reference to T. This doesn't make a difference to the soundness of the crate, since it only releases Arc<LoanInner<'a, T>>s, and Arc requires both Send and Sync for the value to be transferred between threads.

Lender::with() fails to wait for the loans to be dropped if op() performs an unwinding panic. This can allow the original thread to invalidate the reference while it is still being used:

use lender_loan::Lender;
use std::{
    panic::{self, AssertUnwindSafe},
    sync::mpsc,
    thread,
};

fn main() {
    let (sender, receiver) = mpsc::sync_channel(0);
    let s = "Hello, world!".to_owned();
    panic::catch_unwind(AssertUnwindSafe(|| {
        Lender::with(&s[..], |lender| {
            let loan = lender.lend();
            thread::spawn(move || {
                receiver.recv().unwrap();
                println!("{loan}"); // use-after-free
            });
            panic!()
        });
    }))
    .unwrap_err();
    drop(s);
    sender.send(()).unwrap();
}

So it might be useful to protect the wait() behind a drop guard pattern, either through a third-party crate for that purpose (e.g., scopeguard) or a manually implemented guard type with a Drop impl (explained, e.g., here).

bgr360 · December 15, 2022, 6:00pm

Thanks for taking a look @LegionMammal978 , I had totally forgotten about unwinds.

@H2CO3 this is quite similar to scoped threads, but not the same. For one, it doesn't block waiting for all spawned threads, it only blocks long enough to wait for all Loans to be dropped. Spawned threads can continue running after the Lender has proceeded. Secondly, sharing references between scoped threads requires getting the lifetimes right; this erases the lifetimes to make the code easier to write.

This makes it possible to use with, say, a thread that was spawned earlier in the program that you communicate with via a channel. You can send a Loan through the channel to that longer-lived thread, and the Lender will wait until the thread is done borrowing the value. It basically takes all the hassle out of plumbing thread scopes and lifetimes through your multi-threaded code.

EdmundsEcho · December 23, 2022, 8:11pm

I could be way off, but yoke accomplishes lifetime erasure: crate.

Succinctly, this allows one to “erase” static lifetimes and turn them into dynamic ones, similarly to how dyn allows one to “erase” static types and turn them into dynamic ones.

The motivations are different but may otherwise overlap.

bgr360 · December 23, 2022, 9:54pm

Indeed, you'll notice I mentioned yoke in my initial post

I didn't spend a whole lot of time thinking about if I could use yoke to accomplish this. Now that I'm thinking about it again, perhaps I could.

steffahn · December 24, 2022, 12:36am

Fascinating. How come I've never seen yoke before? I love reporting all the soundness issues I can find in such crates

steffahn · December 24, 2022, 2:04am

I almost thought they got away without issues after the first potential problem I thought I saw didn’t turn out to be a problem AFAICT… but no, there are (almost) always soundness issues when lifetimes are involved

github.com/unicode-org/icu4x

`yoke` soundness issue with non-`'static` and contravariant cart, using `Yoke::attach_to_cart` + `Yoke::get`.

opened 02:02AM - 24 Dec 22 UTC

steffahn

Here’s a repro, in small steps with useful type annotations ```rs use yoke::Yo…ke; type F<'a> = fn(&'a ()); fn main() { let n = String::from("Hello World!"); let x: Yoke<&'static str, Box<F<'_>>> = Yoke::attach_to_cart(Box::new((|_| {}) as _), |_| &n[..]); let y: Yoke<&'static str, Box<F<'static>>> = x; let s: &'static Yoke<&'static str, Box<F<'static>>> = Box::leak(Box::new(y)); let r: &'static str = s.get(); println!("pre-drop: {r}"); drop(n); println!("post-drop: {r}"); } ``` [Run Online in Rust Explorer](https://www.rustexplorer.com/b#%2F*%0A%5Bdependencies%5D%0Ayoke%20%3D%20%220.6.2%22%0A*%2F%0A%0Ause%20yoke%3A%3AYoke%3B%0A%0Atype%20F%3C'a%3E%20%3D%20fn(%26'a%20())%3B%0A%0Afn%20main()%20%7B%0A%20%20%20%20let%20n%20%3D%20String%3A%3Afrom(%22Hello%20World!%22)%3B%0A%20%20%20%20let%20x%3A%20Yoke%3C%26'static%20str%2C%20Box%3CF%3C'_%3E%3E%3E%20%3D%20Yoke%3A%3Aattach_to_cart(Box%3A%3Anew((%7C_%7C%20%7B%7D)%20as%20_)%2C%20%7C_%7C%20%26n%5B..%5D)%3B%0A%20%20%20%20let%20y%3A%20Yoke%3C%26'static%20str%2C%20Box%3CF%3C'static%3E%3E%3E%20%3D%20x%3B%0A%20%20%20%20let%20s%3A%20%26'static%20Yoke%3C%26'static%20str%2C%20Box%3CF%3C'static%3E%3E%3E%20%3D%20Box%3A%3Aleak(Box%3A%3Anew(y))%3B%0A%20%20%20%20let%20r%3A%20%26'static%20str%20%3D%20s.get()%3B%0A%0A%20%20%20%20println!(%22pre-drop%3A%20%7Br%7D%22)%3B%0A%20%20%20%20drop(n)%3B%0A%20%20%20%20println!(%22post-drop%3A%20%7Br%7D%22)%3B%0A%7D%0A) ``` pre-drop: Hello World! post-drop: �[YU�Es ```

Edit: Aaaand… here’s a second issue, though as far as I can tell so far, it’s only exploitable on nightly.

github.com/unicode-org/icu4x

soundness issue in `yoke` using `#[feature(coerce_unsized)]` on `nightly`: The `Yokeable` derive (in `prove_covariance_manually` mode) does not do sufficient checks.

opened 05:17AM - 24 Dec 22 UTC

steffahn

Repro: ```rs #![feature(coerce_unsized)] use std::{cell::Cell, ops::Coerc…eUnsized}; use yoke::*; #[derive(Yokeable)] struct S1<T>(T); impl<T, U> CoerceUnsized<S1<U>> for S1<T> where T: CoerceUnsized<U> {} #[derive(Yokeable)] #[yoke(prove_covariance_manually)] struct S2<'a>(S1<Cell<Box<dyn AsRef<str> + 'a>>>); fn main() { let x: S2<'static> = S2(S1(Cell::new(Box::new("")))); let x_transformed: &S2<'_> = x.transform(); let r: &Cell<Box<dyn AsRef<str> + '_>> = &x_transformed.0 .0; let s = String::from("Hello World!"); r.set(Box::new(&s)); let contents: Box<dyn AsRef<str> + 'static> = x.0 .0.replace(Box::new("")); let s_ref: &str = (*contents).as_ref(); println!("A string: {s_ref}"); drop(s); println!("No more string: {s_ref}"); } ``` [run online in Rust Explorer](https://www.rustexplorer.com/b#%2F*%0A%5Bdependencies%5D%0Ayoke%20%3D%20%7B%20version%20%3D%20%220.6.2%22%2C%20features%20%3D%20%5B%22derive%22%5D%20%7D%0A*%2F%0A%0A%0A%23!%5Bfeature(coerce_unsized)%5D%0A%0Ause%20std%3A%3A%7Bcell%3A%3ACell%2C%20ops%3A%3ACoerceUnsized%7D%3B%0A%0Ause%20yoke%3A%3A*%3B%0A%0A%23%5Bderive(Yokeable)%5D%0Astruct%20S1%3CT%3E(T)%3B%0A%0Aimpl%3CT%2C%20U%3E%20CoerceUnsized%3CS1%3CU%3E%3E%20for%20S1%3CT%3E%20where%20T%3A%20CoerceUnsized%3CU%3E%20%7B%7D%0A%0A%23%5Bderive(Yokeable)%5D%0A%23%5Byoke(prove_covariance_manually)%5D%0Astruct%20S2%3C'a%3E(S1%3CCell%3CBox%3Cdyn%20AsRef%3Cstr%3E%20%2B%20'a%3E%3E%3E)%3B%0A%0Afn%20main()%20%7B%0A%20%20%20%20let%20x%3A%20S2%3C'static%3E%20%3D%20S2(S1(Cell%3A%3Anew(Box%3A%3Anew(%22%22))))%3B%0A%20%20%20%20let%20x_transformed%3A%20%26S2%3C'_%3E%20%3D%20x.transform()%3B%0A%20%20%20%20let%20r%3A%20%26Cell%3CBox%3Cdyn%20AsRef%3Cstr%3E%20%2B%20'_%3E%3E%20%3D%20%26x_transformed.0%20.0%3B%0A%20%20%20%20let%20s%20%3D%20String%3A%3Afrom(%22Hello%20World!%22)%3B%0A%20%20%20%20r.set(Box%3A%3Anew(%26s))%3B%0A%20%20%20%20let%20contents%3A%20Box%3Cdyn%20AsRef%3Cstr%3E%20%2B%20'static%3E%20%3D%20x.0%20.0.replace(Box%3A%3Anew(%22%22))%3B%0A%20%20%20%20let%20s_ref%3A%20%26str%20%3D%20(*contents).as_ref()%3B%0A%20%20%20%20println!(%22A%20string%3A%20%7Bs_ref%7D%22)%3B%0A%20%20%20%20drop(s)%3B%0A%20%20%20%20println!(%22No%20more%20string%3A%20%7Bs_ref%7D%22)%3B%0A%7D%0A) ``` A string: Hello World! No more string: KT�_�Y�p ``` The relevant problem is in the generated code for the derive on `S2`: ```rs unsafe impl<'a> yoke::Yokeable<'a> for S2<'static> { type Output = S2<'a>; #[inline] fn transform(&'a self) -> &'a Self::Output { unsafe { ::core::mem::transmute(self) } } #[inline] fn transform_owned(self) -> Self::Output { match self { S2(__binding_0) => S2( <S1<Cell<Box<dyn AsRef<str> + 'static>>> as yoke::Yokeable<'a>>::transform_owned( __binding_0, ), ), } } #[inline] unsafe fn make(this: Self::Output) -> Self { use core::{mem, ptr}; debug_assert!(mem::size_of:: <Self::Output>() = = mem::size_of:: <Self>()); let ptr: *const Self = (&this as *const Self::Output).cast(); #[allow(clippy::forget_copy)] mem::forget(this); ptr::read(ptr) } #[inline] fn transform_mut<F>(&'a mut self, f: F) where F: 'static + for<'b> FnOnce(&'b mut Self::Output), { unsafe { f(core::mem::transmute::<&'a mut Self, &'a mut Self::Output>( self, )) } } } ``` Here, the sanity check for soundness lies in the ```rs S2(__binding_0) => S2( <S1<Cell<Box<dyn AsRef<str> + 'static>>> as yoke::Yokeable<'a>>::transform_owned( __binding_0, ), ), ``` however: The type ```rs <S1<Cell<Box<dyn AsRef<str> + 'static>>> as yoke::Yokeable<'a>>::Output ``` is ``` S1<Cell<Box<dyn AsRef<str> + 'static>> ``` whereas the field we want to populate is ```rs S1<Cell<Box<dyn AsRef<str> + 'a>> ``` The code compiles nonetheless because of the unsize coercion from `S1<Cell<Box<dyn AsRef<str> + 'static>>` to `S1<Cell<Box<dyn AsRef<str> + 'a>>`. However, this transformation is only sound for _owned_ `Cell<…>` values, and _unsound_ behind a shared reference, which is what the implemented `transform` method now offers.

EdmundsEcho · December 24, 2022, 3:44am

Indeed you did. My bad. I only read the first 3/4 of your intro post before reading answers…

Notwithstanding, I look forward to reading how @steffahn found an issue. I was thinking that, just maybe, they found a way to admit safe code, without issue, in the circumstance where they “erasure” the lifetime of data borrowed from a read used to instantiate the app. So, life starts before anything else. I suspect when they need to mutate it they need to make it seem invariant with… or make it seem like the borrow no longer exists when they need to mutate the data?…

steffahn · December 24, 2022, 5:37am

Looking at lender-loan now, not a soundness issue, but I’m noticing the Loan<T>: Clone implementation having unnecessarily restrictive bounds (i.e. requiring T: Clone).

H2CO3 · December 26, 2022, 3:51am

<rant>
I am always highly suspicious when a crate is created specifically with the purpose of writing non-idiomatic code that would otherwise not pass the borrow checker, and successful compilation is achieved via unsafe instead. It is my perception that people increasingly assume the borrow checker to be a nuisance rather than the necessary condition of soundness it most frequently is.

We often say that it might be too strict (to remain politically correct, I guess), but my experience shows that even in the corner cases when this is technically true, one would still be better off trying to refactor into a style blessed by borrowck, rather than unsafely working it around, either manually or by means of a crate.

I've just seen way too many soundness holes even in older crates of such origin, and I'm starting to think that their prevalence, along with the sentiment of borrowck being an obstackle rather than a quality standard, hurts the language and its credibility in the long run.
</rant>

steffahn · December 26, 2022, 4:45am

As long as such crates are reasonably small, well maintained, and care about fixing all soundness issues being reported, I'd consider the possibility that such crates can be created in Rust a strength not a weakness. In my view, they serve as extensions to the borrow checker, whilst heavily relying on the borrow checker, too, since their APIs are commonly containing closure arguments with HRTB bounds that are necessary for soundness.

The power of such crates is that they help strengthening the argument against individual users "defeating" the borrow checker on a case-by-case and ad-hoc (and usually not well reviewed) basis and instead encapsulate the unsafety necessary for certain constructs, in this case (yet another) form of self-referencing datatype, behind an open source library that can be reviewed and improved, etc...

Of course, you could argue: Just avoid this pattern! I think "if you really need it, please still stick to (probably) sound safe API by using such and such crate" can be a more effective argument to help getting inexperienced Rust users to stop trying to defeat the borrow checker themself (a task that would much more likely be bound to lead to desaster).

I'm much more unhappy with unmaintained instances of such crates though. In particular owning_ref which has an incredible number of known soundness holes, no active maintainers, and is wayyyyy too popular a crate. (AFAIR, a clone of it is even appearing in the source of rustc or at least some tool in the repo; I should probably look back into whether they are still reproducing all those soundness holes there, too, but that's an unrelated discussion).

steffahn · December 26, 2022, 5:00am

For example, for the lender-loan library that this thread is about, I appreciate that the API is tiny (yet still useful), and unlike some other libraries discussed above, there are no macros involved in the API either.

bgr360 · December 26, 2022, 6:18pm

I'm happy to see this discussion happening, and I welcome any and all criticism.

I'll note my motivation for this library: I'm tinkering with designing my own programming language aimed at being a "Smaller Rust" as described by withoutboats in this blog post.

One idea I had for how to implement lifetimeless, Rust-like ownership and borrowing was to use this borrowing scheme where threads are blocked until are borrows are finished. So I wrote this library with the intention of trying to use it for the runtime of my language.

I have no particular goals for this in terms of the wider Rust ecosystem or coding patterns. I see @H2CO3 's point about offering beginners ill-advised "outs" from the borrow checker. But I think I agree with @steffahn moreso when they say that the ability to design these constructs is a strength of Rust.

My other primary goal here is to learn. Please continue pointing out other "borrowck-sidestepping" libraries that you know of! TIL about owning_ref.

bgr360 · December 26, 2022, 6:45pm

So, ^ those are the key benefits I see. The key drawback that I see is that it's possible (easy, even) to block your thread indefinitely by extending the lifetime of the loan:

use lazy_static::lazy_static;
use std::sync::Mutex;

type Loan = lender_loan::Loan<String>;

lazy_static! {
    static ref LOAN: Mutex<Option<Loan>> = Mutex::new(None);
}

fn main {
    let s = String::from("lend me");
    lender_loan::Lender::with(&s, |lender| {
        let loan = lender.loan();
        *LOAN.lock() = Some(loan);
    }); // <---- blocks forever 
}

Disclaimer: I haven't run this code yet, I'm just assuming it works as I've described

VorfeedCanal · December 26, 2022, 6:46pm

Can we mark such crates on lib.rs and doc.rs? Just like all attempts to prevent UB in C hit the “but it works for me” wall sometimes people would always write unsound code in Rust, but most just honestly look on the crate which solves their problem… and they have no idea that crate is unsound, they are not digging deeply, because this would, kinda defeat the purpose of not writing an ad-hoc implementation.

Although question about soundness and “responsiveness”… what about crates like indoc? It's kinda-sorta popular crate yet soundness issue is ignored (probably because it's not important issue since it can only crash the compiler, probably couldn't do anything more)… it's hard to say what's the proper handling of all that, I guess.

H2CO3 · December 26, 2022, 8:29pm

An ICE is not a soundness issue. A soundness issue is when you can cause UB using safe code. An ICE is a compiler bug which should either compiler or generate an error, but it's a problem in the compiler, not in a library's code. The compiler is never supposed to crash on even the wildest and unsafest code.

VorfeedCanal · December 26, 2022, 8:44pm

That's what indoc does there if I'm not mistaken. It produces invalid String (puts invalid UTF-8 inside) and then compiler explodes. Note that it's supposed to produce (according to manual) the exact same string as if you would remove indent — but in that case there are no crash and everything works just fine.

I can not be 100% sure (because I haven't investigated further) but I'm 90% certain it's soundness issue which in this particular case haven't lead to UB, you are correct.

Does it mean that it's Ok for procmacro to create dangling pointers and broken Strings? I was under impression that compiler contract WRT to acceptance of wildest and unsafest code doesn't go quite that far.

H2CO3 · December 26, 2022, 9:12pm

That's definitely not OK, but it should not crash the compiler nevertheless. Even if there is a soundness issue in a crate, this should not cause the compiler to crash.

Topic		Replies	Views
Lifetimes: Documentation help	14	739	January 12, 2023
Lifetime & borrowing resources for an intermediate help	3	934	January 13, 2022
Blog post series: After NLL -- what's next for borrowing and lifetimes? announcements	121	9028	January 2, 2022
Feedback for Tutorial about Borrowing & Co tutorials	9	685	September 11, 2022
Lifetimes and borrowing question help	2	290	January 26, 2022

New library for lifetime-erased borrowing across threads

Related Topics