Requiring T: !Sync

(Originally (erroneously) posted on internals here, but I actually intended it to be posted here.)


I'm currently trying to get per-thread-object to work for my program, which works with a scoped function like .with(FnOnce(&T) -> R) -> R, allowing &T to "not escape" the thread bounds.

However, I require &T to live for longer than that, see the following simple bit of code;

    fn iter<'a>(&'a self) -> Box<dyn Iterator<Item = TupleOfBytes> + 'a> {
        let guard: &'a Connection = self.engine.iter_reader();

        self.iter_with_guard(guard)
    }

This is part of implementing a trait, so rewriting it isn't feasable.

However, we'd still like thread-locals, to conserve resources on initiating and maintaining multiple connections;

pub struct Engine {
    writer: Mutex<Connection>,

    // rusqlite::Connection: Send + !Sync
    read_conn_tls: ThreadLocal<Connection>,
    read_iterator_conn_tls: ThreadLocal<Connection>,

    ...

}

std::thread_local and thread-local have already been considered and rejected for different reasons (std doesn't free all objects on program exit, which is required for some durability. And thread-local never frees TLS objects after their threads die.), so that brings me to per-thread-object, with a following small modification;

// This borrows a TLS with a lifetime only as long as the TLS itself,
// this assures it does not outlive the TLS (or the structure its borrowed from),
//
// SAFETY: ThreadLocal assures that the value is never dropped as long as it or the thread lives.
//
// However, that means that T needs to be !Sync, there is no stable way of conveying this,
// so extra caution needs to be taken on the caller's side to signal this.
unsafe fn robbin_hood_tls<'tls, T: Send, I: FnOnce() -> T>(
    tls: &'tls ThreadLocal<T>,
    init: I,
) -> &'tls T {
    tls.with_or(
        |b| unsafe {
            let borrow: &'tls T = std::mem::transmute(b);

            return borrow;
        },
        init,
    )
}

This unsafe function has a massive "gotcha"; manually checking if T: !Sync, else the borrowed value can have a use-after-free (if it gets sent across threads, and the original thread dies).

It is only used in the following bit of code in Engine;


    fn iter_reader<'a>(&'a self) -> &'a Connection {
        let init = || Self::prepare_conn(&self.path, self.cache_size_per_thread).unwrap();

        // This is required to coerce a "good" lifetime for iterators,
        // which need a borrow to the connection to stay alive as long as they are.
        //
        // Unfortunately, we're giving away ownership of those iterators to the caller,
        // and so with a normal ThreadLocal we can't scope with `with()`.
        //
        // SAFETY: Connection is !Sync, so the borrow is !Send,
        // which assures it stays in the same thread, preventing use-after-free.
        unsafe { robbin_hood_tls(&self.read_iterator_conn_tls, init) }
    }

...but, understandably, i'd be more happier if this didn't require so many hazmat gloves, and i could let the compiler check itself if T: !Sync for this required "contract".


So my question is; Is there any way, existing in stable, existing in nightly, or up-and-coming, that could let me require that T: !Sync on robbin_hood_tls?

Alas, this is wrong. A value if Sync when all of its constituents are. Thus a value cannot be (proven to be) Sync if any of its constituents cannot be proven to be Sync. Take type T = (Cell<u8>, i32);, for instance:

  • T : Sync does not hold,

  • and yet, out of a &'lt T, I can get a &'lt i32, which is Sync!

So Sync (or lack thereof) is not the right marker for your property.

  • This is quite reminiscent of threads about upper-bounding the lifetime parameters appearing in a type (having "'lt : T clauses"), in that the : 'lt property has auto-trait-like (structural / infectious propagation) properties. So (&'lt (), &'static i32) would be upper-bounded by 'lt, and yet we'd be able to extract a &'static i32 out of it).

The way to ensure that a return value has a certain property (or lack thereof), such as not being 'static or not being Sync, is by using a helper wrapper with a restricted API within which to wrap the return value.

  • The restricted API:

    trait ScopedDeref {
        type Target : ?Sized;
    
        fn with_ref<R> (&self, _: impl FnOnce(&Self::Target) -> R)
          -> R
        ;
    }
    
    /// A nice blanket impl
    impl<T : ?Sized> ScopedDeref for &'_ T {
        type Target = T;
    
        fn with_ref<R> (&self, scope: impl FnOnce(&T) -> R)
          -> R
        {
            scope(&**self)
        }
    }
    
  • The wrapper:

    pub
    struct ThreadLocalRef<'lt, Referee : ?Sized> (
        &'lt Referee,
        PhantomData<*const ()>, // !Send + !Sync
    );
    impl Clone, Copy…
    
    impl<'lt, Referee : ?Sized> From<&'lt Referee>
        for ThreadLocalRef<'lt, Referee>
    { … }
    
    impl<Referee : ?Sized> ScopedDeref
        for ThreadLocalRef<'_, Referee>
    {
        type Target = Referee;
        …
    }
    

thus yielding:

fn robbin_hood_tls<'tls, T: Send>(
    tls: &'tls ThreadLocal<T>,
    init: impl FnOnce() -> T,
) -> ThreadLocalRef<'tls, T> // both `'tls`-upper-bounded and thread-tied.

And so, you'd have to adapt self.iter_with_guard() to take an impl 'a + ScopedDeref<Target = Connection> rather than a &'a Connection.

I don't know if that would be possible for you, but that's the only general and truly robust way.


That being said, I'm still struggling to wrap my head around the 'tls lifetime not being tied to that of the thread local storage: could you show a pattern whereby the thread storing the value could die before the ThreadLocal<T> owned value does?

As @Yandros already said, the absence of a trait implementation conveys no information.

It would be totally correct to have a copy of AtomicBool that is !Sync, and then use unsafe code to share this across threads. T: Sync requires proving certain things; not proving those thing just means they have not been proven, but it doesn't meant they do not hold.

4 Likes

What i'm looking for is asserting that T has been tried to prove for Sync, and that it has failed, that T does not hold Sync, not not proving Sync.

I think what @RalfJung is trying to say is that T: !Sync doesn't mean it's unsound for it to be shared between threads, and thus guaranteeing this will never happen, just that it's unsafe, meaning the user can still share the &'tls T with another thread by using unsafe and manually guaranteeing the absence of data races. Thus even if you could require T: !Sync your API would still be unsound.

1 Like

First of all, there is no trait that would express this. !Sync certainly doesn't.

Secondly, I don't think this is really what you are looking for. "adding Sync would not be sound" is a negated statement, and those are mostly useless to downstream users of T. To give some examples:

  • While adding Sync to Cell is unsound, it is still perfectly fine to share references to the same cell across multiple threads if all threads are careful to only perform read accesses. Unsafe code can do things safe code could not do.
  • While adding Sync to Option<Cell> would be unsound, it is still totally fine to create an &None at that type and share that across threads with arbitrary safe code. The None value is Sync even if the entire type is not.

Basically, "adding Sync is unsound" means that in general, sharing values of that type across thread boundaries is unsound, but there can be a long list of special cases where this is still perfectly fine.

2 Likes

It is impossible to guarantee that a value is dropped on program exit. std::process::exit(0) only runs atexit handlers, while std::process::abort() immediately exits without giving any code the chance to run. (except for the SIGABRT signal handler on UNIX. I am not sure if a failfast exception as raised by abort() can be catched on Windows.) Would it be possible to run all your code inside threads and ensure that all threads exit before returning from main? Then you can use libstd thread locals just fine, right?

There isn't even a need to go to a theoretical layer or to talk about unsafe code:

1 Like

Ah thank you, that part did not register to me properly, it properly illustrates how this is a more tricky situation than I had imagined.

1 Like

Also, you haven't answered to my question about the meaning of 'tls,

And the more I think about it, the more I'm convinced the whole design is ill-suited or over-restricted:

  1. Let's call 'ThreadLocal the "true lifetime" of the ThreadLocal instance, that is, a region that ends exactly when the instance is dropped or moved. By design, we have 'Threadlocal ⊇ 'tls, since 'tls represents the region within which it is borrowed (in practice, since we have leeway w.r.t. picking 'tls (it can be made as big as possible), then we can assume that 'tls = 'ThreadLocal).

    Let's also call 'thread the "true lifetime" of the thread itself, backing the storage of the value T symbolically wrapped within the ThreadLocal<T>.

  2. If 'thread ⊇ 'ThreadLocal, then having something be 'tls-bounded is enough: you don't need to think about thread shenanigans or whatnot.

  3. If 'thread ⊉ 'ThreadLocal, then there exists a region of code within which the ThreadLocal instance may live, but its backing thread may have died!

    Within that region, no matter where you are (even in your theoretical "fully !Sync" scenario), borrowing the inner data would be UAF!

Now, in practice, I don't believe this last case is possible: I suspect the ThreadLocal wrapper type is !Send, and created within the thread that backs its storage. This ensures we must be within the case 2.. So all is fine, except this extra Sync bound you are talking about is now redundant: heck, even ThreadLocal itself, if it doesn't use shared mutability, could be Sync itself (in a way this ThreadLocal thing is kind of similar to an stdlib MutexGuard<'T, T>: it's "pinned" to a thread (!Send), and yet other threads can still access the MutexGuard in a shared fashion, across threads (it's Sync)).


The only situation where you could want to start designing more advanced "thread-pinning" abstractions (e.g., something along the lines of my disregarded ThreadLocalRef), would be if you freed 'tls from the shackles of ThreadLocal / if you had an unbounded lifetime!

- fn robbin_hood_tls<'tls,             T: Send>(
+ fn robbin_hood_tls<'tls, 'unbounded, T: Send>(
      tls: &'tls ThreadLocal<T>,
      init: impl FnOnce() -> T,
- ) -> ThreadLocalRef<      'tls, T> // both `'tls`-upper-bounded and thread-tied.
+ ) -> ThreadLocalRef<'unbounded, T> // only thread-tied.

Now this would be the way to actually take advantage of the thread-pinning / thread-bounding in order to get rid of the redundant and slightly over-constraining 'tls bound.

With T : 'static (e.g., T = String), we could pick 'unbounded = 'static, for instance.

But this whole thing only works if either ThreadLocalRef were to be able to panic when calling .with_ref(), or if the backing T is never dropped:

  • before the thread dies, it can't be dropped, else it would be possible for some drop glue at thread cleanup to use a ThreadLocalRef that would be dangling (unless there were to be some guard-like logic between the drop and the ThreadLocalRef);

  • after the thread dies, it can't be dropped either, even if T : Send and the drop were to be scheduled to be run in another thread, since there would be no more backing storage.


So, actually, an unbounded lifetime may be a bit too loose to allow normal use.

Hence thread_local!'s API, which directly offers an allowed-to-panic! with_ref! kind of API, dubbed with().


There is an interesting thing, which is that, at first glance, it looks like it would be possible to offer a macro API for thread_local!s, which would only borrow from an unnameable local:

// can panic if called within some thread-local drop glue
thread_local_access! {
    let it: &/* 'local */ Cell<bool> = &SOME_THREAD_LOCAL;
}

which would unsugar to:

let local = PhantomNotSend(());
let it: &Cell<bool> = unsafe {
    SOME_THREAD_LOCAL.access_unchecked(&local) // may panic 
};

This would yield an it which would be bound to the lifetime of a local, and thus correctly doomed never to escape its "stack" frame :slightly_smiling_face:.

Or so it initially seems. Alas, because of async unsugaring, local frames don't have to be stack frames, and thus I believe that the compiler-generated async unsugaring breaks this API:

async fn exploit() {
    let local = PhantomNotSend(());
    let s: &String = unsafe {
        SOME_THREAD_LOCAL.access_unchecked(&local) // may panic 
    };
    let _await_point = yield_once().await;
    println!("{}", **s);
}

let mut fut = Box::pin(exploit());
// Poll it once, so as to obtain the reference in a non-panicking context.
some_executor::with_cx(|cx| fut.as_mut().poll(cx));
SOME_OTHER_THREAD_LOCAL.with(|drop_glue| drop_glue.set(Box::new(|| {
    // dereference that potentially stale reference in a no-longer-checked context.
    some_executor::block_on(fut);
})));

:weary:

So, all in all, correctly-lifetime-bound APIs are simple and quite fool-proof, athough a bit over-constrained. But the moment we forgo such lifetimes and only use phony / unbounded lifetimes, while also calling the drop glue of the thread-local-stored values when the thread is about to die, we run into a lot of trouble, and only a callback-based API (and its higher-order lifetime) such as .with(|it| { … }) gets to be sound.

3 Likes

Apologies, i had a bit of a rough and busy week, and I did not have enough spoons to properly meet this thread.


Thanks for the comprehensive reply, the reason i would like for T: !Sync is so that &T is !Send, this pins the borrow to the thread, and makes the effective borrow of &T within 'ThreadLocal and 'thread.

'tls is effectively tied to the ThreadLocal, by also making it a immutable borrow of the wrapping struct(s).

Here is the code in context of the program.

I can't use something along the lines of ThreadLocalRef, as that's a struct i would need to store somewhere, which i cant do here, as then it becomes a self-referential struct. So i need a "clean" borrow, a &Connection, so i can simply use that within that function, see the following function signature;

    pub fn iter_with_guard<'a>(
        &'a self,
        guard: &'a Connection,
    ) -> Box<dyn Iterator<Item = TupleOfBytes> + 'a> {

My thought process here was "well, the iterator needs to be only alive for 'a, which in this case is Engine, which contains the ThreadLocal, so 'tls == 'engine, that takes care of the UAF when Engine is dropped. Then i need only the iterator to not escape the thread, so &'a Connection needs to not be Send and Sync, pinning it to the thread, dying with it."

The reason why we would want to have Connection (inside the TLS) die both with the Thread and with the wrapping Engine, is because we had problems with std::thread_local dropping all Connections when the program gracefully exits, which would then close the connections gracefully, ensuring resiliency while it cleaned up the associated resources (on disk).

thread_local::ThreadLocal also wasn't doing it for us, as that doesn't drop the value after the thread exits (but still provides a &T, which was sneakily always valid as the TLS never dropped the value after the thread dies, so it could be Sync and Send).

per_thread_object::ThreadLocal then was perfect... if not for the fact it did only (correctly) offer a .wrap API, ensuring the value never escaped its stack frame.

TLDR; We required a lifetime that was smaller than 'tls and 'thread, we could only bind to the former, and "cheat" by letting the borrow not escape the thread (with !Sync and !Send), however, with your previous reply, i see that is simply not possible, and hazmat gloves would be required.


It is not, however, it shouldnt matter as it only fronts a pool of values that can never be borrowed longer than .wrap(). It does still hold these values as long as it or the thread lives, though.


Apologies for my rambly, or otherwise evasive-looking answers, I'm simply trying to wrap my head around all of these concepts, but learned a lot in the process.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.