Why do my distinct values resolve to the same pointer?

Hi folks,

I'm trying to put together a bounded deque which protects the enque / dequeue ends by mutual exclusion. I'll discuss pieces of the library inline below but you can find the whole listing in the playground.

The basic idea is I have a contiguous block of memory -- a Vec turned into raw parts -- and two head and tail offsets chasing each other around the structure for doing enq/deq operations. Typical ring buffer setup, I guess. Two mutexes hold these offsets, plus there's a condvar so that the enqueuers can wake up the dequeuer. The primary type Queue<T> is defined like:

struct Queue<T>
where
    T: ::std::fmt::Debug,
{
    inner: *mut InnerQueue<T>,
}

with clone being:

impl<T> Clone for Queue<T>
where
    T: ::std::fmt::Debug,
{
    fn clone(&self) -> Queue<T> {
        Queue { inner: self.inner }
    }
}

InnerQueue<T> is heap allocated and is defined like:

struct InnerQueue<T>
where
    T: ::std::fmt::Debug,
{
    capacity: usize,
    data: *mut (*const T),
    size: AtomicUsize,
    enq_lock: Mutex<isize>,
    deq_lock: Mutex<isize>,
    not_empty: Condvar,
}

So far so good. Where I run into trouble is enq:

pub unsafe fn enq(&mut self, elem: T) -> Result<(), Error> {
    let mut must_wake_dequeuers = false;
    let mut guard = self.enq_lock.lock().expect("enq guard poisoned");
    if !(*self.data.offset(*guard)).is_null() {
        println!("[ENQ] WOULD BLOCK: {:?}", elem);
        return Err(Error::WouldBlock);
    } else {
        let raw_elem = &elem as *const T;
        println!(
            "[ENQ] PTR[{:0>2}]: {:?} <- {:0>2?} |{:?}|",
            *guard,
            self.data.offset(*guard),
            elem,
            raw_elem,
        );
        mem::forget(elem);
        *self.data.offset(*guard) = raw_elem;
        *guard += 1;
        *guard %= self.capacity as isize;
        if self.size.fetch_add(1, Ordering::Relaxed) == 0 {
            must_wake_dequeuers = true;
        };
    }
    drop(guard);
    if must_wake_dequeuers {
        // TODO not sure if must lock this or just makes failure less likely
        let guard = self.deq_lock.lock().expect("deq guard poisoned");
        self.not_empty.notify_all();
        drop(guard);
    }
    return Ok(());
}

I take the elem:T, grab a raw pointers to it and then leak elem:T, being careful to store it's pointer in my contiguous memory -- *self.data.offset(*guard) = raw_elem -- before doing some bookkeeping work. deq is similar, as you can see in the playground like above. Where I run into trouble is that raw_elem always seems to be the same pointer, which means I catastrophically do not understand something here. Let me show you. This is the output I get on my x86 machine:

[ENQ] PTR[00]: 0x106c25000 <- 00 |0x700003f5c758|
                                                      [DEQ] LOOP FOR NON-NULL
[ENQ] PTR[01]: 0x106c25008 <- 01 |0x700003f5c758|
                                                      [DEQ] PTR[00]: 0x106c25000 -> 01 |0x700003f5c758|
[ENQ] PTR[02]: 0x106c25010 <- 02 |0x700003f5c758|
[ENQ] PTR[03]: 0x106c25018 <- 03 |0x700003f5c758|
[ENQ] PTR[04]: 0x106c25020 <- 04 |0x700003f5c758|
[ENQ] PTR[05]: 0x106c25028 <- 05 |0x700003f5c758|
[ENQ] PTR[06]: 0x106c25030 <- 06 |0x700003f5c758|
[ENQ] PTR[07]: 0x106c25038 <- 07 |0x700003f5c758|
[ENQ] PTR[08]: 0x106c25040 <- 08 |0x700003f5c758|
[ENQ] PTR[09]: 0x106c25048 <- 09 |0x700003f5c758|
[ENQ] PTR[10]: 0x106c25050 <- 10 |0x700003f5c758|
[ENQ] PTR[11]: 0x106c25058 <- 11 |0x700003f5c758|
[ENQ] PTR[12]: 0x106c25060 <- 12 |0x700003f5c758|
[ENQ] PTR[13]: 0x106c25068 <- 13 |0x700003f5c758|
[ENQ] PTR[14]: 0x106c25070 <- 14 |0x700003f5c758|
                                                      [DEQ] PTR[01]: 0x106c25008 -> 13 |0x700003f5c758|
[ENQ] PTR[15]: 0x106c25078 <- 15 |0x700003f5c758|
                                                      [DEQ] PTR[02]: 0x106c25010 -> 15 |0x700003f5c758|
[ENQ] PTR[16]: 0x106c25080 <- 16 |0x700003f5c758|
                                                      [DEQ] PTR[03]: 0x106c25018 -> 16 |0x700003f5c758|
[ENQ] PTR[17]: 0x106c25088 <- 17 |0x700003f5c758|
                                                      [DEQ] PTR[04]: 0x106c25020 -> 17 |0x700003f5c758|
[ENQ] PTR[18]: 0x106c25090 <- 18 |0x700003f5c758|
[ENQ] PTR[19]: 0x106c25098 <- 19 |0x700003f5c758|
                                                      [DEQ] PTR[05]: 0x106c25028 -> 123145368751136 |0x700003f5c758|
                                                      [DEQ] PTR[06]: 0x106c25030 -> 4642804436 |0x700003f5c758|
                                                      [DEQ] PTR[07]: 0x106c25038 -> 4642804436 |0x700003f5c758|
                                                      [DEQ] PTR[08]: 0x106c25040 -> 4642804436 |0x700003f5c758|
[1]    88494 segmentation fault

In this particular run 0x700003f5c758 is raw_elem always but that, uh, that shouldn't be, yeah? I figured raw_elem -- as the input vector was cycled through -- would increment by one each time but that's not what I'm seeing here.

Could anyone point out where I've goofed and/or stumbled into undefined behaviour?

This will move the elem into the function call, and then forget will just avoid running the destructor. The elem pointer you captured was on the stack, and that memory will certainly get reused.

To actually leak it, you need some indirection like a Box putting it on the heap -- then forget will consume the box pointer without freeing it. Or even better, use Box::into_raw().

1 Like

Ah dang, of course. Thanks much!

FYI, you should be using a release/acquire atomic pair to ensure that the data written to the queue is actually visible to the dequeueing thread. You could switch *const T to AtomicPtr<T> and do the release-acquire on that, or you could change the dequeue loop to load-acquire self.size and check that rather than testing for null (and change the store of size to be a release).

This is mainly for the sake of portability to architectures with weaker memory ordering guarantees than x86, although in theory it's required for correctness on x86 too, since otherwise your code could potentially be broken by compiler optimizations (although this is unlikely to occur in practice).

1 Like

Indeed! Thank you for that. I knew that this implementation was only going to function incidentally on x86 and ought to have mentioned that. Fully intend on incorporating this feedback.