Data Structures appear to have an extra level of difficulty in Rust, how would you teach them?

I think this is the main point. Of course you can write doubly-linked lists with pointers in Rust, you just need to implement everything unsafe-ly. Which is what you are doing in C anyway, there it's just implict.

The fact that it requires unsafe points out that it's hard to get right. So I would rather flip the OP statement: doubly-linked lists (etc) are just as hard to get right in C as in Rust. Singley-linked lists are easier in Rust than in C, because the compiler helps you.

2 Likes

You can use "weak", but the interface is clunky and you have to get away from single ownership.

Could this proposed feature, "derive owner()" work?

 
#[derive(Owner<Bar>)] // generates a function owner() that returns the owner of the object.
 struct Foo {
     s1: String
 }
 
 struct Bar {
     my_foo: &Foo, 
     s2: String
 }
 
 fn main() {
     let f1 = Foo { s1: "Hello".as_string() }; // initialize Foo
     let b1 = Bar { my_foo: &f1, s2: "World".as_string() };  // b1 now owns f1
     println!("{}", f1.my_foo.s1); // prints "Hello"
     println!("{}", f1.owner().unwrap().s2); // prints "World"

The idea here is that deriving an "owner" function does two things. First, it puts a hidden field in the struct that points back to the owner. Here, that hidden field would have type Option. Ownership by anything other than a Bar puts a None in the Option.
Second, it adds code to anything that takes ownership of that struct to update the hidden field.

Deriving "owner" for an object would of course limit it to single ownership, since otherwise the owner is ambiguous.

This allows making doubly linked lists and simple trees without reference counts.

OK, what's wrong with this proposal?

Derive macros can't modify their input declaration. You'd need another kind of proc-macro for that, e.g. an attribute.

How would that work? It looks like this would need "move constructors", which (I hope) basically won't happen.

2 Likes

As defined, types Foo and Bar are extremely limited and can't work. Bar is supposed to own Foo, but it is a view struct that actually needs a lifetime annotation:

struct Bar<'foo>{
    my_foo: &'foo Foo, 
    s2: String
}

But, that ties the lifetime of a Bar instance to the lifetime of a Foo instance, instead of the other way around (which is I think always the case for an owner).

In addition, what exactly should Foo's pointer to its owner look like? If it's a raw pointer, that's unsafe and will likely lead to UB and segfaults for some code using it. If it's a borrow, that will fail to compile because Bar already has a borrow to a Foo instance. And something like Rc or Arc means that Foo ends up owning Bar (in a shared fashion) rather than Bar being the owner of Foo.

I can't prove it atm, but currently I strongly believe that any proposal to add a link back to the owner is fatally flawed at the conceptual level.

You may be right. But if a back pointer to an owner was impossible, doing this with a weak pointer woudn't work either. You'd get the double-borrow panic if you tried to go back to the previous item in a list.

Weak works here because the node here is managed by the Rc system of shared ownership. While you're working with a back pointer, both the child and the grandparent share ownership of the parent node. This shared ownership prevents you from making any changes (outside of Cell / Mutex).

1 Like

What is your take on BAPC 2010 I Keylogger which you can submit here?

Scan the input and maintain two stacks: characters left of the cursor (top of stack is right-most) and characters right of the cursor (top of stack is left-most). The operations are

match input {
    '<' => pop_left_stack_and_then_push_it_onto_right_stack(),
    '>' => pop_right_stack_and_then_push_it_onto_left_stack(),
    '-' => pop_left_stack_and_then_drop_it(),
    _ => push_left_stack(input),
}

When done, read off the stacks (left stack: bottom to top, right stack: top to bottom).

Nice. This passes in 295ms:

// BAPC 2010 I Keylogger
// as suggested by https://users.rust-lang.org/t/data-structures-appear-to-have-an-extra-level-of-difficulty-in-rust-how-would-you-teach-them/61264/28
//
// @author godmar
use std::io::{self, BufRead};
 
fn main() {
    let stdin = io::stdin();
    for s in stdin.lock().lines().map(|l| l.unwrap()).skip(1) {
        let mut left : Vec<char> = vec![];
        let mut right : Vec<char> = vec![];
        for c in s.chars() {
            match c {
            '<' => if let Some(c) = left.pop() { right.push(c); },
            '>' => if let Some(c) = right.pop() { left.push(c); },
            '-' => { left.pop(); () },
            _ => left.push(c),
            }
        }
        for c in left.iter() {
            print!("{}", c);
        }
        for c in right.iter().rev() {
            print!("{}", c);
        }
        println!("");
    }
}

I had suggested this problem because it was the only problem we've ever encountered that would actually make use of Java's java.util.LinkedList and its removing iterators in a useful way.

Overall, upon reflection, it shouldn't perhaps come as a surprise that competitive problems can be solved in Rust; after all, the most commonly used paradigm to solving these problems is C++ without pointers.

Yes. If you do that with lists, you're chasing all over memory, getting cache misses, and doing way too much allocation and de-allocation for the amount of work done for each character.

2 Likes

It's more expensive, but not "too much" under the given problem bounds. It passes in about 0.75s (including JVM overhead and "slow IO" using java.util.Scanner. Though I haven't looked at Rust's I/O so I don't know whether stdin.lock().lines() is correct.)

What counts is how quickly you get the insight and can implement it. If you don't have the stack insight, you can implement it with a linked list straightforwardly.

PS: Using a faster I/O template for Rust suggested by a CF competitor, along with using u8 instead of char, drop Rust's time from 295ms to 30ms. Java's time with fast I/O dropped only from 748ms to 623ms.

PPS: Cache misses are not likely a problem here. The code has locality since elements are added at the cursor and each operation move the cursor only one element. Likely, the link elements are allocated in spatial locality in a contiguous area. What really kills performance is the per-character overhead (likely 24 bytes or so).

1 Like

What happens, if you just lock the stdout explicitly, as well, instead of locking it implicitly repeatedly? That seems to me like a simple change with perhaps a big impact.

// BAPC 2010 I Keylogger
// as suggested by https://users.rust-lang.org/t/data-structures-appear-to-have-an-extra-level-of-difficulty-in-rust-how-would-you-teach-them/61264/28
//
// @author godmar
use std::io::{self, BufRead};
 
fn main() {
    let stdin = io::stdin().lock();
    let stdout = io::stdout().lock();
    for s in stdin.lines().map(|l| l.unwrap()).skip(1) {
        let mut left : Vec<char> = vec![];
        let mut right : Vec<char> = vec![];
        for c in s.chars() {
            match c {
            '<' => if let Some(c) = left.pop() { right.push(c); },
            '>' => if let Some(c) = right.pop() { left.push(c); },
            '-' => { left.pop(); () },
            _ => left.push(c),
            }
        }
        for c in left.iter() {
            write!(&mut stdout, "{}", c);
        }
        for c in right.iter().rev() {
            write!(&mut stdout, "{}", c);
        }
       writeln!(&mut stdout, "");
    }
}

That doesn't surprise me. I didn't look at that template beyond a glance, but the things I thought of from your example were

  • Lock stdin and stdout at the top
  • Use u8, not String or char
    • You can then also use std::io::Write::write_all for the left buffer
  • Don't use lines, reuse a buffer with std::io::BufRead::read_until
    • Or maybe just read byte by byte
  • Also reuse your stack Vecs (create them outside the loop, clear them within)

The optimizer probably did some of these for you.

Do you mean something like this:

use std::io::{self, BufRead, Write};

fn main() {
    let stdin = io::stdin();
    let stdout = io::stdout();
    let mut stdout = stdout.lock();
    for s in stdin.lock().lines().map(|l| l.unwrap()).skip(1) {
        let mut left : Vec<char> = vec![];
        let mut right : Vec<char> = vec![];
        for c in s.chars() {
            match c {
            '<' => if let Some(c) = left.pop() { right.push(c); },
            '>' => if let Some(c) = right.pop() { left.push(c); },
            '-' => { left.pop(); () },
            _ => left.push(c),
            }
        }
        for c in left.iter() {
            write!(&mut stdout, "{}", c);
        }
        for c in right.iter().rev() {
            write!(&mut stdout, "{}", c);
        }
        writeln!(&mut stdout, "");
    }
}

That's 233ms vs. 295ms.

From my experience, if the CP code is unreasonably slower than expected wrapping stdout with BufWriter usually solves it. BTW you should open another thread instead if you want to talk about it more.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.