Accepting `Trait` vs `&mut Trait`

In these codes below, I created a Serialize trait which the implementors should implement it to serialize itself using the given std::io::Write. Since Vec<u8> implements std::io::Write, I can use &mut Vec<u8> as the writer. The problem is: in line 12, the std::io::Write becomes &mut &mut &mut Vec<u8> instead of just &mut Vec<u8>. Imagine having a deeply nested data structure where each child implements Serialize, we will end up with &mut &mut &mut ... &mut Vec<u8>. And I think we can ended up with infinite recursion if we have a circular data structure. I do understand this is because I use &mut writer in line L27 and L45 and it causes the multiple &mut .... I think I can fix the issue by changing the Serialize trait to accept &mut std::io::Write instead of std::io::Write, but then I saw that usually people accept the trait instead of &mut Trait, like in the serde library (to_writer in serde_json - Rust). Am I in the right track to change it into &mut std::io::Write, or do I miss something?

trait Serialize {
    fn serialize<W>(&self, writer: W)
    where
        W: std::io::Write;
}

impl Serialize for u32 {
    fn serialize<W>(&self, mut writer: W)
    where
        W: std::io::Write,
    {
        dbg!(std::any::type_name::<W>());
        writer.write_all(&self.to_le_bytes()).unwrap();
    }
}

impl<T> Serialize for Option<T>
where
    T: Serialize,
{
    fn serialize<W>(&self, mut writer: W)
    where
        W: std::io::Write,
    {
        dbg!(std::any::type_name::<W>());
        if let Some(ref val) = self {
            val.serialize(&mut writer);
            writer.write_all(&[0x01]).unwrap();
        } else {
            writer.write_all(&[0x00]).unwrap();
        }
    }
}

impl<T> Serialize for Vec<T>
where
    T: Serialize,
{
    fn serialize<W>(&self, mut writer: W)
    where
        W: std::io::Write,
    {
        dbg!(std::any::type_name::<W>());
        for item in self.iter() {
            item.serialize(&mut writer);
        }
    }
}

fn main() {
    let x = Some(vec![10u32, 11u32]);
    let mut buff = Vec::<u8>::default();
    x.serialize(&mut buff);
    buff.push(0x12);
    println!("{:?}", &buff);
}

(Playground)

Output:

[10, 0, 0, 0, 11, 0, 0, 0, 1, 18]

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.87s
     Running `target/debug/playground`
[src/main.rs:25] std::any::type_name::<W>() = "&mut alloc::vec::Vec<u8>"
[src/main.rs:43] std::any::type_name::<W>() = "&mut &mut alloc::vec::Vec<u8>"
[src/main.rs:12] std::any::type_name::<W>() = "&mut &mut &mut alloc::vec::Vec<u8>"
[src/main.rs:12] std::any::type_name::<W>() = "&mut &mut &mut alloc::vec::Vec<u8>"

But what is the issue?

The problem is that, in a very deeply nested struct, we can reach rust recursion limit due to very long &mut &mut &mut &mut ... chain. I think in a recursive data structure, we will definitely get a recursion limit. Let me try if I can trigger the recursion limit using a recursive data structure.

Here is what I meant: Rust Playground

trait Serialize {
    fn serialize<W>(&self, writer: W)
    where
        W: std::io::Write;
}

struct A(Vec<B>);
impl Serialize for A {
    fn serialize<W>(&self, mut writer: W)
    where
        W: std::io::Write,
    {
        dbg!(std::any::type_name::<W>());
        for item in &self.0 {
            item.serialize(&mut writer);
        }
    }
}

struct B(Vec<A>);
impl Serialize for B {
    fn serialize<W>(&self, mut writer: W)
    where
        W: std::io::Write,
    {
        dbg!(std::any::type_name::<W>());
        for item in &self.0 {
            item.serialize(&mut writer);
        }
    }
}

fn main() {
    let x = A(vec![B(vec![])]);
    let mut buff = Vec::<u8>::default();
    x.serialize(&mut buff);
    buff.push(0x12);
    println!("{:?}", &buff);
}

Errors:

   Compiling playground v0.0.1 (/playground)
error: reached the recursion limit while instantiating `<A as Serialize>::serialize::<&mut &mut &mut &mut ...>`
  --> src/main.rs:28:13
   |
28 |             item.serialize(&mut writer);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
note: `<A as Serialize>::serialize` defined here
  --> src/main.rs:9:5
   |
9  | /     fn serialize<W>(&self, mut writer: W)
10 | |     where
11 | |         W: std::io::Write,
   | |__________________________^
   = note: the full type name has been written to '/playground/target/debug/deps/playground-f2d9b096a48657b4.long-type.txt'

error: could not compile `playground` (bin "playground") due to previous error

Somehow, I need to use 2 struct to make it recurse infinitely. Using something like:

enum List {
   Item(u32, Box<List>),
   Nothing,
}

won't trigger the infinite recursion. Not sure why.

I don't think the signature of to_writer(), should be taken as indicating what should always be done when taking a writer. It is performing serialisation, so it wraps the writer into an instance of Serializer, all the methods of which take self by value (since a value gets serialised to exactly one serde data type, and it is applying a typestate pattern to handle composite types). It thus makes sense for to_writer() to take its writer by value.

For a counterexample to your statement, consider Formatter, also from serde_json, whose methods all take a &mut W where W : Write + ?Sized. Looking at the standard library, most operations on writers are done via &mut self methods on the Write trait, but std::io::copy() is an example of a function that accepts a writer, and it takes it via a reference.

Considering all of this, I would say that if you need a borrowed writer, which it looks like you do if you are going to be reborrowing it, then you should just take a borrowed writer from the start. You should thus be fine to have writer as &mut W where W : Write + ?Sized, following the example of the functions I referenced above.

1 Like

If you were writing a recursive function, you would find that you have to use a &mut for the input, or you will get an always-infinite recursion independent of the intended recursion depth:

use std::io;
fn foo<W: io::Write>(i: usize, mut w: W) -> io::Result<()> {
    w.write_all("hello".as_bytes())?;
    if i > 0 {
        foo(i - 1, &mut w)?;
    }
    w.write_all("goodbye".as_bytes())?;
    Ok(())
}

fn main() {
    foo(10, io::stdout()).unwrap();
}
error: reached the recursion limit while instantiating `foo::<&mut &mut &mut &mut &mut ...>`
 --> src/main.rs:5:9
  |
5 |         foo(i - 1, &mut w)?;
  |         ^^^^^^^^^^^^^^^^^^
  |
note: `foo` defined here
 --> src/main.rs:2:1
  |
2 | fn foo<W: io::Write>(i: usize, mut w: W) -> io::Result<()> {
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This version with a required &mut input compiles, because it's reborrowing the input &mut instead of adding a new layer:

use std::io;
fn foo<W: io::Write>(i: usize, w: &mut W) -> io::Result<()> {
    w.write_all("hello".as_bytes())?;
    if i > 0 {
        foo(i - 1, w)?;
    }
    w.write_all("goodbye".as_bytes())?;
    Ok(())
}

fn main() {
    foo(10, &mut io::stdout()).unwrap();
}

Similarly, I think one should consider it appropriate to require &mut in an implicitly-recursive trait method.

5 Likes

I see. Yeah, that makes sense.
But, it feels weird that the specific implementation requires us to change the function signature, while technically I can change the implementation to be non-recursive function by simulating the stack using Vec.

Well, that's reality. The signature absolutely does limit what you can do in the implementation. That's the whole point of types. If I were to ask you to implement a function with the signature

fn foo(x: u32) -> u32;

then it would automatically rule out any implementation that would accept a String or return a Vec<u32> or anything like that.

2 Likes

And in Rust, we make signature distinctions all the time that are about the implementation characteristics, not the application semantics. If we didn't want to avoid needless allocations, then -> Vec<i32> is an improvement on -> &[i32] because it doesn't impose the requirement of an existing allocated slice.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.