Why I can't create vec of byte strings?

Dear mighty Rustaceans,

I'm newbie in the Rust world and having trouble with joining byte strings.

Here's my code snippet that I expected to work.

fn main() {
    assert_eq!(["next", "gen"].join(" "), "next gen");   // work
    assert_eq!(vec![b"this", b"work"].as_slice().join(b" "), b"this work"); // this doesn't complain for creating vec

    assert_eq!(vec![&b"next", &b"gen"].as_slice().join(&b" "), b"next gen"); // bad
    assert_eq!(vec![b"next", b"gen"].as_slice().join(b" "), b"next gen"); // bad
    assert_eq!([b"next", b"gen"].join(&b" "), b"next gen"); // bad
}

And I'm seeing following compile errors.

   Compiling playground v0.0.1 (/playground)
error[E0599]: the method `join` exists for reference `&[&[u8; 4]]`, but its trait bounds were not satisfied
 --> src/main.rs:3:50
  |
3 |     assert_eq!(vec![b"this", b"work"].as_slice().join(b" "), b"this work"); // this doesn't complain for creating vec
  |                                                  ^^^^ method cannot be called on `&[&[u8; 4]]` due to unsatisfied trait bounds
  |
  = note: the following trait bounds were not satisfied:
          `[&[u8; 4]]: Join<_>`

error[E0308]: mismatched types
 --> src/main.rs:5:31
  |
5 |     assert_eq!(vec![&b"next", &b"gen"].as_slice().join(&b" "), b"next gen"); // bad
  |                               ^^^^^^^ expected an array with a fixed size of 4 elements, found one with 3 elements

error[E0599]: the method `join` exists for reference `&[&&[u8; 4]]`, but its trait bounds were not satisfied
 --> src/main.rs:5:51
  |
5 |     assert_eq!(vec![&b"next", &b"gen"].as_slice().join(&b" "), b"next gen"); // bad
  |                                                   ^^^^ method cannot be called on `&[&&[u8; 4]]` due to unsatisfied trait bounds
  |
  = note: the following trait bounds were not satisfied:
          `[&&[u8; 4]]: Join<_>`

error[E0308]: mismatched types
 --> src/main.rs:6:30
  |
6 |     assert_eq!(vec![b"next", b"gen"].as_slice().join(b" "), b"next gen"); // bad
  |                              ^^^^^^ expected an array with a fixed size of 4 elements, found one with 3 elements

error[E0599]: the method `join` exists for reference `&[&[u8; 4]]`, but its trait bounds were not satisfied
 --> src/main.rs:6:49
  |
6 |     assert_eq!(vec![b"next", b"gen"].as_slice().join(b" "), b"next gen"); // bad
  |                                                 ^^^^ method cannot be called on `&[&[u8; 4]]` due to unsatisfied trait bounds
  |
  = note: the following trait bounds were not satisfied:
          `[&[u8; 4]]: Join<_>`

error[E0308]: mismatched types
 --> src/main.rs:7:26
  |
7 |     assert_eq!([b"next", b"gen"].join(&b" "), b"next gen"); // bad
  |                          ^^^^^^ expected an array with a fixed size of 4 elements, found one with 3 elements

Some errors have detailed explanations: E0308, E0599.
For more information about an error, try `rustc --explain E0308`.
error: could not compile `playground` due to 6 previous errors

String version of my code worlds, but I can't make it with the byte strings.
First of all, I don't understand why does the vec expect the same sized byte strings.
Isn't byte strings slice?
How can instantiate vector with variable sized byte strings, without extra allocation?

Secondly, why does the join complain about trait bounds?
I read the documentation of slices and join was there. se:: slice - Rust
Isn't it enough for byte string [u8] to support?

It seems that I'm misunderstanding some basic, and I'm asking for help.
Can someone help me? It would be a great help.

Sincerely,

Note the byte string literal has the type &'static [u8; N] instead of &'static [u8].

And due to this implementation:

impl<T, V> Join<&[T]> for [V]
where
    T: Clone,
    V: Borrow<[T]>,

this will work

fn main() {
    assert_eq!(vec!["this".as_bytes(), b"work"].as_slice().join(" ".as_bytes()), b"this work");
    
    let v: Vec<&[u8]> = vec![b"this", b"work"]; // coercion: &[u8; N] -> &[u8]
    assert_eq!(v[..].join(" ".as_bytes()), b"this work");
    // Self: v[..] is the type [&[u8]]
    // Separator: str.as_bytes() is the type &[u8]
    // Self::Output: Vec<T>
    // T = u8, V = &[u8], and &[u8]: Borrow<[u8]> (i.e. V: Borrow<[T]> is satisfied)
    // desugar: <[&[u8]] as Join<&[u8]>>::join(&Self, Separator) -> Self::Output
}

// impl<T> Borrow<T> for &T
// where
//     T: ?Sized,
// i.e. &[u8]: Borrow<[u8]>

If you only join with a single byte, then due to another implementation

impl<T, V> Join<&T> for [V]
where
    T: Clone,
    V: Borrow<[T]>,
type Output = Vec<T, Global>

you can also do this

fn main() {
    assert_eq!(vec!["this".as_bytes(), b"work"].as_slice().join(&b' '), b"this work");
}

though the two styles emit the same code under optimization.

That's because Rust doesn't even have any byte strings. It only have arrays bytes and slices of bytes.

It's one place where attempt to make language more “convenient” backfires.

I personally have never fell into that trap but can see why it would be easy to hit it.

What @vague wrote is absolutely correct, but probably flew right over year head.

String literals (like "next" or "gen") are not strings but references to str: &str.

These references are similar to slices of arrays, not arrays themselves.

But byte literals like b"this" or b"gen" are not references, they are just arrays.

You can turn them into slices like this: b"this"[..] and then references to slices: &b"this"[..].

And then everything works:

fn main() {
    assert_eq!(["next", "gen"].join(" "), "next gen");   // work
    assert_eq!(vec![&b"next"[..], &b"gen"[..]]
               .as_slice()
               .join(&b" "[..]), &b"next gen"[..]); // work
}

Or you can use as_slice(), too:

fn main() {
    assert_eq!(["next", "gen"].join(" "), "next gen");   // work
    assert_eq!(vec![b"next".as_slice(), b"gen".as_slice()]
               .as_slice()
               .join(b" ".as_slice()), b"next gen".as_slice()); // work
}

Rust can automatically convert arrays into slices in certain circumstances, thus your version with b"this" and b"work" was accepted. First you got vector of vectors (not slices) which worked because b"this" and b"work" have the same length — and then vectors were converted to slices and code worked.

But this automatic conversion falls apart when you try to create vector from &"next" and &"gen"! They can different size and can not live in one vector!

If you would stick one slice there, it works, again:

pub fn main() {
    assert_eq!(["next", "gen"].join(" "), "next gen");   // work
    assert_eq!(vec![b"next".as_slice(), b"gen"]
               .as_slice()
               .join(b" ".as_slice()), b"next gen"); // work
}

Not your fault. It's place where language inconsistency and ergonomic features “got you”.

If both "this" and b"that" would have had similar types — you would have had no trouble.

If arrays would have automatically converted to slices in all cases where this can be done to make code compileable — you would have never found out that "this" and b"that" have dissimilar types.

But that combo of automatic conversions which work sometimes but not always and difference in types between "this" and b"that" caused confusion.

I have never seen anyone stumble here, but I guess that's a reason why most tutorials never talk about b"literals": if you find out about them after you have finished learning Rust and have some experience then it's easy to understand what happens here.

But for a newbie… that's quite a puzzle.

2 Likes

They are references to arrays. For instance, the following program:

fn main() {
    let a: () = b"abc";
}

Will show that b"abc" is &[u8; 3] (not [u8; 3]).

error[E0308]: mismatched types
 --> src/main.rs:2:17
  |
2 |     let a: () = b"abc";
  |            --   ^^^^^^ expected `()`, found `&[u8; 3]`
  |            |
  |            expected due to this
1 Like

I stand corrected.

Coercions are handy but make it hard to find out what type certain expression have.

The important thing here is that they are not slices and not references to slices. Thus b"this" and b"that" have the same type but b"foo" and "b"oops" have different types.

See also: B in bstr - Rust

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.