I like the crate and will likely use it in the future, thanks for your work!
One thing that stood out to me was that you don't (at least I couldn't find it) mention the term invariant anywhere in the docs or the README. Marking new_unchecked as unsafe implies that the constraints put on the inner value by validation form a safety invariant, which you should then be allowed to soundly exploit in unsafe code. (One example that comes to mind is #[validate(max_len=255)] and then serializing the string length as a u8).
Is this left unspecified on purpose or am I not allowed to rely on this invariant in unsafe code?
I've read through your message a few times, but failed understand the question fully. I haven't read yet the article about validity invariant, so maybe it will fill my gap.
Am I allowed to rely on the validation provided by this crate for soundness?
Let's say I write the following code using nutype (I can't think of a better example right now)
#![feature(vec_into_raw_parts)]
use nutype::nutype;
#[nutype(validate(max_len = 255))]
#[derive(*)]
struct ShortString(String);
impl ShortString {
pub fn rebuild(self) -> Self {
let rebuilt = unsafe {
let (ptr, len, cap) = self.into_raw_parts();
String::from_raw_parts(
ptr,
len as u8 as usize, // imagine it was serialized as a u8
cap
} // this is UB if len > 255
// (not because of the memory leak, but the string potentially not being utf-8)
};
// SAFETY: len comes from a u8, making it impossible for this to be None
unsafe { ShortString::new(rebuilt).unwrap_unchecked() }
}
pub fn nonsense(&self) {
if self.len() > 255 {
core::hint::unreachable_unchecked()
}
}
}
Both functions, rebuild and nonsense, are UB if the struct has a string with more than 255 bytes, one is just more subtle than the other. Are these functions sound?
I just noticed I accidentally mixed up validity and safety invariants, I edited the previous post and added a link that makes the difference clearer (imo).
It's a tricky example that I've never thought about.
Short answer: nutype will not repair the string if it's constructed with unsafe and has potential UB.
ShortString is mostly just a wrapper around String.
I'd like to note, that your example will not compile, because instantiating a value like
ShortString(rebuilt)
is not possible (that's one of the main points of nutype).
Instead you'd need to do
ShortString::new(rebuilt).unwrap()
Not, that ShortString::new(rebuilt) will return Err if rebuilt.len() > 255.
However, in this example it's not gonna happen, because len is obtained through u8.
Let me know, if this answer you question and if I can help you with anything else.
That's clear to me, as a string with invalid UTF-8 is already UB. My question is whether I can rely on nutype's guarantees in unsafe code, such as invoking UB if the validation didn't turn out correct. Perhaps this also deserves a new thread.