I have recently started to learn Rust want to understand if I have implemented this zero-cost newtype correctly. You can imagine that Reg represents a register index in a hypothetical emulator. The idea is that when a function takes a Reg, it's guaranteed to hold a valid index, and that we don't have to pay any runtime cost for Reg.
I have done my best to make it fully featured, and to stay idiomatic, but since I'm so new to Rust I would like some feedback on the end result.
Is there something I'm doing wrong here? An improvement I could make? Some things that are overkill? Does it make sense to use #[inline(always)] in all those functions, should it be #[inline], or should I not have the attribute at all?
Thank you very much in advance!
use thiserror::Error;
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Reg(usize);
impl Reg {
pub const MAX_VALUE: usize = 31;
/// Creates a new `Reg`.
///
/// # Panics
///
/// Will panic if index > MAX_VALUE.
#[inline(always)]
pub fn new(index: usize) -> Self {
Self::try_from(index).expect("register index out of range")
}
/// Returns the underlying register index.
#[inline(always)]
pub fn as_usize(self) -> usize {
self.0
}
}
impl TryFrom<usize> for Reg {
type Error = RegError;
#[inline(always)]
fn try_from(index: usize) -> Result<Self, Self::Error> {
if index <= Self::MAX_VALUE {
Ok(Reg(index))
} else {
Err(RegError::OutOfRange(index))
}
}
}
impl From<Reg> for usize {
#[inline(always)]
fn from(reg: Reg) -> Self {
reg.as_usize()
}
}
#[derive(Error, Debug, Clone, Copy, PartialEq, Eq)]
pub enum RegError {
#[error("register index {0} is out of range (max {max})", max = Reg::MAX_VALUE)]
OutOfRange(usize),
}
IMO,
inline,derive attribute , : rust complier kind of smart if you release your project it could optimise as well. ( but with inline attribute It would be more controlling)
expect function : it would make your program crack so maybe don't use it if released.
I sometimes fairly randomly put an inline attribute in, but mostly the compiler will figure it out for you.
I do use #[cold] more, for code I know is not on the “hot” path.
In theory, you can use benchmarking to measure what works best, and I have done that as well, but I don’t remember very well if it was valuable in practice. It could vary depending on your hardware.
Instead of pub const MAX_VALUE: usize = 31;Reg should have a
const MAX_VALUE: usize = 31;
#[expect(unsafe_code)]
// SAFETY: We call `new_unchecked` with the maximum allowed value as defined above.
pub const MAX: Self = unsafe { Self::new_unchecked(Self::MAX_VALUE) };
/// Create a new `Reg` without checking for valid bounds.
///
/// # Safety
///
/// The caller must ensure that the passed-in value is
/// within the valid rage of values for `Reg`.
#[must_use]
pub const unsafe fn new_unchecked(value: usize) -> Self {
Self(value)
}
Since MAX_VALUE is an implementation detail and the exposed constant should be of the type itself.
You may also want to use #[must_use] on your factory methods.
As already stated, don't panic in new(). If you have to check for invariants, implement a try_new() instead and return a Result. Alternatively, drop new() altogether and only provide a TryFrom implementation, if you don't need a const-factory.
Whether MAX_VALUE is an implementation detail or not is is a design choice.
This too, is a design choice. There are cases it is perfectly legit, to panic in new(), and there are cases, where panicking in new() is annoying and your suggested try_new() is the way to go.
For such typical wrapper types, I always choose the opposite:
Provide const fn constructor methods by default (new(), try_new(), new_with_xyz(), etc.) and enrich the type with From/TryFrom implementations (that calls the const constructors) when required. This gives users the freedom to use the type in const-contexts, while the conversion traits shine in generic type usage.
Everything is a design choice. This is tagged as a review and I suggested other ones.
Using Self() in the associated const is less code, but it does not convey that there were invariants upheld and that we made up our mind about that. A clean unsafe interface with a // SAFETY: comment does exactly that.
Isn’t a TryFrom implementation preferable to a try_new()?
I like having a new() method that returns the object directly; for example, for constant values that shouldn’t fail. If there’s a need to create the object that depends on the user’s input, then it’s better to avoid panicking, of course.
That depends and it is a design choice (scnr). TryFrom::try_from cannot currently be used in const contexts, since it's a trait method.
I also like new to return the object itself by contract and usually use try_new if I need a fallible const factory, though there are counterexamples in the stdlib, like NonZero::new(), which returns an Option.
I think when creating a new object can naturally fail, new() should return a Result. Think about types like pub struct ValidatedString(String) that enforce some invariants.
Then there are types that rarely fail on construction, maybe because of out-of-memory errors. Here I'm fine with a new() that panics on OOM. For environments that need to handle OOM situations, a corresponding try_new() makes sense.
Returning Option is acceptable when the error condition is obvious from context. NonZero and NonNull are good examples: the type name itself describes the purpose and all valid values, and there is only one contradicting value that results in failure.
const fn reg<const N: usize>() {
const { if N > Self::MAX_VALUE { panic!("index out of range"); } }
Self(N)
}
which allows you to construct it from compile time constants and only error at compile time (not runtime) if they are out of range; i.e. Reg::reg::<2>() might work fine but Reg::<usize::MAX_VALUE>() might error at compile time.
There are ways to handle OOM errors for some types. I do think OOM is considered "niche" enough that a lot of methods will just panic instead, but one can always rely on something like Vec::try_reserve:
fn main() {
assert!(
Vec::<u8>::new().try_reserve(usize::MAX).is_err(),
"OOM without panic"
);
}
This can be quite painful though since you have to explicitly allocate before attempting to add items (e.g., Vec::push). Additionally systems that rely on memory overcommitment (e.g., default Linux kernels) may cause the OOM killer to terminate your process. For example on my laptop, the below program is killed by Linux:
fn main() {
/// 1 MiB array of 1s.
/// On my system if this is an all-0 array, the program
/// isn't killed by Linux.
const BUF: &[u8; 0x10_0000] = &[1; 0x10_0000];
let mut vec = Vec::new();
// On my system an allocation failure will occur
// if this is one greater.
vec.try_reserve_exact(42_213_442_328).expect("allocation success");
// Before we fill our "allocated" `Vec`,
// Linux will kill the process.
while vec.len() < 40_000_000_000 {
vec.extend_from_slice(BUF);
}
}