Using unit-structs to type-restrict function parameters

Ok, I want to define a foo() and a bar(), which each take a usize (let's say it's an index), but I want callers of foo() to know that the index is distinct, and for a particular application, separate from the callers of bar(). fn foo(i: usize) and fn bar(i: usize) is what I don't want, even if it would "work". Rather, say I used tuple-structs:

pub struct A(usize);
pub struct B(usize);
fn foo(i: A) { ... }
fn bar(i: B) { ... }

This is what I want. It's likely, too, that impl B {...} will have certain things that impl A {...} does not, and vice-versa, as well, but, for the callers of foo() and bar(), I simply want the extra clarity that A is really the required type for foo, and B for bar, even if, underneath, they're just both usize integers.

Now, I also want the Deref functionality

use std::ops::Deref;

But I don't want to duplicate the "impl Deref for ..." for both struct A and struct B; since, in reality, they're just 1-tuple-structs (common pattern) of usize, I'd like to implement this just once (DRY). I came up with this scheme, which does work, but I'd like to know if it's the best approach, or if there's a more Rustic way to do this... or a better way to see the problem or solution. Perhaps I'd need to provide more info on my use-cases, but, if imagination plus what I show here are sufficient, perhaps you can weigh in...?

pub struct A;
pub struct B;

pub struct Idx<T>(usize, T);
impl<T> Deref for Idx<T> {
	type Target = usize;
	fn deref(&self) -> &Self::Target {
		&self.0
	}
}

fn foo(i: &Idx<A>) {
	println!("foo a: {}", **i);
}

fn bar(i: &Idx<B>) {
	println!("bar b: {}", **i);
}

fn main() {
	let ai = Idx(3, A);
	let bi = Idx(5, B);
	foo(&ai);
	//bar(&ai); // <- this would not compile, and rightly so - this is the way I want it - that bar() cannot be called with an A
	//foo(&bi); // <- likewise, this correctly fails to compile, because foo will not take a B
	bar(&bi);
}

Finally... I wish I didn't have to **i to derefrence all the way to the usize, and from what I read at the bottom of Treating Smart Pointers Like Regular References with the Deref Trait - The Rust Programming Language, it seems like I should get an automatic deref... that at least *i should work, but it doesn't. I get "Idx<A> cannot be formatted with the default formatter". This isn't a show-stopper, but I am curious about it.

Thanks in advance.

The Deref trait is best saved for pointer-like things, but nobody's going to stop you if you want to learn the lesson on your own.

My approach is to have an Id::index(self) function that returns the usize. You could also implement From<Id> for usize, I just prefer the explicit method.

Another note is that rather than having to use a zero-sized type, you can use a std::marker::PhantomData, which will have size 0 regardless of T.

Another option is to just make the usize fields public (if you don't also want to restrict the values they can take on).

If they are different, unrelated types, then adding a separate trivial conversion function to both of them doesn't violate DRY. DRY is not about pure surface syntax; instead, it's about not doing common things for related use cases that could be abstracted away at some level. It seems like trying to artificially unify the two types using complicated type-tagging would make matters more complicated than necessary and than just adding the methods separately.

5 Likes

Thank you all for the great input! Well, no, the two types really are quite the same, especially wrt/ wanting to "dereference" to the underlying usize. Consider them to both to be indices, say, into arrays (or even typically into the same array, for that matter, if the illustration helps). And the user would really like to just think of this as a simple wrapper for the index, even using it as such:

   array[i]

rather than

   array[i.0]

BUT, there might be a function foo that does something that only applies to one type. E.g., perhaps it "halves" the value of the index:

fn half(i: &Idx<A>) -> Idx<A> { // I do NOT want a "B" type index to be allowed to do this!
   Idx(i.0, A)
}
...
my_array[half(z)] = "done!";

As long as z, above, was a "type A index", then half() is a function available to it. Notice that this is the "wishful thinking" code - it would actually have to look like this, unless I can conjure up a way to do it magically:

my_array[half(z).0] = "done!";

Yes, one could simply have half() return a usize, but again, I want the type-wrapped index, as I want it to retain that identity which allows it to associate only with certain functions and not others. There are several ways to do this, but this is the "feel" I'm after. I realize that this interest in the "implicit" irritates the typical "explicit is better" philosophy; that may be the end of the story for most of you. I appreciate that. The comment frsrblch made about "You could also implement From<Id> for usize" is on track -- indeed, an approach like this would be better than the "dereference" idea I had. If, in other words, the Idx<A> could be "cast" (to use the wrong term) to a usize, implicitly, that would be ideal (albeit, perhaps not Rustic). But I actually can't quite figure out how to get From<usize> for Idx or Into<usize> to do this, as they seem more appropriate for the "other direction" - "casting" a usize into the custom type, rather than the other way 'round.

It sounds like you're looking to use the new type idiom, which really just involves the tuple struct setup you already have, plus adding .0 internally to get the usize value.

You can get rid ofthe duplication using macros and you can get rid of the double-star by deriving Copy and passing by value:

macro_rules! newindex {
    ($A:ident) => {
        #[derive(Debug, Copy, Clone, PartialEq, Eq)]
        pub struct $A(usize);
        impl $A {
            pub fn idx(&self) -> usize { self.0 }
        }
        impl std::ops::Deref for $A {
            type Target = usize;
            fn deref(&self) -> &Self::Target {
                &self.0
            }
        }
    }
}

newindex!(A);
newindex!(B);

// impl A { ... more stuff if you need ... }

fn foo(i: A) { println!("{:?}: {}", i, i.idx()); }
fn bar(i: B) { println!("{:?}: {}", i, *i); }

fn main() {
    foo(A(3));
    bar(B(6));
}

@PFaas -- that's right; it's newtype all the way... but for two types that want to be explicitly distinct, and yet share fundamental functionality (originally, deref, but I've been convinced that a kind of "usize cast", as frsrblch suggested, is better. Either way... it's shared behavior.

@duelafn -- ah, macros! Totally uncharted territory for me yet. I've got to take the time to get over my reticence to embrace macros as a real option rather than an ugly cheat. :slight_smile: Your proposal does look pretty straightforward, in many ways... I'll give you that.

I had to edit my last post because I didn't properly escape some bits... then it disappeared on me for approval.

Just because you mentioned indexing arrays with that type, but haven't shown how you do that, you can overload indexing for arrays to take a crate-local type:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c320d7f61bce2a2cedca9bdd8e570b10

#[repr(u8)]
#[derive(Debug, Copy, Clone)]
enum U2 {
    N0 = 0,
    N1,
    N2,
    N3,
}

impl<T> std::ops::Index<U2> for [T; 4] {
    type Output = T;
    fn index(&self, index: U2) -> &T {
        &self[index as usize]
    }
}

fn main() {
    let array = [3, 5, 8, 13];

    for i in [U2::N0, U2::N1, U2::N2, U2::N3] {
        println!("array[{i:?}] == {}", array[i]);
    }
}

this prints

array[N0] == 3
array[N1] == 5
array[N2] == 8
array[N3] == 13

This example intentionally only implements indexing for [T; 4] since that makes the indexing infallible.

1 Like

Thank you all for the great input. This community is fantastic, and seeing this many angles is super.

I decided to "keep it simple", and keep it "explicit", as we're often inclined to do. So, I just have my two newtype 1-tuple structs, each of which simply wraps a usize, and when it's time to get the usize, I just use the .0. No deref or cast or index implementations. No need, therefore, for shared functionality. Just simple. Usage then requires that you simply know how to use the newtype idiom, and appreciate the reason it's used, here, and this is what the documentation says, too. All buttoned up.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.