How to get the type name from `TypeId`?

It's unsound in the case of collisions, thus not mitigating such collisions is a compiler bug.

The fix will break code bases making assumptions about TypeId (e.g. transmutability).

5 Likes

I'll come it with some advice from "I have no idea what you guys are talking about "..

How many types do you have possible to check for? All types? Or is there some subset, ten or twenty types you might have to test for?
Maybe you can make an enum of the types "MyTypeEnum" and a vector of tuples? (MyTypeEnum, type_name, TypeId )
Hardcode a list of the types and push a tuple into the vector for each type.

Then later when you need to find out the type_name from Type_id you look at your vector for tuple with the matching TypeId.

1 Like

Ok... looks like I didn't describe my use case clearly enough.

I'm writing a function accepting a Box<dyn Any>. It checks if it's the expected type. If not, i.e., downcast failed, it'll return error. I want the error message to be useful to human readers.

So I should probably store the type name before it becomes a Box<dyn Any>. Thanks for your advice!

Great links! From your link, there're already Intrinsic for `type_name_of_id` to power a better `impl Debug for TypeId`? · Issue #61533 · rust-lang/rust · GitHub and Expose `type_name()` method on Any. · Issue #68379 · rust-lang/rust · GitHub for this. I should've searched more thoroughly. :sweat_smile:

I solved this by making a trait NamedAny and making my API accept Box<dyn NamedAny>.

use std::any::Any;

trait NamedAny: Any {
    fn type_name(&self) -> &'static str;
}

impl<T: Any> NamedAny for T {
    fn type_name(&self) -> &'static str {
        std::any::type_name::<T>()
    }
}

playground

7 Likes

Also try these two compromise ways. Don't be rude in Rust forum, if you are not satisfied with Rust then you can go to other language. I believe you will face these problems too - either alloc memory dynamically or specify size of memory manually.

fn copy<T: Copy>(input: &T) -> Vec<u8> {
    let vec = Vec::with_capacity(std::mem::size_of::<T>());
    ...
    vec
}

fn copy2<T: Copy>(input: &T, buffer: &mut [u8]) {
    assert_eq!(buffer.len(), std::mem::size_of::<T>());
    ...
}

Granted, TypeId is repr(Rust), so this isn't a sound assumption to make. It's even specifically documented to be opaque:

Each TypeId is an opaque object which does not allow inspection of what’s inside but does allow basic operations such as cloning, comparison, printing, and showing.

Considering that, I find it a bit strange that it derives Ord and Debug.

Edit: Why I find deriving Debug strange

No you not :thinking:

fn main() {
    let vec: Vec<Box<dyn NamedAny>> = vec![Box::new(0i32), Box::new(false), Box::new(Some(""))];
    println!("{}", vec[0].type_name());
}
alloc::boxed::Box<dyn playground::NamedAny>

It's impossible to let Rust automatically infer the name in runtime. type_name is just something like macros.

Stating facts is not being rude.
Writing efficient code (especially const fn code) requires type manipulation, lacking such features makes language weak and it should be stated obviously. Trying to use nice words or inefficient work-arounds doesn't change reality.

I know that my complaints will change nothing, but it doesn't mean I cannot state obvious weak points of Rust.

Good type manipulation can for example let you implement simple type id yourself in much simpler way than probably rustc does it

fn inner_type_id<T>(input: &T) -> usize {
    input as *const _ as usize
}

fn type_id<T>() -> usize {
    inner_type_id::<T> as *const () as usize
}

P.s. it is obviously can bug out due to optimizations, but only in very rare cases (ZST)

Technically, I agree with you.
Templates are limited, and I often find myself in a limited position.
(But, often, my target solution is also way too complex)

But: Please try to sound nice to humans.

Meaning

  • you can say rust is weak
  • you cannot say Somebody is weak

I think you need (*vec[0]).type_name().

fn main() {
    let vec: Vec<Box<dyn NamedAny>> = vec![Box::new(0i32), Box::new(false), Box::new(Some(""))];
    println!("{}", (*vec[0]).type_name());
}

i32

playground

2 Likes

Interesting, my fault. :thinking:
Indeed, your problem may have already settled.

I'm struggling to see the use-case. A failed downcast seems like a bug to be reported to the developer rather than an error to be reported to the user. Do you mind sharing?

Maybe my understanding is not exactly what you express, but being rude is not limited to saying aggressive words, and also includes contemptuous manners. For me, I hope everyone will just discuss the problem itself when they face it and not to evaluate others' effort or whether a language is good or not.
Your requirement is certainly worth supporting, but it takes time for rust which is still under development, there are many more vital things to be implemented. You should not evaluate it from the perspective of a completely mature language. If a feature is not implemented, just say it, but do not judge a language from it.

In addition, your example can make type id not conflicting, but it is has overead. Certainly, you can implement it yourself using traits.

2 Likes

Of course. I'm writing a async task dependency management framework.

Framework motivation

The motivation for this framework is that our async task dependencies often form a directed acyclic graph (DAG), with the nodes being the tasks and the edges pointing from the depended task to the dependent task.

The DAG is often generated "top-down", meaning that we're most interested in some final outputs, but by constructing the Futures that'll yield these final outputs, we discover recursively many layers of inter-connected dependent Futures. To me this can not be easily expressed using .await and friends (join!, select!, etc.).

An primary motivation and use case for this framework is loading the glTF 3D model format.

The task API looks like this:

/// This is an async task and the `node` of the dependency graph.
trait Task {
    type Fut: Future<Output = Box<dyn Any>>;
    fn set_input(&mut self, index: u8, value: Box<dyn Any>);
    fn call(self) -> Self::Futurue;
}
Other APIs that uses Task
/// This is implemented for *FnOnce*.
trait IntoTask {
    type Task: Task;
    fn into_task(self) -> Self::Task;
}

/// This is a DAG describing task dependencies.
struct Graph { ... }

/// The `Graph` API
impl Graph {
    /// `NodeIndex` is the identifier for the task.
    pub fn add_task(&mut self, task: impl IntoTask) -> NodeIndex { ... }
    /// This method connects the output of `parent` to the `index`th input of `child`.
    pub fn add_dependency(
        &mut self,
        parent: NodeIndex,
        child: NodeIndex,
        index: u8
    ) -> Result<(), TypeMismatchError> { ... }
    /// The scheduling algorithm.
    /// It runs concurrently all tasks that have all its inputs ready.
    /// When any one of the task is done (`select_all` style),
    /// it checks if any dependent task becomes ready and runs it if so.
    pub fn run(&mut self) { ... }
}

The Task trait cannot be generic because Box<dyn Task>s will be stored in a container, so it have to use Box<dyn Any> as inputs and output.

Users of this framework are responsible for ensuring the dependencies are set correctly in the sense that their types match. If not, the API will return an error with the details of the mismatched types.

1 Like

Yeah, I think the breakage is fine. But it does slow the fix down.

As for the implementations, they allow things like

  • using it as a hash key
  • deriving debug in structs that hold them
  • comparing types by eye when printing them out

Note that debug output is also explicitly unstable.

1 Like

This ("only in rare cases", let alone that ZST aren't rare in Rust) isn't even true. In the face of multiple compilation units (e.g. multiple crates), you can easily have multiple distinct inner_type_id::<T>. Furthermore, function deduplication means that all T with the same size can get the same address.

This isn't just a limitation of Rust either; the behavior would be the same with C++ templates. And as noted in the threads around TypeId, even C++ toolchains implement C++ RTTI by embedding the type name and comparing that, rather than some more clever scheme using the linker to guarantee both uniqueness between types and unification within a single type.

When a problem seems much simpler than a group of people are claiming it is, the first question should be how your definition of the problem differs from theirs. Sometimes there truly is an overlooked solution. More often, they've considered your simple solution and discarded it because it doesn't fulfill some restrictions you haven't yet considered. And sometimes your simple solution isn't so simple after all, one you dig into the gritty details of actually implementing it.

5 Likes

This ("only in rare cases", let alone that ZST aren't rare in Rust) isn't even true. In the face of multiple compilation units (e.g. multiple crates), you can easily have multiple distinct inner_type_id::.

Ok you got me there, I didn't really investigate it much, it is just something I came up when I wanted a const fn type id. But obviously function pointer address isn't the best way to go about it as it is actually not regular pointer.
In any case I wanted to illustrate that having ability to manipulate types gives you a great toolset and must have feature for metaprogramming

Last time I checked simple hack like that worked just fine [C++] gcc 12.1.0 - Wandbox

https://github.com/rust-lang/rust/pull/95845#issuecomment-1094190368

What about the case when two different crates use the TypeId of a common std type, like TypeId::of::<String>()? If we cannot guarantee deduplication, the hash will be equal, the pointers will be different, but the pointer contents will be equal.

Right, that's what this part of the description is about:

However, to get global deduplication (across all crates/CGUs), we'd have to start using certain linker features (discussed in e.g. #75923), which I've left out of this PR.

I wanted to separate the correctness aspect from the "linker features that we have to be careful about" part - after all, I'm not aware of the exact thing we need being supported anywhere (anonymous symbol with contents-based deduplication).

So we would have to find a way to (hopefully) avoid mangling the same type twice, to get both the contents and the symbol name, and then also have a bunch of decision logic for which exact LLVM options to turn on for those constant globals (e.g. linkonce_odr).

I'd be more inclined to look into this sooner if I saw some non-artificial benchmarks showing a measurable impact in the scope of a larger program. I'm not saying there might be a slowdown without the deduplication, but I can't think of a scenario where it's disastrous.

Frankly, the precedents in the C++ world (i.e. their RTTI impls) tell me two things:

  • relying entirely on a hash is a gamble, one they weren't willing to play at
  • they had decades to fix platform toolchains to get guaranteed deduplication but haven't

It's important that any solution baked into the language work on all targets, not just mainstream ones.

Anyway, I don't have anything more to add here.

What are you trying to get done, what do you need this function for? The function as you describe it would be UB, because of padding bytes.

1 Like