Why do LLMs stubbornly insist {:p} doesn't display metadata for slices?

I have the code fn main() { println!("{:p}", "hi"); } . It produced the output Pointer { addr: 0xad0415aa148, metadata: 2 } . But LLMs, such as Gemini 3 or ChatGPT 5.1, stubbornly insist that metadata won't be displayed even when we are dealing with fat pointers. Gemini 3, in particular, keeps insisting this remains true even in the world of the latest compiler versions, such as rustc 1.91.x, 1.89, etc. How can I prove to various LLMs that nowadays, metadata is indeed displayed when we are dealing with fat pointers in the code that I just showed you?

nobody knows why llms do anything

20 Likes

That phrase makes absolutely no sense, sorry. LLMs are stochastic parrots, you couldn't ever prove anything to them… what do you really want to achieve?

11 Likes

This is a fairly recent (v1.87, May 2025) change, so I'm not surprised that LLMs haven't been trained on this. Maybe point your LLM to the implementing PR?

Out of curiosity though, why does it matter what the LLM says, if you know the truth? I.e. why argue with it?

11 Likes

That's not really true. LLMs are T9 on serious steroids. They are just predicting what you, probably, want to write. You can affect their predictions in various, pretty useful, ways, you just couldn't actually teach them anything.

Process where LLM is “teached” anything is called “training”, it's very expensive, costs millions or billions of dollars, and mere mortals couldn't do that, currently.

3 Likes

Sounds like you need to steer the LLM conversation away from "do you expect it to do this?" (given its training cutoff date, no it does not expect it) into more of a knowledge sharing preface to your main question "I ran this code and it output this ..., help me solve problem XYZ with that in mind."

Some tools let you setup context for all conversations, so something like this would be helpful to add to reduce repetition on each chat.

Of course, your mileage will vary. Not all LLMs are equal, and even conversations with the same model can vary wildly in some cases.

2 Likes

I thought the Debug thing has to do with {:?}, not with {:p}. Where is it mentioned in the PR that both {:?} and {:p} were affected for printing purposes?

The implementation for {:p} is right there:

Almost all other Pointer implementations (which is the trait that gets called when the {:p} formatter is encountered), e.g. those for &T, &mut T or Box<T>, forward to this implementation by casting themselves to *const T.

2 Likes

It will require to retrain the model. More likely it can be done by employees of the mentioned companies.

Okay.

Hi, isn't this highlighting the difference between State machines and LLM. "State in AI agents refers to persistent and transient data that facilitates meaniful interactions" further.. "Context a crucial aspect of state represents the immediate and historical information revelent." To which "Context can be inferred from multiple sources such as Metadata". As neural networks in humans are 8 dimensional, this maybe a 4 dimensional Context, current user issue.

I dunno, ask the vendor of your LLM?

It seems to me that the title of the PR is a bit misleading. It says "Debug" in the title, but the code that was added as a result of the PR is impl<T: ?Sized> Pointer for *const T { fn fmt(&self, f: &mut Formatter<'_>) -> Result { if <<T as core::ptr::Pointee>::Metadata as core::unit::IsUnit>::is_unit() { pointer_fmt_inner(self.expose_provenance(), f) } else { f.debug_struct("Pointer") .field_with("addr", |f| pointer_fmt_inner(self.expose_provenance(), f)) .field("metadata", &core::ptr::metadata(*self)) . "impl Pointer" and "impl Debug" are not the same thing.

No, but Pointer underlies the Debug implementation for raw pointers, which was what the PR author was concerned about. It just so happened that this PR also changed the {:p} behaviour of other pointer types, too.

LLMs don't understand source code; they have no internal model, internal state, nor any induction engine. They just regurgitate what's been merged in their neural network, trying to predict the next symbols of the answer based on that neural network and what's already been written so far in the answer. It's not meant to generate source code that requires any amount of thinking.

In this case, chances are there weren't enough examples of that specific use case for the different compiler versions in the learning material, and the weighs related to the metadata text were too low to be part of the answer.

Don't rely on LLM chatbots to write code.

6 Likes

Those are contradictory. To predict something better than a uniformly random generator you need some model. The model may be almost trivial (as in markov chains) or severely limited, but it's still a model.
And considering that lots of commercial CoT models don't show their internal reasoning that can be considered internal state. And there's stuff like thinking tokens too.

So while there are lots of negative things that can be said about LLMs or how they get used, drawing caricatures does not make a convincing argument.

Most ironically, the phrase "stochastic parrot" does get repeated... with high probability in these conversations. :wink:
More seriously, the RL applied on top of base models means they're optimized towards goals that are not bare next-token-prediction.

4 Likes

Of course they do have a model! It's right there, in the name: large language model.

They know how “professor” talks, how “bad guy” talks, they know how emotions sound… they can imitate things that criminalists use to glean an emotional state from your text perfectly.

What LLMs don't have are world model. They can not connect facts together, they couldn't follow the rules of even trivial game (like tic-tac-toe game) — even when they can recite these same facts and these same rules.

Many of them can show you “the resoning”… except what they think and what they show you about what they think are different things (that's true for humans, too, so that's not an accusation… just an explanation why “reasoning models” don't improve things radically… they are still language moderls, not world models).

They are still optimized toward that, just with more diverse rules about what “nice” output would be.

Well… “T9 on steroids” is also likely. The core thing: there are no world model that can be altered, like humans have. It's not uncommon for humans to open the box with the board game, read rules once and then play that game. If there are 3-4 people then there would be mistakes, regularly, but people would notice them and fix them… LLMs utterly fail to do that. They even fail to do that with open games like chess, but try to ask them to show a poker game… and you would see how every player would know every card, because something that human can do easily (limit knowledge of a few imaginary players) is fantastically hard for LLM: it's all a sequence of tokens to LLM, there are no separation between them.

1 Like

Interestingly, one can not request ChatGPT to show Chain of Thought, even though as they show there, it can be useful to give more context to their errors. It seems some models do, but did not try any yet (I would assume Anthropic does after that research.)


Oh, that's what @the8472 said as well.

Though I now think I misinterpreted the link. It shows to be hiding info in the CoT.

PS: I would still argue they should show this optionally.

you can steer the LLM generated output. Not "conversation". There is no conversation. It's like saying you have a conversation with water when you are using a stick to draw a path. To have a conversation, you need to have understanding and wanting to communicate something. LLM can't have either.

4 Likes

That's not entirely true. Function books definitely have conversations even if the only “understanding” and “wanting” exist in the head of book author.

The same with LLMs: they can produce conversations, even if the only “understanding” and “wanting” exist in prompts they receive. IOW: you couldn't talk with LLM, but you may still have a dialogue generated with help of LLM.