A plenty of questions about Reference/Language

Would love to get some more clarity on them!

  1. Path patterns
Question

Qualified path patterns can only refer to associated constants.

I can put ::A in a match - doesn't it count as a qualified path pattern?
As stated here - Paths - The Rust Reference

QualifiedPathType :
< Type (as TypePath)? >

"as" is optional here

  1. Enumerated types
Question

Enum types cannot be denoted structurally as types, but must be denoted by named reference to an enum item.

Does that mean there's no single "enum" type, but rather every defined enum is different? Isn't that the case with structs? I'm not sure if I understand it.

  1. Function pointer types
Question

Function pointer types, written using the fn keyword, refer to a function whose identity is not necessarily known at compile-time. They can be created via a coercion from both function items and non-capturing closures.

Type of fn items and non-capturing closures is typically known at compile time, right? Is there any example where this isn't the case? (Maybe receiving such closure through FFI - is that possible?)

  1. The no_mangle attribute
Question

Additionally, the item will be publicly exported from the produced library or object file, similar to the used attribute

Will "no_mangle" and "used" attributes export publicly a private item?

  1. Special types and traits
Question

A trait may be implemented for Box in the same crate as T, which the orphan rules prevent for other generic types.

Shouldn't this be documented for other types (like Pin) too?
As stated here - Glossary - The Rust Reference
Any time a type T is considered local, &T, &mut T, Box, and Pin are also considered local.

  1. Type Layout - Primitive Representation of Enums With Fields
Question

Note: This representation is unchanged if the tag is given its own member in the union, should that make manipulation more clear for you

I get what primitive repr of enums with fields looks like, but I don't understand this note.

  1. Trait and lifetime bounds
Question

Bounds that don't use the item's parameters or higher-ranked lifetimes are checked when the item is defined. It is an error for such a bound to be false

Can you provide me an example where higher-ranked lifetime bound is false?

  1. Subtyping and Variance
Question

If variance in T means "we can initialize &'short T with &'long T where 'long: 'short"
then what does variance in 'a (lifetime, not type) mean? (I'm referring to the variance table)

  1. Behavior considered undefined
Question

When a reference (but not a Box!) is passed to a function, it is live at least as long as that function call, again except if the &T contains an UnsafeCell.

9a. Not a Box, because it can be dropped inside the function, right? Also, why not in case of &UnsafeCell? All we can do with it is getting &mut (or *mut), so it should behave the same as mutable reference.

All this also applies when values of these types are passed in a (nested) field of a compound type, but not behind pointer indirections

9b. Passed to a function or passed wherever?

primitive operation

9c. What is a primitive operation? Does arithmetic operation count as a primitive one?

If the size is 0, then the pointer must either point inside of a live allocation (including pointing just after the last byte of the allocation), or it must be directly constructed from a non-zero integer literal

9d. Does the manually contructed one have to point inside of a live allocation too? Also, why non-zero integer? Is 0 memory address always invalid?

  1. Destructors - Temporary scopes
Question

The second operand of a lazy boolean expression.

Can't the first operand be a temporary scope too? Like in this example:
(PrintOnDrop("first operand").0 == "" || PrintOnDrop("second operand").0 == "");

1 Like

It means that

enum Foo {
    Int(i64),
    Float(f64),
}

and the same enum except called Bar would be different types. In other words, enums are not union types.

Structs are also nominally-typed, and not structural: two structs resulting from distinct type definitions but having the same fields (w.r.t. names and types) are not the same type.

IOW, all user-defined types are nominal and not structural. I'm not sure what the problem is – why couldn't this be true for enums and structs simultaneously?

It doesn't say the "type" is not known at compile-time. It says the identity is not known at compile time (i.e., which one of a set of identically-typed functions are you calling).

There's no way to have values of which the type is not known at compile-time; Rust is statically typed.

It states the obvious: in that example, one could have added a separate field in the union MyEnumRepr for the tag, which, by definition of a repr(C) union, doesn't change the layout:

#[repr(C)]
union MyEnumRepr {
    tag: u8,
    A: MyVariantA,
    B: MyVariantB,
    C: MyVariantC,
    D: MyVariantD,
}

Playground

trait Bar<'a> {}

struct Foo;

struct Qux
where
    Foo: for<'a> Bar<'a>
{
    
}

Variance is only applicable to lifetimes, since types are the only thing that subtyping applies to. Variance "in a type" really means variance in the lifetime(s) of the type:

  • When we say "a shared reference is covariant in the referent type", then it means &'a T<'long> is a subtype of &'a T<'short>.
  • When we say "a shared reference is covariant in its lifetime", we mean &'long U is a subtype of &'short U.

No, the whole point of UnsafeCell is that it's an interior mutability primitive. You can't have two (independent, non-reborrowed) &mut references to the same place, but you can have two independent (shared or raw) pointers to and UnsafeCell, and use both to mutate the same place. It's exactly the act of wrapping into an UnsafeCell that tells the compiler not to optimize based on the assumption of no mutable aliasing.

It doesn't matter. The point of that paragraph is that if you produce or even look at an invalid value, in absolutely any way whatsoever, that's instant UB.

"Passing" is terminology specific to function calls, what else do you have in mind?

No. It can't point to any allocation, exactly because it has zero size. Memory regions (and "contains"/"overlaps" semantics) in Rust aren't defined in terms of the address only, but the whole region (address + length).

Address 0 is the null pointer. It's valid for a raw pointer to be null, it's just not valid for references and boxes. I assume the intention of the documentation's author was to show you how to manually construct a valid 0-sized reference.

I'm not sure what you are asking here. It might be that you are confusing "temporary scope" with "place where it's legal to create temporaries"?

1 Like

A qualified path pattern is a pattern that starts with <..>::.

Those are compiler defined types, so yes.

It is documented, you just linked it. The other page could be updated too, sure.

// Compiles on its own
//             vvvvvvv higher-ranked trait bound (HRTB)
fn foo() where for<'a> String: Copy {}

fn main() {
    // error to call it (this is where the HRTB is checked)
    foo();
}

If 'long: 'short, then if 'x in type T<... 'x ...> is

  • covariant, T<... 'long ...> coerces to T<... 'short ...>
  • contravariant, T<... 'short ...> coerces to T<... 'long ...>
  • invariant, T<... 'x ...> can only coerce to T<... 'x ...>

(In all cases "coerce" is referring to supertype coercion.)

So for example, &'a mut T is covariant in 'a but invariant in T. So &'a mut &'b U is covariant in 'a but invariant in 'b.

Alternatively see this introduction to variance.

Yes.

Those are very different![1] You cannot get a &mut T from a &UnsafeCell<T> without unsafe. unsafe means the programmer is responsible, not the compiler. In safe Rust you can only get the *mut.

(You can get a &mut T from a &mut UnsafeCell<T>, but that wasn't the type under discussion. &mut U and &U are also very different.)

AFAICT they mean operations that don't correspond to a function. That would include addition of primitive number types.

If I attempted to answer this I'd just be rewording that section.

Yes, null (the 0 memory address) is always dangling.


  1. And "can get a &mut" is not something to be dismissed lightly in this context. ↩︎

3 Likes

Specific note for this one:

This actually was refined recently by rust-lang/rust#117329 such that zero-sized (and zero-offset) pointer operations are defined in more cases, including on the null pointer. Specifically:

  • Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer
  • Inbounds-offset-by-zero is allowed on all pointers, including the null pointer
  • offset_from on two pointers derived from the same allocation is always allowed when they have the same address

(I am member of T-opsem. I do not speak for the team, but I am linking to what is intended to be a normative guarantee eventually.)

The tracking issue is rust-lang/rust#117945, which indicates that documentation still needs to be updated to reflect this new guarantee.

The behavior which T-opsem is intending is that pointer read/write/offset operations that cover zero bytes[1] should be a complete no-op, which includes not having any prerequisites which would throw UB if violated. This is a much stronger property than the previously used intuition that there essentially exists a zero-sized allocation at each non-null address.

There are multiple reasons for reformulating how we handle pointers to ZST, but the big two are that C++ allows nullptr + 0 (but we didn't until this PR) and that the new formulation means that the validity of references to ZST doesn't rely on the nondeterminism of exposed provenance anymore. (It also helps justify that NonNull::dangling can be used for ZST access, despite describing the pointer as "dangling.")

The relevant term is "provenance monotonicity" IIRC.


  1. offset_from is somewhat special in that it still has its "same allocation" requirement; offset_from between two separate allocations is undefined even if the compared addresses appear to be equal, due to complicated reasons that are required if we want to validly justify ever optimizing out the allocations. ↩︎

6 Likes

Thanks for the pointer.[1]

Just to clarify, the "can be null and valid in some sense" change is for pointers (to ZSTs), not references. (The quoted section talks about "reference/pointer"s both.)


  1. get it? ↩︎

Yes, references are still always required to be non-null as a part of their value representation. Any attempt to produce such is UB. (As such, you could argue for dereferencing a null reference to be non-behavior instead of UB.)

1 Like

@quinedot Thank you for clarifying things!

A qualified path pattern is a pattern that starts with <..>::.

I meant to say: "I can put <Enum>::A in a match - doesn't it count as a qualified path pattern?", but it somehow got lost in redaction. For example:

fn main() {
    match Test::A {
        <Test>::A => {}
    }
}

enum Test {
    A
}

example where higher-ranked lifetime bound is false

Your example compiles when I comment out the usage of foo(). Reference states that:

Bounds that don't use the item's parameters or higher-ranked lifetimes are checked when the item is defined. It is an error for such a bound to be false.

and if I define bounds that don't use item's parameters like this:

struct A where i32: Iterator
{}

fn main() {

}

it doesn't compile without usage. Following reference strictly, false higher-ranked lifetimes shouldn't even compile without usage. (or maybe I misundestand something?)


You cannot get a &mut T from a &UnsafeCell<T> without unsafe. unsafe means the programmer is responsible, not the compiler. In safe Rust you can only get the *mut.

That's right, but what exactly prevents liveness of &UnsafeCell in function as stated here:

  • When a reference (but not a Box!) is passed to a function, it is live at least as long as that function call, again except if the &T contains an UnsafeCell<U>.

Potential usage of unsafe?


@paramagnetic Also thank you for clarification!

I'm not sure what you are asking here. It might be that you are confusing "temporary scope" with "place where it's legal to create temporaries"?

Yes, I probably am, but I can't wrap my head around the difference.

It's legal to create temporaries basically anywhere. In contrast, a temporary scope is the scope that the created temporaries will live for.

The LHS of a lazy boolean operation isn't a temporary scope because it doesn't affect where the temporaries created inside it will be dropped.

The form of this post basically guaranteed a dumpster fire of a discussion, so apologies if I've missed places where these are already settled.

  1. I think I disagree slightly with @paramagnetic. I gather from context that a "qualified path" in this sentence is not the same thing as a QualifiedPathInExpression or QualifiedPathInType but rather any path that is qualified, i.e. contains a "namespace qualifier (::)". (See also "Path qualifiers".) The reference would benefit from a single clear definition of "qualified path".

    Anyway, the sentence you quoted states a restriction in addition to the grammar. It is not merely restating the grammar. ::A syntactically matches the grammar, but since it doesn't refer to an associated constant, you'll get a compiler error.

    <Enum>::A is allowed and refers to an enum variant (or "constructor"). I think it must be that for the purpose of this sentence, that's considered a constant even though it wasn't defined with const. But you're right, I don't see anywhere that is spelled out and the reference should say what it means.

  2. Another badly worded bit of the reference. The distinction is between structural types and nominal types. Rust has a nominal type system.

  3. Another badly worded bit of the reference. All this means is that a function pointer type has many different possible values, all the possible functions and non-capturing closures matching that signature. The type of each individual fn item and closure is known at compile time; but in code that takes a function pointer as an argument, the identity of the function is not known at compile time. Any function with that signature could be passed in.

  4. The reference seems to say so. If that's wrong, it's worth filing an issue against the documentation.

  5. Yes.

Please consider filing issues about these. Perhaps individual issues instead of one big issue.

1 Like

I didn't reply to the path-related question of OP.

Yes, that's a qualified path pattern. I believe it works for fieldless variants but not others, yet.

The reference isn't normative.

The point of the playground was to demonstrated that the higher-rank bounds wasn't checked at the definition. Incidentally, as far as I know it's still the plan to allow more trivially unmeetable bounds.

It's trying to say that &UnsafeCell<T> has less validity requirements than &T. It's more akin to a *mut T.

1 Like

It's legal to create temporaries basically anywhere. In contrast, a temporary scope is the scope that the created temporaries will live for.

The LHS of a lazy boolean operation isn't a temporary scope because it doesn't affect where the temporaries created inside it will be dropped.

Could you please provide me some example? How is LHS practically different from RHS in the code below?

let x = PrintOnDrop("first operand").0 == "" || PrintOnDrop("second operand").0 == "";

It drops the following way: first operand -> second operand

I'm not sure what exactly you are asking here. I'm just saying that the reference asserts only the RHS is a temporary scope, and that it's not the same as "a scope where temporaries can be created". I don't know the exact rules of all temporary dropping by heart, because they are largely uninteresting most of the time.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.