Does Rust have a way similar to prvalue in C++?

In C++

struct A{ int i;};
void fun(A a){}
int main(){
   fun(A{10});
}

The argument A{10} is a prvalue, which is used to directly initialize the parameter object a, that is, there is no temporary object will be created here. Is Rust default to do this? For example

fn fun(v:A){}
struct A{ i:i32}
fn main(){
   fun(A{i:10}); // #1
}

What is the behavior at #1, does it create a temporary object and then move it to v of fun? Or, does it directly use A{i:10} to initialize v? The "create temporary object" way would look like:

let temp = A{i:10};
let v: A;
memcpy(&v, &temp);  // shallow copy
fun(&v);

Which way does Rust use? Moreover, consider this case

fn get_a()->A{
   A{i:0}
}
fn main(){
   let r = get_a(); 
}

In this case, does get_a() create a temporary object, then move it to r? Or, directly initialize r with A{i:0};

3 Likes

Rust have no move constructor equivalent to C++. Every move will always be a memcpy. There's no need for concept like prvalue.

Whether the memcpy happens is up to the optimizer, and there's no guarantee for now.

16 Likes

This creates a temporary object that is then passed by value into the function. The call to fun may be optimized out via inlining or constant propagation.

Yes, this version is creating a temporary and then returning it from the function (although the call itself will likely be optimized out in this case).

As mentioned earlier, all moves of non-Copy types are equivalent to a memcpy that leaves the source logically uninitialized. Copy types are still a memcpy, they just leave the source unchanged.

There are no move constructors in Rust. You can customize move semantics only by opting in to Copy semantics if the type supports it, or by Pining the object and forbidding moves entirely.

1 Like

I don't think the guarantee of prvalue that directly initializes its result object is relevant to move constructor. This just considers A{i:0} works, as what 0 works for let i = 0;, for compound types. In other words, we can avoid redundant initialization and memcpy.

Well, pasting your code into Compiler Explorer we get the following generated code:

Rust version:

example::main:
        xor     eax, eax
        ret

C++ version:

fun(A):
        ret
main:
        xor     eax, eax
        ret

Hmm... both the same. I see no temporaries or copies going on there!

Of course the devil is in the optimisers. Your code ultimately does nothing so most of it gets optimised away.

Perhaps you have a more realistic example?

6 Likes

As far as I know, Rust doesn't guarantee any particular calling convention for native functions.

You can use extern "C" for the platform-specific C convention (with various caveats).

@quinedot

I didn't meant "calling convention". I asked whether Rust guarantee, for example, let c = A{i:0};, that A{i:0} directly initializes c, or the compiler will create a temporary object to hold the value A{i:0}, then move the temporary object to c?

Since there's no guarantee that c will ever be a specific location (it might be split into registers, for example, if there's no references to it anywhere), is there ever any difference?

2 Likes

if c is an object, we use it elsewhere, it does have the difference. prvalue can avoid creating a temporary object and move once.

Rust does not have in-place construction.

What does it imply? A{i:0} always constructs a temporary object?

Yes.

1 Like

While Alice's answer is correct, "no" would also be correct because there is no observational difference between the options you are considering. (In C++ parlance they satisfy the "as-if" rule.) The right mental model is that temporaries are always created and then most of them are optimized away. The only reason C++ needs to make explicit distinction between the different value classes is because moves are observable (you can put a print statement inside the move constructor) so compilers are not allowed to optimize moves away unless the spec explicitly provides for it.

Rust does actually have a value category distinction, but it is much simpler and tracks C more than C++: there are "places" and "values" corresponding to C/C++ "lvalue" and "(p)rvalue". These distinctions exist primarily for syntactic reasons: it's the only way you can really make sense of what operation x.field performs in the expression x.field = 2;. The C++ "xvalue" category is only needed for C++ style non-destructive move; in Rust "places" actually act more like C++ xvalues than lvalues because they are moved rather than copied when used in a value context. (This is why auto x = y; copies in C++ but let x = y; moves in Rust: y is a place expression used in a value context and so causes a move, while in C++ y is a lvalue and causes a copy, and auto x = std::move(y) does a move because std::move(y) is an xvalue.)

16 Likes

I would say, "not creating a temporary object" is guaranteed by core language or merely is an optimization of implementation, they are different. Obviously, the "prvalue" in C++ is guaranteed by the core language not to create any temporary object.

It's not really true that in C++ you are promised not to create any temporaries any more than it is true in Rust. The only thing the C++ spec promises is that you get something as-if no temporaries were created. It may very well involve extra memcpys as long as the user-visible effects, that is, the move constructor bodies, are executed on sensible values as prescribed by the spec. If the move constructor is pure, then the situation in C++ and Rust is exactly the same: the compiler is free to extract temporaries or remove them as it likes.

Because temporary creation is not observable, this cannot possibly be specced any other way: if the rust spec had a "guaranteed no temporary" clause that would be a meaningless clause because if the compiler decides to make a temporary anyway it would still be as-if it had not created a temporary. Really, the only reason C++ has to spec temporary removal, guaranteed copy elision, NRVO etc is because move constructors can have side effects.

1 Like

Not quite: the reason is more that constructors are in-place, and A{10} calls a user-defined constructor. Even if C++ moves weren't user-defined, the address at which an object is constructed is observable. And since the default mode for C++'s compilation model is separate compilation that doesn't know what's in the constructor, that address is as good as exposed for much of the optimization pipeline.

The rule that A a = A{10}; does direct initialization was even more important before move semantics were introduced, because this would have done a (user-defined) copy (read: clone) of the value after (user-defined) construction of a temporary.

The C++ specification doesn't care about memcpys of podish data, because those are, for the most part, unobservable. What it does care about, very much so, is the address of objects.

Unlike C++, Rust doesn't care about the address of objects as much. Well, it does, since while a reference is loaned the object can't move. But the closest thing we have to in-place initialization in the base language and std is Rc::new_cyclic.

Rust code will, on average, end up memcpying more stack data than an equivalent program written in C++. But for that, it will also tend to keep more data on the stack in the first place; a portion of that data not being memcpyd around is because it was put on the heap in C++ to avoid constantly calling the copy/move constructor. The other thing which Rust doesn't have here which C++ does is the concept of a "continued move;" with C++ you can pass an rvalue reference around for a while before calling the move constructor once at the end, whereas the equivalent in Rust would move/memcpy the value each time it's passed as an argument.

Removing redundant memcpys is one of the things the optimizer is best at, though. For anything reasonably sized, you don't need to worry about redundant memcpys. Write the code for the correct ownership flow, then go back and optimize if and only if stack usage and/or memcpys become a problem. (They almost certainly won't.)

8 Likes

What I mean is, a compiler is within its rights to memcpy a structure somewhere else (crucially without using the move/copy constructor), and perform modifications on that copy as long as, as far as the language semantics is concerned, it still has the original address. Granted there usually isn't much reason to do so, but as long as there are no escaping pointers to the data (including method calls that pass this to code outside the compilation unit) or anything observing the pointer identity, and the value itself is put back in the right location before anyone asks (and the destructor is not called on the bad location), it's no harm no foul as far as the standard is concerned. This is just another instance of the as-if rule.

This is somewhat contrived for C++ (except for POD structs with limited uses of pointers) but is the default situation in Rust (although you still have to be careful with references because the addresses of references are observable). For a pure rvalue being passed to a value context there is no possibility of taking a reference so the temporaries so constructed are almost always free to be optimized away.

Yeah, this is a bit of a shame. You would probably be best off boxing the value if you wanted to do that in rust. But a pointer-like type that calls drop_in_place of the pointee at scope end would be great. &move T references anyone?

4 Likes

That would be great.
Plus maybe a switch in the compiler to tell you when such move couldn't happen automatically.

Can anyone explain what would be the semantic of the proposed extension? You can do that in C++ because compiler doesn't track moves, that's the developer's responsibility. When compiler in Rust can track moves these &move T references would be useless and pointless and where it couldn't be able to do that it would be unsafe.

And for unsafe pointers already offer the full solution which raises the question: what would &move T references offer on top of what Rust already have?

1 Like

The point is that &move T has the ownership semantics of T (i.e. drop_in_place is called on the value at the end of the scope) but the size and ABI of a pointer (it is actually an indirection). It is like Box<T>, but Box<T> also owns the backing allocation (i.e. it is "on the heap") while &move T would not (and in interesting cases it would generally a pointer to a stack local). This would not require any unsafe code, only integration with the initialization check, borrow check, and drop check.

1 Like