How to mess up this Rust code as I did in my C/C++ code?

cpmech · July 1, 2022, 12:26am

Hello,

I'm wondering if it's possible to make the code below to "seg fault" as I did in the C/C++ code version (below)?

EDIT: Thanks for the replies! Note for future readers: the issue is not with indices being out of bounds, but with the lambda function in C++.

seg_fault_does_not_happen.rs

const NNODE: usize = 3;

fn print_x<'a, F>(num_triangle: usize, get_x: F)
where
    F: Fn(usize, usize) -> &'a [f64],
{
    for t in 0..num_triangle {
        for m in 0..NNODE {
            let x = get_x(t, m);
            println!("triangle # {}: x = {:?}", t, x);
        }
    }
}

fn main() {
    // [num_triangle][nnode=3][ndim=2]
    let triangles = vec![
        vec![vec![0.0, 0.0], vec![1.0, 0.0], vec![0.0, 1.0]],
        vec![vec![1.0, 0.0], vec![1.2, 1.5], vec![0.0, 1.0]],
    ];

    // closure that returns the coordinates of cell's point i
    let get_x = |t: usize, m: usize| &triangles[t][m][..];

    // print data
    print_x(triangles.len(), get_x);
}

A bit of context: I created the Rust code first (as part of a larger library) and then a friend of mine asked me if I could translate this piece of code to C/C++ so he could use it in his project. So, I had to "re-learn" C/C++ and tried to use the new (to me) lambda functionality. My first try (below), when run, caused a segmentation fault! (with no warnings from the compiler).

seg_fault_happens.cpp

#include <functional>
#include <iostream>
#include <vector>

using namespace std;

const size_t NNODE = 3;

void print_x(size_t num_triangle, function<vector<double> const &(size_t, size_t)> get_x) {
    for (size_t t = 0; t < num_triangle; t++) {
        for (size_t m = 0; m < NNODE; m++) {
            auto x = get_x(t, m);
            cout << "triangle # " << t << ": x" << m << " = ";
            cout << x[0] << "," << x[1] << endl;
        }
    }
}

int main() {
    try {
        // [num_triangle][nnode=3][ndim=2]
        vector<vector<vector<double>>> triangles = {
            {{0.0, 0.0}, {1.0, 0.0}, {0.0, 1.0}},
            {{1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}}};

        // lambda function that returns the coordinates of cell's point i
        auto get_x = [&triangles](size_t t, size_t i) {
            return triangles[t][i];
        };

        // print data
        print_x(triangles.size(), get_x);

    } catch (char const *msg) {
        cout << "ERROR: " << msg << endl;
    } catch (...) {
        cout << "some error occurred" << endl;
    }
    return 0;
}

Because my work currently is mainly in Rust, I thought that the "move" in the get_x lambda wasn't right. So, I've made a version with pointers (and not C/C++ references). Now it works:

seg_fault_does_not_happen.cpp

#include <functional>
#include <iostream>
#include <vector>

using namespace std;

const size_t NNODE = 3;

void print_x(size_t num_triangle, function<vector<double> const *(size_t, size_t)> get_x) {
    for (size_t t = 0; t < num_triangle; t++) {
        for (size_t m = 0; m < NNODE; m++) {
            auto x = get_x(t, m);
            cout << "triangle # " << t << ": x" << m << " = ";
            cout << (*x)[0] << "," << (*x)[1] << endl;
        }
    }
}

int main() {
    try {
        // [num_triangle][nnode=3][ndim=2]
        vector<vector<vector<double>>> triangles = {
            {{0.0, 0.0}, {1.0, 0.0}, {0.0, 1.0}},
            {{1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}}};

        // lambda function that returns the coordinates of cell's point i
        auto get_x = [&triangles](size_t t, size_t i) {
            return &triangles[t][i];
        };

        // print data
        print_x(triangles.size(), get_x);

    } catch (char const *msg) {
        cout << "ERROR: " << msg << endl;
    } catch (...) {
        cout << "some error occurred" << endl;
    }
    return 0;
}

The key difference is return &triangles[t][i]; inside the get_x lambda.

So, I'm wondering if I could replicate this segfault problem easily in Rust. I hope the answer is NO. The point is that I could tell my friend to "just use Rust and your life will be easier."

quinedot · July 1, 2022, 12:48am

You have no unsafe code, so your Rust program is memory safe. ^[1] It cannot segfault.

You easily could cause a panic at runtime by trying to access outside the bounds of a Vec (for example), but not a segfault (without introducing unsafe).

The only other avenues are buggy dependencies with their own unsafe, or compiler bugs. ↩︎

cuviper · July 1, 2022, 12:51am

It also works if you force this to return a reference:

auto get_x = [&triangles](size_t t, size_t i) -> vector<double> const & {
    return triangles[t][i];
};

I guess otherwise it returns the vector by value, and then the call to print_x(.., get_x) converts the function type to one that returns a reference? I don't know how that could ever work without dangling though, so I hope I'm missing something there. I'm too rusty on my C++ anymore...

Anyway, to your original question, no it should not be possible to fail this in safe Rust.

cpmech · July 1, 2022, 1:40am

Thanks for the replies!

jwakely · July 1, 2022, 11:41am

This code has been bitten by the problem described in GCC Bug 70692 – No warning when std::function<const int&(...)> binds a reference to a temporary and C++ Standard Library Issue 2813: std::function should not return dangling references i.e. a known footgun with std::function.

(I'm not allowed to add more than two links, because I'm a new user here, so I'll add the rest of this answer as another comment ...)

jwakely · July 1, 2022, 11:42am

There's a new feature coming in C++23 which will allow us to fix this problem in std::function so that this code no longer compiles. See P2252R2: A type trait to detect reference binding to temporary for the details. I will ensure that GCC's std::function uses the new feature all the way back to C++11 mode, not just for C++23, and I expect the libc++ and MSVC library teams will do the same.

It might not be as safe-by-construction as Rust, but at least we can disarm this particular trap in C++.

cuviper · July 1, 2022, 3:31pm

Awesome, thanks @jwakely!

8573 · July 2, 2022, 8:19am

On the possibly off-topic topic of making life easier, I suspect you don't need most of those vec!s. If your points always have two coordinates each and triangles are always triangles (having three vertices), you can use plain arrays instead:

     let triangles = vec![
-        vec![vec![0.0, 0.0], vec![1.0, 0.0], vec![0.0, 1.0]],
-        vec![vec![1.0, 0.0], vec![1.2, 1.5], vec![0.0, 1.0]],
+        [[0.0, 0.0], [1.0, 0.0], [0.0, 1.0]],
+        [[1.0, 0.0], [1.2, 1.5], [0.0, 1.0]],
     ];

I leave the outermost vec! intact in case you would want to add more triangles at runtime.

(Of course, in practical code, it would be more idiomatic to use named struct types for the points and triangles.)

jwakely · September 28, 2022, 11:45pm

For the record, that's now implemented on GCC's HEAD:
https://gcc.gnu.org/g:fa9bda3ea4315a7285edbc99323e3fa7885cbbb8

system · December 27, 2022, 11:45pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Using Rust's advanced memory management features without invalidating a bound C library's references help	7	1479	May 17, 2020
Funny beginner bug	19	536	April 19, 2024
Unsafe Array Indexing help	7	556	November 8, 2023
Out of bounds on &str - why does it run? help	5	511	October 22, 2023
What to say to a C programmer when they say … community	27	3517	April 11, 2020

How to mess up this Rust code as I did in my C/C++ code?

Related Topics