How to mess up this Rust code as I did in my C/C++ code?

Hello,

I'm wondering if it's possible to make the code below to "seg fault" as I did in the C/C++ code version (below)?

EDIT: Thanks for the replies! Note for future readers: the issue is not with indices being out of bounds, but with the lambda function in C++.

seg_fault_does_not_happen.rs

const NNODE: usize = 3;

fn print_x<'a, F>(num_triangle: usize, get_x: F)
where
    F: Fn(usize, usize) -> &'a [f64],
{
    for t in 0..num_triangle {
        for m in 0..NNODE {
            let x = get_x(t, m);
            println!("triangle # {}: x = {:?}", t, x);
        }
    }
}

fn main() {
    // [num_triangle][nnode=3][ndim=2]
    let triangles = vec![
        vec![vec![0.0, 0.0], vec![1.0, 0.0], vec![0.0, 1.0]],
        vec![vec![1.0, 0.0], vec![1.2, 1.5], vec![0.0, 1.0]],
    ];

    // closure that returns the coordinates of cell's point i
    let get_x = |t: usize, m: usize| &triangles[t][m][..];

    // print data
    print_x(triangles.len(), get_x);
}

A bit of context: I created the Rust code first (as part of a larger library) and then a friend of mine asked me if I could translate this piece of code to C/C++ so he could use it in his project. So, I had to "re-learn" C/C++ and tried to use the new (to me) lambda functionality. My first try (below), when run, caused a segmentation fault! (with no warnings from the compiler).

seg_fault_happens.cpp

#include <functional>
#include <iostream>
#include <vector>

using namespace std;

const size_t NNODE = 3;

void print_x(size_t num_triangle, function<vector<double> const &(size_t, size_t)> get_x) {
    for (size_t t = 0; t < num_triangle; t++) {
        for (size_t m = 0; m < NNODE; m++) {
            auto x = get_x(t, m);
            cout << "triangle # " << t << ": x" << m << " = ";
            cout << x[0] << "," << x[1] << endl;
        }
    }
}

int main() {
    try {
        // [num_triangle][nnode=3][ndim=2]
        vector<vector<vector<double>>> triangles = {
            {{0.0, 0.0}, {1.0, 0.0}, {0.0, 1.0}},
            {{1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}}};

        // lambda function that returns the coordinates of cell's point i
        auto get_x = [&triangles](size_t t, size_t i) {
            return triangles[t][i];
        };

        // print data
        print_x(triangles.size(), get_x);

    } catch (char const *msg) {
        cout << "ERROR: " << msg << endl;
    } catch (...) {
        cout << "some error occurred" << endl;
    }
    return 0;
}

Because my work currently is mainly in Rust, I thought that the "move" in the get_x lambda wasn't right. So, I've made a version with pointers (and not C/C++ references). Now it works:

seg_fault_does_not_happen.cpp

#include <functional>
#include <iostream>
#include <vector>

using namespace std;

const size_t NNODE = 3;

void print_x(size_t num_triangle, function<vector<double> const *(size_t, size_t)> get_x) {
    for (size_t t = 0; t < num_triangle; t++) {
        for (size_t m = 0; m < NNODE; m++) {
            auto x = get_x(t, m);
            cout << "triangle # " << t << ": x" << m << " = ";
            cout << (*x)[0] << "," << (*x)[1] << endl;
        }
    }
}

int main() {
    try {
        // [num_triangle][nnode=3][ndim=2]
        vector<vector<vector<double>>> triangles = {
            {{0.0, 0.0}, {1.0, 0.0}, {0.0, 1.0}},
            {{1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}}};

        // lambda function that returns the coordinates of cell's point i
        auto get_x = [&triangles](size_t t, size_t i) {
            return &triangles[t][i];
        };

        // print data
        print_x(triangles.size(), get_x);

    } catch (char const *msg) {
        cout << "ERROR: " << msg << endl;
    } catch (...) {
        cout << "some error occurred" << endl;
    }
    return 0;
}

The key difference is return &triangles[t][i]; inside the get_x lambda.

So, I'm wondering if I could replicate this segfault problem easily in Rust. I hope the answer is NO. The point is that I could tell my friend to "just use Rust and your life will be easier."

1 Like

You have no unsafe code, so your Rust program is memory safe. [1] It cannot segfault.

You easily could cause a panic at runtime by trying to access outside the bounds of a Vec (for example), but not a segfault (without introducing unsafe).


  1. The only other avenues are buggy dependencies with their own unsafe, or compiler bugs. ↩ī¸Ž

5 Likes

It also works if you force this to return a reference:

auto get_x = [&triangles](size_t t, size_t i) -> vector<double> const & {
    return triangles[t][i];
};

I guess otherwise it returns the vector by value, and then the call to print_x(.., get_x) converts the function type to one that returns a reference? I don't know how that could ever work without dangling though, so I hope I'm missing something there. I'm too rusty on my C++ anymore...

Anyway, to your original question, no it should not be possible to fail this in safe Rust.

3 Likes

Thanks for the replies!

This code has been bitten by the problem described in GCC Bug 70692 – No warning when std::function<const int&(...)> binds a reference to a temporary and C++ Standard Library Issue 2813: std::function should not return dangling references i.e. a known footgun with std::function.

(I'm not allowed to add more than two links, because I'm a new user here, so I'll add the rest of this answer as another comment ...)

8 Likes

There's a new feature coming in C++23 which will allow us to fix this problem in std::function so that this code no longer compiles. See P2252R2: A type trait to detect reference binding to temporary for the details. I will ensure that GCC's std::function uses the new feature all the way back to C++11 mode, not just for C++23, and I expect the libc++ and MSVC library teams will do the same.

It might not be as safe-by-construction as Rust, but at least we can disarm this particular trap in C++.

8 Likes

Awesome, thanks @jwakely!

1 Like

On the possibly off-topic topic of making life easier, I suspect you don't need most of those vec!s. If your points always have two coordinates each and triangles are always triangles (having three vertices), you can use plain arrays instead:

     let triangles = vec![
-        vec![vec![0.0, 0.0], vec![1.0, 0.0], vec![0.0, 1.0]],
-        vec![vec![1.0, 0.0], vec![1.2, 1.5], vec![0.0, 1.0]],
+        [[0.0, 0.0], [1.0, 0.0], [0.0, 1.0]],
+        [[1.0, 0.0], [1.2, 1.5], [0.0, 1.0]],
     ];

I leave the outermost vec! intact in case you would want to add more triangles at runtime.

(Of course, in practical code, it would be more idiomatic to use named struct types for the points and triangles.)

1 Like

For the record, that's now implemented on GCC's HEAD:
https://gcc.gnu.org/g:fa9bda3ea4315a7285edbc99323e3fa7885cbbb8

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.