Need help to fix memory leak in my matrix implementation

I have own Matrix implementation and it takes a lot of memory for .dot() operation:

use std::fmt::Debug;
use std::io::{Error, ErrorKind};
use std::ops::{Add, AddAssign, Div, Mul, Neg, Sub, SubAssign};
use rand::{Rng, thread_rng};
use rand::distributions::uniform::SampleUniform;

trait Numeric:
    Div<Output = Self>
    + Add<Output = Self>
    + Sub<Output = Self>
    + Mul<Output = Self>
    + Default
    + Debug
    + PartialOrd
    + SampleUniform
    + Clone
    + Copy
    + Neg<Output = Self>
    + AddAssign
    + SubAssign {}

impl<T> Numeric for T where T:
    Div<Output = Self>
    + Add<Output = Self>
    + Sub<Output = Self>
    + Mul<Output = Self>
    + Default
    + Debug
    + PartialOrd
    + SampleUniform
    + Clone
    + Copy
    + Neg<Output = Self>
    + AddAssign
    + SubAssign {}

macro_rules! impl_matrix_ops_for_scalar {
    ($($T:ty),*) => {
        $(
            impl Add<Matrix<$T>> for $T {
                type Output = Matrix<$T>;

                fn add(self, mut matrix: Matrix<$T>) -> Self::Output {
                    matrix.assign(matrix.data.iter().map(|&item| item + self).collect()).unwrap();

                    matrix
                }
            }

            impl Sub<Matrix<$T>> for $T {
                type Output = Matrix<$T>;

                fn sub(self, mut matrix: Matrix<$T>) -> Self::Output {
                    matrix.assign(matrix.data.iter().map(|&item| self - item).collect()).unwrap();

                    matrix
                }
            }

            impl<'a> Sub<&'a mut Matrix<$T>> for $T {
                type Output = Matrix<$T>;

                fn sub(self, mut matrix: &'a mut Matrix<$T>) -> Self::Output {
                    matrix.assign(matrix.data.iter().map(|&item| self - item).collect()).unwrap();

                    matrix.clone()
                }
            }

            impl Mul<Matrix<$T>> for $T {
                type Output = Matrix<$T>;

                fn mul(self, mut matrix: Matrix<$T>) -> Self::Output {
                    matrix.assign(matrix.data.iter().map(|&item| item * self).collect()).unwrap();

                    matrix
                }
            }

            impl Div<Matrix<$T>> for $T {
                type Output = Matrix<$T>;

                fn div(self, mut matrix: Matrix<$T>) -> Self::Output {
                    matrix.assign(matrix.data.iter().map(|&item| self / item).collect()).unwrap();

                    matrix
                }
            }
        )*
    };
}

impl_matrix_ops_for_scalar!(f64, f32, i8, i16, i32, i64, i128, u8, u16, u32, u64, u128, usize);

#[derive(Debug, Clone, Default)]
pub struct Shape {
    pub rows: usize,
    pub cols: usize,
}

#[derive(Debug, Clone, Default)]
pub struct Matrix<N> {
    pub shape: Shape,
    data: Vec<N>,
}

impl<N> Matrix<N> where N: Default + Copy + Clone + Debug {
    pub fn new(rows: usize, cols: usize) -> Self {
        Self {
            shape: Shape { rows, cols },
            data: (0..rows * cols).map(|_| N::default()).collect(),
        }
    }

    pub fn new_empty(rows: usize, cols: usize) -> Self {
        Self {
            shape: Shape { rows, cols },
            data: vec![],
        }
    }

    pub fn assign(&mut self, data: Vec<N>) -> Result<(), Error> {
        if data.len() != self.shape.rows * self.shape.cols {
            return Err(Error::new(ErrorKind::Other, "Dataset size is different from matrix size"));
        }

        self.data = data;

        Ok(())
    }

    pub fn get_item(&mut self, row: usize, col: usize) -> Option<&mut N> {
        self.data.get_mut(row * self.shape.cols + col)
    }

    pub fn get_row(&mut self, row: usize) -> &mut [N] {
        &mut self.data[row * self.shape.cols .. row * self.shape.cols + self.shape.cols]
    }
    
    pub fn transpose(&mut self) -> Self {
        let mut output = Matrix::new(self.shape.cols, self.shape.rows);
        for i in 0..self.shape.rows {
            for j in 0..self.shape.cols {
                *output.get_item(j, i).unwrap() = self.get_item(i, j).unwrap().clone()
            }
        }
        output
    }
}

impl<N> Matrix<N> where N: Numeric {
    pub fn randomize(mut self, min: N, max: N) -> Self {
        let mut rng = thread_rng();
        self.data = (0..self.shape.rows * self.shape.cols).map(|_| rng.gen_range(min..max)).collect();

        self
    }

    pub fn dot(&mut self, other: &mut Self) -> Self {
        assert_eq!(
            self.shape.cols, other.shape.rows,
            "Cannot multiply matrices A ({:?}) and B ({:?}), \
            please check first matrix cols amount equals to second matrix rows amount",
            self.shape, other.shape
        );

        let mut output = Matrix::new_empty(self.shape.rows, other.shape.cols);

        for i in 0..self.shape.rows {
            for j in 0..output.shape.cols {
                let mut item = N::default();
                for k in 0..other.shape.rows {
                    item += self.get_item(i, k).unwrap().clone() * other.get_item(k, j).unwrap().clone();
                }
                output.data.push(item);
            }
        }

        output
    }
}

I implemented minimal example to reproduce.

The error I got is next:

memory allocation of 42949672960 bytes failed

To provide more context: I use this matrix in my own neural network implementation.

Could somebody help me to get what I did wrong ?

P.S. I know I can use ndarray and so on, but I am interested how it works under the hood (and also I am interested how to fix memory leak issues).

This is likely not a memory leak.

The error message tells you that a single allocation of size 42949672960 bytes failed. Now that is exactly 40 GB. If something is trying to allocate 40 GB in a single chunk, that code is broken with overwhelming probability.

Incidentally, this amount is about 1000 times your matrices' total size (2 * 500 * 5000 * 8 = is only 40 million, not 40 billion). This might be something you want to look into.

By the way, there are no (obvious) memory leaks in your code. The means by which Rust programs usually leak memory by accident are:

  • explicit mem::forget()
  • explicit ManuallyDrop or MaybeUninit
  • reference cycles using Rc or Arc
  • unsafe code

but none of these is present in your code, and the only way you seem to ever grow a vector is .push() in Matrix::dot(), which always pushes to an empty vector, so it is not the case either that your collections are growing indefinitely.

The problem will probably be somewhere else.

2 Likes

I don't know where the bug is, but I have some suspicions:

  1. Your linked example is going to produce a 5000 by 5000 matrix.
  2. As @H2CO3 noted, in the loop for dot you push entries on to output.data one-by-one.
  3. In other places you use collect. I think the FromIterator implementation doesn't use the size hint but I could be wrong.
  4. The allocation that fails has a very suspicious size. It's exactly 2^33 * 5 bytes.

It's possible that you're hitting some weird allocator behavior that keeps doubling the size of an allocation or worse. You should use with_capacity and reserve with large vectors, or really any vector where you know the size you will need, so that you can allocate no more than necessary. You can also use fallible versions of these functions that will tell you exactly what allocation fails.

1 Like

On my machine, the programs finishes with about 500MB of memory used.

2 Likes

Finally, I found the issue. It was an issue with my code, with my "broadcasting" implementation. For now my neural network took ~30 Mb in memory.

Thank you very much to all for your hints and for your time !

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.