I am trying to perform a K Means clustering in Rust. I am trying this method:
https://rust-ml.github.io/book/3_kmeans.html
I thought in theory there is a nice tutorial there so I presume it should be easy. But it is breaking and giving an error I can't understand for such simple code.
1) Cargo.toml (Okay)
rand = "0.8" # Check for the latest version
ndarray = "0.16.1" # needed for concatenate
linfa = "0.7.0"
linfa-nn = "0.7.0"
linfa-clustering = "0.7.0" # https://crates.io/crates/linfa-clustering
2) "Use" (Okay)
use rand::Rng;
use ndarray::{Array2, ArrayView1, ArrayView2, concatenate};
use std::str::FromStr;
use linfa::prelude::*;
use linfa_clustering::KMeans;
use linfa_nn::distance::L2Dist; //euclidean distance x + y
use ndarray::prelude::*;
use rand::prelude::*;
3) Random Noise Squares (Okay)
//modified chat gpt generated random noise function since they don't give it in the tutorial
fn create_square(center: [f32; 2], half_width: f32, num_points: usize) -> Array2<f32> {
let mut rng = rand::thread_rng();
let mut points = Vec::with_capacity(num_points);
for _ in 0..num_points {
let x = rng.gen_range((center[0] - half_width)..(center[0] + half_width)); //gen_range(start..end): rand crate generates a random value between start (inclusive) and end (exclusive).
let y = rng.gen_range((center[1] - half_width)..(center[1] + half_width));
points.push(vec![x, y]);
}
let arr : Array2<f32> = Array2::from_shape_vec((num_points, 2), points.into_iter().flatten().collect()).unwrap();
return arr;
}
4) Concatenate Random Noise Squares (Okay)
//COPY OF THEIR CODE TO MAKE A RANDOM DISTRIBUTION OF POINTS
fn get_random_points() -> Array2<f32> {
let square_1: Array2<f32> = create_square([7.0, 5.0], 1.0, 150); // Cluster 1
let square_2: Array2<f32> = create_square([2.0, 2.0], 2.0, 150); // Cluster 2
let square_3: Array2<f32> = create_square([3.0, 8.0], 1.0, 150); // Cluster 3
let square_4: Array2<f32> = create_square([5.0, 5.0], 9.0, 300); // A bunch of noise across them all
let data: Array2<f32> = ndarray::concatenate(
Axis(0),
&[
square_1.view(),
square_2.view(),
square_3.view(),
square_4.view(),
],
)
.expect("An error occurred while stacking the dataset");
return data;
}
5) Run the Model (FAILS)
//RUN THE MODEL FUNCTION - FAILS
fn run_model(data : Array2<f32>) {
let dataset = DatasetBase::from(data);
let rng = thread_rng(); // Random number generator
let n_clusters = 3;
let model = KMeans::params_with(n_clusters, rng, L2Dist)
.max_n_iterations(200)
.tolerance(1e-5)
.fit(&dataset)
.expect("Error while fitting KMeans to the dataset");
let dataset = model.predict(dataset);
}
ERROR
I don't know what I have done wrong but adding the final run_model
function does not compile. It returns errors:
error[E0277]: the trait bound `linfa::DatasetBase<_, _>: From<ArrayBase<OwnedRepr<f32>, Dim<[usize; 2]>>>` is not satisfied
--> src/lib.rs:130:19
|
130 | let dataset = DatasetBase::from(data);
| ^^^^^^^^^^^ the trait `From<ArrayBase<OwnedRepr<f32>, Dim<[usize; 2]>>>` is not implemented for `linfa::DatasetBase<_, _>`
|
= help: the following other types implement trait `From<T>`:
`linfa::DatasetBase<ndarray::ArrayBase<D, I>, ndarray::ArrayBase<ndarray::data_repr::OwnedRepr<()>, ndarray::dimension::dim::Dim<[usize; 1]>>>` implements `From<ndarray::ArrayBase<D, I>>`
`linfa::DatasetBase<ndarray::ArrayBase<D, ndarray::dimension::dim::Dim<[usize; 2]>>, ndarray::ArrayBase<S, I>>` implements `From<(ndarray::ArrayBase<D, ndarray::dimension::dim::Dim<[usize; 2]>>, ndarray::ArrayBase<S, I>)>`
I have followed the tutorial exactly as far as I can tell. So I don't know what the problem is. Might the tutorial be out of date?
Or is there something obvious?
Rust is a bizarre language to me with endless obscure type, reference, mut, borrowing, and trait complaints. I can't make sense of it any time something breaks. It is very hard for me to understand what the error is trying to tell me. I am just staring at it.
Thanks for any help. It is appreciated.