Hello Rust Community,
I'm currently working on a project involving the linfa-clustering
crate in Rust, specifically using the KMeans
algorithm to cluster data. I'm encountering an issue with the centroids()
method, which is returning centroids in an unexpected format.
According to the documentation and examples I've seen, centroids()
should return centroids as arrays of two columns (for 2D data). However, when I apply it to a basic Iris dataset for example, I get centroids like this:
[[63.49999999870944, 6.034615384556769, 2.784615384593053, 4.315384615375798],
[13.00000000161835, 5.027999999996514, 3.479999999940242, 1.4600000000174267],
[114.50000128197149, 6.645833287602458, 2.9333333196899436, 5.6624999115742085],
[38.00000000162139, 4.9839999999982885, 3.3560000000479704, 1.4679999999829425],
[138.5000012823333, 6.575000041839055, 3.0125000236583857, 5.441666738200124],
[89.49999999870744, 5.846153846225979, 2.7730769230986314, 4.303846153855346]]
Instead of the expected format:
[[ 6.25559436e+01 -1.97357959e-02],
[-6.30286820e+01 -5.90974586e-01],
[-1.35102612e+01 -7.83418521e-02],
[ 1.19636167e+01 1.08604407e+00],
[-3.85579122e+01 5.89060663e-01],
[ 3.75776032e+01 -1.05313324e+00]]
I'm unsure why the centroids are returned in four columns instead of two. I've double-checked my dataset and clustering parameters, but I can't seem to find the issue.
If anyone has experience with linfa-clustering
or the KMeans
algorithm in Rust and can provide insights or suggestions on how to correctly retrieve centroids in the expected format, I would greatly appreciate your help.
here my source code:
Thank you in advance!
Best regards
Victor Rodriguez