Struct format for many numeric features

Hi!

I am developing a tool to extract features from songs, and I can't seem to find a proper way to store / expose them to the user. I have a Song struct that stores all kind of information about a song: its title, album, path, etc. I am however unsure how to handle numeric features.

Basically, I extract a bunch of features in the form of a Vec here https://github.com/Polochon-street/bliss-rs/blob/master/src/song.rs#L204-L210, which is stored in the Song.analysis Vec.

It has the advantage of being straightforward, because the thing most of the users will want to do with this Vec is to make it an ndarray's arr1, and compute distance between two analysis to check how different songs are, see https://github.com/Polochon-street/bliss-rs/blob/master/src/library.rs#L49-L52.

Though, in some cases, it would be nice to see which feature is which, without having to read the comment that says "vec[0] is the tempo, vec[1] is the zero-crossing rate", etc.
I've toyed with the idea of having an Analysis struct that has every field, like so:

struct Analysis {
   tempo: f32,
   zcr: f32,
   ...,
   chroma_10: f32
}

with a to_vec that would just return vec![self.tempo, self.zcr, ...].

However, it has the disadvantage of having to set each field individually in the analysis step, which can make it quite verbose (especially since I have 10 unnamed chroma features on top of the rest, and there will probably be a lot more features in the future).

I could also keep the Vec and have a NamedAnalysis struct that I can just build from a Vec by implementing From<Vec<f32>> which would just return NamedAnalysis { tempo: vec[0], zcr: vec[1], ...}, so that only people actually needing to see which field is which will use that; the rest will stick to a Vec.

I don't really know which solution is the best (and to be honest, there are probably better solutions), so any kind help will be greatly appreciated :slight_smile:

I have no idea if this approach has any shortcomings in practice, but you could use an enum together with a convenience impl of the Index trait:

use core::ops::IndexMut;
use core::ops::Index;

enum AnalysisIndex {
    Tempo,
    Zcr,
    // …
    Chroma10,
}

impl<T> Index<AnalysisIndex> for Vec<T>  {
    type Output = T;

    fn index(&self, index: AnalysisIndex) -> &Self::Output {
        &self[index as usize]
    }
}
impl<T> IndexMut<AnalysisIndex> for Vec<T> {
    fn index_mut(&mut self, index: AnalysisIndex) -> &mut Self::Output {
        &mut self[index as usize]
    }
}


fn main() {
    use AnalysisIndex::*;
    let x = vec![1,2,3];
    println!("{}", x[Tempo]);
}

(playground)

2 Likes

It has the advantage of being straightforward, because the thing most of the users will want to do with this Vec is to make it an ndarray's arr1, and compute distance between two analysis to check how different songs are

That sounds less intuitive than the alternative. The Vec is not just a group of elements, it's a collection that can grow and shrink. If the length is always the same length, I would use an array. If the elements have unique names, I would use a struct and name them. If the user is always going to want something in a certain format, I would provide it to them in that format (and maybe put it behind a feature if it uses a non-std dependency).

From what I can tell the only reason that Vec is being used is because it allows the user to turn it into this type from ndarray. A custom type that refers to these things directly by their name and has a From implementation to integrate with that library sounds perfect. It might mean writing some code that's a little boring but undeniably explicit, and if that also means less head-scratching for your user, that sounds like good code to me.

You can use a repr(C) struct and cast it as an array reference:

#[repr(C)]
struct Analysis {
   tempo: f32,
   zcr: f32,
   chroma_10: f32
}
type AnalysisArr = [f32; 3];

impl AsRef<AnalysisArr> for Analysis {
  fn as_ref(&self) -> &AnalysisArr {
    assert_eq!(std::mem::size_of::<Self>(), std::mem::size_of::<AnalysisArr>());
    unsafe {
        &*(self as *const Self as *const AnalysisArr)
    }
  }
}

I'd definitely do with the struct, and provide a distance measure. There should be a metric used in computing the distance between two analyses, and users aren't able to figure this out (most likely).

Or even better, you could use an opaque type that is an array internally and provide methods to extract useful data. This allows you to change the format and even the analysis itself later.

1 Like

Thanks to everyone for their super helpful answers.

I'm leaning towards doing it with an opaque struct an internal array, and expose a distance metric (like I currently do), as well as maybe a to_array() / to_arr1 method specifying that the function should only be used by "experts" users in the docs.

However, I'm still unsure as to how to give the user a full {tempo: 3., zcr: 2.1}, etc result.
Would the enum steffahn suggested could pair well together with it?
I'm thinking that actually no one will want to manipulate such an Analysis struct by fields and it would only be used in case one wants to just see stuff, so perhaps it could just be the Display trait of my opaque structure, that would "iterate" through the AnalysisIndex, and write a string like "Analysis { Tempo: x, Zcr: y }", etc?

What do you guys think?

I made a tentative PR here with some of the feedback here - Change `analysis` from Vec<f32> to `Analysis` by Polochon-street · Pull Request #7 · Polochon-street/bliss-rs · GitHub.

However, I am still wondering how to display the specifics of the analysis to the user. I could implement the Index trait suggested, but that would still not allow to "display all at once".

Will a 100% named struct, and change my to_arr / to_vec to just be a vec![self.tempo, ...] better?

[EDIT] Just a precision - the field names would be irrelevant for 90% of the users.
Basically, I would expect most people to just use the Analysis struct as is, using provided distance functions etc without caring about what's happening in the background.

But I would also expect some users (and me) to want to tweak for example the distance metric, or debug why two songs that were shown "close" don't sound alike. That's when you would want the meaning / name of each feature. But otherwise, they would be irrelevant to you.

That's what makes me uneasy at the idea of showing all the fields to the user directly - I'm afraid that would just confuse the majority.

If the fields have different/additional semantics other than just a bunch of numbers, then definitely do create a struct with named fields.

If the number of dimensions grows and you don't want to maintain it manually, write a derive proc-macro that automatically enumerates all fields and adds them to a Vec or an array. While you are at it, you can write another macro for performing the inverse operation as well.

I've changed it to something like this (Change `analysis` from Vec<f32> to `Analysis` by Polochon-street · Pull Request #7 · Polochon-street/bliss-rs · GitHub)

// TODO is there a way to do that better?
const NUMBER_FEATURES: usize = 20;

#[derive(Default, PartialEq, Clone, Copy)]
pub struct Analysis {
    internal_analysis: [f32; NUMBER_FEATURES],
}

#[derive(Debug, EnumIter)]
pub enum AnalysisIndex {
    Tempo,
    Zcr,
    MeanSpectralCentroid,
    StdDeviationSpectralCentroid,
    MeanSpectralRolloff,
    StdDeviationSpectralRolloff,
    MeanSpectralFlatness,
    StdDeviationSpectralFlatness,
    MeanLoudness,
    StdDeviationLoudness,
    Chroma1,
    ...,
    Chroma10,
}

impl Index<AnalysisIndex> for Analysis {
    type Output = f32;

    fn index(&self, index: AnalysisIndex) -> &f32 {
        &self.internal_analysis[index as usize]
    }
}

impl fmt::Debug for Analysis {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let mut f = f.debug_struct("Analysis");
        for feature in AnalysisIndex::iter() {
            f.field(&format!("{:?}", feature), &self[feature]);
        }
        f.finish()
    }
}

impl Analysis {
    pub(crate) fn new(analysis: [f32; NUMBER_FEATURES]) -> Analysis {
        Analysis {
            internal_analysis: analysis,
        }
    }

    /// Return an ndarray `arr1`.
    ///
    /// Particularly useful if you want to make a custom distance metric.
    pub fn to_arr1(&self) -> Array1<f32> {
        arr1(&self.internal_analysis)
    }

    #[allow(dead_code)]
    pub(crate) fn to_vec(&self) -> Vec<f32> {
        self.internal_analysis.to_vec()
    }

    pub fn distance(&self, other: &Self) -> f32 {
        let a1 = self.to_arr1();
        let a2 = other.to_arr1();
        let m = Array::eye(NUMBER_FEATURES);

        (self.to_arr1() - &a2).dot(&m).dot(&(&a1 - &a2)).sqrt()
    }
}

It does exactly what I want, but I'm not sure I'm 100% satisfied with it.
I'm a bit skeptical at the NUMBER_FEATURES thing, for example, which I didn't really manage to tie together with the rest of AnalysisIndex.
I'm also not sure how easy it will be to add / remove new features.

But it's certainly way better than just a simple Vec<f32> :grinning_face_with_smiling_eyes:

Perhaps it could be improved even further?

One way to do it is with the help of some derive macros. For example the crate strum offers EnumCount.

Apparently you’re already using strum. Try adding EnumCount to the list of derives. Then use AnalysisIndex::COUNT instead of (or as the definition of) NUM_FEATURES.

Oh, didn't know about that one. It should make everything a bit clearer.

The struct should be pretty okay now. Thanks! :slight_smile:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.