Proper signature for a function that makes a playlist

Hi,

I am refactoring a library that makes playlists from lists of songs, but I am unable to come up with a good signature for the function that actually makes the playlist (I already have one, but it's not great), so I thought maybe people had some good ideas.

The very high level description is that this function will make a "smart" playlist from a song or a list of initial songs, choosing among a larger pool of songs by measuring the distance between songs.
So, a naive signature would be fn make_playlist(initial_songs: &[Song], song_pool: &[Song]) -> Vec<Song>

However, there a few parameters one might want to customize:

  • First, there are several ways to "measure distance between songs", so we would like to be able to provide custom distance metrics to evaluate that distance. That would be an extra parameter distance: F: Fn(&[Song], &Song) -> f32.
  • Second, there are several ways to order songs to make a playlist. We might want to take the songs which are closest to first song(s), and then put them in order. Or we might want to take the song closest to the first song(s), let's call it s1, then the song closest to s1, etc. So we would need an extra function as a parameter sort, and this is where I'm not sure what the signature should be.
  • Finally, users usually don't want to order their entire song pool, because some have libraries of 300k+ songs, so it would be a bit overkill to sort everything when you just want a playlist of 300 songs. We would then add a parameter number_songs: usize.

That gives us a signature like so:

fn make_playlist(
   initial_songs: &[Song],
   song_pool: &[Song],
   distance: Fn(&[Song], &Song) -> f32
   sort: ???
   number_songs: usize
) -> ???

What I can't wrap my head around is:

  • What should be the return value of this function? Should it be a Vec? An iterator over Songs? Should it sort song_pool in place? Something else entirely? (And should it conceptually return initial_songs too at the beginning? These will most likely be in song_pool, so I think it would make sense, but it's debatable)
  • What should be the type of sort? I would say Fn(&[Song], &[Song], distance) -> Iter<Song>, but again, it's debatable.

I hope all of this make sense, it's been bugging me for some days, but I haven't been able to find a good solution for it. Cheers!

If you can always construct the playlist one-by-one, then the return type should be an iterator.

The "sorting" function is not really a sorting function but a strategy for selecting the next item. I don't really see its role here, because if it returns an iterator over all "best" songs in order, then it does all of the work, but if it doesn't, then it's useless — it doesn't really have any more information (or job to do) than the distance function.

I'd maybe make a builder interface to select options and make the result of that builder an iterator producing the songs. (And maybe that iterator type could also have methods to get back the options you set previously.) An iterator could just take advantage of the methods already defined for it. Such as take for example.

But, if you're doing side effectul things to get the next song, I wouldn't implement Iterator then. I'd implement a next method returning a Result, the caller would have to call in a loop.

The playlist can always be constructed one by one - I'm thinking of some edge cases like playlists of albums, where you'd return chunks of songs, but even in that case I guess you could keep a state and return the songs one by one?

About the sorting function - it represents different strategies of sorting, because there is no best song order. The information it contains is "do we want to make a playlist where songs are the closest possible to the first song, or do we want to make a playlist where songs are closest to each other, forming a chain?". I'm thinking that in the future we'll probably find other methods, so that's why making it a function sounds like a good idea - what do you think?

I'd maybe make a builder interface to select options and make the result of that builder an iterator producing the songs. (And maybe that iterator type could also have methods to get back the options you set previously.) An iterator could just take advantage of the methods already defined for it. Such as take for example.

@erelde This is the pattern you are talking about, right? Builder - Rust Design Patterns. I don't think generating a new song for the playlist has any side effect - it would just keep a state of the song pool and the initial songs (poping songs from there as the playlist is generated), but that's it.

It seems the consensus is on an Iterator then? :smiley:

1 Like

Hi!

Sorry for the double post (and tell me if I should create a new post in the "review" section?), but I just finished implementing iterators for my problem, and it would be nice to have some feedback, because some of it looks a bit silly. It is implemented here Tweak the behavior of `Library::playlist_from_custom` by Polochon-street · Pull Request #76 · Polochon-street/bliss-rs · GitHub and Tweak the behavior of `Library::playlist_from_custom` by Polochon-street · Pull Request #76 · Polochon-street/bliss-rs · GitHub here.

The nice thing is that for my "song_to_song" sorting, it takes advantage of the Iterator trait nicely, splitting expensive calculation in each call to "next". However, the implementation of "closest_to_songs" is basically doing

let mut playlist = song_pool.to_vec();
playlist.sort_by_cached_key(...);
return playlist.to_vec().into_iter();

Which seems a bit silly. The difference with the song_to_song function is that the closest_to_songs is way faster, since it only needs to sort one array once, so there is no need (and no possibility even?) to split it over several next calls.
I don't think there is a way around the first .to_vec(), but clippy tells me to use playlist.iter().cloned() on the one in return. However, upon doing this, I get ("playlist" is "candidate_songs"):

error[E0515]: cannot return value referencing local variable `candidate_songs`
   --> src/playlist.rs:128:5
    |
128 |     candidate_songs.iter().cloned()
    |     ---------------^^^^^^^^^^^^^^^^
    |     |
    |     returns a value referencing data owned by the current function
    |     `candidate_songs` is borrowed here
    |
    = help: use `.collect()` to allocate the iterator

Not sure what's happening there.

All in all I think it looks better than the previous Vec version, but let me know what you think :slight_smile:
Thanks!

.iter() returns an iterator that contains a reference to the vector/slice, in order to be able to yield references to its items. That's very obviously wrong (and impossible thanks to borrow checking) if the vector is a local variable, which is dropped at the end of the function body.

If you already have a Vec, why are you calling .to_vec() on it again? You should just call .into_iter().

1 Like

Oh, write, thanks :man_facepalming: it went through so many iterations, I completely overlooked that. Thanks!

Now I'm a bit unsure what type these sorting functions should have. The first iteration has

pub fn song_to_song<'a, T:...>(...) -> impl Iterator<Item = T> + 'a

But then on the actual code that uses this function, people can choose between the two (and probably more in the future) functions at some point, and the resulting code is

let sort = |x: &[LibrarySong<()>],
                        y: &[LibrarySong<()>],
                        z|
             -> Box<dyn Iterator<Item = LibrarySong<()>>> {
                match sub_m.is_present("seed") {
                    false => Box::new(closest_to_songs(x, y, z)),
                    true => Box::new(song_to_song(x, y, z)),
                }
            };

which doesn't look very pretty. Would it maybe be better to have
pub fn song_to_song<'a, T:...>(...) -> Box<dyn Iterator<Item = T> + 'a>
?
It is easier to manipulate on the code that actually uses these functions, but maybe a bit more annoying for people who want to create their own sorting functions. Is there a version more standard than the other?

Generally, I'd be upset if an API forced a heap allocation and dynamic dispatch where I don't need it. Box<dyn Trait> is impl Trait if the impl is provided for the Box, and for Iterator, it is, so there's nothing you can do with a Box<dyn Iterator> return type that you couldn't do with an impl Iterator, but the former unconditionally allocates and forces indirection.

It will always be a possibility that someone calling two -> impl Trait-returning functions of your library may want to use them in a uniform manner. It should be their responsibility to decide where the dynamic dispatch happens; you shouldn't make that decision on behalf of the user.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.