Filter/Reorder a list of language identifiers

I’m working on a crate called fluent-locale which is meant to provide language negotiation capabilities.

I’m looking for advice from the community on how to design an API for this capability.

In the simplest form, language negotiation takes a list of language identifiers, filters and reorders them according to some criteria.
For example, it may look like this:

let available = ["en-US", "de-DE", "es-AR"];
let requested = ["de", "fr", "en"];
let supported = negotiate_languages(available, requested);

The list of supported is coming from available filtered out and reordered based on data from requested.

The first issue is that since my functionality is really similar to drain or filter, I’m wondering if it should be some trait on Iterator, Array or Vec instead. In particular, I’d like to avoid cloning, but I’m not sure how to do it.

Second issue is that behind the scenes, I will not operate on strings. I’ll operate on structs called LanguageIdentifier (unic-langid) which have TryFrom<&str>.

If my API accepts &str, then I have to decide what to do if TryFrom fails - in other words, if someone passes a list of locales, and one of the locales is not valid, I’ll need to decide how to handle that scenario within my API.

So it’s tempting to instead ask users to create lists of LanguageIdentifier and then negotiate them against lists of other LanguageIdentifier.
It would be a bit less easy to work with, because in most cases, apps will receive lists of strings, either from the user, or from some settings. But the benefit of having the user decide how to handle incorrect strings, may make it worth it not to accept strings in my API. What do you think?

If I’m right, then the API could be a trait on Vec<LanguageIdentifier>, and the user would do sth like:

let requested = req_strings.map(|s| LanguageIdentifier::try_from(s).unwrap()).collect();
let available = av_strings.map(|s| LanguageIdentifier::try_from(s).unwrap()).collect();
let supported = available.negotiate_against(requested);

Does it sound reasonable? What’s the idiomatic way to avoid cloning when not necessary? Should I consume available? Should I operate on &LanguageIdentifier and clone the reference to supported and keep available?

Any ideas on how to design such API appreciated!

Hope my brainstorm questions make sense :slight_smile:

Taking IntoIterator<Item=TryInto<LanguageIdentifier>> would be most generic for callers, as they could use anything from a slice of strings, to an iterator that already has LanguageIdentifiers.

fn negotiate<E>(t: impl IntoIterator<Item=impl TryInto<LanguageIdentifier, Error=E>>) -> Result<Vec<LanguageIdentifier>, E> {
}

I’m not sure about adding traits to Vec. The task has nothing to do with a vec, specifically.

So, this always returns LanguageIdentifier. I’d like to return what I was given.

I forgot to mention, there’s another struct Locale which has more data than LanguageIdentifier and has a Into<LanguageIdentifier>.

So, I’d like to be able to:

let supported = negotiate(&["en", "fr", "de"], &["de", "pl-PL", "de-AT"]);
assert_eq!(&supported, &["fr", "de"]);

let supported = negotiate(&[LanguageIdentifier::try_from("pl"), LanguageIdentifier::try_from("de")], &[LanguageIdentifier::try_from("en")]);
assert_eq(&supported, &[LanguageIdentifier::try_from("en")]);

let supported = negotiate(&[Locale::try_from("pl"), Locale::try_from("de")], &[Locale::try_from("en")]);
assert_eq(&supported, &[Locale::try_from("en")]);

Is that possible?

I’m not sure what LanguageIdentifier vs Locale distinction gives in this case. Do they compare differently? Do you just want to support both as the input?

from/into destroy the source object, so to return the unconverted type you’d have to require it to be cloneable.

You can let user chose the type used for negotiation, if there’s a trait that is implemented for both Locale and LanguageIdentifier.

It starts looking like generics astronautics:

fn negotiate<L, E>(t: impl IntoIterator<Item=I, Error=E>) -> Result<Vec<I>, E> 
   where L: LanguageLike, I: TryInto<Item=L, Error=E> + Clone {
}
negotiate::<LanguageIdentifier>(&["en", "fr", "de"]) == &["en"]
1 Like

Hmm, I see. Thanks!

I realized that trying to consume a list of TryInto for &str makes me have to deal with when the TryInto fails inside the negotiation function. So I think I’m ok asking the user to build the list themselves (and handle errors) and pass it to me.

This means that I only need to support anything that can infallibly cast into LanguageIdentifier, and I only need a reference to it.

I tried that:

use std::collections::HashMap;

#[derive(PartialEq, Eq, Hash, Debug)]
pub struct LanguageIdentifier {}

impl LanguageIdentifier {
    fn try_from(_input: &str) -> Result<Self, ()> { Ok(LanguageIdentifier {}) }
}

#[derive(PartialEq, Eq, Hash, Debug)]
pub struct Locale {
    langid: LanguageIdentifier,
}

impl Locale {
    fn try_from(_input: &str) -> Result<Self, ()> { Ok(Locale {
        langid: LanguageIdentifier {}
    }) }
}

impl From<Locale> for LanguageIdentifier {
    fn from(input: Locale) -> Self {
        input.langid
    }
}

impl<'a> From<&'a Locale> for &'a LanguageIdentifier {
    fn from(input: &'a Locale) -> Self {
        &input.langid
    }
}

pub fn negotiate_languages<
    'a,
    A: Into<&'a LanguageIdentifier> +  PartialEq + Clone + 'a,
    R: Into<&'a LanguageIdentifier> + PartialEq + Clone + 'a,
>(
    _requested: impl IntoIterator<Item = &'a R>,
    available: impl IntoIterator<Item = &'a A>,
    default: Option<A>,
) -> Vec<A> {

    let mut result = vec![];
    let mut _av_map: HashMap<&LanguageIdentifier, &A> = HashMap::new();

    for _av in available.into_iter() {
        //let langid: &LanguageIdentifier = av.into();
        //av_map.insert(langid, av);
    }
    
    if let Some(default) = default {
        result.push(default);
    }
    

    return result;
}

pub fn main() {
    let langid_en = LanguageIdentifier::try_from("en").unwrap();
    let langid_pl = LanguageIdentifier::try_from("pl").unwrap();
    let langid_de = LanguageIdentifier::try_from("de").unwrap();
    let available = &[&langid_en, &langid_pl];
    let requested = &[&langid_de];

    let y = negotiate_languages(requested, available, Some(&langid_en));
    assert_eq!(y, &[&langid_en]);

    let loc_en = Locale::try_from("en").unwrap();
    let loc_de = Locale::try_from("de").unwrap();

    let available = &[&loc_en];
    let requested = &[&loc_de];

    let y = negotiate_languages(requested, available, Some(&loc_en));
    assert_eq!(y, &[&loc_en]);
}

It kind of works - https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8ef408037a747fbe0a2a66ecec145c00 - but I can’t find a way to get the map to work.

Does it seem sane to you @kornel?

There’s AsRef<T> for cases where something can be referenced as another type.

I’m not sure if Into can be used for references. It’s intended for owned types, and the trait has no place to declare that output depends on the input (as references would).

Ok, tried that now - https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ce837370696f6e5e2d14935a9c00180a

It seems closer to what I’m looking for, but still doesn’t compile :frowning: Am I trying to do something that is inherently impossible in Rust?

Am I trying to do something that is inherently impossible in Rust?

Yes, because the same code would crash in C/C++:

let langid: &LanguageIdentifier = av.as_ref();
av_map.insert(langid, av);
  1. You’re taking a reference to av
  2. You’re moving av to a different address (in the hashmap)
  3. You expect langid that points to the old temporary location (variable in the loop) to still make sense

If LanguageIdentifier is copyable/cloneable, use HashMap<LanguageIdentifier, …> instead of HashMap<&LanguageIdentifier, …>.

Otherwise av.as_ref(), just like &av, is valid as long as av doesn’t move (assignment, returning, insertion to a container are moves).