Ported collection framework from Java

Hello,
I ported Eclipse Collections to Rust. It is large java collection framework with several collection types, parallel iterators, filtering etc.. About 500 rust files in total. It passes tests, good performance, very close to release. I would like someone with good Rust skills to take it for spin before public release, I am mostly JVM guy.

Thanks

Could you give a link to the code?

So the same unreviewed huge code base you also did for Zig?
Why would one spent time helping you if you haven't even spent time on it yourself?

I’m honestly curious what’s your motivation here :slight_smile:

It’s not every day that someone decides to just port a huge collection framework from Java into multiple other programming languages (languages that they don’t even really know much how to use) and wants to publicly release the result.

If nothing else, this sounds almost impossible to handle from a maintainence standport.


Moreover, you’re asking for help here – though not by asking any concrete questions publicly, but merely stating that you’re looking for a person – on the Zig thread you’re also offering communication via DM – which means you might be either looking to hire someone, or looking for free personal tutoring or consulting or something like that? (Job offerings are not something we host on this forum though; and private tutoring or consulting for free doesn’t actually exist.)

Please note that new users joining with requests like “looking for someone skilled to (privately) talk to about XYZ” aren’t particularly close to the main purpose of this forum. This forum is hosting public discussions relating to Rust, and people help others for free here because it’s fun, they know and like the environment of this forum, and anyone else can enjoy the stimulation of reading about Rust-related things by reading along with other people’s discussions. If an interesting question comes up and gets an answer here, it may also help others with similar problems in the future; perhaps taken here by a Google search some day.

I am beginner in Rust and wanted some feedback, how collections should look like in Rust, before I fully open it. I am not trying to hire anyone or get tutoring.

I will put link to code repo in a few days.

Then why did you produce 500 files before understanding how the result should look like?

But why do you want that though? You're willing to share the code with some random stranger on the internet but you're not willing to make it public? What do you expect from that random stranger? Surely they won't look at all 500 files.

(quote from the Readme)

Each primitive collection type is specialized per primitive type (i8 , i16 , i32 , i64 , f32 , f64 , bool , char ) — no boxing, no trait objects, contiguous memory layout. Generic object collections (ArrayList<T> , HashSet<T> , etc.) complement the primitive types for general-purpose use.

This indicates very very strong Java-isms to me. Rust does not share the property of Java of having 8 specific “primitive datatypes” and “objects” beyond that. AFAIK not even even languages much closer to Java, e.g. C#, would have this property.

Hence, APIs that look like this

are extremely unidiomatic, and it’s even weirder that this merely wraps a more generic API internally, anyway, e.g. I see things like

#[derive(Debug, Clone)]
pub struct F32F32HashMap {
    inner: OpenHashMap<f32, f32>,
}

Taking a quick peek around some places in the API, I’m noticing it defines a trait

/// Trait for primitive types that can be used as hash table keys.
pub trait PrimitiveKey: Copy + Default {
    fn hash_code(&self) -> u64;
    fn key_eq(&self, other: &Self) -> bool;
}

which is highly unidiomatic, since the standard library already provides relevant traits (Hash and Eq) which would generally just be re-used for better compatibility with user types, instead of rolling your own one.

Finally, I was intrigued by the existence of an immutable module, and looked at ImmutableF32HashSet out of curiosity… but it looks like it’s completely fake [or at least terribly misnamed] (i.e. not at all a hash set)!

/// Immutable, cheaply cloneable set of `f32` values.
#[derive(Debug, Clone)]
pub struct ImmutableF32HashSet {
    items: Arc<[f32]>,
}

impl ImmutableF32HashSet {
    pub fn from_mutable(set: &F32HashSet) -> Self {
        ImmutableF32HashSet {
            items: Arc::from(set.to_vec().into_boxed_slice()),
        }
    }
    pub fn of(values: &[f32]) -> Self {
        let mut s = F32HashSet::new();
        for &v in values {
            s.add(v);
        }
        Self::from_mutable(&s)
    }
    pub fn contains(&self, value: f32) -> bool {
        self.items.iter().any(|&v| v.to_bits() == value.to_bits())
    }
    pub fn len(&self) -> usize {
        self.items.len()
    }
    pub fn is_empty(&self) -> bool {
        self.items.is_empty()
    }
    pub fn iter(&self) -> impl Iterator<Item = f32> + '_ {
        self.items.iter().copied()
    }
    …

and for this Immutable…HashSet kind of type, too - with about 30 or so identically structured, monomorphised versions in separate modules, features an outright insane amount of redundant code duplication. Admitted, these also contain a note they’re generated code

// Code generated by mapdb-codegen. DO NOT EDIT.

but I seem unable to find the actual codegen tool/template from this description (and such codegen isn’t needed for this use-case in Rust [nor is the resulting API surface desirable], anyway!!)


This is just me looking at 2 or 3 random things; I have not tried to understand or check the hashmap implementation at all, and not looked at most of the crate API either, but maybe it helps as a starting point for some of the most notable things from a Rust user’s point of view :slight_smile:

I read F64ArrayList and saw at least 7 methods that can be delegated to something else (e.g. len by Deref to Vec and for_each by self.items.iter().for_each(f)).

I find it interesting that you mix a.to_bits() == b.to_bits() and a.total_cmp(b). Is there any reason behind this?

I don't see much reason to make a separate XXXArrayList for each type too. I do see a pro of having min/max on f32/f64 thanks to the methods mentioned in my last post, which usually don't have them because they're not Ord. But for this I think I would prefer to use Vec<totally_ordered::TotallyOrdered<f32|f64>> (if I want total_cmp behavior). I thought the bool one would have bool-specific features, but I was wrong.