I am writing an algorithm (poorly) which has somehow incurred extreme memory consumption and slowness.

The algorithm started out in JavaScript which was then ported to Dart. I have also tried porting it to Swift (but was slower than Dart). I finally ported it to Rust which is having severe issues.

The algorithm is setup as stages. The stage which has issues is `compute_matches_b`

. I have a descending iteration counter implemented for that stage so the iterations can be viewed in the terminal. There is only one variable to change: players. Players has to be set at intervals of 4 (4, 8, 12, etc). The problem is for 20 players there is 11,884 iterations with each having a varying workload.

The implementation for Dart and Rust are almost identical. With Rust, I split the matches stage into matches_a and matches_b due to lifetime issues (I am still learning the language).

When running in Dart with 16 players, compute_matches:

• takes around 1.5s

When running in Rust with 16 players, compute_matches_a + compute_matches_b:

• takes around 1.4s

When running in Dart with 20 players, compute_matches:

• takes around two hours

• memory consumption maxes out at 20GB around the last quarter (< 1,000 mark)

• gets down to 5,000 in a short amount of time

When running in Rust with 20 players, compute_matches_b:

• takes around ?? hours or ?? days

• memory consumption maxes out at around 70GB around the 7,500 mark

• gets down to 7,440 after a long time and exits (signal: 9)

I ran them both side by side and Dart absolutely trounces Rust during this stage for 20 players. By the time Rust gets down to 11,500, Dart is already at around 5,500. For <= 16 players – and all other stages of the algorithm – it is the opposite.

I am using version `1.1.0`

on OS X 10.10, but it also exhibits that same behaviour with version `1.0.0`

.

As you can see, this behaviour definitely seems odd.

OS X has memory compression which allows such massive amounts of memory beyond the hardware’s limits (16GB physical memory) to to be used without resorting to a swapfile. This still may be the cause of the slowness as OS X may not be able to keep up with the demand, but there is still the problem of the excessive memory consumption which the Dart version does not have.

Below is the terminal output for 20 players (ellipses denote omission for brevity)

```
cargo run --release --verbose
Fresh rustc-serialize v0.3.14
Fresh gcc v0.3.5
Fresh libc v0.1.8
Compiling rust v0.1.0 ...
Running 'rustc src/main.rs --crate-name rust --crate-type bin -C opt-level=3...'
Fresh time v0.1.25
Running 'target/release/rust'
11884
...
7440
Process didn't exit successfully: 'target/release/rust' (signal: 9)
```

And below is the Rust code (apologies for the quality of the code being substandard). Scroll down for the problem `compute_matches_b`

function:

```
use std::collections::*;
fn has_all_ranks(round: &Vec<&Vec<&Vec<i8>>>, rank_tally: &mut Vec<bool>) -> bool {
for teams in round.iter() {
for team in *teams {
for rank in *team {
if rank_tally[(*rank - 1) as usize] {
for mark in rank_tally.iter_mut() {
*mark = false;
}
return false;
}
else {
rank_tally[(*rank - 1) as usize] = true;
}
}
}
}
for mark in rank_tally.iter_mut() {
*mark = false;
}
true
}
fn compute_pairs(ranks: &Vec<i8>) -> Vec<Vec<i8>> {
let count = ranks.len();
let mut pairs: Vec<Vec<i8>> = Vec::with_capacity(count * count / 2 - count / 2);
for outer in 0..count {
for inner in outer + 1..count {
pairs.push(vec![ranks[outer], ranks[inner]]);
}
}
pairs
}
fn compute_evens(ranks: &Vec<i8>) -> HashSet<Vec<i32>> {
let largest_even: i32 = ranks[ranks.len() - 4 .. ranks.len()]
.iter()
.fold(0, |total, &value| total + value as i32);
let total: i32 = ranks
.iter()
.fold(0, |total, &value| total + value as i32);
let matches = ranks.len() as i32 / 4;
let minimum = 10;
let maximum = total / 2;
let mut result = vec![0; matches as usize];
let mut results: HashSet<Vec<i32>> = HashSet::new();
fn recurse(remainder: i32, minimum: i32, maximum: i32, matches: i32, largest_even: i32,
depth: i32, result: &mut Vec<i32>, results: &mut HashSet<Vec<i32>>) {
if depth + 1 == matches {
if remainder < minimum || remainder > largest_even {
return;
}
result[depth as usize] = remainder;
if result.iter().fold(0, |x, &y| x + y) / 2 != maximum {
return;
}
let mut copy = result.clone();
copy.sort();
if results.contains(©) {
return;
}
results.insert(copy);
return;
}
let mut number = minimum;
while number <= maximum {
result[depth as usize] = number;
// minimum as row number from previous player quantity
recurse(remainder - number, minimum + 8, maximum, matches, largest_even, depth + 1, result, results);
number += 2;
}
}
recurse(total, minimum, maximum, matches, largest_even, 0, &mut result, &mut results);
results
}
fn compute_teams<'a>(pairs: &'a Vec<Vec<i8>>, evens: &HashSet<Vec<i32>>) -> Vec<Vec<Vec<&'a Vec<i8>>>> {
let mut results: Vec<Vec<Vec<&Vec<i8>>>> = Vec::with_capacity(evens.len());
for even in evens.iter() {
let mut current = Vec::with_capacity(even.len());
for match_rank in even.iter() {
let team_rank = match_rank / 2;
let teams: Vec<&Vec<i8>> = pairs.iter().filter(|pair| ((pair[0] + pair[1]) as i32) == team_rank).collect();
current.push(teams);
}
results.push(current);
}
results
}
fn compute_matches_a<'a>(teams : &Vec<Vec<Vec<&'a Vec<i8>>>>) -> Vec<Vec<Vec<Vec<&'a Vec<i8>>>>> {
let mut to_return : Vec<Vec<Vec<Vec<&Vec<i8>>>>> = Vec::with_capacity(teams.len());
for matches in teams {
let transitive_a = {
let mut temp : Vec<Vec<Vec<&Vec<i8>>>> = Vec::with_capacity(matches.len());
for teams in matches {
let count = teams.len();
let mut results: Vec<Vec<&Vec<i8>>> = Vec::with_capacity(count * count / 2 - count / 2);
for outer in 0..count {
for inner in outer + 1..count {
results.push(vec![teams[outer], teams[inner]]);
}
}
temp.push(results);
}
temp
};
to_return.push(transitive_a);
}
to_return
}
// NOTE: This is the problem function
fn compute_matches_b<'a>(matches_a: &'a Vec<Vec<Vec<Vec<&'a Vec<i8>>>>>, players: i8) -> Vec<HashSet<Vec<&'a Vec<&'a Vec<i8>>>>> {
let mut count = matches_a.len();
let mut rank_tally = vec![false; players as usize];
let mut to_return : Vec<HashSet<Vec<&Vec<&Vec<i8>>>>> = Vec::with_capacity(matches_a.len());
// Finds the cartesian product
for matches in matches_a {
let length = matches.len();
let solutions = {
let mut solutions = 1;
for array in matches {
solutions *= array.len();
}
solutions
};
let mut results = HashSet::with_capacity(solutions);
let mut current_solution = 0;
while current_solution < solutions {
let mut j = 1;
let mut array = 0;
let mut solution: Vec<&Vec<&Vec<i8>>> = Vec::with_capacity(length);
while array < length {
let length = matches[array].len();
solution.push(&matches[array][(current_solution / j) % length]);
j *= length;
array += 1;
}
// Only add to results if all ranks are present
if has_all_ranks(&solution, &mut rank_tally) {
// Sort so duplicates can be distinguished by set
solution.sort_by(|left, right| {
let ordering = left[0][0] - right[0][0];
if ordering < 0 {
std::cmp::Ordering::Less
}
else if ordering > 0 {
std::cmp::Ordering::Greater
}
else {
std::cmp::Ordering::Equal
}
});
results.insert(solution);
}
current_solution += 1;
}
if !results.is_empty() {
to_return.push(results);
}
count = count - 1;
println!("{}", count);
}
to_return
}
fn main() {
let players = 16;
let ranks = (1..players + 1).collect::<Vec<i8>>();
let pairs = compute_pairs(&ranks);
let evens = compute_evens(&ranks);
let teams = compute_teams(&pairs, &evens);
let matches_a = compute_matches_a(&teams);
let matches_b = compute_matches_b(&matches_a, players);
}
```

I am using nested vectors as the structure helps me reason about the algorithm during development. I know I could probably vastly improve performance by dealing with a flat vector, but doing that may be beyond my abilities (I am not a CS kind of developer). I am also just generally curious about what is going on.

Any thoughts will greatly appreciated. Thanks!