Return string without new allocations

Hello,
I'm trying to write a function that will return a string based on a random value.
Now this seems like an easy thing to do but unfortunately the owners of the strings are slightly different and this is causing me some problems.

The process will be like this:

  • Generate a random value;
  • If the value is higher than 0 return a string from a vec that is read only;
  • If the value is less than 0 get a string from a pool, calculate the value and return that;

Here is an example

use std::cell::RefCell;
use std::rc::Rc;

fn main() {
    // Vec with fixed length and never changed strings
    // This represent a list of values that are just returned as they are
    let static_vec_values = vec!["test 1".to_owned()];
    // Vec with fixed length and mutate strings.
    // This represents a pool of string pre-allocated
    // so I can use them during the execution without the overhead of the string allocation
    let dynamic_vec_values = vec![Rc::new(RefCell::new("test 1".to_owned()))];
    let value = get_value(1, &static_vec_values, &dynamic_vec_values);
}
struct Item<'a> {
    name: &'a str,
}

// Returns a value based on a random value
fn get_value<'a>(
    random_value: u32,
    static_vec_values: &'a [String],
    dynamic_vec_values: &'a [Rc<RefCell<String>>],
) -> Item<'a> {
    if random_value > 0 {
        Item {
            name: &static_vec_values[0],
        }
    } else {
        Item {
            // In here there is a missing bit that will change the string before returning it
            // I tried to keep as much as simple as possible
            name: dynamic_vec_values[0].clone(),
        }
    }
}

Obviosly the line name: dynamic_vec_values[0].clone(), doesn't allow the code to compile because the return type is not compatible.
I tried to use a Cow but apparently is not possible (or my knowledge is not correct).

Do you have any idea on how I can solve this without allocating more strings than those that I already have?

Thank you :slight_smile:

You can't return a reference to something that is behind a RefCell like that because the RefCell must be able to count the number of active references, which it does by wrapping the reference in the Ref type, which has a destructor that decrements the counter. The compiler makes sure that no reference can outlive the Ref it was created from.

To fix this, you must return the actual Ref object so the destructor doesn't run too early. You could do this with an enum that is either a reference or a Ref.

6 Likes

i think you could use Cow here:

use std::borrow::Cow;

fn get_names(random: i32, names: &[&'static str], dynamic: String) -> Cow<'static, str> {
    if random > 0 {
        names[0].into()
    } else {
        dynamic.into()
    }
}

fn main() {
    let vec_static = ["lkjasdflkjasd", "laksdjflkasdj", "lkajsdfl"];
    let name1 = get_names(10, &vec_static, "nooooo".to_string());
    let name2 = get_names(-10, &vec_static, "nooooo".to_string());
    println!("name: {:?}", name1);
    println!("name: {:?}", name2);
}

Right, you could use a cow if you want to clone the string to handle that case.

1 Like

However, OP explicitly sought to do this without allocating.

Something like this, I imagine (haven't checked, if it compiles as-is):

struct Item<'a> {
    name: DerefStr<'a>,
}

enum DerefStr<'a> {
    Str(&'a str),
    Ref(Ref<'a, String>)
}

impl Deref for DerefStr {
    type Target = str

    fn deref(&self) -> &str {
        match self {
            Str(a) => a,
            Ref(a) => *a
        }
    }
}

Thanks for your help. Unfortunately I cannot clone the string because I have some constraints.

If I wanted to use the enum, if my knowledge again is not wrong, will it not allocate for the maximum value of the enum?

Is there a way get the size of a type without doing the math manually so I can at least minimize the allocations?

A DerefStr will take up more space than a &str (24 vs 16 bytes to be precise), but it would not involve allocating memory on the heap. If you were worried about memory on the stack, that's a separate story and requires separate considerations.

4 Likes

Thank you for all the suggestions, in the end I used the enum, seems the best one for my use case.

I didn't read thoroughly but can't this be handled by a Cow ?

No. You need an enum with the following cases:

  • &str
  • std::cell::Ref<str>

However Cow behaves as an enum with the following cases:

  • &str
  • String

Yes I cannot use Cow because when I want to change the String that is going to clone it and I cannot have another allocation.

I tried many times but unfortunately I still have an issue.

I had to put more code (sorry) to make my use case clearer

use std::cell::{Ref, RefCell};
use std::ops::Deref;
use std::rc::Rc;

pub trait Clear {
    fn clear(&mut self);
}

impl Clear for String {
    fn clear(&mut self) {
        self.clear();
    }
}

#[derive(Debug)]
pub struct Pool<T>
where
    T: Clear + PoolDefault,
{
    values: Vec<Rc<RefCell<T>>>,
}

pub trait PoolDefault {
    fn default() -> Self;
}

impl PoolDefault for String {
    fn default() -> Self {
        String::with_capacity(20)
    }
}

impl<T> Pool<T>
where
    T: Clear + PoolDefault,
{
    pub fn with_capacity(capacity: usize) -> Self {
        let mut values = Vec::with_capacity(capacity);
        for _ in 0..capacity {
            values.push(Rc::new(RefCell::new(T::default())));
        }

        Self { values }
    }

    pub fn borrow_mut(&self) -> Option<Rc<RefCell<T>>> {
        let value = self.values.iter().find(|v| Rc::strong_count(v) < 2);
        if let Some(v) = value {
            v.borrow_mut().clear();
            return Some(v.clone());
        }
        None
    }
}

fn main() {
    // Vec with fixed length and never changed strings
    // This represent a list of values that are just returned as they are
    let static_vec_values = vec!["test 1".to_owned()];
    // Vec with fixed length and mutate strings.
    // This represents a pool of string pre-allocated
    // so I can use them during the execution without the overhead of the string allocation
    let dynamic_vec_values = Pool::with_capacity(1);
    let _value = get_value(1, &static_vec_values, &dynamic_vec_values);
}
struct Item<'a> {
    name: DerefStr<'a>,
}

enum DerefStr<'a> {
    Str(&'a str),
    Ref(Ref<'a, String>),
}

impl<'a> Deref for DerefStr<'a> {
    type Target = str;

    fn deref(&self) -> &str {
        match self {
            Self::Str(a) => a,
            Self::Ref(a) => a.as_str(),
        }
    }
}

// Returns a value based on a random value
fn get_value<'a>(
    random_value: u32,
    static_vec_values: &'a [String],
    dynamic_vec_values: &'a Pool<String>,
) -> Item<'a> {
    if random_value > 0 {
        Item {
            name: DerefStr::Str(&static_vec_values[0]),
        }
    } else {
        let value = dynamic_vec_values.borrow_mut();
        Item {
            // In here there is a missing bit that will change the string before returning it
            // I tried to keep as much as simple as possible
            name: DerefStr::Ref(value.unwrap().borrow()),
        }
    }
}

As you can see this line name: DerefStr::Ref(value.unwrap().borrow()), is causing the problem because obviously the borrowed value will be out of scope after the function is finished.

Because of the return type from the pool I tried to put inside the enum the cloned Rc instead of just the Ref but this is causing the Deref impossible because of the same problem I have here (Ref out of scope).

I can remove the Deref from the DerefStr enum but the code become really ugly when I'm going to use the Item struct.

I'm not sure how to proceed for now, do you think there another way to implement the Pool?

Thank you :slight_smile:

This time the error happened due to the clone of the Rc in Poll::borrow_mut. The Ref can only exist if it knows for sure that the RefCell it points at remains alive, but when you clone the Rc like that, it can't tell whether the RefCell stays alive, because it's accessing it through a clone of the Rc that is destroyed when you return from the function.

Do this instead:

pub fn borrow_mut(&self) -> Option<&Rc<RefCell<T>>> {
    let value = self.values.iter().find(|v| Rc::strong_count(v) < 2);
    if let Some(v) = value {
        v.borrow_mut().clear();
        return Some(v);
    }
    None
}
1 Like

If I do that is the Rc still increasing the strong count? If not that is going to be a problem because essentially I can give the same reference to another piece of code.

I was using the Rc::clone because in this way as soon as the object is serialized into a json the struct will be out of scope and will be dropped and this will cause essentially the pool to reclaim the string because the strong count will be decreased.

Sorry I'm probably not very clear on what I'm trying to achieve.

It wont increase the strong count, but this isn't a problem unless you wanted to remove the string from the pool while the Item still exists.

Yes this is exactly what I'm trying to achieve, essentially only one struct should be able to modify and access that value until the struct itself is dropped and the string is reclaimed.

Then you have to store the Rc in the enum rather than the Ref. You wont be able to implement Deref if you do this because the borrow/borrow_mut method must be called after returning it.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.