Formatting integers: allocation?

Hello there,

Does format_args! allocate for variables (especially integers)?:

let not_const = 12; // Come from somewhere where compiler cannot know the value
format_args!("{}", not_const); // How 12 is converted to "12"?

Is this the same as numtoa?

Can someone explain why termion does this:

// It uses `numtoa` to make a `String`!
impl From<Up> for String {
    fn from(this: Up) -> String {
        let mut buf = [0u8; 20];
        ["\x1B[", this.0.numtoa_str(10, &mut buf), "A"].concat()
    }
}

// Does this allocate?
impl fmt::Display for Up {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "\x1B[{}A", self.0)
    }
}

Thank you!

Most formatting implementations don't allocate, as they are defined in core, which doesn't have allocation. Formatting is just defined in terms of writing strings and characters, see fmt::Write.

The formatter output itself may allocate, for example impl fmt::Write for String has to obviously allocate.

Your first example doesn't allocate at all, as it doesn't format anything, it just specifies how to format something.

3 Likes

Okay, thanks!

Yes I missed that std link is in core :sweat_smile:

Why numtoa then? Can't we get the slice from that?

That slice is a view to an on-stack buffer in the function, so it's impossible to return it (doing so would make it a dangling pointer)

However, it is possible to write a formatted string to a buffer, which can even be a mutable slice:

write!(&mut buf, "{}", num);

(although io::Write on slices is a bit weird, since it shortens the slice by the length written)

2 Likes

As stated before in this thread, a write!(…) call does not (necessarily) allocate; more specifically writing into a byte buffer / slice ([u8]) won't allocate; the question is mainly how such buffer is obtained / allocated.

Also note that technically, stack locals are still a form of allocation, they're just stack allocations, thus very fast, and, in the case of Rust, with lengths known at compile time (vs. dynamic lengths). The latter does lead to formatting integers into stack-allocated buffers to often over-allocate a bit, by using the upper bound on the length.

The two most ergonomic ways to format an integer into a stack-allocated buffer, that is, ways that do not require the (human) caller to know the actual length involved, are the following:

1 - API taking a caller-provided buffer

  • Something like:

    fn fmt_into<'buf> (…, buf: &'buf mut _) -> &'buf str
    

The naive and thus most straight-forward version uses [u8] as _, but then the caller needs to give a big-enough buffer.

To avoid this head-scratching / mental burden / cognitive overhead, here is a version that just requires that the caller provide a &mut Default::default(), in an opaque manner.

Demo:

fn main ()
{
    let mut stack_storage = Default::default();
    let s = 42.fmt_into(&mut stack_storage);
    assert_eq!("42", s); // s is usable while `stack_storage` is in scope.
}
  • Implementation

    impl FmtInto for i32 {
        // Opaque length for reduced semver constraints
        type Storage = impl Default + AsMut<[u8]>;
    
        fn fmt_into<'storage> (&self, storage: &'storage mut Self::Storage)
          -> &'storage str
        {
            fn _def_storage () -> <i32 as FmtInto>::Storage { [0_u8; 11] }
    
            let buf: &mut [u8] = storage.as_mut();
            let remaining = {
                use ::std::io::Write;
                let mut cursor = &mut *buf;
                write!(&mut cursor, "{}", *self).unwrap();
                cursor.len()
            };
            ::core::str::from_utf8(&buf[.. buf.len() - remaining])
                .unwrap()
        }
    }
    

This pattern still requires that the caller create and provide that stack_storage, it thus not as pretty as the heap-allocating format!("{}", 42) / 42.to_string().

This can be soothed, as with any ergonomics problem, using a macro that hides this local, something like:

stack_fmt!(42 => let s);
  • Sketch of the implementation
    macro_rules! stack_fmt {(
        $e:expr => $($binding:tt)*
    ) => ( // note the lack of braces.
        let mut stack_storage = Default::default(); // `stack_storage` is hygienic 👌
        $($binding)* = $e.fmt_into(&mut stack_storage);
    )}
    

The other solution is the following:

2 - Callback-based API (CPS)

The idea is further detailed in the documentation (and upcoming guide / book) of:

Demo:

fn main ()
{
    42.with_str(|s| {
        assert_eq!("42", s); // s can be used within this (closure) scope
    });
}
  • Implementation

    fn with_str<R> (self: &'_ i32, with: impl FnOnce(&str) -> R)
      -> R
    {
        // Calling the callback, within this body, must be viewed as "returning" a value;
        let return_ = with;
    
        let mut stack_storage = [0_u8; 11];
        let buf: &mut [u8] = &mut stack_storage[..];
        let remaining = {
            use ::std::io::Write;
            let mut cursor = &mut *buf;
            write!(&mut cursor, "{}", *self).unwrap();
            cursor.len()
        };
        return_(
            ::core::str::from_utf8(&buf[.. buf.len() - remaining])
                .unwrap()
        )
    }
    
  • Playground

For what it's worth, here is how the above is written using #[with] sugar:
#[with]
fn main ()
{
    let s: &'ref str = 42.str();
    assert_eq!("42", s); // `s` can be used in the scope where `.str()` is called, but not outside it.
}

trait WithStr {
    #[with]
    fn str (self: &'_ Self) -> &'ref str;
}

impl WithStr for i32 {
    #[with]
    fn str (self: &'_ i32)
      -> &'ref str
    {
        let mut stack_storage = [0_u8; 11];
        let buf: &mut [u8] = &mut stack_storage[..];
        … same as before …
        // Looks like "returning" a value referencing a local!
        ::core::str::from_utf8(&buf[.. buf.len() - remaining])
            .unwrap()
    }
}
  • No unsafe code involved
5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.