Announcing Scroll

I'm pleased to announce scroll, a parallel byte parsing library I've been working off and on for a bit. The tl;dr is that it allows:

  1. parallel reads of a byte buffer (self is an immutable reference, not mutable like Read requires)
  2. un-suffixed type readers, e.g., no read_u32, read_str, read_f64, just pread and pread_slice and type ascription or the beloved turbofish::<> (or none at all if you're reading into a struct field or other known type)
  3. big endian/little endian parsing or host machine default endian parsing
  4. custom datatype reading via implementing extended conversion traits, which then gets you pread and pread_slice for free
  5. it's really fast :tm: - seriously. Without parallelism it competes with byteorder. With a naive parallel implementation I've seen benches up to 2x faster than byteorder, with no work at all, just drop in a rayon parallel iterator... this is the way it should be! I began work on a parallel buffer implementation automating almost all this work, but it wasn't a high priority, but eventually something as simple as buffer.par_iter(| offset | { pread::<MyType>(offset)? }).collect() should be possible
  6. finally, there are writer duals of the various traits as well (mutable self of course), which similarly write at an offset, etc. this is very useful for a binary writer, which consists of many specific offset dependent writing
  7. macro 1.1 derives https://github.com/m4b/scroll_derive

Here is the documentation.

And here's a quick taste:

use scroll::{ctx, Pread};
let bytes: [u8; 4] = [0xde, 0xad, 0xbe, 0xef];

// we can use the Buffer type that scroll provides, or use it on regular byte slices (or anything that impl's `AsRef<[u8]>`)
//let buffer = scroll::Buffer::new(bytes);
let b = &bytes[..];

// reads a u32 out of `b` with Big Endian byte order, at offset 0
let i: u32 = b.pread_with(0, scroll::BE).unwrap();
// or a u16 - specify the type either on the variable or with the beloved turbofish
let i2 = b.pread_with::<u16>(2, scroll::BE).unwrap();

// We can also skip the ctx by calling `pread`.
// for the primitive numbers, this will default to the host machine endianness (technically it is whatever default `Ctx` the target type is impl'd for)
let byte: u8 = b.pread(0).unwrap();
let i3: u32 = b.pread(0).unwrap();

// this will have the type `scroll::Error::BadOffset` because it tried to read beyond the bound
let byte: scroll::Result<i64> = b.pread(0);

// we can also get str and byte references from the underlying buffer/bytes using `pread_slice`
let slice = b.pread_slice::<str>(0, 2).unwrap();
let byte_slice: &[u8] = b.pread_slice(0, 2).unwrap();

// here is an example of parsing a uleb128 custom datatype, which
// uses the `ctx::DefaultCtx`
let leb128_bytes: [u8; 5] = [0xde | 128, 0xad | 128, 0xbe | 128, 0xef | 128, 0x1];
// parses a uleb128 (variable length encoded integer) from the above bytes
let uleb128: u64 = leb128_bytes.pread::<scroll::Uleb128>(0).unwrap().into();
assert_eq!(uleb128, 0x01def96deu64);

// finally, we can also parse out custom datatypes, or types with lifetimes
// if they implement the conversion trait `TryFromCtx`; here we parse a C-style \0 delimited &str (safely)
let hello: &[u8] = b"hello_world\0more words";
let hello_world: &str = hello.pread(0).unwrap();
assert_eq!("hello_world", hello_world);

// ... and this parses the string if its space separated!
let spaces: &[u8] = b"hello world some junk";
let world: &str = spaces.pread_with(6, ctx::SPACE).unwrap();
assert_eq!("world", world);

Motivations

So this started off as an immutable byte parsing library because I wanted to read a very large amount of sized values from an array in parallel (more on this later), and similarly for another project, and the mutable Read trait did not (easily) allow this. I'm lazy. They're just byte slices, why couldn't I do this generically, immutably, in parallel, without locking?

It has since become something slightly more after I tinkered with it for a while, after I went over the cliff with generics :wink:

Nevertheless, at its heart is the Pread trait:

pub trait Pread<Ctx = Endian, E = error::Error, I = usize, TryCtx = (I, Ctx), SliceCtx = (I, I, Ctx) >
 where E: Debug,
       Ctx: Copy + Default + Debug,
       I: Copy + Debug,
       TryCtx: Copy + Default + Debug,
       SliceCtx: Copy + Default + Debug,

Don't worry! It's scarier than it looks, and with sensible defaults in function generics you can usually just write fn foo<Pread>.

So basically Pread just exposes two methods: pread_with and pread_slice, which return values of type N, which must implement the modified versions of stdlib conversion traits, which add an additional parameter that I'm calling the parsing context (its sort of like a contextual continuation).

This has several benefits I've noticed across the crates I've ported, primarily:

  1. Less allocations
  2. Easy parallelism

The Read trait and its associated methods are in my opinion to blame for this, as they require a mutable self (although for many applications, like reading out of a stream, unstructured data, etc., this is likely the approach one should use).

I like the following "before and after" example to illustrate how using scroll allows one to easily reduce allocations, and to get an "embarassingly" parallel reader/parser. Below is a function for extracting members from a static archive: the old version version allocates, is not parallel without changes, whereas the new version does not allocate and is easily parallelizable out of the box:

Old:

/// Returns a vector of the raw bytes for the given `member` in the readable `cursor`
pub fn extract<R: Read + Seek> (&self, member: &str, cursor: &mut R) -> io::Result<Vec<u8>> {
    if let Some(member) = self.get(member) {
        let mut bytes = vec![0u8; member.size()];
        try!(cursor.seek(Start(member.offset)));
        try!(cursor.read_exact(&mut bytes));
        Ok(bytes)
    } else {
        io_error!(format!("Cannot extract member {}, not found", member))
    }
}

New:

/// Returns a slice of the raw bytes for the given `member` in the scrollable `buffer`
pub fn extract<'a, R: scroll::Pread> (&self, member: &str, buffer: &'a R) -> Result<&'a [u8]> {
    if let Some(member) = self.get(member) {
        let bytes = buffer.pread_slice(member.offset as usize, member.size())?;
        Ok(bytes)
    } else {
        Err(format!("Cannot extract member {}, not found", member).into())
    }
}

Not only is the newer version cleaner and prettier imho, it can extract members from a static archive both without allocating, and in parallel without any change to the function (of course I suppose we could lock in order to seek + read, but this is stupid, we're just reading bytes, not modifying them, and it requires changes to the function).

Parsing Contexts

On the way to a parallel reader, several patterns kept popping up over and over again; I discuss this in the ctx module documentation, but the primary pattern I kept seeing was:

  1. Parsing was similar to TryFrom (from one thing to another, and can fail), but sometimes required one or two extra pieces of information to complete the parse
  2. Parsing at an offset and parsing at an offset plus a size were almost identical (actually just a special case of 1).

Eventually, I decided to introduce the extra ctx parameter in these modified conversion traits, and thus the idea of a parsing context.

This last notion became so useful, I condensed several different traits down into two, which correspond to the bounds on pread_with and pread_slice - TryFromCtx and TryRefFromCtx, respectively.

What is useful about these conversion traits is that they allow downstream consumers of scroll to implement them, and then automatically gain the immutable Pread methods.

Here is an example which uses a custom datatype, and a custom parsing context, but they get Pread for free without doing anything else:

use scroll::{ctx, Pread, BE};

#[derive(Debug)]
struct Data {
    name: String,
    id: u32,
}

#[derive(Debug, Clone, Copy, Default)]
struct DataCtx {
    pub size: usize,
    pub endian: scroll::Endian
}

impl ctx::TryFromCtx<(usize, DataCtx)> for Data {
    type Error = scroll::Error;
    fn try_from_ctx (src: &[u8], (offset, DataCtx {size, endian}): (usize, DataCtx))
                     -> Result<Self, Self::Error> {
        let name = src.pread_slice::<str>(offset, size)?.to_string();
        let id = src.pread(offset+size, endian)?;
        Ok(Data { name: name, id: id })
    }
}

fn main() {
    let bytes = scroll::Buffer::new(b"UserName\x01\x02\x03\x04");
    let data = bytes.pread::<Data>(0, DataCtx { size: 8, endian: BE }).unwrap();
    assert_eq!(data.id, 0x01020304);
    assert_eq!(data.name, "UserName".to_string());
    println!("Data: {:?}", &data);
}

This has already been used to great effect in the upcoming release of goblin 0.7.0 to allow bytes.pread::<Elf>(0) or bytes.pwrite(elf64::header::Header::default(), 0), and etc.

More examples are in the documentation, and in the examples directory in repo.

Lastly, I have to give credit where credit is due: this crate is heavily inspired by the famous byteorder, the never-released byteio, libc's pread, and the core libraries TryFrom, FromStr traits and the str::parse method, and of course, the #rust irc channel :slight_smile:

Anyway, I'm done rambling, thanks for reading (or not), have fun and rust on!

9 Likes