After over 3 years of development, Jetscii will soon be able to be used on stable Rust thanks to the stabilization of SIMD!
Since docs.rs cannot build these docs at the moment, allow me to paste them in here to give an overview.
Happy to answer any questions you might have!
Jetscii
A tiny library to efficiently search strings for sets of ASCII characters or byte slices for sets of bytes.
Examples
Searching for a set of ASCII characters
#[macro_use]
extern crate jetscii;
fn main() {
let part_number = "86-J52:rev1";
let first = ascii_chars!('-', ':').find(part_number);
assert_eq!(first, Some(2));
}
Searching for a set of bytes
#[macro_use]
extern crate jetscii;
fn main() {
let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42];
let first = bytes!(0x01, 0x10).find(&raw_data);
assert_eq!(first, Some(1));
}
Using the pattern API
If this crate is compiled with the unstable pattern
feature flag, AsciiChars
will implement the Pattern
trait, allowing it to be used with many traditional methods.
#[macro_use]
extern crate jetscii;
fn main() {
let part_number = "86-J52:rev1";
let parts: Vec<_> = part_number.split(ascii_chars!('-', ':')).collect();
assert_eq!(&parts, &["86", "J52", "rev1"]);
}
What's so special about this library?
We use a particular set of x86-64 SSE 4.2 instructions (PCMPESTRI
and PCMPESTRM
) to gain great speedups. This method stays fast even when searching for a byte in a set of up to 16 choices.
When the PCMPxSTRx
instructions are not available, we fall back to reasonably fast but universally-supported methods.
Benchmarks
Single character
Searching a 5MiB string of a
s with a single space at the end for a space:
Method | Speed |
---|---|
ascii_chars!(' ').find(s) |
5882 MB/s |
s.as_bytes().iter().position(|&c| c == b' ') |
1514 MB/s |
s.find(" ") |
644 MB/s |
s.find(&[' '][..]) |
630 MB/s |
s.find(' ') |
10330 MB/s |
s.find(|c| c == ' ') |
786 MB/s |
Set of 3 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, and &
:
Method | Speed |
---|---|
ascii_chars!(/* ... */).find(s) |
6238 MB/s |
s.as_bytes().iter().position(|&c| /* ... */) |
1158 MB/s |
s.find(&[/* ... */][..]) |
348 MB/s |
s.find(|c| /* ... */)) |
620 MB/s |
Set of 5 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, &
, '
, and "
:
Method | Speed |
---|---|
ascii_chars!(/* ... */).find(s) |
6303 MB/s |
s.as_bytes().iter().position(|&c| /* ... */) |
485 MB/s |
s.find(&[/* ... */][..])) |
282 MB/s |
s.find(|c| /* ... */) |
785 MB/s |