Feedback is welcome on the issue tracker: RFC: review API before 1.0 release · Issue #108 · BurntSushi/aho-corasick · GitHub
Even if you don't know what the heck "Aho-Corasick" means, I would still love to hear any feedback you might have.
Feedback is welcome on the issue tracker: RFC: review API before 1.0 release · Issue #108 · BurntSushi/aho-corasick · GitHub
Even if you don't know what the heck "Aho-Corasick" means, I would still love to hear any feedback you might have.
I wrote this brief changelog summary in the issue thread, but I'll cross-post it here for visibility.
std
, has been added and is enabled by default. This makes the standard library optional and permits using aho-corasick
in core/alloc-only contexts. But you do need alloc
.AhoCorasick
will mostly behave the same as before. AhoCorasick::new
is now fallible. The type is no longer generic. (In 0.7, you can choose the size of the state ID to use, but I did away with this and now just use u32
everywhere.)AhoCorasick::find
now accept an impl Into<Input>
, where an Input
provides a way to configure a search. There is an impl From<H: AsRef<[u8]>> for Input
, so you can still call find
with types like &str
and &[u8]
. But Input
lets you set the bounds of the search, run an anchored search and so on. In 0.7, for example, in order to run an anchored search you had to configure it as a build-time parameter of the AhoCorasick
automaton.AhoCorasick
type can now be configured to support unanchored and anchored searches simultaneously using StartKind
. It defaults to just unanchored searches.supports_overlapping
and supports_stream
routines were removed in favor of try_*
search APIs. I could add the predicates back if needed, but they were principally useful to be able to guarantee that panics didn't happen when you called a search API with an unsupported configuration. But now there are fallible APIs for that.automaton
, dfa
and nfa
. These provide low level access to different types of Aho-Corasick automata, and are not expected to be used for most use cases. The key thing they unlock, by implementing the automaton::Automaton
trait, is the ability to walk the automaton one state at a time. For example, you can now write your own search routines. This might unlock more interesting streaming use cases.packed
submodule remains largely unchanged.In terms of implementation, things largely remain the same with one very (IMO) awesome addition: there is a new type of Aho-Corasick automaton called a "contiguous NFA." The old NFA has been rebraned as a non-contiguous NFA. (And the old DFA is still called a DFA.) The contiguous NFA uses a lot less memory and is typically quite a bit faster than the non-contiguous NFA. It represents a very nice middle point between the slower old NFA and the piggish DFA. Indeed, most AhoCorasick
configurations in the 1.0 release will use this new contiguous NFA.