Generic find - am I doing this right?


#1

I’ve coaxed my boss and coworker into letting me use Rust to build a few Bioinformatics tools a colleague of my boss needs. Yay!

I have a enum Base { A, T, G, C, N } and a struct Bases(Vec<Base>), and I’d like to be able to find sub-sequences within a Bases sequence. I looked around but couldn’t find anything, only StrExt::find_str, so I wrote the generic find code in the playpen link below to use for now (writing a specialized Bases::find just felt icky). I feel like my solution is over complicated. Does the stdlib give me a better option?

http://is.gd/ARbVJq


#2

Hmm, can you implement the new Pattern traits?

http://doc.rust-lang.org/std/str/trait.Pattern.html

http://doc.rust-lang.org/std/str/trait.Searcher.html

http://doc.rust-lang.org/std/str/trait.ReverseSearcher.html


#3

Neat! I didn’t know about these, so thanks. These traits would be perfect, but they seem to only be meant for strings. Perhaps I should just represent nucleotide sequences as strings like everyone does in every other language, but it’s fun to exercise Rust’s type system.


#4

Yeah, incidentally, I was a computational biologist (well, grad student) in a former life, and yeah, you’d probably get a lot more mileage if you kept sequences as normal strings.

On the other hand, if you’re able to experiment, representing them as an enum is certainly a very nice thing to try!