Making TRust-DNS faster than BIND9


#1

I just posted this: https://www.reddit.com/r/rust/comments/7my0m0/making_trustdns_faster_than_bind9/

During my quest to get TRust-DNS faster than BIND, I was trying to make better use of Cow. I ran into a problem, I’ll just quote my post:

So I had created a new Name type, DecodedName<'r> which shared a lifetime with the request buffer backing the BinDecoder<'r>. This meant that if the buffer was proper utf8 and in addition lowercase, we wouldn’t have needed to allocate anything for query names. But I ran into an issue: all of the RecordSets are stored in a BTreeMap keyed off of RrKey which is just a wrapper type of Name and RecordType. BTree::get has this interface:

fn get<Q>(&self, key: &Q) -> Option<&V> where
    K: Borrow<Q>,
    Q: Ord + ?Sized, 

I could not for the life of me figure out how get a Name implement a Borrow<Q> where Q would have been DecodeName<'r> (or RrKey to DecodedRrKey<'r> which is what would have been needed). After working through a lot of ideas, I gave up on this approach. Maybe there’s a way to do it with unsafe and std::mem::transmute, but I’m trying to keep all unsafe usage out of the TRust-DNS libraries (and have been successful up to this point). So I leave that as a challenge to someone who knows Rust better than I.

If it helps, I can post the exact types involved to make the question better…


Copyless lookup in BTreeMap
#2

It would help. Better yet is a stripped down playground sample.

Edit: I took a peek at your repo and I don’t think there’s a way to implement Borrow<DecodedName<'a>> for Name, at least without resorting to unsafe hacks.

Here are the type definitions that I found (in case someone can think of something):

pub struct Name {
    is_fqdn: bool,
    labels: Vec<Rc<String>>,
}

pub struct BinDecoder<'a> {
    buffer: &'a [u8],
    index: usize,
}


pub struct RrKey {
    /// Matches the name in the Record of this key
    pub name: Name,
    /// Matches the type of the Record of this key
    pub record_type: RecordType,
}

pub enum RecordType {
    // some variants here - not important for the question, I think
}

I didn’t find the DecodedName<'_> struct, which I assume is because you abandoned that approach. But presumably it holds a reference to the same &[u8] buffer as BinDecoder.

I suppose in retrospect the easier way would’ve been to have Name defined as:

pub struct Name<'a> {
    is_fqdn: bool,
    labels: Vec<Rc<Cow<'a, str>>>,
}

But making that change now would appear to ripple through a large swath of code.

Completely tangential, but you can consider storing labels as Vec<Rc<str>> instead of Vec<Rc<String>>. Rc<str> has the Rc components (strong and weak counts) and the data allocated in a single allocation. This reduces indirection: Rc<String> contains a pointer to a heap allocated String which itself contains a pointer to heap allocated storage, whereas Rc<str> will have a single heap allocation. Rc<str> also doesn’t have the explicit capacity field that the String has.


#3

This seems like a HKT-y sort of problem… Ideally there would be a variant of Borrow that doesn’t have to return a reference, but any type parameterized by the input lifetime. Should be doable with generic associated types, sort of:

pub trait Borrow2<Dummy>{
    type Borrowed<'a>;
    type borrow<'a>(&'a self) -> Self::Borrowed<'a>;
}

(The tricky part is what to use as the type parameter.)

But that would have to be actually added as part of HashMap’s API.

I agree with @vitalyd’s “easier way”.


#4

How does one create a Rc<str>? Apparently you can create an Rc<str> as of Rust 1.21: https://doc.rust-lang.org/stable/std/rc/struct.Rc.html#impl-From<String>


#5

Thank you for the great replies!

I have a local branch where I preserved it, yes. This is the impl:

pub struct DecodedName<'a> {
    labels: Vec<Cow<'a, str>>,
}

Yeah, the issue with that is then associating a lifetime to the Authority and then that would add lifetimes to a lot of the structures that contain that. I suppose it’s what is actually needed in order to accomplish this task, but I don’t right now know all the implications it will have in dealing with a lot of these structures.

These are great recommendations about Rc<str>. I was unaware of Rc<str> I assume you mean that I should use this conversion to create an Rc<str>

impl From<String> for Rc<str>

That seems like it would be better.

Yes, I actually tried to go in that direction. The problem I ran into is that the structure stored in the BTreeMap is owned data, and I can’t pollute the type hierarchy with lifetimes for that (without a lot of other work). When I have some time, maybe I’ll explore Vitaly’s suggestion.


#6

After having fought similarly for quite a while, I eventually decided to use Bytes for all octet sequence types. It is basically an Rc<[u8]> on steroids and allows things like taking the slice out of another Bytes without copying. A little more expensive than dealing with &'a [u8] directly but no need to have types be generic over anything AsRef<[u8]> and sprinkle lifetimes all over the code, so a win in my book.


#7

I just looked through it. Thanks for the pointer. It seems like it would potentially simplify some of the code.