TRust-DNS Resolver 0.9 and Client/Server 0.14

I've finally started a resolver for TRust-DNS. This is very much a work-in-progress, so it might be a little while before it lands. Like with the client, it will have both a basic synchronous version for ease of use and also a Futures/Tokio implementation to offer a fully asynchronous option.

https://github.com/bluejekyll/trust-dns/pull/136

I know some people out there currently use the client, but I suspect most really want a resolver. What is the difference between a resolver and a client? The client is responsible for sending messages to and receiving messages from a DNS server or recursive resolver. The client implements the protocol. The resolver is a more intelligent beast, responsible for interpreting the messages returned from the server into something people can generally use to retrieve IP addresses and such. For example: following a CNAME chain back to it's actual address is the responsibility of the resolver, not the client.

Initially I'll support this interface:

pub fn lookup(&mut self, host: &str) -> Box<Future<Item = Vec<IpAddr>, Error = io::Error>>;

as well as the synchronous version:

pub fn lookup(&mut self, host: &str) -> io::Result<Vec<IpAddr>>;

I also plan to support the system /etc/resolv.conf for configuration (in addition to the programmatic interface). As I have time I'll start adding other functions. Off the top of my head I can think of supporting these other functions:

// Reverse lookups, i.e. PTR records
pub fn reverse_lookup(&mut self, ip: IpAddr) -> io::Result<String>; // or return Name?

// SRV records, may return other info like weight and priority
pub fn lookup_server(&mut self, service: &str, protocol: &str, name: &str) -> io::Result<Vec<SocketAddr>>;

One thing I've realized over time, is that I probably misplaced the DNSSec validation in the client library, and it should probably actually reside in the resolver library, at least for the validation. Anyway, I'd love feedback, feature requests, etc.

BTW, right now I'm thinking of this as only a client resolver, that will eventually have features like caching etc. This is not intended to be a full recursive resolver. I'm thinking of a separate recursive resolver crate for that... though I don't know when I'll actually get to work on that.

10 Likes

This sounds pretty great! The lack of an asynchronous name resolution API in the standard library is a pain (admittedly one shared with most other languages' standard libraries by way of POSIX). As I was looking at the getaddrinfo man page to see if there were any features there I remembered finding useful I did see this note:

The sorting function used within getaddrinfo() is defined in RFC 3484

...which you might want to consider if you haven't already! Will you have IDN support in the resolver? Additionally, do you plan to support the system configuration on Windows/OS X where it's not /etc/resolv.conf?

Reading that man page and comparing to your proposed API made me once again happy with how nice Rust APIs are compared to C. :slight_smile:

1 Like

if there were any features there I remembered finding useful I did see this note:

The sorting function used within getaddrinfo() is defined in RFC 3484

Yeah, I've been thinking about where to support sorting, load balancing, etc. There are a few record types that have specific requirements around this. I know that the CIDR based rules look useful, and I can support that as a default option. I can also just allow a custom function to be passed in for sorting via the programmatic API. This is good feedback, though. Thanks!

Will you have IDN support in the resolver?

For IDN I never bothered with puny code. It seemed like such a hack. Right now I support UTF8 directly encoded in the name. Obviously this is imperfect as it's not compatible with the resolvers/servers out there that don't just store UTF8, and instead use puny code. I have some concerns with my implementation here, as the RFCs in this area are all over the place. Specifically, I think there are some UTF8 characters that I should be stripping that I am not at the moment. I'd love a second pair of eyes on all this if you have any interest, I can point you to the code.

Additionally, do you plan to support the system configuration on Windows/OS X where it's not /etc/resolv.conf?

Yes, though, as I don't use Windows, or have access to it I won't be able to test it easily. I'll at least make sure it functions with the Windows location on AppVeyor. After that, if there are any bugs there I'll have to rely on the community. It shouldn't be hard to have a default location for the resolv.conf per OS.

2 Likes

Nice! I think it's cool that someone is writing critical internet foundation in Rust :heart:

One question about the API (caveat: as a non-user, tell me if I'm going way off-base!):
Is it intentional to emit Vec<IpAddr>? Since this sets the actual type in stone, people will always need an (allocating) Vec. Have you considered returning a more generic type? (maybe a [IpAddr]?), or does that give Ownership problems?

Background: I've seen people do cool things like SmallVec, a Vec that keeps a small bit of stack space, and only starts allocating after it outgrows this.
This has gotten me thinking of supporting things like microcontrollers in libraries like this one. (I'm still fumbling though, so tell me if it's dumb :wink:)

I sort of enjoy reading RFCs so I can probably find some time to take a look at this, especially as I have a project I could use a DNS resolver in :slight_smile:

1 Like

To return an arbitrary number of IP addresses there's no way around allocating memory. [T] can't be returned from a function. Box<[T]> can, but it's also heap-allocated.

If that ever was a problem, there could be another function added that either returns only 1 IP, or takes &mut [IpAddr] as an input (and then you hope the response doesn't contain more elements than the pre-allocated space).

1 Like

I was already expecting reasons in that direction (hence my "please tell me if it's bdumb" hedges).
Thanks for confirming!

Well...since the API is already using futures, it could return a Box<Stream<Item=IpAddr>> instead of a Box<Future<ItemVec<IpAddr>>>, like:

pub fn lookup(&mut self, host: &str) -> Box<Stream<Item = IpAddr, Error = io::Error>>;

While I'm looking at it, @bluejekyll, it might be nice to have it return a concrete type (I know types around Future are a pain to write, someday we'll get impl trait!). I notice you're already using error_chain, so I assume you'd be returning a custom error type and not io::Error, as well?

2 Likes

I sort of enjoy reading RFCs so I can probably find some time to take a look at this, especially as I have a project I could use a DNS resolver in :slight_smile:

Awesome. Here you go: IETF Standards Written by ISC Contributors - ISC

And here is the parser: https://github.com/bluejekyll/trust-dns/blob/master/client/src/rr/domain.rs#L244

See the emit_* methods below that for all the serialization logic. (Yes, it would be cool to implement serde based serialization routines, but this is some of my older code, and when I started it serde was still in a lot of flux).

As others have responded, I don't need to add much here, but I do need to pass ownership of the IpAddr result. At best I might be able to return a COW like object I think. Probably just implement iterator over that (Once I introduce a cache, that will potentially not allocate).

Yeah, SmallVec would fit this use very well, though I don't think I want to expose that to the API.

This might be strange in this context. DNS is basically a single packet system, so the entire set of IpAddrs will be known in the response. I "could" put a stream type on that, but I feel like that would make the interface more complex. If others think differently, I'm happy to try and support that, but I don't think it really buys much as it's not actually a stream.

Yes, I want to return concrete types instead. But I have a lot of Boxes to fix in the client library as well for that too. That's a bunch of technical debt... At some point I want to go back and clean all that up to remove all those heap allocations.

I am using error_chain, but to make the interface easier to consume, I was planning on returning io::Error. I suppose that's not necessary?

2 Likes

You could start by returning a concrete type simply wrapping a Box, that way when you cleanup everything that's currently Boxed in the future it won't break clients.

2 Likes

That's a great idea. Thanks!

1 Like

trust-dns-resolver 0.1.0 released

I just published the initial version of the TRust-DNS Resolver: trust_dns_resolver - Rust

A 100% pure Rust DNS resolver based on the trust-dns client. I will make no claims about zero overhead, it being the fastest, or even faster than the system resolver. Some features that might be interesting for some, this supports async-io, via the Tokio framework. There are synchronous and asynchronous interfaces, where the synchronous interface is the easiest to use. The Resolver uses an internal NameServerPool for measuring the performance of each name server, using the one with the fewest failures over others (in the future this will also measure latency and rank the name servers based on that as well).

Some notes on this initial release

Right now only A (IPv4) requests are issued. I'll be added AAAA (IPv6) soon. Also note that this release currently does not perform recursive lookups, e.g. following CNAME chains, though many upstream resolvers will do this by default, so shouldn't matter in most cases. The lookup_ip interface is the only supported option at the moment. I will be adding others over time. It also does not read the system /etc/resolv.conf, again a future feature. Search paths are also not yet supported, so only a FQDN, with the final ., will work.

Example:

use std::net::*;
use trust_dns_resolver::Resolver;
use trust_dns_resolver::config::*;

let mut resolver = Resolver::new(ResolverConfig::default(), ResolverOpts::default()).unwrap();
 
// NOTE: do not forget the final dot, as the resolver does not yet support search paths.
let mut response = resolver.lookup_ip("www.example.com.").unwrap();

let address = response.next().expect("no addresses returned!");
assert_eq!(address, IpAddr::V4(Ipv4Addr::new(93, 184, 216, 34))); 

I hope people find this easier to use that the client, thanks!

6 Likes

In addition to IPv6 being supported, there are a few IPv4 and IPv6 lookup strategies. The default is to perform parallel lookups of IPv4 and IPv6. The others are to first query IPv4 and on failure/no results query for IPv6. Then the inverse of that, IPv6 then IPv4. After that there are the final two options for only querying IPv6 or only IPv4. There's pretty decent coverage on all of these.

Next I'll start in on reading the system's resolv.conf.

0.2.0 Resolver

Added

  • ipv6 parallel lookup
  • multiple ipv4 and ipv6 lookup strategies
  • library documentation examples
  • test coverage for resolver

0.10.5 Client and Server

Added

  • Library documentation, examples for client query and update

Changed

  • ServerFuture now Accepts generic RequestHandler (@Antti)
2 Likes

TRust-DNS Resolver now has support for reading the resolv.conf. Specifically these settings are supported (these happen to be the defaults):

options ndots:1 timeout:5 attempts:2

## default domain is root zone, aka `.`
domain .

## example domain:
# domain example.com.

## There are no default search domains
# search other.example.com. sub.example.com.

nameserver 2001:4860:4860::8888
nameserver 2001:4860:4860::8844
nameserver 8.8.8.8
nameserver 8.8.4.4

The domain and search options will be regarded as fully-qualified-domains, with or without the final .. This required a minor parsing change to Name::from_str. Attempts, retry, and ndot support are new as well. The lookup is performed in this way, if the Name passed into Resolver::lookup_ip is a FQDN, then there will only be one lookup attempted. FQDN's end in a final ., e.g. www.example.com.. If the name is not a FQDN then the lookup order is this:

if ndots > 1; then try name as FQDN;
then try name+domain;
then try for search in searches; name+search;

The query lookups will end on the first successful IPs returned. The Client had some stuff cleaned up specifically around the Name implementations, full release notes are here.

0.3.0

Added

  • Added options attempts:N aka ResolverOpts::attempts support, aka retries
  • Added Google IPv6 nameservers as defaults for ResolverConfig::default
  • Added support for domain name search in ResolverConfig and LookupIpFuture
  • Added support for search names in ResolverConfig and LookupIpFuture
  • Added LookupIpFuture type alias to InnerLookupIpFuture<NameServerPool> compatibility

Changed

  • breaking LookupIpFuture renamed to InnerLookupIpFuture
  • breaking InnerLookupIpFuture now takes a generic parameter, generally <NameServerPool>

edit: made defaults more accurate...

5 Likes

Awesome tool!

How much did you think about this?

These kinds of defaults can really hurt by accident, because "internet-scale" is really big. To explain:
Assume for a second that the entire internet community says "awesome! pure rust DNS! Let's use only that!" (Or you get integrated into NPM, which thinks rust is pretty awesome, whichever scenario you like more).
Suddenly, millions of additional queries go to Google's DNS. That is quite an amount of traffic to be tossing at someone else's (free) service!

The pool.ntp.org community had this problem in December. Snapchat started using them for time-sync, but on iOS misused a library in its default settings due to a config mixup.
Suddenly, millions of iOS Snapchat users were sending dozens of NTP requests every time they opened the app. This effectively DDoS'ed the entire pool. To their credit, Snapchat handled the issue admirably, deploying a fix within 24h, but I think everyone would have been happier if it never happened at all.

I don't mean to criticize your work, I admire you for doing this hard foundation-of-the-internet stuff. I just see you being so wildly successful that this issue may come up :slight_smile:

The careful thing to do would be to try and get the system default DNS, which is probably annoying, platform-specific, and requires extra dependencies.
The easy way out is probably to just not have a Default impl if you can spare it, or use a non-routable IP if you really must have it. Network config rarely has sensible defaults anyway!

I'm pretty sure Google doesn't mind tons of valuable data being pushed their way for free. I'd rather go the other around and say: Why would you chose the kraken as the default? I'd rather have no default than Google. It's annoying enough that it's obviously en vogue for everyone and their mother throwing in the use of Google DNS in their software for no good reason...

1 Like

Also a valid point. Every Google service is a data collector for their advertisement-flinging. (.. he wrote, from an account linked to his Gmail address...)

Thanks for the continued interest in the project!

I thought about this to the extent of whether or not I thought Google's PublicDNS services were capable of handling significant traffic or not. From their docs:

Mitigations

In Google Public DNS, we have implemented several approaches to speeding up DNS lookup times. Some of these approaches are fairly standard; others are experimental:

  • Provisioning servers adequately to handle the load from client traffic, including malicious traffic.
  • Preventing DoS and amplification attacks. Although this is mostly a security issue, and affects closed resolvers less than open ones, preventing DoS attacks also has a benefit for performance by eliminating the extra traffic burden placed on DNS servers. For information on the approaches we are using to minimize the chance of attacks, see the page on security benefits.
  • Load-balancing for shared caching, to improve the aggregated cache hit rate across the serving cluster.
  • Providing global coverage for proximity to all users.

Based on this and the other things in their docs, I do believe they are capable and have established enough capacity to deal with any load the Rust community could throw at it. This is definitely a valid concern, though.

Criticize away :wink:, seriously, I appreciate the feedback. I think you have raised a very valid concern. If this becomes wildly successful, that would be pretty amazing. Honestly, I just wanted to scratch an itch, successful or not, it's been a great way to become more familiar with Rust.

Yes, I did consider this. To be clear I have no affiliation with Google or any interest in giving away private data for their data mining desires. That being said, I'm open to other ideas here to make it more explicit that you use the Google name servers. For reference, here is their privacy statement for the service:

What we log

Google Public DNS stores two sets of logs: temporary and permanent. The temporary logs store the full IP address of the machine you're using. We have to do this so that we can spot potentially bad things like DDoS attacks and so we can fix problems, such as particular domains not showing up for specific users.

We delete these temporary logs within 24 to 48 hours.

In the permanent logs, we don't keep personally identifiable information or IP information. We do keep some location information (at the city/metro level) so that we can conduct debugging, analyze abuse phenomena. After keeping this data for two weeks, we randomly sample a small subset for permanent storage.

We don't correlate or combine information from our temporary or permanent logs with any personal information that you have provided Google for other services.

Finally, if you're interested in knowing what else we log when you use Google Public DNS, here is the full list of items that are included in our permanent logs:

  • Request domain name, e.g. www.google.com
  • Request type, e.g. A (which stands for IPv4 record), AAAA (IPv6 record), NS, MX, TXT, etc.
  • Transport protocol on which the request arrived, i.e. TCP, UDP, or HTTPS
  • Client's AS (autonomous system or ISP), e.g. AS15169
  • User's geolocation information: i.e. geocode, region ID, city ID, and metro code
  • Response code sent, e.g. SUCCESS, SERVFAIL, NXDOMAIN, etc.
  • Whether the request hit our frontend cache
  • Whether the request hit a cache elsewhere in the system (but not in the frontend)
  • Absolute arrival time in seconds
  • Total time taken to process the request end-to-end, in seconds
  • Name of the Google machine that processed this request, e.g. machine101
  • Google target IP to which this request was addressed, e.g. one of our anycast IP addresses (no relation to the user's IP)

I have this documented here; This uses the default configuration. Currently this sets the google resolvers as the upstream resolvers. I've just updated this to say:

This uses the default configuration, which sets the Google Public DNS as the upstream resolvers. Please see their privacy statement for important information about what they track, many ISP's track similar information in DNS.

On Default::default I had documented this similarly with: Creates a default configuration, using 8.8.8.8, 8.8.4.4 and 2001:4860:4860::8888, 2001:4860:4860::8844 (thank you, Google). I've updated this to include:

Please see Google's privacy statement for important information about what they track, many ISP's track similar information in DNS. To use the the system configuration see: Resolver::from_system_conf and ResolverFuture::from_system_conf

My reasoning for all of this is that I wanted an out-of-the-box solution that would work for most people. All of it can be overridden and changed. I'm open to making the system's resolv.conf default, but my hesitation there is that I currently don't have support for reading the information out of the Windows Registry (or access to any machine to build that) and potentially other systems. See this issue for Windows: [Resolver] Add support for the Windows Registry for DNS configuration · Issue #171 · hickory-dns/hickory-dns · GitHub. I'd love help on all Windows related configuration and builds as my only tool for supporting that at the moment is AppVeyor (I don't run Windows in any capacity).

Again, I'm very open to changes where people see necessary. At the moment though, my preference is to make the library as easy to use as possible, which removing/changing the Default implementations may make that harder.

2 Likes

Ah, very thorough! You've honestly put a lot more thought into this than I had expected. :heart:

I agree that Google has slightly more resources than the NTP-pool volunteer community, so capacity won't be the problem.
The privacy aspects seem to be rather less-bad than I thought, especially for Google. The improved documentation covers my concerns in that respect (particularly nice that you reference the from_system_conf now. Also nice how from_system_conf has the cfg[not(unix)] to generate errors on Windows).

I thought about suggesting OpenDNS as an alternative, but it seems to have been acquired by Cisco since I last looked it up. and DynDNS belongs to Oracle now.
It seems my "there exist open, volunteer-run alternatives"-idea is a bit out-of-date...

1 Like