[getdns-api] async comments (0.268)

Phillip Hallam-Baker hallam
Wed Feb 6 16:36:13 CET 2013


On Tue, Feb 5, 2013 at 11:41 PM, William Chan (???)
<willchan at chromium.org>wrote:

> On Wed, Feb 6, 2013 at 1:27 PM, Dan Winship <dan.winship at gmail.com> wrote:
> > On 02/05/2013 08:59 PM, Paul Hoffman wrote:
> >>> getdns() implementation authors (which, in the long run really means
> >>> "libc/libresolv maintainers")
> >>
> >> Stop right there. There is *no* assumption that this will be be part of
> libc or libresolv any time soon.
> >
> > "It is important that the implementation should try to replicate as best
> > as possible the logic of a local getaddrinfo() when creating a new
> context."
> >
> > In the short term, you might be able to hardcode the behavior of
> > getaddrinfo() on specific releases of specific OSes that you care about,
> > but that doesn't seem like a plausible long-term solution. (None of the
> > other async DNS libraries have managed to achieve this, why would getdns
> > be any different?)
>
> FWIW, Chromium's async DNS library successfully (as far as we can
> tell) replicates the behavior from getaddrinfo() on Mac and Linux and
> we're wrapping up Windows support (there's some trickery in detecting
> dual stack support in the same way Windows does...we're working on
> it).
>
> >
> >> That's exactly right: they don't want to, and can't. If there was only
> one async library that I had to worry about, you would be correct, but it
> is very clear different people want to use different async libraries. This
> leaves four choices:
> >> a) I pick one and ignore the users of all other async libraries
> >> b) I make a generic hole for those libraries and try to shoehorn every
> possible library's calls into that hole
> >> c) I don't do async
> >> d) I leave it up to the implementer, who will certainly hear from the
> application developers about which libraries they want supported
> >> I chose (d).
> >
> >> It sounds like you want (b) that slouches into (a).
> >
> > I want exactly (b). (And I don't think (d) would actually work, at all.)
> >
> >>> On unixy platforms, all getdns() implementations are going to be based
> >>> on sockets and timeouts, and all event loops are going to be based on
> >>> poll() or something equivalent.
> >>
> >> Err, no. There are many more choices than that. In fact, today, I
> suspect that many more applications use { libevent | libev } instead of
> polling.
> >
> > I wasn't saying the applications were based on poll, I was saying the
> > event loops are. Sure, libevent can use poll or epoll or kqueue or
> > whatever, but those are all just variations on the theme of "let me know
> > when one of these file descriptors is ready". So if there's an API that
> > lets getdns tell the local event loop what fds it cares about, then that
> > would let it integrate with libevent, libuv, GLib, Qt, Twisted, etc.
>

Looking at the Docs, I think Chromium is doing more what I suggested:

http://www.chromium.org/developers/design-documents/dns-prefetching

Since some DNS resolutions can take a long time, it is paramount that such
delays in one resolution should not cause delays in other resolutions.
 Toward this end (on Windows, where there is no native support for
asynchronous DNS resolution), Chromium currently employs 8 completely
asynchronous worker threads to do nothing but perform DNS prefetch
resolution. Each worker thread simply waits on a queue, gets the next
requested domain name, and then blocks on a synchronous Windows resolution
function.   Eventually the operating system responds with a DNS resolution,
the thread then discards it (leaving the OS cache warmed!), and waits for
the next prefetch request.  With 8 threads, it is rare than more than one
or two threads will block extensively, and most resolution proceed rather
quickly (or as quickly as DNS can service them!).  On Debug builds, the
"about:histograms/DNS.PrefetchQueue" has current stats on the queueing
delay.


I interpret that to mean that the basic unit of construction here is a
blocking call to the DNS resolver, imagine it is something like the
following:

void dns_lookup (dns_work_item *work) {
    // construct request
    // blocking udp send
    // blocking udp wait for response with time out
    // parse response
    }

Then there is a separate worker process which is essentially a dispatch
loop on dns_lookup:

void farm_tasks ((void (ANY *work)) *delegate,
                 *ANY work_items, int number_items, int max_threads) {
   int i, threads;
   threads =0;
   //create lock here
   for (i = 0; i < number_items ; i++) {
       dispatch (delegate, work_items[i], lock);
       if !(threads ++ < max_threads) {
           wait (lock)
           }
       }
    }

Since none of the code in the farm_tasks routine is DNS specific and most
of that code is actually code that would be useful in a wide range of
farmed task applications, it does not look to me like it is something that
you would need or want in a DNS library. It is probably something you would
want in a companion library but not something I would personally find
useful in the DNS library because my DNS lookup routines are going to
involve more than just the common DNS code, there is going to be a (small)
amount of code to put the returned results into the data structures I use
in my applications.

In general the rule on async programming should probably be that a service
API like a DNS API should neither create nor terminate threads.

If you have an environment like .NET where the UDP calls already have async
versions, you might want to splice the 'handle DNS response' work into the
callback initiated by the UDP call.


I think this discussion is demonstrating the real reason that writing a
DNSAPI is hard. The problem is that there are many styles of async
programing and the C programming environment does not force a particular
choice on the programmer. When selecting an API to use inside a program,
one of the chief design considerations is going to be whether the async
approach used by the API is compatible with the style the programmer is
already using elsewhere.

This is one of the main reasons the programming world in general has moved
on from C and pretty much decided to barf on C++. It is perfectly easy to
write a C library to handle structures like linked lists. But there isn't a
linked list implementation in the base language. So every C API that needs
linked lists has to fashion them for themselves and each library takes a
different approach and when you combine a dozen APIs you end up with a
dozen different linked list handling approaches.


I don't think that finding the perfect C async API is possible here. But it
probably isn't necessary either. The only parts of the API that are async
are the I/O calls which are actually quite easy to code. The hard part of
the API are the calls to encode and decode messages. I therefore suggest
the following as an 80:20 solution:

Core blocks:

dns_get_host_resolver  - Return a structure with the IP addresses of the
host DNS services
dns_encode_message - convert a dns_message structure to an array of byte
dns_decode_message - convert an array of byte to a dns_message structure
dns_decode_record - convert the RDATA portion of a DNS record to an
appropriate C structure

dns_query_blocking - make a single DNS query and block on the response or a
timeout with no internal caching


The core blocks are required to build a higher level interface regardless
of what that interface might look like. So we might as well expose them to
the programmer who might want to use them. I often find it less difficult
to roll my own code than to try to work out what the particular style of
async approach another API applies.

One higher level interface might look something like:

dns_client_init - initialize a DNS client
dns_client_query - blocking client query
dns_client_query_start - initiate an async client query (non blocking)
dns_client_query_end - complete an async client query (blocking)
dns_client_query_prefetch - queue up a domain anticipating a future query
(non blocking)


Adding prefetching to the model has some interesting consequences. It means
that schemes like Certificate Transparency make a lot more sense. It might
also explain why Chrome has a habit of freezing on certain web pages from
my home network. I think that what is happening is that the simultaneous
prefetches are overwhelming the DNS resolver.

-- 
Website: http://hallambaker.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.vpnc.org/pipermail/getdns-api/attachments/20130206/9d841c37/attachment.html>



More information about the spec mailing list