[getdns-api] async comments (0.268)

Dan Winship dan.winship
Mon Feb 4 16:04:21 CET 2013


Some comments from the point of view of thinking about reimplementing
GLib's getaddrinfo()-in-threads-based resolver
(http://developer.gnome.org/gio/stable/GResolver.html) with one based on
getdns()...


> The callback function might be called at any time, even before
> getdns() has returned.

Our experience in GNOME has been that this tends to lead to bugs; the
code after the getdns() call has to deal with two possible states of
the world (eg, the userarg data may or may not have been freed), and
so code like:

  status = getdns (context, name, GETDNS_RRTYPE_A, NULL,
                   myuserdata, &transaction_id, mycallback);
  myuserdata->id = transaction_id;

would be wrong, but not obviously so (and it might actually work 100%
of the time with one getdns implementation, but fail sporadically with
others). It's not much harder for the getdns() implementation to just
guarantee that it won't invoke the callback until after you return to
the event loop, and then you protect the caller from that class of
bugs.


> getdns_cancel_callback() may return immediately, even before the
> callback finishes its work and returns.

As above, the "may" makes things messy; it should either always call
the callback itself before returning, or always just schedule the
callback to be called upon returning to the event loop.



> Each implementation of the DNS API will specify an extension function
> that tells the DNS context which event base is being used.

This seems inconvenient for everyone involved except the API
specification author. :-)

getdns() implementation authors (which, in the long run really means
"libc/libresolv maintainers") don't want to have to know about every
possible event loop implementation. (And they can't anyway, and even
if they did, they'd have no good way to integrate with non-C-based
ones.)

Event loop implementation authors don't want to have to worry about
getting every getdns() implementation to support them, and don't want
to have to write N different integration thingies for N different
getdns() implementations.

getdns() users don't want to have to write:

  #if defined (HAVE_GETDNS_EXTENSION_SET_LIBEVENT_BASE)
      getdns_extension_set_libevent_base (context);
  #elif defined (HAVE_GETDNS_EXTENSION_SET_EVENTBASE_FOR_LIBEVENT)
      getdns_extension_set_eventbase_for_libevent (context);
  #else
  #error Don't know how to set up getdns() on this platform
  #endif

There needs to just be a standard part of the API that can be used to
register any event loop with any getdns() implementation. (Or at
least, there needs to be an API that any unix event loop
implementation can use, and an API that any Windows event loop
implementation can use, etc.)


On unixy platforms, all getdns() implementations are going to be based
on sockets and timeouts, and all event loops are going to be based on
poll() or something equivalent. So something like this would work:

void getdns_context_set_event_loop(
  getdns_context_t               context,
  getdns_event_loop_add_fd_t     add_fd_function,
  getdns_event_loop_remove_fd_t  remove_fd_function,
  void                           *looparg
);

typedef void (*getdns_event_loop_add_fd_t)(
  getdns_context_t       context,
  void                   *looparg,
  getdns_transaction_t   transaction_id,
  struct pollfd          pollfd,
  int                    timeout
);

typedef void (*getdns_event_loop_remove_fd_t)(
  getdns_context_t       context,
  void                   *looparg,
  getdns_transaction_t   transaction_id
);

The app would call getdns_context_set_event_loop() (either directly,
or via some helper function provided by the event loop library), and
then getdns would call the add_fd_function and remove_fd_function as
it needed, to change the set of sockets to poll. When one of the fds
was ready, or the timeout expired, the event loop would call something
like

void getdns_context_process_event(
  getdns_context_t       context,
  getdns_transaction_t   transaction_id,
  int                    fd,
  bool                   timed_out
);


I don't know how people expect async stuff to work on Windows, so I'm
not sure what the API would have to look like there. It might involve
replacing all the fd and pollfd args with HANDLEs.



> 1.5 Calling the API Synchronously (Without Events)

I think the getdns_sync_request() API probably makes sense the way it
is, and would be useful to lots of people, but just for the record, the
fact that it isn't cancellable means we wouldn't be able to use it to
implement sync lookups in GLib. But making it cancellable would imply
making it thread-safe too (since another thread would be the only
place you could be cancelling from), so you probably don't want to go
there.

(And anyway, we can fake a cancellable synchronous lookup by doing an
asynchronous lookup in a temporary event loop.)



More information about the spec mailing list