[getdns-api] UDP failover improvements

Willem Toorop willem at nlnetlabs.nl
Wed Feb 28 12:20:53 CET 2018


Op 28-02-18 om 11:56 schreef Robert Groenenberg:
> Hi Willem, Sara,

Hi Robert,

> To improve (in our view) getdns with respect to the failover/retry
> behaviour towards UDP upstreams, we've made 1 fix and 2 enhancements:
> 
> 1) restrict the back_off value of an upstream to a configurable maximum.
> This avoids that the back_off value (doubled at each timeout for an
> upstream) keeps growing until the value rolls over. We didn't want the
> interval for retrying an upstream to grow to values like 2^16 or bigger
> when that upstream had an outage. Note that the retry interval still is
> in 'query attempts', perhaps we want to make that time-based at some point.

Yes, the quickfix would be limiting that number.  For the longer term
time-based backoffs are probably the way to go.  That would be more
consistent with how stateful transports are handled currently too.

> 2) when an upstream has been unavailable and is found to be Ok at some
> point, its back_off value is not reset. So on a subsequent timeout the
> back_off continues with the value from the previous failure. We consider
> this a bug.

Acknowledged!

> 3) when all configured upstreams of a context are unavailable, in our
> view it makes more sense to retry these in a round-robin fashion instead
> of sticking to the back_off values (especially when one becomes
> unavailable earlier than another). The original backoff mechanism may
> lead that one unavailable upstream is tried hundreds or thousands of
> times before another one is given a try, while the latter may be
> available again. Switching to round-robin when all are unavailable for a
> number of attempts will lead to faster recovery.

Yes that sounds good too.

> I have these changes available on top of the latest 'develop' branch.
> Shall I create pull-requests for them?
> (Credits also go to my colleague Shikha Sharma)

Yes please.

I am currently in the process of reorganizing upstream management, so
perhaps your changes will not remain as provided, but it will be a good
starting point re-evaluating stateless upstreams backoff handling
nevertheless.

Thanks and cheers!
-- Willem
> 
> Cheers,
> Robert
> 
> 
> _______________________________________________
> spec mailing list
> spec at getdnsapi.net
> 



More information about the spec mailing list