[getdns-users] bindata string encoding?

Daniel Kahn Gillmor dkg at fifthhorseman.net
Fri Jul 10 20:02:19 CEST 2015


On Fri 2015-07-10 13:09:04 -0400, Robert Edmonds wrote:
> Related: porting this code to Python 3 brings up a choice between
> treating these fields as the Python 3 'str' type, or the Python 3
> 'bytes' type.  'str' is a little bit easier to work with if dealing with
> textual data (IMO), but it requires selecting an explicit character
> encoding.  If you're going to explicitly allow these fields to contain
> C-style string termination, there's not much harm in also explicitly
> defining that they must be encoded with a specific character encoding,
> say UTF-8 or 7-bit US-ASCII.  That would make language bindings authors
> happier :-)

Python3's distinction between string data and raw bytes is really
useful.  for things like dns labels (and maybe all dictionary labels?)
maybe we can make an explicit statement that they must be UTF-8-encoded
text (which is a superset of 7-bit US-ASCII).  But for response data
from the DNS in general, i don't think we're able to make such a
categorical assumption, so i think it'll need to be converted to raw
bytes objects.

It would be nice if the getdns API itself were to distinguish between
these types, but i think it's generic enough to not want to do that.

      --dkg


More information about the Users mailing list