On Tue, 5 Sep 2000, Dave C. wrote:
> Just one note - speaking as someone quite familiar with DNS, I am
> fairly certain that the DNS system permits only letters, digits and
> dash.
Here's what RFC 1034 actually says. Firstly:
Each node has a label, which is zero to 63 octets in length. ...
Internally, programs that manipulate domain names should represent them
as sequences of labels, where each label is a length octet followed by
an octet string.
No mention of any restriction on what the octets may be, but then we
have
By convention, domain names can be stored with arbitrary case, but
domain name comparisons for all present domain functions are done in a
case-insensitive manner, assuming an ASCII character set, and a high
order zero bit. This means that you are free to create a node with
label "A" or a node with label "a", but not both as brothers; you could
refer to either using "a" or "A". When you receive a domain name or
label, you should preserve its case. The rationale for this choice is
that we may someday need to add full binary domain names for new
services; existing services would not be changed.
Notice that last bit: "full binary domain names". The internal structure
of the DNS imposes no restrictions on what the names may be. Programs
which manipulate the data are a different matter. Later there is this
section:
3.5. Preferred name syntax
Note: "preferred", not mandatory.
The DNS specifications attempt to be as general as possible in the rules
for constructing domain names. The idea is that the name of any
existing object can be expressed as a domain name with minimal changes.
However, when assigning a domain name for an object, the prudent user
will select a name which satisfies both the rules of the domain system
and any existing rules for the object, whether these rules are published
or implied by existing programs.
I think that's pretty general, isn't it? It then goes on:
For example, when naming a mail domain, the user should satisfy both the
rules of this memo and those in RFC-822. When creating a new host name,
the old rules for HOSTS.TXT should be followed. This avoids problems
when old software is converted to use domain names.
The following syntax will result in fewer problems with many
applications that use domain names (e.g., mail, TELNET).
<domain> ::= <subdomain> | " "
<subdomain> ::= <label> | <subdomain> "." <label>
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig-hyp> ::= <let-dig> | "-"
<let-dig> ::= <letter> | <digit>
<letter> ::= any one of the 52 alphabetic characters A through Z in
upper case and a through z in lower case
<digit> ::= any one of the ten digits 0 through 9
I read that as a recommendation, not a restriction on the data that the
DNS can hold.
Note that while upper and lower case letters are allowed in domain
names, no significance is attached to the case. That is, two names with
the same spelling but different case are to be treated as if identical.
The labels must follow the rules for ARPANET host names. They must
start with a letter, end with a letter or digit, and have as interior
characters only letters, digits, and hyphen. There are also some
restrictions on the length. Labels must be 63 characters or less.
Now that's a curious sentence, given the generality of the comments
above. I took it to be an explanation of the example syntax above rather
than a general statement. (After all, the DNS *does* store domain names
other than in that syntax - the name that started this thread is one
such.)
--
Philip Hazel University of Cambridge Computing Service,
ph10@??? Cambridge, England. Phone: +44 1223 334714.