Can (domain Name) Subdomains Have An Underscore "_" In It?


Answer :

Most answers given here are false. It is perfectly legal to have an underscore in a domain name. Let me quote the standard, RFC 2181, section 11, "Name syntax":

The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. [...] Implementations of the DNS protocols must not place any restrictions on the labels that can be used. In particular, DNS servers must not refuse to serve a zone because it contains labels that might not be acceptable to some DNS client programs.

See also the original DNS specification, RFC 1034, section 3.5 "Preferred name syntax" but read it carefully.

Domains with underscores are very common in the wild. Check _jabber._tcp.gmail.com or _sip._udp.apnic.net.

Other RFC mentioned here deal with different things. The original question was for domain names. If the question is for host names (or for URLs, which include a host name), then this is different, the relevant standard is RFC 1123, section 2.1 "Host Names and Numbers" which limits host names to letters-digits-hyphen.


A note on terminology, in furtherance to Bortzmeyer's answer

One should be clear about definitions. As used here:

  • domain name is the identifier of a resource in a DNS database
  • label is the part of a domain name in between dots
  • hostname is a special type of domain name which identifies Internet hosts

The hostname is subject to the restrictions of RFC 952 and the slight relaxation of RFC 1123

RFC 2181 makes clear that there is a difference between a domain name and a hostname:

...[the fact that] any binary label can have an MX record does not imply that any binary name can be used as the host part of an e-mail address...

So underscores in hostnames are a no-no, underscores in domain names are a-ok.

In practice, one may well see hostnames with underscores. As the Robustness Principle says: "Be conservative in what you send, liberal in what you accept".

A note on encoding

In the 21st century, it turns out that hostnames as well as domain names may be internationalized! This means resorting to encodings in case of labels that contain characters that are outside the allowed set.

In particular, it allows one to encode the _ in hostnames (Update 2017-07: This is doubtful, see comments. The _ still cannot be used in hostnames. Indeed, it cannot even be used in internationalized labels.)

The first RFC for internationalization was RFC 3490 of March 2003, "Internationalizing Domain Names in Applications (IDNA)". Today, we have:

  • RFC 5890 "IDNA: Definitions and Document Framework"
  • RFC 5891 "IDNA: Protocol"
  • RFC 5892 "The Unicode Code Points and IDNA"
  • RFC 5893 "Right-to-Left Scripts for IDNA"
  • RFC 5894 "IDNA: Background, Explanation, and Rationale"
  • RFC 5895 "Mapping Characters for IDNA 2008"

You may also want to check the Wikipedia Entry

RFC 5890 introduces the term LDH (Letter-Digit-Hypen) label for labels used in hostnames and says:

This is the classical label form used, albeit with some additional restrictions, in hostnames (RFC 952). Its syntax is identical to that described as the "preferred name syntax" in Section 3.5 of RFC 1034 as modified by RFC 1123. Briefly, it is a string consisting of ASCII letters, digits, and the hyphen with the further restriction that the hyphen cannot appear at the beginning or end of the string. Like all DNS labels, its total length must not exceed 63 octets.

Going back to simpler times, this Internet draft is an early proposal for hostname internationalization. Hostnames with international characters may be encoded using, for example, 'RACE' encoding.

The author of the 'RACE encoding' proposal notes:

According to RFC 1035, host parts must be case-insensitive, start and end with a letter or digit, and contain only letters, digits, and the hyphen character ("-"). This, of course, excludes any internationalized characters, as well as many other characters in the ASCII character repertoire. Further, domain name parts must be 63 octets or shorter in length.... All post-converted name parts that contain internationalized characters begin with the string "bq--". (...) The string "bq--" was chosen because it is extremely unlikely to exist in host parts before this specification was produced.


There is one additional thing you may need to know: If the host or subdomain part of the url contain an underscore, IE9 (have not tested other versions) cannot write cookies.

So be careful about that. :-)


Comments

Popular posts from this blog

Converting A String To Int In Groovy

"Cannot Create Cache Directory /home//.composer/cache/repo/https---packagist.org/, Or Directory Is Not Writable. Proceeding Without Cache"

Android How Can I Convert A String To A Editable