Related Topics:
 • System Files...
 


 The Hosts File on Linux


What is it for... How to work with it

The hosts file on any Linux or UNIX platform is an essential system file, mapping IP addresses to machine or domain names. It also includes an optional, but handy, third column which denotes aliases or "nicknames" for a machine.

Although it is widely used as a "database" by the OS for a variety of machine lookups (i.e., telnet or ftp), I find the hosts file a convenient reference for storing names that I use in my routing table. Using names just makes it easier to remember them rather than their associated IP (numerical) addresses. You can even grep this file yourself, or via script applications, when searching for IPs. (See below for examples... )

For many network applications, the hosts file is "consulted" first when you are on line, need a name resolution, and are not running "named", the local domain name daemon. If a process cannot find a match on your machine, it then refers to your listed domain name servers, as specified in the resolv.conf file, and places a query to one of them. So as fast as the network domain name servers are, the hosts file is usually faster since it is on your local platform!

You may have any number of entries in this file. As you can see, I just have a few here. On a large server, there could be hundreds of names! This can be any empty file, but at a minimum, you should have the 127.0.0.1 mapped to localhost, and probably your own IP and hostname. (Your system "installers" will usually do this for you.)

Although rare, another interesting aside is that if a machine or domain name changes its IP, you won't have to track down all the scripts that use it since that name will be resolved in the hosts file before any network action is taken! So it is worth maintaining this file! (This is more likely to happen on a local area network than on the internet.)



My Hosts File, Some Example Excerpts and Applications

/etc/hosts

#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.  Just add the names, addresses
#               and any aliases to this file...
#
# By the way, Arnt Gulbrandsen  says that 127.0.0.1
# should NEVER be named with the name of the machine.  It causes problems
# for some (stupid) programs, irc and reputedly talk. :^)
#
#
# For loopbacking.
127.0.0.1       localhost
#
129.6.15.28     time-a.nist.gov         synctime
44.56.26.10     ka1fsb.ampr.org         ka1fsb
44.56.26.11     ka1fsb-3.ampr.org       ka1fsb-3
44.56.26.12     ka1fsb-12.ampr.org      ka1fsb-12
44.56.26.16     laptop.ka1fsb.ampr.org  ka1fsb-16
192.168.1.1     vpn1.ka1fsb.ampr.org    vpn1	
192.168.2.1     vpn2.bambi.ampr.org     vpn2
192.168.1.11    vpn1.ka1fsb-3.ampr.org  vpn1jnos	
44.56.26.14     bambi.ampr.org          bambi
44.56.26.15     godzilla.ampr.org       godzilla
44.56.26.17     vpn1.ka1fsb-12.ampr.org ka1fsb-17
44.52.9.48      derry.ka1tuk.ampr.org   larry
129.242.28.57   geo.phys.uit.no         kindex
#
# End of hosts.

You may grep the /etc/hosts file for a specific lookup. I wanted a listing of all IPs and hostnames that had the "key" ka1fsb in the record/line. Here is the command to do this:

  • grep ka1fsb /etc/hosts
ka1fsb:~#
44.56.26.10     ka1fsb.ampr.org         ka1fsb
44.56.26.11     ka1fsb-3.ampr.org       ka1fsb-3
44.56.26.12     ka1fsb-12.ampr.org      ka1fsb-12
44.56.26.16     laptop.ka1fsb.ampr.org  ka1fsb-16
192.168.1.1     vpn1.ka1fsb.ampr.org    vpn1	
192.168.1.11    vpn1.ka1fsb-3.ampr.org  vpn1jnos	
44.56.26.17     vpn1.ka1fsb-12.ampr.org ka1fsb-17

As you can see, there were seven (7) entries returned from the file. (If you needed further refinement, you could always grep it again or use a compound piped command, such as "grep ka1fsb /etc/hosts | grep vpn". This would "pull" only those records with vpn in the name.)

What is going on here with grep? How does it work? Grep searches files line-by-line, with the new line character as the line terminator. When it has stored this line in its search buffer, it now scans across this data string character-by-character, left to right, until there is an exact match with the search "key," the data string to be found. It takes the first match it finds, even though there might be more in the string, and it returns the entire line. For example, in the first line above, if the search key were ka1fsb, it would stop in the second column with its match. It doesn't matter that there is also another "find" in the third column...

As another example, you may use names listed in the hosts file even when you do a simple telnet over your local area network. Here is an example that uses the alias column in that file:

  • telnet ka1fsb-3

Using the hosts file, this alias becomes resolved to 44.56.26.11.

As another common example of the extensive system dependence on this file, the route command, given without any switch options, will always try to present its table using the names of machines as resolved in the hosts file. If you use the -n switch, then it will not bother to look up the names and resolve them. (When online, it is usually a good practice to use the -n switch since any unresolved entry in the routing table will spawn a network lookup request from a domain name server which may either take a long time or stall out the routing display altogether.)

What about the structure of the hosts file? Does the column order of the data matter? The host file is a "flat" file. This means that it uses no special marks or fixed field offsets into the data body. It is not an indexed sequential structure, which means that you cannot demarcate data fields by indexing them as column 1 or column 2, as you might in a typical database structure. So, there is no "companion" mapping, index file. Neither are there any special embeded "marks" which could be counted so as to arrive at a specified column or field location. (Have you ever heard of Pick? :)

The host file uses a very simple data structure. The first "column" is reserved for the IP, and anything else (data items) may follow separated by what is commonly known as "white space!" Convention has "dictated" that the second column is usually reserved for the domain name associated with its leading IP. A third column may be used for an alias or an abbreviated name. Stricky speaking this is really irrelevant. The only requirement is that the numerical IP be positioned first with no leading white space, no tabs, and no control characters in front of it, followed by almost anything consisting of alphanumeric characters, i.e. word(s), that identifies the IP in plain English.

The kernel search process is nearly the same as the grep example presented above. When we find a match "anywhere" in a line using a character-by-character scan, the process parses out what the target IP is at the beginning of that line and returns that value. So, in theory, what we are searching on, any generally non-numeric item, could potentially be found anywhere in the line after the IP. However, as noted, convention has it that the domain name usually immediately follows the IP; although you probably could break with convention with no harm done. So there is no reason why you couldn't put aliases in the second "column." You could even interchange the named items as a test! But the other issue here is maintenance and consistency! Most sysops like to do things in standardized ways. This may also be true for the many variations of Linux OS'es. Follow your conventional file layout as found in existing examples in your distribution for the most reliable results. However, if you feel like exploring the possibilities with an occasional "experiment," by all means see what your system will return. :)


(Courtesy KBNorton Computer Services)