Mapping URLs to Filesystem Locations – Apache HTTP Server
This document explains the method in which Apache determines what filesystem location to serve a file from based on the URL of a request.
In deciding what file to serve for a given request, Apache’s default behavior is to take the URL-Path for the request (the part of the URL following the first single slash) and add it to the end of the DocumentRoot specified in your configuration files. Therefore, the files and directories underneath the
DocumentRoot make up the basic document tree which will be visible from the web.
Apache is also capable of Virtual Hosting, where the server receives requests for more than one host. In this case, a different
DocumentRoot can be specified for each virtual host, or alternatively, the directives provided by the module mod_vhost_alias can be used to dynamically determine the appropriate place from which to serve content based on the requested IP address or hostname.
Files Outside the DocumentRoot
There are frequently circumstances where it is necessary to allow web access to parts of the filesystem which are not strictly underneath the DocumentRoot. Apache offers several different ways to accomplish this. On Unix systems, symbolic links can be used to bring other parts of the filesystem under the
DocumentRoot. For security reasons, symbolic links will only be followed if the Options setting for the relevant directory includes
Alternatively, the Alias directive can be used to map any part of the filesystem into the web space. For example, with
Alias /docs /var/web/
http://www.example.com/docs/dir/file.html will be served from
/var/web/dir/file.html. The ScriptAlias directive works the same way, with the additional effect that all content located at the target path is treated as CGI scripts.
For situations where additional flexibility is required, the AliasMatch and ScriptAliasMatch directives can do powerful regular-expression based matching and substitution. For example,
ScriptAliasMatch ^/~([^/]*)/cgi-bin/(.*) /home/$1/cgi-bin/$2
will map a request to
http://example.com/~user/cgi-bin/script.cgi to the path
/home/user/cgi-bin/script.cgi and will treat the resulting file as a CGI script.
Traditionally on Unix systems, the home directory of a particular user can be referred to as
~user/. The module mod_userdir extends this idea to the web by allowing files under each user’s home directory to be accessed using URLs such as the following.
For security reasons, it would be inappropriate to give direct access to a user’s home directory from the web. Therefore, the UserDir directive is used to specify a directory underneath the user’s home directory where web files will be located. Using the default setting of
Userdir public_html, the above URL would look for a file at a directory like
/home/user/public_html/file.html where the /home/user/ is the user’s home directory as specified in
There are also several other forms of the
Userdir directive which can be used on systems where
/etc/passwd cannot be used to find the location of the home directory.
Some people find the “~” symbol (which is often encoded on the web as
%7e) to be awkward and prefer to use an alternate string to represent user directories. This functionality is not supported by mod_userdir. However, if users’ home directories are structured in a regular way, then it is possible to use the AliasMatch directive to achieve the desired effect. For example, to make
http://www.example.com/upages/user/file.html map to
/home/user/public_html/file.html, the following
AliasMatch directive can be used.
AliasMatch ^/upages/([^/]*)/?(.*) /home/$1/public_html/$2
The configuration directives discussed in the above sections are used to tell Apache to get content from a specific place in the filesystem and return it to the client. Sometimes, it is desirable instead to inform the client that the content being requested is located at an different URL, and instruct the client to make a new request with the new URL. This is referred to as redirection and is implemented by the Redirect directive. For example, if the contents of the directory
/foo/ under the
DocumentRoot have been moved to the new directory
/bar/, clients can instructed to request the content at the new location as follows.
Redirect permanent /foo/ http://www.example.com/bar/
This will redirect any URL-Path starting in
/foo/ to the same URL path on the
www.example.com server with
/bar/ substituted for
/foo/. Note that clients can be redirected to any server, not only the origin server.
Apache also provides a RedirectMatch directive which can be used for more complicated rewriting problems. For example, to redirect requests for the site home page to a different site, but leave all other requests alone, the following configuration can be used.
RedirectMatch permanent ^/$ http://www.example.com/startpage.html
Alternatively, to temporarily redirect all pages on a site to one particular page, the following configuration is useful.
RedirectMatch temp .* http://www.example.com/startpage.html
When even more powerful substitution is required, the rewriting engine provided by mod_rewrite can be useful. The directives provided by this module can use characteristics of the request such as browser type or source IP address in deciding from where to serve content. In addition, mod_rewrite can use external database files or programs to determine how to handle a request. Many practical examples employing mod_rewrite are discussed in the URL Rewriting Guide.
File Not Found
Inevitably, URLs will be requested for which no matching file can be found in the filesystem. This can happen for several reasons. In some cases, it can be a result of moving documents from one location to another. In this case, it is best to use URL redirection to inform clients of the new location of the resource. In this way, you can assure that old bookmarks and links will continue to work, even though the resource is at a new location.
Another common cause of “File Not Found” errors is accidental mistyping of URLs, either directly in the browser, or in HTML links. Apache provides the module mod_speling (sic) to help with this problem. When this module is activated, it will intercept “File Not Found” errors and look for a resource with a similar filename. If one such file is found, mod_speling will send an HTTP redirect to the client informing it of the correct location. If several “close” files are found, a list of available alternatives will be presented to the client.
An especially useful feature of mod_speling, is that it will compare filenames without respect to case. This can be useful for systems where users are unaware of the case-sensitive nature of URLs and the unix filesystem. However, using mod_speling for anything more than the occasional URL correction can lead to additional load on the server, since each “incorrect” request is followed by a URL redirection and a new request from the client.
If all attempts to locate the content fail, Apache returns an error page with HTTP status code 404 (file not found). The appearance of this page is controlled with the ErrorDocument directive and can be customized in a flexible manner as discussed in the Custom error responses and International Server Error Responses documents.
Apache HTTP Server