Just What Is .htaccess?

February 21, 2015     Section214    

One of these files is not like the others…

If you've ever spent any real time working with WordPress there's a fairly good chance you've run across a strange file in the root (or topmost) directory. Whereas the vast majority of the files comprising the WordPress hierarchy consist of a filename and an extension, such as the crucial wp-config.php, one solitary file seem a bit out of place. That file is the aptly-named .htaccess file. So just what is the .htaccess file?

First, a bit of history. The name itself is fairly self-explanitory. The ht in the name stands for “hypertext”, and access… well, that part shouldn't have to be explained. So the original purpose of the .htaccess file was to provide directory-level control to content on an Apache server. And what about that weird dot? Well, the Apache web server was originally built for POSIX-based systems (UNIX, Linux and similar). On POSIX-based systems, prefixing a filename with a dot marks it as a “hidden” file. As the .htaccess file was created as an advanced configuration file, its presence didn't need to be known to end users… Accidentally deleting it could potentially cause all sorts of issues!

Taking a step forward in time, today the .htaccess file read by several web servers including Apache (duh), the Sun Java System web server, and even Microsoft's IIS. However, not all servers are created equally. One of the more popular servers in use today is NGINX, which does not support .htaccess files. Additionally, while originally intended for use on POSIX-based system, it certainly isn't restricted to them… after all, we just mentioned that the Windows IIS web server supports .htaccess!

Oh yea… and today, the .htaccess file isn't confined to simple directory access. These days, it supports a subset of the global server configuration (using the Apache ruleset, regardless of the actual server in use). Today, you can do anything from restricting access to files or directories, through image hotlinking preventions, and even advanced redirects.

But what's going on with the WordPress .htaccess file?

When you initially setup WordPress, a basic .htaccess file is created for you that looks something like this:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Well now… that looks confusing… But it's really not. The first and last lines (aside from the comments) perform a quick check to ensure that the mod_rewrite module is enabled on the server. If it isn't, none of the included rules are run. The included rewrite rules take a bit more explaining.

RewriteEngine On

This line is simple enough. It literally turns the rewrite engine on, allowing processing of rewrite rules at runtime.

RewriteBase /

Tells the server precisely where the .htaccess file was found. Since these files can (and often do) exist in multiple directories, we need to inform the server that this set of rules was loaded for (and related to) the root of the web server.

RewriteRule ^index\.php$ - [L]

Wat… ok, that's a bit more complex. The first part of the rewrite rule, ^index\.php$, is a regular expression. The complexities of regex are outside the scope of this writeup, but the actual code in use here can be broken down like so:

The carat (or ^) character denotes the beginning of a string to lookup. In this case, we're looking for a string beginning with index immediately following the RewriteBase. The backslash after “index” is an escape character. It simply tells the server that the following period should be treated as a standard character as opposed to part of a regular expression. The dollar sign indicates the end of the string.

So far so good, right? Not as hard as it sounds? Moving on. The hyphen tells the server that no redirections should occur, and the [L] indicates that no other rules should be executed. Phew!

RewriteCond %{REQUEST_FILENAME} !-f

Again, simpler than it looks. This adds a condition to the rewrite rule. It's also remarkably self-explanitory… It simply says that the requested file name does not refer to a file which exists on the server. In other words, if we visit http://example.com/file.php, and file.php exists as a physical file, load it… If not, assume it's intended to be handled by WordPress.

RewriteCond %{REQUEST_FILENAME} !-d

No need to go into a lot of detail here. This is effectively the same as the previous condition, but applies to directories instead of files. See the “d” instead of “f”?

RewriteRule . /index.php [L]

This final rule is what makes the aforementioned magic happen. It simply tells the server that if both of the listed conditions are true, load index.php and let WordPress handle it. If either one of them evaluates to false, load the physical file.

See? Not so difficult!

So what else can you do with these rewrite rules? Quite a bit! So much so, in fact, that we aren't going to cover them all here. For the moment, we're going to leave you with a few useful resources which can give you a head start to understanding .htaccess and regular expressions, and an awesome article on tweaking the WordPress .htaccess file. Enjoy, and happy learning!

Categories: Tutorials

Leave a Reply

Your email address will not be published. Required fields are marked *

Rating*