Evolving Framework, Step 1: mod_rewrite + PHP = Clean URLs

Using clean URLs with Apache is fairly simple with mod_rewrite’s help. I’d be the first to admit that rewrite rules can be as

ridiculously complicated as they are powerful, but plenty of people have already admitted this. Luckily, what I’m trying to accomplish is not very difficult, mod_rewrite-wise. Taking a forum as an example, you might see URLs like:

  • http://example.com/ – site’s main page
  • http://example.com/forum – list of forums
  • http://example.com/forum/general – list all threads in general forum
  • http://example.com/forum/general/some_arbitrary_topic – list all posts in some_arbitrary_topic thread

To handle URLs similar to this, I set up the virtual host container to simply rewrite every request through a single PHP script, which I’ll lovingly call index.php. To start off, it just looks like this:


Pretty basic and useless. It just dumps out everything about the request and the server environment in an easy-to-read manner. But we have to tell Apache to use index.php for everything, so the virtual host container in Apache’s config file looks like this:

<VirtualHost *:80>
    ServerName      dev.cholmon.com 
    DocumentRoot    /www/vhosts/cholmon.com/dev/
    RewriteEngine   On 
    RewriteRule     /*\.(css|js|gif|png|jpe?g)$ - [NC,L] 
    RewriteRule     ^/* /index.php
    <Directory "/www/vhosts/cholmon.com/dev/">
                AllowOverride None 
                Order allow,deny 
                Allow from all 

(To read up on configuring virtual hosts and figuring out mod_rewrite, check out http://httpd.apache.org/docs/2.2/)

Those three rewrite directives accomplish the following:

  1. RewriteEngine On: tells Apache to expect some rewrite rules
  2. RewriteRule /*\.(css|js|gif|png|jpe?g)$ – [NC,L]: don’t rewrite images, stylesheets, or javascripts.
  3. RewriteRule ^/* /index.php: any other request should just run index.php

So, for instance, if you type http://example.com/some/made/up/path?this=that&foo=bar into your browser’s address bar, the request would get sent into index.php and you’d see the following (pay particular attention to the highlighted lines):

    [SCRIPT_URL] => /some/made/up/path 
    [SCRIPT_URI] => http://example.com/some/made/up/path 
    [HTTP_ACCEPT] => image/gif, image/x-xbitmap, image/jpeg, */* 
    [HTTP_ACCEPT_LANGUAGE] => en-us 
    [HTTP_UA_CPU] => x86 
    [HTTP_ACCEPT_ENCODING] => gzip, deflate 
    [HTTP_USER_AGENT] => Mozilla/4.0 (compatible; MSIE 7.0) 
    [HTTP_HOST] => dev.cholmon.com 
    [HTTP_CONNECTION] => Keep-Alive 
    [PATH] => /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin 
    [SERVER_SOFTWARE] => Apache/2.2.0 (Unix) PHP/5.2.0  
    [SERVER_NAME] => dev.cholmon.com 
    [SERVER_ADDR] => 
    [SERVER_PORT] => 80 
    [REMOTE_ADDR] => 
    [DOCUMENT_ROOT] => /www/vhosts/cholmon.com/dev/ 
    [SERVER_ADMIN] => [no address given] 
    [SCRIPT_FILENAME] => /www/vhosts/cholmon.com/dev/index.php 
    [REMOTE_PORT] => 57503 
    [QUERY_STRING] => this=that&foor=bar 
    [REQUEST_URI] => /some/made/up/path?this=that&foor=bar 
    [SCRIPT_NAME] => /some/made/up/path
    [PHP_SELF] => /some/made/up/path 
    [REQUEST_TIME] => 1169876370 

At this point, the main parts of the request that I’m interested in are:

  • The method (GET)
  • The script name (/some/made/up/path)
  • The query string (this=that&foor=bar)

Step 2 will be up here in the next day or so. In it, I’ll modify index.php so that it parses those three pieces of information and decides what to do with the request. OMG STAY TUNED LOL!!!