Tag Archives: Apache

Clean URLs – Basic mod_rewrite rules

I’ve been using Comb with a few more projects recently. The main point of Comb is to structure the code in such a way that the request handlers get accessed directly via Apache, as opposed to rewriting the entire URL, and making PHP dynamically include or instantiate a request handling class (i.e. Action, Command, etc).

For example, you might have a registration form that is accessed like this (pseudo-HTTP):

# display the registration page...
GET  /register.php
 
# registration form submits to...
POST /proc_register.php

I personally hate that naming convention because it reminds me of how I wrote code 5 years ago, and it’s just plain ugly.

Here’s a cleaner, friendlier approach that I’d rather use:
Continue reading

Comb: auto_prepend_file, output buffering, and mod_rewrite

I’ve decided to give a name to this barebones framework that I’ve been playing with for the past couple of months. I’m calling it Comb. Unfortunately it takes a long time to fully explain the thinking behind the name. Similar to Camber, however, the jist of it is simplicity; think “get the tangles out of my code”. Plus comb is a lot faster to say and type than MVCish-framework-based-on-autoprependfile-and-output-buffering.

My plan is to build the simplest web app framework that I can. I’m by no means an expert, and I don’t anticipate that this will end up being the best/fastest/most-convenient framework that will lay waste to the Zend Framework, or Symfony, or Kohana (or the mystical Redline! no link, it’s too mystical!). I just want to start from the ground-up by making very few assumptions, and without ripping “inspiration” from the latest and greatest RAD framework. I want to end up with a simple base of code that I don’t have to learn how to love, and that can evolve with me without getting in the way. Basically this is all about me, but you can feel free to use it as well if it makes enough sense.

I’ve got a few projects in mind that I’m planning on being powered by Comb. The first is my Code Examples site, which will serve two purposes: it’ll be a place to host demos of basic code for the blog entries I write, and it’ll be a testbed for integrating new features into the framework. And yes, the code that runs the examples site is Comb itself. The other projects will be revealed in time.

Object-oriented controllers bug me

I thought I might take a little step back here from my previous entry regarding my simplistic, not-all-that-original MVC using auto_prepend_file. I realized that I didn’t really explain why I occasionally get in that “there’s-got-to-be-a-simpler-way” frame of mind.

As I was building Camber, I made it a point to stick with an object-oriented design as closely as possible. The bootstrap script (i.e., index.php) simply sets up the basic environment, creates the main objects that are needed for the application, and starts things flowing. To re-cap, the whole life-cycle for a request in a Camber application is:
Continue reading

Poor man’s MVC: PHP, auto_prepend_file, and mod_rewrite

I often get the impression that there’s too much unnecessary complexity in the approaches to MVC that are currently in use in PHP, primarily with regard to request handlers and controllers. I’m even guilty of it myself with my ironic attempt at object-oriented simplicity, Camber. So when the mood strikes me, I’ll poke around in forums and articles and look for novel approaches to these problems.

A few years ago, Harry Fuecks posted an entry on his wiki-in-progress regarding PHP and the Front Controller design pattern. I’ve stumbled across this entry a few times before, but I always left it for dead, thinking it was just too simplistic. But recently I decided to try picking up where he left off at the end of his entry to see for myself how far the technique could be taken.
Continue reading

Evolving Framework, Step 3: Cleaning up the server-side

Once I had my virtual host set up to route all requests into a single script, I then had to figure out what to do with each request. I knew that the essential request information was available in the $_SERVER autoglobal, but I wanted a cleaner way of dealing with the request. So I put together a very simple Request class, which I saved in the document root as Request.php:

<?php 
class Request 
{ 
    public $method; 
    public $url; 
    public $parameters = array();  
    public function __construct() 
    { 
        $this->method = $_SERVER['REQUEST_METHOD']; 
        $this->url = $_SERVER['SCRIPT_URI']; 
        $this->parameters = $_REQUEST; 
    } 
} 

Continue reading

Evolving Framework, Step 2: Make PHP understand the clean URL

So why am I bothering with “clean” URLs? For several reasons really. For one thing, they look nicer than un-clean URLs. I would much rather get rid of something like, this:

http://www.example.com/forumDisplay.php?fid=17

and instead use something like this:

http://www.example.com/forum/general

But aesthetics aside, why should I bother? My reason is inspired by Roy Fielding’s REST. I won’t try to get into the guts of it here, partly because there’s alot to it, and partly because my grasp of it is far from complete. Regardless, what I’ve taken from my brief introduction to it is a resolution to build applications from a resource-centric point of view, as opposed to a functional, process-centric point of view. Instead of building a collection of scripts that expect data to be passed in so that they can spit data out, I want to try to build a system of resources that can be manipulated.

In order to manipulate a resource, three essential pieces of information are required:

  1. The location of the resource in question (the URL)
  2. The type of manipulation to be performed on the resource (HTTP method; PUT, GET, POST, DELETE)
  3. The data needed to perform the manipulation (request parameters)

One thing I noticed at this point was that the request method could essentially take the place of the action parameter I’ve seen being passed around in so many frameworks (and which I’m guilty of having used in the past). With any resource in mind (e.g., user object, news article, forum thread, photo album, etc), PUT, GET, POST, and DELETE can do pretty much anything you need, or at least imply your intention to do pretty much anything.

Also, those four major request methods map pretty conveniently to CRUD:

  • PUT == CREATE
  • GET == READ
  • POST == UPDATE
  • DELETE == DELETE

I figured a good place to start trying to implement something like this would be structuring the URLs in such a way that they look like nouns. So an old school login form may have looked like:

<form action="/login.php" method="post">
  Username <input type="text" name="username" /><br />
  Password <input type="password" name="password" /><br /> 
  <input type="submit" value="Login..." />
</form>

The problem with that code (RESTfully clean URLs in mind) is the action and the method. The form is POSTing the username and password to login.php. But is login.php really the resource that I want to POST to? If I replace the word POST with the word UPDATE, it makes a little bit less sense. I’m not updating login.php itself. That script doesn’t represent a resource of any kind; it’s just a procedure, a machine that takes data in one side and spits data out the other.

So I took a step back and thought about what I was actually trying to accomplish. I wanted to authenticate a user; check their username and password against the database, and if it matched a valid user record then stick their information in the session for use throughout their visit.

In other words, my goal was to create an authenticated user session. Since “create” maps to the PUT method, I decided to change the form around to look like this:

<form action="/session" method="put">
  Username <input type="text" name="username" /><br />
  Password <input type="password" name="password" /><br /> 
  <input type="submit" value="Login..." /> 
</form>

My intention was to provide the following information to the web server:

  • Location of the resource: /session
  • Operation to be performed on the resource: PUT (i.e., create)
  • Additional information needed to operate on the resource: username and password
    • It seemed a good bit more logical this way. Instead of telling the web server, “here’s a username and password that I want the login script to authenticate“, I’m telling it, “I want to create a new session based on this username and password“.

      The tricky part was actually writing the code to interpret the request in such a way that this actually happened like I wanted. Step 3 is coming up next. In it I’ll explain my initial approach to solving my RESTlessness, and some unexpected problems I ran into along the way.

Evolving Framework, Step 1: mod_rewrite + PHP = Clean URLs

Using clean URLs with Apache is fairly simple with mod_rewrite’s help. I’d be the first to admit that rewrite rules can be as

ridiculously complicated as they are powerful, but plenty of people have already admitted this. Luckily, what I’m trying to accomplish is not very difficult, mod_rewrite-wise. Taking a forum as an example, you might see URLs like:

  • http://example.com/ – site’s main page
  • http://example.com/forum – list of forums
  • http://example.com/forum/general – list all threads in general forum
  • http://example.com/forum/general/some_arbitrary_topic – list all posts in some_arbitrary_topic thread

To handle URLs similar to this, I set up the virtual host container to simply rewrite every request through a single PHP script, which I’ll lovingly call index.php. To start off, it just looks like this:

<?php
    print_r($_SERVER);

Pretty basic and useless. It just dumps out everything about the request and the server environment in an easy-to-read manner. But we have to tell Apache to use index.php for everything, so the virtual host container in Apache’s config file looks like this:

<VirtualHost *:80>
    ServerName      dev.cholmon.com 
    DocumentRoot    /www/vhosts/cholmon.com/dev/
 
    RewriteEngine   On 
    RewriteRule     /*\.(css|js|gif|png|jpe?g)$ - [NC,L] 
    RewriteRule     ^/* /index.php
 
    <Directory "/www/vhosts/cholmon.com/dev/">
                AllowOverride None 
                Order allow,deny 
                Allow from all 
     </Directory>
</VirtualHost>

(To read up on configuring virtual hosts and figuring out mod_rewrite, check out http://httpd.apache.org/docs/2.2/)

Those three rewrite directives accomplish the following:

  1. RewriteEngine On: tells Apache to expect some rewrite rules
  2. RewriteRule /*\.(css|js|gif|png|jpe?g)$ – [NC,L]: don’t rewrite images, stylesheets, or javascripts.
  3. RewriteRule ^/* /index.php: any other request should just run index.php

So, for instance, if you type http://example.com/some/made/up/path?this=that&foo=bar into your browser’s address bar, the request would get sent into index.php and you’d see the following (pay particular attention to the highlighted lines):

Array 
( 
    [SCRIPT_URL] => /some/made/up/path 
    [SCRIPT_URI] => http://example.com/some/made/up/path 
    [HTTP_ACCEPT] => image/gif, image/x-xbitmap, image/jpeg, */* 
    [HTTP_ACCEPT_LANGUAGE] => en-us 
    [HTTP_UA_CPU] => x86 
    [HTTP_ACCEPT_ENCODING] => gzip, deflate 
    [HTTP_USER_AGENT] => Mozilla/4.0 (compatible; MSIE 7.0) 
    [HTTP_HOST] => dev.cholmon.com 
    [HTTP_CONNECTION] => Keep-Alive 
    [PATH] => /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin 
    [SERVER_SIGNATURE] =>
    [SERVER_SOFTWARE] => Apache/2.2.0 (Unix) PHP/5.2.0  
    [SERVER_NAME] => dev.cholmon.com 
    [SERVER_ADDR] => 65.111.167.214 
    [SERVER_PORT] => 80 
    [REMOTE_ADDR] => 65.4.76.89 
    [DOCUMENT_ROOT] => /www/vhosts/cholmon.com/dev/ 
    [SERVER_ADMIN] => [no address given] 
    [SCRIPT_FILENAME] => /www/vhosts/cholmon.com/dev/index.php 
    [REMOTE_PORT] => 57503 
    [GATEWAY_INTERFACE] => CGI/1.1 
    [SERVER_PROTOCOL] => HTTP/1.1 
    [REQUEST_METHOD] => GET 
    [QUERY_STRING] => this=that&foor=bar 
    [REQUEST_URI] => /some/made/up/path?this=that&foor=bar 
    [SCRIPT_NAME] => /some/made/up/path
    [PHP_SELF] => /some/made/up/path 
    [REQUEST_TIME] => 1169876370 
)

At this point, the main parts of the request that I’m interested in are:

  • The method (GET)
  • The script name (/some/made/up/path)
  • The query string (this=that&foor=bar)

Step 2 will be up here in the next day or so. In it, I’ll modify index.php so that it parses those three pieces of information and decides what to do with the request. OMG STAY TUNED LOL!!!