marcus welz

Short URLs with Zend Framework

Posted on September 2, 2010

First up, what's a short URL? A short URL is just that; a url that is as short as it can possibly be, so that takes up as few characters as possible when it is used in a twitter message, which itself is limited to 140 characters and probably the main reason short URLs are so popular. Each character counts.

Technically, short URLs consist of a short domain name and a simple identifier, usually the numeric primary key in a database table of whatever item the page is supposed to be for. And to make that number even shorter it's typically base 62 encoded.

The digits are represented using the numbers 0-9, lowercase a-z and uppercase A-Z. And although PHP offers a base_convert() function, it's unfortunately useless as it only supports up to be base 36 and loses precision on large numbers (it uses floating point math internally). So a replacement is needed.

There are all kinds of base62 encoding and decoding functions out there already. One is bc_base_convert, which uses (requires) the bcmath extension. Another one that's a bit more fleshed out and cleaner looking that I found on pastie while browsing reddit. I've reproduced it here for easy reference:

/**
 * @class Integer
 * @author Julien Garand (Go On Web)
 *
 * Can encode and decode integers to/from a string, using a custom alphabet
 */
class Integer
{
	// Default alphabet for a "normal" base 62 encoding
	static protected $alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
	static protected $base = 62;

	/**
	 * Define your custom alphabet here
	 */
	static public function setAlphabet( $alphabet )
	{
		// only strings are allowed
		if ( !is_string($alphabet) )
		{
			throw new Exception('Given alphabet is not a string !');
		}

		self::$base = strlen( $alphabet ); // Our base will be the length of the given alphabet

		// We check if alphabet doesn't have doubled characters
		if ( strlen( count_chars( $alphabet, 3 ) ) != self::$base )
		{
			throw new Exception('The following alphabet has doubled characters : '.$alphabet);
		}

		self::$alphabet = $alphabet; // store it
	}

	/**
	 * Basic accessors
	 */
	static public function getAlphabet() { return self::$alphabet; }
	static public function getBase() { return self::$base; }

	/**
	 * Encode an integer according to the defined alphabet
	 *
	 * @param integer : Unsigned integer to be encoded
	 * @return string (or false if failed)
	 */
	static public function encode( $integer )
	{
		$integer = (int)$integer; // Be sure to have an integer

		// We only accept unsigned integers
		if ( $integer < 0 )
		{
			return false; // or throw new Exception( "($integer) is less than 0 and cannot be converted" );
		}

		$string = ''; // our encoded integer

		// while we have to encode
		while( $integer )
		{
			$pos = $integer % self::$base;                // get the rest of euclidian division...
			$string .= self::$alphabet[ $pos ];           // thats the position of the char in alphabet
			$integer = ( $integer - $pos ) / self::$base; // and divide integer (minus just encoded char) by the base
		}

		return strrev( $string ); // As we started by the unit of our base ( $base ^ 0 ), we have to reverse the string
	}

	/**
	 * Decode a string to an integer according to the defined alphabet
	 *
	 * @param string : String to be decoded
	 * @return integer (or false if failed)
	 */
	static public function decode( $string )
	{
		$string = (string)$string; // be sure to have a string;

		// check if our string only have chars that are in the alphabet
		if ( strcspn( $string, self::$alphabet ) )
		{
			return false; // or throw new Exception( "($string) is not a string or contains characters that are not in alphabet" );
		}

		$integer = 0; // our integer to find
		$unit = 1;    // we start by $base^0

		// foreach chars, starting at the end
		for( $i = strlen( $string ) -1; $i >= 0; $i -- )
		{
			$pos = strpos( self::$alphabet, $string[$i] ); // we find it's position in alphabet
			$integer += $pos * $unit;                      // its our number to add, multiplied by the current unit
			$unit = $unit * self::$base;                   // and go to next unit in our base
		}

		return $integer;
	}
}

So now we can convert our simple numbers into the more cryptic looking short url identifiers simply using Integer::encode(). Here's some example conversions:

1       => 1
10      => a
100     => 1C
255     => 47
1000    => g8
10000   => 2Bi
65535   => h31
100000  => q0U
1000000 => 4c92

Instead of http://example.com/1000000, you could end up with http://example.com/4c92. There, three characters saved. That makes a difference, particularly with really short domain names, such as Twitter's own URL shortener: http://t.co.

Working with Routes

So, in Zend Framework the actual page logic starts in controllers and actions, which are essentially classes and methods, respectively. In order to to reach a controller and action, request URLs are routed using the router.

The default route is sufficient for most applications. It conveniently maps the first two path segments to controller and action. So http://example.com/photo/view maps to the PhotoController::viewAction(). Also, the action is optional, and if omitted will default to index. Therefore, http://example.com/photo will map to PhotoController::indexAction(). There are other scenarios that are helpful to be familiar with.

Now the easiest way to support short URLs is to add a route that will match any alphanumeric characters and route that to the desired destination. That could look something like this:

$router = Zend_Controller_Front::getInstance()->getRouter();
$router->addRoute('photo', new Zend_Controller_Router_Route(':shortid', array(
    'controller'    => 'photo',
    'action'        => 'view'
), array(
    'shortid'       => '[0-9a-zA-Z]+'
)));

The side effect, however, is that there's no longer a distinction between a short URL such as "http://example.com/1hF" and "http://example.com/photo". Since "photo" could be a base62 encoded number (in fact, it would be the number 373,554,054). This can be worked around if you make all other URLs specify both the controller and action explicitly, so you'd use "http://example.com/photo/index" to ensure that the short URL route doesn't match.

Then, in your controller's action, you'd handle the request using the short URL:

class PhotoController extends Zend_Controller_Action
{
    public function viewAction()
    {
        if ($shortId = $this->_getParam('shortid')) {
            $id = Integer::decode($shortId);
        }

        // rest of the logic to view the photo here, using $id.
    }
}

This technique may not always apply, however, since you might already have a larger application that has all kinds of links that you can't just change to make this short URL thing work.

Another technique is to modify short URL a bit so they're more easily recognizable as such. I did that for one application by sacrificing one extra character. I just prefixed all the short IDs with an upper case "S". So you'd have a URL such as http://example.com/S4c92. This works since normally the URLs are all lower-case anyway:

$routes['twitter-pics'] = new Zend_Controller_Router_Route_Regex(
    '(?-i)S([\w\d]+)',
    array('controller' => 'photos',
          'action'     => 'view'),
    array('shortid' => 1),
    '/%s'
);

Note that this is a regular expression based route. The (?-i) turns off case insensitivity. I still wasn't happy with this approach, because the action still needs to explicitly handle that 'shortid' variable.

Using a Custom Route

I wanted everything encapsulated in the route, so I wrote a custom route class.

The interface that Zend provides is rather straight forward:

interface Zend_Controller_Router_Route_Interface {
    public function match($path);
    public function assemble($data = array(), $reset = false, $encode = false);
    public static function getInstance(Zend_Config $config);
}
  • match checks whether the route matches the path of the request
  • assemble is used to build a URL based on the parameters
  • getInstance is supposed to accept a configuration and return a new instance of the route. I don't even care about that at the moment.

Here's the finished class:

/**
 * Short Route
 *
 * Provides short URLs
 *
 * @author Marcus Welz
 *
 */
class Td_Controller_Router_ShortRoute implements Zend_Controller_Router_Route_Interface
{

    /**
     * @var string The URL prefix
     */
    protected $_urlPrefix = 'S';

    /**
     * @var array The parameter as passed to the request
     */
    protected $_params = array();

    /**
     *
     * @param string $urlPrefix    The prefix of the URL
     * @param array  $params       The parameters as passed to the request
     */
    public function __construct($urlPrefix, $params = array())
    {
        $this->_urlPrefix = $urlPrefix;
        $this->_params = $params;
    }

    /**
     * @param string $path The URL such as "/P3"
     * @return array|false returns parameters including the id on success, false if no match
     */
    public function match($path)
    {
        $prefix = preg_quote($this->_urlPrefix);
        if (preg_match('/\/' . $prefix . '([A-z0-9]+)$/', $path, $matches)) {
            $params = $this->_params;
            $params['id'] = Integer::decode($matches[1]);
            return $params;
        }
        return false;
    }

    /**
     * Assemble a URL using the ID
     *
     *
     * @param array $data       'id' is the only used parameter in the array
     * @param bool  $reset      unused / ignored
     * @param bool  $encode     unused / ignored
     */
    public function assemble($data = array(), $reset = false, $encode = false)
    {
        return $this->_urlPrefix . Integer::encode($data['id']);
    }

    public static function getInstance(Zend_Config $config)
    {
        throw new Exception('not implemented');
    }
}

Using it is straight forward. First, add it to the router:

        Zend_Controller_Front::getInstance->getRouter()
           ->addRoute('photo', new Td_Controller_Router_ShortRoute('S', array(
            'controller' => 'photo',
            'action'    => 'view'
        )));

Since the conversion between base 62 and base 10 is happening inside the class, the action doesn't have to decode it itself and is thus blissfully unaware of it. Encapsulation successful. And to generate a URL in a view, you'd use the url() view helper:

$this->url(array('id'=> $photo['id']), 'photo')

Good enough for me.

Print This Post Print This Post

Using Zend_Acl with your model

Posted on May 26, 2009

Zend_Acl is an excellent component that provides Access Control List (ACL) functionality. In most cases the goal is to manage user access to resources. access to to manage all things related to user access. In a nutshell, a role

to any kind of resource. But unfortunate it doesn't quite live up to its full potential just yet, due to a few implementation details as outlined in tickets and ZF-4460. The latter of the two also has comments that include a few examples for a workaround.

I'm using the following class which gives any custom Assert object access to the actual Resource passed to it.

/**
 * The current Zend_Acl design does not allow for
 * using a custom Role and Resource objects and expect that they'll make it through
 * to custom assertions.
 * See http://framework.zend.com/issues/browse/ZF-1722
 * and http://framework.zend.com/issues/browse/ZF-4460
 */
class My_Acl extends Zend_Acl
{

    /**
     * Returns the identified Resource
     *
     * The $resource parameter can either be a Resource or a Resource identifier.
     *
     * @param  Zend_Acl_Resource_Interface|string $resource
     * @throws Zend_Acl_Exception
     * @return Zend_Acl_Resource_Interface
     */
    public function get($resource)
    {
        if (!$this->has($resource)) {
            require_once 'Zend/Acl/Exception.php';
            throw new Zend_Acl_Exception("Resource '$resource' not found");
        }

        if ($resource instanceof Zend_Acl_Resource_Interface) {
            return $resource;
        }

        return $this->_resources[$resource]['instance'];
    }

}

Unfortunately, this doesn't fix the issue of the Role making it through to an assertion, but in most of my cases that's the acting user anyway, so I don't even try to grab the passed in $role and instead use the identity straight from Zend_Auth.

/**
 * Ensure the photo is owned by the user with $role
 *
 */
class PhotoOwnerAssertion implements Zend_Acl_Assert_Interface
{
    public function assert(Zend_Acl $acl,
                           Zend_Acl_Role_Interface $role = null,
                           Zend_Acl_Resource_Interface $resource = null,
                           $privilege = null)
    {
        if (!$resource instanceof Photos_Row) {
            return false;
        }
        /* @var $resource Photos_Row */

        /*
         * Workaround; the current Zend_Acl design does not allow for
         * using a custom Role interface and expect that it'll make it through.
         * See http://framework.zend.com/issues/browse/ZF-1722 and
         * http://framework.zend.com/issues/browse/ZF-4460
         */
        $role = Zend_Auth::getInstance()->getIdentity();

        return $resource->getOwnerId() == $role;
    }

}

When setting up the ACL, I provide an instance of the custom assertion which will then provide the proper access control. It's fairly well encapsulated (other than the bug workarounds).

/**
 * Only allow owners to view, edit, and delete their photos
 */
$acl->allow('member', 'Photo', array('view', 'edit', 'delete'), new PhotoOwnerAssertion());

In this case, the Photos_Row class must also provide a getOwnerId() method.

class Photos_Row extends Zend_Db_Table_Row_Abstract
                 implements Zend_Acl_Resource_Interface
{

    /**
     * Resource type (for use with ACL)
     *
     * @see Zend_Acl_Resource_Interface
     *
     * @return string
     */
    public function getResourceId()
    {
        return 'Photo';
    }

    /**
     * Return the photo owner's UUID
     *
     * @see UgcItem
     *
     * @return string
     */
    public function getOwnerId()
    {
        return $this->avataruuid;
    }

}

A little more abstraction and the custom assertion can be used for models other than photos.

Print This Post Print This Post

Dirty Rows and Audit Trails with Zend_Db_Table

Posted on September 27, 2008

There are various ways to update rows in a database table using the Zend_Db_Table components. You can use use Zend_Db_Table::update(), like so:

$table = My_Table();
$table->update(array('age' => 22), 'id = 1');

or retrieve the row, and update it:

$table = My_Table();
$row = $table->find(1)->current();
$row->age = 22;
$row->save();

The big difference between the two approaches is that by first retrieving the row, and then updating it, you're actually using three queries. The first one to find the row, the second one to save it, and a third, which is used internally in to Zend_Db_Table_Row_Abstract to refresh data that might have gotten changed due to TIMESTAMP columns, triggers, etc.

If you dig into Zend/Db/Table/Row/Abstract.php, you can see that the class already tracks which columns were changed, so if you only change the value of a single column like the age in the example, not all columns of that row are updated in the database — only those that were actually modified. That's what the protected $_modifiedFields property is for; it records which properties on the Row object were set and only writes those fields to the database. It doesn't, however, check whether the new value is different from the old value.

There's also another protected property, called $_cleanData, which contains the row data as it is currently stored in the database. With that in mind, it is pretty simple to add additional logic to take advantage of that fact.

For instance, we can take it to the next level and only update the record if the column data differs from its previous data. Or perhaps we have a separate audit trail log that needs to capture any column data that was modified.

<?php

require_once 'Zend/Db/Table/Row/Abstract.php';

abstract class My_Db_Table_Row_Abstract extends Zend_Db_Table_Row_Abstract
{

    /**
     * Returns the values that have *actually* been changed
     *
     * @return array
     */
    public function getDirty()
    {
        return array_diff_assoc($this->_data, $this->_cleanData);
    }

    /**
     * Whether the record has been modified
     *
     * @return bool
     */
    public function isDirty()
    {
        return (bool) count($this->getDirty());
    }

    /**
     * Saves the properties to the database.
     *
     * This performs an intelligent insert/update, and reloads the
     * properties with fresh data from the table on success.
     *
     * Saving will only occur if any column values have been modified
     *
     * @return mixed The primary key value(s), as an associative array if the
     *     key is compound, or a scalar if the key is single-column.
     */
    public function save()
    {
        if ($this->isDirty()) {
            return parent::save();
        }
    }
}

I built a feature based on this to record when a row was modified, exactly which columns were updated, when, and by whom, in order to provide a rock-solid audit trail for a web application in a corporate environment.

Print This Post Print This Post

Zend_Db: Setting MySQL's timezone per connection

Posted on September 16, 2008

I have a Linux server with a system timezone of ET (US/Eastern). But I also have a web application that needs to run in a timezone of PT (US/Pacific). Of course that's not a problem at all. I just set the timezone in my web application's bootstrap:

date_default_timezone_set('America/Los_Angeles'); // Pacific timezone

Now I have another problem; the database. Sometimes I use PHP to generate dates such as date('Y-m-d H:i:s', strtotime('-2 minutes')). Other times I insert records and use new Zend_Db_Expr('NOW()');. But because MySQL isn't aware that I'd like to use pacific time, dates end up being inconsistent and are off by three hour.

It's a fairly easy fix, though, with a bit of logic added to a custom MySQL database adapter:

<?php

/**
 * @see Zend_Db_Adapter_Pdo_Mysql
 */
require_once 'Zend/Db/Adapter/Pdo/Mysql.php';

/**
 * MySQL PDO adapter extended to set the timezone
 */
class My_Db_Adapter_Pdo_Mysql extends Zend_Db_Adapter_Pdo_Mysql
{
    /**
     * @var bool
     */
    protected $_initialized = false;

    /**
     * Connects to the database.
     *
     */
    protected function _connect()
    {
        parent::_connect();

        if (!$this->_initialized) {
            $this->_initialized = true;

            if ($this->_config['timezone']) {

                // Requires PHP 5.2+
                $dtz = new DateTimeZone($this->_config['timezone']);
                $offset = $dtz->getOffset(new DateTime('NOW')) / 60 / 60;

                $this->query(sprintf("SET time_zone = '%d:00'", $offset));
            }
        }
    }
}

To kick this all off my bootstrap contains:

$config = array();
$config['host'] = 'localhost';
$config['username'] = 'username';
$config['password'] = 'password';
$config['dbname'] = 'mydatabase';
$config['timezone'] = 'America/Los_Angeles';
$config['adapterNamespace'] = 'My_Db_Adapter';

$db = Zend_Db::factory('PDO_MYSQL', $config);
Zend_Db_Table::setDefaultAdapter($db);

date_default_timezone_set('America/Los_Angeles');

And now I am free to continue my habit of inconsistency when specifying dates.

Print This Post Print This Post

The WSDL Blower: The state of SOAP in Zend Framework 1.6

Posted on September 13, 2008

There are all kinds of ways to expose APIs as web services. SOAP, XML-RPC, REST, JSON-RPC. Out of all of these, SOAP is arguably the most complex, but also one of the oldest ways to expose an API (I remember preliminary SOAP and WSDL support in Delphi 6, circa 2001).

Exposing an API as a web service in Zend Framework is fairly straight forward, in fact it is (or should be) as easy as one, two, three:

  1. Pick your favorite style (Zend_Soap_Server, Zend_XmlRpc_Server, Zend_Rest_Server, Zend_Json_Server).
  2. Define a properly documented class with the methods and business logic that you want to expose.
  3. Let your Zend_*_Server::handle(); everything else.

That's pretty much it. Of course each Zend_*_Server has its own settings and options that you can (and sometimes must) configure. XML-RPC is also the only server that supports namespaces.

If you decide on using SOAP, there are a few things to watch out for.

  1. The WSDL generator doesn't let you set the TargetNamespace attribute explicitly. This becomes an issue when you're working with different environments (development, testing, staging, production, etc.) since the URL of the service determines the namespace. And that makes automatic code generation based on the WSDL problematic. I've contributed a patch to address this, as part of ZF-4117.
  2. Zend_Soap_Client contains a bug that prevents it from properly proxying method calls. Instead, it'll end up recursing infinitely, or at least 100 times before PHP detects the issue and kills it. The quick fix of removing a single underscore froma method call is outlined in ZF-4152.
  3. Zend_Soap_Server doesn't properly turn Exceptions into SoapFaults (due to lack of typecasting) as described in ZF-3958. You can still throw SoapFaults explicitly, but that's just bad form since your class shouldn't be SOAP specific. After all, you may decide to also expose it as XML-RPC or what have you, and then the SoapFault, while it might still work properly, is semantically incorrect.
  4. While Zend_Soap_Server and Zend_Soap_Client are mostly wrappers for the native PHP SoapClient and SoapServer classes, PHP itself lacks WSDL generation which Zend_Soap_AutoDiscovery provides in conjunction with Zend_Soap_Wsdl. Unfortunately the notion of "array of datatype" is not supported, since arrays in PHP are simply declared as "array" and can contain anything. Zend Studio's WSDL generator supports the syntax of "string[]" to specify an array of strings, and this works with complex types as well. "User[]" is an array of User objects. The issue is outlined in ZF-3900, to which I didn't provide one, but two patches, neither of which actually do what I thought they'd do (go me!). However, I have put together an embarrassing hack (Zend_Soap_Wsdl arrayOfType patch) that finally does work — at least for me. So far I haven't worked up the courage to submit it to JIRA. Not one of my finest moments.

I doubt most of these issues will hang around for long, but if you're developing something SOAPy with ZF1.6, you'll like encounter at least one of them.

Print This Post Print This Post