marcus welz

Short URLs with Zend Framework

Posted on September 2, 2010

First up, what's a short URL? A short URL is just that; a url that is as short as it can possibly be, so that takes up as few characters as possible when it is used in a twitter message, which itself is limited to 140 characters and probably the main reason short URLs are so popular. Each character counts.

Technically, short URLs consist of a short domain name and a simple identifier, usually the numeric primary key in a database table of whatever item the page is supposed to be for. And to make that number even shorter it's typically base 62 encoded.

The digits are represented using the numbers 0-9, lowercase a-z and uppercase A-Z. And although PHP offers a base_convert() function, it's unfortunately useless as it only supports up to be base 36 and loses precision on large numbers (it uses floating point math internally). So a replacement is needed.

There are all kinds of base62 encoding and decoding functions out there already. One is bc_base_convert, which uses (requires) the bcmath extension. Another one that's a bit more fleshed out and cleaner looking that I found on pastie while browsing reddit. I've reproduced it here for easy reference:

/**
 * @class Integer
 * @author Julien Garand (Go On Web)
 *
 * Can encode and decode integers to/from a string, using a custom alphabet
 */
class Integer
{
	// Default alphabet for a "normal" base 62 encoding
	static protected $alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
	static protected $base = 62;

	/**
	 * Define your custom alphabet here
	 */
	static public function setAlphabet( $alphabet )
	{
		// only strings are allowed
		if ( !is_string($alphabet) )
		{
			throw new Exception('Given alphabet is not a string !');
		}

		self::$base = strlen( $alphabet ); // Our base will be the length of the given alphabet

		// We check if alphabet doesn't have doubled characters
		if ( strlen( count_chars( $alphabet, 3 ) ) != self::$base )
		{
			throw new Exception('The following alphabet has doubled characters : '.$alphabet);
		}

		self::$alphabet = $alphabet; // store it
	}

	/**
	 * Basic accessors
	 */
	static public function getAlphabet() { return self::$alphabet; }
	static public function getBase() { return self::$base; }

	/**
	 * Encode an integer according to the defined alphabet
	 *
	 * @param integer : Unsigned integer to be encoded
	 * @return string (or false if failed)
	 */
	static public function encode( $integer )
	{
		$integer = (int)$integer; // Be sure to have an integer

		// We only accept unsigned integers
		if ( $integer < 0 )
		{
			return false; // or throw new Exception( "($integer) is less than 0 and cannot be converted" );
		}

		$string = ''; // our encoded integer

		// while we have to encode
		while( $integer )
		{
			$pos = $integer % self::$base;                // get the rest of euclidian division...
			$string .= self::$alphabet[ $pos ];           // thats the position of the char in alphabet
			$integer = ( $integer - $pos ) / self::$base; // and divide integer (minus just encoded char) by the base
		}

		return strrev( $string ); // As we started by the unit of our base ( $base ^ 0 ), we have to reverse the string
	}

	/**
	 * Decode a string to an integer according to the defined alphabet
	 *
	 * @param string : String to be decoded
	 * @return integer (or false if failed)
	 */
	static public function decode( $string )
	{
		$string = (string)$string; // be sure to have a string;

		// check if our string only have chars that are in the alphabet
		if ( strcspn( $string, self::$alphabet ) )
		{
			return false; // or throw new Exception( "($string) is not a string or contains characters that are not in alphabet" );
		}

		$integer = 0; // our integer to find
		$unit = 1;    // we start by $base^0

		// foreach chars, starting at the end
		for( $i = strlen( $string ) -1; $i >= 0; $i -- )
		{
			$pos = strpos( self::$alphabet, $string[$i] ); // we find it's position in alphabet
			$integer += $pos * $unit;                      // its our number to add, multiplied by the current unit
			$unit = $unit * self::$base;                   // and go to next unit in our base
		}

		return $integer;
	}
}

So now we can convert our simple numbers into the more cryptic looking short url identifiers simply using Integer::encode(). Here's some example conversions:

1       => 1
10      => a
100     => 1C
255     => 47
1000    => g8
10000   => 2Bi
65535   => h31
100000  => q0U
1000000 => 4c92

Instead of http://example.com/1000000, you could end up with http://example.com/4c92. There, three characters saved. That makes a difference, particularly with really short domain names, such as Twitter's own URL shortener: http://t.co.

Working with Routes

So, in Zend Framework the actual page logic starts in controllers and actions, which are essentially classes and methods, respectively. In order to to reach a controller and action, request URLs are routed using the router.

The default route is sufficient for most applications. It conveniently maps the first two path segments to controller and action. So http://example.com/photo/view maps to the PhotoController::viewAction(). Also, the action is optional, and if omitted will default to index. Therefore, http://example.com/photo will map to PhotoController::indexAction(). There are other scenarios that are helpful to be familiar with.

Now the easiest way to support short URLs is to add a route that will match any alphanumeric characters and route that to the desired destination. That could look something like this:

$router = Zend_Controller_Front::getInstance()->getRouter();
$router->addRoute('photo', new Zend_Controller_Router_Route(':shortid', array(
    'controller'    => 'photo',
    'action'        => 'view'
), array(
    'shortid'       => '[0-9a-zA-Z]+'
)));

The side effect, however, is that there's no longer a distinction between a short URL such as "http://example.com/1hF" and "http://example.com/photo". Since "photo" could be a base62 encoded number (in fact, it would be the number 373,554,054). This can be worked around if you make all other URLs specify both the controller and action explicitly, so you'd use "http://example.com/photo/index" to ensure that the short URL route doesn't match.

Then, in your controller's action, you'd handle the request using the short URL:

class PhotoController extends Zend_Controller_Action
{
    public function viewAction()
    {
        if ($shortId = $this->_getParam('shortid')) {
            $id = Integer::decode($shortId);
        }

        // rest of the logic to view the photo here, using $id.
    }
}

This technique may not always apply, however, since you might already have a larger application that has all kinds of links that you can't just change to make this short URL thing work.

Another technique is to modify short URL a bit so they're more easily recognizable as such. I did that for one application by sacrificing one extra character. I just prefixed all the short IDs with an upper case "S". So you'd have a URL such as http://example.com/S4c92. This works since normally the URLs are all lower-case anyway:

$routes['twitter-pics'] = new Zend_Controller_Router_Route_Regex(
    '(?-i)S([\w\d]+)',
    array('controller' => 'photos',
          'action'     => 'view'),
    array('shortid' => 1),
    '/%s'
);

Note that this is a regular expression based route. The (?-i) turns off case insensitivity. I still wasn't happy with this approach, because the action still needs to explicitly handle that 'shortid' variable.

Using a Custom Route

I wanted everything encapsulated in the route, so I wrote a custom route class.

The interface that Zend provides is rather straight forward:

interface Zend_Controller_Router_Route_Interface {
    public function match($path);
    public function assemble($data = array(), $reset = false, $encode = false);
    public static function getInstance(Zend_Config $config);
}
  • match checks whether the route matches the path of the request
  • assemble is used to build a URL based on the parameters
  • getInstance is supposed to accept a configuration and return a new instance of the route. I don't even care about that at the moment.

Here's the finished class:

/**
 * Short Route
 *
 * Provides short URLs
 *
 * @author Marcus Welz
 *
 */
class Td_Controller_Router_ShortRoute implements Zend_Controller_Router_Route_Interface
{

    /**
     * @var string The URL prefix
     */
    protected $_urlPrefix = 'S';

    /**
     * @var array The parameter as passed to the request
     */
    protected $_params = array();

    /**
     *
     * @param string $urlPrefix    The prefix of the URL
     * @param array  $params       The parameters as passed to the request
     */
    public function __construct($urlPrefix, $params = array())
    {
        $this->_urlPrefix = $urlPrefix;
        $this->_params = $params;
    }

    /**
     * @param string $path The URL such as "/P3"
     * @return array|false returns parameters including the id on success, false if no match
     */
    public function match($path)
    {
        $prefix = preg_quote($this->_urlPrefix);
        if (preg_match('/\/' . $prefix . '([A-z0-9]+)$/', $path, $matches)) {
            $params = $this->_params;
            $params['id'] = Integer::decode($matches[1]);
            return $params;
        }
        return false;
    }

    /**
     * Assemble a URL using the ID
     *
     *
     * @param array $data       'id' is the only used parameter in the array
     * @param bool  $reset      unused / ignored
     * @param bool  $encode     unused / ignored
     */
    public function assemble($data = array(), $reset = false, $encode = false)
    {
        return $this->_urlPrefix . Integer::encode($data['id']);
    }

    public static function getInstance(Zend_Config $config)
    {
        throw new Exception('not implemented');
    }
}

Using it is straight forward. First, add it to the router:

        Zend_Controller_Front::getInstance->getRouter()
           ->addRoute('photo', new Td_Controller_Router_ShortRoute('S', array(
            'controller' => 'photo',
            'action'    => 'view'
        )));

Since the conversion between base 62 and base 10 is happening inside the class, the action doesn't have to decode it itself and is thus blissfully unaware of it. Encapsulation successful. And to generate a URL in a view, you'd use the url() view helper:

$this->url(array('id'=> $photo['id']), 'photo')

Good enough for me.

Print This Post Print This Post

Proper Twitter Integration with Zend Framework

Posted on May 5, 2009

Twitter is all the rage these days. Every site out there has some kind of "Tweet This" link or "Follow us on Twitter" button. Some sites have even deeper integration and tweet events on your behalf. In most cases, those sites are asking you for your Twitter username and password. What? Even scarier, many people enter their credentials without thinking twice. It's crazy. When has it become acceptable to enter your credentials for your online accounts (that often make you choose six or more character passwords) into some random third party site? Well, the answer, I suppose, is since social networking sites have began asking for email account access to rummage through your contact list. Still, it's a rather unacceptable solution for a self-respecting web site to operate this way, especially since Twitter supports the OAuth protocol which is designed to tackle this exact problem.

If you're familiar with how Flickr allows third-party applications and websites access to your account, then you know how it works. A web site requests access to your account, you are prompted to allow and deny access, and that's it. There are no passwords involved. And if you decide that you don't like what that website is doing with your account, you can revoke access at any time.

I will assume that you're already familiar with the Zend Framework. If that is not the case, and you're a PHP developer, you should really consider starting to use it. It is a very well designed and powerful collection of classes that complement each other and, after the initial ramp up time and learning curve, will pay off in both terms of development speed as well as maintainability. Check out the Quick Start.

In fact, Zend Framework (1.8) ships with a Zend_Service_Twitter class, which provides all the Twitter functionality. The problem is that this class only supports Basic Authentication using your Twitter account username and password. But fear not, we can bend this class to do our bidding.

See, underneath the hood, Zend_Service_Twitter is actually a Zend_Rest_Client, which is powered by Zend_Http_Client. Let's just remember that for now.

Let's take a look at this OAuth thing. Zend Framework has some preliminary support for it in the incubator. The client portion of it is functional, although kind of buggy, still.

The proposal for Zend_Oauth can be found here http://framework.zend.com/wiki/pages/viewpage.action?pageId=37957, complete with a ma.gnolia.com example use case.

Let me summarize how this works real quick:

1. Your configured Zend_Oauth_Consumer fetches a request token, which is used to prompt the user of the service to allow access.
2. Once access is allowed, your application receives an access token.
3. Your can ask the access token object to hand you an http client. It's a Zend_Oauth_Client, which extends Zend_Http_Client, and automagically handles the signing so you can treat it like a regular Zend_Http_Client and perform all the GETS and POSTS you want. Nifty!

Now let's go back to the Zend_Service_Twitter. Remember how it uses a Zend_Http_Client? All we have to do now is remove the basic (username/password) authentication mechanism and replace it with the OAuth-based version. To achieve that, we'll simply extend Zend_Service_Twitter as My_Service_Twitter. and make the following changes:

class My_Service_Twitter extends Zend_Service_Twitter
{
    /**
     * @var array
     */
    protected $_oauthOptions;

    /**
     * @var Zend_Oauth_Token_Access
     */
    protected $_accessToken;

    /**
     * Initialize Oauth
     */
    protected function _init()
    {
        if (!$this->_authInitialized) {

            $client = $this->_accessToken->getHttpClient($this->_oauthOptions);
            $client->setHeaders('Accept-Charset', 'ISO-8859-1,utf-8');
            self::setHttpClient($client);
            $this->_authInitialized = true;

        }
        $client = self::getHttpClient();
        $client->resetParameters();
    }

    /**
     * @param array $oauthOptions
     * @return My_Service_Twitter provides fluent interface
     */
    public function setOauthOptions(array $oauthOptions)
    {
        $this->_oauthOptions = $oauthOptions;
        return $this;
    }

    /**
     * @return array
     */
    public function getOauthOptions()
    {
        return $this->_oauthOptions;
    }

    /**
     * @param Zend_Oauth_Token_Access $token
     * @return My_Service_Twitter provides fluent interface
     */
    public function setToken(Zend_Oauth_Token_Access $token)
    {
        $this->_accessToken = $token;
        return $this;
    }

    /**
     * @return Zend_Oauth_Token_Access
     */
    public function getToken()
    {
        return $this->_accessToken;
    }
}

And it's ready to be used. Instantiate the class, set the Oauth token via setToken() and then use the class the same way as before.

Print This Post Print This Post
Tagged as: 2 Comments