Short URLs with Zend Framework
First up, what's a short URL? A short URL is just that; a url that is as short as it can possibly be, so that takes up as few characters as possible when it is used in a twitter message, which itself is limited to 140 characters and probably the main reason short URLs are so popular. Each character counts.
Technically, short URLs consist of a short domain name and a simple identifier, usually the numeric primary key in a database table of whatever item the page is supposed to be for. And to make that number even shorter it's typically base 62 encoded.
The digits are represented using the numbers 0-9, lowercase a-z and uppercase A-Z. And although PHP offers a base_convert() function, it's unfortunately useless as it only supports up to be base 36 and loses precision on large numbers (it uses floating point math internally). So a replacement is needed.
There are all kinds of base62 encoding and decoding functions out there already. One is bc_base_convert, which uses (requires) the bcmath extension. Another one that's a bit more fleshed out and cleaner looking that I found on pastie while browsing reddit. I've reproduced it here for easy reference:
/**
* @class Integer
* @author Julien Garand (Go On Web)
*
* Can encode and decode integers to/from a string, using a custom alphabet
*/
class Integer
{
// Default alphabet for a "normal" base 62 encoding
static protected $alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
static protected $base = 62;
/**
* Define your custom alphabet here
*/
static public function setAlphabet( $alphabet )
{
// only strings are allowed
if ( !is_string($alphabet) )
{
throw new Exception('Given alphabet is not a string !');
}
self::$base = strlen( $alphabet ); // Our base will be the length of the given alphabet
// We check if alphabet doesn't have doubled characters
if ( strlen( count_chars( $alphabet, 3 ) ) != self::$base )
{
throw new Exception('The following alphabet has doubled characters : '.$alphabet);
}
self::$alphabet = $alphabet; // store it
}
/**
* Basic accessors
*/
static public function getAlphabet() { return self::$alphabet; }
static public function getBase() { return self::$base; }
/**
* Encode an integer according to the defined alphabet
*
* @param integer : Unsigned integer to be encoded
* @return string (or false if failed)
*/
static public function encode( $integer )
{
$integer = (int)$integer; // Be sure to have an integer
// We only accept unsigned integers
if ( $integer < 0 )
{
return false; // or throw new Exception( "($integer) is less than 0 and cannot be converted" );
}
$string = ''; // our encoded integer
// while we have to encode
while( $integer )
{
$pos = $integer % self::$base; // get the rest of euclidian division...
$string .= self::$alphabet[ $pos ]; // thats the position of the char in alphabet
$integer = ( $integer - $pos ) / self::$base; // and divide integer (minus just encoded char) by the base
}
return strrev( $string ); // As we started by the unit of our base ( $base ^ 0 ), we have to reverse the string
}
/**
* Decode a string to an integer according to the defined alphabet
*
* @param string : String to be decoded
* @return integer (or false if failed)
*/
static public function decode( $string )
{
$string = (string)$string; // be sure to have a string;
// check if our string only have chars that are in the alphabet
if ( strcspn( $string, self::$alphabet ) )
{
return false; // or throw new Exception( "($string) is not a string or contains characters that are not in alphabet" );
}
$integer = 0; // our integer to find
$unit = 1; // we start by $base^0
// foreach chars, starting at the end
for( $i = strlen( $string ) -1; $i >= 0; $i -- )
{
$pos = strpos( self::$alphabet, $string[$i] ); // we find it's position in alphabet
$integer += $pos * $unit; // its our number to add, multiplied by the current unit
$unit = $unit * self::$base; // and go to next unit in our base
}
return $integer;
}
}
So now we can convert our simple numbers into the more cryptic looking short url identifiers simply using Integer::encode(). Here's some example conversions:
1 => 1 10 => a 100 => 1C 255 => 47 1000 => g8 10000 => 2Bi 65535 => h31 100000 => q0U 1000000 => 4c92
Instead of http://example.com/1000000, you could end up with http://example.com/4c92. There, three characters saved. That makes a difference, particularly with really short domain names, such as Twitter's own URL shortener: http://t.co.
Working with Routes
So, in Zend Framework the actual page logic starts in controllers and actions, which are essentially classes and methods, respectively. In order to to reach a controller and action, request URLs are routed using the router.
The default route is sufficient for most applications. It conveniently maps the first two path segments to controller and action. So http://example.com/photo/view maps to the PhotoController::viewAction(). Also, the action is optional, and if omitted will default to index. Therefore, http://example.com/photo will map to PhotoController::indexAction(). There are other scenarios that are helpful to be familiar with.
Now the easiest way to support short URLs is to add a route that will match any alphanumeric characters and route that to the desired destination. That could look something like this:
$router = Zend_Controller_Front::getInstance()->getRouter();
$router->addRoute('photo', new Zend_Controller_Router_Route(':shortid', array(
'controller' => 'photo',
'action' => 'view'
), array(
'shortid' => '[0-9a-zA-Z]+'
)));
The side effect, however, is that there's no longer a distinction between a short URL such as "http://example.com/1hF" and "http://example.com/photo". Since "photo" could be a base62 encoded number (in fact, it would be the number 373,554,054). This can be worked around if you make all other URLs specify both the controller and action explicitly, so you'd use "http://example.com/photo/index" to ensure that the short URL route doesn't match.
Then, in your controller's action, you'd handle the request using the short URL:
class PhotoController extends Zend_Controller_Action
{
public function viewAction()
{
if ($shortId = $this->_getParam('shortid')) {
$id = Integer::decode($shortId);
}
// rest of the logic to view the photo here, using $id.
}
}
This technique may not always apply, however, since you might already have a larger application that has all kinds of links that you can't just change to make this short URL thing work.
Another technique is to modify short URL a bit so they're more easily recognizable as such. I did that for one application by sacrificing one extra character. I just prefixed all the short IDs with an upper case "S". So you'd have a URL such as http://example.com/S4c92. This works since normally the URLs are all lower-case anyway:
$routes['twitter-pics'] = new Zend_Controller_Router_Route_Regex(
'(?-i)S([\w\d]+)',
array('controller' => 'photos',
'action' => 'view'),
array('shortid' => 1),
'/%s'
);
Note that this is a regular expression based route. The (?-i) turns off case insensitivity. I still wasn't happy with this approach, because the action still needs to explicitly handle that 'shortid' variable.
Using a Custom Route
I wanted everything encapsulated in the route, so I wrote a custom route class.
The interface that Zend provides is rather straight forward:
interface Zend_Controller_Router_Route_Interface {
public function match($path);
public function assemble($data = array(), $reset = false, $encode = false);
public static function getInstance(Zend_Config $config);
}
matchchecks whether the route matches the path of the requestassembleis used to build a URL based on the parametersgetInstanceis supposed to accept a configuration and return a new instance of the route. I don't even care about that at the moment.
Here's the finished class:
/**
* Short Route
*
* Provides short URLs
*
* @author Marcus Welz
*
*/
class Td_Controller_Router_ShortRoute implements Zend_Controller_Router_Route_Interface
{
/**
* @var string The URL prefix
*/
protected $_urlPrefix = 'S';
/**
* @var array The parameter as passed to the request
*/
protected $_params = array();
/**
*
* @param string $urlPrefix The prefix of the URL
* @param array $params The parameters as passed to the request
*/
public function __construct($urlPrefix, $params = array())
{
$this->_urlPrefix = $urlPrefix;
$this->_params = $params;
}
/**
* @param string $path The URL such as "/P3"
* @return array|false returns parameters including the id on success, false if no match
*/
public function match($path)
{
$prefix = preg_quote($this->_urlPrefix);
if (preg_match('/\/' . $prefix . '([A-z0-9]+)$/', $path, $matches)) {
$params = $this->_params;
$params['id'] = Integer::decode($matches[1]);
return $params;
}
return false;
}
/**
* Assemble a URL using the ID
*
*
* @param array $data 'id' is the only used parameter in the array
* @param bool $reset unused / ignored
* @param bool $encode unused / ignored
*/
public function assemble($data = array(), $reset = false, $encode = false)
{
return $this->_urlPrefix . Integer::encode($data['id']);
}
public static function getInstance(Zend_Config $config)
{
throw new Exception('not implemented');
}
}
Using it is straight forward. First, add it to the router:
Zend_Controller_Front::getInstance->getRouter()
->addRoute('photo', new Td_Controller_Router_ShortRoute('S', array(
'controller' => 'photo',
'action' => 'view'
)));
Since the conversion between base 62 and base 10 is happening inside the class, the action doesn't have to decode it itself and is thus blissfully unaware of it. Encapsulation successful. And to generate a URL in a view, you'd use the url() view helper:
$this->url(array('id'=> $photo['id']), 'photo')
Good enough for me.
Print This Post
Dirty Rows and Audit Trails with Zend_Db_Table
There are various ways to update rows in a database table using the Zend_Db_Table components. You can use use Zend_Db_Table::update(), like so:
$table = My_Table();
$table->update(array('age' => 22), 'id = 1');
or retrieve the row, and update it:
$table = My_Table(); $row = $table->find(1)->current(); $row->age = 22; $row->save();
The big difference between the two approaches is that by first retrieving the row, and then updating it, you're actually using three queries. The first one to find the row, the second one to save it, and a third, which is used internally in to Zend_Db_Table_Row_Abstract to refresh data that might have gotten changed due to TIMESTAMP columns, triggers, etc.
If you dig into Zend/Db/Table/Row/Abstract.php, you can see that the class already tracks which columns were changed, so if you only change the value of a single column like the age in the example, not all columns of that row are updated in the database — only those that were actually modified. That's what the protected $_modifiedFields property is for; it records which properties on the Row object were set and only writes those fields to the database. It doesn't, however, check whether the new value is different from the old value.
There's also another protected property, called $_cleanData, which contains the row data as it is currently stored in the database. With that in mind, it is pretty simple to add additional logic to take advantage of that fact.
For instance, we can take it to the next level and only update the record if the column data differs from its previous data. Or perhaps we have a separate audit trail log that needs to capture any column data that was modified.
<?php
require_once 'Zend/Db/Table/Row/Abstract.php';
abstract class My_Db_Table_Row_Abstract extends Zend_Db_Table_Row_Abstract
{
/**
* Returns the values that have *actually* been changed
*
* @return array
*/
public function getDirty()
{
return array_diff_assoc($this->_data, $this->_cleanData);
}
/**
* Whether the record has been modified
*
* @return bool
*/
public function isDirty()
{
return (bool) count($this->getDirty());
}
/**
* Saves the properties to the database.
*
* This performs an intelligent insert/update, and reloads the
* properties with fresh data from the table on success.
*
* Saving will only occur if any column values have been modified
*
* @return mixed The primary key value(s), as an associative array if the
* key is compound, or a scalar if the key is single-column.
*/
public function save()
{
if ($this->isDirty()) {
return parent::save();
}
}
}
I built a feature based on this to record when a row was modified, exactly which columns were updated, when, and by whom, in order to provide a rock-solid audit trail for a web application in a corporate environment.
Print This Post
Zend_Db: Setting MySQL's timezone per connection
I have a Linux server with a system timezone of ET (US/Eastern). But I also have a web application that needs to run in a timezone of PT (US/Pacific). Of course that's not a problem at all. I just set the timezone in my web application's bootstrap:
date_default_timezone_set('America/Los_Angeles'); // Pacific timezone
Now I have another problem; the database. Sometimes I use PHP to generate dates such as date('Y-m-d H:i:s', strtotime('-2 minutes')). Other times I insert records and use new Zend_Db_Expr('NOW()');. But because MySQL isn't aware that I'd like to use pacific time, dates end up being inconsistent and are off by three hour.
It's a fairly easy fix, though, with a bit of logic added to a custom MySQL database adapter:
<?php
/**
* @see Zend_Db_Adapter_Pdo_Mysql
*/
require_once 'Zend/Db/Adapter/Pdo/Mysql.php';
/**
* MySQL PDO adapter extended to set the timezone
*/
class My_Db_Adapter_Pdo_Mysql extends Zend_Db_Adapter_Pdo_Mysql
{
/**
* @var bool
*/
protected $_initialized = false;
/**
* Connects to the database.
*
*/
protected function _connect()
{
parent::_connect();
if (!$this->_initialized) {
$this->_initialized = true;
if ($this->_config['timezone']) {
// Requires PHP 5.2+
$dtz = new DateTimeZone($this->_config['timezone']);
$offset = $dtz->getOffset(new DateTime('NOW')) / 60 / 60;
$this->query(sprintf("SET time_zone = '%d:00'", $offset));
}
}
}
}
To kick this all off my bootstrap contains:
$config = array();
$config['host'] = 'localhost';
$config['username'] = 'username';
$config['password'] = 'password';
$config['dbname'] = 'mydatabase';
$config['timezone'] = 'America/Los_Angeles';
$config['adapterNamespace'] = 'My_Db_Adapter';
$db = Zend_Db::factory('PDO_MYSQL', $config);
Zend_Db_Table::setDefaultAdapter($db);
date_default_timezone_set('America/Los_Angeles');
And now I am free to continue my habit of inconsistency when specifying dates.
Print This Post
The WSDL Blower: The state of SOAP in Zend Framework 1.6
There are all kinds of ways to expose APIs as web services. SOAP, XML-RPC, REST, JSON-RPC. Out of all of these, SOAP is arguably the most complex, but also one of the oldest ways to expose an API (I remember preliminary SOAP and WSDL support in Delphi 6, circa 2001).
Exposing an API as a web service in Zend Framework is fairly straight forward, in fact it is (or should be) as easy as one, two, three:
- Pick your favorite style (Zend_Soap_Server, Zend_XmlRpc_Server, Zend_Rest_Server, Zend_Json_Server).
- Define a properly documented class with the methods and business logic that you want to expose.
- Let your Zend_*_Server::handle(); everything else.
That's pretty much it. Of course each Zend_*_Server has its own settings and options that you can (and sometimes must) configure. XML-RPC is also the only server that supports namespaces.
If you decide on using SOAP, there are a few things to watch out for.
- The WSDL generator doesn't let you set the TargetNamespace attribute explicitly. This becomes an issue when you're working with different environments (development, testing, staging, production, etc.) since the URL of the service determines the namespace. And that makes automatic code generation based on the WSDL problematic. I've contributed a patch to address this, as part of ZF-4117.
- Zend_Soap_Client contains a bug that prevents it from properly proxying method calls. Instead, it'll end up recursing infinitely, or at least 100 times before PHP detects the issue and kills it. The quick fix of removing a single underscore froma method call is outlined in ZF-4152.
- Zend_Soap_Server doesn't properly turn Exceptions into SoapFaults (due to lack of typecasting) as described in ZF-3958. You can still throw SoapFaults explicitly, but that's just bad form since your class shouldn't be SOAP specific. After all, you may decide to also expose it as XML-RPC or what have you, and then the SoapFault, while it might still work properly, is semantically incorrect.
- While Zend_Soap_Server and Zend_Soap_Client are mostly wrappers for the native PHP SoapClient and SoapServer classes, PHP itself lacks WSDL generation which Zend_Soap_AutoDiscovery provides in conjunction with Zend_Soap_Wsdl. Unfortunately the notion of "array of datatype" is not supported, since arrays in PHP are simply declared as "array" and can contain anything. Zend Studio's WSDL generator supports the syntax of "string[]" to specify an array of strings, and this works with complex types as well. "User[]" is an array of User objects. The issue is outlined in ZF-3900, to which I didn't provide one, but two patches, neither of which actually do what I thought they'd do (go me!). However, I have put together an embarrassing hack (Zend_Soap_Wsdl arrayOfType patch) that finally does work — at least for me. So far I haven't worked up the courage to submit it to JIRA. Not one of my finest moments.
I doubt most of these issues will hang around for long, but if you're developing something SOAPy with ZF1.6, you'll like encounter at least one of them.
Print This Post