Zend Framework and File Locking Pitfalls
Earlier today while reading through the Zend Framework 1.6 RC1 release notes I've come across an interesting bug that has been fixed: [ZF-3382] Zend_Cache_Backend_File problems under very high load.
There are a lot of things to say about this issue. The obvious ones first:
1. Under typical operation such as opening, locking, reading/writing, and then closing a file you should never need to unlock a file explicitly since fclose() implicitly unlocks a file. Calling flock(LOCK_UN) explicitly merely introduces a potential race condition between the lock release and file close. And it'll be (very) hard to debug.
2. PHP's fopen() has a bug when using "wb" (overwrite in binary mode). Instead, "ab+" (append in binary mode) followed by a fseek(0) and truncate(0) eliminates another potential race problem. Roof Top Solutions has an article on this.
Honestly, these types of bugs are of the worst kind particularly because they're rather hard to identify. The buggy code runs and seems fine, but will fail when you need it the most; under high loads, such as when you're on digg, slashdot, or any other large site that sends massive amounts of traffic your way. Suddenly, instead of helping your site scale, it causes your web application to thrash while the cache files are getting clobbered and trampled on.
Now imagine this is an issue with your in-house component that you have to solve by yourself. First it would have to be isolated and identified because the big picture is just that your application crumbles under high load — "even with all the caching that it's doing. Not sure what's going on", might be the first thought. I know that it would take me quite some time to actually narrow it down to the caching layer, create stress tests, experiment with various backends to see if, say, using memcache vs. files would fix the issue. And that's just identifying the problem. When it comes to fixing it, not explicitly unlocking isn't too hard to figure out, but the issue with fopen() in "wb" mode? That would have taken a while. Case in point; looking at the issue ticket, it was created June 4 and resolved on June 26, and judging by the notes, largely due to the efforts of Cody Pisto, who spent his afternoon on June 25 identifying the problems and creating a patch for this tricky issue.
Furthermore, this is a great example of the benefits gained from the Zend Framework (and other open source frameworks and components). In buzzwordy marketing lingo: It's a time and battle tested feature rich platform of loosely coupled components that you can mix and match as you please, and it only gets better as its adoption rate increases.
Print This Post
Zend_DB_Select Woes
It's no secret, I'm a fan of the Zend Framework, which I've been using since version 0.15. A lot of components have been added since then, and many of the initial components have been refactored and enhanced, and have matured. And that includes the Zend_Db_Select component, which has been evolving quite nicely, and even Zend_Db_Table_Abstract based classes make use of Zend_Db_Table_Select, which extends Zend_Db_Select.
But ultimately, there's still something lacking: Support for vendor specific SQL extensions, such as MySQL's SQL_CALC_FOUND_ROWS. Generally speaking, keeping Zend_Db_Select ANSI SQL compliant is a Good Thing™, as it forces (mostly) standards compliant queries (with the exception of the LIMIT clause, which invokes the database adapter to generate the SQL snippet), which may help with portability. Still, full support for features offered would be nice, and often is the reason why a particular database was chosen to begin with (among other reasons such as performance, budget, corporate policy, developer experience, etc).
Up until now I've always patched my local copy of the Zend/Db/Select.php file with support for that particular extension. That way I could call $select->calcFoundRows(true)->from([..]); and not have to give up using Zend_Db_Select. Of course doing it this way is not exactly best practice — a vendor specific extension shouldn't be implemented like that in a more universal component. For my purposes that's fine, since I just work with MySQL and SQLite, but ultimately there needs to be better support for these extensions.
A cleaner way to implement the extra functionality is to extend Zend_Db_Select, and add the extension support there. With ZF 1.6 RC1 that means overwriting the protected static $_partsInit, and adding the correlating _render*() method that gets called in Zend_Db_Select::assemble().
There are discussions about this on the Zend Framework mailing list, and they provide interim solutions. It's especially useful in conjunction with the proposed Zend_Paginator, which is part of ZF 1.6 RC1.
However, getting a select object with $dbAdapter->select(); is still an issue. Instead, one would have to manually call $select = new My_Db_Select($dbAdapter);, and that's something I'd like to avoid. It would also trigger a chain reaction and require one to extend Zend_Db_Table_Abstract with My_Db_Table_Abstract, which calls My_Db_Table_Select instead of Zend_Db_Table_Select, which in return extends My_Db_Select instead of Zend_Db_Select. Not pretty. And it's definitely a good example of why I, like many other OO developers, favor "composition over inheritance."
I'd almost like to see PHP getting proper support for mixins. I could see Zend_Db_Select emulating mixins or supporting plugins by registering plugin classes and using __call() to iterate over those classes to add functionality to the core, which the individual database adapters then register — something like Zend_Db_Select_Plugin_Mysql.php, but honestly, that seems a bit overkill, and a tad bit too unclean for my taste, but perhaps that's the way to go. It's how Doctrine implements Table plugins, and who knows, maybe it paves the road to what could one day be supported natively by PHP 6.3 or 7. After all, Perl, Python, Ruby, JavaScript and many other dynamic languages support it.
Print This Post
Paginating Zend_Search_Lucene results
This short entry was inspired by a snippet of inefficient code I encountered, which involved iterating over an array with a loop and breaking out of it once enough results were fetched.
Zend_Search_Lucene does not paginate results. It simply returns an array. While it does allow you to specify to only return the first N results (using Zend_Search_Lucene::setResultSetLimit($limit)), this is not really all too useful.
$lucene = Zend_Search_Lucene::open('index');
$hits = $lucene->find('author:"mark twain"');
$page = 1;
$perpage = 10;
return array_slice($hits, $page * $perpage - $perpage, $perpage);
The key element here, of course, is the use of array_slice(), which can be used with any array. So this isn't specific to Zend_Search_Lucene in any way.
Print This Post