Finally bought a Solid State Drive
As an early Christmas present to myself, I took the plunge and finally bought two small 2.5" 60GB solid state drives. They were quite a bit smaller and lighter than I imagined. But in terms of performance, there's quite a difference.
I normally run everything RAID-1, but since everything important is now stored on RAID-1 backed NAS if not Amazon's S3, I figured I could take a little risk for the desktop and run the SSDs in RAID-0. I did some benchmarks to compare. An old 300GB ATA drive did around 50MB/s. My previous 1TB SATA RAID-1 setup performed somewhat better at around 70MB/s. The RAID-0 SSD setup wins by a long shot, with around 320MB/s.
This setup reduces my desktop drive space quite a bit, but that's a good thing. Perhaps it'll be more of an incentive to keep everything neatly on the NAS devices.
Print This Post
Finding Duplicate or Similar Images with Perceptual Hashing in PHP
A while ago on Reddit someone asked how Tineye works. It's pretty fascinating; you upload a photo (or point it at a URL of an image) and it'll find other locations with similar images — if they've been indexed. Even if those images are in different sizes, or have had minor changes made to them, be it due to compression or because someone added or removed some text. So in a way, it's a fuzzy image based search engine.
Although I'm sure tineye has it's own set of algorithms and custom applications to drive all this, something similar, if crude in implementation, can be achieved with available software.
At the center of all this is an image hashing algorithm. Usually (cryptographic) hashes are designed to detect even the slightest modifications and return a completely different hash. We're looking for the opposite, and libphash delivers:
The phash library implements a "perceptual hashing" algorithm. From their site:
A perceptual hash is a fingerprint of a multimedia file derived from various features from its content. Unlike cryptographic hash functions which rely on the avalanche effect of small changes in input leading to drastic changes in the output, perceptual hashes are "close" to one another if the features are similar.
So the idea is that you feed it an image, and it'll return a hash, and the similar two hashes are to one another, the more identical the images are. The comparison of similarity is done by calculating the hamming distance, which in a way is the bit-level version of the levenstein() function that PHP developers may already be familiar with.
Even better, the phash library comes with PHP bindings, and provides a few functions to get you started. For instance, there's the ph_image_hash() function. Simply give it a filename of an image, and it'll return, well, a resource. Now this really puts a damper on the usefulness of that function, since resources are fairly opaque and hard if not impossible to work with.
Fear not, I've made a few changes so that ph_image_hash() returns a plain string with the hexadecimal representation of the hash, which can then be stored in a database, for instance. You can grab phash from my github repository.
Alright, you've got a way to get to those hashes, now how does one index them and look them up in a speedy way? Well, this is where it gets interesting, and unfortunately a little bit theoretical.
Ideally, you'll store these hashes (and other meta data) in a database, but not a SQL database. You really want a vantage point tree, or better, multiple vantage point tree. Essentially these are binary trees that build up based on the distance of hashes. The idea is that hashes that are similar are close together. So you traverse the tree in order to get close to your match, and then just "look around" in that area and you'll likely find similar results.
The MVP tree area gets pretty academic, and from what I've looked at so far, most of it is theoretical, presented as limited in applicability, but at the same time seems to be exactly what such a fuzzy image search engine would need to be based on. I'm fairly certain companies working on various aspects of image recognition and augmented reality and what not are all messing with this sort of thing, so there's likely very little incentive for them to publicize or advertise their algorithms.
The phash library does include a "MVPTree" library with basic examples of how data is stored. Having something like this built out into a scalable data store with an HTTP interface a la SOLR would be fantastic.
I'd immediately work on a PHP application to index my photos, detect duplicates, etc.
Print This Post
HTML5 Game - Connecting to the server
I'm prototyping a multiplayer HTML5 game. The client is the HTML5 concoction of JavaScript, CSS, and whatever one might lump into the "HTML5" technology stack, while the server is the "concurrency oriented" Erlang and mochiweb stack.
The browser will then communicate with the web server using AJAX calls. Eventually this could all be done using WebSockets, but we're not there yet. WebSockets have just experienced a set back due to a problem with the protocol.
With simple AJAX calls (which aren't really AJAX anyway, since JSON is so much nicer to work with) there are two ways to get data from the server.
One is polling. Often. Every second or so. This is rather wasteful, but it's easy to implement. In PHP, for instance, polling often is pretty much the only option since the connection to the server ties up an Apache child process. The PHP portions needs to be as quickly as possible in order to free up the resources for the next person to be able to bang away at the server.
The other option is long-polling (AKA COMET). In this case the browser connects to the server, which then holds onto that connection until data is ready to be sent back. This eats the socket, and whatever additional resources are required by the webserver. And since Erlang appears to be really good at this, I'll do that.
So for the purpose of communicating with the server, I'll use two types of connections. One is the "event listener", waiting for data coming from the web server, that's my long-polling COMET call. Whenever there's data the server wants me to have, it'll send it and then close the connection. On the client side I'll process it, and then immediately fire off the next COMET call, waiting for more data. Mochiweb actually comes with an example of long-polling.
The other connection type is the "action connection". It'll fire for every command that the client is sending to the server. So when I click something in the browser, that'll result in an action call to the server. For that I'll use the regular jQuery .ajax() call.
So essentially to receive data from the server I have my listener AJAX call, and to send data to the server I use my action AJAX call.
Erlang Processes
I'll need an Erlang process that represents my client. It stores my session data, but a bit more than that, actually. I'll basically store all relevant client view state and interact with the rest of the server on the browser's behalf.
So for each client there's a session process. This process gets created when the client makes a login call:
Once this Session process exists, the listener will connect to the server. The server will hold onto the connection, until data is available. The data is then returned to the browser:
Finally, the action AJAX call. It just fires off an action and closes the connection. The Listener Pid here is the hibernating mochiweb process from the previous sequence diagram:
It's rather quick, too. The first feature I implemented was the "ping". JavaScript sends a ping command, along with the current time.
metapoly.net =
{
action: function(action, params) {
$.post('/action/' + action, params);
},
};
metapoly.net.action("ping", {time: (new Date()).getTime()});
The AJAX call will connect to the mochiweb server, which creates a new process to handle the request. The process will then look up which "client view" process is responsible for handling the web browser commands, this is stored in ets. The "client view" process is then sent the ping message. The client view process pattern matches it, and tries to send a response "pong" to the listener process, which is the COMET connection.
% actions such as mouse clicks, text events, etc.
handle_action(Payload, State) ->
case Payload of
{"ping", Params} ->
Pong = {struct, [{'_t', mpu:now()}, {"c", "pong"}|Params]},
io:format("received ping, queueing pong reply: ~p ~p~n", [Params, Pong]),
NewState = State#state{messages = [Pong | State#state.messages]}
end,
?MODULE:loop(NewState).
As soon as the listener has the command, it'll return it to the client. The JavaScript client will invoke the callback, get the new current time, do a diff with the time the ping was sent and displays the results in a div tag.
metapoly.net.e('pong', function(msg) {
var diff = (new Date()).getTime() - msg.time;
$('#ms').text(diff + "ms lag");
});
On localhost, and using Chrome, the fastest I've seen was a 6ms response time. If I hammer the button, it'll come back somewhere between 8 - 20ms. On Firefox, with Firebug open, it'll bounce around between 30ms and 150ms.
With that, the consider the first conversation with the server successful.
Print This Post
HTML5 Large Random Map test
This time it's building random patches. And I've added a layer for sprites. There's two of them and they're static, so again it's not too exciting. I did notice, however, that I'm running into serious performance issues when building these patches. So this technique isn't going to fly.
Maybe it's faster to farm out the work to web worker threads, but they don't have access to the DOM and can't use any HTML elements. So I'd have to pass everything as an array (from getImageData()) , let the web worker assemble the patch, which is then handed back as an array. I haven't tried that yet.
The other possibility is to have the server generate the patches. Google maps style. The immediate downside is that I'm no longer saving bandwidth, since the same tiles that were previously displayed on the client are now generated by the server. Maybe patches could be hashed and different areas of the map could be served by the same image, if the hash matches. That would also involve some Erlang. So far it's all just been JavaScript.
Print This Post



