Sunday, May 25, 2008

Framework Darwinism on the bleeding edge of technology

I was recently looking for some information on how to handle session related data in the Google App Engine. Without too much effort, I found three different approaches to handling the issue:
  • Appengine Utilities, a simple solution that targeted the simple use case I was looking for
  • Django App Engine Utilities, an extension for the very successful Django framework (requires to upgrade to a newer version than what comes with App Engine out of the box, though)
  • Beaker, a relatively sophisticated caching library that supposedly ties into multiple frameworks
I was very happy that I would be able to use an existing solution. I was even happier that enough people cared about App Engine to create useful code and put it out for everyone to use. Alas, it also made me a little worried: how would I be able to choose the right solution in the long run? Should I go with something that
  • just works for my simple use case?
  • works for the framework I currently use but requires me to upgrade?
  • go with a library that does more than I currently need, just because it "might" be useful in the future?
The problem with a relatively young technology is that there is an explosion of libraries and it is everyone's guess what will work out best in the long term. We have all been in similar situations before: like in early 2000, when my employer of that time was building a complex Java-based system and had to decide on what logging mechanism to use. At that time, the standard was still in its drafting stage and the existing frameworks out there only partially compatible. Eventually, we decided to go with a framework from ibm alphaworks, but wrap it in a layer that was closer to the draft standard of that time. Had we made the right decision? Hard to say in retrospect -- especially since we had quite a few do-overs in the next couple of years: Other frameworks, especially log4j, emerged that provided a much higher degree of features and performance than what we had before. At the same time, we tried to play catchup with changes in the official standard (some of them significant) without having to re-work all of our existing logging code. In the end, we ended up with a home-brewn wrapper not unlike commons logging that was not quite like the standard but did its job for us.

So what's my point? Forget about open source projects out there and re-invent the wheel? Not quite: if there is something well established that does the job, use it by all means. However, for a young technlogy like App Engine, it can be expected that things will be in flux for a while. The definite tools of the trade are yet to emerge -- whatever third party libraries you pick, you can bet that a certain percentage of them are going to become obsolete during the lifetime of your project. Unless they are very easy to rip out and replace, always consider a thin layer of isolation. At least, that's what I will do for my session cache.

1 comments:

Joe Bowman said...

Hi, was googling on my appengine-utilities project and found it mentioned on your blog. I thought I'd let you know I'm working on building the session class for the project to support everything the default Django session class supports, and then to build a wrapper for it so you can use it as MiddleWare. The point is, if you wanted to take your application off of GAE, you'd just have to change your MiddleWare back to the default Django version in your settings file to have session support. I have a strong interest in making sure the application I'm working on, the reason I started the opensource project, to have the ability to come off of GAE very easily.

I'm currently getting it to play nice with memcache in GAE before wrapping up the MiddleWare. The version in my svn trunk already supports object storage, the version you would have downloaded only supports strings.

I'll have a release once I finish wrapping up the issues on the project page.