Sunday, February 22, 2009

Higher level hooks

I recently got a mail from a fellow App Engine enthusiast regarding my series on using hooks in Google App Engine. He had played with the low-level hooks, and they had worked for him in some situations. However, he also needed model-specific hooks; higher-level functions that he could use only on a specific class of models. Common use cases would be additional validation before a "foo"-Model is saved, or updating counters specific to only one Model class with special properties. While this was achievable on the protocol buffer level, it was a little bit cumbersome and not overly developer friendly (my own interpretation here, he did not write it like that).

I fully agree that there has to be a better way, so let's talk about it in this post. My fellow developer is right: lower-level hooks are a tool for advanced hacking and should only be used when everything else fails. For most use cases, working with the higher-level APIs that the App Engine team gave us should be just fine. If it is mostly about checking a particular property, providing a custom validator for that field should be enough (see this previous article from my blog). In this post, let's take a shot at a different approach. First, we build a very simple script that introduces a new class, HookedModel, and a concrete simple example that uses of it:

# Setup code, not needed in a real App Engine app
from google.appengine.api import apiproxy_stub_map
from google.appengine.api import datastore_file_stub
import os
os.environ['APPLICATION_ID'] = 'test'
stub = datastore_file_stub.DatastoreFileStub('test', None, None)
apiproxy_stub_map.apiproxy.RegisterStub('datastore_v3', stub)

# A Model with preread/write hooks
from google.appengine.ext import db

class HookedModel(db.Model):
"""A subclass of model that provides hooks for extra checks."""

def pre_write(self):
"""Called before a model is written to the store."""
pass

def post_read(self):
"""Called after a model is read from the store."""
pass


class TestModel(HookedModel):
"""A model that uses hooks:
Upon save/load, it prints out the content of a property.
"""
text = db.StringProperty(default='some text')

def pre_write(self):
print 'Writing %s' % self.text

def post_read(self):
print 'Reading %s' % self.text


save_this = TestModel()
key = save_this.put()

load_this = TestModel.get(key)
print 'Done :-)'


Our new Model class has two hooks, pre_write and post_read. Our subclass TestModel provides actual implementations. Right now, there is nothing that would trigger these hooks actually being used, so the only output we see on the screen is this:

Done :-)


So, how can we change that? As mentioned before, the Model framework is actually the highest layer on a stack of APIs. When saving a model to the store, the Model class has to actually translate the object into the lower-level data format. This operation is done in a method called _populate_internal_entity, which we will override in our HookedModel:

  def _populate_internal_entity(self, *args, **kwds):
"""Introduces hooks into the entity storing process."""
self.pre_write()
return db.Model._populate_internal_entity(self, *args, **kwds)


If we run our script again, we see that we have made some progress:

Writing some text
Done :-)


Now for the second hook. My first instinct was to replace the class-method from_entity, but that proved to be beyond my python skills. The problem was in a check that the original method did:

  @classmethod
def from_entity(cls, entity):
if cls.kind() != entity.kind():
raise KindError('Class %s cannot handle kind \'%s\'' %
(repr(cls), entity.kind()))

entity_values = cls._load_entity_values(entity)
instance = cls(None, _from_entity=True, **entity_values)
instance._entity = entity
del instance._key_name
return instance


Since I did not want to replicate code, my replacement method would simply call Model.from_entity(entity) -- which would result in a KindError being thrown. Fortunately, a second look at the implementation revealed that there was an easier way: when constructing the new model instance, from_entity would set an argument called _from_entity to true when calling the constructor. _from_entity is "intentionally undocumented" in the Model constructor, but if I had to take an educated guess, I would assume that it is set to true whenever a Model is constructed from the lower-level Entity object in the API stack. Based on this assumption, we should be able to create our own custom constructor of HookedModel:

  def __init__(self, *args, **kwds):
"""Introduces hooks into the entity loading process."""
db.Model.__init__(self, *args, **kwds)
if kwds.get('_from_entity', False):
self.post_read()


If we run our test another time, we finally get the intended result:

Writing some text
Reading some text
Done :-)


Let's take a quick look at the final HookedModel

class HookedModel(db.Model):
"""A subclass of model that provides hooks for extra checks."""

def pre_write(self):
"""Called before a model is written to the store."""
pass

def post_read(self):
"""Called after a model is read from the store."""
pass

def _populate_internal_entity(self, *args, **kwds):
"""Introduces hooks into the entity storing process."""
self.pre_write()
return db.Model._populate_internal_entity(self, *args, **kwds)

def __init__(self, *args, **kwds):
"""Introduces hooks into the entity loading process."""
db.Model.__init__(self, *args, **kwds)
if kwds.get('_from_entity', False):
self.post_read()


Not only have we added a generic way of executing code before writing to (or after reading from) the store, we have done it using standard object oriented techniques. No need to hack or monkeypatch! This shows again what a terrific job the App Engine team has done in providing an extensible set of base classes that we as the end user can customize and make useful in ways we see fit. Doing this is important but not always easy (see this talk by Joshua Bloch for some of the finer philosophical details). It is well worth it though, and I think it will help keeping the user community happy and inspire us to come up with many new cool ways of putting App Engine to use.

2 comments:

Bill said...

Thanks a lot for the article, Jens. Very helpful.

Kristaps Rāts said...

This might work, but a method name starting with an underscore is convention for class internals. By using those you might find yourself in a world of hurt someday since it's not public API and may change without notice, take care.