Tuesday, June 16, 2009

Making non-sharded counters perform

Sometimes, design decisions of the past come back to haunt me when I least expect it. For example, imagine you have a persistent model with a property that needs to be increased and decreased in certain situations:

class SomeModelWithCounter(db.Model):
name = db.StringProperty()
count = db.IntegerProperty(default=0)
other_details = db.StringProperty()


Normally, my gut reaction would be to yell out "shard your counters", but there are certain things that do not work overly well with sharding. Some examples:


  • If a sharded counter is not present in memcache, it will require several datastore reads to get. In this particular app, I have tens of thousands (or beyond) of these models and need to display random subsets of them in tabular form. Chances are that most of these counters are not cached, so I will have to get them one by one from the datastore. Let's assume that I have to display 50 model instances with an average of 20 shards per count. Compiling the table for display will require loading roughly an additional thousand objects from the store.

  • Assumed that I would need to use the counters for pagination (think of actions like "show me all instances with a count greater than 500 that start with the letter a"). Having the count being explicitely part of the model makes such a thing very easy to do.



When I started toying with App Engine a bit more than a year ago, I built an application that was a very similar use case -- so I decided against sharding. I modelled my data in a similar fashion as shown above and built a relatively straightforward increment-function:

def increment_counter(model_id, delta):
assert delta >= 0
model = SomeModelWithCounter.get_by_key_name(model_id)
model.count += delta
model.put()


Since I did not want data to get overwritten, I then made sure that the method would always be called in a transactional way, such as

db.run_in_transaction(increment_counter, 'foo', 1)


While this worked out well for quite some time, it was certainly not ideal. In particular, as the application attracted more users, I noticed an increase in issues that were related to resource contention. The error rate increased, requests took longer to execute, and the application therefore became less user-friendly. For a while, I contemplated adding up in mecache and only saving occasionally (see fastpageviews for an implementation of this approach). The problem was that that would only work if a particular counter was hit often enough (otherwise, the counts would just get evicted from memcache eventually before they could get written). Also, it mean that, while most of my users would not see the overhead of writing to the store, some requests would still take significantly longer than others. Wasn't there an alternative, so that the user would never have to wait?

After a couple of tries, I decided to decouple the increase of the counter from the user-facing action. I am using cron jobs for now, but I assume that this pattern should be easily ported when offline processing becomes available. Here is what I did:

The idea is to use the atomic increase/decrease to accumulate counts in memcache and have a cronjob apply them outside of the user request. I started out with an utility function to make memcache access easy:

def change_memcache_counter(counter_name, delta):
"""Increases or decreases a memcache counter atomically."""

# use memcache.incr or memcache.decr?
func = memcache.incr
if delta < 0:
func = memcache.decr
delta = -delta

# If the counter exists, just apply the change
result = func(counter_name, delta)

# If it does not exist, create it
if not result:
memcache.add(counter_name, 0)
result = func(counter_name, delta)

# Done
return result


Using this helper, I can write a replacement for my old increment_counter. Counts are getting accumulated under memcache-ids of /c/datastore_key. In addition, I put two additional values into memcache:


  • I maintain a separate counter /todo of how many count operations have been done in memcache so far.

  • I store the id of the model I incremented under the current value of the /todo count.


TODO_COUNTER = '/todo'
COUNTER_PREFIX = '/c/%s'


def increment_counter_later(model_id, delta):

# Increment the particular counter
change_memcache_counter(COUNTER_PREFIX % model_id, delta)

# Store the name of the counter in a "bucket" with increasing number
bucket_id = change_memcache_counter(TODO_COUNTER, 1)
memcache.set(str(bucket_id), model_id)


Storing this sequence of "buckets" creates a log of operations, which I can now replay in an independent cron job. The following utility method walks along the /todo and finds me an operation I have to apply to the data store:

DONE_COUNTER = '/done'


def get_task(last_bucket):
# Loop until we find something or run out of buckets
while True:

# Is there a bucket left?
current_bucket = counter.change_memcache_counter(
DONE_COUNTER, 1)
if current_bucket > last_bucket:
memcache.set(DONE_COUNTER, last_bucket)
return None

# Is there anything in the current bucket?
# If so, take it out and return it
task = memcache.get(str(current_bucket))
if task:
memcache.delete(str(current_bucket))
count = memcache.get(counter.COUNTER_PREFIX % task)
if count:
counter.change_memcache_counter(
counter.COUNTER_PREFIX % task, -count)
return (task, count)


If we arrive at a situation where current_bucket>last_bucket, we know that we have caught up with the log. By deleting applied buckets and counts from memcache, we also make sure that the cron job may be hit in parallel (for example by creating several entries in cron.yaml for higher throughput) without applying the same count twice. As a nice side-effect, this also means that a single counter may be increased many times between cronjob executions, but it will only result in a single datastore write (afterwards, the count will be zero, so all following buckets with the same id can be skipped).

With this helper method complete, the actual implementation becomes quite straightforward:

TASKS_PER_REQUEST = 5


def main():

# Print some output for easy debugging in the browser
print 'Content-Type: text/html'
print ''

# What's the last bucket that might contain data?
last_bucket = counter.change_memcache_counter(
counter.TODO_COUNTER, 0)
print 'Running up to bucket #%s' % last_bucket

# We will try to increase up to N counters
for i in range(TASKS_PER_REQUEST):
task_count = get_task(last_bucket)
if not task_count:
break
db.run_in_transaction(
counter.increment_counter,
task_count[0],
task_count[1])
print 'Increased %s by %s' % task_count
print 'Done'

if __name__ == "__main__":
main()


Once offline processing becomes available, I should be able to simply replace the process of creating the log in memcache and hitting it from a cronjob by putting the log entries into task queues. The rest of the concept (accumulating in memcache and applying the counters asynchronously) should still apply.

6 comments:

mrgoldenbrown said...

I just want to thank you for sharing your experiences with app engine. I just used your posting on unit testing to get my jdo testing infrastructure set up, and just got a green bar. Welcome to my rss feed, for what it's worth :)

The App Engine Fan said...

Thanks, I'm glad it was of help to you :-)

Søren Bjerregaard Vrist said...

Would you consider sketching this with the task queue api instead of cron ?

http://googleappengine.blogspot.com/2009/06/new-task-queue-api-on-google-app-engine.html

The App Engine Fan said...

Yes, I'll do a blog post about this at some point. I have migrated the app from this article towards tasks a while ago, and I think I ought to do a post on this in the next few weeks. There are a few interesting gotchas if you want to avoid running out of quota, and I was struggling with that initially.

Are you just generally interested, or do you have a similar problem you'd like to solve? I'm mostly asking whether I should hurry ;-)

Søren Bjerregaard Vrist said...

I "descoped" the functionallity for now but I got the rewrite bug so mostly interest :)

The App Engine Fan said...

Makes sense. Check the blog again in 2-4 weeks :-) Thanks for your interest.

Cheers,

Jens