Saturday, July 4, 2009

Faux Sockets for rich Java clients

It's been a while, so I figured I should be posting something at some point ;-) As previously mentioned, my current pet project is porting an existing Java application (the JOGRE Game server) onto the App Engine platform. As it turns out, not only is doing such a thing tricky, blogging about it isn't much easier either. This post adresses one aspect of such a conversion: socket based communication, but on a slightly simplified example. Instead of the full game server, let's try to port this Multi-User Chat application that I found on the interwebs.

The application behaves similarly to the examples from The Java Tutorial (see http://java.sun.com/docs/books/tutorial/networking/sockets/clientServer.html). A server-side ServerSocket listens for incoming connections. A client connects using a Socket and uses the readLine method of a BufferedReader to read data from the socket. It uses the println method of a PrintWriter to push data back to the server.

Converting these kind of applications can be tricky: the server here is stateful, while App Engine is not. Also, sockets assume that a client-server connection is always online. Clients can do blocking reads, which will simply set them into a "waiting" state until new data from the server arrived. These kind of things can be a bit tough to reproduce in a webservice-based environment that expects a client/server interaction to be done within a couple of seconds or less. How can we do it? Let's take a look.

(PS: Since the source code of the chat application does not really tell me what license it is under, I am not going to reprint it here in many details. Instead, I am going to show how to port the existing client, and then simply reimplement the server side from scratch.)

The client


As mentioned before, the two methods mostly used in this example (and many others, for what it's worth), are println and readLine. As long as we successfully substitute these two methods, we should be fine.

In order to not having to reinvent the wheel, I built a little tool class called ClientSocketSubstitute. This class provides the very same methods mentioned before. It caches outgoing communication in a queue and sends it out in batches (as JSON arrays) to the server. Likewise, it receives batches from the server in those same polling requests. The first thing I do is to replace our incoming and outgoing streams with that substitute:

//    BufferedReader in;
// PrintWriter out;
ClientSocketSubstitute in;
ClientSocketSubstitute out;


Since my helper class provides println and readLine implementations, my code mostly compiles. The only area that shows problems is where the Socket would get initialized. I therefore replace the initialization code accordingly:

        // Make connection and initialize streams
// String serverAddress = getServerAddress();
// Socket socket = new Socket(serverAddress, 9001);
// in = new BufferedReader(new InputStreamReader(
// socket.getInputStream()));
// out = new PrintWriter(socket.getOutputStream(), true);
in = new ClientSocketSubstitute(
new URL(getServerAddress()), // Connection URL
1000, // ping frequency
5 // max. messages per ping
);
out = in;


That's it -- let's continue with the server.

The server


The client-server protocol of this example is fortunately quite simple:

  • Once the socket communication is established, the server sends the text SUBMITNAME to the client.

  • The client responds with a line of text that contains the nickname of the participant in the chat. The server responds with NAMEACCEPTED.

  • Any text sent from the client afterwards is a chat message. Anything sent from the server afterwards is a chat message and starts with MESSAGE.



Like with the client, I built a server-side socket replacement that makes it easy to implement such a protocol. Outgoing connections are replaced with a ServerEndpoint class that has a send message. Outgoing messages are buffered in the data store and sent out the next time the polling client connects. In addition, an endpoint has a bag of properties that the server can use to remember certain connection-specific attributes. Incoming messages arrive through a Listener interface that our web service can implement. The overall communication is controlled by a tool class called WebConnectionServer.

Let's look at the actual implementation. We base our implementation on the servlet API and initialize an internal WebConnectionServer instance:

package com.appenginefan.sample.chat;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.http.*;

import com.appenginefan.toolkit.common.ServerEndpoint;
import com.appenginefan.toolkit.common.WebConnectionServer;
import com.appenginefan.toolkit.persistence.DatastorePersistence;
import com.appenginefan.toolkit.persistence.Persistence;

@SuppressWarnings("serial")
public class ChatserverServlet
extends HttpServlet
implements WebConnectionServer.Receiver {

private Persistence<byte[]> data;
private WebConnectionServer server;

/**
* Set up the local persistence and the server
* implementation.
*/
@Override
public void init() throws ServletException {
super.init();

// Persist communication data in the store
// under a particular socket name
data = new DatastorePersistence("Socket");

// Build a connection server around that server
server = WebConnectionServer.fromPeristence(data);
}


Note that the underlying storage uses the Persistence interface that I introduced a little while ago. The concrete implementation currently uses the data store, but there is nothing preventing me from switching to something else (like memcache or a combination of those two) if I needed better performance at some point. The WebConnectionServer uses a protocol buffer based schema internally to hold on to the buffered messages.

As far as the servlet itself is concerned; its function is mostly to refer all the parsing logic to the WebConnectionServer:

  /**
* Plugs an incoming request into the server.
*/
public void doPost(HttpServletRequest req,
HttpServletResponse resp) throws IOException {
if (!server.dispatch(this, req, resp)) {
resp.sendError(404);
}
}


The more interesting part is our listener implementation. Let's start with a new connection happening. We have not received any data yet; the server is expected to send a SUBMITNAME. Notice how the listener uses a property in the faux socket to determine whether or not to send out that information:

  @Override
public void onEmptyPayload(WebConnectionServer server,
ServerEndpoint socket, HttpServletRequest req) {

// This is the first communication, so we ask for a name once
if (socket.getProperty("namerequested", null) == null) {
socket.setProperty("namerequested", "true");
socket.send("SUBMITNAME");
}
}


Now that the SUBMITNAME is out, we can sit back and wait for the messages to come in. The first message is going to be the user's name (which we store in another property). Anything else is incoming data that should be broadcasted to every participant we know.

  @Override
public void receive(
WebConnectionServer theServer,
ServerEndpoint socket,
String theMessage,
HttpServletRequest request) {

final String name = socket.getProperty("name", null);

// Are we looking for a name?
if (name == null) {
socket.setProperty("name", theMessage.trim());
socket.send("NAMEACCEPTED");
}

// Otherwise, this must be an incoming message.
// Forward it to everybody
else {
for (String handle :
data.keyScan("", "" + Character.MAX_VALUE, 100)) {
theServer.fromHandle(request, handle).send(
"MESSAGE " + name + ": " + theMessage);
}
}
}


For the broadcast, we use the keyScan method of out Persistence interface to get to a list of all sockets known to our server. It should be noted that a broadcast like this is highly inefficient in production code and should be avoided. These kind of socket replacements are more intended for a small amount of recipients (in case of the JOgre game server, it is assumed that board games will usually have less than a dozen players per game).

(PS: You can find the tool classes used in this post under Apache license at this open source project)

Tuesday, June 16, 2009

Making non-sharded counters perform

Sometimes, design decisions of the past come back to haunt me when I least expect it. For example, imagine you have a persistent model with a property that needs to be increased and decreased in certain situations:

class SomeModelWithCounter(db.Model):
name = db.StringProperty()
count = db.IntegerProperty(default=0)
other_details = db.StringProperty()


Normally, my gut reaction would be to yell out "shard your counters", but there are certain things that do not work overly well with sharding. Some examples:


  • If a sharded counter is not present in memcache, it will require several datastore reads to get. In this particular app, I have tens of thousands (or beyond) of these models and need to display random subsets of them in tabular form. Chances are that most of these counters are not cached, so I will have to get them one by one from the datastore. Let's assume that I have to display 50 model instances with an average of 20 shards per count. Compiling the table for display will require loading roughly an additional thousand objects from the store.

  • Assumed that I would need to use the counters for pagination (think of actions like "show me all instances with a count greater than 500 that start with the letter a"). Having the count being explicitely part of the model makes such a thing very easy to do.



When I started toying with App Engine a bit more than a year ago, I built an application that was a very similar use case -- so I decided against sharding. I modelled my data in a similar fashion as shown above and built a relatively straightforward increment-function:

def increment_counter(model_id, delta):
assert delta >= 0
model = SomeModelWithCounter.get_by_key_name(model_id)
model.count += delta
model.put()


Since I did not want data to get overwritten, I then made sure that the method would always be called in a transactional way, such as

db.run_in_transaction(increment_counter, 'foo', 1)


While this worked out well for quite some time, it was certainly not ideal. In particular, as the application attracted more users, I noticed an increase in issues that were related to resource contention. The error rate increased, requests took longer to execute, and the application therefore became less user-friendly. For a while, I contemplated adding up in mecache and only saving occasionally (see fastpageviews for an implementation of this approach). The problem was that that would only work if a particular counter was hit often enough (otherwise, the counts would just get evicted from memcache eventually before they could get written). Also, it mean that, while most of my users would not see the overhead of writing to the store, some requests would still take significantly longer than others. Wasn't there an alternative, so that the user would never have to wait?

After a couple of tries, I decided to decouple the increase of the counter from the user-facing action. I am using cron jobs for now, but I assume that this pattern should be easily ported when offline processing becomes available. Here is what I did:

The idea is to use the atomic increase/decrease to accumulate counts in memcache and have a cronjob apply them outside of the user request. I started out with an utility function to make memcache access easy:

def change_memcache_counter(counter_name, delta):
"""Increases or decreases a memcache counter atomically."""

# use memcache.incr or memcache.decr?
func = memcache.incr
if delta < 0:
func = memcache.decr
delta = -delta

# If the counter exists, just apply the change
result = func(counter_name, delta)

# If it does not exist, create it
if not result:
memcache.add(counter_name, 0)
result = func(counter_name, delta)

# Done
return result


Using this helper, I can write a replacement for my old increment_counter. Counts are getting accumulated under memcache-ids of /c/datastore_key. In addition, I put two additional values into memcache:


  • I maintain a separate counter /todo of how many count operations have been done in memcache so far.

  • I store the id of the model I incremented under the current value of the /todo count.


TODO_COUNTER = '/todo'
COUNTER_PREFIX = '/c/%s'


def increment_counter_later(model_id, delta):

# Increment the particular counter
change_memcache_counter(COUNTER_PREFIX % model_id, delta)

# Store the name of the counter in a "bucket" with increasing number
bucket_id = change_memcache_counter(TODO_COUNTER, 1)
memcache.set(str(bucket_id), model_id)


Storing this sequence of "buckets" creates a log of operations, which I can now replay in an independent cron job. The following utility method walks along the /todo and finds me an operation I have to apply to the data store:

DONE_COUNTER = '/done'


def get_task(last_bucket):
# Loop until we find something or run out of buckets
while True:

# Is there a bucket left?
current_bucket = counter.change_memcache_counter(
DONE_COUNTER, 1)
if current_bucket > last_bucket:
memcache.set(DONE_COUNTER, last_bucket)
return None

# Is there anything in the current bucket?
# If so, take it out and return it
task = memcache.get(str(current_bucket))
if task:
memcache.delete(str(current_bucket))
count = memcache.get(counter.COUNTER_PREFIX % task)
if count:
counter.change_memcache_counter(
counter.COUNTER_PREFIX % task, -count)
return (task, count)


If we arrive at a situation where current_bucket>last_bucket, we know that we have caught up with the log. By deleting applied buckets and counts from memcache, we also make sure that the cron job may be hit in parallel (for example by creating several entries in cron.yaml for higher throughput) without applying the same count twice. As a nice side-effect, this also means that a single counter may be increased many times between cronjob executions, but it will only result in a single datastore write (afterwards, the count will be zero, so all following buckets with the same id can be skipped).

With this helper method complete, the actual implementation becomes quite straightforward:

TASKS_PER_REQUEST = 5


def main():

# Print some output for easy debugging in the browser
print 'Content-Type: text/html'
print ''

# What's the last bucket that might contain data?
last_bucket = counter.change_memcache_counter(
counter.TODO_COUNTER, 0)
print 'Running up to bucket #%s' % last_bucket

# We will try to increase up to N counters
for i in range(TASKS_PER_REQUEST):
task_count = get_task(last_bucket)
if not task_count:
break
db.run_in_transaction(
counter.increment_counter,
task_count[0],
task_count[1])
print 'Increased %s by %s' % task_count
print 'Done'

if __name__ == "__main__":
main()


Once offline processing becomes available, I should be able to simply replace the process of creating the log in memcache and hitting it from a cronjob by putting the log entries into task queues. The rest of the concept (accumulating in memcache and applying the counters asynchronously) should still apply.

Monday, June 8, 2009

Ping

I thought I'd give a quick update on how my work on Jogre is going. I have to admit that I am not putting a lot of effort into it (the weather is just too nice outside), but I have made some nice little progress: today, I was able for the first time to play a game of connect 4 through an http based protocol. The previous version of Jogre used sockets, which is a more efficient way to communicate but does not work on App Engine. Thus, I replaced it with a polling-based mechanism that would pool game messages and exchange them about once a second.

Obviously, I am still far from really running the backend on Google App Engine -- there are many things left to address, such as not storing things in the file system, making sure no threads are created, preventing objects from storing anything in memory. I will have to look into them one by one over time. Still, it is good to see that my initial refactorings on the connection-threads actually worked :-). Gives me hope for the rest of the road ahead.

If anyone is interested in the details of the http protocol, let me know and I'll post something. I do not think that it is all that relevant, except for maybe some nice patterns like the use of an Environment interface to make threading and messaging testable. I'll keep dabbling with the code base until something a bit more interesting (like putting games into the datastore) comes up.

Sunday, May 31, 2009

Random thoughts after Google I/O

Just like the year before, I was in the fortunate position of being able to attend Google I/O. When I was still living in Virginia, such a trip would have been only wishful thinking, so I am very grateful for such opportunities. The meetings were amazing, and smarter and more eloquent people than myself have already reported more than enough about it. I will try not to bore anyone by yet another rehash of the Wave demo, or all the other cool things that have been announced. In other words, I will keep this short ;-)

One of the talks I attended was called Writing Real-Time Games for Android, and it turned out one of my favourite sessions. I will not pretend to have understood everything (Chris Pruett certainly lost me when he started talking about the different graphics features on the device), but it was inspiring to see the quality of output that could be achieved on such a device. And it was all in Java, a language that I actually understood.

When I first blogged about my new hobby-project of making the JOGRE gaming engine work with an App Engine backend, I got asked why I was going after a framework that used a Java client, when Javascript and HTML canvas seemed the new way to go. My response was relatively bland. Bascially, I did not really care about building a state-of-the-art client; I more cared about porting an existing backend. Now after this talk I start wondering: could I achieve both? JOGRE is Java -- Android is Java!

JOGRE builds clients on a regular Java framework with a Swing or AWT based frontend. I am already working on making that client work "on the web", by replacing socket-based messaging with an http protocol. Beyond that, how hard would it be to refactor the client to work on Android backends, rather than regular PCs? Would it be possible to "mobilize" the games with little effort? Imagine having a standard gaming platform for multiplayer games, backed by App Engine and running on your cell phone? How easy would that make writing new online games, and pushing them to the Android marketplace?

So... I guess I've got to ask: any "Android fan" reading this blog who could audit the client side of JOGRE and give me some feedback?

Thursday, May 21, 2009

An initial refactoring

This is the first technical post in my new JOGRE series . As mentioned in the previous article, the goal behind this project is to explore how App Engine can efficiently host online games that scale to many users. Since I do not want to reinvent the wheel, I am basing the work on an existing Java-based platform (http://jogre.sourceforge.net/main.htm) that already happens to come with a lot of prewritten games. How hard is it going to be to retrofit that base to the App Engine platform? Over the next couple of weeks, I will hopefully find out.

The post is a little bit longer than usual, so I'll do my best to put in a few meaningful headlines inbetween. This way, readers can skip to the parts that are of particular interest to them.


A hobbyist's code of conduct to refactoring



A lot of my activities over the next couple of weeks will be about refactoring code in JOGRE, so I'd like to start out by summarizing the guidelines that I will try to follow in this work. Note that the "you" in the following paragraphs does not address the reader but myself. I know -- talking to myself is a bad habit, but it makes it easier to write the rules down. I will to try my best to abide by them in this and other refactoring posts.

The cardinal rule: Don't be a jerk


Life's too short to get angry, so why would anyone want to work with someone he or she does not like? This is even more important to consider when there is no transfer of money involved: people work on this kind of software because they have a passion for it and it is fun. So, try to avoid spoiling the project for anyone else!

Corollary #1: If it ain't broke, don't fix it.


You might have an extensive background in methodology xyz. You might have read the GOF book, Martin Fowler's Refactoring, and you might have quoted from Effective Java on occasion. You may think you know how to spot "code smells", or that you have read and written a ton of code. Good for you, but do yourself a favor: forget all of this right now! Unless you are a glutton for punishment, you have chosen to work on this project in your spare time because you like it, it works well 90% of the time, and you'd like to make it even better. This means that if you run into something that does not comply with what you might consider quasi-standards of good coding, it is probably like that for one of the following reasons:
  • The code has evolved historically towards the way it currently is, and it has never been too much of a problem.
  • The project team has a different philosophy or architectural view towards development than you. Happens all the time, and that doesn't make it bad code. Feel free to ask the team why it is the way it is, but do not expect them to change it for you.
  • You ran into a bug or an area with "TODOs". If it's an issue and the team agrees, feel free to improve it -- as long as you do it with the consent of the others.
Bear in mind that long before you joined the project, other people have spent a lot of their spare time contributing and building the foundation of what you are using right now. Criticizing the code or calling any of it "smelly" could be considered impolite or disrespectful to other contributors.

Corollary #2: Honor the code's spirit and document the intentions of changes you make.


It's funny 'cause it's true: nobody should be looking at your changes and ask, "why the heck did he do that?!?" There are several things that can significantly increase the WTFs per minute ratio:
  • changes that would make sense -- if the author had only documented why he or she is doing them
  • too many concurrent modifications in one iteration
  • combining a refactoring with writing new features
  • incomplete or speculative refactorings (in other words, starting a change because "it might come in handy later" but never following through on that)
  • major changes to the contract of a core class that require code changes throughout the project
  • random changes just for the heck of it, or because one does not like the way a particular piece of code looks (see corollary #1). While it sometimes makes sense to clean up something small while you're in a particular class anyway, there is a fine line between cleanup and major modifications.
Refactorings can be like solving a Rubik's cube: while the overall change might be complex, it can usually be comprised of several smaller transformations, each of them simple and easy to understand. It often makes sense to document these smaller steps by checking them in independently into revision control (for example, today's refactoring steps can be traced from revisions 5 till 16 of the open source project.


Corollary #3: Contributorship does not imply ownership


Your open source work is branched off an existing project (Note: I contacted Bob Marks before I started any of this. We agreed to keep my work separate for now, but if I am successful and he finds the changes useful and beneficial, we will merge them back into the main project). The overall amount of changes you are going to make will most likely affect not even ten percent of the overall codebase. The new features you add will probably be even fewer. Therefore, do not imply that you "built" this platform, or forget to give credit where credit is due. If you build a new feature, feel free to put your name into the author tag -- but if all you are doing is pulling code from one class into an other class, that doesn't make you the author. Give credit where credit is due! Also, do not expect that your changes will definitely make it back verbatim into the main branch (in other words, don't get defensive if the other coders require some additional modifications first). Do not remove any branding that the original authors might have put in and replace it with your own -- it might be ok to do that in customized deploys, but not in the open source project itself. That being said -- of course you are free to take some liberties with the project, as long as you comply with its open source license (GPL v2 or higher in case of JOGRE). Just don't forget about the cardinal rule.

Corollary #4: Failure IS an option


Sometimes, as you get deeper into the code base, you might discover that the project does not really do what you need it to do. It might be designed in a way that you cannot easily adapt, there could be fundamental disagreements between you and the other team members -- or you simply happen to run out of time and need to focus on something else. It happens on occasion, so move on if you need to move on. However, if you choose to do so, communicate this clearly and in a nice manner. Do not promise to implement a particular feature and just never do it. Do not write a scathing goodbye post that explains why project xyz is so much better. Sometimes, the real problem actually lies somewhere between the keyboard and the seat.


The overall goal of my refactorings


Simply put, the goal of my initial work is to rework any code I might find in JOGRE that would prevent it from running on App Engine. Since I am just starting out in this effort, I do not have a very good idea yet what that might be. However, I am aware of a few things that work in a regular Java program but would be challenging in App Engine, such as:
  • spawning threads
  • network communication through anything but http
  • storing data in files
  • keeping data in a static field and expecting it to stick around (the next request might hit a completely different instance of my App Engine app)
  • connecting via JDBC to a database
  • longer living background processes
  • anything that requires a "restart" of the system
  • anything equivalent to a "global scan" (will the client ever need to get the list of all users?)
  • anything with near-realtime requirements that relies on the accuracy of timestamps
  • anything that uses native code or expects certain operating system commands to be available
  • code that uses any class that may not be whitelisted
There's probably more stuff to look out for, but those are the first things that come to my mind. Some of these limitations will only be triggered (if ever) as I do the first tentative steps towards the port to App Engine, but others might be obvious from looking at the code. In the first couple of weeks, I will do my best to identify those more obvious areas and refactor them. I will do so in a way that is downwards compatible (it should be possible to merge those changes back into the main project, and older deployments of JOGRE should still work) and does not break the existing unit tests. If I break a unit test, I will revert my change and accomplish the goal in another way (minor changes to the test to cover API modifications are ok, though).

Today's refactoring in pictures


After having successfully built the server and played a game of Connect4 with myself, I started looking into the main method of the server code. From the documentation, I knew that the client-server protocol was xml based, which is good. I did not know however in which way those messages were exchanged. Many gaming engines use TCP directly (or, for better throughput, even UDP), since it provides a lot of nice properties (like establishing an keeping up a connection) that are more than sufficient for their needs.

I took a peek and found my expectations confirmed: JOGRE was using a well-established pattern that could have been straight from The Java Tutorial (see http://java.sun.com/docs/books/tutorial/networking/sockets/clientServer.html):
  • a main class (JogreServer) creates a ServerSocket and listens in a loop to incoming connections
  • for each incoming connection, it starts a new Thread (ServerConnectionThread) that then handles all the gaming logic.
Here is how the connection thread looks like:



ServerConnectionThread is a subclass of AbstractConnectionThread, which contains all the logic of how to fetch data from the connection, maintain the lifecycle of the socket, update connection state, and remember the name of the current user that owns the connection. AbstractConnectionThread, a subclass of java.lang.Thread, also had a couple of other subclasses that shared the connection handling logic but did other things with the arriving data. How could I start squeezing http support in here without breaking anything?

There is a very good chapter in Working Effectively with Legacy Code that introduces a concept that is called the Single-Responsibility Principle:
"Every class should have a single responsibility: It should have a single purpose in the system, and there should be only one reason to change it."
Using that as the foundation of my first refactoring, I decided to break parts of the AbstractConnectionThread out into a new class called SocketBasedMessageBus:



The refactoring itself was pretty straightforward: I took the content of the thread's run()-method, plus everything that used the socket object, and copied it into the new class. The thread's constructor would simply wrap the original Socket into the new object, and it's start/stop/run-methods would delegate calls to their equivalents in the new object.

I compiled the code and ran the unit tests -- everything still worked :-). The next step was to simply kick out those delegating, hollowed-out methods. I added a "getMessageBus" method to the base class and had all subclasses use that to get access to the new MessageBus object, and called the moved methods in the subclasses:



Now that the the SocketBasedMessageBus was doing all the socket-based work, AbstractConnectionThread no longer needed to subclass the Thread class. This enabled me to get rid of the run() method, and encapsulate the use of Threads in the message bus:



While looking at the new class, I realized that the names of the methods I had moved into SocketBasedMessageBus were focused around threading and loops -- but that was not necessarily what the class was responsible for. As the name said, SocketBasedMessageBus isolated the exchange of messages via a Java socket -- so its public API should reflect that. I decided to rename the methods accordingly:



While probably still not single responsibility, I now had a situation where the AbstractConnectionThread class no longer had any particular knowledge that it was using a socket -- except for its constructor. That was good enough for me (after all, if it ain't broke...), so I decided to wrap it up. One final cleanup step remained: the constructor of our base class should not have to know about sockets. Nor would our MessageBus really have to care about the user name of the connection thread, as long as there was a parse- and a cleanup method. I therefore chose to extract those aspects of the class into interfaces and have the base classes depend on those rather than the concrete implementations:



At the end of the day, the refactoring of AbstractConnectionThread resulted in the following modification:
  • instead of managing sockets and threads, an AbstractConnectionThread is connected to a generic MessageBus object, which may choose any transport protocol (sockets, udp, http) it likes. The class no longer has any dependencies on sockets.
  • AbstractConnectionThread has two remaining responsibilities: to manage the name of the user that the connection belongs to, and to provide a generalization for how to react to incoming messages from the MessageBus. The latter responsibility is represented by the MessageParser interface, which is what the MessageBus interfaces with.
  • concrete subclasses like ServerConnectionThread still accept Sockets in their constructor (thus remaining compatible with the rest of the codebase), but they wrap the socket in a MessageBus before pushing it into the base constructor. I might choose to refactor that in a later stage, but if I do so, I can do it on an individual class basis, without breaking any of the other peer classes.
Hardly a spectacular change, but that was never the goal in the first place ;-). The main target was to make the socket communication replaceable without major rewrites of the system. Now that that's done, I can keep scanning the code for other things that might be tough to do on App Engine.


The refactoring in code


For those amongst us who'd rather just read code, here are the classes affected by this refactoring:

The original class
http://code.google.com/p/gae-ogre/source/browse/trunk/api/src/org/jogre/common/AbstractConnectionThread.java?r=2

Main refactoring targets
Affected classes (minor changes)

Sunday, May 17, 2009

My summer project: let the games begin

I finally got my Schluesselmeister app to a point where I feel comfortable moving away from my desktop app and into the cloud (the final missing piece was sharing, so that my wife and I could use the same key database). Now that that's done, I need another hobby project :-) The search criteria were as following
  • It should be as far away from my day job as it could possibly be, yet still involve Google App Engine.
  • It should force me to look at the tools from a different angle and broaden my perspective.
  • It should be about something I can "occasionally" mention in this blog and that people will hopefully enjoy reading.
  • It should be something that I find fun to do.
At first, I thought along the lines of building an enterprise application. Ages ago (at least in software years ;-), when I was at a previous job writing software for public transportation, I had claimed that:
Even if Google happens to stay out of the market, its products will help lower the entry-level for this industry even more. By using its products, a new breed of systems, based on Internet technologies and open standards will reach the market. Greater competition will improve the overall quality of the solutions offered, and the end user is going to benefit.
I was briefly contemplating putting that statement to the test and writing such an application. After all, App Engine seemed like the perfect backend of moving such software into the cloud. The problem with it: it seemed more like work than fun. Also, I wasn't sure if I could squeeze any good articles out of such a project ;-)

After some more soul searching, I decided to focus on something else instead: online gaming. A recent lightning talk I saw on Buddypoke inspired me: obviously, App Engine is a great backend to scale fun applications to millions of users. But how? What works, and what doesn't? What are best practices to build such a fun and massively scalable application?

Naturally, I did not want to reinvent the wheel. I searched a little bit around and found JOGRE, a Java-based gaming engine that is open source, seems to have decent test coverage and comes with a ton of pre-implemented games. How hard would it be to run this backend on App Engine? I am going to find out over the next months (yes, months -- I have no idea how hard it is going to be; and the weather is way to beautiful outside to be coding all weekend ;-). I am starting with the 0.3 source snapshot, which I uploaded to http://code.google.com/p/gae-ogre/. I doubt I will have anything runnable for quite a while, but I hope the journey of getting there will yield some interesting posts. Wish me luck :-)

Thursday, May 14, 2009

Notes from the last meetup

[updated 3/18/09: added reference to webvnc]

Yesterday, I was at the last developers meetup in Palo Alto. As always, it was a very interesting time, as people gave lightning talks about many different subjects. Check out this link in a couple of days from now, as the organizer is going to link in more links and details from different presenters.

A couple of things that I found particularly interesting (in the order in which they were presented):
  • A member of the team behind BuddyPoke shared some lessons learned behind his secret to scaling to millions of users. Unfortunately, I do not remember them all, but two of them were the use of sharding techniques for things like counters and to avoid using queries wherever possible. According to the presenter, BuddyPoke gets its data from the store almost exclusively through loading by primary key, and they avoid putting indexes on data wherever possible (<shameless-plug>if you'd like to use similar techniques in Java, check out my previous posts on a simple, key-based datastore api and efficient global counters. You can also download the sourcecode directly from this open source project.</shameless-plug>).
  • Another developer is working on jiql, a JDBC driver for App Engine. In the long term, projects like this could be essential to ease of porting sql-based applications to App Engine, so I think it's worth checking out. I'd love to hear especially if anyone is using it to porting things like ActiveRecord in Rails or php-based stuff...
  • There was a very nice demo on patching django-based applications to use the Jinja2 templating system, including a real life example on how it simplified life in a web application that the developer was building.
  • A demo of a prototype that can broadcast browser content to a set of viewers (useful for broadcasting demos and training sessions): http://www.webnc.net
If I didn't mention any of the other presenters, it was mostly because I did not take notes while I was there and am typing only what I still remember ;-). Again: check out this link in a couple of days from now, as the organizer is going to link in more links and details from different presenters.