Monday, September 21, 2009

App Engine -- my personal administrative assistant

I should be glad I'm not a systems administrator, because I would be terrible at my job. I recently decided to do a couple of "enhancements" to my home office and move my Linux box to a faster machine. Since that is an effort I'd rather not have to repeat again, I chose to virtualize the PC. While I was at it, I also moved part of my home folder (in particular my App Engine projects) to a NAS. The result is now a machine that is significantly faster but only works 95 per cent as well -- running a python dev appserver is unfortunately one of the victims. I had originally planned to write, about porting my non-sharded counters to task queues (a continuation of this article that had been suggested by a commenter), but I guess that will have to wait until I fix my PC. I hope you folks like the stand in...

We live in a time where it is no longer enough to build a good web app that is self contained. Applications are expected to be open, not only for accessing and migrating data, but also for extensibility. Developers provide APIs, from passive polling-based JSON formats to pushing data through mechanisms like pubsubhubbub. App Engine turns out to be a great medium to build and host these extensions, be it just for one personally or as a generally available service.

When I started this blog, I defined its purpose to being "dedicated to the Google App Engine and ways of putting it to use". Over the course of the last 17 months, I wrote about building commercial and personal applications, using Python or Java, with the datastore, memcache, urlfetch, and other services. I have posted about tricks for unit testing, and how an App Engine can be extended through low-level features, such as hooks or monkeypatches. Today, I'd like to write about another use: extending other non-Appengine web services, App Engine style. Not the so fashionable mashup kind of extension; just an everyday hack to make life easier.

This particular post is going to focus on the Meetup API. Meetup is a great service for coordinating events for groups of people, like for example the SF Bay Area Google App Engine Developers group. A meetup organizer can create new events, manage capacity constraints (how many people may join), and collect feedback on how the particular event was. One of the meetups I am personally hooked on is a boardgame group in my area. It is a ton of fun and I really enjoy it, but "getting in" can sometimes be tricky. Unfortunately, many other people have also discovered the same group, and space for the events is quite limited. New events may pop up at any time, and whoever sees it first on the callender and RSVPs gets to go.

Since I am not very good at watching out for these kind of things, I would like somebody else to do the job for me. The following script checks all my meetup groups for new events and notifies me by email when they come up (note: it would be just as easy to "snipe" the events by automatically RSVPing, but let's give the others a fair chance, too ;-):

# Assumed this runs in App Engine:
# Make this a "text request, so that all output
# appears in the browser windor
print 'Content-Type: text/plain'
print ''

# Do the necessary imports
import meetup_api_client as api
import datetime
from google.appengine.api import mail
import logging
from google.appengine.ext import db

# Some data for the meetup API (anonymized, of course ;-)
MAIL_ADDRESS = 'my-email@example.com'
API_KEY = 'get-this-from-meetup.com'
MEMBER_ID = 42 # get this from meetup.com

def pr(s):
"""Print both to console and log."""
logging.info(s)
print s


class Event(db.Model):
"""Represent an event we already sent out."""
name = db.StringProperty()
time = db.StringProperty()


# Create a new meetup api instance
mu = api.Meetup(API_KEY)

# Go through all groups of this user
for group in mu.get_groups(member_id=MEMBER_ID).results:
group_id = group.id

# Get all future events
for event in mu.get_events(
group_id=group_id,
after=datetime.datetime.now().strftime(
'%m%d%Y')).results:

# Ignore any event we have already RSVPed
if event.myrsvp == 'none':
key_name = ':%s' % event.id

# Avoid sending duplicate mail
entity = Event.get_by_key_name(key_name)
if not entity:

# Send out email
mail.send_mail(
MAIL_ADDRESS,
MAIL_ADDRESS,
'New event for %s' % group.name,
'Consider RSVPing for %s at %s' % (
event.name, event.time))
Event(key_name=key_name,
name=event.name, time=event.time).put()
pr('Sent out mail for %s at %s' % (
event.name, event.time))
else:
pr('Ignoring event %s at %s' % (
event.name, event.time))


This simple script will send me one, and only one, email for every new event that gets added to any group I am a member of. I can combine this with an App Engine cron job like the following, which fires the script every minute:

cron:
- description: Find new meetup events
url: /findEvents
schedule: every 1 minutes


This way, within a minute after its creation, I get notified about any event that may be out there. Since i get incoming emails directly onto my phone, I can quickly make up my mind whether to attend or not. Thanks to the meetup service being extendible via an API, I can customize the sites behavior, which will in return make me use the service more -- and happy users hopefully mean more revenue to a site in the long run.

While the listing above looks like an oversimplified example, it happens to be what I actually use for my meetups. A lot of the complexity is hidden by the meetup client library, and of course by the fact that python eliminates a lot of the overhead that other languages might bring to the table. It turns out that App Engine is a pretty nice host for these kind of personal hacks and scripts -- it certainly beats me having a linux box up and running all the time. Like mentioned before, I would do a terrible job of a system administrator, and thus probably miss a lot of game nights ;-)

Thursday, September 10, 2009

My first attempt at XMPP in Java App Engine

One of the things I love about App Engine is that the team keeps releasing new and exciting features. With the latest SDK, Google provided us with XMPP support, so let's give that a shot. During the course of this post, I am going to build a simple XMPP-chatbot version of ELIZA. You can give it a try by pointing your Google Talk client to j-eliza@appspot.com and inviting her to a chat. Naturally, you can also download the source code and try to run it yourself :-)

Eliza, please talk to me!


XMPP is one of the first features where I get to choose between Python and Java from the very beginning. Since my last experiment (on cron) was Python based, I am going to give Java a try this time.

The first step was to create a project in Eclipse and build a very simple echo-bot (a chat bot that always repeats whatever message I send to it). The following code (full source available here) accomplishes that:

  @Override
public void doPost(HttpServletRequest req,
HttpServletResponse resp) throws IOException {

// Parse incoming message
XMPPService xmpp = XMPPServiceFactory.getXMPPService();
Message msg = xmpp.parseMessage(req);
JID jid = msg.getFromJid();
String body = msg.getBody();
LOG.info(jid.getId() + " --> JEliza: " + body);

// Get a response from Eliza
String response = "echo: " + body;
LOG.info(jid.getId() + " <-- JEliza: " + response);

// Send out response
msg = new MessageBuilder()
.withRecipientJids(jid)
.withBody(response)
.build();
xmpp.sendMessage(msg);

}


Basically, all I do is to convert the incoming HttpServletRequest object into a Message, which contains a message body (whatever the sender typed into Google Talk) and the sender's id (JID). I then take that information and create a new Message using the MessageBuilder, and send it to the XMPPService

In addition to this Java code, there is also some additional setup required. I have to bind my servlet in the web.xml under a "magic" path,

 <servlet>
<servlet-name>xmppreceiver</servlet-name>
<servlet-class>com.appenginefan.xmpptest.XmpptestServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>xmppreceiver</servlet-name>
<url-pattern>/_ah/xmpp/message/chat/</url-pattern>
</servlet-mapping>


and I also need to enable messaging in the appengine-web.xml.

  <inbound-services>
<service>xmpp_message</service>
</inbound-services>


This was pretty much explained in the introduction, so I just had to be lazy and copy-and-paste that into the right files into my project. As easy as that sounds, it still requires a certain attention to detail. That lack of attention turned what should have been a 20 minute project into a one hour debugging session:

After uploading the first version, I connected to the chatbot and entered Hello World. Unfortunately, Eliza would not answer. Had I forgotten something? I re-read the article again, checked my configuration: everything was fine. I tried to fiddle with the settings, switched the path in my web.xml to wildcards -- alas, Eliza remained silent. In my despair, I pinged a couple of fellow App Engine enthusiasts. They also checked my files but did not find any problems. Had I run into a bug?

The next morning, I checked my inbox and found that somebody had tracked it down: xmpp servlets expect posts! While the introduction's sample code clearly uses doPost, the Eclipse template I had filled in used the doGet method. Had I started with an empty servlet and just followed the documentation, I would have never run into that problem. Oops :-(

Switching from doGet to doPost immediately did the trick. Whatever I chatted to my bot would immediately be chatted back. Eliza was finally talking to me.

Eliza, why don't you remember me?


Next step was to make Eliza "intelligent". A quick Google search for "Eliza" and "Java" pointed me to a simple open source version. Without going too much into detail, the API of the class looked something like this (visibility of inner fields changed by me):

public class ElizaParse {
public String lastline;
public boolean exit;
public void PRINT(String s) {
...
}
public void handleLine(String s) {
...
}
}


When a new eliza parser was instantiated, it would print a quick greeting (HI! I'M ELIZA...) and then wait for input. Using handleLine, a new line of input could be fed to the bot. The bot would then store that line in lastline and return an answer using PRINT. If the user wanted to end the conversation (by typing in something like "shut up"), the bot would set the exit field to true.

To make the parser work for our bot, we would need to simply feed it the incoming message and retrieve the answer by overwriting the PRINT method. The following code does that (full source is here):

  @Override
public void doPost(HttpServletRequest req,
HttpServletResponse resp) throws IOException {

// Parse incoming message
/ ...

// Get a response from Eliza
final StringBuilder response = new StringBuilder();
final ElizaParse parser = new ElizaParse() {

@Override
public void PRINT(String s) {

// Skip the annoying intro
if (s.startsWith("HI! I'M ELIZA")) {
return;
}

// Write all output into a StringBuffer
response.append(s);
response.append('\n');
}
};
parser.handleLine(body);
body = response.toString();
LOG.info(jid.getId() + " <-- JEliza: " + body);

// Send out response
// ...
}


I uploaded the code and gave it a try. It worked pretty well, but wasn't perfect. Eliza has a little feature that detects when I type in the same line twice and prints "PLEASE DON'T REPEAT YOURSELF!". Also, while I was currently cutting out the greeting all the time, I did want it to appear at the beginning of a conversation.

Both issues came back to the fact that I was throwing away my parser at the end of the request. Eliza had no recollection of the last thing I had typed in, so it could not know if the conversation was new or whether I was repeating myself. In other words, I needed some persistence.

Making Eliza less forgetful


Storing the last line of a converstion could be done in many ways. I could just dump it in memcache, since I only need to remember it temporarily. Or I could create a JDO model class and use the datastore persistence, but that's a lot of code.

In the end, I chose something in the middle. Using the lower-level datastore APIs, I directly wrote entities into the store (using the chat's JID as key). The following code checks the datastore to see if we had a previous conversation:

    Key key = KeyFactory.createKey("chatData", ":"
+ jid.getId());
Entity lastLineEntity = null;
try {
lastLineEntity =
DatastoreServiceFactory.getDatastoreService()
.get(key);

} catch (EntityNotFoundException e) {
lastLineEntity = new Entity(key);
lastLineEntity.setProperty(LINE, "");
}
final String lastLine =
(String) lastLineEntity.getProperty(LINE);

// ...

parser.lastline = lastLine;


At the end of my request, I can extract the new last line of conversation and write that back into the store (full source is here):

  if (parser.exit) {
lastLineEntity.setProperty(LINE, "");
} else {
lastLineEntity.setProperty(LINE, parser.lastline);
}
DatastoreServiceFactory.getDatastoreService().put(
lastLineEntity);


Final thoughts


Not counting some stupidity on my part during setup, using XMPPP in App Engine was super-simple to do. The API could not have been much easier, and I am very excited to see what interesting things people are going to do with it.