Take a look at the following example application. Ignore the lack of transactions or that I hardcode my html and focus on the security aspect. Do you see anything wrong?
import wsgiref.handlers
from google.appengine.api import users
from google.appengine.ext import db
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import login_required
# This app is an easy way for users to store their
# social security number online, in case they need to
# look it up
# I'm too lazy to use Django templates ;-)
SHOW_DATA = '<html><body>Your SSN: %s</body></html>'
NOT_FOUND = '<html><body>No data found</body></html>'
DATA_SAVED = """<html><body>
Your data was saved. You can get to it through <a
href='/?user=%s'>this link</a>.</body></html>"""
FORM = """<html><body><form method='GET'>
Enter your Social Security Number:
<input name="ssn"/><br/><input type="submit"/>
</form></body></html>"""
# Data model
class SSN(db.Model):
user = db.UserProperty(required=True)
ssn = db.StringProperty(required=True)
# Handler
class MainPage(webapp.RequestHandler):
# Get method (login required for security)
@login_required
def get(self):
html = ''
user = users.get_current_user()
# Case: show data
if self.request.get('user'):
user = users.User(self.request.get('user'))
data = SSN.gql('WHERE user = :1', user).get()
if data:
html = SHOW_DATA % data.ssn
else:
html = NOT_FOUND
# Case: modify data
else:
if self.request.get('ssn'):
data = SSN.gql('WHERE user = :1', user).get()
if data:
data.ssn = self.request.get('ssn')
data.put()
else:
data = SSN(user=user, ssn=self.request.get('ssn'))
data.put()
html = DATA_SAVED % user.email()
else:
html = FORM
# Render response
self.response.out.write(html)
# Main
def main():
application = webapp.WSGIApplication(
[('/.*', MainPage)], debug=True)
wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__":
main()
The author of this code (which is me, but I'd rather not point fingers ;-) tried to build a small database that people could use to store their social security number online. A user would log in and post its information. In return, he would get a unique URL back that pointed to his information. To make things secure, the programmer decided to require a login for data access. The problem is that the author neglected to check that the current user matched the owner of the user data. In other words, if Mitch Millionaire used the application and I somehow knew his email address, then I could get to his SSN by simply going to his URL and logging into my own gmail account.
On first sight, this example might seem a little bit contrived. Who the heck would be so stupid and forget that simple check? The reality however is that it could happen to anybody. We are all human, and real life applications are rarely as simple as what would fit into a single blog post. We usually have a hierarchy of data entities, all connected to each other. As an application traverses the entity relationships, it could easily happen that it stumbles upon information it is not intended to have access to. Unit tests may help to prevent some of these cases, but with growing complexity, it is not possible to foresee every turn of events. If we agree on the premise that we all make mistakes and that protecting our users is more important than protecting our pride, then this leads me to a simple conclusion: I'd rather have my application explode than hand out confidential data to the wrong person! My wounded pride will heal faster than the user's trust.
With that in mind, could we have prevented this bug from happening? Take a look at the following
addition to my original code:
class CurrentUserProperty(db.UserProperty):
def checkCurrentUser(self, value):
if value != users.get_current_user():
raise db.BadValueError(
'Property %s must be the current user' % self.name)
return value
def __init__(self, verbose_name=None, name=None):
super(CurrentUserProperty, self).__init__(
verbose_name, name, required=True,
validator=self.checkCurrentUser)
The class CurrentUserProperty is like a regular UserProperty, with a small addition: it assumes that only the current user is a valid input. This was achieved by creating a custom validator-function (checkCurrentUser) and passing it along with the constructor. The property's validation function is called whenever data is loaded from or stored in the database, so besides its original validation code (we set
required=True, so validation will fail if no user was assigned) we will now also make sure that the user equals the person logged in. Since we did not modify anything else in the code, the new Property is a drop-in replacement in our model class:class SSN(db.Model):
user = CurrentUserProperty()
ssn = db.StringProperty(required=True)
Did this fix our original bug? Absolutely not! Did it prevent somebody else having access to Mitch Millionaire's social? Absolutely, assumed that nobody hacks into his gmail account! The application will raise an exception and the data will not get served.
What does this mean for the application developer? Well, whenever a crash like this happens, the application console will log the exception and make it available in the logs. The developer can access this data in the dashboard, inspect the stack trace and detect where the data integrity violation came from. He or she can write a unit test, fix the leak and deploy an improved version.
Interested in more tips and tricks? Like I said before, there are other nifty things one can do to keep data secure. Why not posting some of them in the comments section? Go ahead and share your ideas!
2 comments:
I enjoy your blog ...
Based on the appengine docs, why not use the 'validator' class method, or is that a different thing?
I am looking for help in just this area, and I con't find any examples of the 'validator' approach.
Thanks for the feedback :-)
There are (at least) two ways to do validation in a property class: one is to pass along a custom validator to the constructor, like I did:
super(CurrentUserProperty, self).__init__(
verbose_name, name, required=True,
validator=self.checkCurrentUser)
By defining a validator, one can instruct the base class to perform additional validation logic on top of the regular code.
The other alternative is to subclass a Property and override the validate()-method (be sure to call the base class validate from your code though). Both ways work well, and as long as one stays away from this association-over-inheritance debate, they do pretty much the job.
If you really feel there should be an example on the validate()-approach, please ping this thread again -- I can surely post a quick snippet.
Post a Comment