As I was thinking a bit more about how I was going to do the queries, how to translate the query protocol buffer into an SQL statement, I realized two things that I should really clean up:
- The methods entityToDict and dictToEntity contained a lot of logic that translates protocol buffers into sql rows and vice versa, but they were relatively specialized for the SQLite case. Since the overall goal is to make the classes work with other RDBMS as well, that is certainly not ideal.
- The same methods might prove useful for my work on queries, but they do not hand out all the information I need. For example, entityToDict gives me a nice flat list of column keys and values, but it does not tell me which key corresponds to which original property.
As a result of this realization, I decided to wrap the original method into a new class called PRMHelper. I also made entityToDict a bit more customizable (but in a downwards compatible way):
class PRMHelper(object):
"""ProtocolBuffer-Relational-Mapping tool.
Encapsulates helpers that deal with the mapping between
protocol buffer and a relational database. Currently, this
means SQLite, but it could be generalized in the future.
"""
def entityToDict(self, pb,
populate_dict=None,
get_list=lambda x:x.property_list(),
unwrap_properties=lambda x:[x]):
"""Converts an entity protocol buffer to a dictionary.
This method will only deal with the properties of the entity,
not its primary key.
Args:
pb: the entity protocol buffer
populate_dict: a method that takes a dictionary,
a property object(or whatever else get_list returns to
iterate on), a column key and a value as parameters
and decides how to modify the dictionary. If not given, the
default behavior is to ignore the property and put the
value into the map, using the column key as key.
Implementation detail: if a property value maps to more
than one row (like a geo-coordinate that could be split
up in latitude and longitude), the method might be called
more than once with the same property object
get_list: a way to extract a list of properties from pb
(or anything else iterable that returns something that
the get_property method can turn into a property;
default behavior is to call "property_list")
unwrap_properties: takes an element from the get_list
result and turns it into list of property pbs.
Default is lambda x:[x].
Returns:
a dictionary with appropriate kev/value pairs that can
be stored in a SQLite database.
"""
result = {}
#TODO: what about raw properties?
for item in get_list(pb):
for property in unwrap_properties(item):
#TODO: what about multiple properties?
assert not (property.has_multiple() and property.multiple())
if not property.has_value():
continue
property_name = property.name()
value_pb = property.value()
col_key = None
property_value = None
if value_pb.has_int64value():
col_key = 'int64_' + property_name
property_value = value_pb.int64value()
elif value_pb.has_stringvalue():
col_key = 'string_' + property_name
property_value = value_pb.stringvalue()
else:
raise 'Not supported yet: %s' % value
if not populate_dict:
result[col_key] = property_value
else:
populate_dict(result, item, col_key, property_value)
return result
def dictToEntity(self, values, pb):
"""Transfers values from a dictionary into a protocol buffer
This method will only deal with the properties of the entity,
not its primary key.
Args:
values: a dctionary of values
pb: the entity protocol buffer
"""
for key, value in values.items():
# Split up the key (like int64_name) into segments
i = key.find('_')
if i < 1:
continue
p1 = key[0:i]
p2 = key[i+1:]
# Case: type integer
if p1 == 'int64':
prop = pb.add_property()
prop.set_name(p2)
prop.set_multiple(False)
prop.mutable_value().set_int64value(long(value))
# Case: type string
elif p1 == 'string':
prop = pb.add_property()
prop.set_name(p2)
prop.set_multiple(False)
prop.mutable_value().set_stringvalue(value)
In the constructor of the datastore, a PRMHelper object is
passed as a parameter and stored in a private field called
self.prm. After a few modifications to previously written
methods, I was
able to make sure that all unit tests still passed.
So, how am I using the modifications that were introduced?
Take a look at the following snippet, an incomplete first pass
at _Dynamic_RunQuery:
_Operator_NAMES = {
1: "<",
2: "<=",
3: ">",
4: ">=",
5: "=",
#6: "IN",
#7: "EXISTS",
}
def _Dynamic_RunQuery(self, query, query_result):
# Turn the filter-objects into a set of conditions
def build_query_conditions(dictionary, filter, col_key, value):
operator = filter.op()
names = DatastoreSqliteStub._Operator_NAMES
if operator in names:
key = '%s %s' % (col_key, names[operator])
assert not key in dictionary,\
'Multiple conditions not supported yet: %s' % key
dictionary[key] = value
elif operator == 7:
key = '%s NOT NULL' % col_key
assert not key in dictionary,\
'Multiple conditions not supported yet: %s' % key
dictionary[key] = None
else:
assert False, 'Unsupported operator: %s' % operator
conditions = self.prm.entityToDict(
pb=query,
populate_dict=build_query_conditions,
get_list=lambda x:x.filter_list(),
unwrap_properties=lambda x:x.property_list())
# Concatenate the conditions in a statement
query = 'SELECT * FROM %s' % query.kind()
params = {}
count = 0
for key, value in conditions.items():
if count == 0:
query = '%s WHERE %s' % (query, key)
else:
query = '%s AND %s' % (query, key)
if value != None:
params[str(count)] = value
query = '%s :%s' % (query, count)
count += 1
#TODO: consider sort order
#TODO: execute query
By slightly tweaking the way property objects are extracted from the protocol buffer (using small lambda functions), I can use the pre-existing logic in entityToDict to execute a helper-method called build_query_conditions. This method populates a dictionary with conditional snippets (like column_name < 'foo'). I can then concatenate these snippets to build an SQL query.
How does the resulting query look like? Consider our unit test from the last post:
def testSimpleQuery(self):
helpers.create_tables([TestModel()], self.connection)
model = TestModel(text='t1', number=13)
model.put()
TestModel.gql('WHERE text=:1 and number=:2 order by text desc',
't1', 13).fetch(5)
With our current code, this will result in the following query string:
SELECT * FROM TestModel WHERE string_text = :0 AND int64_number = :1
What's next?
Thanks to the reuse of our existing logic in entityToDict, we are one step closer to being able to execute arbitray GQL queries on the database (well, assumed we restrict ourselves to the few types we know so far). Even better, since all logic outside the PRMHelper is agnostic with regards to the specific SQL dialect, new database types should (almost) be a matter of just writing a new PRMHelper.
So, how close are we to getting the queries done? I would say, probably about 40 per cent. There is still the small matter of actually executing against the datastore. Furthermore, we need to return results to the user -- possibly within multiple fetch operations. There is definitely still a lot to be done...
0 comments:
Post a Comment