Sunday, January 11, 2009

Sharding woes

This week, an interesting discussion ran through the blogs. It started with an article called Mr.Moore gets to punt in sharding. That post spawned a lot of discussion, from the discussion thread on the blog to other pieces like Quick thought on sharding and Don't Bet on Moore Saving Your Ass. It seems like an interesting read to me, if one can be detached enough to stay on the sidelines in case another flame war breaks out ;-).

Personally, I think designing with scalability in mind is important -- however, implementing it from the beginning may be hard. There are many factors that influence how much effort one can put into this field at the beginning of a product lifecycle, such as

  • Time to market. (Can I afford the extra effort in the beginning, or will spending time on that put me out of business?)

  • Anticipated growth. (Not every software developer in the world needs to write apps that have to schale to millions of users -- think of niche tools that make money through subscriptions and licenses, not advertising)

  • Skill level of the team. (Sharding is not easy. Having inexperienced people trying to pull it off might result in worse performance than just using a standard three-tier architecture with an RDBMS)

  • Availability of tools. (Early releases often undergo lots of changes that will require data migration or fixing inconsistencies due to bugs. How can I (or my support team) accomplish this? Will I have to reinvent the wheel? See also "Time to market")

  • Contractual restrictions (My contract might require me to use a particular database, which could make sharding even more difficult)



I have not always worked at my current job, and I have been in situations in the past where tradeoffs both for and against sharding have been made. I have been saved by faster machines as well, but I am fully aware that this will not work for many situations. I have seen issues that could not be addressed, no matter how much memory one threw at the problem. For example, I have been in teams where a single architect was overseeing multiple projects at once, which resulted in projects where Java novices and graduates on their first job were making most decisions on data structure. I learned a lot at that time -- as Thomas J Watson said, "So go ahead and make mistakes. Make all you can. Because, remember that's where you'll find success.". Still, those mistakes were made on my employer's dime, and it would have been great if that could have been avoided. We got the system out in time, but we had to pay the price for it later, when we struggled to keep the system running smoothly while gradually introducing a more performant infrastructure.

With that in mind, I sometimes look at the restrictions within App Engine's datastore API, and I wonder if one shouldn't apply the same restraint on an RDBMS based project. It would be possible; for example if enough people helped getting the gae-sqlite to a point where it works on any rdbms and has feature parity with the real datastore. Developers could work on an API that encompasses all the best practices of Google's highly scalable architecture but still have the backend tools necessary to deal with the early woes of bugfixes and schema evolution. Then, once the system is mature enough and scalability becomes a concern, simply write a migration script to upload the data into the cloud and benefit from App Engine's scalability and reliability. Unless of course purchasing a faster server works well enough, and Mr. Moore really gets to punt on sharding...

0 comments: