Building a new web application today usually starts in the same way: set up a skeleton of an application by running your favorite framework’s generator script, cloning a repository, or copying a custom built starter project. All of these options are great for getting your application up and running quickly. There are few greater joys than being able to go from an idea to a functioning local web server with your new project’s name in less than thirty seconds.
But building applications this way can make us blind to important design decisions. These decisions get made regardless of whether we consider them thoroughly, as our starter scripts will do their best to pick a reasonable default. Even if this does not cause headaches today, it might cause significant problems later on.
In particular, Ruby on Rails (which is my preferred framework) and other like-minded server side frameworks usually make some strong assumptions about your back end. With Rails, we default to SQL databases.
As evidence, consider that while the ‘Rails’ gem has 38 million downloads on RubyGems.org, the ‘pg’ gem required for PostgreSQL has 10.8 million and the ‘mysql2’ gem for MySQL and MariaDB has 12 million. SQLite 3, which is the actual rails default for development, has 10.8 million downloads. The most popular NoSQL backend gem, Mongo, has just 4 million.
There is not a perfect correlation between downloading the Rails gem, downloading a database driver, and creating an application. I suspect that of production rails applications, very few are running on SQLite, and a relatively higher proportion on Mongo and PostgreSQL than the download numbers would suggest. That said, I believe it is clear from the data that for most developers, ‘Rails’ implies ‘SQL’ - probably without careful thought.
Is every application best off with an SQL backend? Trivially no, as evidenced by the existence of Mongo. A more interesting question is, ‘is every application best off with a database?’. I think the answer is still no.
There are plenty of reasons to use a standalone database server as your back end. Most developers are familiar with database systems, libraries and documentation are widely available, performance is good, and since the mid 2000s, libre and free database systems can match the performance of high priced proprietary systems like Oracle Database. In addition, for applications that scale out, using a standard database package can help isolate and conceptualize concurrency problems by moving all persistence to a single, stand-alone subsystem.
As an exploration of no-database web applications, I wrote James C Gibson as a Service without a database. The web server keeps all of the business data in memory, and serializes all updates to flat files. When the server is started, the logs are read and objects read back into memory.
All of the persistence logic is hidden behind a set of Serializer classes and data manager classes, which together hide the JSON serialization and provide a set of ActiveRecord-style finder methods. Instead of
Foo.find_by_field(), I have
There are several benefits to this setup.
First, my design is not constrained by database schema. Some of the objects I serialize have nested arrays and hashes that would be somewhat difficult to arrange in a normal ActiveRecord/SQL setup - with this design, I don’t need to think about tables when I am designing business objects. I do not have any evidence to suggest that this freedom improves design, but all else equal, I believe that removing constraints from developers will probably do more good than harm.
Second, my ‘business logic’ objects can be tested in perfect isolation, since they do not have any outside dependencies at all - as such, I can test all of the non-persistence logic for the entire application, which is a few dozen tests, in less than 150ms. This is a speed I could only dream of on my current major rails project - the test suite for UpsideOS takes a few minutes to run. The only untested code is that dedicated to reading and writing the flat storage files, which are only a few lines, and as such easy to verify by hand.
Third, it will be very easy to move to a different storage engine if that becomes necessary. Since all of my persistence code is isolated in a few, well defined, single-job methods, switching to say ActiveRecord would be a simple matter of creating the appropriate ActiveRecord classes and then setting my own finder methods up to call the appropriate AR methods. It would be equally easy to switch to Mongo, or even a nontraditional database like ElasticSearch. Certainly, defining my own finder methods at the beginning took a little longer than using AR from the start would have, but being able to any storage back end at all with a fixed, small amount of effort is worth the initial investment.
Finally, I also avoid the operations overhead of maintaining database servers. The current JCGaaS website is a single ruby process running on a single server. Setting the server up was as simple as installing Ruby, cloning a repository, and adding the right script to the startup list - so quick and easy that I can automate it with a shell script, instead of relying on something like Puppet or Chef. Backing up the system is equally easy, as all I need to do is synchronize a single file system - currently, I have a cron job to run
rsync to another server every hour, but in theory I could even do something as easy as just mounting the right folder to DropBox.
I don’t think my current JSON code will scale well to millions of requests, but by avoiding the extra overhead and pushing all HTML rendering onto the client, I have been able to achieve 1000 request per second throughput on benchmarks, which is more than enough for a minimum viable product in nearly any field. And, if the time ever comes to scale the application, it will be easy to switch out the persistence engine.
Building an application without a database was an instructive and refreshing exercise - I highly recommend it as a tinkering project for all web developers. And, consider alternate storage systems next time you find yourself running