Supercharge Your App With Materialized Views

rails

Materialized views a very powerful PostgreSQL feature that can allow you to dramatically speed up complex reads. Since they aren’t often used in Rails, here are some notes on when to use Materialized views and how to effectively use them

On Views & Materialized Views

A database view is a result of a stored query. You can read from a view just like one would read from a table, and perform additional filtering on the results of the view. They’re particularly useful when the stored query is very complex, for example, when the query is the result of several complex JOINs. Evey time that a view is read from, the query that defines it is performed.

A materialized view is a view that has been executed and has its results stored on disk, effectively caching the results of the query. Because the results of the query are written to disk, reading from and querying a materialized view is just as fast as doing a straight read on a single table - scary fast. For complex database operations, this can be a significant speedup.

The cost of that increase in read performance is that the materialized view does not automatically change with changes underlying data. A materialized view must be periodically “refreshed” to bring it up to date - essentially, performing the query that defines the view again, and storing the results. Depending on the complexity of the query that defines the view, it could be practical to refresh the view frequently, even upon every write to the underlying table, but it is not done automatically.

A short example

Here’s a quick example. Let’s say that we have a website that allows customers to book cruises.

What makes up a cruise? Well, a cruise is a specific customer renting a specific cabin, which has a type, during a specific sailing of a ship, which itself is from a specific port. Maybe our schema looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
cabins
  - id
  - cabin_number
  - ship_id
  - cabin_type_id

cabin_types
  - id
  - price_per_day

sailings
  - id
  - ship_id
  - departure_port_id
  - embark_date
  - return_date
  - price_adjustment_factor

ships
  - id
  - name

ports
  - id
  - name
  - port_fee

cabin_sailings
  - cabin_id
  - sailing_id
  - booked

We want to show the user CabinSailings that are not booked, but allow them to search by criteria that’s stored through a lot of JOINs. To allow the user to search by the name of the port, for example, we’re going to need to do something like:

1
2
3
SELECT * FROM cabin_sailings
  JOINS sailings ON sailings.id = cabin_sailings.sailing_id
  JOINS ports ON ports.id = sailings.departure_port_id

Not too bad, but then we’re going to need to select by the ship as well -

1
2
3
4
5
SELECT * FROM cabin_sailings
  JOINS sailings ON sailings.id = cabin_sailings.sailing_id
  JOINS ports ON ports.id = sailings.departure_port_id
  JOINS cabins ON cabins.id = cabin_sailings.cabin_id
  JOINS ships ON cabins.ship_id = ships.id

Oh, and while we’re at it, let the user select only, say, four day cruises, and sort by price:

1
2
3
4
5
6
7
8
9
10
11
12
SELECT
  cabin_sailings.*,
  (sailings.return_date - sailings.embark_date) AS duration,
  (duration *  cabin_type.price_per_day) * sailings.price_adjustment_factor + ports.port_fee AS price,
FROM cabin_sailings
  JOINS sailings ON sailings.id = cabin_sailings.sailing_id
  JOINS ports ON ports.id = sailings.departure_port_id
  JOINS cabins ON cabins.id = cabin_sailings.cabin_id
  JOINS ships ON cabins.ship_id = ships.id
  JOINS cabin_types ON cabins.cabin_type_id = cabin_types.id
WHERE duration = ?
ORDER BY price DESC

That’s not too bad, but you can quickly see how this is getting a little out of hand. Putting this much SQL into any traditionally define model is going to be unwieldy.

And as our application gets more complex, and our data set grows, doing these very complex JOINs will rapidly consume significant database power. Soon, we’ll be taking ten seconds to display results to the user - every time they search.

This is the perfect situation for a (Materialized) view. We can keep the complexity of the SQL that defines the concept we’re interested in - let’s call it a “Listing” for now - out of the application, and in the database where it belongs by using a view. And by using a materialized view, we can keep the queries super speedy, at the cost of having to put a bit more work into keeping them up to date.

Next Up

Next up: Some tips and tricks for using materialized views with Ruby on Rails.


I'm looking for better ways to build software businesses. Find out if I find something.