Skip to navigation

malevolent design weblog

This blog is now defunct, but you can find more stuff over at my personal site

Must… Avoid… Cache/Caching Pun…

Uncontroversially, I reckon most dynamic web sites/apps need some form of caching to reuse work done by the server. Without it, the LOLinator wouldn’t have coped with its peak traffic (it’s on dirt-cheap shared hosting and barely flinched), and most blogs need it to avoid keeling over at the first sniff of a link from a popular site.

You can introduce caches in numerous ways:

Opcode Caching
If you’re using PHP it’s worth installing something like APC or XCache to avoid scripts being compiled from scratch for every request. They work transparently with only a small memory overhead and occasional stability issues.
Query Caching
You can store the results of popular queries either by using the database server’s built-in features or via the application/framework’s database layer. The problem is, the app still has to process and display the data, and the cache has to expire when there are updates, so it may not always make a noticeable difference.
Object Caching
At a slightly higher level, you can cache the data structures used by your application rather than just the raw query results. An object might be assembled from several queries so it can make sense to avoid repeating that process.
Page Caching
Reusing the entire web page gives the biggest performance increase and can even be handled by a dedicated layer (e.g via Apache’s mod_cache or a Squid server) so that your app isn’t even executed for most requests. However, if you need fine control over expiry or the ability to exclude certain content then indiscriminate caching may not be viable.
Selective Page Caching
By giving the application some control you can let it decide which pages get stored and for how long. For example, ZenMagick won’t cache a page if the customer is logged in or has something in their basket. You lose performance by having to run some application logic, but still get big gains from caching popular pages.
Page Fragment Caching
If part of each page differs between requests (e.g. to display the user’s own details), but other areas remain the same and involve significant processing, then it can make sense to cache selected portions of markup. The app will still need to sort out the non-cached areas and assemble the various pieces, and expiry may be more complex, but it can reduce querying and processing enough to prove worthwhile.

Frameworks such as Rails, Django and ASP.NET offer most or all of the above as options, and you can often choose between using files or memory. In my own framework I’ve implemented selective page caching with support for conditional GET, plus fragment caching for extra flexibility. It’s only file-based at the moment, but adding memcached support would be easy (apart from wildcard expiry).

I know many developers are wary of spending time on these things and would rather someone quietly threw more hardware at any performance problems, but caching should be seen as a core feature of any site that may attract non-trivial traffic. For existing sites, start by applying measures to the home page and work from there to target other popular or slow pages, it’s usually quite easy to pick off a few problem areas.


Comments are now closed for this entry.