Posts tagged with cache

Caching websites with Django and Memcached...

Published at July 12, 2011 | Tagged with: , , , , , , ,

... memcached installation, django configuration and how to use it for your website

After Caching web sites and web applications and When to use caching for web sites its time for a little sample. This sample shows the usage of Memcached for server-side application cache. The installation part is taken from my Ubuntu so it may differ depending from your OS/distribution.

What is Memcached: Memcached is a tool that allows you to store key-value pairs in you memory. The keys are limited to 250 Bytes and for better performance the value size is limited to 1MB(more details) but this size is fair enough for web usage.

Memcached installation:

apt-get install memcached
apt-get install python-memcache
The first line installs Memcached and the second one install Python API for communication between your application and Memcached daemon. After this the Memcached daemon is up and running. With default configuration it runs on port 11211 on localhost( If you want to modify this the configuration file(in my case) is situated in /etc/memcached.conf Django configuration: This one depends from the Django version that you use. For 1.2.5 and prior the next code should by added in your settings file(
CACHE_BACKEND = 'memcached://'
For 1.3 and development version add:
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '',
In both cases if you use different port and/or IP you have to replace them above. More info about cache backend configuration you can find in Django documentation docs. So now you have Memcached running and Django configured. If you have doubts about is this suitable/usable in you case take a look at the posts mentioned above or just add comment with your case and I will be happy to give you an advice. Now it is time to start using it. Cache usage(part I) - how to cache on Python level: If you have some heavy calculations in your view you can cache the result from this and use the calculated one to lower the load. Example:
from django.core.cache import cache

def heavy_view(request):
    cache_key = 'my_heavy_view_cache_key'
    cache_time = 1800 # time to live in seconds
    result = cache.get(cache_key)
    if not result:
        result = # some calculations here
        cache.set(cache_key, result, cache_time)
    return result
The process is simple, you ask the cache for a value corresponding to a given key(line 4). If the result is None you execute the code that generates it(line 8 ) and store it in the cache(line 9). My advice is to declare the key and time as variables cause this will ease their future changes. Cache usage(part II) - how to cache on template level: This is suitable for the cases when you have some heavy processing in the template(as regroup) or you want to cache only part of the template(as latest news section). Example:
{% load cache %}
 ... non cached content here ...
{% cache 1800 latest_news %}
    ... here are latest news - cached ...
{% endcache %}
The basic usage usage is {% cache time_in_seconds key %} ... {% endcache %} You can also cache code fragments based on dynamic properties, for example - current user recent conversations, just pass a 3rd param the uniquely identifies the code to be cached.
{% load cache %}
{% cache 300 recent_conversations %}
    ... current user recent conversations - cached ...
{% endcache %}

Final words: as you see from the examples above using Django and Memcached is really easy. Using it correctly will speed up your website and respectively improve your user experience(UX) and SEO. Using it wrong will provide negative results. Just take a moment and think what can be cached, how long can it be cached and is there a reason to be cached. Try to avoid double caching - there is no need to use caching in templates and then cache the rendered template in the view too.

When to use caching for web sites ...

Published at July 12, 2011 | Tagged with: , , , , , , ,

... five major question to ask yourself before using cache

After we learned about Why, Where, What and How of caching web sites now it is time to see when to use it.
The application cache is mainly used for speeding up and/or decreasing the load of frequently executed and(but not necessary) heavy resource using methods. So the first question is:

1) If you have method the consumes lots of CPU time and/or memory can it be optimised?

If you can optimize your code and make the method run faster and consume less resource than do this first and then reconsider whether you still need to cache it..

2) Will caching save you load/wait?

Have in mind that accessing cache has its own load. So caching the result of relatively light operations is pointless. Try to find where your biggest load/wait came from and use cache there.

3) For how long the cache should be valid?

This depend from how often the data is changed. We can split it in 3 major cases.

Case 1: The data changes on equal amounts of time, for example it is updated by cron. In this case you can set the expire time equal to the time interval in which the cron is runned. Example - you are reading a feed with the news from the last hour.

Case 2: The data is changed on random intervals of time - but not real time(i.e. from minutes to hours). In this case you should choose an average amount of time for which you think the data is persistent or even if it is changed you can serve the outdated one to the client. In this case the best way is to be able to invalidate cache when data changes. This is usable if the data is from the same type, for example - news list. Just add a line that invalidates cache on every news affecting operation(add/edit/delete). If you have composite data, for example mix of news, weather cast and currency rates for the day you'll just have to wait the cache to expire by it self.

Case 3: The data is updated real-time(every second, sometimes more than once in a second). If you are watching/playing on the market you really need real-time info for the current rates. But if don`t need the real-time date, for example you are displaying just informative graph with the daily changes you can cache and update it rarely(for example on every few minutes) then it changes.

So you have determined how long your cache should stay valid and the next question is:

4) Will the data be accessed more than once for the cache period?

If not then don't cache it or reconsider to increase cache validity time(question 2).

5) How big is the data to be cached?

Memcached has 1MB limit per cached item. For web(no matter is it site or application) this limit is fair enough for most/all of the cases. If you plan to store something bigger thing again and be sure you are not doing something wrong. If you really need to cache bigger amount of data consider to use another cache storage - database or file system.

If you are an advanced developer you will subconsciously know whether to use cache or not. But this questions may be really helpful if you are a novice.
I am open to hear what you ask yourself before using cache.


One more case when it is mandatory to use caching(if you can) is when you have frequent calls to an outer API. For example Retrieving Google Analytics Data.

Caching web sites and web applications...

Published at May 25, 2011 | Tagged with: , , , , , , , , , , ,

...Why, Where, What and How of caching web sites

Basics of web page load: Every time when you open an webpage this results in a series of requests and responses between the client(your browser) and the web server that hosts the requested sites(most of the times this includes more the one servers). Each request tells the server that the client wants specific resource(CSS/JavaScript/image/etc.) and each response carries the resource content.


The main purposes of caching is to decrease bandwidth and to improve website performance.

Where are resources cached: According to the image above there are 2 places where content can be cached: on the client(browser cache) and on the server(server cache). The first one(browser cache) is more suitable when the client often requests single recourse again and again. As an example for this can be pointed CSS-files/JavaScripts/Images and other resources that form the page layout(these are requested on every page load). The second one(server cache) is hold on the server. It also can be split in two major pieces web server cache(when entire resource is cached) and application cache(when data chunks inside the application are cached).

How browser cache works: In order to tell the browser to cache such resources you have to supply each response with the following headers:

  • Last-Modified - datetime indication of when the content was last modified/updated on the server
  • Expires - the datetime after which the response will be considered "out of date"
  • Cache-Control - a directives about whether the resource must be cached and its maximal age of validity
  • Date - origin servers datetime

When the client request the resource for a first time the response contains both the resource data and the headers mentioned above. This tells the browser to store the response body(data) into its own cache. On the next request for the same resource the browser sends "If-Modified-Since" header with datetime value identical to the "Last-Modified" received from the server on the previous request. If the cache is still valid the web server returns HTTP Status code "304 Not Modified" which tell the browser to retrieve the content from its cache. Otherwise if the resource is modified between the two requests the response contains the new data along with the corresponding date headers.
This one is extremely useful to decrease websites bandwidth and speed up their loading because the common resources are read directly from the client and not pulled again and again from the server over the network.
For static resources this can be set in the webserver configuration. For dynamic ones(generated images or resources pulled from the database) these headers must be provided and checked from the application itself.

How server cache works: As I stated above the server cache can can be split into web server cache and application cache. Both are used when a single resource is requested by multiple of clients. For example the pages HTML is requested by every site visitor. If the content is the same for all of them then the HTML computation(for dynamically generated pages) can be done only for the first request and the latter ones can get the precomputed one from cache. For this you will need a caching proxy. The most common applications that are used for this purpose are Squid, Varnish and NginX

The process is similar to the one on the first figure but here the client communicates with the proxy instead of directly to the web server. If the proxy does not has a copy of the requested resource it gets if from the web server and then serves it to the client. Otherwise it pulls the content from its cache and return it directly.

Application cache: Sometimes caching full page is not an option. For example if you have a news website with two columns: one that is same for all users and one that is customizable by user preferences you can not serve every one with the same page. But you can pre-compute first column HTML, store it in the application cache and retrieve it when need instead of computing it for every user. The most common tool for this is Memcached. Its implementation is application specific but in pseudo code it works the following way:

key = 'cached_item_unique_key'
result = cache.get(key)
if not result:
   # do some computation here
   result = computed_result
   cache.set(key, result, is_valid_period)
return result

Each cached item is represented in the cache with unique key. When requested if there is such data in the cache it is pulled directly from there and server. Otherwise the date is computed, stored in the cache and then served. Server cache will decrease server load and allow your pages to be computed faster.

Final words: Caching is a double-edge razor. Using if carefully and with comprehension will make your life easier. Playing with it without knowing what you do may ruin your application. Do not cache rarely requested content. Do not try to cache everything, cache also has its limits. Use it wise.
If there is something not well explained, or missed, or wrong feel free to ask/comment. Your participation will be appreciated. Also if you find this article helpful share it and help other too.