CDN strategies

A global website needs to have relevant content for their multi-cultural, sometimes worldwide, audience. This content needs to be provided to a visitor as quickly as possible to engage them in a great experience.

One solution is to deploy and mirror your application to several datacentres across the world.

A much more cost effective solution is to use a CDN (Content Delivery Network) to serve your content from a geographical location close to your visitor.

A CDN will automatically determine which server is the closest from a visitor and serve the content from there, instead of your central datacentre. This improves performance by decreasing page load times.

A CDN can be setup in two different modes: “Push” and “Origin Pull”.

CDN Push

In “Push” mode a CDN is similar to having distributed storage space. You have to manage the CDN cache yourself, by uploading the files to the CDN storage server and referencing them in your website, effectively “pushing” the content into the CDN.

This approach ensures the assets will only be published to the general public when you desire because you define explicitly when the content is uploaded and when does it expire.

The disadvantage of this approach is that the integration effort between your website and the CDN provider is high. The upload flow via your CMS or DAM must be integrated with the CDN provider. Alternative approaches are to upload assets manually to the CDN or use synchronization processes between the systems.

Either of these approaches will involve a significant amount of time to implement.

CDN Pull

In “Pull” mode a CDN acts as a transparent caching layer on top of your website. All requests to your website will go to the CDN provider, which in turn will examine its local cache to see if there is a cached version of the assets, serving it if found. In the case there is no cached content, or if it has expired, the CDN will request it from your server and then serves it to the visitor. This approach is called “origin pull” as it “pulls” the content from the original server.

In this mode, the CDN controls the cache contents and its expiration by inspecting the HTTP headers of the cached content. Typically it follows the “Cache-Control”, “Expires” and “Last-Modified” headers to determine if an asset has expired. Cache population happens when a visitor first request the assets so there will be a small performance hit for the user who is accessing the assets for the first time or if it has recently expired.

A dynamic website needs mechanisms to invalidate content as it can change any time and cannot be predicted. This can be achieved in a manual way where a content editor uses the CDN administration interface and removes the content from its cache. A more integrated approach would be the origin website to call the CDN API and invalidate the cache for content that has been modified.

You may find that the CDN does not have an API for content invalidation or your system does not provide you the facility to call said API. You can follow the rules below to achieve content invalidation automation for this scenario.

  1. Page content is never cached. This can be achieved by setting the “Cache-Control” HTTP header to “private”.

    A website that has personalisation features should follow this rule as page content is specific to each individual visitor. Caching a personalised page will have the effect of serving the same content to all visitors, defeating the purpose of personalisation.

    On non-personalised sites “Cache-Control” can be set to “public” with the “Expires” header set to a date/time very near in the future. It can be a few minutes to a day, depending on how reactive to changes your website needs to be without compromising in performance. Consider setting “Cache-Control” to private if you want the content to be served directly from your server, always getting the latest content.

    This is important to ensure your visitors receive the most up to date and most relevant content.

  2. Site resources, like css and js files, should be cached with a long expiration date and versioned. This is achieved by setting the “Cache-Control” header to “public” and setting the “Expires” header to 1 year from now. The resource file name should have a version information parameter so that on any changes the version number is bumped up, causing the CDN to load a new file. e.g. style.css?v=1.1. This can be controlled manually by a developer or automatically via the build server.

    In EPiServer you can use the CDN4EPiServer module.

  3. Site assets, like image and video files, should also be cached with a long expiration date. Again, this is achieved by setting the “Cache-Control” header to “public” and setting the “Expires” header to 1 year from now. On systems that use a DAM (digital assets management) system and the CDN provider has an API, the DAM can be extended to notify the CDN of any file changes, invalidating the cache for that particular asset. If this is not feasible, versioning or URL rewriting strategies can be used to remove the asset from the cache. The URL will be rewritten to include a version number or a timestamp of the last modified date of the file to control the cache expiration.

China CDN

To better serve your visitors in China, it is recommended you use a CDN provider that has edge servers in mainland China. Chinacache is your best solution to reach your visitors in China.

In order to have a presence in China, your site needs to have an ICP (Internet Content Provider) license. This is a permit issued by the China Ministry of technology that grants you a license to operate in the country.

The process is described here and after your application is accepted you license number has to be displayed on your homepage at all times.