Practical AppCache Strategies
The Application Cache spec provides us with a very useful tool for enabling offline access to assets. This allows us to build web apps that can function whether or not the browser has an HTTP connection. AppCache is not a complete solution for offline apps, and it’s received some criticism for the fact that some effort is involved in getting it to work reliably. The goal of this post is to explore some concepts for using AppCache to create robust and reliable offline web apps. I’m going to assume that you are already familiar with the AppCache spec. If you need more background or a refresher, take a few minutes to read through these articles:
Recommended Reading
A Beginners Guide to Using The Application Cache
MDN: Using the Application Cache
Application Cache is a douchebag
Appcache, not so much a douchebag as a complete pain in the #$%^
Jake Archibald’s (in)famous ALA article “Application Cache is a douchebag” includes a great explanation of the pros and cons of AppCache, and what types of apps are good candidates for it’s use. I recently had a project that was a perfect fit for AppCache; a system that manages and displays touch-screen navigable, media-rich product catalogs that sales reps can walk buyers through at trade shows, meetings, etc. One of the requirements was that the system should allow the creation of any number of presentations, and each presentation should be capable of having it’s own look and feel. Creating a new native app for each presentation would be cumbersome, I wanted the flexibility and rapid development and iterations that web apps allow, but needed the presentations to be available offline. After briefly experimenting with local storage, saving binary assets base64 encoded, I decided to take a closer look at AppCache.
What would we like to improve/work around?
There are a couple of challenges that I see as the major problems that must be solved in order to implement a robust AppCache solution in a data-driven web app.
- Populating the manifest with both your static assets (css, js, static images that make up your client), and your dynamic assets (uploaded by back end users, potentially changing quite often)
- The “feature” of AppCache that invalidates a cache if the manifest changes while the cache is being refreshed, leaving the user without a viable cache (will not revert to previous known-good cache). This is the Achilles heel of AppCache, IMO.
This is how I solved both of these problems.
Our setup
I’m going to show simplified code from my app, which is uses the Zend Framework on the back end, and Backbone/MarionetteJS on the front end to deliver a single page app. None of the critical features are specific to any of those technologies, it should be easy enough to apply these principals to the stack of your choosing.
First we need a controller and action to render our SPA. Nothing fancy here, we’re just resolving the thing we want to display and stuffing it’s data into the view in an array format. The view will render the data as JSON strings.
class IndexController extends Zend_Controller_Action
{
public function displayAction()
{
try
{
$slug = $this->_getParam('slug');
try
{
$presentation = Presentation::fetchBySlug($slug);
}
catch(Exception $e)
{
$this->_forward('pagenotfound', 'error', null, null);
return true;
}
/*
Fetch our data in a simple array format that we can pass to the view
*/
$presentationData = $presentation->clientDetail();
$productData = Product::fetchClientDetailsForPresentation($presentation);
$this->view->presentation = $presentation;
$this->view->presentationData = $presentationData;
$this->view->productData = $productData;
}
catch(Exception $e)
{
$this->_handleException($e);
}
}
[...]
}
And here is our view template. Pretty standard stuff here, note the manifest attribute in the html tag. Since I’m serving a manifest for each of my presentations, I have a method on my presentation class to generate the path to it, but you can do whatever you want to get the relative path the the manifest.
<!DOCTYPE html>
<html lang="en" manifest="{$presentation->getAppcacheManifestRelativePath()}">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="msapplication-tap-highlight" content="no">
<title>{$presentation->getTitle()}</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="robots" content="noindex,nofollow">
<link href="/client/css/main.css" rel="stylesheet">
<link href="/client/css/nprogress.css" rel="stylesheet">
</head>
<body role="document">
<div id="presentation"></div>
<script type="text/javascript">
var presentation = {$presentationData|json_encode};
var products = {$productData|json_encode};
</script>
<script type="text/javascript" data-main="/client/js/built.js" src="/client/bower_components/requirejs/require.js"></script>
</body>
</html>
Populating the manifest
Now to address our first problem, we need a way to populate our manifest with our static and dynamic assets as painlessly as possible. We will maintain a text file in our client that lists our static assets that should be included in the manifest. Since it would be tedious and error-prone to have to maintain an explicit list of all of the files we want included, let’s use some wildcards that our server code can use to find all of the individual files for us. Also note the comment – these will be included by the back end code, so we can bump the version here to force the browsers to refresh their cache if we make a change to the static assets (doesn’t have to be a version, any text change in the manifest will trigger a refresh in the browser). I call this file appcache.mf, and place it in the root of the directory that my client code lives in.
# Client v1.0.39 bower_components/requirejs/require.js js/built.js img/* css/* fonts/*
Nice and concise, and unless I add something major, I never have to worry about updating it when my static files change.
And now we need to do the real work to generate the manifest. This is an abridged version of a method on my presentation class. First, it gathers all of the objects that are related to my presentation, and gets the relative paths to all of their assets that should be included in the manifest. Next, it parses the text file that we maintain in the client that contains the information for our static assets and appends their paths to the manifest. Note that I am including a timestamp in a comment so that the browser will know that it needs to refresh even if the nothing else in the manifest has changed. Doing this allows us to tell the browsers to update their cache if the actual data for our app, which is sent in our HTML page has been updated.
public function getLatestManifest()
{
$output = array(
'CACHE MANIFEST',
'# '.$this->getTitle(),
'# '.$this->getLastUpdateTime(),
'CACHE:'
);
$products = $this->getProducts();
foreach ($products as $currProduct)
{
$output[] = $currProduct->getImageFileRelativePath();
}
$clientRelativePath = '/client/';
$clientAbsolutePath = HTDOCS_DIR.$clientRelativePath;
$handle = fopen($clientAbsolutePath.'/appcache.mf', "r");
if ($handle)
{
while (($buffer = fgets($handle, 4096)) !== false)
{
$currLine = trim($buffer);
/*
If the current line is not a comment, but ends in a directory wildcard,
go fetch the list of files to be included
*/
if($currLine[0]!='#' && substr($currLine, -2)=='/*')
{
/*
If our client path has a trailing slash, make sure our file entries do not start with one.
*/
if(substr($clientAbsolutePath, -1)=='/')
{
$currLine = (substr($currLine, 0, 1)=='/') ? substr($currLine, 1):$currLine;
}
/*
Strip wildcard and trailing slash
*/
$currSearchPath = $clientAbsolutePath.substr($currLine, 0, -2);
/*
Use regex to exclude anything we don't want added to the manifest,
for example OSX .DS_Store files. Better to prevent these from being deployed in the first place,
but let's give ourselves the capability in case we need it.
*/
$currIncludeList = dirlist($currSearchPath, '/^.+\.(?!DS_Store$)/i', HTDOCS_DIR);
$output = array_merge($output, $currIncludeList);
}
else
{
/*
Format explicit file entries
*/
if($currLine[0]!='#')
{
/*
If our client path has a trailing slash, make sure our file entries do not start with one.
*/
if(substr($clientRelativePath, -1)=='/')
{
$currLine = (substr($currLine, 0, 1)=='/') ? substr($currLine, 1):$currLine;
}
$currLine = $clientRelativePath.$currLine;
}
$output[] = $currLine;
}
}
fclose($handle);
}
/*
Set everything else to NETWORK, in our app, should only be ajax requests to the manifest safety actions
*/
$output[] = 'NETWORK:';
$output[] = '*';
return implode("\n", $output);
}
public function dirlist($path, $regex=null, $excludePath=null)
{
$output = array();
$path = realpath($path);
$dirIterator = new RecursiveDirectoryIterator($path, FilesystemIterator::SKIP_DOTS);
$mainIterator = new RecursiveIteratorIterator($dirIterator);
if(!empty($regex))
{
$objects = new RegexIterator($mainIterator, $regex, RecursiveRegexIterator::GET_MATCH);
}
else
{
$objects = $mainIterator;
}
foreach($objects as $name => $object)
{
$output[] = (empty($excludePath)) ? $name:substr($name, strlen($excludePath));
}
return $output;
}
We’ll need to add an action to our controller to serve the manifest. Here is my routing for the presentation display and manifest.
<?xml version="1.0"?>
<configdata>
<production>
<routes>
<presentationDisplay>
<type>Zend_Controller_Router_Route_Regex</type>
<route>p/([^/]*)/?</route>
<defaults>
<module>presentations</module>
<controller>index</controller>
<action>display</action>
</defaults>
<map>
<slug>1</slug>
</map>
</presentationDisplay>
<presentationManifest>
<type>Zend_Controller_Router_Route_Regex</type>
<route>p/([^\.]*)\.appcache</route>
<defaults>
<module>presentations</module>
<controller>index</controller>
<action>manifest</action>
</defaults>
<map>
<slug>1</slug>
</map>
</presentationManifest>
</routes>
</production>
<staging extends="production">
</staging>
<test extends="production">
</test>
<dev extends="production">
</dev>
</configdata>
And here is a simple action to serve the manifest. Note the headers, it’s critical that the manifest not be cached, and it must be served with the text/cache-manifest MIME type.
public function manifestAction()
{
try
{
$this->_helper->viewRenderer->setNoRender(true);
$slug = $this->_getParam('slug');
try
{
$presentation = Presentation::fetchBySlug($slug);
$manifest = $presentation->getLatestManifest();
header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Content-type: text/cache-manifest');
echo $manifest;
}
catch(Exception $e)
{
$this->getResponse()->setRawHeader('HTTP/1.1 404 Not Found');
}
}
catch(Exception $e)
{
$this->_handleException($e);
}
}
At this point, we should have a fully functioning AppCache setup that works with both our static and dynamic assets, and frees you from having to think about maintaining the cache manifest.
Preventing cache invalidation when manifest changes during sync
So now on to our other major problem, preventing failed updates when changes are made to the manifest during download. The theory here is pretty simple. What we’re going to do is send an XHR to the server as soon as the browser starts downloading the latest version of the cache. When the server receives that request, it will cache the manifest in the user’s session, and the manifest action will check the session for a cached manifest until we send another XHR telling the server to clear the cache, which we will do after either a successful cache refresh, or if there is an error downloading the cache.
First, let’s update our manifest action to check the session for a cached version.
public function manifestAction()
{
try
{
$this->_helper->viewRenderer->setNoRender(true);
$slug = $this->_getParam('slug');
try
{
/*
The manifest is checked twice during an update, once at the start and once after all cached files have been updated.
If the manifest has changed during the update, it's possible the browser fetched some files from one version,
and other files from another version, so it doesn't apply the cache and retries later.
*/
$session = new Zend_Session_Namespace('appcache');
if (isset($session->manifestList) && array_key_exists($slug, $session->manifestList))
{
$manifest = $session->manifestList[$slug];
}
else
{
$presentation = Presentation::fetchBySlug($slug);
$manifest = $presentation->getLatestManifest();
}
header('Cache-Control: no-cache, must-revalidate');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header("Content-type: text/cache-manifest");
echo $manifest;
}
catch(Exception $e)
{
$this->getResponse()->setRawHeader('HTTP/1.1 404 Not Found');
}
}
catch(Exception $e)
{
$this->_handleException($e);
}
}
Now, we will add an action to handle the XHR indicating that a cache download is underway.
public function startappcachedownloadAction()
{
try
{
$this->_helper->viewRenderer->setNoRender(true);
$id = $this->_getParam('id');
try
{
$session = new Zend_Session_Namespace('appcache');
$presentation = Presentation::fetchById($id);
$manifest = $presentation->getLatestManifest();
if(!isset($session->manifestList))
{
$session->manifestList = array();
}
$session->manifestList[$presentation->getPrimaryKey()] = $manifest;
$response = array('status'=>'success');
header('Content-Type: application/json');
echo Zend_Json::encode($response);
return;
}
catch(Exception $e)
{
$this->getResponse()->setRawHeader('HTTP/1.1 404 Not Found');
}
}
catch(Exception $e)
{
$this->_handleException($e);
}
}
And the corresponding action to clear the cached manifest.
public function endappcachedownloadAction()
{
try
{
$this->_helper->viewRenderer->setNoRender(true);
/*
Best to use a primary key rather than a volatile field, so changes to that field during sync
won't trip us up.
*/
$id = $this->_getParam('id');
try
{
$session = new Zend_Session_Namespace('appcache');
$presentation = Presentation::fetchById($id, true);
$manifest = $presentation->getLatestManifest();
if(isset($session->manifestList))
{
unset($session->manifestList[$presentation->getPrimaryKey()]);
}
$response = array('status'=>'success');
header('Content-Type: application/json');
echo Zend_Json::encode($response);
return;
}
catch(Exception $e)
{
$this->getResponse()->setRawHeader('HTTP/1.1 404 Not Found');
}
}
catch(Exception $e)
{
$this->_handleException($e);
}
}
On the client side, I have a Marionette controller that handles all of the appcache functionality. Now would be a good time to review the applicationCache events, if you haven’t already.
define([
'underscore',
'backbone',
'marionette',
'vent',
'cmd',
'reqres',
'nprogress',
'views/AppcacheConfirmModal'
], function(_, Backbone, Marionette, vent, cmd, reqres, NProgress, AppcacheConfirmModalView)
{
return Backbone.Marionette.Controller.extend(
{
appCache: window.applicationCache,
startDownloadSent: false,
updateDeclined: false,
initialize: function()
{
var ctx = this;
// Fired after the first cache of the manifest.
this.appCache.addEventListener('cached', function(e)
{
ctx.handleCachedEvent(e);
}.bind(this), false);
// Checking for an update. Always the first event fired in the sequence.
this.appCache.addEventListener('checking', function(e)
{
ctx.handleCheckingEvent(e);
}.bind(this), false);
// An update was found. The browser is fetching resources.
this.appCache.addEventListener('downloading', function(e)
{
ctx.handleDownloadingEvent(e);
}.bind(this), false);
// The manifest returns 404 or 410, the download failed,
// or the manifest changed while the download was in progress.
this.appCache.addEventListener('error', function(e)
{
ctx.handleCacheError(e);
}.bind(this), false);
// Fired after the first download of the manifest.
this.appCache.addEventListener('noupdate', function(e)
{
ctx.handleNoupdateEvent(e);
}.bind(this), false);
// Fired if the manifest file returns a 404 or 410.
// This results in the application cache being deleted.
this.appCache.addEventListener('obsolete', function(e)
{
ctx.handleObsoleteEvent(e);
}.bind(this), false);
// Fired for each resource listed in the manifest as it is being fetched.
this.appCache.addEventListener('progress', function(e)
{
ctx.handleProgressEvent(e);
}.bind(this), false);
// Fired when the manifest resources have been newly redownloaded.
this.appCache.addEventListener('updateready', function(e)
{
ctx.handleUpdateReady(e);
}.bind(this), false);
cmd.setHandler('appcache.checkForUpdate', function()
{
ctx.checkForUpdate();
});
cmd.setHandler('appcache.doReload', function()
{
ctx.doReload();
});
cmd.setHandler('appcache.declineReload', function()
{
ctx.declineReload();
});
/*
We'll be using actual progress info to update NProgress, so turn the trickle off.
*/
NProgress.configure({ trickle: false });
},
handleCheckingEvent: function(e)
{
vent.trigger('appcache.checking');
},
handleDownloadingEvent: function(e)
{
vent.trigger('appcache.downloading');
/*
Tell the server to cache the current version of the manifest until our
download is complete
*/
this.sendStartDownloadMessage();
},
handleCachedEvent: function(e)
{
vent.trigger('appcache.cached');
/*
We're done,
tell the server to clear the cached manifest in our session
*/
this.sendEndDownloadMessage();
},
handleCacheError: function(e)
{
vent.trigger('appcache.cacheError');
/*
Something bad happened,
Kill NProgress and tell the server to clear the cached manifest in our session
*/
NProgress.done();
this.sendEndDownloadMessage();
},
handleNoupdateEvent: function(e)
{
vent.trigger('appcache.noUpdate');
},
handleObsoleteEvent: function(e)
{
vent.trigger('appcache.obsolete');
/*
We should never get here, since we protect against this state, but let's be safe.
Kill NProgress and tell the server to clear the cached manifest in our session
*/
NProgress.done();
this.sendEndDownloadMessage();
},
handleProgressEvent: function(e)
{
/*
The browser will initiate the download before our scripts have set up,
so as soon as we start receiving download events, check to see if the start message
has been sent, and if not, fire it off.
Also initialize or update NProgress with our current progress, then broadcast the state
in case something else in our app wants to know about it.
*/
if(!this.startDownloadSent)
{
this.sendStartDownloadMessage();
}
NProgress.set(e.loaded/e.total);
vent.trigger('appcache.progress', e);
},
handleUpdateReady: function(e)
{
/*
Our cache has been refreshed and is ready to reload.
Broadcast a message to let the rest of our app know,
tell the server to clear the cached manifest in our session,
promt the user to refresh
*/
vent.trigger('appcache.updateReady');
this.sendEndDownloadMessage();
this.promptForRefresh();
},
getStatus: function()
{
switch (this.appCache.status)
{
case appCache.UNCACHED: // UNCACHED == 0
return 'UNCACHED';
break;
case appCache.IDLE: // IDLE == 1
return 'IDLE';
break;
case appCache.CHECKING: // CHECKING == 2
return 'CHECKING';
break;
case appCache.DOWNLOADING: // DOWNLOADING == 3
return 'DOWNLOADING';
break;
case appCache.UPDATEREADY: // UPDATEREADY == 4
return 'UPDATEREADY';
break;
case appCache.OBSOLETE: // OBSOLETE == 5
return 'OBSOLETE';
break;
default:
return 'UKNOWN CACHE STATUS';
break;
}
},
checkForUpdate: function()
{
/*
If we have an update ready, prompt for the refresh.
If not, check for updates
*/
if (this.appCache.status == this.appCache.UPDATEREADY)
{
this.promptForRefresh();
}
else
{
this.appCache.update();
}
},
promptForRefresh: function()
{
/*
If the user has not already declined a reload, prompt them to do so
*/
if(!this.updateDeclined)
{
var modalView = new AppcacheConfirmModalView();
Presentation.app.topRegion.currentView.modal.show(modalView);
}
},
doReload: function()
{
/*
Simply reload the document, browser will use the new cache automatically
*/
window.location.reload();
},
declineReload: function()
{
/*
Let's set a flag that we can reference later so we don't pester the user with repeated prompts
*/
this.updateDeclined = true;
},
sendStartDownloadMessage: function()
{
/*
Tell the server to cache our manifest
*/
$.ajax({
url: "/presentations/index/startappcachedownload",
data: {
'id': Presentation.app.presentation.get('id'),
},
cache: false,
dataType: 'json',
success: function(data)
{
Presentation.debug('startappcachedownload success');
},
error: function(xhr, textStatus, errorThrown)
{
Presentation.debug('startappcachedownload error');
}
});
this.startDownloadSent = true;
},
sendEndDownloadMessage: function()
{
/*
Tell the server to clear our cached manifest
*/
$.ajax({
url: "/presentations/index/endappcachedownload",
data: {
'id': Presentation.app.presentation.get('id'),
},
cache: false,
dataType: 'json',
success: function(data)
{
Presentation.debug('endappcachedownload success');
},
error: function(xhr, textStatus, errorThrown)
{
Presentation.debug('endappcachedownload error');
}
});
this.startDownloadSent = false;
}
});
});
And in our application file, we simply bootstrap our models, controllers, and other stuff, including the Appcache controller.
/*global $*/
define(
[
'jquery',
'backbone',
'marionette',
'jquery.bootstrap',
'vent',
'cmd',
'routers/router',
'controllers/View',
'controllers/Appcache',
'views/AppLayout',
'collections/Products',
'models/Presentation'
],
function(
$,
Backbone,
marionette,
bootstrap,
vent,
cmd,
Router,
ViewController,
AppcacheController,
AppLayout,
ProductCollection,
PresentationModel
)
{
"use strict";
var app = new marionette.Application();
/*
Set up main region and render the app layout
*/
app.addRegions({
topRegion: "#presentation"
});
var layoutView = new AppLayout();
app.topRegion.show(layoutView);
app.addInitializer(function()
{
/*
Initialize our global controllers
*/
this.appcacheController = new AppcacheController();
this.viewController = new ViewController();
/*
Set global models/collections populated with bootstrapped data
*/
app.presentation = new PresentationModel(presentation);
app.productCollection = new ProductCollection(products);
/*
We're using Marionette.AppRouter, pass in our ViewController, which contains the methods referenced in the routes
*/
var routerOptions = {
'controller':this.viewController
};
this.router = new Router(routerOptions);
Backbone.history.start();
});
return app;
});
And that’s all there is to it, just a handful of simple techniques to take AppCache from douchebag to dependable workhorse.
Keep in mind, if your application could be thrown into an inconsistent state by loading a cache with files that span two different versions of the manifest, you will want to ensure that the browser downloads the files that are valid for the manifest that is cached. I minimize the risk by bundling up my JavaScript with the RequireJS Optimizer, so it is unlikely that any changes to the assets will cause problems for a client that is updating it’s cache while new assets are being deployed.