HomeBlogAboutTools

An interesting web application infrastructure issue

uncategorized

Many web applications depend require cache control for pages, especially if they involve user logons or time-dependant data.

Usually this is achieved with HTTP headers - something like (in JSP):

response.setHeader("Cache-Control", "no-cache"); response.setHeader("Expires","Tue, 30 May 1980 14:00:41 GMT");

An alternative, which usually work well is to require your site to be run under HTTPS. In theroy, this seems ideal, since it provides security as well as cache control.

However, beware of the impact of things like reverse-proxies. Many companies are installing reverse proxies in front of their web hosting machines to do request filtering in order to provide some protection against SQL injection & XSS attacks on their websites. This is a really good idea, but there can be some unexpected impacts.

One I didn’t expect was the impact on caching. Because the proxy needs to inspect the request, it decrypts it, then forwards the request as a HTTP request to the server. Many vendors list this as a feature, because it offloads some processing requirements tot he proxy-box, instead of the webserver. The catch comes if you don’t have explict cache control in your pages AND the reverse proxy is a caching reverse proxy. In this case the proxy may return the cached content to the user, which is NOT WHAT YOU WANT!

Another complicating factor is that some reverse proxies forward the original HTTP 1.1 requests to the server as HTTP 1.0, and seem to ignore HTTP 1.1 headers that are returned. This can bite you if you only use the “Cache-Control” (HTTP 1.1 only) header.

Lesson: Always provide explict cache control AND expiry headers, and never rely on HTTPS to control caching for you.