The Task of Optimizing Assets for Production Websites

  • A nicely designed blog contains on average 5-10 javascript files and 2-5 stylesheets, sometimes more than 20 of each.
  • We want a fast loading site
  • We don't want to do repetitive tasks (we're developers!)
  • It takes a lot of time and effort to compress assets (or at least learning how to do it correctly the first time does).

The goal is to minimize having to do the same thing twice. My dad always said:

Never handle a document twice

Reason being, every day you'll get more documents and will have less time to go back to the previous ones. Handle the documents well the first time. Put thought into doing what ever you need to with them now, you'll be thankful you spent the time later.

Rules for Website Asset Optimization

Here's the top rules of website performance optimizations as it relates to static asset optimization (with quotes from YSlow):

  1. Minify Assets (remove whitespace) > Minification removes unnecessary characters from a file to reduce its size, thereby improving load times. When a file is minified, comments and unneeded white space characters (space, newline, and tab) are removed. This improves response time since the size of the download files is reduced.
  2. Compress Assets (gzip) > Compression reduces response times by reducing the size of the HTTP response. Gzip is the most popular and effective compression method currently available and generally reduces the response size by about 70%. Approximately 90% of today's Internet traffic travels through browsers that claim to support gzip.
  3. Minimize HTTP Requests > Decreasing the number of components on a page reduces the number of HTTP requests required to render the page, resulting in faster page loads. Some ways to reduce the number of components include: combine files, combine multiple scripts into one script, combine multiple CSS files into one style sheet, and use CSS Sprites and image maps.
  4. Use Cache Headers
  5. Reduce DNS lookups (2-4 max) > The Domain Name System (DNS) maps hostnames to IP addresses, just like phonebooks map people's names to their phone numbers. When you type URL www.yahoo.com into the browser, the browser contacts a DNS resolver that returns the server's IP address. DNS has a cost; typically it takes 20 to 120 milliseconds for it to look up the IP address for a hostname. The browser cannot download anything from the host until the lookup completes.
  6. Keep Components under 25K > This restriction is related to the fact that iPhone won't cache components bigger than 25K. Note that this is the uncompressed size. This is where minification is important because gzip alone may not be sufficient.
  7. Use a Content Delivery Network

Github Pages Solves the Asset Optimization Problem

Github Pages is a really neat project. This site is run on Github Pages. It allows you to simply git push to deploy.

Github uses Nginx 0.7.6 (Nginx 0.8.4 is out at the time of this writing) to quickly serve up their (your) web pages, and gzip's the content automatically. Gzipping can compress the response by as high as 70%!

Here's what a typical response header looks like from a Github Pages page:

Cache-Control:max-age=86400
Content-Encoding:gzip
Content-Type:text/html
Date:Thu, 29 Jul 2010 19:47:17 GMT
Expires:Fri, 30 Jul 2010 19:47:17 GMT
Last-Modified:Tue, 27 Jul 2010 06:05:54 GMT
Server:nginx/0.7.61

What that means is every time you deploy a change to your Github Pages repository, it updates the cache and gzips your content. That saves you a good deal of thought and work.

Store Your Assets on Github Pages

Let's setup a quick project for your assets. First create a new Github repository named my-name-cache or something, then:

mkdir my-name-cache
cd my-name-cache
git init
touch index.html
touch test.js
echo "alert('cached');" >> test.js
git add .
git commit -a -m 'test github pages'
git remote add origin [email protected]:your-username/my-name-cache
git push origin master
git branch gh-pages
git push origin gh-pages

Go to http://your-username.github.com/my-name-cache/test.js and you'll see the alert. Use Google Chrome or Firefox to check the response headers to confirm gzip compression if you'd like.

Now you can just setup your commonly used javascript files, icons, stylesheets, etc. however you want. I like the Rails convention, it's simple, clear, and semantic:

/javascripts/**/*
/stylesheets/**/*
/images/**/*

How About Minimizing HTTP Requests?

Once you get all of the asset libraries you ever use up onto a Github Pages project, you can create another Github Pages project that is specific to you that combines all the javascript files into one. The other option is to deploy the assets with your app but then you have to worry about gzipping which isn't always done by default. So you could have something like this:

/javascripts/my-first-rails-app.min.js
/javascripts/a-sinatra-test.min.js

Gotchas

I've noticed quite a few sites linking to the raw file directly from a non Github Pages repository branch like this:

http://github.com/viatropos/cached-commons/raw/master/javascripts/jquery/jquery-1.4.2-min.js

This seems like a good idea but it's not. When you request the raw file like that, you're not directly accessing the file from the filesystem! You're going through layers of application code too, which will definitely slow your site down. Don't do that. Instead, create a gh-pages branch, and load it from there like this:

http://viatropos.github.com/cached-commons/javascripts/jquery/jquery-1.4.2-min.js

That reads directly from the file system.

I also don't recommend using Github Pages to store binary files or any large documents. Github is a very useful service but we don't want to abuse it. Plus, having a small repository makes pushing and pulling a lot faster. For all the other assets you want to store, some free services I use are:

Keep the community DRY

The beauty of Github (and open source) is that one person or a small team can work intensely to solve a very specific problem for the benefit of themselves and the community. The programming world changes sooo fast that it's useless to a) keep your work a secret and b) duplicate efforts. Instead, solve the problem once, as well as you can, or find someone else that has and contribute. You'll save yourself and everyone else lots of time and help the community move at a much more rapid pace.

Same goes for this asset compression system. I messed around a bit and set up Cached Commons on Github Pages to collect common javascript and css libraries so I could minimize and gzip them once and not have to clutter every project with the same files over and over again. If you have a better idea, or you have libraries you want to add or thoughts on how to make this better, I'd love to know. Anything that saves time and effort in the long run, that's DRY, is a big win for me.

Cheers.

Resources