August 15, 2007
I first saw Yahoo's 13 Simple Rules for Speeding Up Your Web Site referenced in a post on Rich Skrenta's blog in May. It looks like there were originally 14 rules; one must have fallen off the list somewhere along the way.
- Make Fewer HTTP Requests
- Use a Content Delivery Network
- Add an Expires Header
- Gzip Components
- Put CSS at the Top
- Move Scripts to the Bottom
- Avoid CSS Expressions
- Reduce DNS Lookups
- Avoid Redirects
- Remove Duplicate Scripts
- Configure ETags
It's solid advice culled from the excellent Yahoo
User Interface blog, which will soon be packaged into a
similarly excellent book. It's also
available as a powerpoint presentation delivered at the Web 2.0 conference.
I've also covered similar ground in my post,
Reducing Your Website's Bandwidth Usage.
But before you run off and implement all of Yahoo's solid advice, consider the audience. These are rules from Yahoo, which according to Alexa is one of the top three web properties in the world. And Rich's company, Topix, is no slouch either-- they're in the top 2,000. It's only natural that Rich would be keenly interested in Yahoo's advice on how to scale a website to millions of unique users per day.
To help others implement the rules, Yahoo created a FireBug plugin, YSlow. This plugin evaluates the current page using the 13 rules and provides specific guidance on how to fix any problems it finds. And best of all, the tool rates the page with a score-- a score! There's nothing we love more than boiling down pages and pages of complicated advice to a simple, numeric score. Here's my
report card score for yesterday's post.
To understand the scoring, you have to dissect the weighting of the individual rules, as Simone Chiaretta did:
My YSlow score of 73 is respectable, but I've already made some changes to accommodate
its myriad demands. To get an idea of how some common websites score,
Simone ran YSlow on
a number of blogs and recorded the results:
- Google: A (99)
- Yahoo Developer Network blog : D (66)
- Yahoo! User Interface Blog : D (65)
- Scott Watermasysk : D (62)
- Apple : D (61)
- Dave Shea's mezzoblue : D (60)
- A List Apart : F (58)
- Steve Harman : F (54)
- Coding Horror : F (52)
- Haacked by Phil : F (36)
- Scott Hanselman's Computer Zen : F (29)
YSlow is a convenient tool, but either the web is full of terribly inefficient
web pages, or there's something wrong with its scoring. I'll get to
The Stats tab contains a summary of the total size
of your downloaded page, along with the footprint with and without browser caching. One of the key findings from Yahoo is that
40 to 60 percent of daily visitors have an empty cache. So it behooves you
to optimize the size of everything and not rely on client browser caching to save to you
in the common case.
YSlow also breaks down the statistics in much more detail via the Components tab.
Here you can see a few key judgment criteria for every resource on your page...
- Does this resource have an explicit expiration date?
- Is this resource compressed?
- Does this resource have an ETag?
... along with the absolute sizes.
YSlow is a useful tool, but it can be dangerous in the wrong hands.
Software developers love optimization. Sometimes too much.
There's some good advice here, but there's also a lot of advice that only makes
sense if you run a website that gets millions of unique users per day.
Do you run a website like that? If so, what are you doing reading this instead of
flying your private jet to a Bermuda vacation with your trophy wife? The rest of
us ought to be a little more selective about the advice we follow. Avoid the temptation
to blindly apply these "top (x) ways to (y)" lists that are so popular on Digg and
other social networking sites. Instead, read the advice critically and think about
the consequences of implementing that advice.
If you fail to read the Yahoo advice critically, you might make your site slower,
as Phil Haack unfortunately found out. While many of these rules are bread-and-butter
HTTP optimization scenarios, it's unfortunate that a few of the highest-weighted
rules on Yahoo's list are downright dangerous, if not flat-out wrong for smaller web
sites. And when you define "smaller" as "smaller than Yahoo", that's.. well, almost
everybody. So let's take a critical look at the most problematic heavily weighted advice
on Yahoo's list.
Use a Content
Delivery Network (Weight: 10)
If you have to ask how much
a formal Content Delivery Network will cost, you can't afford it. It's more
effective to think of this as outsourcing the "heavy lifting" on your website--
eg, any large chunks of media or images you serve up -- to external sites that are
much better equipped to deal with it. This is one of the most important bits of
advice I provided in
Reducing Your Website's Bandwidth Usage. And using a CDN, below a reasonably Yahoo-esque traffic volume, can even
slow your site down.
ETags (Weight: 11)
ETags are a checksum field served up with each server file so the client can tell if the server resource is different
from the cached version the client holds locally. Yahoo recommends turning ETags
off because they cause problems on server farms due to the way they are generated
with machine-specific markers. So unless you run a server farm, you should ignore
this guidance. It'll only make your site perform worse because the client will have
a more difficult time determining if its cache is stale or fresh. It is possible
for the client to use the existing last-modified date fields to determine whether
the cache is stale, but
last-modified is a weak validator, whereas
Entity Tag (ETag) is a strong validator. Why trade strength for weakness?
Add an Expires
Header (Weight: 11)
This isn't bad advice, per se, but it can cause huge problems if you get it wrong.
In Microsoft's IIS, for example, the Expires header is always turned off by default,
probably for that very reason. By setting an Expires header on HTTP resources, you're
telling the client to never check for new versions of that resource-- at
least not until the expiration date on the Expires header. When I say never, I mean
it -- the browser won't even ask for a new version; it'll just assume its
cached version is good to go until the client clears the cache, or the cache reaches
the expiration date. Yahoo notes that they change the filename of these resources
when they need them refreshed.
All you're really saving here is the cost of the
client pinging the server for a new version and getting a 304 not modified header
back in the common case that the resource hasn't changed. That's not
much overhead.. unless you're Yahoo. Sure, if you have a set of images or scripts
that almost never
change, definitely exploit client caching and turn on the Cache-Control header. Caching is critical to browser performance; every web developer should have a deep understanding of how HTTP caching works. But only use it in a surgical, limited way for those
specific folders or files that can benefit. For anything else, the risk outweighs the benefit. It's certainly not something you want turned on as a blanket default for your entire website.. unless you like changing filenames every time the content changes.
I don't mean to take anything away from Yahoo's excellent guidance.
Yahoo's 13 Simple Rules for Speeding Up Your Web Site and the companion
FireBug plugin, YSlow, are outstanding
resources for the entire internet. By all means, read it. Benefit from it. Implement it. I've been
banging away on the
benefits of GZip compression for years.
But also realize that Yahoo's problems aren't necessarily your problems. There is no such thing as one-size-fits-all guidance. Strive to understand the advice first, then implement the advice that makes sense for your specific situation.
Posted by Jeff Atwood
If you don't take rule #1 seriously "pinging the server for a new version and getting a 304 not modified header back in response" can become quite costly, even if you are not yahoo.
One roundtrip can be more than 1KB in traffic, 5 such request and you will have 5KB of lost traffic. If content and markup of your page takes 10-20KB it's quiite substantial part of the traffic.
If you're experimenting with Cache-Control headers (set in IIS via the file/folder properties dialog, HTTP Headers tab, "Enable Content Expiration" checkbox) there's a great summary here:
Also, remember that F5 forces the browser to re-check all files, even those that would *normally* be cached. Not that I made that mistake or anything.. :P
"the browser won't even ask for a new version"
IE 5 on the Mac used to always go to the cache regardless of the expires date. We had to pass a random number in the querystring to force it to GET the page from the server. Sucked. Thank God that browser tanked.
re: F5 - F5 does force the browser to re-check the files, but in some cases changes to static and some dynamic pages may not show up. Ctrl-f5 (cmd-r on a Mac) forces the browser to by-pass the cache entirely.
This is a fantastic critique of the methodology used by Yahoo. Thanks.
The reason I advocate people opt in to setting cache control headers and the like for "real" projects (I must take the time to fix up jcooney.net too) is not the network traffic - it's the latency all those extra requests add to your page load.
Great post! You've done a wonderful job on this site - it renders very speedily on an iPhone over EDGE, as well.
Good stuff, I've use YSlow to optimize my blog and to compare with other blog including yours. I come to realize that this is pretty apples and oranges here. Come to think of it that in order to really see the close result, then two comparing blogs have to have nearly same contents and same amount of posts on one page. So yes, it is nice to know how optimized a blog really is, but to use it to compare blogs is just wrong IMHO! By the way, I totally agree with your analysis that YSlow is targeting high traffic sites that can be optimized with all of their assets at disposal. Anyway, I will post my analysis about this YSlow too later on my blog as well, not trying to compare and use others to optimize my blog, but to show how it can optimize the blog based on its guideslines which i doubt that some of them I can even tweak for blogger platform...lol! Anyway, good post Jeff and your blog really has blown my sox off fast now!
ETags aren't necessarily checksums. You can put any kind of value in them, like an ID, a cryptographic hash (eg: MD5 or SHA-1), an actual datetime like the modified timestamp in your database table, the file size or a compound value of the above or anything else.
What is important when using ETag is that you will generate the same ETag value for unmodified resources and a different ETag if they've been modified.
As for the Expires, I can't confirm at the moment but I believe that the "must-revalidate" property of the Cache-Control will force the browser to check with the server despite the presence of the Expires header.
I prefer to use the max-age property of the Cache-Control rather than Expires. It is simpler to just say "max-age=86400" then "Expires: Fri, 18 Aug 2007 02:55:13 GMT"
I personally use the expires and cache-control settings, and even with on the order of 10K visitors a day I think it makes a pretty big difference. We do it because our price comparison pages are pretty heavy on the server, and we only set it for about an hour.
Other than that I totally agree-- CDN for most people means offload your images.
Thanks for the ETags explanation I hadn't seen much on that previously.
Far-future expires headers are great for the fact that once the client has the resource (img, js, or css) they don't need to check again. Almost every build process at Yahoo! uses date-stamped files for these types of resources. That way when a new build is pushed out to production the filename itself changes. This way instead of having the browser checking for new versions of these files it can just passively go get them when it's told to. Removing the need for those roundtrips to the server does make a very measurable difference for sites that get a lot of use.
I prefer to use the max-age property of the Cache-Control rather than Expires.
Expires is a HTTP 1.0 header; Cache-Control is a HTTP 1.1 header. Most modern web servers use the Cache-Control header, which does the same thing Expires did, and much more..
what we need is a bittorrent'esque webbrowser.
There may be special cases where CDNs are worthwhile even for small sites. If you're using YUI, for example, the only good reason not to use the Yahoo-hosted versions is if high security is required. It saves you bandwidth, and it saves your visitors time because it dramatically increases the percentage of cache hits.
"All you're really saving here is the cost of the client pinging the server for a new version and getting a 304 not modified header back in the common case that the resource hasn't changed. That's not much overhead.. unless you're Yahoo."
That's not exactly true. Part of the savings here is realized not by the person hosting the content but the person requesting it. If you're an Indian visitor to a US web site, even a conditional GET is quite expensive. You're making the request over a high latency, often flaky connection. We've seen this in Yahoo! Mail and it has forced us to squeeze requests and responses into as few packets as possible. The damage caused to user perceived performance by a single dropped packet on these networks, where round-trip latency is high, is quite noticeable to the end user, less so to Yahoo!.
Jeff, as some already did I strongly disagree with your Expires header part for a simple reason: latency.
And the fact that most browsers only handle 2 parallel connections by default. This means that everytime a browser has to check the cache headers of the server for two files at the same time, everything is blocked until the requests comes back, a small but significant fraction of a second later, especially over high-latency connections (e.g. 3G or -- worse -- EDGE).
By using Expires on everything static (images, CSS, JS) you ensure that these won't even exist and the browsing experience will be much smoother.
I don't exactly use Yahoo's rules on the subject though, they suggest putting the Expires in the far future for everything and using version numbers in the names, I usually prefer (unless I have automated the build of the website, which also happens) having a far-future Expires for images (that usually don't need to be debugged once they're in prod) and a nearer future for resources that may need evolution/debugging (CSS and JS)
You saved me!! I was realizing something like you wrote in this great post :D
The problem isn't that a CDN will slow your site down, it's that Coral cache isn't a CDN.
Coral cache is a p2p caching system.
I agree that using a CDN is overkill for all but the largest websites, and shouldn't really be on Yahoo's yslow.. but saying that it'll slow down your website is no different than changing all your links to google cache and claiming that you're using a CDN.
Just wanted to let you know when I tried to print Aug 15th's article in firefox it tried to print 78009 pages. That was with the article and comments selected, and printing the print selection. Might be on this end, but I'd rather not try to duplicate this bug on my way out the door :D
"I refuse to take Yahoo's recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page."
Well, duh. If you'd actually bothered to read their recommendations, you would have realized why *they* do that.
One of the websites I tested (not mine!) had 109 http requests.
Assume I have a 50ms ping to the server (which is fairly average), then that's 5.4 seconds added to the page load in latency alone.
I believe one of the biggest wins with expires headers is that a fair number of ISP's use invisible caching proxies on their network. This reduces their bandwith bills as well as improving response time to their customers. For the site it means you send out one request and a few thousand people grab it from someone elses proxy.
I tend to agree with pretty much most of what you said. Interestingly enough I manage to get an A (90) on my site somehow despite not using a CDN or expires headers. (Though I will probably add expire headers to certain content when I get the chance to research it more)
I would just like to point out that you can change these weights. By typing in about:config into the address bar and then filtering by the value 'yslow' you can see all the options for yslow. If you are only concerned with the weighted point values, filter by 'yslow.points'.
Like you said, Yahoo's problems are not your problems. Perhaps someone should setup a set of weights to change this to. Ie. if you are a small blogger use these values, if you get x number of hits use these.
And as for your comment about inefficient web pages, I do feel there is a large number of inefficient web pages out there, even on high profile sites. Especially corporate sites. Of course taking my experiences from where I work, I can completely understand how it happens.
I refuse to take Yahoo's recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page.
(Side note: This figure was checked on June 5, 2007)
unless you like changing filenames every time the content changes.
The default way to output links for resources like CSS in Rails adds a parameter with the modification time automatically. If you modify the file then the link is generated with a different parameter and the browser retrieves the file again. This lets you have the far future expires without having to change the filename each modification.
Good post. Kinda reminds me back in '99 when everyone was trying to create architectures to handle Amazon and ebay-like loads.
You know what would be nice....
If the tool could provide an estimate of savings for a specific page, before one goes through the motions of making real changes.
For instance, if you implement gzip compression... your estimated savings is XYZ.
... or better.... allowing people to add checks beyond the original 13 items... to customize the entire plugin.
E. David Zotter
Once again, great post Jeff.
quoth apeinago : "what we need is a bittorrent'esque webbrowser."
Yes. And then a supernode gets infected and someone finds a way to splice information in the packets going to other people (just add the right padding to end up with the same checksum). Result: you can hack the site without hacking the site and hit-and-run your botnet together, because if someone else checks, the perp will be gone; or the packet will come from another source, so it won't show.
I'll trust BT-esque rather for content that is large (so overhead is relatively small) and isn't expected to change every minute (unlike a well-visited news site).
great to see the posts back to development instead of hardware based!
A lot of this advice is very good, but I believe a lot of you are missing out on what the advice is actually telling you. Yahoo! gave a talk at the Web 2.0 Expo this year in which they gave their 14 rules, and explained briefly why each rule was there, and what it would accomplish.
I'd recommend reading this to see where the true benefits come. Some suggestions, such as the CDN, aren't needed by most people, but things like far future expires headers, and avoiding redirects are VERY helpful and can easily be applied to any user.
Using CSS sprites was also discussed at the Expo during the talk and is another GREAT optimization to cut back on requests to the web servers, which I guarantee you is where most of the user spends his time.
""I refuse to take Yahoo's recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page."
Well, duh. If you'd actually bothered to read their recommendations, you would have realized why *they* do that."
Their reasoning is that by merging them into the same file results in faster end-user response times. I found their reasoning flawed. It only applies in two cases:
1. It's the first time I visited their site.
2. I have caching turned off, or the external stylesheet has expired from the cache.
For a repeat visit, my browser would cache that stylesheet, so the next time I visited it, it would send a conditional GET (If-Match if ETags are being used, If-Modified-Since if they aren't), be passed back a 304 Not Modified header, then load the cached copy.
Skipping sending 43.5KiB is going to speed up page loading except on the fastest of connections.
Powerlord, Yahoo found that 40-60% of visits were NOT repeat visits based on their data.
40-60% of Yahoo!’s users have an empty cache experience and ~20% of all page views are done with an empty cache. To my knowledge, there’s no other research that shows this kind of information. And I don’t know about you, but these results came to us as a big surprise. It says that even if your assets are optimized for maximum caching, there are a significant number of users that will always have an empty cache. This goes back to the earlier point that reducing the number of HTTP requests has the biggest impact on reducing response time. The percentage of users with an empty cache for different web pages may vary, especially for pages with a high number of active (daily) users. However, we found in our study that regardless of usage patterns, the percentage of page views with an empty cache is always ~20%.
Doesn't mean we should all do it, but food for thought.
Great article. As a webdev at Yahoo, I can speak from experience that these rules definitely *can* make a huge impact on our performance; also, depending on what site you happen to be working on, a rule might be more or less relevant. For example, if you are expecting 1-2 page views per session, then externalizing your CSS and JS might not be worth the extra HTTP requests, even to a CDN. If you get 10-20, then it's a huge win, and you should absolutely lean on the browser cache heavily. Others, such as concatenating and minifying scripts and CSS, are almost always beneficial. You're right to point out that there's no substitute for a thinking developer.
As a huge web company, we have a lot of different kinds of sites, and they have different requirements. YSlow is not intended to be anything but a lint checker that summarizes our Exceptional Performance Team's findings. And it is *very* useful in that regard. It was an internal tool back before it was a Firebug plugin, and I believe it was only recently released shared with the public.
The tool doesn't include a configuration screen, but if you enter about:config into the address bar, and then filter for "yslow", you can adjust the weights that are assigned to each rule. Handy when you know that the tool is wrong about your particular situation.
I agree that using a CDN is overkill for all but the largest websites, and shouldn't really be on Yahoo's yslow..
This speaks to the title of this post, but, on a high-volume website, not using a CDN for static content is a recipe for disaster. Serving static content from a good CDN can be much faster, since they're optimized for speed and cacheability rather than hosting an application. When you multiply the number of requests by 3 (html, css, js), you can bring down a bank of servers in the first few million impressions. (Of course, I'll probably have grandkids before my little podunk blog *gets* a million impressions, so what matters for Yahoo might not matter for you.) Since Yahoo routinely plugs its own content on The Most Trafficked Site On The Web, we have to build pages that can scale to support massive spikes in traffic. I've seen a 2% click-through rate from the home page cause servers to die and flop around with rigor mortis. It's a MASSIVE firehose of traffic that we deal with.
In typical "open-source good guys who know where their bread is buttered" fashion, Yahoo is simply sharing what we do to optimize pages in situations where optimization counts. They want the internet to be faster. A faster internet means more people will use it more of the time, and that means more people using Yahoo.
Btw, 43.5 KB of CSS embedded in a GZipped HTML document is only about 14.5KB on the wire. It's absolutely worthwhile for the homepage to embed this information in an inline style tag.
The comment about maintainability only highlights the need for a good build process. Develop with many files, and then concatenate and minify them all as part of the build-and-deploy process.
Hmmm... now I see why it looked like you hadn't posted for several days. You've been messing around with caching evidently. I figured that you were just on vacaction or something. I took the chance and clicked my refresh button and found that you had actually been posting but I hadn't been getting the changes.
If you've been messing around with caching your page content then you need to do some more testing. Some of us thought that you had given up posting or gone on vacaction!
This is Steve Souders, YSlow author and Chief Performance Yahoo!.
Jeff - This article really fills a missing need. I just finished reviewing an article written by a member of the performance team on how these rules apply (or don't apply) to smaller sites. But you've addressed a lot of that here.
Alexander Kirk and I emailed about this blog. He suggested that Rule 13 (ETags) is also not applicable to smaller sites, since they primarily run on one server. He thought Rule 3 (Expires header) was not applicable to smaller sites, because the cost of revving filenames could be a challenge to smaller sites. I think the benefit of browser caching is huge, and feel that the development burden isn't bad and could be lessened. I show some PHP code in the book to make this easier, and hope that could be published some day.
Rule 2 (Use a CDN) is hard for smaller sites to adopt. I recommend several free CDN services, but don't have any data on how good/bad they are. Feedback there would be great, such as the info on CoralCDN above. I knew this rule would cause smaller sites to lose points, so I added the config option to add your own CDN hostnames basically disabling the rule (http://developer.yahoo.com/yslow/faq.html#faq_cdn).
Powerlord - One cool technique described in the book is "dynamic inlining". The first time a user arrives on your page the server inlines the CSS. In the onload event the page downloads the external .css file and sets a cookie. The next time the user goes to the page, the server sees the cookie and inserts a LINK tag to the external file, instead of inlining the CSS. This works for JS, too. This is the best of both worlds - a faster page load on the first (empty cache) page view, and a smaller, faster HTML document on subsequent (primed cache) page views. The cookie doesn't reflect the state of the cache 100% accurately, but it's pretty close and can be tightened/loosened by tweaking the expiration date of the cookie. I thought FP was doing this. I'll go back and ask them.
E. David Zotter - All great ideas. I'm putting them on the list!
I'd prefer if people not alter the YSlow grading system using the about:config settings (other than the one for CDN hosts). This will create a situation, even within the same company, of YSlow grades being apples and oranges. We'll work on ideas for making YSlow more applicable across different types of web sites.
I agree that the web is full of terribly inefficient web pages. I'm working on a paper that translates these performance improvements to power savings; think of the number of MW that would be saved if everyone used a future Expires header avoided those unnecessary 304 Not Modified validation requests. Each web site has to weigh the costs/benefits before deciding to address these rules, but for the most part the fixes are fairly easy and the benefits are noticeable. As all of us improve our development practices, all our users reap the benefits. I appreciate the huge amount of discussion around this topic and look forward to a faster and more efficient Web.
Hi Steve-- thanks for your comments, it's always great to have the source of the article stop by! Yslow is a fantastic tool, and it reflects very well on Yahoo! to release something so helpful to the community. Anything that gets this many people talking about ways to improve web performance is a net public good.
However, as noted by Matt's comment directly above yours, caching is something that you have to be very careful with. I accidentally turned on the Expires/Cache-Control header for ALL my content for about an hour (whoops!) before I realized what I had done. Thus, everyone who was unfortunate enough to visit in that window of time won't see any changes on the homepage until the cache expires, 7 days from now.
Totally my fault, of course, but I do think this is exactly why IIS defaults to never setting an Expires/Cache-Control field on any content.
Re: The ETags rule and how it's applicable to smaller sites.
I believe the rule applies to smaller sites. Although they are served from one server, they often move.
A smaller site is usually hosted on a low cost shared server. I can tell from bad experience that often you're wrong about your choice of hosting provider. Either tricked by a "review" from the provider's affiliate, or by the promise for unlimited something (bandwidth, space). So after a while, maybe not even a full year, you move your site to a different provider, meaning on a different server, meaning different ETags. Also even if you stick with the same provider, sometimes they decide to move you to a new server because of their internal restructuring or any other reason.
Given how extremely simple it is to configure ETags, why skipping this rule?
I've decided to tackle the speed problem in a "social" way (ohh no, another web 2.0 hyper here!).
Here's my reasoning:
Many websiteds uses the same JS libraries (say prototype, jquery...).
But when every website stores its own version on its own server, there's a huge lost of caching benefit. Furthermore, some website owners are unaware or unable to tweak their server for full performance .
Imagine a JS distribution site - all versions of popular JS libraries and script will be remotely hosted on a fast tuned server (JS specific CDN you could say).
This could be a win-win situation:
1. Regulars surfer will just surf faster - No noticable at first, but as a mass of websites uses the aforementioned service the chance of already having a cached version of a JS library will rise significantly.
2. Website owners will save money on bandwith costs, serving JS file from a remote fast optimized server. Their website usability will get better due to speed improvement.
(*this does not apply to YUI of course.)
Of course, there are drawbacks: control, reliability and security.
I'm doing everything I can to make the service as reliable as possible, including backup servers on different hosts etc...
Well, there it is. The service should be launched in less then a week. I believe ads and sponsoring will cover the hosting costs, and everybody should be happy.
Click on the name to be directed to a sign-up page so you can notified when we launch (or to flame me for self promotion on a respectful blog).
A weak validator? WTF. Only if you use it as one. You can put any date you like in there. I think that page you link to is talking nonsense. Most site would change the last modified date whenever anything on the page changed.
Stoyan, the ETag Yahoo is talking about is not for small servers it is for Web Farms, where the same server you might make one request from isn't the same server you may go to on the second request.
Also it is generally not a good idea to turn off ETags if you content is only hosted on one server like a Blog. However if you are running you Blog from an IIS server you should be aware of this ETag bug in IIS.
The quick way of getting rid of ETag in IIS is to go to the HTTP Headers tab in the properties of your website and Add a new HTTP header. Name = "ETag" Value = "" This will remove the ETag from the header.
Of course it's a vast oversimplification to simply give a score for a website, but presumably the people who use YSlow have enough common sense to realise this.
I actually tend to use a similar tool at http://linuxbox.co.uk/website_performance_test.php as well as YSlow - both give slightly different advice
I always wondered: /Why are they outright telling me that I must turn off ETags on account of server farm problems which I simply don't have?/
I knew it was bad... But I turned them off anyway on a particular site, as at the time we were kind of competing to see who could make the most dramatic score improvement on any existing site (with YSlow).
The article, though, is awesome.
Thanks for this great article. I'm in the middle of shrinking the footprint of uxbooth.com. Like many, I'll be following along with the advice from Yahoo's blog. I like that you analyze each point they make and explain them from a practical point of view. Indeed, we don't all have top-tier sites.
I will be linking to this article from my blog-post. You just got a new reader :)
Great read, I am beginning to wonder since I also implemented Yahoo's 13 rules if I am somehow hurting my site.
I have increased speed dramatically (which is good), but have lost a significant amount of traffic ever since. I don't understand and cannot pin-point the problem.
Anyone care to take a look at what could be the culprit? Head over to http://www.geekberry.net/ - It's a wireless technology site. At one point I was doing 360,000+ visits a month. Now I hardly due half! :(
@Giancarlo: is that supposed to be a troll? Not very creative way to get readers for your site :) Though granted, it might appear you actually read the post.
Now, the culprit could of course be, you are measuring 'visits' wrongly. Oftentimes 'hits' are confused for 'visits' and some log analyzers will simply fail to spot that multiple requests form a single visit.
So reducing requests (html,js,css,whatnot) will - ideally - vastly reduce the number of hits (a good thing: as in avoiding taking a hit). That doesn't mean you get fewer visitors. I'd check your log numbers to see whether you might have gotten *more* visitors, only serving them in fewer request/responses?
I wish more people would evaluate advice from authorities such as yahoo and google. My boss decided to implement every one of those rules, but in the case of the cache header he did a blanket +10 years expiration date for the WHOLE site - html and all - and the site gets updated at least once a week if not more so. It has been corrected but unfortuatnly the htaccess file was up for a whole month. And come to find out - not everyone clears their cache on a regular basis. So here we are - months later still walking people through how to clear their cache so they don't see a site in time warp.
wow, dude you suck. every little byte and http request saved is important regardless of how heavy your site traffic is. a grade of 70-something on yslow is very bad even for "small" sites that you use as an example, you can get at least a grade of 90 even with an "E" for cdn. you should go back to school to learn this stuff or get a new career.
For more info on decreasing latency, Aaron Hopkins from die.net has an interesting article up showing the effect the keep-alives and multiple hostnames (among other things) have on page load time:
The distinction between weak and strong validators is not relevant to caching of entire files. Where it is relevant is in resuming broken downloads of very large files (notice how browsers will do this if you break after a few megs of a many-meg file). In this case its important to know that the file you are resuming the download of is byte-for-byte identical to the file you started downloading.
Weak E-Tags (with W\ before the quotes) and last-modified don't make that promise to the client, strong E-Tags do.
E-Tags for web-farms can be implemented by generating the e-tag yourself, if each server (or each server process in a farm of gardens) will produce the same e-tag for the same entity, then this can work perfectly.