If..Else Log

Perils of Prefetching

Coldforged.org was recently taken down by what appears to be a pre-fetcher gone bad. Whilst there's not enough information to identify what and if that was the case, this episode only serves to illustrate the havoc that can be caused by sloppy programming. However, even had the pre-fetcher been coded with some manners, I wonder if there is a need in this day and age for such technology.

Inherently selfish

The main problem with prefetching is that, by definition, it's a selfish act. What prefetching involves is the automatic retrieval of the possible links from the currently active webpage. The idea is that the user will be spending a bit of a time reading the current page and so, why not use this "idle time" to fetch the probable next set of pages so that when the user does move off the page, the next page is already fully loaded.

Increase in network traffic

The problem is that, this is almost a textbook example of the tragedy of the commons in practice. If you're the only person doing this, then the gains might be worthwhile. However, if a lot of people started doing this, the resulting increase in network traffic will mean that web-browsing would be slower than before which would mean that more people would be tempted to use pre-fetchers.

Widespread use of pre-fetchers is Pareto-Inefficient. The only way that you can see an improvement is if you're more aggressive than everyone else.

Hurts the content provider

The other problem and the one which Coldforged saw first hand is that pre-fetching hurts the content provider. Whilst in his case, the main cause was the programs ignorance of good practice and HTTP return codes, the fact remains that pre-fetching consumes more bandwidth.

No program is going to be able to accurately predict what the reader wants to do next. Each time a pre-loaded resource is not used, that is a waste; it's a waste of bandwidth and a waste of server resources, both of which could be better used. And to cope with the increased demand, the provider would need more (powerful) servers, more bandwidth, more backend magic (load-balancing etc), of which the net result is more money.

Distorts statistics

Another problem is that pre-fetching distorts statistics needlessly. It is difficult to measure and seperate out pre-fetched requests from actual requests.

Is demand going up or is that the effect of pre-fetching. What browsers/OS are most readers using? OK, we had this problem before pre-fetching so let's isolate on the unique user (whether by IP, session, cookie or otherwise). Now can we deduce whether or not readers are finding the overall site content is appealing i.e. do they enjoy other content as well, what pages are the most popular, general site stickiness, or is this once again a distortation induced by pre-fetching?

And what about sites dependent on advertising income? How does prefetching affect that?

Helpfulness?

Despite all of these concerns, does pre-fetching actually benefit anyone? There are a couple of proviso's here and they're not small ones either.

Browsing has changed

This is not 1994. The majority of people have moved on from 14k modems. Heck, I'm sure most people should have moved on from dial-up (and if you haven't, what are you waiting for?).

In addition, are you using a proper browser? You should be able to open a webpage in a new tab to load in the background. Technology has changed and the need for pre-fetcher has gone the way of the telegraph.

Pre-fetching predicting hasn't

Yes, we have rel="prefetch" now. This obviously requires both the content publisher to implement this and the pre-fetcher to properly acknowledge it. But because the first isn't always the case, this means that pure pre-fetchers can't solely depend on it… so they cheat.

There's a number of tactics that they can use but the end result is it's not going to be able to guarantee retrieval of next destination without bandwidth abuse. And even then, the pre-fetcher can't predict that you may decide to just type another URL into the address bar.

All this work, all this pain, and all for nothing.

Just say no

Pre-fetching/pre-caching, however you want to term it, is a technology that should have died out. I shouldn't need to write this post. Sadly, just because something stinks, doesn't mean that it no longer exists.

-30-

8 Responses to “Perils of Prefetching”

  1. Gravatar ColdForged July 14th, 2005 7:43 pm

    That’s stated so well it’s almost scary. Nice write-up and — if it isn’t obvious — I agree with everything you’ve said :).

  2. Gravatar Dave July 14th, 2005 8:32 pm

    Yeah I agree; loading pages that may not be needed is stupid and with web standards we have small loading times anyway.

    It is just a gimmick for out of touch business men who still think the internet is what it was like in 1994 - slow ugly and not used.

  3. Gravatar Henrik Lied July 14th, 2005 10:25 pm

    Heck, I’m sure most people should have moved on from dial-up (and if you haven’t, what are you waiting for?).

    For it to become 27th of August - aka. movingday :)

  4. Gravatar Bryan Veloso July 18th, 2005 5:26 pm

    I wish I could say something on top of this… but you’ve already covered everything.

  5. Gravatar felipe.lavin July 23rd, 2005 9:28 am

    I totally agree with your point of view. I was checking out Deer Park Alpha 2 (a sort of release candidate for Firefox 1.1) the other day and one of it’s features it’s pre-fetching, which it’s pretty cool if you are on dial-up, but it’s awful if you are paying to host your own site.
    Your post it’s very clear on this issue: pre-fetching might have nice intentions, but it simply does more harm than good.

  6. Gravatar Ryan July 24th, 2005 4:09 pm

    If we can have rel=”prefetch” why not rel=”no-prefetch”? A no-prefetch property really doesn’t seem to specify a relationship between pages, but then again neither does prefetch. It might not be semantically beautiful, but it would make things easier for content providers. Imagine just having to worry about placing rel=”no-prefetch” on your “delete” links.

    Nevertheless, I tend to agree that the whole idea of prefetching is flawed.

  7. Gravatar Jon August 5th, 2005 7:08 pm

    I wonder, is there any way to disable prefetching on Apache or through PHP or something, deny them the favor and give them a message telling them not to _ever_ use prefetching again? That would be decently helpful if it were ever possible…

  8. Gravatar Pranab November 4th, 2005 8:07 pm

    Well I’m from India, and here, and I’m sure in many such parts of the wide world, prefetching makes real sense.

    Surfing on a dialup, especially rural connections is really unpleasant and does not make for a coherent user experience. So when used wisely, say a limit for prefetching one text page per document, prefetching sounds good.

    Images should not be prefetched, good connection or bad. Because it won’t mean anything for folks with broadband, and will not really help people on dialup.