Internet Marketing Journal UK

May 16, 2008

Get Google to Cache your Pages More Frequently

Filed under: Blogging, Google, IMJUK, Links — Mercury Thread @ 2:37 pm

Getting your pages cached by Google can be a nightmare at times. You’ve updated a web page and then you sit back and wait, wait some more and after more waiting GoogleBot decides index and cache your updated web page. It’s a problem that Dunkin’ Donuts had recently. They had a free iced coffee day in their branches put it on their website - but it was invisible in their search rankings as the page was not cached - GrokDotCom have posted on this recently.

Traditionally Google would index, and cache, your website based upon the number of links and the resulting PageRank of your webpages. Pages with more links and/or PageRank would be reindexed and cahced more frequently than pages with less links and/or PageRank.

It is possible to make changes to your website to help Google cache your website more often.

Stage 1 - Send out the correct HTTP headers where possible

The first step is to make sure your web server is sending out the correct headers. Last-Modified dates in your HTTP headers will let Google caluclate effectively if they should index your page; has it updated since the last time they requested it. If you imagine every page in Google index being requested every time GoogleBot visited a website you can see why they have a system to try and reduce their crawling bandwidth where possible. If they didn’t their business costs would go up, reducing profitability and making their share holders less happy.

If you have a CMS driven website it is likely that this header won’t be being sent. IIS servers will send out Last-Modified data but its often wrong/falsified (as the Last-Modified header will have the value for the exact second your page is requested) and many PHP based CMS don’t even send Last-Modified headers out (as an aside and a plug Andy @ Oyster Web developed a CMS that does send these out and this has helped increase the frequency with which Google spiders the most recently updated pages on a website). I know this blog doesn’t send these HTTP headers out at prsent- we are developing a wordpress plugin to send out Last Modified dates on pages but its quite a bit down my list of priorities.

Stage 2 - Get a blog and get it pinging

One of the great things about blogs is the ‘Blog and Ping‘ functionality - instead of waiting for Google to work out that your content has updated you can tell them. When you update your website/blog content you send a message to GoogleBlogsearch that your site has updated and this should trigger thier spider to visit the relevant pages almost instantly. GoogleBot normally visits shortly after and your page will appear in the index - having been updated shortly after this. This system hasn’t replaced the normal spidering that Google does of websites it is more of an added extra.

For this reason it is important to make sure that yor blog is:

  1. Pinging effectively
  2. Pinging Google Blog Search

I use wordpress for almost every blog I set up - and for WordPress their is a great little plugin that shows you if your pings have or have not been working - download the plugin to test your pings here.

Make sure you’re pinging Google Blogsearch: check that you’re pinging http://blogsearch.google.com/ping. If you are you’re well on your way to getting your website spidered and cached more often by Google.

2 Comments »

  1. Great Article. I am still trying to figure all this seo stuff out. I have Seoquake from firefox but I don’t know what all the statistics mean.

    Comment by Bonus — June 3, 2008 @ 3:34 am

  2. This is a really helpful tip and I appreciate the link to the Wordpress Plugin ping tester. A few of our websites are using Wordpress and I’ll certainly be adding that plugin.

    Comment by Lee — July 29, 2008 @ 6:55 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress