GoogleBot Gets Smarter: Crawling AJAX Comments

November 17, 2011

For years, Google has not crawled a significant part of websites and blogs: the comments.  There were two reasons for this:

  • Google always aims to provide the best webpages in its results. Historically, many comments can be filled with spam, or otherwise be irrelevant or unhelpful to the content of the post.
  • Comment add-on integration from Disqus, Facebook, and other popular commenting plugins and tools rely on AJAX and require JavaScript to be executed, which Googlebot wasn’t doing.

Regarding #1, Panda updates have done a great job in removing spam sites with bad comments.  Also, most websites have now established a platform and community in which the comments are helpful and relevant. Sometimes the comments can even be more helpful than the post itself, which is a frequent occurrence on blogs like SEOmoz.

And regarding #2, on November 1st the search community discovered that certain AJAX and JavaScripts are now able to be crawled and indexed.  This was later confirmed by Googler Matt Cutts. They mentioned this applies specifically to comment platforms Disqus and Facebook comments, which are the two biggest providers on the web.

So What Does Comment Indexation Mean To You?

It’s simple.  Remember point #1 above?  Google wants to deliver relevant, non-spam sites to its users.  A primary signal to relevancy is recency.  Something that has fresh content never appears stale to Google.  Indexation of comments shows that there is fresh content on the site. Fresh, engaging content that changes each time Google visits your site. At the beginning of the month Google launched a “freshness” update that solidifies the commitment to fresh content. Both of these announcements are clearly connected and provide a singular message: keep content updated often.

Bear in mind that crawling new comments doesn’t always mean your site will get boosted to the #1 position.  There are different parts of your website.  If the comments are the only thing staying fresh and the rest of your site remains unchanged for months/years at a time, Google is smart enough to recognize this.

So What Should You Do About It?

If you have Disqus or Facebook comments enabled on your blog, check Google to see if your comments are being indexed.  Find a comment and copy it directly into the Google search box surrounded by quotes (to get an exact match).  If you see the comment, dig a little deeper to be aware of the context surrounding it.

Also (and this is the hard one), be actively engaged with users of your website.  This means responding to complaints, providing help to those who need it, and sharing news often. This also means that negative comments can be indexed, so it’s additionally important to be responsive.

The Bigger Picture

While indexing new content is great in the short term, the ability to crawl AJAX and execute JavaScript has a much brighter future.  As more and more sites are built with dynamic content that utilizes these technologies, Google has been unable to keep up with the trend.  It took YEARS before Google was able to index Flash, and even then, it wasn’t nearly perfect.  This response to AJAX is promising that Google has embraced the state of the web as it stands now and provides a platform to grow in the future.