SEO, "look Ma, no hash!" and .htaccess tricks

AngularJS and web analytics part 3

Websites using AngularJS are often terrible when it comes to SEO. You will have a master page which contains all the top navigation and footer links and an empty placeholder. When the page loads Javascript executes, AngularJS fetches a HTML fragment that will be injected into the empty placeholder. For a long time the search engine bots could not execute Javascript but now they do. You will need to feed them AngularJS-free versions of the page your visitors see.

You can opt to keep the hashbang (#!) visible in the URL or not. Search engine bots can crawl your website regardless but each option require specific preparation. In both cases the Google, Bing and Yandex bots will crawl your AngularJS pages if you can support the _escaped_fragment_. Back in December 2015 Google has announced that the Google bot will no longer support crawling URLs containing the _escaped_fragment_ here but it seems that they still do. Baidu does not seem to support this feature.

With hashbang

The hashbang refers to the hash followed by an exclamation mark in the URL, i.e. #!. The hashbang is followed by an identifier for the HTML fragment that was loaded into the the empty placeholder. The Google, Bing and Yandex bots will replace the hashbang with _escaped_fragment in the URL and index the content.

If your URL is http://www.test.com/#!page1 these fine bots will convert the URL and index the content at that URL instead, i.e. http://www.test.com/?_escaped_fragment_=page1.

You will need server-side code that will spot when something is requesting the converted URL and serve an AngularJS-free version of the original page with hashbang.

Without hashbang

Your URL will have nothing to convert but you will need to instruct the Google, Bing and Yandex bots they need to convert the URL anyway. You do that by adding in the <head>

<meta name="fragment" content="!">

The bots will convert this URL http://www.test.com/page1 into http://www.test.com/page1?_escaped_fragment_=. Here again you will need server-side code that will serve an AngularJS-free page when the converted URL is request.

Is that it? Not so fast. You will need to rewrite URLs on your server to handle the hashbang-free URLs. I use PHP so I have added a .htaccess which I copied from elsewhere, only 5 lines of code.

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^ demo4.php [L]

You will also need to enable the HTML5 mode in your AngularJS code to use HTML5 history API to store your navigation history instead of the hashbang. Please checkout the source code of the example below: ngRoute, HTML5 mode and .htaccess enabled

sitemap.xml and robots.txt

The next thing you will need is generate your sitemap.xml and write your robots.txt files. I have put these files in the web root folder. I have used the Screaming Frog SEO crawler. If your site has fewer than 500 pages you can use Screaming Frog for free. The sitemap.xml file will tell Google how often your content changes and which pages are more important than others. Once your sitemap.xml file is ready you can login into Google Web Master Tools.

The robots.txt file will help the search engine bots where your sitemap.xml file is and which folders are out of bounds. Here's a robots.txt file example:

Sitemap: http://www.albangerome.com/sitemap.xml
User-agent:*

Other AngularJS articles

Code examples