How canonical Url Redirect Work for Static Sites

posted on : Saturday, 2 Mar, 2019

author avatar John Hashim

Hey! I’m a Junior web developer at PROTECHIG And founder of Unwrote Blog.

The traditional way of handling this is with server-side redirects so that, wherever a user may land, the server would issue a HTTP 301 Moved Permanently redirect to the canonical URL.

You can see this in action on Google when requesting http://google.com and looking at the HTTP response headers below, where the browser is redirected to http://www.google.com/. Observe that Google doesn’t redirect to SSL! (A missed opportunity?)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Tue, 18 Apr 2017 18:29:33 GMT
Expires: Thu, 18 May 2017 18:29:33 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Since the proliferation of static site generators like Hugo and Jekyll, free static webhosts like GitLab Pages and GitHub Pages have also become popular. In fact, this blog is generated with Hugo and hosted on GitLab Pages, as they support custom domains and SSL. These static webhosts, generally however, don’t support anything more exotic than than simply serving up files off disk.

JavaScript redirects So how to handle the redirects to the canonical URL without server-side support? Client side JavaScript, of course!

Start with a function that, given a canonical URL, will redirect the browser based on the current location given by location.href:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
function redirectPageIfNeeded(canonicalURL) {
  // Extract protocol and host from canonical URL
  var regexp = new RegExp("(https?:)//([^/]+)");
  var matches = regexp.exec(canonicalURL);
  var canonicalProto = matches[1];
  var canonicalHost = matches[2];

  // Current browser URL
  var href = location.href;

  // Track whether we need to redirect the browser
  var hrefRedirect = false;

  // Perform protocol redirect?
  if (location.protocol.toLowerCase() !== canonicalProto) {
    href =
      canonicalProto + // new protocol
      href.substring(location.protocol.length); // host + path
    hrefRedirect = true;
  }

  // Perform hostname redirect?
  if (location.host.toLowerCase() !== canonicalHost) {
    var pos = href.indexOf(location.host);
    href =
      href.substring(0, pos) + // protocol
      canonicalHost + // new host
      href.substring(pos + location.host.length); // path
    hrefRedirect = true;
  }

  // Perform protocol and/or host redirect as required
  if (hrefRedirect) {
    location.href = href;
  }
}
Ensure the above function is called from every page on your site, passing the canonical site URL as the argument:
1
2
3
4
5
6
<script type="text/javascript">
  function redirectPageIfNeeded(canonicalURL){
    // ... as above
  }
  redirectPageIfNeeded("https://www.unwrote.com/");
</script>
We can also compact the function execution into an IIFE, shown below.

This script block is best placed at the top of the section of each page to ensure that the redirect is processed immediately on page load. If the script is instead placed at the bottom of the , the browser may load additional resources (e.g. CSS, JavaScript, images) before the redirect is executed. If those resources are linked via a relative URL, the browser will attempt to download them again following the redirect because they now appear on a different domain, wasting bandwidth and polluting the browser cache.

Hugo tips and tricks For Hugo, calling the redirect function becomes easy through a partial, where we can call the redirect with “{{ .Site.BaseURL }}” to provide the canonical URL. Nice!

Unfortunately this also breaks the local server preview, or “watch”, feature that we get with hugo -w.

The solution is to conditionalize redirectPageIfNeeded on an environment variable, and have that variable set in a npm run-command that is used to generate the final site for upload to the static webhost. Where that environment variable does not exist (while running hugo -w during local preview), the redirect does not come into play, and is only enabled when the final site is generated.

The package.json run command is defined as,

1
2
3
4
5
6
{ ...
  "scripts": {
    "final": "rm -rf public && NODE_ENV=production hugo",
    ...
  }
}
where if you build on Windows, you should additionally use cross-env.

The partial becomes:

1
2
3
4
5
6
7
8
<script type="text/javascript">
  {{ $environment := (getenv "NODE_ENV") }}
  {{ if (eq $environment "production") }}
    (function(canonicalURL){
      // redirectPageIfNeeded() code as above
    })("{{ .Site.BaseURL }}");
  {{ end }}
</script>
Now we can run hugo -w and the redirects are disabled, but when we run npm run final to generate the static files for our site, the redirect is baked in.