Content Delivery Network (CDN) using Linode VPS

This month one of the neat things I’ve done was to set up a small content delivery network (CDN) for speedy downloading of files across the globe. For one reason and another (mostly the difficulty in doing this purely with DNS and the desire not to use AWS), I opted to do this using my favourite VPS provider, Linode. All in all (and give or take DNS propagation time) I reckon it’s possible to deploy a multi-site CDN in under 30 minutes given a bit of practice. Not too shabby!

For this recipe you will need:

  1. Linode account
  2. A domain name and DNS management

What you’ll end up with:

  1. 3x Ubuntu 12.04 LTS VPS, one each in London, Tokyo and California
  2. 3x NodeBalancers, one each in London, Tokyo and California
  3. 1x user-facing general web address
  4. 3x continent-facing web addresses

I’m going to use “mycdn.com” wherever I refer to my DNS / domain. You should substitute your domain name wherever you see it.

So, firstly log in to Linode.

Create three new Linode 1024 small VPSes (or whatever size you think you’ll need). I set mine up as Ubuntu 12.04 LTS with 512MB swap but otherwise nothing special. Set one each to be in London, Tokyo and Fremont. Set the root password on each. Under “Settings”, give each VPS a label. I called mine vps-<city>-01. Under “Remote Settings”, give each a private IP and note them down together with the VPS/data centre they’re in.

At this point it’s also useful (but not strictly necessary) to give each node a DNS CNAME for its external IP address, just so you can log in to them easily by name later.

Boot all three machines and check you can login to them. I find it useful here to do an

apt-get update ; apt-get dist-upgrade.

You can also now install Apache and mod_geoip on each node:

apt-get install apache2 libapache2-mod-geoip
a2enmod include
a2enmod rewrite

You should now be able to bring up a web browser on each VPS (public IP or CNAME) in turn and see the default Apache “It works!” page.

Ok.. still with me? Next we’ll ask Linode to fire up three NodeBalancers, again one in each of the data centres, for each VPS. I labelled mine cdn-lb-<city>-01. Each one can be configured with a Port, 80 with, for now, the default settings. Add a host to each NodeBalancer with the private IP of each VPS, and the port, e.g. 192.168.128.123:80 . Note that each VPS hasn’t yet been configured to listen on those interfaces so each NodeBalancer won’t recognise its host as being up.

Ok. Let’s fix those private interfaces. SSH into each VPS using the root account and the password you set earlier. Edit /etc/network/interfaces and add:

auto eth0:1
iface eth0:1 inet static
	address <VPS private address here>
	netmask <VPS private netmask here>

Note that your private netmask is very unlikely to be 255.255.255.0 (probably) like your home network and yes, this does make a difference. Once that configuration is in, you can:

ifup eth0:1

Now we can add DNS CNAMEs for each NodeBalancer. Take the public IP for each NodeBalancer over to your DNS manager and add a meaningful CNAME for each one. I used continental regions americas, apac, europe, but you might prefer to be more specific than that (e.g. us-west, eu-west, …). Once the DNS propagates you should be able to see each of your Apache “It works!” pages again in your browser, but this time the traffic is running through the NodeBalancer (you might need to wait a few seconds before the NodeBalancer notices the VPS is now up).

Ok so let’s take stock. We have three VPS, each with a NodeBalancer and each running a web server. We could stop here and just present a homepage to each user telling them to manually select their local mirror – and some sites do that, but we can do a bit better.

Earlier we installed libapache2-mod-geoip. This includes a (free) database from MaxMind which maps IP address blocks to the continents they’re allocated to (via the ISP who’s bought them). The Apache module takes the database and sets a series of environment variables for each and every visitor IP. We can use this to have a good guess at roughly where a visitor is and bounce them out to the nearest of our NodeBalancers – magic!

So, let’s poke the Apache configuration a bit. rm /etc/apache2/sites-enabled/000-default. Create a new file /etc/apache2/sites-available/mirror.mycdn.com and give it the following contents:

<VirtualHost>
	ServerName mirror.mycdn.com
	ServerAlias *.mycdn.com
	ServerAdmin webmaster@mycdn.com

	DocumentRoot /mirror/htdocs

	DirectoryIndex index.shtml index.html

	GeoIPEnable     On
	GeoIPScanProxyHeaders     On

	RewriteEngine     On

	RewriteCond %{HTTP_HOST} !americas.mycdn.com
	RewriteCond %{ENV:GEOIP_CONTINENT_CODE} NA|SA
	RewriteRule (.*) http://americas.mycdn.com$1 [R=permanent,L]

	RewriteCond %{HTTP_HOST} !apac.mycdn.com
	RewriteCond %{ENV:GEOIP_CONTINENT_CODE} AS|OC
	RewriteRule (.*) http://apac.mycdn.com$1 [R=permanent,L]

	RewriteCond %{HTTP_HOST} !europe.mycdn.com
	RewriteCond %{ENV:GEOIP_CONTINENT_CODE} EU|AF
	RewriteRule (.*) http://europe.mycdn.com$1 [R=permanent,L]

	<Directory />
		Order deny,allow
		Deny from all
		Options None
	</Directory>

	<Directory /mirror/htdocs>
		Order allow,deny
		Allow from all
		Options IncludesNoExec
	</Directory>
</VirtualHost>

Now ln -s /etc/apache2/sites-available/mirror.mycdn.com /etc/apache2/sites-enabled/ .

mkdir -p /mirror/htdocs to make your new document root and add a file called index.shtml there. The contents should look something like:

<html>
 <body>
  <h1>MyCDN Test Page</h1>
  <h2><!--#echo var="HTTP_HOST" --></h2>
<!--#set var="mirror_eu"       value="http://europe.mycdn.com/" -->
<!--#set var="mirror_apac"     value="http://apac.mycdn.com/" -->
<!--#set var="mirror_americas" value="http://americas.mycdn.com/" -->

<!--#if expr="${GEOIP_CONTINENT_CODE} == AF"-->
 <!--#set var="continent" value="Africa"-->
 <!--#set var="mirror" value="${mirror_eu}"-->

<!--#elif expr="${GEOIP_CONTINENT_CODE} == AS"-->
 <!--#set var="continent" value="Asia"-->
 <!--#set var="mirror" value="${mirror_apac}"-->

<!--#elif expr="${GEOIP_CONTINENT_CODE} == EU"-->
 <!--#set var="continent" value="Europe"-->
 <!--#set var="mirror" value="${mirror_eu}"-->

<!--#elif expr="${GEOIP_CONTINENT_CODE} == NA"-->
 <!--#set var="continent" value="North America"-->
 <!--#set var="mirror" value="${mirror_americas}"-->

<!--#elif expr="${GEOIP_CONTINENT_CODE} == OC"-->
 <!--#set var="continent" value="Oceania"-->
 <!--#set var="mirror" value="${mirror_apac}"-->

<!--#elif expr="${GEOIP_CONTINENT_CODE} == SA"-->
 <!--#set var="continent" value="South America"-->
 <!--#set var="mirror" value="${mirror_americas}"-->
<!--#endif -->
<!--#if expr="${GEOIP_CONTINENT_CODE}"-->
 <p>
  You appear to be in <!--#echo var="continent"-->.
  Your nearest mirror is <a href="<!--#echo var="mirror" -->"><!--#echo var="mirror" --></a>.
 </p>
 <p>
  Or choose from one of the following:
 </p>
<!--#else -->
 <p>
  Please choose your nearest mirror:
 </p>
<!--#endif -->

<ul>
 <li><a href="<!--#echo var="mirror_eu"       -->"><!--#echo var="mirror_eu"        --></a> Europe (London)</a></li>
 <li><a href="<!--#echo var="mirror_apac"     -->"><!--#echo var="mirror_apac"      --></a> Asia/Pacific (Tokyo)</a></li>
 <li><a href="<!--#echo var="mirror_americas" -->"><!--#echo var="mirror_americas"  --></a> USA (Fremont, CA)</a></li>
</ul>

<pre style="color:#ccc;font-size:smaller">
http-x-forwarded-for=<!--#echo var="HTTP_X_FORWARDED_FOR" -->
GEOIP_CONTINENT_CODE=<!--#echo var="GEOIP_CONTINENT_CODE" -->
</pre>
 </body>
</html>

Then apachectl restart to pick up the new virtualhost and visit each one of your NodeBalancer CNAMEs in turn. The ones which aren’t local to you should redirect you out to your nearest server.

Pretty neat! The last step is to add a user-facing A record, I used mirror.mycdn.com, and set it up to DNS-RR (Round-Robin) the addresses of the three NodeBalancers. Now Set up a cron job to rsync your content to the three target VPSes, or a script to push content on-demand. Job done!

For extra points:

  1. Clone another VPS behind each NodeBalancer so that each continent is fault tolerant, meaning you can reboot one VPS in each pair without losing continental service.
  2. Explore whether it’s safe to add the public IP of one Nodebalancer to the Host configuration of a NodeBalancer on another continent, effectively making a resilient loop.

Using the iPod Nano 6th gen with Ubuntu

440x330-ipod-nano6gen-frontToday I spent 3 hours wrestling with a secondhand ipod Nano, 6th gen (the “6” is the killer) for a friend, trying to make it work happily with Ubuntu.

Having never actually owned an iPod myself, only iPhone and iPad, it was a vaguely educational experience too. I found nearly no useful information on dozens of fora – all of them only reporting either “it works” without checking the generation, or “it doesn’t work” with no resolution, or “it should work” with no evidence. Yay Linux!

There were two issues to address – firstly making the iPod block storage device visible to Linux and secondly finding something to manage the unconventional media database on the iPod itself.

It turned out that most iPods, certainly early generations, work well with Linux but this one happened not to. Most iPods are supported via libgpod, whether you’re using Banshee, Rhythmbox, even Amarok (I think) and others. I had no luck with Rhythmbox, Banshee, gtkpod, or simple block storage access for synchronising music.

It also turns out that Spotify one of my other favourite music players doesn’t use libgpod, which looked very promising.

So the procedure I used to get this one to work went something like this:

  1. Restore and/or initialise the iPod using the standard procedure with iTunes (I used iTunes v10 and latest iPod firmware 1.2) on a Windows PC. Do not use iTunes on OSX. Using OSX results in the iPod being formatted using a not-well-supported filesystem (hfsplus with journalling). Using Windows results in a FAT filesystem (mounted as vfat under Linux).Having said that, I did have some success making the OSX-initialised device visible to Linux but it required editing fstab and adding:
    /dev/sdb2 /media/ipod hfsplus user,rw,noauto,force 0 0

    which is pretty stinky. FAT-based filesystems have been well supported for a long time – best to stick with that. Rhythmbox, the player I was trying at the time, also didn’t support the new media database. It appeared to copy files on but failed every time, complaining about unsupported/invalid database checksums. According to various fora the hashes need reverse engineering.

  2. Install the Ubuntu Spotify Preview using the Ubuntu deb (not the Wine version). I used the instructions here.
  3. I have a free Spotify account, which I’ve had for ages and might not be possible to make any more. I was worried that not having a premium or unlimited account wouldn’t let me use the iPod sync, but in the end it worked fine. The iPod was seen and available in Spotify straight away and allowed synchronisation of specific playlists or all “Local Files”. In the end as long as Spotify was running and the iPod connected, I could just copy files directly into my ~/Music/ folder and Spotify would sync it onto the iPod immediately.

Superb, job done! (I didn’t try syncing any pictures)

 

XBMC on Ubuntu 11.10 Oneric Ocelot

Yesterday I upgraded my XBMC media centre, an Acer (bleugh!) Revo 3610 from Ubuntu 10.10 to 11.10 (Oneric Ocelot).

The upgrade itself went fine but (re)installed a few things I’d previously removed, things I didn’t want and things which break a few XBMC features. This is what I had to do to reset things:

  1. Reset the xbmc user’s login session to ‘custom session’ using the gear icon on the top-right of the login window
  2. Add a .xsession file containing
    #!/bin/sh
    exec xbmc

    and chmod +x .xsession

  3. apt-get remove nautilus ubufox xul-ext-ubufox network-manager* pulseaudio
  4. Reset network settings (e.g. /etc/resolv.conf) if you made the mistake of logging in, resulting in NetworkManager resetting everything
  5. Check your xbmc user is still in the ‘audio’ group
  6. apt-add-repository ppa:ubuntu-x-swat/x-updates
  7. apt-get update
  8. apt-get install nvidia-current # if you hadn’t previously done this
  9. apt-get dist-upgrade
  10. apt-get autoremove

It’s probably worth saying I use plain stereo output from the headphone jack, and a Grand Hand III VGA adapter rather than HDMI because my TV is about 9 years old.