Automated basic Xen snapshots

With a little bit of shell-scripting I made this. It runs on a dom0 and scans through virtual machines running on a cluster, requesting snapshots of each one. You may want to use vm-snapshot-with-quiesce if it’s supported by xen-tools on your domU machines.

for i in `xe vm-list params=name-label | grep name | awk '{print $NF}' | xargs echo`; do echo $i; xe vm-snapshot vm=$i new-name-label="$i-`date +'%Y-%m-%dT%H:%M:%S'`"; done

If you cron the above script every day then after a few days you may want to start deleting the oldest snapshots:

for i in `xe vm-list params=name-label | grep name | awk '{print $NF}'`; do snapshot=`xe vm-list name-label=$i params=snapshots | awk '{print $NF}'`; echo "$i earliest snapshot is $snapshot"; if [ "$snapshot" ]; then xe vm-uninstall vm=$snapshot force=true; fi; done

What does Technology Monoculture really cost for SME?

http://www.flickr.com/photos/ndrwfgg/140859675/sizes/m/

Open, open, open. Yes, I sound like a stuck record but every time I hit this one it makes me really angry.

I regularly source equipment and software for small-medium enterprises, SMEs. Usually these are charities and obviously they want to save as much money as they can with their hardware and software costs. Second-hand hardware is usually order of the day. PCs around 3-years old are pretty easy to obtain and will usually run most current software.

But what about that software? On the surface the answer seems simple: To lower costs use free or Open Source software (OSS). The argument for Linux, OpenOffice and other groupware applications is pretty compelling. So what does it really mean on the ground?

Let’s take our example office:
Three PCs called “office1”, “office2” and “finance” connected together using powerline networking. There’s an ADSL broadband router which provides wireless for three laptops and also a small NAS with RAID1 for backups and shared files.

Okay, now the fun starts. The office has grown “organically” over the last 10 years. The current state is that Office1 runs XP 64-bit; Office2 runs Vista Ultimate and the once-per-week-use “finance” runs Windows 2000 for Sage and a Gift Aid returns package. All three use Windows Backup weekly to the NAS. Office1 & Office2 use Microsoft Office 2007. Office1 uses Exchange for mail and calendars, Office2 uses Windows Mail and Palm Desktop. Both RDP and VNC are also used to manage all machines.

So, what happens now is that the Gift Aid package is retired and the upgrade is to use web access but can’t run on MSIE 6. Okay. Upgrade to MSIE 8. Nope – won’t run on Win2k. How about MSIE 7? Nope, can’t download that any more (good!). Right, then an operating system upgrade is in order.

What do I use? Ubuntu of course. Well, is it that easy? I need to support the (probably antique) version of Sage Accounts on there. So how about Windows XP? Hmm – XP is looking a bit long in the tooth now. Vista? You must be joking – train-wreck! So Windows 7 is the only option. Can’t use Home Premium because it doesn’t support RDP without hacking it. So I’m forced to use Win 7 Pro. That’s £105 for the OEM version or £150 for the “full” version. All that and I’ll probably still have to upgrade Sage, AND the finance machine is only used once a week. What the hell?

Back to the drawing-board.

What else provides RDP? Most virtualisation systems do – Xen, virtualbox and the like. I use Virtualbox quite a lot and it comes with a great RDP service built in for whatever virtual machine is running. Cool – so I can virtualise the win2k instance using something like the VMWare P2V converter and upgrade the hardware and it’ll run everything, just faster (assuming the P2V works ok)…

No, wait – that still doesn’t upgrade the browser for the Gift Aid access. Ok, I could create a new WinXP virtual machine – that’s more recent than Win2k and bound to be cheaper – because Virtualbox gives me RDP I don’t need the professional version, “xp home” would do, as much as it makes me cringe. How much does that cost? Hell, about £75 for the OEM version. What??? For an O/S that’ll be retired in a couple of years? You have to be kidding! And I repeat, Vista is not an option, it’s a bad joke.

I’m fed up with this crap!

Okay, options, options, I need options. Virtualise the existing Win2k machine for Sage and leave the Ubuntu Firefox web browser installation for the updated Gift Aid. Reckon that’ll work? It’ll leave the poor techno-weenie guy who does the finances with a faster PC which is technically capable of doing everything he needs but with an unfamiliar interface.

If I were feeling particularly clever I could put Firefox on the Win2k VM, make the VM start on boot using VBoxHeadless; configure Ubuntu to auto-login and add a Win2k-VM-RDP session as a startup item for the auto-login user. Not a bad solution but pretty hacky, even for my standards (plus it would need to shut-down the domain0 host when the VM shuts down).

All this and it’s still only for one of the PCs. You know what I’d like to do? Virtualise everything and stick them all on a central server. Then replace all the desktop machines with thin clients and auto-login-RDP settings. There’s a lot to be said for that – centralised backups, VM snapshotting, simplified (one-off-cost) hardware investment, but again there’s a caveat – I don’t think that I’d want to do that over powerline networking. I’d say a minimum requirement of 100MBps Ethernet, so networking infrastructure required, together with the new server. *sigh*.

I bet you’re thinking what has all this got to do with technology monoculture? Well, imagine the same setup without any Microsoft involved.

All the same existing hardware, Ubuntu on each, OpenOffice, Evolution Mail & Calendar or something like Egroupware perhaps or even Google Apps (docs/calendar/mail etc. – though that’s another rant for another day). No need for much in the way of hardware upgrades. No need for anything special in the way of networking. Virtualise anything which absolutely has to be kept, e.g. Sage, without enforcing a change to the Linux version.

I don’t know what the answer is. What I do know is that I don’t want to spend up to £450 (or whatever it adds up to for upgrade or OEM versions) just to move three PCs to Windows 7. Then again with Windows 8, 9, 10, 2020 FOREVER. It turns out you simply cannot do Microsoft on a shoestring. Once you buy in you’re stuck and people like Microsoft (and they’re not the only ones) have a license to print money, straight out of your pocket into their coffers.

Of course that’s not news to me, and it’s probably not news to you, but if you’re in a SME office like this and willing to embrace a change to OSS you can save hundreds if not thousands of pounds for pointless, unnecessary software. Obviously the bigger your working environment is, the quicker these costs escalate. The sooner you make the change, the sooner you start reducing costs.

Remind me to write about the state of IT in the UK education system some time. It’s like lighting a vast bonfire made of cash, only worse side-effects.

Shared Development Environments; why they suck and what you can do about it

http://www.flickr.com/photos/doctorow/2698336843/sizes/l/
http://www.flickr.com/photos/doctorow/2698336843/sizes/l/

I’ve wanted to write this post for a long time but only recently have I been made frustrated enough to do so.

So.. some background.

When I worked at the Sanger Institute I ran the web team there. This was a team with three main roles –

  1. Make sure the website was up
  2. Internal software development for projects without dedicated informatics support
  3. Support for projects with dedicated informatics

When I started, back in 1999 things were pretty disorganised but in terms of user-requirements actually a little easier – projects had the odd CGI script but most data were shipped out using file dumps on the FTP site. You see back then and for the few years’ previous, it was the dawning of the world-wide-web and web-users were much happier being faced with an FTP/gopher file-listing of .gz (or more likely, uncompressed .fasta) files to download.

Back then we had a couple of small DEC servers which ran the external- and internal- (intranet) websites. Fine. Well, fine that is, until you want to make a change.

Revision Control: Manual

Ok. You want to make a change. You take your nph-blast_server.cgi and make a copy nph-blast_server2.cgi . You make your changes and test them on the external website. Great! It works! You mail a collaborator across the pond to try it out for bugs. Fab! Nothing found. Ok, so copy it back over nph-blast_server.cgi and everyone’s happy.

What’s wrong with this picture? Well, you remember that development copy? Firstly, it’s still there. You just multiplied your attack-vectors by two (assuming there are bugs in the script capable of being exploited). Secondly, and this is more harmful to long-term maintenance, that development copy is the URL you mailed your collaborator. It’s also the URL your collaborator mailed around to his 20-strong informatics team and they posted on bulletin boards and USENET groups for the rest of the world.

Luckily you have a dedicated and talented web-team who sort out this chaos using a pile of server redirects. Phew! Saved.

Now multiply this problem by the 150-or-so dedicated informatics developers on campus serving content through the core servers. Take that number and multiply it by the number of CGI scripts each developer produces a month.

That is then the number of server redirects which every incoming web request has to be checked against before it reaches its target page. Things can become pretty slow.

Enter the development (staging) service

What happens next is that the web support guys do something radical. They persuade all the web developers on site by hook or by crook that they shouldn’t be editing content on the live, production, public servers. Instead they should use an internal (and for special cases, IP-restricted-external-access) development service, test their content before pushing it live, then use a special command, let’s call it webpublish, to push everything live.

Now to the enlightened developer of today that doesn’t sound radical, it just sounds like common sense. You should have heard the wailing and gnashing of teeth!

Shared development

At this point I could, should go into the whys and wherefores of using revision control, but I’ll save that for another post. Instead I want to focus on the drawbacks of sharing. My feeling is that the scenario above is a fairly common one where there are many authors working on the same site. It works really well for static content, even when a CMS is used. Unfortunately it’s not so great for software development. The simple fact is that requirements diverge – both for the project and for the software stack. These disparate teams only converge in that they’re running on the same hardware, so why should the support team expect their software requirements to converge also?

Allow me to illustrate one of the problems.

Projects A and B are hosted on the same server. They use the same centrally-supported library L. A, B and L each have a version. They all work happily together at version A1B1L1. Now B needs a new feature, but to add it requires an upgrade to L2. Unfortunately the L2 upgrade breaks A1. Project A therefore is obliged to undertake additional (usually unforeseen) work just to retain current functionality.

Another situation is less subtle and involves shared-user access. For developers this is most likely the root superuser although in my opinion any shared account is equally bad. When using a common user it’s very difficult to know who made a change in the past, let alone who’s making a change right now. I observed a situation recently where two developers were simultaneously trying to build RPMs with rpmbuild which, by default, builds in a system location like /usr/share . Simultaneously trying to access the same folders leads to very unpredictable, unrepeatable results. Arguably the worst situation is when no errors are thrown during the build and neither developer notices!

Naturally a lot of the same arguments against shared development go for shared production too. The support matrix simply explodes with a few tens of applications each with different prerequisites.

Other options

Back in the day there were fewer options – one was left with always having to use relative paths and often having to discard all but the core system prerequisites in fear of them changing unexpectedly over time. Using relative paths is still a fairly inexpensive way to do things but sometimes it’s just too restrictive. There is another way…

Virtualisation is now commonplace. You probably cross-paths with a virtual machine every day without knowing it. They’re ubiquitous because they’re really, really useful. For our development purposes one core support member can build a standard, supported virtual machine image and post it on the intranet somewhere. All the other developers can take it, start their own instances of it and do all of their own development on their own hardware without fighting for common resources. Upgrades can be tested independently of one another. Machines can be restarted from scratch and so on. Once development is complete and given sufficient core resources, each developer can even bundle up their working image and ship it into production as is. No further core support required!

What tools can you use to do this? Parallels? Too commercial. VMWare? A bit lardy. Xen? Probably a bit too hard-core. KVM? Not quite mature enough yet. No, my current favourite in the virtualisation stakes is VirtualBox. Cross platform and free. Works great with Ubuntu inside. A killer combination capable of solving many of these sorts of problems.