sysadmin - psyphi.net blog

Pseudo-VPN stuff with SSH

Firstly, there are *lots* of ways to do this. This is one way.

Secondly, poking holes in your corporate network is occasionally frowned upon and may contravene your workplace Acceptable Use Policy or equivalent. If you have a VPN solution (HTTPS, L2TP, or whatever) which works on everything you need, then I shouldn’t need to tell you to use that instead.

Anyway…

At home, on the end of my DSL line I have a PC running Linux.

At work I have lots of PCs running Linux.

Sometimes I’m using a random machine and/or a platform unsupported by my corporate VPN and I want to connect to work without using the (recommended) HTTPS VPN or (complicated) L2TP. So I turn to a trusty source of cool networky stuff: SSH.

Importantly, SSH understands how to be a SOCKS server. This allows applications which understand SOCKS (most Windows stuff for example) to redirect all their traffic over SSH without the addition of a proxy server like Squid on the corporate end.

So, how do you set it up? It’s fairly easy:

1. Set up the work-to-home connection:

user@work:~$ while [ 1 ]; do ssh -NR20000:localhost:22 user@home.net; done

2. Set up the laptop-to-home connection:

user@laptop:~$ ssh -L15000:localhost:20000 user@home.net

3. Set up the laptop-to-work connection:

user@laptop:~$ ssh -D15001 localhost -p 15000

If you’re at home and your “other” machine is on the same network as your home server you can be a bit more adventurous and do the following:

1. set GatewayPorts yes in your sshd_config

2. Set up the work-to-home connection, where home_ip is the IP of your home server on your internal network:

user@work:~$ while [ 1 ]; do ssh -NRhome_ip:15000:localhost:22 user@home.net; done

3. Set up the laptop-to-work connection:

user@laptop:~$ ssh -D15001 home_ip -p 15000

Passwordless authentication can be configured by setting up your ssh host keys correctly.

In both scenarios above, SOCKS-aware applications can be configured with server as “localhost” and port as “15001”. For non-SOCKS-aware applications, you can generally get away with using tsocks.

You’ll also notice that step (1) needs bootstrapping while you’re on the corporate network. Persuade someone to su to you, or do it while you’re in the office one day.

Generally you also want to reduce the possibility of your work-to-home connection failing, so run it in screen, or in a nohup script or use something like autossh or rstunnel to bring it back up for you.

Don’t forget you’ll also need to open appropriate holes in your home firewall, generally some sort of NAT, PAT, or DMZ settings to allow incoming SSH (TCP, port 22) to be forwarded to your home server.

Update 2010-06-30 17:57
It’s worth mentioning that if you don’t have a static IP on your home DSL line that you’ll need to use a dynamic DNS service (like DynDNS) to keep a static name for your dynamic IP. Personally I do other stuff with Linode so I’ve set something cool up using their web-service API.

Generating MSCACHE & NTLM hashes using Perl

I’ve been doing a lot of tinkering recently whilst working on the revised rainbowcracklimited.com website. Naturally it uses Perl on the back end so I’ve had to find out how to make Windows-style hashes of various types using largely non-native means.

On the whole I’ve been able to make good use of the wealth of CPAN modules – Digest::MD4, Digest::MD5, Digest::SHA and Authen::Passphrase but for one reason and another I’ve wanted to find out how to make NTLM and MSCACHE hashes “by-hand”. It turns out this is pretty easy:

NTLM is just a MD4 digest of the password in Unicode, or to be specific utf16 2-byte characters + surrogates:

perl -M"Unicode::String latin1" -M"Digest::MD4 md4_hex" -e 'print md4_hex(latin1("cheese")->utf16le),"\n"'

MSCACHE is a little bit more fiddly as it also encodes the Unicode username as well:

perl -M"Unicode::String latin1" -M"Digest::MD4 md4_hex" -e 'print md4_hex(latin1("cheese")->utf16le . latin1(lc "Administrator")->utf16le),"\n"'

Active Directory + Linux account integration

Firstly a note of warning. I’ve done this mostly using CentOS but there’s no reason it shouldn’t work just as well on other distributions. I’ve gleaned a lot of this information by scouring a lot of other resources around the internet, FAQs, newsgroups etc. but as far as I can remember I wasn’t able to find a coherent article which described all of the required pieces of the puzzle.

Secondly the objective of this article is to have unified accounting across Windows & Linux, or at least as close as possible. We’re going to use Microsoft Active Directory, Kerberos, Samba, Winbind, pam and nsswitch. We’re also going to end up with consistent uids and gids across multiple linux clients.

/etc/samba/smb.conf

[global]
	workgroup = PSYPHI
	realm = PSYPHI.LOCAL
	security = ADS
	allow trusted domains = No
	use kerberos keytab = Yes
	log level = 3
	log file = /var/log/samba/%m
	max log size = 50
	printcap name = cups
	idmap backend = idmap_rid:PSYPHI=600-20000
	idmap uid = 600-20000
	idmap gid = 600-20000
	template shell = /bin/bash
	winbind enum users = Yes
	winbind enum groups = Yes
	winbind use default domain = Yes

/etc/krb5.conf

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = PSYPHI.LOCAL
 dns_lookup_realm = true
 dns_lookup_kdc = true
 ticket_lifetime = 24h
 forwardable = yes

[realms]
 EXAMPLE.COM = {
  kdc = kerberos.example.com:88
  admin_server = kerberos.example.com:749
  default_domain = example.com
 }

 PSYPHI.LOCAL = {
 }

[domain_realm]
 .example.com = EXAMPLE.COM
 example.com = EXAMPLE.COM

 psyphi.local = PSYPHI.LOCAL
 .psyphi.local = PSYPHI.LOCAL
[appdefaults]
 pam = {
   debug = false
   ticket_lifetime = 36000
   renew_lifetime = 36000
   forwardable = true
   krb4_convert = false
 }

Next we join the machine to the AD domain – it’s necessary to specify a user with the right privileges. It also prompts for a password.

net ads join -U administrator

We can check things are working so far by trying to create a kerberos ticket using an existing username. Again it prompts us for a password.

kinit (username)

Then klist gives us output something like this:

Ticket cache: FILE:/tmp/krb5cc_0
Default principal: username@PSYPHI.LOCAL

Valid starting     Expires            Service principal
04/28/10 10:57:32  04/28/10 20:57:34  krbtgt/PSYPHI.LOCAL@PSYPHI.LOCAL
	renew until 04/29/10 10:57:32


Kerberos 4 ticket cache: /tmp/tkt0
klist: You have no tickets cached

Cool, so we have a machine joined to the domain and able to use kerberos tickets. Now we can tell our system to use winbind for fetching account information:

/etc/pam.d/system-auth-ac

auth        required      pam_env.so
auth        sufficient    pam_unix.so nullok try_first_pass
auth        requisite     pam_succeed_if.so uid >= 500 quiet
auth        sufficient    pam_krb5.so use_first_pass
auth        required      pam_deny.so

account     required      pam_unix.so broken_shadow
account     sufficient    pam_localuser.so
account     sufficient    pam_succeed_if.so uid < 500 quiet
account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
account     required      pam_permit.so

password    requisite     pam_cracklib.so try_first_pass retry=3
password    sufficient    pam_unix.so md5 shadow nullok try_first_pass use_authtok
password    sufficient    pam_krb5.so use_authtok
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      /lib/security/pam_mkhomedir.so 
session     required      pam_unix.so
session     optional      pam_krb5.so

If we’re on a 64-bit distribution we’ll find that references to /lib need to be switched for /lib64, e.g. /lib64/security/pam_mkhomedir.so . This file will also create new home directories for users if they’re not present during first log-in.

/etc/nsswitch.conf

passwd:     files winbind
shadow:     files winbind
group:      files winbind

hosts:      files dns

bootparams: nisplus [NOTFOUND=return] files

ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files

netgroup:   nisplus

publickey:  nisplus

automount:  files nisplus
aliases:    files nisplus

Now we need to tell a few services to start on boot

chkconfig smb on
chkconfig winbind on

and start a few services now

service smb start
service winbind start

The Winbind+pam configuration can sometimes take a few minutes to settle down – I occasionally find it’s necessary to wait 5 or 10 minutes before accounts are available. YMMV.

getent passwd

Should now list local accounts (which take precedence) followed by domain accounts. Using ssh to the box as a domain user should make new home directories in /home/PSYPHI/username. If you decide to migrate home directories from /home make sure you change uid and gid to the new domain values for that user, then remove the old local account.

There are a handful of limitations of this approach –

Though usernames and groupnames map ok, linux uids still don’t map to the windows uids so permissions don’t quite work across smb/cifs mounts
The standard linux tools for user & group modification don’t work for domain accounts (adduser/usermod/groupadd/… etc.)
Winbind seems unstable. On a lot of systems I’ve resorted to cronning a service winbind restart every 15 minutes, which seriously sucks
… and probably others too

For debugging /var/log/secure is very useful, as are the samba logs in /var/log/samba/.

What does Technology Monoculture really cost for SME?

Open, open, open. Yes, I sound like a stuck record but every time I hit this one it makes me really angry.

I regularly source equipment and software for small-medium enterprises, SMEs. Usually these are charities and obviously they want to save as much money as they can with their hardware and software costs. Second-hand hardware is usually order of the day. PCs around 3-years old are pretty easy to obtain and will usually run most current software.

But what about that software? On the surface the answer seems simple: To lower costs use free or Open Source software (OSS). The argument for Linux, OpenOffice and other groupware applications is pretty compelling. So what does it really mean on the ground?

Let’s take our example office:
Three PCs called “office1”, “office2” and “finance” connected together using powerline networking. There’s an ADSL broadband router which provides wireless for three laptops and also a small NAS with RAID1 for backups and shared files.

Okay, now the fun starts. The office has grown “organically” over the last 10 years. The current state is that Office1 runs XP 64-bit; Office2 runs Vista Ultimate and the once-per-week-use “finance” runs Windows 2000 for Sage and a Gift Aid returns package. All three use Windows Backup weekly to the NAS. Office1 & Office2 use Microsoft Office 2007. Office1 uses Exchange for mail and calendars, Office2 uses Windows Mail and Palm Desktop. Both RDP and VNC are also used to manage all machines.

So, what happens now is that the Gift Aid package is retired and the upgrade is to use web access but can’t run on MSIE 6. Okay. Upgrade to MSIE 8. Nope – won’t run on Win2k. How about MSIE 7? Nope, can’t download that any more (good!). Right, then an operating system upgrade is in order.

What do I use? Ubuntu of course. Well, is it that easy? I need to support the (probably antique) version of Sage Accounts on there. So how about Windows XP? Hmm – XP is looking a bit long in the tooth now. Vista? You must be joking – train-wreck! So Windows 7 is the only option. Can’t use Home Premium because it doesn’t support RDP without hacking it. So I’m forced to use Win 7 Pro. That’s Â£105 for the OEM version or Â£150 for the “full” version. All that and I’ll probably still have to upgrade Sage, AND the finance machine is only used once a week. What the hell?

Back to the drawing-board.

What else provides RDP? Most virtualisation systems do – Xen, virtualbox and the like. I use Virtualbox quite a lot and it comes with a great RDP service built in for whatever virtual machine is running. Cool – so I can virtualise the win2k instance using something like the VMWare P2V converter and upgrade the hardware and it’ll run everything, just faster (assuming the P2V works ok)…

No, wait – that still doesn’t upgrade the browser for the Gift Aid access. Ok, I could create a new WinXP virtual machine – that’s more recent than Win2k and bound to be cheaper – because Virtualbox gives me RDP I don’t need the professional version, “xp home” would do, as much as it makes me cringe. How much does that cost? Hell, about Â£75 for the OEM version. What??? For an O/S that’ll be retired in a couple of years? You have to be kidding! And I repeat, Vista is not an option, it’s a bad joke.

I’m fed up with this crap!

Okay, options, options, I need options. Virtualise the existing Win2k machine for Sage and leave the Ubuntu Firefox web browser installation for the updated Gift Aid. Reckon that’ll work? It’ll leave the poor techno-weenie guy who does the finances with a faster PC which is technically capable of doing everything he needs but with an unfamiliar interface.

If I were feeling particularly clever I could put Firefox on the Win2k VM, make the VM start on boot using VBoxHeadless; configure Ubuntu to auto-login and add a Win2k-VM-RDP session as a startup item for the auto-login user. Not a bad solution but pretty hacky, even for my standards (plus it would need to shut-down the domain0 host when the VM shuts down).

All this and it’s still only for one of the PCs. You know what I’d like to do? Virtualise everything and stick them all on a central server. Then replace all the desktop machines with thin clients and auto-login-RDP settings. There’s a lot to be said for that – centralised backups, VM snapshotting, simplified (one-off-cost) hardware investment, but again there’s a caveat – I don’t think that I’d want to do that over powerline networking. I’d say a minimum requirement of 100MBps Ethernet, so networking infrastructure required, together with the new server. *sigh*.

I bet you’re thinking what has all this got to do with technology monoculture? Well, imagine the same setup without any Microsoft involved.

All the same existing hardware, Ubuntu on each, OpenOffice, Evolution Mail & Calendar or something like Egroupware perhaps or even Google Apps (docs/calendar/mail etc. – though that’s another rant for another day). No need for much in the way of hardware upgrades. No need for anything special in the way of networking. Virtualise anything which absolutely has to be kept, e.g. Sage, without enforcing a change to the Linux version.

I don’t know what the answer is. What I do know is that I don’t want to spend up to Â£450 (or whatever it adds up to for upgrade or OEM versions) just to move three PCs to Windows 7. Then again with Windows 8, 9, 10, 2020 FOREVER. It turns out you simply cannot do Microsoft on a shoestring. Once you buy in you’re stuck and people like Microsoft (and they’re not the only ones) have a license to print money, straight out of your pocket into their coffers.

Of course that’s not news to me, and it’s probably not news to you, but if you’re in a SME office like this and willing to embrace a change to OSS you can save hundreds if not thousands of pounds for pointless, unnecessary software. Obviously the bigger your working environment is, the quicker these costs escalate. The sooner you make the change, the sooner you start reducing costs.

Remind me to write about the state of IT in the UK education system some time. It’s like lighting a vast bonfire made of cash, only worse side-effects.

Configuring Hudson Continuous Integration Slaves

Roughly this time last year I set up Hudson at the office to do Oxford Nanopore’s continuous integration builds. It’s been a pleasure to roll-out the automated-testing work-ethic for developers and (perhaps surprisingly) there’s been less resistance than I expected. Hudson is a great weapon in the arsenal and is dead easy to set up as a master server. This week though I’ve had to do something new – configure one of our product simulators as my first Hudson slave server. Again this proved to be a doddle – thanks Hudson!

My slave server is called “device2” – Here’s what I needed to set up on the master (ci.example.net).

“Manage Hudson” => “Manage Nodes” => “New Node” => node name = “device2”, type is “dumb slave” => number of executors = 1, remote fs root = “/home/hudson/ci”, leave for tied jobs only, launch by JNLP.

Then on device2:

adduser hudson
su hudson
cd
mkdir ci
cd ci
wget http://ci.example.net/jnlpJars/slave.jar

Made a new file /etc/init.d/hudson-slave with these contents:

#!/bin/bash
#
# Init file for hudson server daemon
#
# chkconfig: 2345 65 15
# description: Hudson slave

. /etc/rc.d/init.d/functions

RETVAL=0
NODE_NAME="device2"
HUDSON_HOST=ci.example.net
USER=hudson
LOG=/home/hudson/ci/hudson.log
SLAVE_JAR=/home/hudson/ci/slave.jar

pid_of_hudson() {
    ps auxwww | grep java | grep hudson | grep -v grep | awk '{ print $2 }'
}

start() {
    echo -n $"Starting hudson: "
    COMMAND="java -jar $SLAVE_JAR -jnlpUrl \
        http://${HUDSON_HOST}/computer/${NODE_NAME}/slave-agent.jnlp \
        2>$LOG.err \
        >$LOG.out"
    su ${USER} -c "$COMMAND" &
    sleep 1
    pid_of_hudson > /dev/null
    RETVAL=$?
    [ $RETVAL = 0 ] && success || failure
    echo
}

stop() {
    echo -n "Stopping hudson slave: "
    pid=`pid_of_hudson`
    [ -n "$pid" ] && kill $pid
    RETVAL=$?
    cnt=10
    while [ $RETVAL = 0 -a $cnt -gt 0 ] &&
        { pid_of_hudson > /dev/null ; } ; do
        sleep 1
        ((cnt--))
    done

    [ $RETVAL = 0 ] && success || failure
    echo
}

status() {
    pid=`pid_of_hudson`
    if [ -n "$pid" ]; then
        echo "hudson (pid $pid) is running..."
        return 0
    fi
    echo "hudson is stopped"
    return 3
}

#Switch on called
case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    status)
        status
        ;;
    restart)
        stop
        start
        ;;
    *)
        echo $"Usage: $0 (start|stop|restart|status}"
        exit 1
esac

exit $RETVAL

chkconfig --add hudson-slave
chkconfig hudson-slave on
service hudson-slave start

and that’s pretty much all there was to it – refreshing the node-status page on the hudson master showed the slave had registered itself, then reconfiguring one of the existing jobs to be tied to the “device2” slave immediately started assigning jobs to it. Tremendous!

Exa-, Peta-, Tera-scale Informatics: Are YOU in the cloud yet?

http://www.flickr.com/photos/pagedooley/2511369048/

One of the aspects of my job over the last few years, both at Sanger and now at Oxford Nanopore Technologies has been the management of tera-, verging on peta- scale data on a daily basis.

Various methods of handling filesystems this large have been around for a while now and I won’t go into them here. Building these filesystems is actually fairly straightforward as most of them are implemented as regular, repeatable units – great for horizontal scale-out.

No, what makes this a difficult problem isn’t the sheer volume of data, it’s the amount of churn. Churn can be defined as the rate at which new files are added and old files are removed.

To illustrate – when I left Sanger, if memory serves, we were generally recording around a terabyte of new data a day. The staging area there was around 0.5 Petabytes (using the Lustre filesystem) but didn’t balance correctly across the many disks. This meant we had to keep the utilised space below around 90% for fear of filling up an individual storage unit (and leading to unexpected errors). Ok, so that’s 450TB. That left 45 days of storage – one and a half months assuming no slack.

Fair enough. Sort of. collect the data onto the staging area, analyse it there and shift it off. Well, that’s easier said than done – you can shift it off onto slower, cheaper storage but that’s generally archival space so ideally you only keep raw data. If the raw data are too big then you keep the primary analysis and ditch the raw. But there’s a problem with that:

lots of clever people want to squeeze as much interesting stuff out of the raw data as possible using new algorithms.
They also keep finding things wrong with the primary analyses and so want to go back and reanalyse.
Added to that there are often problems with the primary analysis pipeline (bleeding-edge software bugs etc.).
That’s not mentioning the fact that nobody ever wants to delete anything

As there’s little or no slack in the system, very often people are too busy to look at their own data as soon as it’s analysed so it might sit there broken for a week or four. What happens then is there’s a scrum for compute-resources so they can analyse everything before the remaining 2-weeks of staging storage is up. Then even if there are problems found it can be too late to go back and reanalyse because there’s a shortage of space for new runs and stopping the instruments running because you’re out of space is a definite no-no!

What the heck? Organisationally this isn’t cool at all. Situations like this are only going to worsen! The technologies are improving all the time – run-times are increasing, read-lengths are increasing, base-quality is increasing, analysis is becoming better and more instruments are becoming available to more people who are using them for more things. That’s a many, many-fold increase in storage requirements.

So how to fix it? Well I can think of at least one pretty good way. Don’t invest in on-site long-term staging- or scratch-storage. If you’re worried by all means sort out an awesome backup system but nearline it or offline to a decent tape archive or something and absolutely do not allow user-access. Instead of long-term staging storage buy your company the fattest Internet pipe it can handle. Invest in connectivity, then simply invest in cloud storage. There are enough providers out there now to make this a competitive and interesting marketplace with opportunities for economies of scale.

What does this give you? Well, many benefits – here are a few:

virtually unlimited storage
only pay for what you use
accountable costs – know exactly how much each project needs to invest
managed by storage experts
flexible computing attached to storage on-demand
no additional power overheads
no additional space overheads

Most of those I more-or-less take for granted these days. The one I find interesting at the moment is the costing issue. It can be pretty hard to hold one centralised storage area accountable for different groups – they’ll often pitch in for proportion of the whole based on their estimated use compared to everyone else. With accountable storage offered by the cloud each group can manage and pay for their own space. The costs are transparent to them and the responsibility has been delegated away from central management. I think that’s an extremely attractive prospect!

The biggest argument I hear against cloud storage & computing is that your top secret, private data is in someone else’s hands. Aside from my general dislike of secret data, these days I still don’t believe this is a good argument. There are enough methods for handling encryption and private networking that this pretty-much becomes a non-issue. Encrypt the data on-site, store the keys in your own internal database, ship the data to the cloud and when you need to run analysis fetch the appropriate keys over an encrypted link, decode the data on demand, re-encrypt the results and ship them back. Sure the encryption overheads add expense to the operation but I think the costs are far outweighed.

Shared Development Environments; why they suck and what you can do about it

http://www.flickr.com/photos/doctorow/2698336843/sizes/l/

I’ve wanted to write this post for a long time but only recently have I been made frustrated enough to do so.

So.. some background.

When I worked at the Sanger Institute I ran the web team there. This was a team with three main roles –

Make sure the website was up
Internal software development for projects without dedicated informatics support
Support for projects with dedicated informatics

When I started, back in 1999 things were pretty disorganised but in terms of user-requirements actually a little easier – projects had the odd CGI script but most data were shipped out using file dumps on the FTP site. You see back then and for the few years’ previous, it was the dawning of the world-wide-web and web-users were much happier being faced with an FTP/gopher file-listing of .gz (or more likely, uncompressed .fasta) files to download.

Back then we had a couple of small DEC servers which ran the external- and internal- (intranet) websites. Fine. Well, fine that is, until you want to make a change.

Revision Control: Manual

Ok. You want to make a change. You take your nph-blast_server.cgi and make a copy nph-blast_server2.cgi . You make your changes and test them on the external website. Great! It works! You mail a collaborator across the pond to try it out for bugs. Fab! Nothing found. Ok, so copy it back over nph-blast_server.cgi and everyone’s happy.

What’s wrong with this picture? Well, you remember that development copy? Firstly, it’s still there. You just multiplied your attack-vectors by two (assuming there are bugs in the script capable of being exploited). Secondly, and this is more harmful to long-term maintenance, that development copy is the URL you mailed your collaborator. It’s also the URL your collaborator mailed around to his 20-strong informatics team and they posted on bulletin boards and USENET groups for the rest of the world.

Luckily you have a dedicated and talented web-team who sort out this chaos using a pile of server redirects. Phew! Saved.

Now multiply this problem by the 150-or-so dedicated informatics developers on campus serving content through the core servers. Take that number and multiply it by the number of CGI scripts each developer produces a month.

That is then the number of server redirects which every incoming web request has to be checked against before it reaches its target page. Things can become pretty slow.

Enter the development (staging) service

What happens next is that the web support guys do something radical. They persuade all the web developers on site by hook or by crook that they shouldn’t be editing content on the live, production, public servers. Instead they should use an internal (and for special cases, IP-restricted-external-access) development service, test their content before pushing it live, then use a special command, let’s call it webpublish, to push everything live.

Now to the enlightened developer of today that doesn’t sound radical, it just sounds like common sense. You should have heard the wailing and gnashing of teeth!

Shared development

At this point I could, should go into the whys and wherefores of using revision control, but I’ll save that for another post. Instead I want to focus on the drawbacks of sharing. My feeling is that the scenario above is a fairly common one where there are many authors working on the same site. It works really well for static content, even when a CMS is used. Unfortunately it’s not so great for software development. The simple fact is that requirements diverge – both for the project and for the software stack. These disparate teams only converge in that they’re running on the same hardware, so why should the support team expect their software requirements to converge also?

Allow me to illustrate one of the problems.

Projects A and B are hosted on the same server. They use the same centrally-supported library L. A, B and L each have a version. They all work happily together at version A1B1L1. Now B needs a new feature, but to add it requires an upgrade to L2. Unfortunately the L2 upgrade breaks A1. Project A therefore is obliged to undertake additional (usually unforeseen) work just to retain current functionality.

Another situation is less subtle and involves shared-user access. For developers this is most likely the root superuser although in my opinion any shared account is equally bad. When using a common user it’s very difficult to know who made a change in the past, let alone who’s making a change right now. I observed a situation recently where two developers were simultaneously trying to build RPMs with rpmbuild which, by default, builds in a system location like /usr/share . Simultaneously trying to access the same folders leads to very unpredictable, unrepeatable results. Arguably the worst situation is when no errors are thrown during the build and neither developer notices!

Naturally a lot of the same arguments against shared development go for shared production too. The support matrix simply explodes with a few tens of applications each with different prerequisites.

Other options

Back in the day there were fewer options – one was left with always having to use relative paths and often having to discard all but the core system prerequisites in fear of them changing unexpectedly over time. Using relative paths is still a fairly inexpensive way to do things but sometimes it’s just too restrictive. There is another way…

Virtualisation is now commonplace. You probably cross-paths with a virtual machine every day without knowing it. They’re ubiquitous because they’re really, really useful. For our development purposes one core support member can build a standard, supported virtual machine image and post it on the intranet somewhere. All the other developers can take it, start their own instances of it and do all of their own development on their own hardware without fighting for common resources. Upgrades can be tested independently of one another. Machines can be restarted from scratch and so on. Once development is complete and given sufficient core resources, each developer can even bundle up their working image and ship it into production as is. No further core support required!

What tools can you use to do this? Parallels? Too commercial. VMWare? A bit lardy. Xen? Probably a bit too hard-core. KVM? Not quite mature enough yet. No, my current favourite in the virtualisation stakes is VirtualBox. Cross platform and free. Works great with Ubuntu inside. A killer combination capable of solving many of these sorts of problems.

Switching to WordPress

So.. Sorry as I am to say it, I’ve decided to take the plunge and switched this blog from the home-grown one over to wordpress, which is hard to beat for support – features & plugins.

With any luck it’ll mean this thing is updated much more frequently but I can’t say I’m happy using a PHP-based solution. Still – best tool for the job and all that. It’s a pity Krang or MT weren’t as easy & simple to set up.

I still have all the old entries deep in the bowels of my database but I need to spend a little time to port them over to WP. I figure it’s better to have the new system up and running first rather than hold it back until everything’s ready – iterative agile development and all that. Yum!

Apache Forward-Proxy REMOTE_ADDR propagation

I had an interesting problem this morning with the Apache forward-proxy supporting the WTSI sequencing farm.

It would be useful for the intranet service for tracking runs to know which (GA2) sequencer is requesting pages but because they’re on a dedicated subnet they have to use a forward-proxy for fetching pages (and then only from intranet services).

Now I’m very familiar using the X-Forwarded-For header and HTTP_X_FORWARDED_FOR environment variable (and their friends) which do something very similar for reverse-proxies but forward-proxies usually want to disguise the fact there’s an arbitrary number of clients behind them, usually with irrelevant RFC1918 private IP addresses too.

So what I want to do is slightly unusual – take the remote_addr of the client and stuff it into a different header. I could use X-Forwarded-For but it doesn’t feel right. Proxy-Via is also not right here as that’s really for the proxy servers themselves. So, I figured mod_headers on the proxy would allow me to add additional headers to the request, even though it’s forwarded on. Also following a tip I saw here using my favourite mod_rewrite and after a bit of fiddling I can up with this:

#########
# copy remote addr to an internal variable
#
RewriteEngine  On
RewriteCond  %{REMOTE_ADDR}  (.*)
RewriteRule   .*  -  [E=SEQ_ADDR:%1]

#########
# set X-Sequencer header from the internal variable
#
RequestHeader  set  X-Sequencer  %{SEQ_ADDR}

These rules sit in the container managing my proxy, after ProxyRequests and ProxyVia and before a small set of ProxyMatch restrictions.

The RewriteCond traps the contents of the REMOTE_ADDR environment variable (it’s not an HTTP header – it comes from the end of the network socket as determined by the server). The RewriteRule unconditionally copies the last RewriteCond match %1 into a new environment variable SEQ_ADDR. After this mod_headers sets the X-Sequencer request header (for the proxied request) to the value of the SEQ_ADDR environment variable.

This works very nicely though I’d have hoped a more elegant solution would be this:

RequestHeader set X-Sequencer %{REMOTE_ADDR}

but this doesn’t seem to work and I’m not sure why. Anyway, by comparing $ENV{HTTP_X_SEQUENCER} to a shared lookup table, the sequencing apps running on the intranet can now track which sequencer is making requests. Yay!

The ISP Hosting blues

Initial work on the sign-on is done. It’s a shame Namesco don’t support the SSL virtualhost in the same way that Simply did. I really must finish the sign-on code and make it use session keys instead. Maybe some sort of javascript hashing thing client-side simple for obfuscation would help. Hmm.

I can safely say that after a good few years running a (admittedly el-cheapo basic) webserver at a remote ISP is a pain in the proverbials. I’m really looking forward to setting up John’s parish server & services on his DSL connection.

So my ISP renewal is up soon. I’m sure I’m paying too much but it’s *so* painful switching services over – finding out what’s supported and what isn’t; finding out that all your CGIs have to be renamed .cgi and can live anywhere, or vice-versa – named anything but all live in cgi-bin (well, that’s the way I do it).

Anyway… This will be the first entry in my hastily lashed-up weblog system. At least it only depends on my code plus XML::RSS. Goodness knows (well ok I should really look into it) what movabletype depends on.