code - psyphi.net blog

Remote Power Management using Arduino

2016-03-07 Update: Git Repo available

Recently I’ve been involved with building a hardware device consisting of a cluster of low-powerÂ PC servers. The boards chosen for this particular project aren’t enterprise or embedded -style boards with specialist features like out of band (power) management (like Dell’s iDRAC or Intel’s AMT) so I started thinking about how toÂ approximate something similar.

It’s also a little reminiscent of STONITH (Shoot The Other Node In The Head), used for aspects of the Linux-HA (High Availability) services.

I dug around in a box of goodies and found a couple of handy parts:

Arduino Duemilanove
Seeedstudio Arduino Relay Shield v3

The relays are rated for switching up to 35V at 8A – easily handling the 19V @ 2A for the mini server boards I’m remote managing.

The other handy thing to notice is that the Arduino by its nature is serial-enabled, meaning you can control itÂ very simply using a USB connection to the management system without needing any more shields or adapters.

Lastly it’s worth mentioning that the relays are effectively SPDT switches so have connections for circuit open and closed.Â In my case this is useful as most of the time I don’t want the relays to be energised, saving power and prolonging the life of the relay.

The example Arduino code below opens a serial port and collects characters in a string variable until a carriage-return (0x0D) before acting, accepting commands “on”, “off” and “reset”. When a command is completed, the code clears the command buffer and flips voltages on the digital pins controlling the relays. Works a treat – all I need to do now is splice the power cables for the cluster compute units and run them through the right connectors on the relay boards. With the draw the cluster nodes pull being well within the specs of the relays it might even be possible to happily run two nodesÂ through each relay.

There’s no reason why this sort of thing couldn’t be used for many other purposes too – home automation or other types of remote management, and could obviously be activated over ethernet, wifi or bluetooth instead of serial – goes without saying for a relay board -duh!

int MotorControl1 = 4;
int MotorControl2 = 5;
int MotorControl3 = 6;
int MotorControl4 = 7;
int incomingByte = 0; // for incoming serial data
String input = ""; // for command message

void action (String cmd) {
  if(cmd == "off") {
    digitalWrite(MotorControl1, HIGH); // NO1 + COM1
    digitalWrite(MotorControl2, HIGH); // NO2 + COM2
    digitalWrite(MotorControl3, HIGH); // NO3 + COM3
    digitalWrite(MotorControl4, HIGH); // NO4 + COM4
    return;
  }

  if(cmd == "on") {
    digitalWrite(MotorControl1, LOW); // NC1 + COM1
    digitalWrite(MotorControl2, LOW); // NC2 + COM2
    digitalWrite(MotorControl3, LOW); // NC3 + COM3
    digitalWrite(MotorControl4, LOW); // NC4 + COM4
    return;
  }

  if(cmd == "reset") {
    action("off");
    delay(1000);
    action("on");
    return;
  }

  Serial.println("unknown action");
}

// the setup routine runs once when you press reset:
void setup() {
  pinMode(MotorControl1, OUTPUT);
  pinMode(MotorControl2, OUTPUT);
  pinMode(MotorControl3, OUTPUT);
  pinMode(MotorControl4, OUTPUT);
  Serial.begin(9600); // opens serial port, sets data rate to 9600 bps
  Serial.println("relay controller v0.1 rmp@psyphi.net actions are on|off|reset");
  input = "";
} 

// the loop routine runs over and over again forever:
void loop() {
  if (Serial.available() > 0) {
    incomingByte = Serial.read();

    if(incomingByte == 0x0D) {
      Serial.println("action:" + input);
      action(input);
      input = "";
    } else {
      input.concat(char(incomingByte));
    }
  } else {
    delay(1000); // no need to go crazy
  }
}

Bookmarks for December 2nd through January 12th

These are my links for December 2nd through January 12th:

Firebase – Build Realtime Apps –
The last cloud storage API you’ll ever need | Kloudless –
Introduction – Google Caja — Google Developers – detaint third-party javascript/html
Brackets – A modern, open source code editor that understands web design. –
How to get started with RequireJS –

Bookmarks for February 8th through April 23rd

These are my links for February 8th through April 23rd:

Bookmarks for December 16th through January 11th

These are my links for December 16th through January 11th:

Toxiclibsjs v0.1.0 | labs.hapticdata.com – Toxiclibs
toxiclibs –
fuse-ext2 | Free System Administration software downloads at SourceForge.net – ext2 & ext3 FUSE driver for OSX
HSMM-MESH –
RISC OS Open: Welcome –

3 sorts of sort

I’ve been fiddling around recently with some stuff which I’m sure I covered in my CS degree 16 (Gah! Really?) years ago but have had to re-educate myself about. Namely a few different implementations of sort. I implemented three types in Perl with some reacquaintance of the mechanisms via Wikipedia. I found a few existing examples of sort algorithms in Perl but they generally looked a bit unpleasant so like all programmers I decided to write my own (complete with errors, as an exercise to the reader). Here I’ve also added some notes, mostly to myself, which are largely unsubstantiated because I haven’t measured memory, speed or recursion depth, for example (though these are well-documented elsewhere).

1. Bubble Sort

#!/usr/bin/perl -w
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = bubblesort($set);
print "out: @{$sorted}\n";

sub bubblesort {
  my ($in)     = @_;
  my $out      = [@{$in}];
  my $length   = scalar @{$in};
  my $modified = 1;

  while($modified) {
    $modified = 0;
    for my $i (0..$length-2) {
      if($out->[$i] > $out->[$i+1]) {
	($out->[$i], $out->[$i+1]) = ($out->[$i+1], $out->[$i]);
	$modified = 1;
      }
    }
  }

  return $out;
}

Bubblesort iterates through each element of the list up to the last but one, comparing to the next element in the list. If it’s greater the values are swapped. The process repeats until no modifications are made to the list.

Pros: doesn’t use much memory – values are swapped in situ; doesn’t perform deep recursion; is easy to read

Cons: It’s pretty slow. The worst-case complexity is O(n²) passes (for each value in the list each value in the list is processed once).

2. Merge Sort

#!/usr/bin/perl
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = mergesort($set);
print "out: @{$sorted}\n";

sub mergesort {
  my ($in) = @_;

  my $length = scalar @{$in};
  if($length < = 1) {
    return $in;
  }

  my $partition = $length / 2;
  my $left      = [@{$in}[0..$partition-1]];
  my $right     = [@{$in}[$partition..$length-1]];

  return merge(mergesort($left), mergesort($right));
}

sub merge {
  my ($left, $right) = @_;
  my $merge = [];

  while(scalar @{$left} || scalar @{$right}) {
    if(scalar @{$left} && scalar @{$right}) {
      if($left->[0] < $right->[0]) {
	push @{$merge}, shift @{$left};
      } else {
	push @{$merge}, shift @{$right};
      }
    } elsif(scalar @{$left}) {
      push @{$merge}, shift @{$left};
    } elsif(scalar @{$right}) {
      push @{$merge}, shift @{$right};
    }
  }
  return $merge;
}

Mergesort recurses through the list, in each iteration breaking the remaining list in half. Once broken down to individual elements, each pair of elements/lists at each depth of recursion is reconstituted into a new ordered list and returned.

Pros: generally quicker than bubblesort; O(n log n) complexity.

Cons: quite difficult to read

3. Quicksort

#!/usr/bin/perl
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = quicksort($set);
print "out: @{$sorted}\n";

sub quicksort {
  my ($in) = @_;

  my $length = scalar @{$in};
  if($length < = 1) {
    return $in;
  }

  my $pivot = splice @{$in}, $length / 2, 1;
  my $left  = [];
  my $right = [];

  for my $v (@{$in}) {
    if($v < $pivot) {
      push @{$left}, $v;
    } else {
      push @{$right}, $v;
    }
  }

  return [@{quicksort($left)}, $pivot, @{quicksort($right)}];
}

Quicksort is probably the best known of all the sort algorithms out there. It’s easier to read than Mergesort, though arguably still not as easy as Bubblesort, but it’s a common pattern and its speed makes up for anything lacking in readability. At each iteration a pivot is selected and removed from the list. The remaining list is scanned and for element lower than the pivot is put in a new “left” list and each greater element is put into a new “right” list. The returned result is a merged recursive quicksort of the left list, the pivot and the right list.

In this example I’m picking the middle element of the list as the pivot. I’m sure there are entire branches of discrete mathematics dealing with how to choose the pivot based on the type of input data.

Pros: (One of?) the fastest sort algorithm(s) around; Reasonably efficient memory usage and recursion depth. Average O(n log n) complexity again (worst is O(n²)).

Perhaps it’s worth noting that in 25-odd years of programming computers I’ve only ever had to examine the inner workings of sort routines as part of my degree – never before, nor after, but it’s certainly brought back a few memories.

Bookmarks for November 25th through December 6th

These are my links for November 25th through December 6th:

gource – Project Hosting on Google Code –
500 Internal Server Error – 500 Internal Server Error
500 Internal Server Error – 500 Internal Server Error
The British Shooting & Countryman Show – possibly worth a visit
Futurelab – Innovation in education –

An Interview Question

I’d like to share a basic interview question I’ve used in the past. I’ve used this in a number of different guises over the years, both at Sanger and at ONT but the (very small!) core remains the same. It still seems to be able to trip up a lot of people who sell themselves as senior developers on their CVs and demand Â£35k+ salaries.

You have a list of characters.

Remove duplicates

The time taken for the interviewee to scratch their head determines whether they’re a Perl programmer, or at least think like one – this is an idomatic question in Perl. It’s a fairly standard solution to anyone who uses hashes, maps or associative arrays in any language. It’s certainly a lot harder without them.

The answer I would expect to see would run something like this:

#########
# pass in an array ref of characters, e.g.
# remove_dupes([qw(a e r o i g n o s e w f e r g e r i g e o n k)]);
#
sub remove_dupes {
  my $chars_in  = shift;
  my $seen      = {};
  my $chars_out = [];

  for my $char (@{$chars_in}) {
    if(!$seen->{$char}++) {
      push @{$chars_out}, $char;
    }
  }

  return $chars_out;
}

Or for the more adventurous, using a string rather than an array:

#########
# pass in a string of characters, e.g.
# remove_dupes(q[uyavubnopwemgnisudhjopwenfbuihrpgbwogpnskbjugisjb]);
#
sub remove_dupes {
  my $str  = shift;
  my $seen = {};
  $str     =~ s/(.)/( !$seen->{$1}++ ) ? $1 : q[]/smegx;
  return $str;
}

The natural progression from Q1 then follows. It should be immediately obvious to the interviewee if they answered Q1 inappropriately.

List duplicates

#########
# pass in an array ref of characters, e.g.
# list_dupes([qw(a e r o i g n o s e w f e r g e r i g e o n k)]);
#
sub list_dupes {
  my $chars_in  = shift;
  my $seen      = {};
  my $chars_out = [];

  for my $char (@{$chars_in}) {
    $seen->{$char}++;
  }

  return [ grep { $seen->{$_} > 1 } keys %{$seen} ];
}

and with a string

#########
# pass in a string of characters, e.g.
# list_dupes(q[uyavubnopwemgnisudhjopwenfbuihrpgbwogpnskbjugisjb]);
#
sub list_dupes {
  my $str  = shift;
  my $seen = {};
  $str     =~ s/(.)/( $seen->{$1}++ > 1) ? $1 : q[]/smegx;
  return $str;
}

The standard follow-up is then “Given more time, what would you do to improve this?”. Well? What would you do? I know what I would do before I even started – WRITE SOME TESTS!

It’s pretty safe to assume that any communicative, personable candidate who starts off writing a test on the board will probably be head and shoulders above any other.

If I’m interviewing you tomorrow and you’re reading this now, it’s also safe to mention it. Interest in the subject and a working knowledge of the intertubes generally comes in handy for a web developer. I’m hiring you as an independent thinker!

Great pieces of code

A lot of what I do day-to-day is related to optimisation. Be it Perl code, SQL queries, Javascript or HTML there are usually at least a couple of cracking examples I find every week. On Friday I came across this:

SELECT cycle FROM goldcrest WHERE id_run = ?

This query is being used to find the number of the latest cycles (between 1 and 37 for each id_run) in a near-real-time tracking system and is used several times whenever a run report is viewed.

EXPLAIN SELECT cycle FROM goldcrest WHERE id_run = 231;
  
+----+-------------+-----------+------+---------------+---------+---------+-------+--------+-------------+
| id | select_type | table Â  Â  | type | possible_keys | key Â  Â  | key_len | ref Â  | rows Â  | Extra Â  Â  Â  |
+----+-------------+-----------+------+---------------+---------+---------+-------+--------+-------------+
| Â 1 | SIMPLE Â  Â  Â | goldcrest | ref Â | g_idrun Â  Â  Â  | g_idrun | Â  Â  Â  8 | const | 262792 | Using where |
+----+-------------+-----------+------+---------------+---------+---------+-------+--------+-------------+

In itself this would be fine but the goldcrest table in this instance contains several thousand rows for each id_run. So, for id_run, let’s say, 231 this query happens to return approximately 588,000 rows to determine that the latest cycle for run 231 is the number 34.

To clean this up we first try something like this:

SELECT MIN(cycle),MAX(cycle) FROM goldcrest WHERE id_run = ?

which still scans the 588000 rows (keyed on id_run incidentally) but doesn’t actually return them to the user, only one row containing both values we’re interested in. Fair enough, the CPU and disk access penalties are similar but the data transfer penalty is significantly improved.

Next I try adding an index against the id_run and cycle columns:

ALTER TABLE goldcrest ADD INDEX(id_run,cycle);
Query OK, 37589514 rows affected (23 min 6.17 sec)
Records: 37589514 Â Duplicates: 0 Â Warnings: 0

Now this of course takes a long time and, because the tuples are fairly redundant, creates a relatively inefficient index, also penalising future INSERTs. However, casually ignoring those facts, our query performance is now radically different:

EXPLAIN SELECT MIN(cycle),MAX(cycle) FROM goldcrest WHERE id_run = 231;
  
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key Â | key_len | ref Â | rows | Extra Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| Â 1 | SIMPLE Â  Â  Â | NULL Â | NULL | NULL Â  Â  Â  Â  Â | NULL | Â  Â NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
  
SELECT MIN(cycle),MAX(cycle) FROM goldcrest WHERE id_run = 231;
+------------+------------+
| MIN(cycle) | MAX(cycle) |
+------------+------------+
| Â  Â  Â  Â  Â 1 | Â  Â  Â  Â  37 |
+------------+------------+
  
1 row in set (0.01 sec)

That looks a lot better to me now!

Generally I try to steer clear of the mysterious internal workings of database engines, but with much greater frequency come across examples like this:

sub clone_type {
  my ($self, $clone_type, $clone) = @_;
  my %clone_type;

  if($clone and $clone_type) {
    $clone_type{$clone} = $clone_type;
    return $clone_type{$clone};
  }

  return;
}

Thankfully this one’s pretty quick to figure out – they’re usually *much* more convoluted, but still.. Huh??

Pass in a clone_type scalar, create a local hash with the same name (Argh!), store the clone_type scalar in the hash keyed at position $clone, then return the same value we just stored.

I don’t get it… maybe a global hash or something else would make sense, but this works out the same:

sub clone_type {
  my ($self, $clone_type, $clone) = @_;

  if($clone and $clone_type) {
    return $clone_type;
  }
  return;
}

and I’m still not sure why you’d want to do that if you have the values on the way in already.

Programmers really need to think around the problem, not just through it. Thinking through may result in functionality but thinking around results in both function and performance which means a whole lot more in my book, and incidentally, why it seems so hard to hire good programmers.

7 utilities for improving application quality in Perl

I’d like to share with you a list of what are probably my top utilities for improving code quality (style, documentation, testing) with a largely Perl flavour. In loosely important-but-dull to exciting-and-weird order…

Test::More. Billed as yet another framework for writing test scripts Test::More extends Test::Simple and provides a bunch of more useful methods beyond Simple’s ok(). The ones I use most being use_ok() for testing compilation, is() for testing equality and like() for testing similarity with regexes.

ExtUtils::MakeMaker. Another one of Mike Schwern’s babies, MakeMaker is used to set up a folder structure and associated ‘make’ paraphernalia when first embarking on writing a module or application. Although developers these days tend to favour Module::Build over MakeMaker I prefer it for some reason (probably fear of change) and still make regular mileage using it.

Test::Pod::Coverage – what a great module! Check how good your documentation coverage is with respect to the code. No just a subroutine header won’t do! I tend to use Test::Pod::Coverage as part of…

Test::Distribution . Automatically run a battery of standard tests including pod coverage, manifest integrity, straight compilation and a load of other important things.

perlcritic, Test::Perl::Critic . The Perl::Critic set of tools is amazing. It’s built on PPI and implements the Perl Best Practices book by Damien Conway. Now I realise that not everyone agrees with a lot of what Damien says but the point is that it represents a standard to work to (and it’s not that bad once you’re used to it). Since I discovered perlcritic I’ve been developing all my code as close to perlcritic -1 (the most severe) as I can. It’s almost instantly made my applications more readable through systematic appearance and made faults easier to spot even before Test::Perl::Critic comes in.

Devel::Cover. I’m almost ashamed to say I only discovered this last week after dipping into Ian Langworthy and chromatic’s book ‘Perl Testing’. Devel::Cover gives code exercise metrics, i.e. how much of your module or application was actually executed by that test. It collates stats from all modules matching a user-specified pattern and dumps them out in a natty coloured table, very suitable for tying into your CI system.

Selenium . Ok, not strictly speaking a tool I’m using right this minute but it’s next on my list of integration tools. Selenium is a non-interactive, automated, browser-testing framework written in Javascript. This tool definitely has legs and it seems to have come a long way since I first found it in the middle of 2006. I’m hoping to have automated interface testing up and running before the end of the year as part of the Perl CI system I’m planning on putting together for the new sequencing pipeline.