Bookmarks for November 19th through December 2nd

These are my links for November 19th through December 2nd:

Bookmarks for August 29th through November 12th

These are my links for August 29th through November 12th:

Bookmarks for June 30th through July 2nd

These are my links for June 30th through July 2nd:

Bookmarks for May 6th through May 22nd

These are my links for May 6th through May 22nd:

Bookmarks for December 12th through February 18th

These are my links for December 12th through February 18th:

Bookmarks for November 6th through November 26th

These are my links for November 6th through November 26th:

Bookmarks for June 1st through November 3rd

These are my links for June 1st through November 3rd:

Bookmarks for February 8th through April 23rd

These are my links for February 8th through April 23rd:

Bookmarks for December 16th through January 11th

These are my links for December 16th through January 11th:

3 sorts of sort

I’ve been fiddling around recently with some stuff which I’m sure I covered in my CS degree 16 (Gah! Really?) years ago but have had to re-educate myself about. Namely a few different implementations of sort. I implemented three types in Perl with some reacquaintance of the mechanisms via Wikipedia. I found a few existing examples of sort algorithms in Perl but they generally looked a bit unpleasant so like all programmers I decided to write my own (complete with errors, as an exercise to the reader). Here I’ve also added some notes, mostly to myself, which are largely unsubstantiated because I haven’t measured memory, speed or recursion depth, for example (though these are well-documented elsewhere).

1. Bubble Sort

#!/usr/bin/perl -w
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = bubblesort($set);
print "out: @{$sorted}\n";

sub bubblesort {
  my ($in)     = @_;
  my $out      = [@{$in}];
  my $length   = scalar @{$in};
  my $modified = 1;

  while($modified) {
    $modified = 0;
    for my $i (0..$length-2) {
      if($out->[$i] > $out->[$i+1]) {
	($out->[$i], $out->[$i+1]) = ($out->[$i+1], $out->[$i]);
	$modified = 1;
      }
    }
  }

  return $out;
}

Bubblesort iterates through each element of the list up to the last but one, comparing to the next element in the list. If it’s greater the values are swapped. The process repeats until no modifications are made to the list.

Pros: doesn’t use much memory – values are swapped in situ; doesn’t perform deep recursion; is easy to read

Cons: It’s pretty slow. The worst-case complexity is O(n2) passes (for each value in the list each value in the list is processed once).

2. Merge Sort

#!/usr/bin/perl
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = mergesort($set);
print "out: @{$sorted}\n";

sub mergesort {
  my ($in) = @_;

  my $length = scalar @{$in};
  if($length < = 1) {
    return $in;
  }

  my $partition = $length / 2;
  my $left      = [@{$in}[0..$partition-1]];
  my $right     = [@{$in}[$partition..$length-1]];

  return merge(mergesort($left), mergesort($right));
}

sub merge {
  my ($left, $right) = @_;
  my $merge = [];

  while(scalar @{$left} || scalar @{$right}) {
    if(scalar @{$left} && scalar @{$right}) {
      if($left->[0] < $right->[0]) {
	push @{$merge}, shift @{$left};
      } else {
	push @{$merge}, shift @{$right};
      }
    } elsif(scalar @{$left}) {
      push @{$merge}, shift @{$left};
    } elsif(scalar @{$right}) {
      push @{$merge}, shift @{$right};
    }
  }
  return $merge;
}

Mergesort recurses through the list, in each iteration breaking the remaining list in half. Once broken down to individual elements, each pair of elements/lists at each depth of recursion is reconstituted into a new ordered list and returned.

Pros: generally quicker than bubblesort; O(n log n) complexity.

Cons: quite difficult to read

3. Quicksort

#!/usr/bin/perl
use strict;
use warnings;

my $set    = [map { int rand() * 99 } (0..40)];
print "in:  @{$set}\n";

my $sorted = quicksort($set);
print "out: @{$sorted}\n";

sub quicksort {
  my ($in) = @_;

  my $length = scalar @{$in};
  if($length < = 1) {
    return $in;
  }

  my $pivot = splice @{$in}, $length / 2, 1;
  my $left  = [];
  my $right = [];

  for my $v (@{$in}) {
    if($v < $pivot) {
      push @{$left}, $v;
    } else {
      push @{$right}, $v;
    }
  }

  return [@{quicksort($left)}, $pivot, @{quicksort($right)}];
}

Quicksort is probably the best known of all the sort algorithms out there. It’s easier to read than Mergesort, though arguably still not as easy as Bubblesort, but it’s a common pattern and its speed makes up for anything lacking in readability. At each iteration a pivot is selected and removed from the list. The remaining list is scanned and for element lower than the pivot is put in a new “left” list and each greater element is put into a new “right” list. The returned result is a merged recursive quicksort of the left list, the pivot and the right list.

In this example I’m picking the middle element of the list as the pivot. I’m sure there are entire branches of discrete mathematics dealing with how to choose the pivot based on the type of input data.

Pros: (One of?) the fastest sort algorithm(s) around; Reasonably efficient memory usage and recursion depth. Average O(n log n) complexity again (worst is O(n2)).

Perhaps it’s worth noting that in 25-odd years of programming computers I’ve only ever had to examine the inner workings of sort routines as part of my degree – never before, nor after, but it’s certainly brought back a few memories.