Here’s a handy one(ish)-liner to mutate an input sequence using Perl’s RegEx engine:
epiphyte:~ rmp$ perl -e '$seq="ACTAGCTACGACTAGCATCGACT"; $mutants = [qw(A C T G)];
print "$seq\n";
$seq =~ s{([ATGC])}{ rand() < 0.5 ? $mutants->[int rand 4] : $1 }smiexg;
print "$seq\n";'
ACTAGCTACGACTAGCATCGACT
ACAATCGCGGACCAGAATCTCTT
This gives each base in $seq a 50% chance (rand() < 0.5) of mutating to something, but as the original base is in the available $mutants array it has a further 25% chance of changing to itself. If you wanted to improve it by excluding the original base for each mutation you might do something like:
epiphyte:~ rmp$ perl -e '$seq="ACTAGCTACGACTAGCATCGACT"; $mutants = [qw(A C T G)];
$mutsize=scalar @{$mutants}; print "$seq\n";
$seq =~ s{([ATGC])}{ rand() < 0.5 ? [grep { $_ ne $1 } @{$mutants}]->[int rand $mutsize-1] : $1 }smiexg;
print "$seq\n";'
ACTAGCTACGACTAGCATCGACT
TGTAGATAATGTGATACGAGACT
This (quite inefficiently) builds an array of all available options from $mutants, excluding $1 the matched base at each position.
Unrolling it and tidying it up a little for readability looks like this:
my $seq = 'ACTAGCTACGACTAGCATCGACT';
my $mutants = [qw(A C T G)];
my $mutsize = scalar @{$mutants};
print "$seq\n";
$seq =~ s{([ATGC])}{
rand() < 0.5
?
[grep { $_ ne $1 } @{$mutants}]->[int rand $mutsize-1]
:
$1
}smiexg;
print "$seq\n";'