This article has been moved from «perl6.eu» and updated to reflect the language rename in 2019.
See also: The Introduction | Part 1: Base 36 | Part 2: Base 400.
Very long messages are vulnerable to Letter Frequency Analysis, even when we use Base 400 encryption, as the 10 possible codes for each character will be evenly distributed for large selections. We can avoid that (or at least making it considerably harder) by changing the key after encoding part of the message.
I'll describe the decryption part first:
If we specify the wrong key, a modified string, or the secret keys don't match, we'll get gibberish in return. As the key is also specified as two base 40 characters the program will happily go on. The decryption algorithm gives valid output regardless of what we have given it, so an attacker will not get any help when trying to guess the keys. And will have to try all the possibilites and look for recognizable text in them. E.g:
$ raku decrypt AA 149N6WI1IUI6ODASEEJRO
YSYI44FE6YUUOH8W
$ raku decrypt AB 149N6WI1IUI6ODASEEJRO
4E0DPEVQ8.1AZN Y
The code:
File: lib/Cryptoish.pm6 (partial)
my %base40values = @base40.map( { $_ => $++ } );
our sub multidecrypt (Alphabet $key is copy, Str $value, :$secret)
{
my &get-secret = load-it($secret);
sub char-from-key($key, $index)
{
return @( &get-secret($key) ).substr($index, 1);
}
my $number = $value.parse-base($wrapper); # [1]
my @numbers; # [1]
while $number # [1]
{
@numbers.push($number % $base); # [1]
$number = $number div $base; # [1]
}
my $result = "";
while (@numbers)
{
my $count = %base40values{ char-from-key($key, @numbers.shift) }; # [2][3]
if $count
{
my @partial = @numbers.splice(0, $count); # [4]
$result ~= char-from-key($key, $_) for @partial; # [4]
}
if @numbers > 1 # [5] [7]
{
my ($a, $b) = @numbers.splice(0, 2); # [5]
my $new-key = char-from-key($key, $a) ~ char-from-key($key, $b); # [5]
$key = $new-key; # [5]
}
}
return $result;
}
[7] Note the >1
, as we want at least two elemens
so that «splice» (which removes the two first elements from the list) doesn't
choke. A single element will be left for the next iteration («while @numbers») takes
it as a count, but ends up reading zero values as there are none left. That is ok.
This is done so that decrypting messages with the wrong key cannot make the program
crash, as that would have told an attacker that the key was wrong. An attacker can
of course modify the program to discover the fact, and use it to discard
the solution. (And I can have fun by adding a single dummy base 400 character to
the end of the encrypted string in the encryption program to invalidate this
assumption, without messing up the decrypted message.)
The encryption part:
The code:
File: lib/Cryptoish.pm6 (partial)
our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy, :$secret)
{ # [1]
my &get-secret = load-it($secret);
sub random_key
{
return (0 .. 9, "A" .. "Z").flat.roll(2).join;
}
my @chars = $text.comb; # [2]
my @return;
while @chars # [3] [9]
{
my $idx = 0; # [4]
my %values; # [4]
%values{$_}.push($idx++) for &get-secret($key).comb; # [4]
my $count = (0 .. 39).pick; # [5]
@return.push(%values{@base40[$count]}.pick); # [6]
if $count # [7]
{
my @partial = @chars.splice(0, $count); # [7]
@return.push(%values{$_}.pick) for @partial; # [7]
}
if @chars # [8]
{
my $new-key = random_key; # [8]
@return.push(%values{$new-key.substr(0,1)}.pick); # [8]
@return.push(%values{$new-key.substr(1,1)}.pick); # [8]
$key = $new-key; # [8]
}
}
return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper);
# [10] ######################### # [11] #######
}
The scripts:
File: multiencrypt
#! /usr/bin/env raku
use lib "lib";
use Cryptoish;
sub MAIN (Cryptoish::Alphabet $key, Str $text, :$secret)
{
say Cryptoish::multiencrypt($key, $text, :$secret);
}
File: multidecrypt
#! /usr/bin/env raku
use lib "lib";
use Cryptoish;
sub MAIN (Cryptoish::Alphabet $key, Str $value, :$secret)
{
say Cryptoish::multidecrypt($key, $value, :$secret);
}
See the «Verbose» section of Part 4: Postscript for a step by step walk through.
But as we use this for encryption, we don't really need to make it a correct base 40 (or base 400) encoding. So I dropped them both. This change results in different encrypted texts, but that doesn't matter as both the encrypter and decrypter agree on the algorithm.
$ raku encrypt AA "THIS IS A TEST. YES."
MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH
$ raku decrypt AA MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH
THIS IS A TEST. YES.
$ raku multidecrypt AA MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH
SEY .TSET A SI SIHT
The last one shows that the string has been reversed. The first letter in the encrypted string is taken as the count. The character is a «.» (a period), with index 37 (in the base 40 character set). That means that the algorithm reads up to 37 values, and as the message is shorter, we get everything. The period is not part of the message, and is missing.
We can add «flip» as a command line option:
$ raku multiencrypt --flip AA "ARNE SOMMER"
FU8WIQ53PP76WL6GACRCXV0XP
$ raku multidecrypt AA FU8WIQ53PP76WL6GACRCXV0XP
6QCEMMOS ENRAA
Note that the text is reversed, and somewhat similar. The changes in the next section will further reduce the similarities, so that «flip» actually makes the message unreadable.
Only the changes to the module are shown here, like
this . See the next section for the complete code, including the scripts (or scroll
up, and imagine «:$flip
» added in the same way as «:$secret
» in
the programs).
our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy,
:$secret, Bool :$flip = False) # [1]
return @return.reverse.map( { $_ * $base ** $++ } ).sum.base($wrapper)
if $flip; # [2]
return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper); # [3]
our sub multidecrypt (Alphabet $key is copy, Str $value is copy,
:$secret, Bool :$flip = False) # [4]
@numbers = @numbers.reverse if $flip; # [5]
[1] Adding the optional named Boolean parameter «flip».
[2] Note the «reverse», if we have enabled «flip». This line is added just before [3]
[3] Without «reverse» otherwise.
[4] As [1].
[5] We reverse the list, if we have enabled «flip».
The easiest way of reducing the possibility for that to happen is sending short messages only. That isn't very user friendly, though.
The problem is that we we use «pick» on a list of ten values. We can prevent the possibility of repetition using «grab» instead, as it removes the value from the list (where as «pick» doesn't change the list). Then we'd have to regenerate the list for the character again, when all ten have been used. This will reduce the possibility for two adjacant identical letters in the original message from ending up with the same value in the encrypted version. (It will not remove the possibility altogether, as the first one can be the tenth, and the second one is then taken from a new random list.)
Or we could switch the key, and go back and update the count before this part of the encrypted message. That is harder.
See docs.raku.org/routine/grab for more information about «grab».
I have not programmed this, as the next section gives a solution that is much better - and much easier to implement.
unit module Cryptoish;
constant @base40 := (0 .. 9, "A" .. "Z", " ", ".", "?", "!").flat;
constant $base := 400;
constant $wrapper := 36;
our subset Alphabet of Str where { /^@base40+$/ };
our subset Modulo of Int where { 2 <= $_ <= 40 }; # [1]
sub load-it ($secret = "")
{
my $name = $secret ?? "Cryptoish::Secret::$secret" !! "Cryptoish::Secret";
require ::($name) '&get-secret';
# say "[Loading $name]";
return &get-secret;
}
our sub encrypt (Alphabet $key, Alphabet $text, :$secret)
{
my &get-secret = load-it($secret);
my $idx = 0;
my %values;
%values{$_}.push($++) for &get-secret($key).comb;
return $text.flip.comb.map( { %(%values).{$_}.pick * $base ** $++ } )
.sum.base($wrapper);
}
our sub decrypt (Alphabet $key, Str $value, :$secret)
{
my &get-secret = load-it($secret);
my $result = "";
my $number = $value.parse-base($wrapper);
while $number
{
$result ~= @( &get-secret($key) ).substr($number % $base, 1);
$number = $number div $base;
}
return $result.flip;
}
our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy,
Modulo :$modulo = 40, Bool :$flip = False,
:$secret) # [2]
{
my &get-secret = load-it($secret);
sub random_key
{
return (0 .. 9, "A" .. "Z").flat.roll(2).join;
}
my @chars = $text.comb;
my @return;
while @chars
{
my $idx = 0;
my %values;
%values{$_}.push($idx++) for &get-secret($key).comb;
my $count = (0 .. 39).pick;
$count = $count % $modulo; # [3]
if $count
{
my @partial = @chars.splice(0, $count);
@return.push(%values{$_}.pick) for @partial;
}
if @chars
{
my $new-key = random_key;
@return.push(%values{$new-key.substr(0,1)}.pick);
@return.push(%values{$new-key.substr(1,1)}.pick);
$key = $new-key;
}
}
return @return.reverse.map( { $_ * $base ** $++ } ).sum.base($wrapper)
if $flip;
return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper);
}
my %base40values = @base40.map( { $_ => $++ } );
our sub multidecrypt (Alphabet $key is copy, Str $value, Modulo :$modulo = 40,
Bool :$flip = False, :$secret, :$verbose = False) # [4]
{
my &get-secret = load-it($secret);
sub char-from-key($key, $index)
{
say "K: $key I: $index" if $verbose;
return @( &get-secret($key) ).substr($index, 1);
}
my $number = $value.parse-base($wrapper);
my @numbers;
while $number
{
@numbers.push($number % $base);
$number = $number div $base;
}
@numbers = @numbers.reverse if $flip;
my $result = "";
while @numbers
{
my $count = %base40values{ char-from-key($key, @numbers.shift) };
if $count
{
{
$count = $count % $modulo; # [5]
my @partial = @numbers.splice(0, $count);
$result ~= char-from-key($key, $_) for @partial;
}
if @numbers.elems > 1
{
my ($a, $b) = @numbers.splice(0, 2);
my $new-key = char-from-key($key, $a) ~ char-from-key($key, $b);
$key = $new-key;
}
}
return $result;
}
[1] I have added a «subset» to be used as type constraint on the modulo value. Values 2..40 only are allowed.
[2] The modulo argument, as an optional named parameter. The default value is 40.
[3] Applying modulo on the value, after adding the original one to the output.
[4] The same as #2.
[5] Applying modulo on the value, before using it.
And finally the scripts, with the changes for both modulo and flip highlighted:
File: multiencrypt
#! /usr/bin/env raku
use lib "lib";
use Cryptoish;
sub MAIN (Cryptoish::Alphabet $key, Str $text :$secret,
Cryptoish::Modulo :$modulo = 40, Bool :$flip = False)
{
say Cryptoish::multiencrypt($key, $text :$secret, :$modulo, :$flip);
}
File: multidecrypt
#! /usr/bin/env raku
use lib "lib";
use Cryptoish;
sub MAIN (Cryptoish::Alphabet $key, Str $value :$secret,
Cryptoish::Modulo :$modulo = 40, Bool :$flip = False)
{
say Cryptoish::multidecrypt($key, $value :$secret, :$modulo, :$flip);
}
Testing «modulo»:
$ raku multiencrypt --modulo=4 AA "THIS IS A TEST."
6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
$ raku multidecrypt --modulo=4 AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THIS IS A TEST.
$ raku multidecrypt --modulo=10 AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THICKT.J7U1?27DNVIQN78.95TL 5A
$ raku multidecrypt AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THICKT.J7U1?0N2I9QBU S8 7?EVND8PTYNPJVD
The first three characters («THI») are decrypted correctly if we don't apply the user specified modulo value (of 4). Note that the program picks a random number, so we get from 0 to 3 characters before switching keys. In this initial case we got the maximum (3), which is pure luck (or unluck).
Testing «flip»:
$ raku multiencrypt --modulo=10 AA ABCDEFGHIJKLMNOPQRSTUVW
KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25
$ raku multidecrypt --modulo=10 AA KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25
ABCDEFGHIJKLMNOPQRSTUVW
$ raku multidecrypt --modulo=10 --flip AA KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25
HM4JDL98.615ZIBZL
$ raku multiencrypt --modulo=10 --flip AA ABCDEFGHIJKLMNOPQRSTUVW
1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND
$ raku multidecrypt --modulo=10 --flip AA 1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND
ABCDEFGHIJKLMNOPQRSTUVW
$ raku multidecrypt --modulo=10 AA 1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND
.9PEWCC1KM.J73M8BW5AUF
Using «modulo=2», the lowest possible value, gives the longest encrypted result (as it encrypts 0 or 1 value with each key, before changing the key). The result should be very hard to crack without the correct Secret File, as any repetition of the base 400 numbers doesn't mean anything whatsoever.
Changing keys can result in the same encrypted value beeing used several times, but from a different key. So they can actually represent different base 40 characters. This makes it possible for an attacker to see a pattern (a repetition) that really isn't there. And as I haven't reduced the possibility of an actual repetition with the same key (as described in the Repetition is a Bad Thing section above), he cannot know if the repetition is real or not. I am rather satisfied with that conundrum.
See the next and final part; Part 4: Postscript.