[ Index | Permutations | Canonical | Multigrams | The Program | On My Wig ]
We have already dealt with multiple words in the source of the anagram (e.g. translated «West Virginia» to «westvirginia»). But what about multiple words in the result?
Let us illustrate the difficulty, starting with a three letter word (e.g. «ABC»). Each of the boxes represents a set of characters (or a potential word on canonical form, i.e. sorted). The rows (after the arrows) represent the possibilites. The number of combinations (5) is not too bad. |
Then let us take a look at a four letter word (e.g. «ABCD»). Now we get 15 possible combinations. |
The number of combinations will increase pretty fast. How fast is not something that I care to compute. Suffice to say that it does not scale well when we increase the length of the word (the subject).
We are going to try to solve this with recursion.
#! /usr/bin/env raku
unit sub MAIN ($string, :v(:$verbose));
pick-partial($string.comb.sort.join).map({ "@_[0];@_[1]".say }); # [1]
sub pick-partial ($string) # [2]
{
my %seen; # [3]
return gather # [4]
{
for 1 .. $string.chars -> $c # [5]
{
_pick_partial("", $string, $c); # [6]
}
}
sub _pick_partial ($done, $todo, $length) # [7]
{
if $length == 0 # [8]
{
take ($done, $todo) unless %seen{$done}; # [9]
%seen{$done} = True; # [10]
return; # [11]
}
my @todo = $todo.comb; # [12]
for ^@todo.elems -> $index # [13]
{
my @copy = @todo.clone; # [14]
my $add = @copy.splice($index, 1); # [15]
my $new = $done ~ $add; # [16]
return if $new.comb.sort.join ne $new; # [17]
_pick_partial($new, @copy.join, $length -1); # [18]
}
}
}
[1] Get all the two word combinations, and print each one on a separate
line with a semicolon between the words (using map
).
[2] The procedure doing the work.
[3] A list of already seen words.
[4] I have chosen to use
gather
/take
to collect (return) the values.
See my Raku Gather,
I Take article or
docs.raku.org/syntax/gather take for more information about
gather
/take
.
[5] Iterate over the length of the input string; thus giving the number of characters to place in the first word.
[6] Calling a recursive helper procedure that does the actual work. The first argument is the already computed word (initially ""). The second is the leftover letters (i.e. unused in the first word) The third is the number of characters to use when setting up the next words.
[7] The recursive helper function.
[8] No more characters to select?
[9] If so, return (with take
) the words, unless seen already.
This kicks in if we have repeating characters - as e.g. «AA» is the same as
«AA».
[10] Enure that we do not get more instances of this sentence later on.
[11] We are done with this recursion (as there are no more letters left).
[12] The remaining letters, as a list.
[13] Iterate over the indices.
[14] Get a copy of the list from [12], so that we do not clobber up the original
(with splice
in [15]).
[15] Remove one character from the array (and remember that
we iterate over all of them), with splice
.
See
docs.raku.org/routine/splice
for more information about splice
.
[16] Add the new letter to the word.
[17] Return from this recursion if the word is on non-canonical form (i.e. «BA» instead of «AB») as they are identical.
[18] Then we use recursion to get the remaining letters.
Running it:
$ ./pick-partial abc
a;bc
b;ac
c;ab
ab;c
ac;b
abc;
$ ./pick-partial cba
a;bc
b;ac
c;ab
ab;c
ac;b
abc;
$ ./pick-partial abcd
a;bcd
b;acd
c;abd
d;abc
ab;cd
ac;bd
ad;bc
abc;d
abd;c
abcd;
$ ./pick-partial aabb
a;abb
b;aab
aa;bb
ab;ab
aab;b
aabb;
$ ./pick-partial aaaa
a;aaa
aa;aa
aaa;a
aaaa;
How can we handle the rest of the daunting challenge? The trick is recursion. Well, even more recursion. Let us consider the two word solution for «ABCD»:
The idea is to regard the second word (shown in gray) as a remainder instead of a second word. Then we can do a recursive call on those remainders - with the first word (in yellow) at the front of a list of resulting words. When we run out of recursive parts to add, we have a list of words that form a (potential) sentence.
We can show this by filling in the missing part (or a part of it) of the illustration above:
Instead of returning the gray boxes (in the first illustration), the program recursively
replaces them with the result shown in the second illustration. The green «CD» will also
be replaced, but this will not lead to duplicates as they start out with different permutations:
(AB CD
vs A B CD
).
#! /usr/bin/env raku
unit sub MAIN ($string, :v(:$verbose));
pick-partial($string.comb.sort.join)>>.say;
sub pick-partial ($string)
{
my %seen;
return gather
{
for 1 .. $string.chars -> $c
{
_pick_partial( (), "", $string, $c); # [1]
}
}
sub _pick_partial (@done, $done is copy, $todo, $length) # [1]
{
if $length == 0
{
if $todo eq ""
{
my @new = @done.clone;
@new.push: $done;
my $new = @new.join(" ");
unless %seen{$new}
{
take $new if $new eq @new.sort.join(" ");
%seen{$new} = True;
}
}
else
{
my @done2 = @done.clone;
@done2.push: $done;
for 1 .. $todo.chars -> $c
{
_pick_partial(@done2, "", $todo, $c); # [1]
}
}
}
my @todo = $todo.comb;
for ^@todo.elems -> $index
{
my @copy = @todo.clone;
my $add = @copy.splice($index, 1);
my $new = $done ~ $add;
return if $new.comb.sort.join ne $new;
_pick_partial(@done, $new, @copy.join, $length -1); # [1]
}
}
}
[1] The extra argument @done
is the list of words we have picked
so far, and $done
is the current one.
The rest of the program should be straight forward.
Running it:
$ ./pick-even-more-partial abc
a b c
a bc
ab c
ac b
abc
$ ./pick-even-more-partial abcd
a b c d
a b cd
a bc d
a bd c
a bcd
ab c d
ab cd
ac b d
ac bd
ad b c
ad bc
abc d
abd c
abcd
$ ./pick-even-more-partial aaa
a a a
a aa
aaa
Note that all the word candidates are on canonical form, so it is easy to check them against the (also canonical) dictionary.
We can (and indeed should) do the dictionary lookup for each word that we add to
@done
, so that we can abort computing the rest of the words if we
have a non-word up front.
We'll do just that, in the next part.
[ Index | Permutations | Canonical | Multigrams | The Program | On My Wig ]