[ Index | Permutations | Canonical | Multigrams | The Program | On My Wig ]
I had a hunch (ouch) that this program would be extremely slow on long words (and yes - it sure is), so chose to skip the automatic lookup of US states this time. Specify the word(s) manually on the command line, lean back and wait for the result. The upside is that you can specify anything, not just a US state...
#! /usr/bin/env raku
unit sub MAIN
(
$string,
:d(:$dictionary) where $dictionary.IO.r = 'dict-US.txt',
:v(:$verbose)
);
my %anagram;
for $dictionary.IO.lines -> $word
{
%anagram{$word.lc.comb.sort.join}.push: $word;
}
my $length = $string.chars;
my @sentences = pick-all($string);
for @sentences -> $sentence
{
say $sentence.words.map({ %anagram{$_}.join("|") }).join(" ");
}
sub pick-all ($string)
{
my $canonical = $string.lc.comb.grep( * ~~ /<[a .. z]>/ ).sort.join;
say ":Canonical: $string -> $canonical" if $verbose;
my %seen;
return gather
{
for 1 .. $canonical.chars -> $c
{
_pick_partial( (), "", $canonical, $c);
}
}
sub _pick_partial (@done, $done is copy, $todo, $length)
{
if $length == 0
{
if $todo eq ""
{
if %anagram{$done}
{
my @new = @done.clone;
@new.push: $done;
my $new = @new.join(" ");
unless %seen{$new}
{
take $new if $new eq @new.sort.join(" ");
%seen{$new} = True;
}
}
}
else
{
my @done2 = @done.clone;
if %anagram{$done}
{
@done2.push: $done;
for 1 .. $todo.chars -> $c
{
_pick_partial(@done2, "", $todo, $c);
}
}
}
}
my @todo = $todo.comb;
for ^@todo.elems -> $index
{
my @copy = @todo.clone;
my $add = @copy.splice($index, 1);
my $new = $done ~ $add;
next if $new.comb.sort.join ne $new; # AB -> BA
_pick_partial(@done, $new, @copy.join, $length -1);
}
}
}
This program should not need any explanation, as long as you have read the prior parts of thsi article.
Let us run it, even though that is the topic of the next part.
I have chosen to show only some of the permutations (with a proposed sentence highlighted in white). The first one gave 61 suggestions, and the second 130:
$ ./multigrams "arne sommer"
man remorse # -> remorse man
armor semen # -> armor semen
earner|nearer moms # -> nearer moms
manors|ransom mere # -> mere ransom
$ ./multigrams "raku and perl"
a par|rap den|end lurk
a lard punker # -> a punker lard
a lark pruned # -> a pruned lark
a paler|pearl drunk # -> a paler drunk
a plank ruder # -> a ruder plank
a prank lured|ruled # -> ruled a prank
and pare|pear|rape|reap lurk # -> lurk and rape
parked lunar # -> parked lunar
If the canonical form only gives one real word, the program prints that. If we get several (alternate) words, they are printed with a vertical bar between them.
The actual number of suggestions will depend on the dictionary and the words you have chosen to ignore (added to the «ignore list»).
Note that duplicate words in the output is ok:
$ ./multigrams toto
to to
toot
I have not forgotten the US states. We'll look into them in the next and final part.
[ Index | Permutations | Canonical | Multigrams | The Program | On My Wig ]