This is my response to the Perl Weekly Challenge #092.
$A
and $B
.
Write a script to check if the given strings are Isomorphic. Print 1 if they are otherwise 0.
Example 1:
Input: $A = "abc"; $B = "xyz"
Output: 1
Example 2:
Input: $A = "abb"; $B = "xyy"
Output: 1
Example 3:
Input: $A = "sum"; $B = "add"
Output: 0
The point is a one-to-one mapping. So if we set up a hash, with the characters
from $A
as keys, and the characters from $B
as the
values. Duplicate keys (from $A
) are ok, as long as they lead to
the same value (from $B
); e.g. $A = 'ABA'; $B = '1*1'
is ok. Duplicates on the value side (from $B
) are ok in a hash,
but not for us here. We can get rid of them like this: %hash.values.unique
.
#! /usr/bin/env raku
unit sub MAIN (Str $A, Str $B, :v($verbose)); # [1]
if $A.chars != $B.chars # [2]
{
say ": Different length" if $verbose;
say 0;
exit;
}
my %A2B = ($A.comb Z $B.comb)>>.hash; # [3]
say ": Hash: { %A2B.raku }" if $verbose;
say %A2B.keys.elems == %A2B.values.unique.elems # [4]
?? 1 # [4a]
!! 0; # [4b]
[1] The Str
type does not actually restrict the input, so we could drop
it.
[2] Check that the two strings have the same length. If not, say so and exit.
[3]
The Z
infix operator (which has a prefix version
zip
) takes one element from each list at a time and gives a list back
(as a zipper) with the tuples as sublists. We want a hash, and a two-dimentional list
cannot be coerced to one, so we have to coerce each pair of values to a hash before
the assignment to the hash to make this work.
[4] Ensure a one-to-one mapping, and print 1 on success [4a] and 0 if not [4b].
See
docs.raku.org/routine/Z for
more information about Z
.
See
docs.raku.org/routine/zip for
more information about zip
.
The reason
Z
resturns a list of lists (and not a list of pair objects,
that we could have turned into a hash with a simple assignment) is that
is is a spceial case of of the zip
function, that can take
two or more lists. Three elements do not easily fit in Pair objects...
See
docs.raku.org/type/Pair for
more information about the Pair
type.
Running it:
$ ./isomorphic-strings-zip abc xyz
1
$ ./isomorphic-strings-zip -v abc xyz
: Hash: {:a("x"), :b("y"), :c("z")}
1
$ ./isomorphic-strings-zip abb xyy
1
$ ./isomorphic-strings-zip -v abb xyy
: Hash: {:a("x"), :b("y")}
1
$ ./isomorphic-strings-zip sum add
0
$ ./isomorphic-strings-zip -v sum add
: Hash: {:m("d"), :s("a"), :u("d")}
0
So far so good...
We can use a traditional loop instead of the handy zip operator(s). Here
it is with an implicit loop with map
insted if the [3]-line above:
my %A2B = (^$A.chars).map({ ( $A.substr($_,1) => $B.substr($_, 1) ) });
The zip version is shorter, and easier to understand (in my opinion).
The last example had a duplicate value («d»), and the program detected it. But what happens if we change the order of the strings?
$ ./isomorphic-strings-zip -v add sum
: Hash: {:a("s"), :d("m")}
1
Oops.
The problem is that dupliate keys are lost in a hash.
We can in fact use a hash, but must build it up manually:
File: isomorphic-strings-zip-again
#! /usr/bin/env raku
unit sub MAIN (Str $A, Str $B, :v($verbose));
if $A.chars != $B.chars
{
say ": Different length" if $verbose;
say 0;
exit;
}
my @A2B = ($A.comb Z $B.comb);
say ": Array: { @A2B.raku }" if $verbose;
my %A;
for @A2B -> @pair
{
say ": Pair: @pair[0] -> @pair[1]" if $verbose;
if %A{@pair[0]}.defined # [1]
{
if %A{@pair[0]} eq @pair[1] # [2]
{
say ": Duplicate of @pair[0] (value: @pair[1])";
next;
}
# [3]
say ": Redeclaration of @pair[0] (values: %A{@pair[0]} and @pair[1])";
say 0;
exit;
}
%A{@pair[0]} = @pair[1];
}
say %A.keys.elems == %A.values.unique.elems # [4]
?? 1
!! 0;
[1] This one handles duplicate value in $A
.
[2] Does if have the same value in $B
, if so we are good.
[3] If not, we say so (say 0
) and are done.
[4] This one handles duplicate values in $B
.
Running it:
$ ./isomorphic-strings-zip-again -v abc xyz
: Array: [("a", "x"), ("b", "y"), ("c", "z")]
: Pair: a -> x
: Pair: b -> y
: Pair: c -> z
1
$ ./isomorphic-strings-zip-again -v abb xyy
: Array: [("a", "x"), ("b", "y"), ("b", "y")]
: Pair: a -> x
: Pair: b -> y
: Pair: b -> y
: Duplicate of b (value: y)
1
$ ./isomorphic-strings-zip-again -v sum add
: Array: [("s", "a"), ("u", "d"), ("m", "d")]
: Pair: s -> a
: Pair: u -> d
: Pair: m -> d
0
The duplicate detection is done by the last line it the program, and it does not have verbose output (as it probably should). I'll get back to that later..
It we swap the strings, we get the verbose explanation as to why it fails:
$ ./isomorphic-strings-zip-again -v add sum
: Array: [("a", "s"), ("d", "u"), ("d", "m")]
: Pair: a -> s
: Pair: d -> u
: Pair: d -> m
: Redeclaration of d (values: u and m)
0
We can get rid of the test for string length with a little care (and the use of
roundrobin
):
#! /usr/bin/env raku
unit sub MAIN (Str $A, Str $B, :v($verbose));
my @A2B = (roundrobin($A.comb, $B.comb)); # [1]
say ": Array: { @A2B.raku }" if $verbose;
my %A;
for @A2B -> @pair
{
unless @pair[1].defined # [2]
{
say ": Different length" if $verbose;
say 0;
exit;
}
say ": Pair: @pair[0] -> @pair[1]" if $verbose;
if %A{@pair[0]}.defined # [3]
{
if %A{@pair[0]} eq @pair[1] # [3a]
{
say ": Duplicate of @pair[0] (value: @pair[1])";
next;
}
# [3b]
say ": Redeclaration of @pair[0] (values: %A{@pair[0]} and @pair[1])";
say 0;
exit;
}
%A{@pair[0]} = @pair[1];
}
if %A.keys.elems == %A.values.unique.elems
{
say 1;
}
else
{
say ': Redeclaration of value (in $B).'; # [4]
say 0;
}
[1] Using roundrobin
instead of the infix Z
operator gives trailing single elemens when the input lists have diffrent length (whereas
Z
(and zip
) gives up when the shortest list has been exhausted.
Coercing the sublists to hashes will fail if they contain single values, as we get if one
of thestrings is shorter than the other one.
[2] This block handles the situation where the strimgs have different lenght. This approach has just as much code as the previous program, so the change probably isn't worth the effort.
[3] Do have duplicate keys? If they have the same value, that is ok [3a]. If not, say so and exit [3b].
[4] Duplicate values (from $B
) that came from different keys (from
$A
).
See
docs.raku.org/routine/roundrobin for
more information about roundrobin
.
Running it gives the same output as above, except when we have duplicate values in
$B
(where we now get verbose output):
$ ./isomorphic-strings-roundrobin -v sum add
: Array: [("s", "a"), ("u", "d"), ("m", "d")]
: Pair: s -> a
: Pair: u -> d
: Pair: m -> d
: Redeclaration of value (in $B).
0
#! /usr/bin/env perl
use strict;
use feature 'say';
use List::Util 'uniq';
use Getopt::Long;
my $verbose = 0;
GetOptions("verbose" => \$verbose);
my $A = shift(@ARGV) // die 'Please specify $A';
my $B = shift(@ARGV) // die 'Please specify $B';
if (length($A) != length($B))
{
say ": Different length" if $verbose;
say 0;
exit;
}
my %A;
for my $index (0 .. length $A -1)
{
my $pair0 = substr($A, $index, 1);
my $pair1 = substr($B, $index, 1);
say ": Pair: $pair0 -> $pair1" if $verbose;
if (defined %A{$pair0})
{
if ($A{$pair0} eq $pair1)
{
say ": Duplicate of $pair0 (value: $pair1)" if $verbose;
next;
}
say ": Redeclaration of $pair0 (values: " . $A{$pair0} . " and "
. $pair1 . ")" if $verbose;
say 0;
exit;
}
$A{$pair0} = $pair1;
}
if (scalar keys %A == scalar uniq values %A)
{
say 1;
}
else
{
say ': Redeclaration of value (in $B).' if $verbose;
say 0;
}
Running it gives the same result as the final Raku version («isomorphic-strings-roundrobin»):
$ ./isomorphic-strings-perl abc xyz
1
$ ./isomorphic-strings-perl abb xyy
1
$ ./isomorphic-strings-perl sum add
0
$ ./isomorphic-strings-perl add sum
0
Input $S = (1,4), (8,10); $N = (2,6)
Output: (1,6), (8,10)
Example 2:
Input $S = (1,2), (3,7), (8,10); $N = (5,8)
Output: (1,2), (3,10)
Example 3:
Input $S = (1,5), (7,9); $N = (10,11)
Output: (1,5), (7,9), (10,11)
I have chosen to set up a custom class for the intervals, so that I could use class methods for checking if two intervals are mergable - and then merge them.
File: insert-interval
#! /usr/bin/env raku
unit sub MAIN (:$S = "(1,4),(8,10)", :$N ="(2,6)", :v(:$verbose)); # [1]
my $s = $S.EVAL; # [1a]
my $n = $N.EVAL; # [1a]
class interval # [2]
{
has $.start; # [2a]
has $.stop; # [2b]
method inside ($value) # [3]
{
return $.start <= $value <= $.stop;
}
method mergable (interval $new) # [4]
{
return True if $.start -1 < $new.start < $.stop +1;
return True if $.start -1 < $new.stop < $.stop +1;
return False;
}
method merge (interval $new) # [5]
{
return interval.new(start => min($.start, $new.start),
stop => max($.stop, $new.stop));
}
method Str # [6]
{
return "({ $.start },{ $.stop })";
}
}
my @all = @$s.map({ interval.new(start => $_[0], stop => $_[1]) }); # [7]
my $m = interval.new(start => $n[0], stop => $n[1]); # [8]
@all.push($m); # [9]
@all = @all.sort: { $^a.start <=> $^b.start || $^a.stop <=> $^b.stop };
# [10]
my @result; # [11]
my $first = @all.shift; # [12]
loop # [13]
{
last unless @all.elems; # [14]
my $second = @all.shift; # [15]
if $first.mergable($second) # [16]
{
my $new = $first.merge($second); # [17]
$first = $new; # [17a]
next; # [17b]
}
elsif $first.stop < $second.start + 1 # [18]
{
@result.push($first); # [18a]
$first = $second; # [18b]
next; # [18c]
}
else # [19]
{
@result.append($first, $second, @all); # [19a]
$first = Any; # [19b]
last; # [19c]
}
}
@result.push($first) if $first; # [20]
say @result.join(", "); # [21]
[1] Specifying a data structure on the command line is difficult, but a
string (the quotes are there to prevent the shell from playing havoc with the parens) in
combination with EVAL
[1a] does the trick. (Note that EVAL
does not run code when used like this, and trying to do so will cause an error.)
[2] The «interval» class, with the two class variables for the start and stop values.
[3] A method that tells us if a given value is inside the interval. (It is not used, but here it is anyway.)
[4] A method that tells us if two intervals can be merged (are overlapping).
[5] A method that merges two intervals. Do check if they are mergable first.
[6] Stringification of the objects, used by [21].
[7] Get a list of «interval» objects from the input.
[8] Ditto for the one interval to add,
[9] and add it to the list.
[10] Sort the list, with the lowest start value first. If several, get the one with the highest stop value first.
[11] We are going to store the result (as a sorted list of «interval» objects here.
[12] Get the first element.
[13] An eternal loop,
[14] • until we have emptied the array.
[15] Get the next element.
[16] Can we merge the two elements?
[17] • if so, merge them.
[18] Is the first one fully before the secons one? If so add it to the list [18a], and prepare for the nect iteration [18a,b,c].
[19] Otherwise (we are done, as they are not overlapping), add them both to the list [19a] and we are done.
[20] Add the current interval, if any.
[21] Stringify the objects (see [6]) and print them.
See
docs.raku.org/routine/EVAL for
more information about EVAL
.
Running it:
$ ./insert-interval -S="(1,4), (8,10)" -N="(2,6)"
(1,6), (8,10)
$ ./insert-interval -S="(1,2), (3,7), (8,10)" -N="(5,8)"
(1,2), (3,10)
$ ./insert-interval -S="(1,5), (7,9)" -N="(10,11)"
(1,5), (7,9), (10,11)
We got the same result as the page linked to in the challenge.
It can be argued that adjacent integer intervals, as we have in the second and third examples, should be merged. The challenge does not, so I'll support it with a command line option «--integer» - where we simply change the offsets used in «mergeable»:
File: insert-interval2 (partial)
unit sub MAIN (:$S = "(1,4),(8,10)", :$N ="(2,6)", :i($integer), :v(:$verbose));
my $limit = $integer ?? 2 !! 1;
method mergable (interval $new)
{
return True if $.start -$limit < $new.start < $.stop +$limit;
return True if $.start -$limit < $new.stop < $.stop +$limit;
return False;
}
Running it:
$ ./insert-interval2 -S="(1,4), (8,10)" -N="(2,6)"
(1,6), (8,10)
$ ./insert-interval2 -S="(1,4), (8,10)" -N="(2,6)" -i
(1,6), (8,10)
$ ./insert-interval2 -S="(1,2), (3,7), (8,10)" -N="(5,8)"
(1,2), (3,10)
$ ./insert-interval2 -S="(1,2), (3,7), (8,10)" -N="(5,8)" -i
(1,10)
$ ./insert-interval2 -S="(1,5), (7,9)" -N="(10,11)" -i
(1,5), (7,11)
$ ./insert-interval2 -S="(1,5), (7,9)" -N="(10,11)"
(1,5), (7,9), (10,11)
Verbose mode is available as a command line option, but is missing from the program due to lack of time to implement it fully.
And that's it.