This is my response to The Weekly Challenge #229.
Input: @str = ("abc", "bce", "cae")
Output: 1
In the given array "cae" is the only element which is not
lexicographically sorted.
Example 2:
Input: @str = ("yxz", "cba", "mon")
Output: 2
In the given array "yxz" and "mon" are not lexicographically sorted.
The only sane way of doing this is to use the Unicode sorting order (similar to Isolatin and even Ascii).
File: lexi-order
#! /usr/bin/env raku
unit sub MAIN (*@str where @str.elems > 0, :v(:$verbose)); # [1]
sub is-lexi-sorted ($string) # [2]
{
return True if [<=] $string.ords; # [3]
return [>=] $string.ords; # [4]
}
say ": OK: { @str.grep(*.&is-lexi-sorted).map('"' ~ * ~ '"').join(", ") }"
if $verbose; # [7]
my @not-ok = @str.grep( ! *.&is-lexi-sorted ); # [5]
say ": Not OK: { @not-ok.map('"' ~ * ~ '"').join(", ") }" if $verbose;
say @not-ok.elems; # [6]
[1] Ensure at least one string.
[2] The procedure deciding if the given string is lexicographically sorted.
[3]
We use the Reduction Metaoperator []
together
with the «numerically less than or equal» operator <=
to tell us
if the assertion (less than or equal) applies to every neighbourly values in
the list of Unicode codepoints. The codepoints are supplied by ords
,
which works on strings, returning the values for each character in that string.
We return True
if this assertion holds.
See
docs.raku.org/language/operators#Reduction_metaoperators for more
information about the Reduction Metaoperator []
.
See
docs.raku.org/routine/ords
for more information about ords
.
[4] The «greather than or equal» assertion. This returns true
or False
.
[5] Get the offending strings (i.e. those not lexicographically sorted), by negating the procedure in [2].
[6] Print the number of offending strings.
[7] Note that this grep
expression gives us the explicit result;
a list where we have deleted the offending strings. This list is not actually
used anywhere, except by verbose mode, but here it is.
Running it:
$ ./lexi-order "abc" "bce" "cae"
1
$ ./lexi-order "yxz" "cba" "mon"
2
Looking good.
With verbose mode:
$ ./lexi-order -v "abc" "bce" "cae"
: OK: "abc", "bce"
: Not OK: "cae"
1
$ ./lexi-order -v "yxz" "cba" "mon"
: OK: "cba"
: Not OK: "yxz", "mon"
2
So far, so good.
But what about upper case letters, or even non-letters, in the strings (in addition to the prescribed lower case letters). This is a case (pun intended) of sorting order blues...
$ ./lexi-order -v "abc" "Abc" "aBc" "abC"
: OK: "abc", "Abc"
: Not OK: "aBc", "abC"
2
In Computer Science the uppercase letters precede the lower case variants, as shown here. In the real world (i.e. a dictionary) they are considered identical.
The challenge specifies strings and not words, so it is reasonable to postulate that the result we got above is indeed correct. Especially if we add non-letters to the mix:
$ ./lexi-order -v "abc12" "abc21" "12abc" "21abc"
: OK: "12abc"
: Not OK: "abc12", "abc21", "21abc"
3
A librarian will almost certainly disagree, though. Here is a librarian-friendly version of the program:
File: lexi-order-librarian
#! /usr/bin/env raku
unit sub MAIN (*@str where @str.elems > 0
&& all(@str) ~~ /^<[a..z A..Z]>+$/, # [1]
:v(:$verbose));
sub is-lexi-sorted ($string)
{
return True if [<=] $string.lc.ords; # [2]
return [>=] $string.lc.ords; # [2]
}
say ": OK: { @str.grep(*.&is-lexi-sorted).map('"' ~ * ~ '"').join(", ") }"
if $verbose;
my @not-ok = @str.grep( ! *.&is-lexi-sorted );
say ": Not OK: { @not-ok.map('"' ~ * ~ '"').join(", ") }" if $verbose;
say @not-ok.elems;
[1] Ensure lower and uppercase Ascii letters (A to Z) only in the strings.
[2] Coerce the string to lowercase before getting the codepoints.
Running it gives the expected result:
$ ./lexi-order-librarian -v "abc" "Abc" "aBc" "abC"
: OK: "abc", "Abc", "aBc", "abC"
: Not OK:
0
Note that non-ascii letters (as well as non-letters) are illegal, courtesy of the explicit regex in [1]. That is intentional, as non-ascii characters come with a code point that is somewhat random. If we were to do this correctly, we would have to take the current language into account - as the sorting order is context sensitive. Let us not descend into that rabbit hole...
The notion of "current language" is also a problem, as it must take the input strings into account, and not rely on the computer running the program (the language locale setting).
Also note that the Germans would have a field day of my use of .lc
,
as the lowercase "ß" is translated to "SS" in uppercase. This does not round
trip:
> say "straße".uc; # -> STRASSE
> say "STRASSE".lc; # -> strasse
That was awfully close to the rabbit hole, so let us stop there...
Input: @array1 = (1, 1, 2, 4)
@array2 = (2, 4)
@array3 = (4)
Ouput: (2, 4)
Example 2:
Input: @array1 = (4, 1)
@array2 = (2, 4)
@array3 = (1, 2)
Ouput: (1, 2, 4)
File: two-out-of-tree
#! /usr/bin/env raku
unit sub MAIN ($one, $two, $three, :v(:$verbose)); # [1]
my @array1 = $one.words; # [2]
my @array2 = $two.words; # [2a]
my @array3 = $three.words; # [2b]
my $bag = (@array1.unique, @array2.unique, @array3.unique).Bag; # [3]
my $ok = $bag.grep( *.value >= 2 ); # [4]
say ":All: { $bag.raku }" if $verbose;
say ":Ok: { $ok.raku }" if $verbose;
say $ok.elems; # [5]
[1] Specify the three strings with quotes (thus ensuring that the spaces work as value separators, within the strings).
[2] Use words
to get the list of integers.
[3] This is very easy, really. Get rid of duplicates (if any) in
each array (with unique
), add the duplicate-free arrays together
(here with the comma operator), turn the result (which is an array with three
elements, each one an array) into a Bag
, a hash like structure
where the values in the list turns out as keys, and the values are the number of
times they occured.
See
docs.raku.org/type/Bag
for more information about the Bag
type.
[4] Then we use grep
to get those values (the value
method gives us the frequency, and the key
method would give the
original value) that occured two or more times.
[5] Print the number of matches, i.e. the number of Pair objects (key, value) in the Bag.
Note the missing check on the input values for integerness. One could be forgiven for trying something like this, preferable between lines [2b] and [3]:
die "array1 must contain integers only" unless all(@array1) ~~ Int;
die "array2 must contain integers only" unless all(@array2) ~~ Int;
die "array3 must contain integers only" unless all(@array3) ~~ Int;
But this will not work, as words
returns strings. Use something
like this instead, if you absolutely feel the need:
die "array1 must contain integers only" unless all(@array1) ~~ /^<[0..9]>+$/;
Beware that it does not allow for negative values, and nonsense values as «000» and «007» are allowed.
The challenge does indeed specify integers, but it really does not matter. As long
as we use one or more spaces between the values to make words
happy.
Running it:
$ ./two-out-of-tree "1 1 2 4" "2 4" "4"
2
$ ./two-out-of-tree "1 1 2 -4" "2 -4" "4"
2
$ ./two-out-of-tree "a b c" "d e a" "b"
2
Looking good.
With verbose mode:
$ ./two-out-of-tree -v "1 1 2 4" "2 4" "4"
:All: ("1"=>1,"4"=>3,"2"=>2).Bag
:Ok: $(("4" => 3, "2" => 2).Seq)
2
$ ./two-out-of-tree -v "1 1 2 -4" "2 -4" "4"
:All: ("-4"=>2,"4"=>1,"2"=>2,"1"=>1).Bag
:Ok: $(("-4" => 2, "2" => 2).Seq)
2
$ ./two-out-of-tree -v "a b c" "d e a" "b"
:All: ("c"=>1,"b"=>2,"d"=>1,"e"=>1,"a"=>2).Bag
:Ok: $((:b(2), :a(2)).Seq)
2
Negative values does not quite work out:
$ ./two-out-of-tree "-1 2 3" "4 5 6" "1 001"
Usage:
./two-out-of-tree [-v|--verbose[=Any]] <one> <two> <three>
$ ./two-out-of-tree "1 -2 3" "4 5 6" "1 001"
1
So words
may not be that great a choice. Let us do it the hard way:
my @array1 = $one.split(/\s+/);
my @array2 = $two.split(/\s+/);
my @array3 = $three.split(/\s+/);
This does not help, giving the exact same result as above. So it is the shell that is the problem; not the program. (Using single quotes instead of double quotes does not help.)
The solution: Do not specify a negative value as the first value in the first string. (This is ok as the order does not matter, but it will cause problems if we were to use only negative values.)
Oh, well. Another rabbit hole...
And that's it.