Words of a Sort
with Raku

by Arne Sommer

Words of a Sort with Raku

[253] Published 10. September 2023.

This is my response to The Weekly Challenge #233.

Challenge #233.1: Similar Words

You are given an array of words made up of alphabets only.

Write a script to find the number of pairs of similar words. Two words are similar if they consist of the same characters.

Example 1:
Input: @words = ("aba", "aabb", "abcd", "bac", "aabc")
Output: 2

Pair 1: similar words ("aba", "aabb")
Pair 2: similar words ("bac", "aabc")
Example 2:
Input: @words = ("aabb", "ab", "ba")
Output: 3

Pair 1: similar words ("aabb", "ab")
Pair 2: similar words ("aabb", "ba")
Pair 3: similar words ("ab", "ba")
Example 3:
Input: @words = ("nba", "cba", "dba")
Output: 0

This requires two loops; we compare each word (loop 1) with the remaining words (those to the right of them) (loop 2).

File: similar-words
#! /usr/bin/env raku

unit sub MAIN (*@words where @words.elems > 0,             # [1]
               :w(:$very-verbose),                         # [2]
               :v(:$verbose) = $very-verbose);             # [2a]

my @result;                                                # [3]

while @words                                               # [4]
{
  my $first           = @words.shift;                      # [5]
  my $first-canonical = $first.comb.sort.unique.join;      # [6]

  say ":First $first -> $first-canonical" if $very-verbose;

  for @words -> $second                                    # [7]
  {
    my $second-canonical = $second.comb.sort.unique.join;  # [8]

    say ":- Second $second -> $second-canonical" if $very-verbose;

    if $first-canonical eq $second-canonical               # [9]
    {
      @result.push( ($first, $second) );                   # [10]
      say ":Pair {@result.elems }: similar words (\"$first\", \"$second\")"
        if $verbose;
    }
  }
}

say @result.elems;                                         # [11]

[1] Ensure at least one word.

[2] Note the default value for «verbose» [2a], so that «very-verbose» enables both.

[3] The result will end up here (even if we do not really need them; only the count).

[4] As long as the array is non-empty (the first loop).

[5] Get the first word (with shift; thus removing it from the array).

[6] The canonical form of the first word; a sorted string without duplicate letters.

[7] Iterate over the remaining words (the second loop).

[8] The canonical form of the second word.

[9] Are they equal (canonically-wise)?

[10] if so, add the words (the original words, not the canonical versions) to the result.

[11] Print the number of pairs.

Running it:

$ ./similar-words aba aabb abcd bac aabc
2

$ ./similar-words aabb ab ba
3

$ ./similar-words nba cba dba
0

Looking good.

With verbose mode:

$ ./similar-words -v aba aabb abcd bac aabc
:Pair 1: similar words ("aba", "aabb")
:Pair 2: similar words ("bac", "aabc")
2

$ ./similar-words -v aabb ab ba
:Pair 1: similar words ("aabb", "ab")
:Pair 2: similar words ("aabb", "ba")
:Pair 3: similar words ("ab", "ba")
3

$ ./similar-words -v nba cba dba
0

And with «very verbose» mode:

$ ./similar-words -w aba aabb abcd bac aabc
:First aba -> ab
:- Second aabb -> ab
:Pair 1: similar words ("aba", "aabb")
:- Second abcd -> abcd
:- Second bac -> abc
:- Second aabc -> abc
:First aabb -> ab
:- Second abcd -> abcd
:- Second bac -> abc
:- Second aabc -> abc
:First abcd -> abcd
:- Second bac -> abc
:- Second aabc -> abc
:First bac -> abc
:- Second aabc -> abc
:Pair 2: similar words ("bac", "aabc")
:First aabc -> abc
2

$ ./similar-words -w aabb ab ba
:First aabb -> ab
:- Second ab -> ab
:Pair 1: similar words ("aabb", "ab")
:- Second ba -> ab
:Pair 2: similar words ("aabb", "ba")
:First ab -> ab
:- Second ba -> ab
:Pair 3: similar words ("ab", "ba")
:First ba -> ab
3

$ ./similar-words -w nba cba dba
:First nba -> abn
:- Second cba -> abc
:- Second dba -> abd
:First cba -> abc
:- Second dba -> abd
:First dba -> abd
0

The First and Second labled verbose rows show the word themselves (before the ->), and the canonical version after it.

Challenge #233.2: Frequency Sort

You are given an array of integers.

Write a script to sort the given array in increasing order based on the frequency of the values. If multiple values have the same frequency then sort them in decreasing order.

Example 1:
Input: @ints = (1,1,2,2,2,3)
Ouput: (3,1,1,2,2,2)

'3' has a frequency of 1
'1' has a frequency of 2
'2' has a frequency of 3
Example 2:
Input: @ints = (2,3,1,3,2)
Ouput: (1,3,3,2,2)

'2' and '3' both have a frequency of 2, so they are sorted in decreasing order.
Example 3:
Input: @ints = (-1,1,-6,4,5,-6,1,4,1)
Ouput: (5,-1,4,4,-6,-6,1,1,1)

We can use the Bag type to get the frequency for free:

> say (1, 1, 2, 2, 2, 3).Bag.raku;  # -> (2=>3,3=>1,1=>2).Bag
> say (1, 1, 2, 2, 2, 3).Bag.raku;  # -> (1=>2,2=>3,3=>1).Bag

The key (the part before the =>) is the value (integer), and the value is the frequency. The order is random, as shown by the second example.

See docs.raku.org/type/Bag for more information about the Bag type.

File: frequency-sort
#! /usr/bin/env raku

unit sub MAIN (*@ints where @ints.elems > 1 && all(@ints) ~~ Int);      # [1]

my $bag = @ints.Bag;                                                    # [2]

say "(", $bag.sort({ $^a.value <=> $^b.value || $^b.key <=> $^a.key })  # [3]
	     .map({ $_.key xx $_.value })                               # [4]
	     .flat                                                      # [5]
	     .join(","), ")";                                           # [6]

[1] Ensure that we have at least one element, and all of them must be integers.

[2] Use the Bag type.

[3] Sort the bag on the values (the first comparison), and then the keys (the second comparison) if the values are the same. Note that $^a is aliased to the first argument, and $^b to the second argument in the comparisons. In the second comparison we have swapped the order of the variables, so the sort will be the other way round.

[4] The result of the sort is a list. We use map to convert each pair (key, value) to a list with the number of integers of the given type, with the list repetition xx operator.

See docs.raku.org/routine/xx for more information about the list repetition operator xx.

[5] The result of the above is a list of lists, so we flatten it to a single list.

[6] Pretty print the result.

Running it:

$ ./frequency-sort 1 1 2 2 2 3
(3,1,1,2,2,2)

$ ./frequency-sort 2 3 1 3 2
(1,3,3,2,2)

$ ./frequency-sort 1 -1 -6 4 5 -6 1 4 1
(5,-1,4,4,-6,-6,1,1,1)

Looking good.

Note that I swapped the first and second values on the third example, as a minus sign on the first argument will not go down nicely with the program (as it is taken as a command line option, and an illegal one at that).

We can avoid the problem by inserting a double dash where we want the command line option passing to halt, like this:

$ ./frequency-sort -- -1 1 -6 4 5 -6 1 4 1
(5,-1,4,4,-6,-6,1,1,1)

What about kxxv?

Converting a Bag to a list is something that is quite common, so Raku has a special operator kxxv for just that.

See docs.raku.org/routine/kxxv for more information about kxxv.

But we cannot use it in our porgram, instead of [4] and [5], as kxxv must be applied to a Bag and not a list. And we have a list, courtesy of the sort [3].

And that's it.