Excelled Find with Raku

by Arne Sommer

Excelled Find with Raku

[73] Published 13. May 2020.

This is my response to the Perl Weekly Challenge #60.

Challenge #60.1: Excel Column

Write a script that accepts a number and returns the Excel Column Name it represents and vice-versa.

Excel columns start at A and increase lexicographically using the 26 letters of the English alphabet, A..Z. After Z, the columns pick up an extra “digit”, going from AA, AB, etc., which could (in theory) continue to an arbitrary number of digits. In practice, Excel sheets are limited to 16,384 columns.

Example

Input Number: 28 Output: AB

Input Column Name: AD Output: 30

The problem with the Excel numbering system is that it doesn't have a zero value. In our decimal system we have 10 digits (0..9), and the first is the zero value. When we reach the end (9), we add a new digit in front (1) and set the last one to the zero value (0).

We can solve this by pretending that 'A' is the zero value. Then we can use an array with the letters (where 'A' has index 0) to get the them.

File: excel (partial)
constant A := ord('A');                                 # [1]

sub to-excel (Int $number is copy)                      # [2]
{
  $number--;                                            # [3]

  my $div = $number mod 26;                             # [4]
  my $exp = $number div 26;                             # [5]

  return to-excel($exp) ~ ('A' .. 'Z')[$div] if $exp;   # [6]

  return ('A' .. 'Z')[$div];                            # [7]
}

[1] The ascii (and unicode) location of the letter 'A', as a starting point for the letters.

[2] Translate a (column) number to an Excel column ID. I have added is copy so that we can change the value (in [3]).

[3] 'A' is at offset 0 in the array (and is considered the zero value). Compensate for the input starting with 1 as the lowest value.

[4] How many times can we divide the number by 26 (integer division).

[5] and the remainer, after the above (integer division).

[6] If the value was greater than 26 (i.e. more letters), do a recursive call where we add on additional letters to the left of whatever we have gotten so far.

[7] Else (less than 26), return the corresponding character.

See docs.raku.org/routine/ord for more information about ord.

See docs.raku.org/routine/mod for more information about the integer modulo operatormod.

See docs.raku.org/routine/div for more information about the integer division operator.div.

Using alphabet arrays may not bee that efficient. We can replace them with a combination of ord and chr:

sub to-excel (Int $number is copy)
{
  $number--;

  my $div = $number mod 26;
  my $exp = $number div 26;

  return to-excel($exp) ~ chr(A + $div) if $exp;   # [1]

  return chr(A + $div);                            # [1]
}

[1] 'A' had index 0 in the array, and this behaves the same way (as chr(A + 0) gives 'A'). chr gives us the character with the given position in the Unicode character set.

The downside is that the need for $number-- is not obvious anymore.

See docs.raku.org/routine/chr for more information about chr.

The other direction:

File: excel (partial)
sub from-excel ($str)                      # [1]
{
  my @elems = $str.comb;                   # [2]

  my $sum = 0;                             # [3]

  for @elems                               # [4]
  {
    $sum = $sum * 26 + ord($_) - A + 1;    # [5]
  }
  return $sum;                             # [3a]
}

[1] Translate an Excel column ID to a (column) number.

[2] Split the string into separate characters.

[3] The final number, which we return in [3a].

[4] Iterate over the letters, from the left.

[5] • multiply any existing value with 26 (to compensate for the additional digit we are adding), and add the value of the new digit.

File: excel (the rest)
multi MAIN (Int $int)                              # [1]
{
  say to-excel($int);
}

multi MAIN (Str $str where $str ~~ /^<[A..Z]>+$/)  # [2]
{
  say from-excel($str);
}

[1] Two multi MAIN candidates. This one takes an integer (and translates it to an Excel column ID).

[1] This one takes a string consisting of of one or more uppercase English letters. The program will fail without the where clause, as the input (for both) is of the IntStr type and the program cannot decide which to choose.

Running it with the values given in the challenge:

$ raku excel 28
AB

$ raku excel AD
30

The upper limit (and beyond):

$ raku excel 16384 
XFD

$ raku excel XFD
16384

$ raku excel 999999999999
DTMCHRXUM

$ raku excel PERLWEEKLYCHALLENGE
478147880915977152353483899

Challenge #60.2: Find Numbers

Write a script that accepts list of positive numbers (@L) and two positive numbers $X and $Y.

The script should print all possible numbers made by concatenating the numbers from @L, whose length is exactly $X but value is less than $Y.

Example

Input:

@L = (0, 1, 2, 5);
$X = 2;
$Y = 21;

Output:

10, 11, 12, 15, 20

There are several issues here:

  • Positive Numbers do not include zero, but it is included in the example. So we should allow them.
  • Positive Numbers are not limited to integers, but non-integers doesn't make sense here, and we should not allow them.
  • Are the numbers (now assumed to be integers) limited to one digit, or do we allow values greater than 9? I'll go for the latter.

The numbers are either «integers > 0» ($X and $X) or «integers >= 0» (@L). We can use custom types to enfore that:

subset PositiveIntZero of Int where * >= 0;
subset PositiveInt     of Int where * >  0;

They are borrowed from my Ackerman, URL and Raku article, which was my response to the Perl Weekly Challenge #17, and are described there.

Enforcing the types can be done like this:

unit sub MAIN (
    PositiveInt $X,                                       # [1]
    PositiveInt $Y,                                       # [1]
    *@L where @L.elems > 1 && all(@L) ~~ PositiveIntZero  # [2]
              # 3 ########     # 4 #####################
);

[1] The two first arguments, with types.

[2] The slurpy argument (which gobbles up the rest of the arguments) must be placed last in the argument list (of non-named arguments), so here it is. The challenge implied that it should come first, but that is not possible. A slurpy argument cannot be type constrained in the normal way, but we can do it with a where clause (or clauses, as done here).

[3] A slurpy argument allows zero values in the array, so we do this to ensure at least one value.

[4] Using an all junction with smartmatch against the required type ensures that all the elemens are of the given type.

See docs.raku.org/routine/all for more information about all, and docs.raku.org/type/Junction for more information about Junctions.

The actual program can be done in a single line of code (shown here with a lot of newlines to make it easier to add comments):

(@L xx $X)                          # [1]
.flat                               # [2]
.combinations($X)                   # [3]
>>.join                             # [4]
>>.Int                              # [5]
.unique                             # [6]
.sort                               # [7]
.grep({ .chars == $X && $_ < $Y })  # [8]
.join(", ")                         # [9]
.say;                               # [10]

[1] We use the List Repetition Operator xx to duplicate all the elements as many times as the length of the output string, as we are allowed to use duplicates (and combinations doesn't).

[2] get one array out of the above.

[3] Get all the combinations of $X elements, as a list of lists.

[4] The hyper method call operator >>. joins the inner lists (each combination) to a string, leaving a list of strings.

[5] This one removes leading zeros, if any, as e.g. «02» is not a valid two digit number.

[6] Remove duplicates (as we added a lot of possibilities for that in [1]).

[7] Sort the list (so that the output is nicer). The list is sorted numerically, because of [5]. If not we would have gotten string sorting.

[8] Only allow numbers with the correct number of digits (the first part), and below the upper limit (the second part).

[9] As the challenge wanted a comma separated list.

[10] And display it.

See docs.raku.org/routine/xx for more information about the List Repetition Operator xx.

See docs.raku.org/routine/combinations for more information about combinations.

See docs.raku.org/language/operators#index-entry-methodop_>>. for more information about the hyper method call operator.

Running it, with the values given in the challenge, and then some more:

$ raku find-numbers 2 21 0 1 2 5
10, 11, 12, 15, 20

$ raku find-numbers 3 200 1 2 3 4
111, 112, 113, 114, 121, 122, 123, 124, 131, 132, 133, 134, 141, 142, 143, 144

$ raku find-numbers 3 200 1 22 3 4
111, 113, 114, 131, 133, 134, 141, 143, 144

That clearly should have included «122».

This one should give a single answer, «200», which it does not:

$ raku find-numbers 3 200 111 7 8 9

The problem is that we require the same number of elemenets from the combinations as we have digits (the value of $X), and that makes any combination of «111» to a 5 digit number. Which is too long for the 3 digit limitation.

The solution is to allow combinations with fewer elements as well:

File: find-numbers-fixed (changes only)
.combinations(1..$X)  # [1]

[1] Allow any combination of 1 and up to (and including) $X elements from the array.

Runnin it:

$ raku find-numbers-fixed 2 21 0 1 2 5
10, 11, 12, 15, 20

$ raku find-numbers-fixed 3 200 1 22 3 4
111, 113, 114, 122, 131, 133, 134, 141, 143, 144

$ raku find-numbers-fixed 3 200 111 7 8 9
111

We could have insisted on single digit number in @L instead, but that would only have made a less capable program.

And that's it.