Diff Sum by Raku

by Arne Sommer

Diff Sum by Raku

[68] Published 18. April 2020.

This is my response to the Perl Weekly Challenge #56.

Challenge #56.1: Diff-K

You are given an array @N of positive integers (sorted) and another non negative integer k.

Write a script to find if there exists 2 indices i and j such that A[i] - A[j] = k and i != j.

It should print the pairs of indices, if any such pairs exist.

Example:

    @N = (2, 7, 9)
    $k = 2

Output : 2,1

Beware of the wording: «non negative integer» includes zero, whereas «positive integer» does not.

File: diff-k:
subset PositiveInt0 of Int where * >= 0;  # [1]
subset PositiveInt  of Int where * >= 1;  # [2]

unit sub MAIN
  (PositiveInt0 $k where $k >= 0,                                 # [1]
   *@array where @array.elems > 1 && all(@array) ~~ PositiveInt,  # [3]
   :$verbose);

my $last = @array.end;                                            # [4]

die "Array @array[] is not sorted." unless [<=] @array;           # [5]

say ": Diff: $k" if $verbose;

for 0 .. $last -> $j                                              # [6]
{
  for $j +1 .. $last -> $i                                        # [7]
  {
    say ": I:$i (@array[$i]) -> J:$j (@array[$j])" if $verbose;
    say "$i, $j" if @array[$i] - @array[$j] == $k;                # [8]
  }
}

[1] A custom type for Positive Integers.

[2] Another custom type, for Positive Integers and zero.

[3] We cannot add a type constraint on a slurpy array, but we can smartmatch all the values with an all junction. A slurpy array allows an empty array, so we use elems to avoid that.

[4] .last gives us the index of the last element in the array. It is the same as .elems - 1, but is shorter.

[5] The array is sorted, according to the challenge, but we can check, and exit if it isn't. The Reduction Metaoperator, given with [], is computed between each element in the list. We specify the operator <), and the expression will fail (and the program dies) if one of the values has a lower value than the one before it (i.e. to the left).

[6] Iterate over all the indicis.

[7] • iterate over the remaining indicis (after the current one in [6]).

[8] Report a match if the expression pans out.

Running it:

$ raku diff-k 2 2 7 9
2, 1

$ raku diff-k --verbose 2 2 7 9
: Diff: 2
: I:1 (7) -> J:0 (2)
: I:2 (9) -> J:0 (2)
: I:2 (9) -> J:1 (7)
2, 1

$ raku diff-k 0 1 1 1
1, 0
2, 0
2, 1

$ raku diff-k --verbose 0 1 1 1
: Diff: 0
: I:1 (1) -> J:0 (1)
1, 0
: I:2 (1) -> J:0 (1)
2, 0
: I:2 (1) -> J:1 (1)
2, 1

This is fine for small arrays, but the second for loop will continue looking even if the difference is higher than $k. The difference will only increase as we go on, as the array is sorted. This isn't a problem in practice for small arrays, as the one given in the challenge, but we should fix it anyway:

File: diff-k2 (changes only)
  for $j +1 .. $last -> $i
  {
    say ": I:$i (@array[$i]) -> J:$j (@array[$j])" if $verbose;
    say "$i, $j" if @array[$i] - @array[$j] == $k;
    last if @array[$i] - @array[$j] > $k;
  }

We can see the number of computations (sort of) this save us:

$ raku diff-k 2 2 7 9 10 10 10 10 10 10 10 10 10 11 12 13 14 15 16 17 18|wc
     17      34     109

$ raku diff-k --verbose 2 2 7 9 10 10 10 10 10 10 10 10 10 11 12 13 14 15 16 17 18|wc
    208    1177    4622

$ raku diff-k2 --verbose 2 2 7 9 10 10 10 10 10 10 10 10 10 11 12 13 14 15 16 17 18|wc
    113     607    2377

This gives ut 17 matches (the first command). Verbose mode has those matches as well, and the $k value. That gives us 190 computations for «diff-k» and 95 for «diff-k2», which is about half. That is pretty good.

Note that the extra line we added is a computation on its own, so it will add some overhead. For small arrays this extra line can actually make the program slower, but it should be fast enough anyway.

Challenge #56.2: Path Sum

You are given a binary tree and a sum, write a script to find if the tree has a path such that adding up all the values along the path equals the given sum. Only complete paths (from root to leaf node) may be considered for a sum.

Example
Given the below binary tree and sum = 22,

          5
         / \
        4   8
       /   / \
      11  13  9
     /  \      \
    7    2      1

For the given binary tree, the partial path sum 5 → 8 → 9 = 22 is not valid.

The script should return the path 5 → 4 → 11 → 2 whose sum is 22.

I have chosen to model the binary tree as node objects:

File: path-sum (partial)
unit sub MAIN (:$verbose);

class BinaryNode
{
  has Int        $.value;   # [1]
  has BinaryNode $.left;    # [2]
  has BinaryNode $.right;   # [3]
}

[1] Each node has a value (an Integer),

[2] a left child node (which can be undefined),

[3] and a right child node (which also can be undefined).

«You are given a binary tree» is a wonderful statement, as it glosses over the very big issue of how it should be given. We can do it like this, in one large (and ugly) declaration:

File: path-sum (partial)
my BinaryNode $btree
 = BinaryNode.new
   (
     value => 5,
     left  => BinaryNode.new
              (
                value => 4,
                left  => BinaryNode.new
                         (
                           value => 11,
                           left  => BinaryNode.new(value => 7),
                           right => BinaryNode.new(value => 2)
                         )
              ),
     right => BinaryNode.new
              (
                value => 8,
                left  => BinaryNode.new(value => 13),
                right => BinaryNode.new
                         (
                           value => 9,
                           right => BinaryNode.new(value => 1)
                         )
              )
   );

The indentation and newlines have been added to make it more readable (or perhaps, slightly less incomprehensible). Feel free to compare it with the illustration from the challenge, ensuring that I got it right. (I did.)

Then the rest of the program, done as a recursive procedure traversing the tree:

File: path-sum (the rest)
say $btree, "\n" if $verbose;

traverse($btree, 22, ());                             # [4]

sub traverse ($current, $target-sum, @path is copy)   # [5]
{
  if ($current.left or $current.right)                # [6]
  {
    @path.push: $current;                             # [7]

    traverse($current.left,  $target-sum, @path) if $current.left;   # [8]
    traverse($current.right, $target-sum, @path) if $current.right;  # [9]
  }
  else                                               # [10]
  {
    @path.push: $current;                            # [11]
    say ": " ~ @path>>.value.join(" -> ") if $verbose;
    say @path>>.value.join(" -> ") if @path>>.value.sum == $target-sum; # [12]
    return;                                          # [13]
  }
}

[4] A recursive procedure. It is called for each and every child node. The important parameter is path that holds the path down to the node we are to embark upon. We start if off at the top.

[5] Note is copy on the path, so that we can change a local copy of it. (Recursive procedures require local variables to work.)

[6] If the node has one or more children,

[7] • add it to the path, and

[8] • recurse down the left child, if given.

[9] • recurse down the right child, if given.

[10] Else (no children), and it is a leaf node that is a possible answer,

[11] Add it to the path (as we use the path to compute the value).

[12] Print the path if it has the target value. We have the objects in @path, so we apply .value on every one to get the values. The hyper method call operator >>. does that for us.

[13] This one actually doesn't do anything, as it would have returned anyway. But it can be nice to point out that we are stepping out of a recursive descent.

Running it:

$ raku path-sum
5 -> 4 -> 11 -> 2

See docs.raku.org/language/operators#index-entry-methodop_>>. for more information about the hyper method call operator.

I have chosen to push the nodes themselves onto the array. But since we are only interested in the values, we could push them instead:

File: path-sum2 (changes only)
sub traverse ($current, $target-sum, @path is copy)
{
  if ($current.left or $current.right)
  {
    @path.push: $current.value;

    traverse($current.left,  $target-sum, @path) if $current.left;
    traverse($current.right, $target-sum, @path) if $current.right;
  }
  else
  {
    @path.push: $current.value;
    say ": " ~ @path.join(" -> ") if $verbose;
    say @path.join(" -> ") if @path.sum == $target-sum;
    return;
  }
}

This version of the program gives the same result.

But what if we want to specify a different sum, and a different binary tree? The sum is trivial (adding a command line argument), but the tree is not.

Let us consider the tree given in the challenge. I have added stop nodes (marked with «*» where the parent has no child, as we need some way of telling that there are no child. (The nodes in the bottom row have no children, as there isn't any more rows.)

Now we can code the tree like this, an array with arrays, one for each row:

my @btree = ( (5), (4,8), (11, *, 13, 9), (7,2, *, *, *, 1) );

Turning this into a tree is doable:

File: path-sum-conf
unit sub MAIN (Int $sum = 22, :$verbose);  # [1]

class BinaryNode
{
  has Int        $.value;
  has BinaryNode $.left;
  has BinaryNode $.right;
}

my @btree = ( 5, (4, 8), (11, *, 13, 9), (7, 2, *, *, *, 1) );  # [2]

my @old-nodes;                            # [3]
my @new-nodes;                            # [4]

for @btree.reverse -> $row                # [5]
{
  my @current = @$row;                    # [6]
  @old-nodes  = @new-nodes;               # [7]
  @new-nodes  = ();                       # [8]
  
  for @current -> $value                  # [9]
  {
    if $value eq "*"                      # [10]
    {
      @new-nodes.push("*");               # [10a]
      next;                               # [10b]
    }

    my $left  = @old-nodes.shift // "*"; $left  = Nil if $left  eq "*"; # [11]
    my $right = @old-nodes.shift // "*"; $right = Nil if $right eq "*"; # [12]
    
    @new-nodes.push(BinaryNode.new(value => $value,                     # [13]
                                   left  => $left,
                                   right => $right)); 
  }
}

my $btree = @new-nodes[0];                # [14]

say $btree, "\n" if $verbose;

traverse($btree, $sum, ());               # [15]

sub traverse ($current, $target-sum, @path is copy)                     # [16]
{
...
}

The procedure starts with the bottom row in the array, in the for @btree.reverse -> $row loop; (7, 2, *, *, *, 1), and generates nodes for the integer values. The @old-nodes array is empty, so none of them has any children. The nodes, with the strings «*» in between, are placed in «@new-nodes».

The second iteration uses the row above; (11, *, 13, 9). The result from the first iteration (a list of objects) are moved to @old-nodes- For each of values in the list that is an integer, it creates a new node, with the children taken from the @old-nodes array (with shift). If the object is a string «*» we use the Any (undefined) value to indicate that there is no child.

This goes on, until we reach the top row and finishes. The result is an array with a single element. This element is the whole tree.

[1] Note the new argument giving the sum, and the default value of 22.

[2] The binary tree, on the syntax as described above.

[3] The old nodes, i.e. the ones in the row below the current one.

[4] The new nodes, which we create as we parse the current row.

[5] For each row in the three, starting at the bottom.

[6] Get the list of elements from the tree row.

[7] Put the nodes from the row below (the iteration before the current one) in the old nodes.

[8] Clear the new nodes (as we are staring with a new row).

[9] Iterate over the values in the current row.

[10] If the value is «*», we simply push the value to the new nodes list [10a] and continue with the next value [10b].

[11] Get the left child from the old nodes array. If the value is undefined (detected with the defined-or operator //), use "*" instead. Then set it to Any (which is the undefined value), if it is «*». If we skip the // part, the comparison will fail if the value is undefined.

[12] As [11], for the right child.

[13] Create the new node, with the children (if any).

[14] When the recursion has finished, we are left with an array with a single element. That is the binary tree (or the top node).

[15] Traverse the binary tree.

[16] The recursiveprocedure. See «path-sum2» for the procedure body.

Use verbose mode to see that this program creates the same tree as the first one:

$ raku path-sum2 --verbose
BinaryNode.new(value => 5, left => BinaryNode.new(value => 4, left =>
BinaryNode.new(value => 11, left => BinaryNode.new(value => 7, left =>
BinaryNode, right => BinaryNode), right => BinaryNode.new(value => 2,
left => BinaryNode, right => BinaryNode)), right => BinaryNode),
right => BinaryNode.new(value => 8, left => BinaryNode.new(value => 13,
left => BinaryNode, right => BinaryNode), right => BinaryNode.new(value
=> 9, left => BinaryNode, right => BinaryNode.new(value => 1,
left => BinaryNode, right => BinaryNode))))

: 5 -> 4 -> 11 -> 7
: 5 -> 4 -> 11 -> 2
5 -> 4 -> 11 -> 2
: 5 -> 8 -> 13
: 5 -> 8 -> 9 -> 1

$ raku path-sum-conf --verbose
BinaryNode.new(value => 5, left => BinaryNode.new(value => 4, left =>
BinaryNode.new(value => 11, left => BinaryNode.new(value => 7, left =>
BinaryNode, right => BinaryNode), right => BinaryNode.new(value => 2,
left => BinaryNode, right => BinaryNode)), right => BinaryNode),
right => BinaryNode.new(value => 8, left => BinaryNode.new(value => 13,
left => BinaryNode, right => BinaryNode), right => BinaryNode.new(value
=> 9, left => BinaryNode, right => BinaryNode.new(value => 1,
left => BinaryNode, right => BinaryNode))))

: 5 -> 4 -> 11 -> 7
: 5 -> 4 -> 11 -> 2
5 -> 4 -> 11 -> 2
: 5 -> 8 -> 13
: 5 -> 8 -> 9 -> 1

(It does. I have added newlines to make it easier to read the output.)

There are only 4 paths, and we get them all:

$ raku path-sum-conf 5
$ raku path-sum-conf 9
$ raku path-sum-conf 13
$ raku path-sum-conf 22
5 -> 4 -> 11 -> 2

$ raku path-sum-conf 20
$ raku path-sum-conf 21
$ raku path-sum-conf 23
5 -> 8 -> 9 -> 1

$ raku path-sum-conf 24
$ raku path-sum-conf 25
$ raku path-sum-conf 26
5 -> 8 -> 13

$ raku path-sum-conf 27
5 -> 4 -> 11 -> 7

$ raku path-sum-conf 28
$ raku path-sum-conf 29
$ raku path-sum-conf 30

That was fun. But this is kind of pointless unless we add the posibility to specify the tree on the command line. So let us do just that.

REPL is excellent for experimenting with this. I ended up with this line of code:

> "5 | 4 8 | 11 * 13 9 | 7 2 * * * 1".split("|")>>.words
((5) (4 8)) (11 * 13 9) (7 2 * * * 1))

Except that it isn't entirely the same:

> my @btree = ( (5), (4,8), (11, *, 13, 9), (7,2, *, *, *, 1) );
[5 (4 8) (11 * 13 9) (7 2 * * * 1)]

> @btree.raku
[5, (4, 8), (11, *, 13, 9), (7, 2, *, *, *, 1)]

> "5 | 4 8 | 11 * 13 9 | 7 2 * * * 1".split("|")>>.words.raku
(("5",).Seq, ("4", "8").Seq, ("11", "*", "13", "9").Seq,
 ("7", "2", "*", "*", "*", "1").Seq)

The Seq (as in «Seqence») part doesn't really matter, but we can get rid of them if it bothers anybody:

> "5 | 4 8 | 11 * 13 9 | 7 2 * * * 1".split("|")>>.words>>.list.raku
(("5",), ("4", "8"), ("11", "*", "13", "9"), ("7", "2", "*", "*", "*", "1"))

The problem is that we are given strings, and not numbers, and the program will abort as the object value has been declared as an Int. We can fix that when we construct the objects.

Also note that the first element is a (sub)list with a single element in it, and not just the plain value. But that doesn't matter.

File: path-sum-conf2 (changes only)
unit sub MAIN (Int :$sum = 22,
               Str :$tree = "5 | 4 8 | 11 * 13 9 | 7 2 * * * 1",
               :$verbose);
    @new-nodes.push(BinaryNode.new(value => $value.Int,

I have chosen named arguments this time.

Let us try with some different trees:

$ raku path-sum-conf2 --sum=20 --tree="7 | 6 18 | 6 * 12 4 | * 1 * 5 * 1"
7 -> 6 -> 6 -> 1

$ raku path-sum-conf2 --sum=20 --tree="7 | 6 18 | 6 * 12 4 | 1 1 * 5 * 1"
7 -> 6 -> 6 -> 1
7 -> 6 -> 6 -> 1

The last one gives two different paths, which happen to have the same values. If we had kept the nodes (as in «path-sum»), we could have inspected them and found out that leaf nodes («1» and «1») are different objects. But it doesn't really matter.

And that's it.