Subsequencially Good
with Raku

by Arne Sommer

Subsequencially Good with Raku

[241] Published 16. June 2023.

This is my response to The Weekly Challenge #221.

Challenge #221.1: Good Strings

You are given a list of @words and a string $chars.

A string is good if it can be formed by characters from $chars, each character can be used only once.

Write a script to return the sum of lengths of all good strings in words.

Example 1:
Input: @words = ("cat", "bt", "hat", "tree")
       $chars = "atach"
Output: 6

The good strings that can be formed are "cat" and "hat" so the answer
  is 3 + 3 = 6.
Example 2:
Input: @words = ("hello", "world", "challenge")
       $chars = "welldonehopper"
Output: 10

The expression «each character can be used only once» can be interpreted strictly, so that characters used in one word cannot be reused in subsequent words. The examples makes it abolutely clear that this interpretation is wrong (as will be shown later), but I have implemented it, even so; but as an option - called «single use».

File: good-string
#! /usr/bin/env raku

unit sub MAIN ($chars,
               *@words where @words.elems > 0,  # [1]
               :s(:$single-use),                # [2]
               :v(:$verbose));

my $chars-bag = $chars.comb.Bag;                # [3]
my @matches;                                    # [4]

for @words -> $word                             # [5]
{
  my $word-bag = $word.comb.Bag;                # [6]

  if $word-bag (<=) $chars-bag                  # [7]
  {
    @matches.push: $word;                       # [8]
    $chars-bag (-)= $word-bag if $single-use;   # [9]

    say ": Good word: $word" if $verbose && !$single-use;

    say ": Good word: $word (chars left: { $chars-bag.raku })"
      if $verbose && $single-use;
  }
  elsif $verbose
  {
    say ": Non-good word: $word";
  }
}

say ": Matches: @matches[]" if $verbose;

say @matches>>.chars.sum;                       # [10]

[1] A list with at least one element.

[2] The «single use» option, where letters used in one word cannot be reused.

[3] Get the individual characters (with comb), and turn that list into a Bag (with Bag); a hash like structure with the characters as keys, and the weight (count) as the values.

See docs.raku.org/type/Bag for more information about the Bag type.

[4] The matching words will end up here.

[5] Iterate over the words.

[6] Turn the current word into a Bag, as we did with the characters in [3].

[7] Is the current word a subset (or equal) to the characters?

See docs.raku.org/language/operators infix (<=) for more information about the subset or equal to operator (<=).

[8] If so, add the word to the list of matches.

[9] And remove the characters just used from the character list (with the set difference operator (-)), if «single use» mode has been requested. The result is assigned back to the left hand side, courtesy of the = operator prefix.

See https://docs.raku.org/routine/(-), infix(-) for more information about the set difference operator (-).

[10] Print the result; the sum of all the word lengths.

Running it:

$ ./good-string atach cat bi hat tree
6

$ ./good-string -v welldonehopper hello world challenge
10

Looking good.

With verbose mode:

$ ./good-string -v atach cat bi hat tree
: Good word: cat
: Non-good word: bi
: Good word: hat
: Non-good word: tree
: Matches: cat hat
6

$ ./good-string -v welldonehopper hello world challenge
: Good word: hello
: Good word: world
: Non-good word: challenge
: Matches: hello world
10

With «single use» enabled:

$ ./good-string -v -s atach cat bi hat tree
: Good word: cat (chars left: ("h"=>1,"a"=>1).Bag)
: Non-good word: bi
: Non-good word: hat
: Non-good word: tree
: Matches: cat
3

$ ./good-string -v -s welldonehopper hello world challenge
: Good word: hello (chars left: ("d"=>1,"o"=>1,"e"=>2,"n"=>1,"r"=>1,"w"=>1,
                                 "p"=>2).Bag)
: Non-good word: world
: Non-good word: challenge
: Matches: hello
5

«Single use» mode gives a different result if we swap the order of the words, as the program does a single pass, left to right, only:

$ ./good-string -v -s welldonehopper challenge world hello
: Non-good word: challenge
: Good word: world (chars left: ("l"=>1,"e"=>3,"h"=>1,"o"=>1,"n"=>1,
                                 "p"=>2).Bag)
: Non-good word: hello
: Matches: world
5

This illustrates the problem fully:

$ ./good-string -v -s welldonehopper challenge world hoo
: Non-good word: challenge
: Good word: world (chars left: ("p"=>2,"e"=>3,"l"=>1,"n"=>1,"o"=>1,
                                 "h"=>1).Bag)
: Non-good word: hoo
: Matches: world
5

$ ./good-string -v -s welldonehopper challenge hoo world
: Non-good word: challenge
: Good word: hoo (chars left: ("r"=>1,"d"=>1,"p"=>2,"e"=>3,"l"=>2,"w"=>1,
                               "n"=>1).Bag)
: Non-good word: world
: Matches: hoo
3

The first one (5) is the best (highest) answer, but the second one (3) is derailed by the shorter «hoo». Order matters.

Challenge 221.2: Arithmetic Subsequence

You are given an array of integers, @ints.

Write a script to find the length of the longest Arithmetic Subsequence in the given array.

A subsequence is an array that can be derived from another array by deleting some or none elements without changing the order of the remaining elements.

A subsquence is arithmetic if ints[i + 1] - ints[i] are all the same value (for 0 <= i < ints.length - 1).

Example 1:
Input: @ints = (9, 4, 7, 2, 10)
Output: 3

The longest Arithmetic Subsequence (4, 7, 10) can be derived by deleting
9 and 2.
Example 2:
Input: @ints = (3, 6, 9, 12)
Output: 4

No need to remove any elements, it is already an Arithmetic Subsequence.
Example 3:
Input: @ints = (20, 1, 15, 3, 10, 5, 8)
Output: 4

The longest Arithmetic Subsequence (20, 15, 10, 5) can be derived by
deleting 1, 3 and 8.
File: arithmetic-subsequence
#! /usr/bin/env raku

unit sub MAIN (*@ints where @ints.elems > 1 && all(@ints) ~~ Int,  # [1]
	       :v(:$verbose));

my $last  = @ints.end;                                             # [2]
my $score = 0;                                                     # [3]
my @seq;                                                           # [4]

for 0 .. $last -1 -> $first                                        # [5]
{
  for 1 .. $last -> $second                                        # [6]
  {
    my $diff    = @ints[$second] - @ints[$first];                  # [7]
    my $current = @ints[$first];                                   # [8]
    my $count   = 1;                                               # [9]
    my @sub     = ($current,);                                     # [10]

    for @ints[$first +1 .. *] -> $int                              # [11]
    {
      if $int == $current + $diff                                  # [12]
      {
        $count++;                                                  # [12a]
        @sub.push: $int;                                           # [12b]
        $current += $diff;                                         # [12c]
      }
    }

    if $count > $score                                             # [13]
    {
      @seq   = @sub;                                               # [13a]
      $score = $count;                                             # [13b]

      say ": New Longest Sequence: ({ @seq.join(",") }) with size $score"
        if $verbose;
    }
  }
}

say ": Sequence: ({@seq.join(",") })" if $verbose;

say $score;                                                        # [14]

[1] This time we require at least 2 elements in the sequence, for obvious reasons. (The difference between values is hard without at least two values.) They must all be integers.

[2] The index of the last element, gotten by the end method. This will save us some typing later on. (This is the same as @ints.elems -1, if you want to type more.)

See docs.raku.org/routine/end for more information about end.

[3] The current score (length of the sequence), as we go along looking for the longest arithmetic subsequence we can find.

[4] The currently best (longest) sequence ends up here. This is used by verbose mode only, but we could have used it to dynamically replace the $count variblee (set up in [3]).

Now we come to the important point; how to fint the sequences. The idea is to get the difference between any two values in the input, and use that to generate a sequence. These sequenceses will have at least two elements; the two values themselves.

[5] [6] We start with the first element in the [5] loop, all the way to the last but one. In the [6] loop we start with the element to the right of the start value and contnue until the very end. This gives us all the possible pairs, and thus the difference (the arithmetic part of sequence). When these two values are not adjacent, the skipped values are indeed skipped (in the sequence).

[7] The arithmetic difference between the first values in the two loops.

[8] The current value, initially the one at the first (as in the [5] loop) index.

[9] The number of values in the current sequence, initially 1.

[10] The current sequence, with the first value.

[11] Then we iterate over the indices of the rest of the array (starting one higher than the first value).

[12] If the new value has the correct arithmetic difference, we increase the count of values in the sequence [12a], add that value to the sequence [12b], and adjust the current value {12c] ready for the next iteration.

[13] We have now done all the values. If the new sequence is better (longer) than the old one, save the new sequence [13a] and the length [13b].

[14] Print the score; the length of the longest sequence.

Running it:

$ ./arithmetic-subsequence 9 4 7 2 10
3

$ ./arithmetic-subsequence 3 6 9 12
4

$ ./arithmetic-subsequence 20 1 15 3 10 5 8
4

We got the expected result.

With verbose mode:

$ ./arithmetic-subsequence -v 9 4 7 2 10
: New Longest Sequence: (9,4) with size 2
: New Longest Sequence: (4,7,10) with size 3
: Sequence: (4,7,10)
3

$ ./arithmetic-subsequence -v 3 6 9 12
: New Longest Sequence: (3,6,9,12) with size 4
: Sequence: (3,6,9,12)
4

$ ./arithmetic-subsequence -v 20 1 15 3 10 5 8
: New Longest Sequence: (20,1) with size 2
: New Longest Sequence: (20,15,10,5) with size 4
: Sequence: (20,15,10,5)
4

That was easy, but we used brute force. A lot of the candidate sequences are duds, especially the later ones with an arithmetic difference we have already tried. (This is so, as we in this case inspects a shorter version of the input list than we did the first time.) This will speed up the program, especially when we burden it with a long input list.

File: arithmetic-subsequence-faster
#! /usr/bin/env raku

unit sub MAIN (*@ints where @ints.elems > 1 && all(@ints) ~~ Int,
               :v(:$verbose));

my $last  = @ints.end;
my $score = 0;
my @seq;
my %computed;                     # [1]

for 0 .. $last -1 -> $first
{
  for 1 .. $last -> $second
  {
    my $diff    = @ints[$second] - @ints[$first];

    next if %computed{$diff}++;   # [2]

    my $current = @ints[$first];
    my $count   = 1;
    my @sub     = ($current,);

    for @ints[$first +1 .. *] -> $int
    {
      if $int == $current + $diff
      {
        $count++;
        @sub.push: $int;
        $current += $diff;
      }
    }

    if $count > $score
    {
      @seq   = @sub;
      $score = $count;

      say ": New Longest Sequence: ({ @seq.join(",") }) with size $score"
        if $verbose;
    }
  }
}

say ": Sequence: ({@seq.join(",") })" if $verbose;

say $score;

[1] The already used arithmetic differences will end up here.

[2] Skip this second value (the inner loop) if we have already had a go at this difference. Note the postfix update ++, setting the value after the comparison.

The speed gain on the supplied examples are minimal, though.

And that's it.