The Euclid Path to Raku

by Arne Sommer

The Euclid Path to Raku

[17] Published 11. June 2019.

Perl 6 → Raku

This article has been moved from «perl6.eu» and updated to reflect the language rename in 2019.

This is my response to the Perl Weekly Challenge #12.

Challenge #12.1

The numbers formed by adding one to the products of the smallest primes are called the Euclid Numbers (see wiki). Write a script that finds the smallest Euclid Number that is not prime. This challenge was proposed by Laurent Rosenfeld.

The builtin function is-prime can tell us if a number is prime, and we can generate a list of primes like this:

my $primes := (1 .. *).grep(*.is-prime);

We start with an infinite list of integers ((1 .. *)), use grep to only select the primes(.grep(*.is-prime)), and assign the result with binding (:=) to make absolute sure that we have a lazy data structure.

See docs.raku.org/routine/is-prime for more information about «is-prime.

Note that the documentation states that «is-prime» Returns True if this Int is known to be a prime, or is likely to be a prime based on a probabilistic Miller-Rabin test. This probabilistic approach shouldn't be a problem in practice, but look it up at e.g. wikipedia if you want to know more.

See docs.raku.org/routine/grep for more information about «grep.

File: euclid-nonprime
my $primes := (1 .. *).grep(*.is-prime);  # [1]

for 1 .. *                                # [2]
{
  my $sum =  1 + [*] $primes[^$_];        # [3]

  unless $sum.is-prime                    # [4]
  {
    say "Smallest non-prime Euclid Number: $sum"; # [4a]
    last;                                         # [4b]
  }
}

[1] The list (actually a sequence) of prime numbers.

[2] An infinite loop, from 1 and up.

[3] Calculate the Xth Euclid Number, where X is an integer from 1 to Infinity. $primes[^$_] is an array slice, that gives us a list of the $_ first prime numbers. ^$_ means «up to, but not including, so we get the correct number og primes (as the index starts with zero). Then we apply the Reduction metaoperator [ ] on the list, with multiplication as the operator. This multiplies all the elements together. And finaly we add 1 (logically, as I placed it first on the line).

[4] If the Euclid Number isn't a prime, print it [4a] and exit [4b].

Running it:

$ raku euclid-nonprime
Smallest non-prime Euclid Number: 30031

And the answer is correct, according to the wikipedia page given in the challenge.

Euclid Number Bonus

It is easy to set up a Sequence (with gather/take) that gives us the Euclid Numbers:

File: euclid-numbers
unit sub MAIN ($count = 10);

my $primes := (1 .. *).grep(*.is-prime);

my $euclid-numbers := gather
{
  for 1 .. *
  {
    take 1 + [*] $primes[^$_];
  }
}

say "{ ($_ + 1).fmt('%' ~ $count.chars ~ 'd') }: $euclid-numbers[$_]"
 for ^$count;

Specify the number of numbers as argument, or get the default 10:

$ raku euclid-numbers 30
 1: 3
 2: 7
 3: 31
 4: 211
 5: 2311
 6: 30031
 7: 510511
 8: 9699691
 9: 223092871
10: 6469693231
11: 200560490131
12: 7420738134811
13: 304250263527211
14: 13082761331670031
15: 614889782588491411
16: 32589158477190044731
17: 1922760350154212639071
18: 117288381359406970983271
19: 7858321551080267055879091
20: 557940830126698960967415391
21: 40729680599249024150621323471
22: 3217644767340672907899084554131
23: 267064515689275851355624017992791
24: 23768741896345550770650537601358311
25: 2305567963945518424753102147331756071
26: 232862364358497360900063316880507363071
27: 23984823528925228172706521638692258396211
28: 2566376117594999414479597815340071648394471
29: 279734996817854936178276161872067809674997231
30: 31610054640417607788145206291543662493274686991

It is possible to both write it shorter and get rid of gather/take, if we use «lazy» instead:

File: euclid-numbers-lazy (partial)
my $euclid-numbers := lazy for 1 .. * { 1 + [*] $primes[^$_] };

The «lazy» statement prefix ensures that the statement following it is only executed when the values are actually used.

See docs.raku.org/syntax/lazy for more information about «lazy».

We can rewrite «euclid-nonprime» using this sequence:

File: euclid-nonprime-lazy
my $primes := (1 .. *).grep(*.is-prime);

my $euclid-numbers := lazy for 1 .. * { 1 + [*] $primes[^$_] };

for ^Inf
{
  unless $euclid-numbers[$_].is-prime
  {
    say "Smallest non-prime Euclid Number: $euclid-numbers[$_]";
    last;
  }
}

It looks much clearer than the first version, as the variable name used for the second sequence tells the reader what is going on.

Challenge #12.2

Write a script that finds the common directory path, given a collection of paths and directory separator. For example, if the following paths are supplied.
/a/b/c/d
/a/b/cd
/a/b/cc
/a/b/c/d/e
and the path separator is /. Your script should return /a/b as common directory path.

The challenge doesn't say how we are supposed to get the values, so I have chosen to get them on the command line. The directory separator can be set with the named argument «separator», and defaults to «/». This is just the framework, to get the input and make sure that it works as intended.

File: common-dir-path-dummy
unit sub MAIN (:$separator = "/", *@paths);

say "Separator: $separator";

say $_ for @paths;

Running it:

$ raku common-dir-path --separator=§ /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e
Separator: §
/a/b/c/d
/a/b/cd
/a/b/cc
/a/b/c/d/e

That was the easy part. Now the hard part:

It isn't clear if the last part of a path is a file or a directory. None of the paths given in the challenge ends with the directory separator (in which case the last part would have been a directory), so we have to assume that the last part can be a file. And not part of a common path.

File: common-dir-path
unit sub MAIN (:$separator = "/", *@paths is copy); # [1]

loop                                                # [2]
{
  @paths = remove-last-part(@paths);                # [2a]

  unless @paths.elems                               # [6]
  {
    say "No common Directory Path.";                # [6a]
    last;                                           # [6b]
  }

  if all(@paths) eq @paths[0]                       # [9]
  {
    @paths[0] eq $separator                         # [10]
      ?? say "Common Directory Path: $separator"    # [10a]
      !! say "Common Directory Path: "              # [10b]
               ~ @paths[0].substr(0, @paths[0].chars - $separator.chars);
    last;                                           # [9b]
  } 
}

sub remove-last-part(@paths)               # [3]
{
  my @new; for @paths                      # [4]
  {
    return () unless /$separator/;         # [5]
    /(.*$separator)/;                      # [7]
    push @new: $0.Str;                     # [7a]
  }
  
  my $min = @new>>.chars.min;              # [8]

  my @return; for @new { @return.push($_.substr(0, $min)); }  # [8a]
  return @return;                                             # [8b]
}

[1] Parameteters to a function are read only be default, but is copy gives us a copy that we can change.

[2] An eternal loop, where we start by calling remove-last-part [2a].

[3] remove-last-part gets the current path values, does something with them [6a and 9], and returns the modified versions [9a].

[4] Loop through the paths, one at a time.

[5] • return an empty list if the separator isn't present in the path, as that means that there isn't any common elements.

[6] • if we get an empty list, report the failure [6a] and end the program [6b].

[7] This regex match removes anything after the last directory separator, for all the paths. The result is added to the @new array [7a]. Note $0.Str to coerce the match object $0 to a string.

[8] This gives us the size (length) of the shortest path, and goes through the values and takes that many characters from each of them and saves them to @return [8a]. And finally returns that array [8b]

[9] Are all the paths equal? If yes, print the path [10] and end the program [9b]

[10] If the value is the same as the directory separator, print it [10a]. Otherwise remove the trailing directory separator before printing the value [10b].

The all junction in [9] gives us a very compact way of checking if all the values in the list (given to all) are equal to the right hand side.

See docs.raku.org/routine/all and docs.raku.org/type/Junction for more information about Junctions.

Running it:

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e 
Common Directory Path: /a/b
[0][7][8][9]
/a/b/c/d
/a/b/cd
/a/b/cc
/a/b/c/d/e
    /a/b/c/
/a/b/
/a/b/
/a/b/c/d/
    /a/b/
/a/b/
/a/b/
/a/b/
    /a/b

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e /a
Common Directory Path: /

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e /a/
Common Directory Path: /a

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e /a/b
Common Directory Path: /a

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e /a/b/
Common Directory Path: /a/b

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e Q
No common Directory Path.

$ raku common-dir-path /a/b/c/d /a/b/cd /a/b/cc /a/b/c/d/e /Q
Common Directory Path: /

$ raku common-dir-path 12 A Q
No common Directory Path.

$ raku common-dir-path 12/A 12
No common Directory Path.

$ raku common-dir-path 12/A 12/C
Common Directory Path: 12
$ raku common-dir-path /a/b /a/bbb/x
Common Directory Path: /a
[0][7][8][9]
/a/b
/a/bbb/x
    /a/
/a/bbb/
    /a/
/a/
    /a

But this one hangs the program (eternal loop):

$ raku common-dir-path /a/b/ /a/c/
[0][7][8][7]
/a/b/
/a/c/
    /a/b/
/a/c/
    /a/b/
/a/c/
    /a/b/
/a/c/

This will go on forever, repeating step 7 and 8.

The problem is that the procedure doesn't remove anything, when the paths have the same length, and end with the directory separator.

So I'l add code to handle that special case:

File: common-dir-path2 (partial)
loop
{
  my $size = @paths>>.chars.sum; # [1]
  
  @paths = remove-last-part(@paths);

  unless @paths.elems
  {
    say "No common Directory Path.";
    last;
  }

  if all(@paths) eq @paths[0]
  {
    @paths[0] eq $separator
      ?? say "Common Directory Path: $separator"
      !! say "Common Directory Path: "
           ~ @paths[0].substr(0, @paths[0].chars - $separator.chars);
    last;
  }
  
  if (@paths>>.chars.sum == $size)                                       # [2]
  {
    @paths[0] = @paths[0].substr(0, @paths[0].chars - $separator.chars); # [3]
  }
}

[1] Get the length of all the paths, and add them together.

[2] If the total length remains the same after «remove-last-part» has been called,

[3] • remove the last characters (the same number as in the directory separator; and actually the whole directory separator) from the first path. The next iteration will then remove more.

Some Observations about Paths

The directory separator is «/» on Unix- and Unix-like systems, and Windows uses «\».

The program works fine with Windows paths (and volumes):

$ raku common-dir-path --separator='\' 'A:\b' 'A:\r'
Common Directory Path: A:

$ raku common-dir-path --separator='\' 'A:\b\r' 'A:\b\r'
Common Directory Path: A:\b

It doesn't matter which operating system you run the program under, as long as you specify the correct directory separator for the given paths.

Raku has the dir-sep method that can be used to get the directory separator, but it must be invoked on an object representing the operating system we want the value for:

> IO::Spec::Unix.dir-sep.say;   # -> /
> IO::Spec::Win32.dir-sep.say;  # -> \

We can use this code snippet to tell us if the program is running under Windows:

> say $*DISTRO.is-win;  # -> False

(I am running it under Linux. You'll obviously get another result if done under Windows.)

If you want the separator, dynamically depending on the operating system, something like this will do the trick:

my $sep = $*DISTRO.is-win
    ?? IO::Spec::Win32.dir-sep
    !! IO::Spec::Unix.dir-sep;

But you should use «IO::Path» objects instead. The «add» method takes care of the separator for us:

> my $a = $*HOME;        # -> "/home/raku/".IO
> $a.add("readme.txt");  # -> "/home/raku/readme.txt".IO

This makes it possible to write programs that accesses files in different directories, regardless of the operating system. With a little care...

It is also possible to generate «IO::Path» objects manually, but you'll lock the program to a specific operating system.

> say IO::Path.new("/home/raku/");  # -> "/home/raku/".IO
> say "/home/raku/".IO;             # -> "/home/raku/".IO

See docs.raku.org/type/IO::Path for more informations about «IO::Path». The class has a lot of methods that can be useful when working with paths.