This is my response to the Perl Weekly Challenge #132.
Dave Cross
has built cool site
that does something similar.
Input: 2021/09/18
Output: 2021/09/14, 2021/09/26
On the date you were born, someone who was your current age, would have
been born on 2021/09/14. Someone born today will be your current age on
2021/09/26.
Example 2:
Input: 1975/10/10
Output: 1929/10/27, 2067/09/05
On the date you were born, someone who was your current age, would have
been born on 1929/10/27. Someone born today will be your current age on
2067/09/05.
Example 3:
Input: 1967/02/14
Output: 1912/07/08, 2076/04/30
On the date you were born, someone who was your current age, would have
been born on 1912/07/08. Someone born today will be your current age on
2076/04/30.
The notion of the day you were born is a red herring. The only useful information to be had from that herring is that the specified date must be before the current date, as you would not have been born yet otherwise.
Let is use an illustration: We start with the current date (in black) and the user specified date (in red). We get the difference (in days), and subtract it from the user specified date to get the lower date (in green), and add it to the current date to get the upper date (in blue):
Then the program:
File: mirror-dates
#! /usr/bin/env raku
subset SlashDate where * ~~ /^\d\d\d\d\/\d\d\/\d\d$/; # [1]
my $fmt = { sprintf "%04d/%02d/%02d", .year, .month, .day }; # [2]
unit sub MAIN (SlashDate $date = "2021/09/18", # [3]
SlashDate :$today = Date.new(DateTime.now,
formatter => $fmt).Str, # [4]
:v($verbose));
my $date-iso = $date.subst("/", "-", :g); # [5]
my $earlier = Date.new($date-iso, formatter => $fmt); # [6]
my $now = Date.new($today.subst("/", "-", :g), formatter => $fmt); # [7]
die "The date ($earlier) should be earlier than today ($now)"
unless $earlier < $now; # [8]
my $diff = $now - $earlier; # [9]
say ": Today: $now" if $verbose;
say ": Difference: $diff days" if $verbose;
my $first = $earlier.earlier(day => $diff); # [10]
my $last = $now.later( day => $diff); # [11]
say "$first, $last"; # [12]
[1] A custom type, set up subset
, for the peculiar
(non-iso) date format specified in the challenge.
[2] Anonymous procedure used when printing a Date object, using the reqired format.
[3] The specified date. Note the custom type (from [1]) handling compliance issues.
[4] Use this optionally (named) argument to specify the current date, as the program is difficult to test otherwise. Default to the currenr date. Note the formatter specification, so that simply printing the object will give us what the challenge requested.
[5] Convert the date string to iso standard, as the Date object requires it.
[6] Get the date object for the specified date.
[7] Get today's date.
[8] The specified date should be in the past.
[9] The number of days between the two dates. Subtraction on two Date
objects does the trick.
[10] Get the first date; in the past.
[11] Get the second date, in the future.
[12] Print the dates.
See
docs.raku.org/language/typesystem#index-entry-subset-subset
for more information about subset
.
Running it:
$ ./mirror-dates 2021/09/18
2021/09/05, 2021/10/14
Not what you expected? Verbose mode to the rescue:
$ ./mirror-dates -v 2021/09/18
: Today: 2021/10/01
: Difference: 13 days
2021/09/05, 2021/10/14
The problem is the notion of today. But the program can overcome that problem, removing the element of entropy. Let us try again, using the same current date as the challenge:
$ ./mirror-dates -today=2021/09/22 -v 2021/09/18
: Today: 2021/09/22
: Difference: 4 days
2021/09/14, 2021/09/26
$ ./mirror-dates -today=2021/09/22 -v 1975/10/10
: Today: 2021/09/22
: Difference: 16784 days
1929/10/27, 2067/09/05
$ ./mirror-dates -today=2021/09/22 -v 1967/02/14
: Today: 2021/09/22
: Difference: 19944 days
1912/07/08, 2076/04/30
Looking good.
Input:
@player_ages = (
[20, "Alex" ],
[28, "Joe" ],
[38, "Mike" ],
[18, "Alex" ],
[25, "David" ],
[18, "Simon" ],
);
@player_names = (
["Alex", "Stewart"],
["Joe", "Root" ],
["Mike", "Gatting"],
["Joe", "Blog" ],
["Alex", "Jones" ],
["Simon","Duane" ],
);
Output:
Based on index = 1 of @players_age and index = 0 of @players_name.
20, "Alex", "Stewart"
20, "Alex", "Jones"
18, "Alex", "Stewart"
18, "Alex", "Jones"
28, "Joe", "Root"
28, "Joe", "Blog"
38, "Mike", "Gatting"
18, "Simon", "Duane"
I have ignored the «in-memory size», as it does not make (much) sense with the amount of the memory used in modern computers.
File: hash-join
#! /usr/bin/env raku
unit sub MAIN (:v(:$verbose));
my @player_ages = ( # [1]
[20, "Alex" ],
[28, "Joe" ],
[38, "Mike" ],
[18, "Alex" ],
[25, "David" ],
[18, "Simon" ],
);
my @player_names = ( # [2]
["Alex", "Stewart"],
["Joe", "Root" ],
["Mike", "Gatting"],
["Joe", "Blog" ],
["Alex", "Jones" ],
["Simon","Duane" ],
);
hash-join(@player_ages, 1, @player_names, 0); # [3]
sub hash-join (@table1 is copy, $index1 is copy,
@table2 is copy, $index2 is copy) # [4]
{
if @table1.elems > @table2.elems # [5]
{
(@table1, @table2) = (@table2, @table1); # [5a}
($index1, $index2) = ($index2, $index1); # [5b]
}
my %table1; # [6]
for @table1 -> $row # [7]
{
my @data = @$row; # [7a]
my $id = @data.splice($index1, 1); # [8]
say ": Table 1 Row (add): $id -> { @data.join(", ") }" if $verbose;
%table1{$id}.push: @data; # [9]
}
my %result; # [10]
for @table2 -> $row # [11]
{
my @data = @$row; # [11a]
my $id = @data.splice($index2, 1); # [12]
if $verbose
{
say ": Table 1 Row (scan): $id -> { @(%table1{$id}).join(", ") }";
say ": Table 2 Row (scan): $id -> { @data.join(", ") }";
}
for @(%table1{$id}) -> $row # [13]
{
%result{$id}.push: (@$row, @data).flat; # [14]
say ": Result Table Row: $id -> { @$row.join(", ") } + \
{ @data.join(", ") }" if $verbose;
}
}
for %result.keys.sort -> $id # [15]
{
my @rows = @(%result{$id}); # [16]
for @rows -> $row # [17]
{
say nicify(|($id, @$row)); # [18]
}
}
}
sub nicify (*@data) # [19}
{
return @data.map({ $_ ~~ Numeric ?? $_ !! '"' ~ $_ ~ '"' }).join(", ");
}
[1] The first table.
[2] The second table.
[3] Join (and print) the two specified tables.
[4] Note the use of is copy
so that we
can change the variables. The default behaviour is a read-only value.
See
docs.raku.org/type/Signature#index-entry-trait_is_copy
more information about is copy
.
[5] The wikipedia article advises starting with the smallest relation. The procedure will start with the table with the lowest number of rows, by swapping them [5a, 5b] if the second one is smaller.
[6] The first table, as a hash, will end up here.
[7] Iterate over the first table, row by row. Get the row content as an array [7a].
[8] Extract (remove) the ID value from the list.
See
docs.raku.org/routine/splice
more information about splice
.
[9] Add the data (sans ID) to the hash, with the ID as key. Note the
value, @data
. This is a list, but lists are fine as values.
Also note the push
which will set up the has value to be a
list (if not already a list), with the given list as the first entry.
This makes it possible to have several values, which we need.
[10] The result of joining the two tables will end up here.
[11] Iterate over the second table.
[12] Extract the ID.
[13] Iterate over the values (rows) from the first table with the current ID.
[14] • Add the two lists to the new has, with the ID
as key. Note the flat
to merge the two lists (into a single list).
See
docs.raku.org/routine/flat
for more information about flat
.
[15] Iterate over the results.
[16] Get the row(s) (the values).
[17] For each row,
[18] • pass it on to «nicify» to format it. Note the prefix |
that flattens the content of the parens.
[19] map
is a fancy loop, quite useful for showing off.
The idea is just to quote non numeric numbers - as in the example.
See
docs.raku.org/routine/map for more
information about map
.
Running it:
$ ./hash-join
"Alex", 20, "Stewart"
"Alex", 18, "Stewart"
"Alex", 20, "Jones"
"Alex", 18, "Jones"
"Joe", 28, "Root"
"Joe", 28, "Blog"
"Mike", 38, "Gatting"
"Simon", 18, "Duane"
The order of the columns is not quite right, but that is fixable. But before tackling that, let us have a go at verbose mode:
$ ./hash-join -v
: Table 1 Row (add): Alex -> 20
: Table 1 Row (add): Joe -> 28
: Table 1 Row (add): Mike -> 38
: Table 1 Row (add): Alex -> 18
: Table 1 Row (add): David -> 25
: Table 1 Row (add): Simon -> 18
: Table 1 Row (scan): Alex -> 20, 18
: Table 2 Row (scan): Alex -> Stewart
: Result Table Row: Alex -> 20 + Stewart
: Result Table Row: Alex -> 18 + Stewart
: Table 1 Row (scan): Joe -> 28
: Table 2 Row (scan): Joe -> Root
: Result Table Row: Joe -> 28 + Root
: Table 1 Row (scan): Mike -> 38
: Table 2 Row (scan): Mike -> Gatting
: Result Table Row: Mike -> 38 + Gatting
: Table 1 Row (scan): Joe -> 28
: Table 2 Row (scan): Joe -> Blog
: Result Table Row: Joe -> 28 + Blog
: Table 1 Row (scan): Alex -> 20, 18
: Table 2 Row (scan): Alex -> Jones
: Result Table Row: Alex -> 20 + Jones
: Result Table Row: Alex -> 18 + Jones
: Table 1 Row (scan): Simon -> 18
: Table 2 Row (scan): Simon -> Duane
: Result Table Row: Simon -> 18 + Duane
"Alex", 20, "Stewart"
"Alex", 18, "Stewart"
"Alex", 20, "Jones"
"Alex", 18, "Jones"
"Joe", 28, "Root"
"Joe", 28, "Blog"
"Mike", 38, "Gatting"
"Simon", 18, "Duane"
Then we can have a go at the column order. The problem is that we print the ID first, regardless of the original order. That is easy to fix:
File: hash-join-sorted
#! /usr/bin/env raku
unit sub MAIN (:v(:$verbose));
my @player_ages = (
[20, "Alex" ],
[28, "Joe" ],
[38, "Mike" ],
[18, "Alex" ],
[25, "David" ],
[18, "Simon" ],
);
my @player_names = (
["Alex", "Stewart"],
["Joe", "Root" ],
["Mike", "Gatting"],
["Joe", "Blog" ],
["Alex", "Jones" ],
["Simon","Duane" ],
);
hash-join-sorted(@player_ages, 1, @player_names, 0);
sub hash-join-sorted (@table1, $index1, @table2, $index2) # [1]
{
my %table1;
for @table1 -> $row
{
my @data = @$row;
my $id = @data[$index1]; # [2]
say ": Table 1 Row (add): $id -> { @data.join(", ") }" if $verbose;
%table1{$id}.push: @data;
}
my %result;
for @table2 -> $row
{
my @data = @$row;
my $id = @data.splice($index2, 1);
if $verbose
{
say ": Table 1 Row (scan): $id -> { @(%table1{$id}).join(", ") }";
say ": Table 2 Row (scan): $id -> { @data.join(", ") }";
}
for @(%table1{$id}) -> $row
{
%result{$id}.push: (@$row, @data).flat;
say ": Result Table Row: $id -> { @$row.join(", ") } + \
{ @data.join(", ") }" if $verbose;
}
}
for %result.keys.sort -> $id
{
my @rows = @(%result{$id});
for @rows -> $row
{
say nicify(@$row); # [3]
}
}
}
sub nicify (*@data)
{
return @data.map({ $_ ~~ Numeric ?? $_ !! '"' ~ $_ ~ '"' }).join(", ");
}
[1] The is copy
and the swapping of the arrays has gone, as it
did not really give that much. Except the possibility to have the column order
royally screwed up.
[2] Get the ID, but leave it intact in the array.
[3] The ID is now part of the first table (@$row
), so we skip
adding it.
Running it:
$ ./hash-join-sorted
20, "Alex", "Stewart"
18, "Alex", "Stewart"
20, "Alex", "Jones"
18, "Alex", "Jones"
28, "Joe", "Root"
28, "Joe", "Blog"
38, "Mike", "Gatting"
18, "Simon", "Duane"
That's better.
Note that the order of the rows differ from the one given in the challenge. I do not think that it matters very much, but I'll look into it later anyway.
With verbose mode:
$ ./hash-join-sorted -v
: Table 1 Row (add): Alex -> 20, Alex
: Table 1 Row (add): Joe -> 28, Joe
: Table 1 Row (add): Mike -> 38, Mike
: Table 1 Row (add): Alex -> 18, Alex
: Table 1 Row (add): David -> 25, David
: Table 1 Row (add): Simon -> 18, Simon
: Table 1 Row (scan): Alex -> 20 Alex, 18 Alex
: Table 2 Row (scan): Alex -> Stewart
: Result Table Row: Alex -> 20, Alex + Stewart
: Result Table Row: Alex -> 18, Alex + Stewart
: Table 1 Row (scan): Joe -> 28 Joe
: Table 2 Row (scan): Joe -> Root
: Result Table Row: Joe -> 28, Joe + Root
: Table 1 Row (scan): Mike -> 38 Mike
: Table 2 Row (scan): Mike -> Gatting
: Result Table Row: Mike -> 38, Mike + Gatting
: Table 1 Row (scan): Joe -> 28 Joe
: Table 2 Row (scan): Joe -> Blog
: Result Table Row: Joe -> 28, Joe + Blog
: Table 1 Row (scan): Alex -> 20 Alex, 18 Alex
: Table 2 Row (scan): Alex -> Jones
: Result Table Row: Alex -> 20, Alex + Jones
: Result Table Row: Alex -> 18, Alex + Jones
: Table 1 Row (scan): Simon -> 18 Simon
: Table 2 Row (scan): Simon -> Duane
: Result Table Row: Simon -> 18, Simon + Duane
20, "Alex", "Stewart"
18, "Alex", "Stewart"
20, "Alex", "Jones"
18, "Alex", "Jones"
28, "Joe", "Root"
28, "Joe", "Blog"
38, "Mike", "Gatting"
18, "Simon", "Duane"
Placing the procedures in a module is a good idea:
File: lib/HashJoin.rakumod
unit module HashJoin;
our sub hash-join-sorted (@table1, $index1, @table2, $index2, :$verbose)
{
…
}
our sub hash-join (@table1 is copy, $index1 is copy,
@table2 is copy, $index2 is copy, :$verbose)
{
…
}
sub nicify (*@data)
{
–
}
You have seen the procedure content before.
Then the program, doing the same as «hash-join-sorted» (so nothing really exciting):
File: hash-join-module
#! /usr/bin/env raku
use lib "lib";
use HashJoin;
unit sub MAIN (:v(:$verbose));
my @player_ages = (
[20, "Alex" ],
[28, "Joe" ],
[38, "Mike" ],
[18, "Alex" ],
[25, "David" ],
[18, "Simon" ],
);
my @player_names = (
["Alex", "Stewart"],
["Joe", "Root" ],
["Mike", "Gatting"],
["Joe", "Blog" ],
["Alex", "Jones" ],
["Simon","Duane" ],
);
HashJoin::hash-join-sorted(@player_ages, 1, @player_names, 0, :$verbose);
Running it gives the expected result. Feel free to take my word for it.
The versions so far print the result, which on the face of it is ok. But it makes it impossible to do something with the result, as in merge a third table. So let us do a slight rewrite, supporting that.
And have a go at the order of the rows at the same time...
The row order rewrite actually makes the task easier, as we can skip the first step and do them both in one fell swoop.
File: lib/HashJoin.rakumod (changes only)
our sub merge (@table1, $index1, @table2, $index2, :$verbose)
{
my %table1;
my @ids; # [3]
my %ids;
my %result;
for @table1 -> $row1 # [1]
{
my @data1 = @$row1;
my $id1 = @data1[$index1];
@ids.push: $id1 unless any(@ids) eq $id1; # [3a]
say ": Table 1 Row (scan): $id1 -> { @data1.join(", ") }" if $verbose;
for @table2 -> $row2 # [2]
{
my @data2 = @$row2;
my $id2 = @data2.splice($index2, 1);
next unless $id1 eq $id2; # [2a]
%ids{$id2} = True; # [3b]
%result{$id1}.push: (@data1, @data2).flat; # [4]
say ": Result Table Row: $id1 -> { @data1.join(", ") } + \
{ @data2.join(", ") }" if $verbose;
}
}
my @result;
for @ids -> $id # [5]
{
next unless %ids{$id}; # [5b]
@result.append: @(%result{$id});
}
return @result; # [6]
}
our sub print-it(*@data) # [7]
{
for @data -> $row
{
say nicify(@$row);
}
}
[1] Iterate over the first table.
[2] Iterate over the second table, skipping rows that does not match (the same ID) [2a].
[3] A list of unique IDs, in the order we find them in the first table [3a]. The hash variant holds the IDs we get from joining the two tables [3b].
[4] Add the combined row to the result, as an array element in the hash with the specified ID.
[5] Iterate over the IDs from the first table (i.e. the correct order), and skip those without a joined row [5a].
[6] Return the data, instead of printing it.
[7] The printing part, as a separate procedure.
The program:
File: hash-join-module2
#! /usr/bin/env raku
use lib "lib";
use HashJoin;
unit sub MAIN (:v(:$verbose));
my @player_ages = (
[20, "Alex" ],
[28, "Joe" ],
[38, "Mike" ],
[18, "Alex" ],
[25, "David" ],
[18, "Simon" ],
);
my @player_names = (
["Alex", "Stewart"],
["Joe", "Root" ],
["Mike", "Gatting"],
["Joe", "Blog" ],
["Alex", "Jones" ],
["Simon","Duane" ],
);
my @joined = HashJoin::merge(@player_ages, 1, @player_names, 0, :$verbose);
HashJoin::print-it(@joined);
Running it:
$ ./hash-join-module2 -v
20, "Alex", "Stewart"
20, "Alex", "Jones"
18, "Alex", "Stewart"
18, "Alex", "Jones"
28, "Joe", "Root"
28, "Joe", "Blog"
38, "Mike", "Gatting"
18, "Simon", "Duane"
The order is correct now.
With verbose mode:
$ ./hash-join-module2 -v
: Table 1 Row (scan): Alex -> 20, Alex
: Result Table Row: Alex -> 20, Alex + Stewart
: Result Table Row: Alex -> 20, Alex + Jones
: Table 1 Row (scan): Joe -> 28, Joe
: Result Table Row: Joe -> 28, Joe + Root
: Result Table Row: Joe -> 28, Joe + Blog
: Table 1 Row (scan): Mike -> 38, Mike
: Result Table Row: Mike -> 38, Mike + Gatting
: Table 1 Row (scan): Alex -> 18, Alex
: Result Table Row: Alex -> 18, Alex + Stewart
: Result Table Row: Alex -> 18, Alex + Jones
: Table 1 Row (scan): David -> 25, David
: Table 1 Row (scan): Simon -> 18, Simon
: Result Table Row: Simon -> 18, Simon + Duane
20, "Alex", "Stewart"
20, "Alex", "Jones"
18, "Alex", "Stewart"
18, "Alex", "Jones"
28, "Joe", "Root"
28, "Joe", "Blog"
38, "Mike", "Gatting"
18, "Simon", "Duane"
Let us join three tables, two at a time. I have added additional columns to the tables as well, just to show off.
File: hash-join-module3
#! /usr/bin/env raku
use lib "lib";
use HashJoin;
unit sub MAIN (:v(:$verbose));
my @player_ages = (
[20, "Alex", "May", 29 ],
[28, "Joe", "August", 2 ],
[38, "Mike", "June", 12 ],
[18, "Alex", "January", 29 ],
[25, "David", "March", 14 ],
[18, "Simon", "June", 10 ],
);
my @player_names = (
["Alex", "Stewart", "Oslo" ],
["Joe", "Root", "Berlin" ],
["Mike", "Gatting", "London" ],
["Joe", "Blog", "Bern" ],
["Alex", "Jones", "Zürich" ],
["Simon","Duane", "Oxford" ],
);
my @countries = (
["Norway", "Oslo" ],
["Germany", "Berlin" ],
["England", "London" ],
["Switzerland", "Bern" ],
["Switzerland", "Zürich" ],
["England", "Oxford" ],
);
my @joined = HashJoin::merge(@player_ages, 1, @player_names, 0, :$verbose);
my @joined2 = HashJoin::merge(@joined, 5, @countries, 1);
HashJoin::print-it(@joined2);
Running it:
$ ./hash-join-module3
20, "Alex", "May", 29, "Stewart", "Oslo", "Norway"
18, "Alex", "January", 29, "Stewart", "Oslo", "Norway"
20, "Alex", "May", 29, "Jones", "Zürich", "Switzerland"
18, "Alex", "January", 29, "Jones", "Zürich", "Switzerland"
28, "Joe", "August", 2, "Root", "Berlin", "Germany"
28, "Joe", "August", 2, "Blog", "Bern", "Switzerland"
38, "Mike", "June", 12, "Gatting", "London", "England"
18, "Simon", "June", 10, "Duane", "Oxford", "England"
Let us try with a second duplicate (Birmingham):
File: hash-join-module4
#! /usr/bin/env raku
use lib "lib";
use HashJoin;
unit sub MAIN (:v(:$verbose));
my @player_ages = (
[20, "Alex", "May", 29 ],
[28, "Joe", "August", 2 ],
[38, "Mike", "June", 12 ],
[18, "Alex", "January", 29 ],
[25, "David", "March", 14 ],
[18, "Simon", "June", 10 ],
);
my @player_names = (
["Alex", "Stewart", "Oslo" ],
["Joe", "Root", "Berlin" ],
["Mike", "Gatting", "London" ],
["Joe", "Blog", "Bern" ],
["Alex", "Jones", "Zürich" ],
["Simon","Duane", "Birmingham" ],
);
my @countries = (
["Norway", "Oslo" ],
["Germany", "Berlin" ],
["England", "London" ],
["Switzerland", "Bern" ],
["Switzerland", "Zürich" ],
["England", "Birmingham" ],
["USA", "Birmingham" ],
);
my @joined = HashJoin::merge(@player_ages, 1, @player_names, 0, :$verbose);
my @joined2 = HashJoin::merge(@joined, 5, @countries, 1);
HashJoin::print-it(@joined2);
Running it:
$ ./hash-join-module4
20, "Alex", "May", 29, "Stewart", "Oslo", "Norway"
18, "Alex", "January", 29, "Stewart", "Oslo", "Norway"
20, "Alex", "May", 29, "Jones", "Zürich", "Switzerland"
18, "Alex", "January", 29, "Jones", "Zürich", "Switzerland"
28, "Joe", "August", 2, "Root", "Berlin", "Germany"
28, "Joe", "August", 2, "Blog", "Bern", "Switzerland"
38, "Mike", "June", 12, "Gatting", "London", "England"
18, "Simon", "June", 10, "Duane", "Birmingham", "England"
18, "Simon", "June", 10, "Duane", "Birmingham", "USA"
And that's it.