Vin Knapsack Meets Raku

by Arne Sommer

Vin Knapsack Meets Raku

[44] Published 1. December 2019

This is my response to the Perl Weekly Challenge #36.

Challenge 36.1

Write a program to validate given Vehicle Identification Number (VIN). For more information, please checkout wikipedia.

There are some problems with the wikipedia article. I'll describe and deal with them when I come to the relevant part of the program.

Note that the wikipedia article calls the individual characters in a VIN number for digits, even if they can be a letters. I'll do the same.

First a very simple program, setting up the framework and doing very basic validation:

File: vin-zero
my regex VINCHAR { A | B | C | D | E | F | G | H | J | K | L | M | N |
                   P | R | S | T | U | V | W | X | Y | Z | 1 | 2 | 3 |
                   4 | 5 | 6 | 7 | 8 | 9 | 0 };                # [3]

subset VIN of Str where * ~~ /^ <VINCHAR> ** 17 $/;            # [4]
subset WMI of Str where * ~~ /^ <VINCHAR> **  3 $/;            # [5]

multi sub MAIN (VIN $vin)                                      # [1]
{
  say "Looks like a legal VIN. Checking if it is valid..";

  my $wmi = $vin.substr(0,3); # World Manufacturer Identifier  # [6]
  my $vds = $vin.substr(3,6); # Vehicle Descriptor Section
  my $vis = $vin.substr(9,8); # Vehicle Identifier Section
	
  say "WMI: $wmi";
  say "VDS: $vds";
  say "VIS: $vis";
}

multi sub MAIN (Str $vin)                                      # [2]
{
  say "Not a legal VIN";
  say " - contains illegal character(s)"
    unless $vin ~~ /^ <VINCHAR> + $/;                          # [7]

  say " - wrong length ({ $vin.chars } instead of 17)"
    unless $vin.chars == 17;                                   # [8]
}

[1] I have set up two «multi MAIN»s to do the initial verification. This first one is executed if the VIN number has the correct length, and contains valid digits only.

[2] And this one is triggered if the VIN is illegal.

[3] A custom regex matching one VIN digit. Note that (the letters) «I«, «O» and «Q» are not used.

[4] A custom type matching a legal VIN.

[5] Another one, this time matching the three-digit WMI part. It isn't used in this program, but we will need it later on.

[6] All it does (in this version of the program) it split the VIN number in its three parts, printing them.

[7] This regex complains if we have illegal digits.

[8] Complain if the length is wrong.

Running it:

$ raku vin-zero 1111111111111111q
Not a legal VIN
 - contains illegal character(s)

$ raku vin-zero 1111111111111111
Not a legal VIN
 - wrong length (16 instead of 17)

$ raku vin-zero 11111111111111111q
Not a legal VIN
 - contains illegal character(s)
 - wrong length (18 instead of 17)

$ raku vin-zero 111111111111111111
Not a legal VIN
 - wrong length (18 instead of 17)

$ raku vin-zero 11111111111111111
Looks like a legal VIN. Checking if it is valid..
WMI: 111
VDS: 111111
VIS: 11111111

Note that the «multi MAIN»s could have been replaced by an if-construct, using the «VIN» regex like this:

sub MAIN (Str $vin)
{
  if $vin ~~ VIN
  {
    say "Legal";
  }
  else
  {
    say "Not a legal VIN";
  }
}

But «multi MAIN»s are cooler.

The Country

I'll do the full program now, part for part, starting with the country.

The country part of the WMI is either one or two digits, and I use different hashes for each type:

File: vin (partial)
my @vinchar =
  <A B C D E F G H J K L M N P R S T U V W X Y Z 1 2 3 4 5 6 7 8 9 0>;  # [6]

my %country1 = ( J => "Japan",     L => "China",       1 => "USA",      # [2]
                 2 => "Canada",    4 => "USA",         5 => "USA",
                 6 => "Australia", 7 => "New Zealand", W => "Germany");

my %country2;                                                           # [3]

add-country("AA", "AH", "South Africa");                                # [4]
add-country("AJ", "AN", "Cote d'Ivoire");
add-country("BA", "BE", "Angola");
add-country("BF", "BK", "Kenya");
add-country("BL", "BR", "Tanzania");
add-country("CA", "CE", "Benin");
add-country("CF", "CK", "Madagascar");
add-country("CL", "CR", "Tunisia");
add-country("DA", "DE", "Egypt");
add-country("DF", "DK", "Morocco");
add-country("DL", "DR", "Zambia");
add-country("EA", "EE", "Ethiopia");
add-country("EF", "EK", "Mozambique");
add-country("FA", "FE", "Ghana");
add-country("FF", "FK", "Nigeria");
add-country("KA", "KE", "Sri Lanka");
add-country("KF", "KK", "Israel");
add-country("KL", "KR", "Korea (South)");
add-country("KS", "K0", "Kazakhstan");
add-country("MA", "ME", "India");
add-country("MF", "MK", "Indonesia");
add-country("ML", "MR", "Thailand");
add-country("MS", "M0", "Myanmar");
add-country("NA", "NE", "Iran");
add-country("NF", "NK", "Pakistan");
add-country("NL", "NR", "Turkey");
add-country("PA", "PE", "Philippines");
add-country("PF", "PK", "Singapore");
add-country("PL", "PR", "Malaysia");
add-country("RA", "RE", "United Arab Emirates");
add-country("RF", "RK", "Taiwan");
add-country("RL", "RR", "Vietnam");
add-country("RS", "R0", "Saudi Arabia");
add-country("SA", "SM", "United Kingdom");
add-country("SN", "ST", "East Germany");
add-country("SU", "SZ", "Poland");
add-country("S1", "S4", "Latvia");
add-country("TA", "TH", "Switzerland");
add-country("TJ", "TP", "Czech Republic");
add-country("TR", "TV", "Hungary");
add-country("TW", "T1", "Portugal");
add-country("UH", "UM", "Denmark");
add-country("UN", "UT", "Ireland");
add-country("UU", "UZ", "Romania");
add-country("U5", "U7", "Slovakia");
add-country("VA", "VE", "Austria");
add-country("VF", "VR", "France");
add-country("VS", "VW", "Spain");
add-country("VX", "V2", "Serbia");
add-country("V3", "V5", "Croatia");
add-country("V6", "V0", "Estonia");
add-country("XA", "XE", "Bulgaria");
add-country("XF", "XK", "Greece");
add-country("XL", "XR", "Netherlands");
add-country("XS", "XW", "Russia");
add-country("XX", "X2", "Luxembourg");
add-country("X3", "X0", "Russia");
add-country("YA", "YE", "Belgium");
add-country("YF", "YK", "Finland");
add-country("YL", "YR", "Malta");
add-country("YS", "YW", "Sweden");
add-country("YX", "Y2", "Norway");
add-country("Y3", "Y5", "Belarus");
add-country("Y6", "Y0", "Ukraine");
add-country("ZA", "ZR", "Italy");
add-country("ZX", "Z2", "Slovenia");
add-country("Z3", "Z5", "Lithuania");
add-country("3A", "3W", "Mexico");
add-country("3X", "37", "Costa Rica");
add-country("38", "39", "Cayman Islands");
add-country("8A", "8E", "Argentina");
add-country("8F", "8K", "Chile");
add-country("8L", "8R", "Ecuador");
add-country("8S", "8W", "Peru");
add-country("8X", "82", "Venezuela");
add-country("9A", "9E", "Brazil");
add-country("9F", "9K", "Colombia");
add-country("9L", "9R", "Paraguay");
add-country("9S", "9W", "Uruguay");
add-country("9X", "92", "Trinidad & Tobago");
add-country("93", "99", "Brazil");

sub add-country($from, $to, $name)            # [5]
{
  my ($first, $second) = $from.comb;

  loop
  {
    %country2{$first ~ $second} = $name;

    last if "$first$second" eq $to;

    if $second eq "Z"
    {
      $second = "1";
    }
    elsif $second eq "9"
    {
      $second = "0";
    }
    elsif $second eq "0"
    {
      die "Not possible to increment past { $first }0. Set up two rules.";
    }
    else
    {
      repeat { $second.=succ } until $second eq any @vinchar; # [7]
    }
  }
}

sub wmi2country (WMI $wmi)         # [1]
{
  return %country1{$wmi.substr(0,1)} if $wmi.substr(0,1) eq any %country1.keys;
  return %country2{$wmi.substr(0,2)} if $wmi.substr(0,2) eq any %country2.keys;
  return;
}

multi sub MAIN (VIN $vin)
{
  say "Looks like a legal VIN. Checking if it is valid..";

  my $wmi = $vin.substr(0,3); # World Manufacturer Identifier
  my $vds = $vin.substr(3,6); # Vehicle Descriptor Section
  my $vis = $vin.substr(9,8); # Vehicle Identifier Section

  say "WMI: $wmi";

  my $country = wmi2country($wmi); # [1]

  unless $country                  # [1a]
  {
    say "- Not a valid country";
    exit;
  }
  
  say "- Country: $country";       # [1b]
}  

[1] Get the country. If not defined, complain and exit (1a). If defined, print it (1b)

[2] The one digit contries.

[3] The two digit countries are stored here.

[4] The ranges given in the wikipedia articles must be resolved by code (in (5)).

[5] Resolve the ranges. This procedure skips illegal digits (anything not in (6). Note the order, as given in (6). The next value complies with this order (after «Z« we get «1», and after «9» we get «0». After «0» we get an error, as I do not want to do carrying. All the ranges in the article increase the second digit only, so this is ok.

[6] The legal digits.

[7] The «succ« method returns the next value, and I use «.=» to assign it back to the original variable. (There is also a «pred» method, working backwards.)

See docs.raku.org/routine/succ for more information about the «succ» method.

The complex «add-country» logic is there to handle the weird digit order in VIN numbers. Raku sequences (or ranges) isn' an option as they follow ascii rules:

> "AF" ... "A3"
(AF AE AD AC AB AA A@ A? A> A= A< A; A: A9 A8 A7 A6 A5 A4 A3)

This one has chosen to count down, as the digits come before the characters in the ascii table.

The Manufacturer

Here we have the same logic as with the country, as some manufacturers are identified by two digits, and some by all three digits of the WMI. I have added all the two digit ones, but only some of the three digit manufacturers (as the list is too long).

File: vin (partial)
sub wmi2manufacturer( WMI $wmi)
{
  my %manufacturer2 = (  JA  => "Isuzu",
                         JF  => "Fuji Heavy Industries",
  		         JN  => "Nissan",
                         JS  => "Suzuki",
  		         JT  => "Toyota",
                         JY  => "Yamaha",
                         KL  => "Daewoo/GM Korea",
  		         KN  => "Kia",
  		         UU  => "Dacia",
  		        '1B' => "Dodge",
 		        '1C' => "Chrysler",
 		        '1F' => "Ford",
 		        '1G' => "General Motors",
 		        '1J' => "Jeep",
 		        '1L' => "Lincoln",
 		        '1M' => "Mercury",
 		        '1N' => "Nissan",
		        '2F' => "Ford",
		        '2M' => "Mercury",
		        '2T' => "Toyota",
		        '3F' => "Ford",
		        '3G' => "General Motors",
		        '3N' => "Nissan",
		        '4F' => "Mazda",
		        '4J' => "Mercedes-Benz",
		        '4M' => "Mercury",
		        '4T' => "Toyota",
		        '5L' => "Lincoln",
		        '5T' => "Toyota",
		        '5U' => "BMW",
		        '5X' => "Hyundai/Kia",
		        '55' => "Mercedes-Benz",
		        '6F' => "Ford",
		        '6G' => "General Motors",
		        '6H' => "Holden",
  );

  my %manufacturer3 = (  AAV  => "Volkswagen",
                         AHT  => "Toyota",
                         AFA  => "Ford",
			'1G1' => "Chevrolet",
			'1G3' => "Oldsmobile",
			'1G4' => "Buick",
			'1G9' => "Google",
			'1GB' => "Chevrolet incomplete vehicles",
			'1GC' => "Chevrolet",
			'1GD' => "GMC incomplete vehicles",
			'1GM' => "Pontiac",
			'1HG' => "Honda",
  );

  return %manufacturer3{$wmi} // return %manufacturer2{$wmi.substr(0,2)} // "";
}

multi sub MAIN (VIN $vin)
{
  ...

  say "- Country: $country";

  my $manufacturer = wmi2manufacturer($wmi) || "Not implemented";

  say "- Manufacturer: $manufacturer";
}

Note that we have to quote the keys that doesn't start with a character. Also note the «1G» series, which is assigned to «General Motors», except that some of the values are given to other manufacurers. So we must do the lookup in the three digit table before the two digit one.

The country list is probably complete. The manufacturer list is definitely not complete. The complete WMI list is available, at a cost (see e.g. https://www.iso.org/standard/45844.html or www.sae.org/standards/content/j272_200808/. (If you have more money to spare, consider this one on the VIN system as well: www.sae.org/standards/content/j1044_201207/.)

The Checksum

Cars sold in the USA must adhere to the North American version, with a checksum digit (at position 9; or index 8). That would imply that cars imported from Europe must have compliant VINs. I don't know if the cars for doemstic European usage also are compliant (as that would be easier). Chinese cars also adhere to this standard. I have chosen to make the check mandatory on North American and Chinese cars (and have the program fail if it doesn't match), and just print the check on other cars.

File: vin (partial)
sub verify-checksum(VIN $vin)                       # [1]
{
  my $check  = $vin.substr(8, 1);                   # [1a]
  my $string = $vin.substr(0, 8) ~ $vin.substr(9);  # [1b]

  my %trans =                                       # [2]
    A => 1, B => 2, C => 3, D => 4, E => 5, F => 6, G => 7, H => 8,
    J => 1, K => 2, L => 3, M => 4, N => 5,         P => 7,         Q => 9,
    R => 1, S => 2, T => 3, U => 4, V => 5, W => 6, X => 7, Y => 8, Z => 9,
    1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5, 6 => 6, 7 => 7, 8 => 8, 9 => 9,
    0 => 0
  );

  my @weight = <8 7 6 5 4 3 2 10 9 8 7 6 5 4 3 2>;  # [4]  

  my $sum = 0;

  for ^$string.chars -> $index                                    # [3]
  {
    my $trans = %trans{$string.substr($index,1)} // return False; # [2]
    my $value = $trans * @weight[$index];
    
    $sum += $value;                                               # [5]
  }

  my $got = $sum mod 11;                                          # [6]

  $got = "X" if $got == 10;                                       # [6]

  return $check eq $got;                                          # [7]
}

multi sub MAIN (VIN $vin)
{
  ...

  say "- Manufacturer: $manufacturer";
  say "VDS: $vds";

  if $country eq any <China USA Canada Mexico&ft;
  {
    if verify-checksum($vin)
    {
      say "- North American/China Checksum: OK.";
    }
    else
    {
      say "- Checksum: Failure.";
      exit;
    }
  }
  else
  {
    if verify-checksum($vin)
    {
      say "- Non-Mandatory Checksum verified.";
    }
    else
    {
      say "- Non-Mandatory Checksum failure.";
    }
  }

  say "VIS: $vis";
}

[1] «verify-checksum« gets the VIN number, extracts the checksum digit (with index 8), computes the checksum on the rest of the VIN (after the checksum digit has been removed), and compares the result with the original checksum digit.

[2] The mapping between digits and values for the ckecksum. I have added values for the (numeric) digits to make the lookup easier.

[3] Iterate over the digits, or rather the indices.

[4] The weight of the digits.

[5] Add the value to the total.

[6] Get the value down to a single digit.

[7] Is it ok?

The Year

According to the illustration in the wikipedia article, only North American manufacturers use a year character (at position 10; index 9), but the text says that it applies to all. So I have chosen to trust the text.

File: vin (partial)
sub vin2year(VIN $vin)
{
  my %year =
  (
    A => "1980,2010", B => "1981,2011", C => "1982,2012", D => "1983,2013",
    E => "1984,2014", F => "1985,2015", G => "1986,2016", H => "1987,2017",
    J => "1988,2018", K => "1989,2019", L => "1990,2020", M => "1991,2021",
    N => "1992,2022", P => "1993,2023", R => "1994,2024", S => "1995,2025",
    T => "1996,2026", U => "1997,2027", V => "1998,2028", W => "1999,2029",
    1 => "2001,2031", 2 => "2002,2032", 3 => "2003,2033", 4 => "2004,2034",
    5 => "2005,2035", 6 => "2006,2036", 7 => "2007,2037", 8 => "2008,2038",
    9 => "2009,2039", 0 => "1980"
  );

  return %year{$vin.substr(9,1)} // "";  # [1]
}

multi sub MAIN (VIN $vin)
{
  ...

  say "VIS: $vis";

  my $year = vin2year($vin);
  
  unless $year                           # [1b]
  {
    say "Not a valid year";
    exit;
  }

  say "- Year: $year";
}

[1] All there is to this is extracting the 10th digit (with index 9), looking it up in the year table (hash) to get the year. Illegal characters are handled by returning nothing. The program then writes a warning and aborts (1a).

Note the duplicate year values. This implies that a car from e.g. 1980 and 2010 can have exactly the same VIN number. The North American standard has tried to remedy this, but only for certain vehicle types. I have just ignored it, presenting both years.

The complete program

I have not modified the lines to fit the article width.

File: vin
my regex VINCHAR { A | B | C | D | E | F | G | H | J | K | L | M | N | P | R | S | 
                   T | U | V | W | X | Y | Z | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 0 };

my @vinchar = <A B C D E F G H J K L M N P R S T U V W X Y Z 1 2 3 4 5 6 7 8 9 0>;

subset VIN of Str where * ~~ /^ <VINCHAR> ** 17 $/;
subset WMI of Str where * ~~ /^ <VINCHAR> **  3 $/;

my %country1 = ( J => "Japan", L => "China", 1 => "USA",       2 => "Canada",
                 4 => "USA",   5 => "USA",   6 => "Australia", 7 => "New Zealand",
		 W => "Germany");

my %country2;

add-country("AA", "AH", "South Africa");
add-country("AJ", "AN", "Cote d'Ivoire");
add-country("BA", "BE", "Angola");
add-country("BF", "BK", "Kenya");
add-country("BL", "BR", "Tanzania");
add-country("CA", "CE", "Benin");
add-country("CF", "CK", "Madagascar");
add-country("CL", "CR", "Tunisia");
add-country("DA", "DE", "Egypt");
add-country("DF", "DK", "Morocco");
add-country("DL", "DR", "Zambia");
add-country("EA", "EE", "Ethiopia");
add-country("EF", "EK", "Mozambique");
add-country("FA", "FE", "Ghana");
add-country("FF", "FK", "Nigeria");
add-country("KA", "KE", "Sri Lanka");
add-country("KF", "KK", "Israel");
add-country("KL", "KR", "Korea (South)");
add-country("KS", "K0", "Kazakhstan");
add-country("MA", "ME", "India");
add-country("MF", "MK", "Indonesia");
add-country("ML", "MR", "Thailand");
add-country("MS", "M0", "Myanmar");
add-country("NA", "NE", "Iran");
add-country("NF", "NK", "Pakistan");
add-country("NL", "NR", "Turkey");
add-country("PA", "PE", "Philippines");
add-country("PF", "PK", "Singapore");
add-country("PL", "PR", "Malaysia");
add-country("RA", "RE", "United Arab Emirates");
add-country("RF", "RK", "Taiwan");
add-country("RL", "RR", "Vietnam");
add-country("RS", "R0", "Saudi Arabia");
add-country("SA", "SM", "United Kingdom");
add-country("SN", "ST", "East Germany");
add-country("SU", "SZ", "Poland");
add-country("S1", "S4", "Latvia");
add-country("TA", "TH", "Switzerland");
add-country("TJ", "TP", "Czech Republic");
add-country("TR", "TV", "Hungary");
add-country("TW", "T1", "Portugal");
add-country("UH", "UM", "Denmark");
add-country("UN", "UT", "Ireland");
add-country("UU", "UZ", "Romania");
add-country("U5", "U7", "Slovakia");
add-country("VA", "VE", "Austria");
add-country("VF", "VR", "France");
add-country("VS", "VW", "Spain");
add-country("VX", "V2", "Serbia");
add-country("V3", "V5", "Croatia");
add-country("V6", "V0", "Estonia");
add-country("XA", "XE", "Bulgaria");
add-country("XF", "XK", "Greece");
add-country("XL", "XR", "Netherlands");
add-country("XS", "XW", "Russia");
add-country("XX", "X2", "Luxembourg");
add-country("X3", "X0", "Russia");
add-country("YA", "YE", "Belgium");
add-country("YF", "YK", "Finland");
add-country("YL", "YR", "Malta");
add-country("YS", "YW", "Sweden");
add-country("YX", "Y2", "Norway");
add-country("Y3", "Y5", "Belarus");
add-country("Y6", "Y0", "Ukraine");
add-country("ZA", "ZR", "Italy");
add-country("ZX", "Z2", "Slovenia");
add-country("Z3", "Z5", "Lithuania");
add-country("3A", "3W", "Mexico");
add-country("3X", "37", "Costa Rica");
add-country("38", "39", "Cayman Islands");
add-country("8A", "8E", "Argentina");
add-country("8F", "8K", "Chile");
add-country("8L", "8R", "Ecuador");
add-country("8S", "8W", "Peru");
add-country("8X", "82", "Venezuela");
add-country("9A", "9E", "Brazil");
add-country("9F", "9K", "Colombia");
add-country("9L", "9R", "Paraguay");
add-country("9S", "9W", "Uruguay");
add-country("9X", "92", "Trinidad & Tobago");
add-country("93", "99", "Brazil");

sub add-country($from, $to, $name)
{
  my ($first, $second) = $from.comb;

  loop
  {
    %country2{$first ~ $second} = $name;

    last if "$first$second" eq $to;

    if $second eq "Z"
    {
      $second = "1";
    }
    elsif $second eq "9"
    {
      $second = "0";
    }
    elsif $second eq "0"
    {
      die "Not possible to increment past { $first }0. Set up two rules.";
    }
    else
    {
      repeat { $second.=succ } until $second eq any @vinchar;
    }
  }
}

sub wmi2manufacturer( WMI $wmi)
{
  my %manufacturer2 = (  JA  => "Isuzu",
                         JF  => "Fuji Heavy Industries",
  		         JN  => "Nissan",
                         JS  => "Suzuki",
  		         JT  => "Toyota",
                         JY  => "Yamaha",
                         KL  => "Daewoo/GM Korea",
  		         KN  => "Kia",
  		         UU  => "Dacia",
  		        '1B' => "Dodge",
 		        '1C' => "Chrysler",
 		        '1F' => "Ford",
 		        '1G' => "General Motors",
 		        '1J' => "Jeep",
 		        '1L' => "Lincoln",
 		        '1M' => "Mercury",
 		        '1N' => "Nissan",
		        '2F' => "Ford",
		        '2M' => "Mercury",
		        '2T' => "Toyota",
		        '3F' => "Ford",
		        '3G' => "General Motors",
		        '3N' => "Nissan",
		        '4F' => "Mazda",
		        '4J' => "Mercedes-Benz",
		        '4M' => "Mercury",
		        '4T' => "Toyota",
		        '5L' => "Lincoln",
		        '5T' => "Toyota",
		        '5U' => "BMW",
		        '5X' => "Hyundai/Kia",
		        '55' => "Mercedes-Benz",
		        '6F' => "Ford",
		        '6G' => "General Motors",
		        '6H' => "Holden",
  );

  my %manufacturer3 = (  AAV  => "Volkswagen",
                         AHT  => "Toyota",
                         AFA  => "Ford",
			'1G1' => "Chevrolet",
			'1G3' => "Oldsmobile",
			'1G4' => "Buick",
			'1G9' => "Google",
			'1GB' => "Chevrolet incomplete vehicles",
			'1GC' => "Chevrolet",
			'1GD' => "GMC incomplete vehicles",
			'1GM' => "Pontiac",
			'1HG' => "Honda",
  );

  return %manufacturer3{$wmi} // return %manufacturer2{$wmi.substr(0,2)} // "";
}

sub vin2year(VIN $vin)
{
  my %year = ( A => "1980,2010", B => "1981,2011", C => "1982,2012", D => "1983,2013",
               E => "1984,2014", F => "1985,2015", G => "1986,2016", H => "1987,2017",
	       J => "1988,2018", K => "1989,2019", L => "1990,2020", M => "1991,2021",
	       N => "1992,2022", P => "1993,2023", R => "1994,2024", S => "1995,2025",
	       T => "1996,2026", U => "1997,2027", V => "1998,2028", W => "1999,2029",
	       1 => "2001,2031", 2 => "2002,2032", 3 => "2003,2033", 4 => "2004,2034",
	       5 => "2005,2035", 6 => "2006,2036", 7 => "2007,2037", 8 => "2008,2038",
	       9 => "2009,2039", 0 => "1980");

  return %year{$vin.substr(9,1)} // "";
}

sub wmi2country (WMI $wmi)
{
  return %country1{$wmi.substr(0,1)} if $wmi.substr(0,1) eq any %country1.keys;
  return %country2{$wmi.substr(0,2)} if $wmi.substr(0,2) eq any %country2.keys;
  return;
}

sub verify-checksum(VIN $vin)
{
  my $check  = $vin.substr(8, 1);
  my $string = $vin.substr(0, 8) ~ $vin.substr(9);

  my %trans = ( A => 1, B => 2, C => 3, D => 4, E => 5, F => 6, G => 7, H => 8,
                J => 1, K => 2, L => 3, M => 4, N => 5,         P => 7,         Q => 9,
                R => 1, S => 2, T => 3, U => 4, V => 5, W => 6, X => 7, Y => 8, Z => 9,
                1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5, 6 => 6, 7 => 7, 8 => 8, 9 => 9, 0 => 0);

  my @weight = <8 7 6 5 4 3 2 10 9 8 7 6 5 4 3 2>;

  my $sum = 0;

  for ^$string.chars -> $index
  {
    my $trans = %trans{$string.substr($index,1)} // return False;
    my $value = $trans * @weight[$index];
    
    $sum += $value;
  }

  my $got = $sum mod 11;

  $got = "X" if $got == 10;

  return $check eq $got;  
}


multi sub MAIN (VIN $vin)
{
  say "Looks like a legal VIN. Checking if it is valid..";

  my $wmi = $vin.substr(0,3); # World Manufacturer Identifier
  my $vds = $vin.substr(3,6); # Vehicle Descriptor Section
  my $vis = $vin.substr(9,8); # Vehicle Identifier Section

  say "WMI: $wmi";

  my $country = wmi2country($wmi);

  unless $country
  {
    say "- Not a valid country";
    exit;
  }
  
  say "- Country: $country";
  
  my $manufacturer = wmi2manufacturer($wmi) || "Not implemented";

  say "- Manufacturer: $manufacturer";
  say "VDS: $vds";

  if $country eq any <China USA Canada Mexico>
  {
    if verify-checksum($vin)
    {
      say "- North American/China Checksum: OK.";
    }
    else
    {
      say "- Checksum: Failure.";
      exit;
    }
  }
  else
  {
    if verify-checksum($vin)
    {
      say "- Non-Mandatory Checksum verified.";
    }
    else
    {
      say "- Non-Mandatory Checksum failure.";
    }
  }

  say "VIS: $vis";
  
  my $year = vin2year($vin);
  
  unless $year
  {
    say "Not a valid year";
    exit;
  }

  say "- Year: $year";
}

multi sub MAIN (Str $vin)
{
  say "Not a legal VIN";
  say " - contains illegal character(s)" unless $vin ~~ /^ <VINCHAR> + $/;
  say " - wrong length ({ $vin.chars } instead of 17)" unless $vin.chars == 17;
}   

Challenge 36.2

Write a program to solve Knapsack Problem.

There are 5 color coded boxes with varying weights and amounts in GBP. Which boxes should be choosen to maximize the amount of money while still keeping the overall weight under or equal to 15 kgs?

R: (weight = 1 kg, amount = £1)
B: (weight = 1 kg, amount = £2)
G: (weight = 2 kg, amount = £2)
Y: (weight = 12 kg, amount = £4)
P: (weight = 4 kg, amount = £10)

Bonus task, what if you were allowed to pick only 2 boxes or 3 boxes or 4 boxes? Find out which combination of boxes is the most optimal?

There is a Wikipedia article about this problem, but it doesn't add anything not given in the challenge (except presenting several variations of this problme, and naming this one as the «0-1 knapsack problem» (as each box can only be present 0 or 1 time(s).

Also note that the colour codes are not explained. «RGB» does look familiar, so we can deduce that R=Red, B=Blue, G=Green. From «CMYK» we can deduce that the Y=Yellow. I can only guess at P; either Pink or Purple. The Wikipedia article has a colour coded illustration, but it doesn't help us at it has both a Gray and a Green box. (But the actual colour doesn't really matter.)

This seems more like a logical problem than a programming challenge, so I'll have a go at it.

a/ The Weight

The outlier is the Y box, with a very high weight and a not so high value. Either it is part of the solution (in), or it isn't (out). Let us take a closer look:

Y is In: Y weights 12 kg, so we have place for an additional 3 kg. We get at 3 kg by choosing G (at 2 kg) and either R or B (both at 1 kg). B has the highest value (£2 as opposed to £1 for R), so we choose that. This gives the content Y,G,B with weight 15 kg and value £ 8.

Y is Out: The total weight of the rest (R,B,G,P) is 8 kg, so we can include them all. The value is £ 15. That is way more than £ 8, so this is the solution.

b/ The Value

We could have looked at the values instead. The value of P is £ 10, which is more than the sum of all the others (which is £ 9). So P must be in the result. (And the weight is lesser than the total, so it fits.) That gives us 4 kg, with place for an additional 11 kg. Y is too heavy (at 12kg), but the rest (R,G,B) fits at 4 kg in total. That gives the same result as above; R,B,G,Y at 8 kg and value £ 15.

Conclusion

Choose boxes R,B,G,Y at 8 kg and value £ 15.

Bonus Tasks

The highest value will always include box P. Box Y is not part of the solution, as the total weight would exceed 15 kg (as stated in "The Value" section). This bonus task is (also) a logical problem, and a table is a suitable answer:

CountBoxesWeight   Value   Comment
1P4 kg£ 10
2P,B5 kg£ 12[1]
3P,G,B7 kg£ 14
4P,G,B,R   8 kg£ 15[2]
5---[3]

[1] We could have chosen G instead of B; same value, higher weight. I presume that lower weight is a good thing.

[2] The same as the main challenge.

[3] Not possible.

A Programming Session

That did feel like cheating, so I&apos:ll write a program anyway...

File: knapsack-simple (partial)
unit sub MAIN (:$verbose);                                # [1]

my %weight = (R => 1, B => 1, G => 2, Y => 12, P =>  4);  # [2]
my %value  = (R => 1, B => 2, G => 2, Y =>  4, P => 10);  # [3]

constant $maxweight = 15;                                 # [4]

my @boxes = %weight.keys.sort;                            # [5]

say @boxes.combinations if $verbose;                      # [6]

[1] Debug (or verbose) output is a good idea, and this flag ensures that it can live on in production code.

[2] We store the wights (in kg) in this hash,

[3] and the values (in £) in this one.

[4] The max weight. I have declared it as a constant, as it is constant.

[5] The boxes (as a single character), in sorted order.

[6] «combinations» does the trick, hopefully...

Running it:

$ raku knapsack-simple --verbose
(()
(B) (G) (P) (R) (Y) (B G)
(B P) (B R) (B Y) (G P) (G R) (G Y) (P R) (P Y) (R Y)
(B G P) (B G R) (B G Y) (B P R) (B P Y) (B R Y) (G P R) (G P Y) (G R Y) (P R Y)
(B G P R) (B G P Y) (B G R Y) (B P R Y) (G P R Y)
(B G P R Y))

I have added newlines to make the output fit the screen. «combinations» is indeed doing what we want, but the first sublist is empty as «combinations» considers that a valid answer. We'll get rid of that shortly.

See docs.raku.org/routine/combinations for more information about «combinations».

I coud have made a «KnapsackBox» class, with weight and value attributes, but the two hashes work just fine in this program.

File: knapsack-simple (partial)
my %w;  # [1]
my %v;  # [2]

for @boxes.combinations.grep(*.elems) -> @list         # [3]
{
  my $weight = @list.map({ %weight{$_} }).sum;         # [4]
  my $value  = @list.map({ %value{$_}  }).sum;         # [5]
  my $key    = @list.join;                             # [6]

  if $weight <= $maxweight                             # [7]
  {
    %w{$key} = $weight;                                # [7]
    %v{$key} = $value;                                 # [7]
    
    say "{ @list } -> $weight kg -> £ $value" if $verbose;                 # [8]
  }
  elsif $verbose
  {
    say "{ @list } -> $weight kg -> £ $value (> $maxweight kg; ignored)";  # [9]
  }
}

[1] We keep the weight (in kg) of the candiates here.

[2] We keep the value (in £) of the candidates here.

[3] Iterate over the items (or sublists), but get rid of the empty list (by only selecting non-empty sublists). Note that we could have used an array slice here instead of «grep», as we know that it is the very first item that should go away; «@boxes.combinations[1 .. *] -> @list». I'll keep the «grep», as it better explains what is going on (and it fits in with a later extension).

[4] The weight of the current sublist is the sum of the weight of all the itmes.

[5] The value of the current sublist is the sum of the values of all the itmes.

[6] Get the key by joining the values.

[7] If the weight is within the limit (15 kg in this case), save the weight and value.

[8] Verbose output for values within the limit,

[9] and for values exceeding the limit.

Running it:

$ raku knapsack-simple --verbose
B -> 1 kg -> £ 2
G -> 2 kg -> £ 2
P -> 4 kg -> £ 10
R -> 1 kg -> £ 1
Y -> 12 kg -> £ 4
B G -> 3 kg -> £ 4
B P -> 5 kg -> £ 12
B R -> 2 kg -> £ 3
B Y -> 13 kg -> £ 6
G P -> 6 kg -> £ 12
G R -> 3 kg -> £ 3
G Y -> 14 kg -> £ 6
P R -> 5 kg -> £ 11
P Y -> 16 kg -> £ 14 (> 15 kg; ignored) 
R Y -> 13 kg -> £ 5
B G P -> 7 kg -> £ 14
B G R -> 4 kg -> £ 5
B G Y -> 15 kg -> £ 8
B P R -> 6 kg -> £ 13
B P Y -> 17 kg -> £ 16 (> 15 kg; ignored) 
B R Y -> 14 kg -> £ 7
G P R -> 7 kg -> £ 13
G P Y -> 18 kg -> £ 16 (> 15 kg; ignored) 
G R Y -> 15 kg -> £ 7
P R Y -> 17 kg -> £ 15 (> 15 kg; ignored) 
B G P R -> 8 kg -> £ 15
B G P Y -> 19 kg -> £ 18 (> 15 kg; ignored) 
B G R Y -> 16 kg -> £ 9 (> 15 kg; ignored) 
B P R Y -> 18 kg -> £ 17 (> 15 kg; ignored) 
G P R Y -> 19 kg -> £ 17 (> 15 kg; ignored) 
B G P R Y -> 20 kg -> £ 19 (> 15 kg; ignored) 

Now we have all the candidates (as those that exceed the weight limit has been discarded), with the weight (in «%w») and value (in «%v») File: knapsack-simple (partial)

my $max = %v.values.max;                             # [1]

say "Highest value: £ $max" if $verbose;

my @solutions = %w.keys.grep( { %v{$_} == $max } );  # [2]

[1] Get the highest value (in £).

[2] Select the boxes with exactly that value. Note that there can be more than one.

Running it:

$ raku knapsack-simple --verbose
Highest value: £ 15
File: knapsack-simple (partial)
my $min = @solutions.map( { %w{$_} } ).min;  # [1]

say "Lowest weight: $min kg" if $verbose;

[1] We can have more than one solutions, with the same or different weight. I assume that a lighter knapsack is a good thing, so choose the lowest weight.

Running it:

$ raku knapsack-simple --verbose
Lowest weight: 8 kg
File: knapsack-simple (partial)
for @solutions -> $solution                                 # [1]
{
  say "{ $solution.comb.join(",") }: { %w{$solution} } kg "
    . "at £ { %v{$solution} }." if %w{$solution} == $min;   # [2]
}

[1] We start with all the possible solutions,

[2] and print the current one if it has the lowest weight.

Running it:

$ raku knapsack-simple
B,G,P,R: 8 kg at £ 15.

Bonus tasks

Adding a limit of the number of boxes to use is quite easy:

File: knapsack-limit (changes only)
subset Positive of Int where * > 0;                              # [2]

unit sub MAIN (:$verbose, Positive :$boxcount = 1000000);        # [1]

for @boxes.combinations.grep(0 < *.elems <= $maxcount) -> @list  # [3]

[1] I have added a named parameter to override the maximum number of boxes. Note the default value, an integer that is larger than the number of boxes specified. (Even if we were to add a lot of additional boxes to the program.)

[2] A have set up a custom type for this value; a positive integer not including zero.

[3] Note the double comparison inside the «grep»; both a lower and an upper limit.

See docs.raku.org/language/typesystem#index-entry-subset-subset for more information about «subset».

Running it:

$ raku knapsack-limit --maxbox=5
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-limit --maxbox=4
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-limit --maxbox=3
B,G,P: 7 kg at £ 14.

$ raku knapsack-limit --maxbox=2
B,P: 5 kg at £ 12.

$ raku knapsack-limit --maxbox=1
P: 4 kg at £ 10.

Why Stop There?

We can add support for a user specified weight limit:

File: knapsack-weight (changes only)
unit sub MAIN (:$verbose, Positive
               :$boxcount = 1000000, Positive :$maxweight = 15); # [1]

[1] I have kept 15 as the default value, so that the program behaves accoring to the challenge when we run it without any arguments.

Remove the «constant $boxcount» line.

Running it:

$ raku knapsack-weight --maxweight=1
B: 1 kg at £ 2.

$ raku knapsack-weight --maxweight=2
B,R: 2 kg at £ 3.

$ raku knapsack-weight --maxweight=3
B,G: 3 kg at £ 4.

$ raku knapsack-weight --maxweight=4
P: 4 kg at £ 10.

$ raku knapsack-weight --maxweight=5
B,P: 5 kg at £ 12.

$ raku knapsack-weight --maxweight=6
B,P,R: 6 kg at £ 13.

$ raku knapsack-weight --maxweight=7
B,G,P: 7 kg at £ 14.

$ raku knapsack-weight --maxweight=8
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=9
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=10
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=11
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=12
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=13
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=14
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=15
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=16
B,G,P,R: 8 kg at £ 15.

$ raku knapsack-weight --maxweight=17
B,P,Y: 17 kg at £ 16.

$ raku knapsack-weight --maxweight=18
B,P,R,Y: 18 kg at £ 17.

$ raku knapsack-weight --maxweight=19
B,G,P,Y: 19 kg at £ 18.

$ raku knapsack-weight --maxweight=20
B,G,P,R,Y: 20 kg at £ 19.

We can also add support for additional boxes (or redefining existing ones):

File: knapsack-turbo (changes only)
unit sub MAIN (*@custom, Bool :$clear, :$verbose,  # [1]
               Positive :$boxcount = 1000000, Positive :$maxweight = 15);

my %weight; %weight = (R => 1, B => 1, G => 2, Y => 12, P =>  4) unless $clear;
my %value;  %value  = (R => 1, B => 2, G => 2, Y =>  4, P => 10) unless $clear;
     # [2]

if @custom   # [3]
{
  for @custom -> $current
  {
    if $current ~~ /^(<upper>)w(\d+)v(\d+)$/
    {
      %weight{$0} = $1.Int;
      %value{$0}  = $2.Int;
      say "Added box $0 with weight $1 kg and value £ $2" if $verbose;
    }
  }
}

my @boxes = %weight.keys.sort;

[1] Add new (or custom) boxes on the command line, on the form «Xw1v2» (where «X» is a single uppercase letter, «w1» is the weight (in this case 1 kg), and «v2» is the value (in this case £ 2).) Use the «--clear» command line option if you don't want to keep the default boxes.

[2] The default weights and values, unless «--clear» has been specified.

[3] Iterate throught the new boxes, and add them to the data structures.

Note that the weight must come before the value. (It is easy to add support for the reverse order as well, and I'll get back to that later.)

Running it (with an additional box Q with weight 5 kg and value £ 16):

$ raku knapsack-turbo Qw5v16
B,G,P,Q,R: 13 kg at £ 31.
$ raku knapsack-turbo Qw5v16 Zw2v10
B,G,P,Q,R,Z: 15 kg at £ 41.

We can redefine an existing box (changing the Q box to weight 5 kg and value £ 16):

$ raku knapsack-turbo Qw5v16
B,G,P,Q,R: 13 kg at £ 31.

And finally, to show that the program supports more than one solution:

$ raku knapsack-turbo Sw4v10 Ww4v10 Zw4v10
B,G,P,S,W: 15 kg at £ 34.
B,G,P,S,Z: 15 kg at £ 34.
B,G,P,W,Z: 15 kg at £ 34.
B,G,S,W,Z: 15 kg at £ 34.

Four choises that are equally good.

With verbose mode:

$ raku knapsack-turbo --clear --verbose Sw4v10 Ww4v10 Zw4v10
Added box S with weight 4 kg and value £ 10
Added box W with weight 4 kg and value £ 10
Added box Z with weight 4 kg and value £ 10
(() (S) (W) (Z) (S W) (S Z) (W Z) (S W Z))
S -> 4 kg -> £ 10
W -> 4 kg -> £ 10
Z -> 4 kg -> £ 10
S W -> 8 kg -> £ 20
S Z -> 8 kg -> £ 20
W Z -> 8 kg -> £ 20
S W Z -> 12 kg -> £ 30
Highest value: £ 30
Lowest weight: 12 kg
S,W,Z: 12 kg at £ 30.

With a lower weight limit (10 kg):

$ raku knapsack-turbo --clear --verbose --maxweight=10 Sw4v10 Ww4v10 Zw4v10
Added box S with weight 4 kg and value £ 10
Added box W with weight 4 kg and value £ 10
Added box Z with weight 4 kg and value £ 10
(() (S) (W) (Z) (S W) (S Z) (W Z) (S W Z))
S -> 4 kg -> £ 10
W -> 4 kg -> £ 10
Z -> 4 kg -> £ 10
S W -> 8 kg -> £ 20
S Z -> 8 kg -> £ 20
W Z -> 8 kg -> £ 20
S W Z -> 12 kg -> £ 30 (> 10 kg; ignored)
Highest value: £ 20
Lowest weight: 8 kg
S,Z: 8 kg at £ 20.
S,W: 8 kg at £ 20.
W,Z: 8 kg at £ 20.

Wikipedia Bonus

The aforementioned Wikipedia article has one big example with 10 boxes. It doesn't provide an answer, but our program can do that:

$ raku knapsack-turbo --clear --maxweight=67 Aw23v505 Bw26v352 Cw20v458 \
  Dw18v220 Ew32v354 Fw27v498 Gw29v434 Hw26v545 Iw30v473 Jw27v543
A,D,H: 67 kg at £ 1270.

Wikipedia Bonus, Part 2

I have solved the «0-1 knapsack problem», but the Wikipedia article also presents «The bounded knapsack problem» (BKP), where we can have multiple copies of each box. The number of duplicates is specified as an upper limit. And «the unbounded knapsack problem» (UKP), with no upper limit of duplicates.

Implementing this with «combinations» is quite easy, for the bounded version. The unbounded is not doable this way. So here it is, with duplicate support:

File: knapsack (with the changes highlighted)
subset Positive of Int where * > 0;

unit sub MAIN (*@custom, Bool :$clear, :$verbose, Positive :$boxcount = 1000000,
  Positive :$maxweight = 15, Positive :$duplicates = 1); # [1]

my %weight; %weight = (R => 1, B => 1, G => 2, Y => 12, P =>  4) unless $clear;
my %value;  %value  = (R => 1, B => 2, G => 2, Y =>  4, P => 10) unless $clear;

if @custom
{
  for @custom -> $current
  {
    if $current ~~ /^ (<upper>) [ w $<w> = (\d+) v $<v> = (\d+) |  # [2]
                                  v $<v> = (\d+) w $<w> = (\d+) ] $/
    {
      %weight{$0} =  $<w>.Int;
      %value{$0}  =  $<v>.Int;
      say "Added box $0 with weight  $<w> kg and value £  $<v>" if $verbose;
    }
  }
}

my @boxes = %weight.keys.sort;

@boxes = (@boxes xx $duplicates).flat.sort if $duplicates > 1;  # [3]

say @boxes.combinations if $verbose;

my %w;
my %v;

for @boxes.combinations.grep(0 < *.elems <= $boxcount) -> @list
{
  my $key    = @list.join;

  next if %w{$key}.defined; # [4]

  my $weight = @list.map({ %weight{$_} }).sum;
  my $value  = @list.map({ %value{$_}  }).sum;

  if $weight <= $maxweight
  {
    %w{$key} = $weight; 
    %v{$key} = $value;
    
    say "{ @list } -> $weight kg -> £ $value" if $verbose;
  }
  elsif $verbose
  {
    say "{ @list } -> $weight kg -> £ $value (> $maxweight kg; ignored)";
  }
}

my $max = %v.values.max;

say "Highest value: £ $max" if $verbose;

my @solutions = %w.keys.grep( { %v{$_} == $max } );

my $min = @solutions.map( { %w{$_} } ).min;

say "Lowest weight: $min kg" if $verbose;

for @solutions -> $solution
{
  say "{ $solution.comb.join(",") }: { %w{$solution} } kg "
    . "at £ { %v{$solution} }." if %w{$solution} == $min;
}

[1] A new command line argument to set the number of duplicates, with «1» as the default value. Note that the way I use «duplicate» may be misleading, as I mean the the total number of identical boxes. (So «1» means no duplicates.)

[3] I promised to add support for the reverse order of weight and value, and here it is. Note the alteration (with «|) so that the regex matches both variants; e.g. «Qw5v10» and «Qv10w5». The matching order is reversed, so I have used named captures instead.

[3] If we have requested copies of the boxes, copy the list accordingly. The «xx» operator copies the list (on the left) the given number of times (on the right)

[4] Do not compute values we already have. «combinations» does not give duplicates, as shown by the verbose output above, if the elements are unique. Here we have duplicate elements, and the result is duplicate lists. (As «combinatons» works with the positions, and not the actual values. E.g:

> (1,2,3).combinations
(() (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3))

> (1,1,3).combinations
(() (1) (1) (3) (1 1) (1 3) (1 3) (1 1 3))

Note the duplicates.

See docs.raku.org/routine/xx for more information about the «xx» operator.

See docs.raku.org/language/regexes#Named_captures for more information about Named Captures.

Running it:

$ raku knapsack --duplicates=1
B,G,P,R: 8 kg at £ 15.

$ raku knapsack --duplicates=2
B,B,G,G,P,P,R: 15 kg at £ 29.

$ raku knapsack --duplicates=3
B,B,B,P,P,P: 15 kg at £ 36.

$ raku knapsack --duplicates=4
B,B,B,P,P,P: 15 kg at £ 36.

The first one doesn't make sense. The last one took quite some time to finish (about half a minute on my computer), so this does not scale well.

We can count the combinations:

> (<R B G Y P> xx 2).flat.combinations.elems
1024
       
> (<R B G Y P> xx 3).flat.combinations.elems
32768

> (<R B G Y P> xx 4).flat.combinations.elems
1048576

So yes, the increase in time usage makes sense.

We should test that swapping the weight and value works:

$ raku knapsack --verbose Qw10v25 | head -n 1
Added box Q with weight 10 kg and value £ 25

$ raku knapsack --verbose Qv25w10 | head -n 1
Added box Q with weight 10 kg and value £ 25

(It does.)

And that's it.