by Arne Sommer

# Unicode Zip with Raku and Perl

[206] Published 16. October 2022.

This is my response to The Weekly Challenge #186.

## Challenge #186.1: Zip List

You are given two list `@a` and `@b` of same size.

Create a subroutine `sub zip(@a, @b)` that merge the two list as shown in the example below.

Example: ```Input: @a = qw/1 2 3/; @b = qw/a b c/; Output: zip(@a, @b) should return qw/1 a 2 b 3 c/; zip(@b, @a) should return qw/a 1 b 2 c 3/; ```

### A Perl Version

Note that we cannot use arrays as procedure arguments, so I use references instead.

File: zip-list-perl ```#! /usr/bin/env perl use strict; use warnings; use feature 'say'; use feature 'signatures'; no warnings 'experimental::signatures'; my @a = qw/1 2 3/; my @b = qw/a b c/; my @zip1 = zip(\@a, \@b); # [1] my @zip2 = zip(\@b, \@a); say 'qw/' . join(" ", @zip1) . '/;'; # [2] say 'qw/' . join(" ", @zip2) . '/;'; # [2] sub zip (\$a, \$b) # [3] { my @return; for my \$index (0 .. @\$a -1) # [4] { push(@return, \$a->[\$index]); # [5] push(@return, \$b->[\$index]); } return @return; } ```

[1] Pass two array references as argument.

[2] The challenge wanted this output, so here it is.

[3] The references are scalar variables.

[4] Iterate over the indices of the first array.

[5] Copy the array element with the current index.

Running it gives the excact same output as given by the example:

```\$ ./zip-list-perl qw/1 a 2 b 3 c/; qw/a 1 b 2 c 3/; ```

### The Raku version

Raku has a built-in `zip` function, so we do not actually have to program it:

```> my @a = <1 2 3> > my @b = <a b c> > say zip(@a, @b); # -> ((1 a) (2 b) (3 c)) ```

The result is a list of lists (with one element from each of the input lists), so we have to flatten it to get the desired result:

```> say zip(@a, @b).flat; # -> (1 a 2 b 3 c) ```

See docs.raku.org/routine/zip more information about `zip`.

But the task was to implement it ourselves, so let us have a go at that.

File: zip-list ```#! /usr/bin/env raku my @a = qw/1 2 3/; my @b = qw/a b c/; my @zip1 = my-zip(@a, @b); my @zip2 = my-zip(@b, @a); say "qw/@zip1[]/;"; say "qw/@zip2[]/;"; sub my-zip (@a is copy, @b is copy) # [1] { my @return; while @a { @return.push: @a.shift; @return.push: @b.shift; } return @return; } ```

[1] Note the use of `is copy` so that the `shift`s work on a local copy, and not the global variable - which would result in an empty list in «@zip2».

Running it:

```./zip-list qw/1 a 2 b 3 c/; qw/a 1 b 2 c 3/; ```

Looking good.

## Challenge #186.2: Unicode Makeover

You are given a string with possible unicode characters.

Create a subroutine `sub makeover(\$str)` that replace the unicode characters with ascii equivalent. For this task, let us assume it only contains alphabets.

Example 1: ```Input: \$str = 'ÃÊÍÒÙ'; Output: 'AEIOU' ``` Example 2: ```Input: \$str = 'âÊíÒÙ'; Output: 'aEiOU' ```

Raku has built-in support for this, with the `samemark` method:

```> say 'ÃÊÍÒÙ'.samemark('a') AEIOU > say 'ÃÊaqøæÍÒÙ'.samemark('a') AEaqøæIOU ```

See docs.raku.org/routine/samemark more information about `samemark`.

The argument can either be a string or a single character, as used here. The mark/accent on this character is applied to (a copy of) the string. As we use «a», which does not have any marks, the result is a string devoid of them.

Wrapped up as a program:

File: unicode-makeover ```#! /usr/bin/env raku say makeover('ÃÊÍÒÙ'); say makeover('âÊíÒÙ'); sub makeover (\$str) { return \$str.samemark('a'); } ```

Running it:

```\$ ./unicode-makeover AEIOU aEiOU ```

Looking good.

### Perl

Perl does not support this sort of trickery out of the box, but the module Unicode::Normalize is helpful.

File: unicode-makeover-perl ```#! /usr/bin/env perl use strict; use warnings; use utf8; use feature 'say'; use feature 'unicode_strings'; use feature 'signatures'; no warnings 'experimental::signatures'; use Unicode::Normalize; say makeover('ÃÊÍÒÙ'); say makeover('âÊíÒÙ'); sub makeover (\$str) { my \$nfkd = NFKD(\$str); # [1] \$nfkd =~ s/\p{NonspacingMark}//g; # [2] return \$nfkd; } ```

[1] Split the characters and marks/accents into separate codepoints.

[2] Remove all the mark/accent codepoints.

Running it gives the same result as the Raku version:

```\$ ./unicode-makeover-perl AEIOU aEiOU ```

Note that the program will give unprintable output for unicode characters where it was unable to remove the mark/accent from a non-ascii character:

```say makeover('ßØÆÅøæåÖ'); # -> ���A��aO ```