From 834317151987193ed0470a58dbedc6c0090cadf1 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Wed, 23 Sep 2020 23:36:52 -0400 Subject: Add second tutorial, on lists --- docs/tutorial/expression.html | 4 +- docs/tutorial/index.html | 1 + docs/tutorial/list.html | 291 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 294 insertions(+), 2 deletions(-) create mode 100644 docs/tutorial/list.html (limited to 'docs/tutorial') diff --git a/docs/tutorial/expression.html b/docs/tutorial/expression.html index a3e8e3f0..6098e7a4 100644 --- a/docs/tutorial/expression.html +++ b/docs/tutorial/expression.html @@ -127,9 +127,9 @@

Addition and subtraction with affine characters have all the same algebraic properties that they do with numbers. One way to see this is to think of values as a combination of "characterness" (0 for numbers and 1 for characters) and either numeric value or code point. Addition and subtraction are done element-wise on these pairs of numbers, and are allowed if the result is a valid value, that is, its characterness is 0 or 1 and its value is a valid code point if the characterness is 1. However, because the space of values is no longer closed under addition and subtraction, certain rearrangements of valid computations might not be valid, if one of the values produced in the middle isn't legal.

Modifiers

Functions are nice and all, but to really bring us into the space age BQN has a second level of function called modifiers (the space age in this case is when operators were introduced to APL in the early 60s—hey, did you know the second APL conference was held at Goddard Space Flight Center?). While functions apply to subjects, modifiers can apply to functions or subjects, and return functions. For example, the 1-modifier ˜ modifies one function by swapping the arguments before calling it (Swap), or copying the right argument to the left if there's only one (Self).

-↗️
    2 -˜ 'd'  # Subtract from
+↗️
    2 -˜ 'd'  # Subtract from
 'b'
-    +˜ 3  # Add to itself
+    +˜ 3      # Add to itself
 6
 

This gives us two nice ways to square a value:

diff --git a/docs/tutorial/index.html b/docs/tutorial/index.html index b79aafbb..5e66548e 100644 --- a/docs/tutorial/index.html +++ b/docs/tutorial/index.html @@ -10,5 +10,6 @@

The tutorials available so far:

diff --git a/docs/tutorial/list.html b/docs/tutorial/list.html new file mode 100644 index 00000000..bfeb2221 --- /dev/null +++ b/docs/tutorial/list.html @@ -0,0 +1,291 @@ + + + + BQN: Tutorial: Working with lists + + +

Tutorial: Working with lists

+

Enough with all these preliminaries like learning how to read basic expressions. Let's get into what makes BQN special.

+↗️
    1, 2, 3
+⟨ 1 2 3 ⟩
+
+

This is a list. Wait for it…

+↗️
    1, 2, 3 + 1
+⟨ 2 3 4 ⟩
+
+

There we go. Now in BQN arrays are not just lists, which are a 1-dimensional data structure, but can have any number of dimensions. In this tutorial we're going to discuss lists only, leaving the 5-dimensional stuff for later. So we're really only seeing the power of K, an APL-family language that only uses lists (and dictionaries, which BQN doesn't have). K was powerful enough for Arthur Whitney to found two companies and make millions and millions of dollars, and BQN's compiler also runs almost entirely on lists, so this is probably enough power for one webpage.

+

List notation

+

There are three kinds of list notation in BQN. Every one has a subject role, even though expressions used inside it might have other roles. First, a string is a list of characters, and is written by placing those characters in double quotes.

+
"Text!"
+
+

Only one character needs to be escaped to place it in a string: the double quote, which is escaped by writing it twice. Any other character, including a newline, can be placed directly in a string.

+

Second, list notation uses angle brackets ⟨⟩. The elements in the list are kept apart with one of the three separator characters: ,, , and newline. Anything can be used as an element, even a function, or a modifier like . Here's a list containing a number, a 2-modifier, a string, and a non-string list:

+
 π, , "element"  'l',1,5,'t' 
+
+

The two characters , and are completely interchangeable, and newline is nearly the same but has the additional function of ending a comment (I guess now's as good a time as any to officially point out that comments start with # and end with a newline, and have no effect on the program but may in rare cases have an effect on the reader). You can use extra separators anywhere in a list: leading, trailing, and repeated separators are ignored.

+

+  "putting"         # This is a comment
+  "a",              # That , wasn't needed
+  "list"
+                    # An extra newline won't hurt anything
+  "on","multiple"   # Two elements here
+  "lines"
+
+
+

Finally, strand notation is a shortcut for simple lists like a few numbers. It's written with the ligature , which has a higher precedence than either functions or operators. A sequence of values joined with ligatures becomes a list, so that for example the following two expressions are equivalent:

+
2,+,-
+2+-
+
+

Strand notation is shorter and looks less cluttered in this example. As with lists, anything goes in a strand, but if it's the result of a function or operator, or another strand, then it has to be put in parentheses. With one set of parentheses, a strand will be just as long as the equivalent bracketed list, and with two you're better off using the list.

+

A ligature is a kind of notation and doesn't do something specific like a function does. It's the sequence of ligatures that makes whatever they join together into a list. So if we parenthesize either ligature below, we get a different result! Ligatures aren't right-associative or left-associative.

+↗️
    012
+⟨ 0 1 2 ⟩
+    (01)2
+⟨ ⟨ 0 1 ⟩ 2 ⟩
+    0(12)
+⟨ 0 ⟨ 1 2 ⟩ ⟩
+
+

BQN types

+

Now that all six BQN types have been introduced, let's make a table:

+ + + + + + + + + + + + + + + + + + + + + +
DataOperation
NumberFunction
Character1-modifier
Array2-modifier
+

Lists are just one-dimensional arrays. Types are divided into data types, which tend to have a subject role, and operation types, which tend to have a role matching their type. Also, any value that's not an array, such as everything we used in the last tutorial, is called an atom.

+

Arithmetic on lists

+

Arithmetic functions automatically apply to each element of a list argument. If both arguments are lists, they have to have the same length, and they're matched up one element at a time.

+↗️
    ÷ 2,3,4
+⟨ 0.5 0.333333333333333 0.25 ⟩
+
+    "APL" + 1
+"BQM"
+
+    "31415" - '0'
+⟨ 3 1 4 1 5 ⟩
+
+    4321  1234
+⟨ 4 9 8 1 ⟩
+
+

This list application works recursively, so that lists of lists (and so on) are handled as well. We say that arithmetic functions are pervasive. They dig into their arguments until reaching the atoms.

+↗️
    2 × 02  135
+⟨ ⟨ 0 4 ⟩ ⟨ 2 6 10 ⟩ ⟩
+
+     10, 2030  +  12, 3 
+⟨ ⟨ 11 12 ⟩ ⟨ 23 33 ⟩ ⟩
+
+

Some list functions

+

Let's introduce a few primitives to work with lists.

+

Make one or two atom arguments into a list with , pronounced Solo in the one-argument case and Couple in the two-argument case. This might not seem to merit a symbol but there's more to come. Don't call it on lists and ponder the results, igniting a hunger for ever more dimensions.

+↗️
     4
+⟨ 4 ⟩
+
+    2  4
+⟨ 2 4 ⟩
+
+

Concatenate lists with Join To (). The little chain link symbol—technically "inverted lazy S"—is my favorite in BQN. Hook those lists together!

+↗️
    1,2,3  "abc"
+⟨ 1 2 3 'a' 'b' 'c' ⟩
+
+    0  1,2,3
+⟨ 0 1 2 3 ⟩
+
+    "plural"  's'
+"plurals"
+
+

The last two examples show that you can join a list to an atom, making it the first or last element of the result. This is a little suspect because if you decide the data being stored is more complicated and start using a list instead of an atom, then it will no longer be used as a single element but rather a subsection of the result. So I would only use that shortcut for something like a numeric literal that's clearly an atom and will stay that way, and otherwise wrap those atomic arguments in some ⟨⟩ brackets. Join will even work with two atoms, but in that case I'd say it makes more sense to use Couple instead.

+

Reverse () puts the list back to front.

+↗️
     "drawer"
+"reward"
+
+

With a left argument means Rotate instead, and shifts values over by the specified amount, wrapping those that go off the end to the other side. A positive value rotates to the left, and a negative one rotates right.

+↗️
    2  0,1,2,3,4
+⟨ 2 3 4 0 1 ⟩
+    ¯1  "bcdea"
+"abcde"
+
+

…and modifiers

+

The 1-modifier Each (¨) applies its operand to every element of a list argument: it's the same as map in a functional programming language. With two list arguments (which have to have the same length), Each pairs the corresponding elements from each, a bit like a zip function. If one argument is a list and one's an atom, the atom is reused every time instead.

+↗️
    ¨ "abcd""ABCDEF""01"
+⟨ "dcba" "FEDCBA" "10" ⟩
+
+    "string""list""array" ¨ 's'
+⟨ "strings" "lists" "arrays" ⟩
+
+    "abc" ¨  "abc"
+⟨ "ac" "bb" "ca" ⟩
+
+

Fold (´) is the higher-order function also known as reduce or accumulate. It applies its operand function between each pair of elements in a list argument. For example, +´ gives the sum of a list and ×´ gives its product.

+↗️
    +´ 234
+9
+    ×´ 234
+24
+
+

To match the order of BQN evaluation, Fold moves over its argument array from right to left. You'd get the same result by writing the operand function in between each element of the argument list, but you'd also write the function a lot of times.

+↗️
    -´ 12345
+3
+    1-2-3-4-5
+3
+
+

With this evaluation order, -´ gives the alternating sum of its argument. Think of it this way: the left argument of each - is a single number, while the right argument is made up of all the numbers to the right subtracted together. So each - flips the sign of every number to its right, and every number is negated by all the -s to its left. The first number (1 above) never gets negated, the second is negated once, the third is negated twice, returning it to its original value… the signs alternate.

+

Hey, isn't it dissonant that the first, second, and third numbers are negated zero, one, and two times? If they were the zeroth, first, and second it wouldn't be…

+

You can fold with the Join To function to join several lists together:

+↗️
    ´  "con", "cat", "enat", "e" 
+"concatenate"
+
+

But you shouldn't! Just will do the job for you—with no left argument it's just called "Join" (it's like Javascript's .join(), but with no separator and not specific to strings). And it could do more jobs if you had more dimensions. But I'm sure that's the furthest thing from your mind.

+↗️
      "con", "cat", "enat", "e" 
+"concatenate"
+
+

Example: base decoding

+

Some people like to imagine that robots or other techno-beings speak entirely in binary-encoded ASCII, like for instance "01001110 01100101 01110010 01100100 00100001". This is dumb for a lot of reasons, and the encoded text probably just says something inane, but you're a slave to curiosity and can't ignore it. Are one and a half tutorials of BQN enough to clear your conscience?

+

Almost. It's really close. There are just two things missing, so I'll cover those and can we agree one and three-quarters is pretty good? First is Range (), which is called on a number to give all the natural numbers less than it:

+↗️
     8
+⟨ 0 1 2 3 4 5 6 7 ⟩
+
+

Natural numbers in BQN start at 0. I'll get to the second function in a moment, but first let's consider how we'd decode just one number in binary. I'll pick a smaller one: 9 is 1001 in binary. Like the first 1 in decimal 1001 counts for one thousand or 103, the first one in binary 1001 counts for 8, which is 23. We can put each number next to its place value like this:

+↗️
    8421 ¨ 1001
+⟨ ⟨ 8 1 ⟩ ⟨ 4 0 ⟩ ⟨ 2 0 ⟩ ⟨ 1 1 ⟩ ⟩
+
+

To get the value we multiply each number by its place value and then add them up.

+↗️
    +´ 8421 × 1001
+9
+
+

Now we'd like to generate that list 8421 instead of writing it out, particularly because it needs to be twice as long to decode eight-bit ASCII characters (where the first bit is always zero come on robots would never use such an inefficient format). It's the first four powers of two, or two to the power of the first four natural numbers, in reverse. While we're at it, let's get 1001 from "1001" by subtracting '0'. Nice.

+↗️
    2  4
+⟨ 1 2 4 8 ⟩
+
+    2⋆↕4
+⟨ 8 4 2 1 ⟩
+
+    (2⋆↕4) × "1001"-'0'
+⟨ 8 0 0 1 ⟩
+
+    +´ (2⋆↕4) × "1001"-'0'
+9
+
+

Lot of functions up there. Notice how I need to use parentheses for the left argument of a function if it's compound, but never for the right argument, and consequently never with a one-argument function.

+

Representing our ASCII statement as a list of lists, we convert each digit to a number as before:

+↗️
    '0' -˜ "01001110""01100101""01110010""01100100""00100001"
+⟨ ⟨ 0 1 0 0 1 1 1 0 ⟩ ⟨ 0 1 1 0 0 1 0 1 ⟩ ⟨ 0 1 1 1 0 0 1 0 ⟩ ⟨ 0 1 1 0 0 1 0 0 ⟩ ⟨ 0 0 1 0 0 0 0 1 ⟩ ⟩
+
+

Now we need to multiply each digit by the right place value, and add them up. The adding part is easy, just requiring an Each.

+↗️
    +´¨ '0' -˜ "01001110""01100101""01110010""01100100""00100001"
+⟨ 4 4 4 3 2 ⟩
+
+

Multiplication is harder, and if we try to multiply by the place value list directly it doesn't go so well.

+↗️
    (2⋆↕8) × '0' -˜ "01001110""01100101""01110010""01100100""00100001"
+ERROR
+

This is because the list on the left has length 8 while the list on the right has length 5. The elements of the list on the right have length 8, but BQN can't be expected to know you want to connect the two arguments in that particular way. Especially considering that if you happen to have 8 characters then the right argument will have length 8!

+

There are a few ways to handle this. What we'll do is bind the place values to × using the 2-modifier . This modifier attaches a left argument to a function.

+↗️
    "ab" ¨  "cd", "ut" 
+⟨ "acd" "but" ⟩
+
+    "ab"¨  "cd", "ut" 
+⟨ "abcd" "abut" ⟩
+
+

In the first bit of code above, ¨ matches up its left and right arguments. In the second, we bind "ab" to first—remember that modifiers associate from left to right, so that "ab"¨ and ("ab")¨ are the same—and Each only sees one argument. "ab", packed inside Each's operand, is reused each time. The same principle applies to our binary problem:

+↗️
    +´¨ (2⋆↕8)ר '0' -˜ "01001110""01100101""01110010""01100100""00100001"
+⟨ 78 101 114 100 33 ⟩
+
+

To wrap things up, we just convert from numbers to characters by adding the null character.

+↗️
    @ + +´¨ (2⋆↕8)ר '0' -˜ "01001110""01100101""01110010""01100100""00100001"
+"Nerd!"
+
+

Was it as anticlimactic as you'd hoped? In fact there's a simpler way to do the base decoding as well, using 's mirror image in a different way. We'll discuss that in the next tutorial!

+↗️
    +´ (2⋆↕4) × "1001"-'0'
+9
+    +(+˜)´  "1001"-'0'
+9
+
+

Summary

+

There are three types of syntax that create lists: the "string literal" for lists of characters and either enclosing angle brackets ⟨⟩ with , or or newline characters or connecting ligatures for lists with arbitrary elements. The ligature has a higher precedence than functions or modifiers, so we should add to to our precedence table:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PrecedenceRoleAssociativity
0()⟨⟩(none)
1Non-binary
2ModifierLeft-to-right
3FunctionRight-to-left
+

The elements of a list can have any syntactic role; it's ignored and the list as a whole is a subject.

+

We introduced a few new primitives. The links below go to the full documentation pages for them.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Glyph1 arg2 args
JoinJoin To
SoloCouple
ReverseRotate
¨EachEach
´Fold
+

Additionally, we saw that the arithmetic functions work naturally on lists, automatically applying to every element of a single list argument or pairing up the elements of two list arguments.

+

Even this small amount of list functionality is enough to solve some small problems. We haven't even introduced a function notation yet!

+ -- cgit v1.2.3