diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/README.md | 1 | ||||
| -rw-r--r-- | doc/based.md | 2 | ||||
| -rw-r--r-- | doc/enclose.md | 102 | ||||
| -rw-r--r-- | doc/primitive.md | 2 | ||||
| -rw-r--r-- | doc/shape.md | 2 |
5 files changed, 106 insertions, 3 deletions
diff --git a/doc/README.md b/doc/README.md index 7b3217f1..ab347e99 100644 --- a/doc/README.md +++ b/doc/README.md @@ -33,6 +33,7 @@ Primitives: - [Array dimensions](shape.md) (`≢=≠`) - [Assert](assert.md) (`!`) - [Deshape and Reshape](reshape.md) (`⥊`) +- [Enclose](enclose.md) (`<`) - [Fold and Insert](fold.md) (`´˝`) - [Group](group.md) (`⊔`) - [Identity functions](identity.md) (`⊢⊣`) diff --git a/doc/based.md b/doc/based.md index 83fe74e8..38139032 100644 --- a/doc/based.md +++ b/doc/based.md @@ -12,7 +12,7 @@ If you're an array programmer then I have bad news for you. My thesis here is th APL tends to define its data by starting with the array and then looking downwards in depth at what it contains. The based array model, as the name suggests, starts at the foundations, which in BQN are called "atoms". There are five types of atom, which together with the array type give the six types a value can have in BQN. Based means being yourself, and an atom's *not* an array. -An atom has [depth](depth.md) 0, and doesn't inherently have a shape. However, primitives that expect an array promote atoms by enclosing them to get a rank-0, or *unit*, array that contains the atom (any value can be enclosed in this way, giving a unit array with higher depth, but it only happens automatically for atoms). Rank and shape both do this, so an atom can be considered to have the same dimensions as a unit array: rank 0 and shape `⟨⟩`. An atom is also considered a kind of unit, but it's not a unit array. +An atom has [depth](depth.md) 0, and doesn't inherently have a shape. However, primitives that expect an array promote atoms by [enclosing](enclose.md) them to get a rank-0, or *unit*, array that contains the atom (any value can be enclosed in this way, giving a unit array with higher depth, but it only happens automatically for atoms). Rank and shape both do this, so an atom can be considered to have the same dimensions as a unit array: rank 0 and shape `⟨⟩`. An atom is also considered a kind of unit, but it's not a unit array. Atoms are displayed as plain values, while enclosed atoms, that is, depth-1 unit arrays, are shown with an array display. diff --git a/doc/enclose.md b/doc/enclose.md new file mode 100644 index 00000000..6e54e12f --- /dev/null +++ b/doc/enclose.md @@ -0,0 +1,102 @@ +*View this file with results and syntax highlighting [here](https://mlochbaum.github.io/BQN/doc/enclose.html).* + +# Enclose + +The function enclose creates a unit array whose only element is `𝕩`. + + < "xyz" + +If you understand the concept of a unit array, then that definition almost certainly made sense to you. Therefore the remainder of this document will explain what a unit array is, what it isn't, and why you would use it. + +If you're familiar with the Enclose or Box function from APL or J (but particularly APL), then it's possible you understand the concept of a unit array wrongly, or at least, not in the same way BQN uses it. A difference from APL is that `<𝕩` is never the same as `𝕩`. I recommend reading about [based array theory](based.md) if you haven't already. + +## What's a unit? + +A **unit array** is an array with no axes: that is, it has rank 0 and its shape is the empty list. The array itself isn't empty though. The number of elements is the product of the shape, which is 1. + + ≢ <"anything" # empty shape + ×´ ≢ <"anything" # and one element + +If there are no axes, what use is an array? Rank 0 certainly qualifies as an edge case, as there's no rank -1 below it. Most often when a unit array is used it's because there *are* relevant axes, but we want an array that doesn't include them (sound cryptic? Just keep reading…). + +This contrasts with an atom like `137`, which is considered a unit but not a unit *array*. An atom has no axes just because it doesn't have axes. But because it has no axes, it has the same shape `⟨⟩` as a unit array, by convention. + +Some unit arrays are made by removing an axis from an existing array. First Cell (`⊏`) or [Insert](fold.md) (`˝`) might do this: + + l ← 2‿7‿1‿8‿2‿8 + ⊏ l + +˝ l + +Usually this is unwanted. You'd prefer to use `⊑` or `+´` in order to get an atom result. But consider the following function to sum the rows of a table: + + +˝˘ 3‿4⥊↕12 + +In this case each call to `+˝` returns a cell of the result. The result is a list, so its cells are units! Here, Cells (`˘`) "hides" one axis from its operand, and the operand `+˝` reduces out an axis, leaving zero axes—until Cells assembles the results, putting its axis back. In this case, `+´` would also be tolerated. But it's wrong, because each result really should be a zero-axis array. We can reveal this by making an array whose elements aren't atoms. + + +´˘ ⟨↕2,"ab"⟩≍⟨↕3,"ABC"⟩ + +˝˘ ⟨↕2,"ab"⟩≍⟨↕3,"ABC"⟩ + +The function `+´˘` tries to mix together the result elements into one big array, causing an error because they have different lengths, but `+˝˘` keeps them as elements. + +One strained example probably isn't all that compelling. And it doesn't explain why you'd use Enclose, which doesn't remove an axis from an existing array but creates a whole new unit array. So… + +## Why create a unit? + +Why indeed? + +### Table of combinations + +Let's take a look at the following program, which uses [Table](map.md#table) (`⌜`) to create an array of combinations—every possibility from three sets of choices. It uses Enclose not once but twice. + + (<⟨⟩) <⊸∾⌜´ ⟨""‿"anti", "red"‿"blue"‿"green", "up"‿"down"⟩ + +One use is in the function `<⊸∾`, which encloses the left argument before (`⊸`) [joining](join.md) (`∾`) it to the right argument. This is different from Join on its own because it treats the left argument as a single element. + + "start" ∾ "middle"‿"end" + + "start" <⊸∾ "middle"‿"end" + +For this purpose `{⟨𝕩⟩}⊸∾`, which turns the left argument into a 1-element list, also works. But maybe it doesn't really capture the intended meaning: it makes `𝕨` into a whole new list to be added when all that's needed is to add one cell. This cell will be placed along the first axis, but it doesn't have an axis of its own. A similar example, showing how units are used as part of a computation, is to join each row of a matrix to the corresponding item of a list. + + (=⌜˜↕4) ∾˘ ↕4 + +Now Cells (`˘`) splits both arguments into cells. For the `𝕨`, a rank-2 array, these cells are lists; for the list `𝕩` they have to be units. Treating them as elements would work in this case, because `∾` would automatically enclose them, but would fail if `𝕩` contained non-atom elements such as strings. + +The other use of `<` in the original example is `(<⟨⟩)`, which is the left argument to the function `<⊸∾⌜´`. Let's break that function down. We said `<⊸∾` joins `𝕨` as an element to the front of `𝕩`. With [Table](map.md#table) we have `<⊸∾⌜`, which takes two array arguments and does this for every pair of elements from them. + + "red"‿"blue"‿"green" <⊸∾⌜ ⟨"up"⟩‿⟨"down"⟩ + +[Fold](fold.md) (`´`) changes this from a function of two arrays to a function of any number of arrays. And `<⟨⟩`, the enclosed empty array, is the initial value for the fold. Why do we need an initial value? To start with, consider applying to only one input array. With no initial value Fold just returns it without modification. + + <⊸∾⌜´ ⟨"up"‿"down"⟩ + +But this is only an array of strings, and not an array of lists of strings: the right result is `⟨⟨"up"⟩,⟨"down"⟩⟩`. And that's not the extend of our troubles: without an initial value we'll get the wrong result on longer arguments too, because the elements of the rightmost array get joined to the result lists as lists, not as elements. + + <⊸∾⌜´ ⟨"red"‿"blue"‿"green", "up"‿"down"⟩ + +To make things right, we need an array of lists for an initial value. Since it shouldn't add anything to the result, any lists it contains need to be empty. But what should its shape be? The result shape from Table is always the argument shapes joined together (`𝕨∾○≢𝕩`). The initial value shouldn't contribute the result shape, so it needs to have empty shape, or rank 0! We use Enclose to create the array `<⟨⟩` with no axes, because the result *will* have axes but the initial element needs to start without any. All the axes come from the list of choices. + +It goes deeper! The following (pretty tough) example uses arrays with various ranks in the argument, and they're handled quite well. The last one isn't really a choice, so it has no axes. If it were a one-element list then the result would have a meaningless length-1 axis. But not enclosing it would cause each character to be treated as an option, with unpleasant results. + + flavor ← ⍉ ∘‿2 ⥊ "up"‿"down"‿"charm"‿"strange"‿"top"‿"bottom" + (<⟨⟩) <⊸∾⌜´ ⟨"red"‿"blue"‿"green", flavor, <"quark"⟩ + +### Broadcasting + +Table isn't the only mapping function that gets along well with units. Here's an example with Each (`¨`). + + =‿≠‿≡‿≢ {𝕎𝕩}¨ < 3‿2⥊"abcdef" + +The function `{𝕎𝕩}` applies its left argument as a function to its right; we want to apply the four functions Rank, Length, [Depth](depth.md), and [Shape](shape.md) to a single array. Each normally matches up elements from its two arguments, but it will also copy the elements of a lower-rank argument to fill in any missing trailing axes and match the higher-rank argument's shape. To copy a single argument for every function call, it should have no axes, so we enclose it into a unit. + +This example would work just as well with Table (`⌜`), although maybe the interpretation is a little different. The reason it matters that Each accepts unit arrays is that arithmetic primitives (as well as the Depth modifier `⚇`) use Each to match their arguments up. Want to add a point (two numbers) to each point in an array? Just enclose it first. + + (<10‿¯10) + ⟨2‿3,1‿7⟩≍⟨4‿1,5‿4⟩ + +## Coda + +Perhaps you feel bludgeoned rather than convinced at this point. Unit arrays are useful, sure, but aren't they ugly? Aren't they a hack? + +The practical answer is that I think you should use them anyway. You'll probably come to appreciate the use of Enclose and how it can help you produce working, reliable code, making you a more effective BQN programmer. + +On the theoretical side, it's important to realize that units are just a consequence of having multidimensional arrays. Array languages come with units be default, so that "adding" them is not really a complication, it's a simplification. It's natural to not feel quite right around these sorts of non-things, because zero is a pretty special number—being among other things the only number of paddles you can have and still not be able to go anywhere in your canoe. In my opinion the right response is to understand why they are special but also why they fit in as part of the system, so you can be in control instead of worrying. diff --git a/doc/primitive.md b/doc/primitive.md index 2be91bd4..531ffb13 100644 --- a/doc/primitive.md +++ b/doc/primitive.md @@ -27,7 +27,7 @@ Functions that have significant differences from APL functions are marked with a | `¬` | [Not](logic.md)* | [Span](logic.md)* | `\|` | [Absolute Value](arithmetic.md#additional-arithmetic)| [Modulus](arithmetic.md#additional-arithmetic) | `≤` | | [Less Than or Equal to](https://aplwiki.com/wiki/Less_than_or_Equal_to) -| `<` | [Enclose](https://aplwiki.com/wiki/Enclose) | [Less Than](https://aplwiki.com/wiki/Less_than) +| `<` | [Enclose](enclose.md) | [Less Than](https://aplwiki.com/wiki/Less_than) | `>` | [Merge](couple.md)* | [Greater Than](https://aplwiki.com/wiki/Greater_than) | `≥` | | [Greater Than or Equal to](https://aplwiki.com/wiki/Greater_than_or_Equal_to) | `=` | [Rank](shape.md)* | [Equals](https://aplwiki.com/wiki/Equal_to) diff --git a/doc/shape.md b/doc/shape.md index 90a71cdd..1cb98d46 100644 --- a/doc/shape.md +++ b/doc/shape.md @@ -32,7 +32,7 @@ Applying Shape and the other two functions to an atom shows a shape of `⟨⟩`, ## Units -A unit is an atom, or an array with no axes—rank 0. Since it doesn't have any axes, its shape should have no elements. It should be the empty list `⟨⟩` (with a fill of `0`, like all shapes). As there's no first element in the shape, it's not obvious what the length should be, and a stricter language would just give an error. However, there are some good reasons to use a length of `1`. First, the total number of elements is 1, meaning that if the length divides this number evenly (as it does for non-unit arrays) then the only possible natural number it can be is 1. Second, many functions that take a list for a particular argument also accept a unit, and treat it as a length-1 array. For example, `5⥊a` and `⟨5⟩⥊a` are identical. Defining `≠5` to be `1` means that `=s⥊a` is always `≠s`. +A unit is an atom, or an array with no axes—rank 0. (See [Enclose](enclose.md) for more about unit arrays). Since it doesn't have any axes, its shape should have no elements. It should be the empty list `⟨⟩` (with a fill of `0`, like all shapes). As there's no first element in the shape, it's not obvious what the length should be, and a stricter language would just give an error. However, there are some good reasons to use a length of `1`. First, the total number of elements is 1, meaning that if the length divides this number evenly (as it does for non-unit arrays) then the only possible natural number it can be is 1. Second, many functions that take a list for a particular argument also accept a unit, and treat it as a length-1 array. For example, `5⥊a` and `⟨5⟩⥊a` are identical. Defining `≠5` to be `1` means that `=s⥊a` is always `≠s`. Despite this last point, it's important to remember that a unit isn't the same as a 1-element list. For example, the length-1 string `"a"` doesn't match `<'a'` but instead `⟨'a'⟩`. And also bear in mind that having an empty *shape* doesn't make a unit an empty *array*. That would mean it has no elements, not one! |
