11 files changed, 1205 insertions, 0 deletions
diff --git a/docsrc/context.md b/docsrc/context.md
new file mode 100644
index 00000000..fcd51bde
--- /dev/null
+++ b/docsrc/context.md
@@ -0,0 +1,82 @@
+# BQN's context-free grammar
+
+APL has a problem. To illustrate, let's look at an APL expression:
+
+    a b c d e
+
+It is impossible to say anything about this sentence! Is `c` a dyadic operator being applied to `b` and `d`, or are `b` and `d` two dyadic functions being applied to arrays? In contrast, expressions in C-like or Lisp-like languages show their structure of application:
+
+    b(a, d(c)(e))
+    (b a ((d c) e))
+
+In each case, some values are used as inputs to functions while others are the functions being applied. The result of a function can be used either as an input or as a function again. These expressions correspond to the APL expression where `a` and `e` are arrays, `b` and `c` are functions, and `d` is a monadic operator. However, these syntactic classes have to be known to see what the APL expression is doing—they are a form of context that is required for a reader to know the grammatical structure of the expression. In a context-free grammar like that of simple C or Lisp expressions, a value's grammatical role is part of the expression itself, indicated with parentheses: they come after the function in C and before it in Lisp. Of course, a consequence of using parentheses in this way is having a lot of parentheses. BQN uses a different method to annotate grammatical role:
+
+    a B C _d e
+
+Here, the lowercase spelling indicates that `a` and `e` are to be treated as values ("arrays" in APL) while the uppercase spelling of variables `B` and `C` are used as functions and `_d` is a modifier ("monadic operator"). Like parentheses for function application, the spelling is not inherent to the variable values used, but instead indicates their grammatical role in this particular expression. A variable has no inherent spelling and can be used in any role, so the names `a`, `A`, `_a`, and `_a_` all refer to exact same variable, but in different roles; typically we use the lowercase name to refer to the variable in isolation. While we still don't know anything about what values `a`, `b`, `c`, and so on have, we know how they interact in the line of code above.
+
+## Is grammatical context really a problem?
+
+Yes, in the sense of [problems with BQN](../problems.md). A grammar that uses context is harder for humans to read and machines to execute. A particular difficulty is that parts of an expression you don't yet understand can interfere with parts you do, making it difficult to work through an unknown codebase.
+
+One difficulty beginners to APL will encounter is that code in APL at first appears like a string of undifferentiated symbols. For example, a tacit Unique Mask implementation `⍳⍨=⍳∘≢` consists of six largely unfamiliar characters with little to distinguish them (in fact, the one obvious bit of structure, the repeated `⍳`, is misleading as it means different things in each case!). Simply placing parentheses into the expression, like `(⍳⍨)=(⍳∘≢)`, can be a great help to a beginner, and part of learning APL is to naturally see where the parentheses should go. The equivalent BQN expression, `⊐˜=↕∘≠`, will likely appear equally intimidating at first, but the path to learning which things apply to which is much shorter: rather than learning the entire list of APL primitives, a beginner just needs to know that superscript characters like `˜` are modifiers and characters like `∘` with unbroken circles are compositions before beginning to learn the BQN grammar that will explain how to tie the various parts together.
+
+This sounds like a distant concern to a master of APL or a computer that has no difficulty memorizing a few dozen glyphs. Quite the opposite: the same concern applies to variables whenever you begin work with an unfamiliar codebase! Many APL programmers even enforce variable name conventions to ensure they know the class of a variable. By having such a system built in, BQN keeps you from having to rely on programmers following a style guide, and also allows greater flexibility, including [functional programming](functional.md), as we'll see later.
+
+Shouldn't a codebase define all the variables it uses, so we can see their class from the definition? Not always: consider that in a language with libraries, code might be imported from dependencies. Many APLs also have some dynamic features that can allow a variable to have more than one class, such as the `⍺←⊢` pattern in a dfn that makes `⍺` an array in the dyadic case but a function in the monadic case. Regardless, searching for a definition somewhere in the code is certainly a lot more work than knowing the class right away! One final difficulty is that even one unknown can delay understanding of an entire expression. Suppose in `A B c`, `B` is a function and `c` is an array, and both values are known to be constant. If `A` is known to be a function (even if its value is not yet known), its right argument `B c` can be evaluated ahead of time. But if `A`'s type isn't known, it's impossible to know if this optimization is worth it, because if it is an array, `B` will instead be called dyadically.
+
+## BQN's spelling system
+
+BQN's expression grammar is a simplified version of the typical APL, removing some oddities like niladic functions and the two-glyph Outer Product operator. Every value can be used in any of four syntactic roles:
+
+| BQN         | APL              | J
+|-------------|------------------|------
+| Value       | Array            | Noun
+| Function    | Function         | Verb
+| Modifier    | Monadic operator | Adverb
+| Composition | Dyadic operator  | Conjunction
+
+Unlike variables, BQN primitives have only one spelling, and a fixed role (but their values can be used in a different role by storing them in variables). Superscript glyphs `` ˜¨˘⁼⌜´` `` are used for modifiers, and glyphs `∘○⊸⟜⌾⊘◶⚇⎉⍟` with an unbroken circle are compositions. Other primitives are functions. String and numeric literals are values.
+
+BQN's variables use another system, where the spelling indicates how the variable's value is used. A variable spelled with a lowercase first letter, like `var`, is a value. Spelled with an uppercase first letter, like `Var`, it is a function. Underscores are placed where operands apply to indicate a modifier `_var` or composition `_var_`. Other than the first letter or underscore, variables are case-insensitive.
+
+The associations between spelling and syntactic role are considered part of BQN's [token formation rules](../spec/token.md).
+
+One rule for typing is also best considered to be a pre-parsing rule like the spelling system: the role of a brace construct `{}` with no header is determined by which special arguments it uses: it's a value if there are none, but a `𝕨` or `𝕩` makes it at least a function, an `𝔽` makes it a modifier or composition, and a `𝔾` always makes it a composition.
+
+## BQN's grammar
+
+A formal treatment is included in [the spec](../spec/grammar.md). BQN's grammar—the ways syntactic roles interact—follows the original APL model (plus trains) closely, with allowances for new features like list notation. In order to keep BQN's syntax context-free, the syntactic role of any expression must be known from its contents, just like tokens.
+
+Here is a table of the APL-derived operator and function application rules:
+
+| left  | main  | right | output   | name
+|-------|-------|-------|----------|------
+|       |  `F`  |  `x`  | Value    | Monadic function
+|  `w`  |  `F`  |  `x`  | Value    | Dyadic function
+|       |  `F`  |  `G`  | Function | 2-train
+|  `F*` |  `G`  |  `H`  | Function | 3-train
+|  `F*` | `_m`  |       | Function | Modifier
+|  `F*` | `_c_` |  `G*` | Function | Composition
+|       | `_c_` |  `G*` | Modifier | Partial application
+|  `F*` | `_c_` |       | Modifier | Partial application
+
+A function with an asterisk indicates that a value can also be used: in these positions there is no difference between function and value spellings. Operator applications bind more tightly than functions, and associate left-to-right while functions associate right-to-left.
+
+BQN lists can be written with angle brackets `⟨elt0,elt1,…⟩` or ligatures `elt0‿elt1‿…`. In either case the elements can have any type, and the result is a value.
+
+The statements in a brace block, function, or operator can also be any role, including the return value at the end. These roles have no effect: outside of braces, a function always returns an array, a modifier always returns a function, and so on, regardless of how these objects were defined.
+
+## Mixing roles
+
+BQN's basic types align closely with its syntactic roles: functions, modifiers, and compositions are all basic types, while values are split into numbers, characters, and arrays. This is no accident, and usually values will be used in roles that match their underlying type. However, the ability to use a role that doesn't match the type is very useful.
+
+Any type can be passed as an argument to a function, or as an operand, by treating it as a value. This means that BQN fully supports Lisp-style [functional programming](functional.md), where functions can be used as values.
+
+It can also be useful to treat a value type as a function, in which case it applies as a constant function. This rule is useful with most built-in operators. For example, `F⎉1` uses a constant for the rank even though in general a function can be given, and `a⌾(b⊸/)` inserts the values in `a` into the positions selected by `b`, ignoring the old values rather than applying a function to them.
+
+Other mixes of roles are generally not useful. While a combination such as treating a function as a modifier is allowed, attempting to apply it to an operand will fail. Only a modifier can be applied as a modifier and only a composition can be applied as a composition. Only a function or value can be applied as a function.
+
+It's also worth noting that something that appears to be a value may actually be a function! For example, the result of `𝕨˜𝕩` may not always be `𝕨`. `𝕨˜𝕩` is exactly identical to `𝕎˜𝕩`, which gives `𝕩𝕎𝕩`. If `𝕎` is a number, character, or array, that's the same as `𝕨`, but if it is a function, then it will be applied.
+
+The primary way to change the role of a value in BQN is to use a name, including one of the arguments to a brace function. In particular, you can use `{𝔽}` to convert a value operand into a function. Converting a function to a value is more difficult. Often an array of functions is wanted, in which case they can be stranded together; otherwise it's probably best to give the function a name. Picking a function out of a list, for example `⊑⟨+⟩` will give it as a value.
diff --git a/docsrc/depth.md b/docsrc/depth.md
new file mode 100644
index 00000000..2bb369ef
--- /dev/null
+++ b/docsrc/depth.md
@@ -0,0 +1,142 @@
+# Depth
+
+The depth of an array is the greatest level of array nesting it attains, or, put another way, the greatest number of times you can pick an element starting from the original array before reaching a non-array. The monadic function Depth (`≡`) returns the depth of its argument, while the composition Depth (`⚇`) can control the way its left operand is applied based on the depth of its arguments. Several primitive functions also use the depth of the left argument to decide whether it applies to a single axis of the right argument or to several axes.
+
+## The Depth function
+
+To find the depth of an array, use Depth (`≡`). For example, the depth of a list of numbers or characters is 1:
+
+        ≡ 2‿3‿4
+    1
+        ≡ "a string is a list of characters"
+    1
+
+Depth is somewhat analogous to an array's rank `≠≢𝕩`, and in fact rank can be "converted" to depth by splitting rows with `<⎉1`, reducing the rank by 1 and increasing the depth. Unlike rank, Depth doesn't care at all about its argument's shape:
+
+        ≡ 3‿4⥊"characters"
+    1
+        ≡ (1+↕10)⥊"characters"
+    1
+
+Also unlike rank, Depth *does* care about the elements of its argument: in fact, to find the depth of an array, every element must be inspected.
+
+        ≡ ⟨2,3,4,5⟩
+    1
+        ≡ ⟨2,<3,4,5⟩
+    2
+        ≡ ⟨2,<3,4,<<<5⟩
+    4
+
+As the above expressions suggest, the depth of an array is the maximum of its elements, plus one. The base case, a non-array (including a function, modifier, or combinator), has depth 0.
+
+        ≡'c'
+    0
+        F←+⋄≡f
+    0
+        ≡⟨'c',f,2⟩
+    1
+        ≡⟨5,⟨'c',f,2⟩⟩
+    2
+
+If the function `IsArray` indicates whether its argument is an array, then we can write a recursive definition of Depth using the Choose composition.
+
+    Depth←IsArray◶0‿{1+0⌈´Depth¨⥊𝕩}
+
+The minimum element depth of 0 implies that an empty array's depth is 1.
+
+        ≡⟨⟩
+    1
+        ≡2‿0‿3⥊0
+    1
+
+## Testing depth for multiple-axis primitives
+
+Several primitive functions use the left argument to manipulate the right argument along one or more axes: see [leading.md](leading.md).
+
+| Single-axis depth | Functions
+|-------------------|----------
+| 0                 | `↑↓↕⌽⍉`
+| 1                 | `/⊏⊔`
+
+Functions such as Take and Drop use a single number per axis. When the left argument is a list of numbers, they apply to initial axes. But for convenience, a single number is also accepted, and applied to the first axis only. This is equivalent to ravelling the left argument before applying the function.
+
+        ≢2↑7‿7‿7‿7⥊"abc"
+    [ 2 7 7 7 ]
+        ≢2‿1‿1↑7‿7‿7‿7⥊"abc"
+    [ 2 1 1 7 ]
+
+In these cases the flexibility seems trivial because the left argument has depth 1 or 0: it is an array or isn't, and it's obvious what a plain number should do. But for the second row in the table, the left argument is always an array. The general case is that the left argument is a vector and its elements correspond to right argument axes:
+
+        ⟨3‿2,1‿4‿1⟩ ⊏ ↕6‿7
+    ┌
+      [ 3 1 ] [ 3 4 ] [ 3 1 ]
+      [ 2 1 ] [ 2 4 ] [ 2 1 ]
+                              ┘
+
+This means the left argument is homogeneous of depth 2. What should an argument of depth 1, or an argument that contains non-arrays, do? One option is to continue to require the left argument to be a vector, and convert any non-array argument into an array by boxing it:
+
+        ⟨3‿2,1⟩ <⍟(0=≡)¨⊸⊏ ↕6‿7
+    [ [ 3 1 ] [ 2 1 ] ]
+
+While very consistent, this extension represents a small convenience and makes it difficult to act on a single axis, which for Replicate and [Group](group.md) is probably the most common way the primitive is used:
+
+        3‿2‿1‿2‿3 / "abcde"
+    [ aaabbcddeee ]
+
+With the extension above, every case like this would have to use `<⊸/` instead of just `/`. BQN avoids this difficulty by testing the left argument's depth. A depth-1 argument applies to the first axis only, giving the behavior above.
+
+For Select, the depth-1 case is still quite useful, but it may also be desirable to choose a single cell using a list of numbers. In this case the left argument depth can be increased from the bottom using `<¨`.
+
+        2‿1‿4 <¨⊸⊏ ↕3‿4‿5‿2
+    [ [ 2 1 4 0 ] [ 2 1 4 1 ] ]
+
+## The Depth composition
+
+The Depth composition (`⚇`) is a generalization of Each that allows diving deeper into an array. To illustrate it we'll use a shape `4‿3` array of lists of lists.
+
+        ⊢ n ← <⎉1⍟2 4‿3‿2‿2⥊↕48
+    ┌
+      [ [ 0 1 ] [ 2 3 ] ]     [ [ 4 5 ] [ 6 7 ] ]     [ [ 8 9 ] [ 10 11 ] ]
+      [ [ 12 13 ] [ 14 15 ] ] [ [ 16 17 ] [ 18 19 ] ] [ [ 20 21 ] [ 22 23 ] ]
+      [ [ 24 25 ] [ 26 27 ] ] [ [ 28 29 ] [ 30 31 ] ] [ [ 32 33 ] [ 34 35 ] ]
+      [ [ 36 37 ] [ 38 39 ] ] [ [ 40 41 ] [ 42 43 ] ] [ [ 44 45 ] [ 46 47 ] ]
+                                                                              ┘
+        ≡ n
+    3
+
+Reversing n swaps all the rows:
+
+        ⌽ n
+    ┌
+      [ [ 36 37 ] [ 38 39 ] ] [ [ 40 41 ] [ 42 43 ] ] [ [ 44 45 ] [ 46 47 ] ]
+      [ [ 24 25 ] [ 26 27 ] ] [ [ 28 29 ] [ 30 31 ] ] [ [ 32 33 ] [ 34 35 ] ]
+      [ [ 12 13 ] [ 14 15 ] ] [ [ 16 17 ] [ 18 19 ] ] [ [ 20 21 ] [ 22 23 ] ]
+      [ [ 0 1 ] [ 2 3 ] ]     [ [ 4 5 ] [ 6 7 ] ]     [ [ 8 9 ] [ 10 11 ] ]
+                                                                              ┘
+
+Depth `¯1` is equivalent to Each, and reverses the larger vectors, while depth `¯2` applies Each twice to reverse the smaller vectors:
+
+        ⌽⚇¯1 n
+    ┌
+      [ [ 2 3 ] [ 0 1 ] ]     [ [ 6 7 ] [ 4 5 ] ]     [ [ 10 11 ] [ 8 9 ] ]
+      [ [ 14 15 ] [ 12 13 ] ] [ [ 18 19 ] [ 16 17 ] ] [ [ 22 23 ] [ 20 21 ] ]
+      [ [ 26 27 ] [ 24 25 ] ] [ [ 30 31 ] [ 28 29 ] ] [ [ 34 35 ] [ 32 33 ] ]
+      [ [ 38 39 ] [ 36 37 ] ] [ [ 42 43 ] [ 40 41 ] ] [ [ 46 47 ] [ 44 45 ] ]
+                                                                              ┘
+        ⌽⚇¯2 n
+    ┌
+      [ [ 1 0 ] [ 3 2 ] ]     [ [ 5 4 ] [ 7 6 ] ]     [ [ 9 8 ] [ 11 10 ] ]
+      [ [ 13 12 ] [ 15 14 ] ] [ [ 17 16 ] [ 19 18 ] ] [ [ 21 20 ] [ 23 22 ] ]
+      [ [ 25 24 ] [ 27 26 ] ] [ [ 29 28 ] [ 31 30 ] ] [ [ 33 32 ] [ 35 34 ] ]
+      [ [ 37 36 ] [ 39 38 ] ] [ [ 41 40 ] [ 43 42 ] ] [ [ 45 44 ] [ 47 46 ] ]
+                                                                              ┘
+
+While a negative depth tells how many levels to go down, a non-negative depth gives the maximum depth of the argument before applying the left operand. On a depth-3 array like above, depth `2` is equivalent to `¯1` and depth `1` is equivalent to `¯2`. A depth of `0` means to loop until non-arrays are reached, that is, apply [pervasively](https://aplwiki.com/wiki/Pervasion), like a scalar function.
+
+        ⟨'a',"bc"⟩ ≍⚇0 ⟨2‿3,4⟩
+    [ [ [ a 2 ] [ a 3 ] ] [ [ b 4 ] [ c 4 ] ] ]
+
+With a positive operand, Depth doesn't have to use the same depth everywhere. Here, Length is applied as soon as the depth for a particular element is 1 or less, including if the argument has depth 0. For example, it maps over `⟨2,⟨3,4⟩⟩`, but not over `⟨11,12⟩`, even though these are elements of the same array.
+
+        ≠⚇1 ⟨1,⟨2,⟨3,4⟩⟩,⟨5,⟨6,7⟩,⟨8,9,10⟩⟩,⟨11,12⟩⟩
+    [ 1 [ 1 2 ] [ 1 2 3 ] 2 ]
diff --git a/docsrc/functional.md b/docsrc/functional.md
new file mode 100644
index 00000000..edc20c5f
--- /dev/null
+++ b/docsrc/functional.md
@@ -0,0 +1,98 @@
+# Functional programming
+
+BQN boasts of its functional capabilities, including first-class functions. What sort of functional support does it have, and how can a BQN programmer exercise these and out themself as a Schemer at heart?
+
+First, let's be clear about what the terms we're using mean. A language has *first-class functions* when functions (however they are defined) can be used in all the same ways as "ordinary" values like numbers and so on, such as being passed as an argument or placed in a list. Lisp and JavaScript have first-class functions, C has unsafe first-class functions via function pointers, and Java and APL don't have them as functions can't be placed in lists or used as arguments. This doesn't mean every operation is supported on functions: for instance, numbers can be added, compared, and sorted; while functions could perhaps be added to give a train, comparing or sorting them as functions (not representations) isn't computable, and BQN doesn't support any of the three operations when passing functions as arguments.
+
+Traditionally APL has worked around its lack of first-class functions with operators or second-order functions. Arrays in APL are first class while functions are second class and operators are third class, and each class can act on the ones before it. However, the three-tier system has some obvious limitations that we'll discuss, and BQN removes these by making every type first class.
+
+The term *functional programming* is more contentious, and has many meanings some of which can be vague. Here I use it for what might be called *first-class functional programming*, programming that makes significant use of first-class functions; in this usage, Scheme is probably the archetypal functional programming language. However, two other definitions are also worth mentioning. APL is often called a functional programming language on the grounds that functions can be assigned and manipulated, and called recursively, all characteristics it shares with Lisp. I prefer the term *function-level programming* for this usage. A newer usage, which I call *pure functional programming*, restricts the term "function" to mathematical functions, which have no side effects, so that functional programming is programming with no side effects, often using monads to accumulate effects as part of arguments and results instead. Finally, *typed functional programming* is closely associated with pure functional programming and refers to statically-typed functional languages such as Haskell, F#, and Idris (the last of which even supports *dependently-typed functional programming*, but I already said "finally" so we'll stop there). Of these, BQN supports first-class functional and function-level programming, allows but doesn't encourage pure functional programming, and does not support typed functional programming, as it is dynamically and not statically typed.
+
+Another topic we are interested in is *lexical scoping* and *closures*. Lexical scoping means that the realm in which a variable exists is determined by its containing context (in BQN, the surrounding set of curly braces `{}`, if any) within the source code. A closure is really an implementation mechanism, but it's often used to refer to a property of lexical scoping that appears when functions defined in a particular block can be accessed after the block finishes execution. For example, they might be returned from a function or assigned to a variable outside of that function's scope. In this case the functions can still access variables in the original scope. I consider this property to be a requirement for a correct lexical scoping implementation, but it's traditionally not a part of APL: implementation might not have lexical scoping (for example, J and I believe A+ use static scoping where functions can't access variables in containing scopes) or might cut off the scope once execution ends, leading to value errors that one wouldn't predict from the rules of lexical scoping.
+
+## Functions in APL
+
+This seems like a good place for a brief and entirely optional discussion of how APL handles functions and why it does it this way. As mentioned above, APL's functions are second class rather than first class. However, it's worth noting that the barriers to making functions first-class objects have been entirely syntactic and conceptual, not technical. In fact, the J language has for a long time had [a bug](http://www.jsoftware.com/pipermail/programming/2013-January/031260.html) that allows an array containing a function to be created: by selecting from the array, the function itself can even be passed through tacit functions as an argument!
+
+The primary reason why APL doesn't allow functions to be passed as arguments is probably syntax: in particular, there's no way to say that a function should be used as the left argument to another function, as an expression like `F G x` with functions `F` and `G` and an array `x` will simply be evaluated as two monadic function applications. However, there's no syntactic rule that prevents a function from returning a function, and Dyalog APL for example allows this (so `⍎'+'` returns the function `+`). Dyalog's `⎕OR` is another interesting phenomenon in this context: it creates an array from a function or operator, which can then be used as an element or argument like any array. The mechanism is essentially the same as BQN's first class functions, and in fact `⎕OR`s even share a form of BQN's [syntactic type erasure](../problems.md#syntactic-type-erasure), as a `⎕OR` of a function passed as an operand magically becomes a function again. But outside of this property, it's cumbersome and slow to convert functions to and from `⎕OR`s, so they don't work very well as a first-class function mechanism.
+
+Another reason for APL's reluctance to adopt first-class functions is that Iverson and others seemed to believe that functions fundamentally are not a kind of data, because it's impossible to uniquely represent, compare, and order them. One effect of this viewpoint is J's gerund mechanism, which converts a function to an array representation, primarily so that lists of gerunds can be created. Gerunds are nested arrays containing character vectors at the leaves, so they are arrays as Iverson thought of them. However, I consider this conversion of functions to arrays, intended to avoid arrays that contain "black box" functions, to be a mistake: while it doesn't compromise the purity of arrays, it gives the illusion that a function corresponds to a particular array, which is not true from the mathematical perspective of functions as mappings from an arbitrary input to an output. I also think the experience of countless languages with first-class functions shows that there is no practical issue with arrays that contain functions. While having all arrays be concrete entities with a unique canonical representation seems desirable, I don't find the existence of arrays without this property to be detract from working with arrays that do have it.
+
+## Functional programming in BQN
+
+*Reminder: I am discussing only first-class functional programming here, and not other concepts like pure or typed functional programming!*
+
+What does functional programming in BQN look like? How is it different from the typical APL style of manipulating functions with operators?
+
+### Working with roles
+
+First, let's look at the basics: a small program that takes a function as its argument and result. The function `Lin` below gives a linear approximation to its function argument based on the values at 0 and 1. To find these two values, we call the argument as a function by using its uppercase spelling, `𝕏`.
+
+    Lin ← {
+      v0 ← 𝕏 0
+      v0 + ((𝕏 1) - v0) × ⊢
+    }
+
+We can pass it the exponential function as an argument by giving it the name `Exp` and then referring to it in lowercase (that is, in a value role). The result is a train that adds 1 to *e*-1 times the argument.
+
+        Exp ← ⋆
+        Lin exp
+    (1 + (1.71828182845905 × ⊢))
+
+As with all functions, the result of `Lin` has a value role. To use it as a function, we give it a name and then use that name with an uppercase spelling.
+
+        expLin ← Lin exp
+        ExpLin 5
+    9.59140914229523
+
+A tricker but more compact method is to use the modifier `{𝔽}`, as the input to a modifier can have a value or function role but its output always has a function role.
+
+        (Lin exp){𝔽} 5
+    9.59140914229523
+
+Not the most accurate approximation, though.
+
+        Exp 5
+    148.413159102577
+
+Note also in this case that we could have used a modifier with a very similar definition to `Lin`. The modifier is identical in definition except that `𝕏` is replaced with `𝔽`.
+
+    _lin ↩ {
+      v0 ← 𝔽 0
+      v0 + ((𝔽 1) - v0) × ⊢
+    }
+
+Its call syntax is simpler as well. In other cases, however, the function version might be preferable, for example when dealing with arrays of functions or many arguments including a function.
+
+        Exp _lin 5
+    9.59140914229523
+
+### Arrays of functions
+
+It's very convenient to put a function in an array, which is fortunate because this is one of the most important uses of functions as values. Here's an example of an array of functions with a reduction applied to it, composing them together.
+
+        {𝕎∘𝕏}´ ⋆‿-‿(×˜)
+    ⋆∘(-∘(×˜))
+
+Like any function, this one can be given a name and then called. A quirk of this way of defining a function is that it has a value role (it's the result of the function `{𝕎∘𝕏}´`) and so must be defined with a lowercase name.
+
+        gauss ← {𝕎∘𝕏}´ ⋆‿-‿(×˜)
+        Gauss 2
+    0.0183156388887342
+
+Another, and probably more common, use of arrays of functions is to apply several different functions to one or more arguments. Here we apply three different functions to the number 9:
+
+        ⟨√, 2⊸∾, ⊢-⋆⟩ {𝕎𝕩}¨ 9
+    [ 3 [ 2 9 ] ¯8094.083927575384 ]
+
+The composition Choose (`◶`) relies on arrays of functions to… function. It's very closely related to Pick `⊑`, and in fact when the left operand and the elements of the right operand are all value types there's no real difference: Choose returns the constant function `𝕗⊑𝕘`.
+
+        2◶"abcdef" "arg"
+    c
+
+When the operands contain functions, however, the potential of Choose as a ternary-or-more operator opens up. Here's a function for a step in the Collatz sequence, which halves an even input but multiplies an odd input by 3 and adds 1. To get the sequence for a number, we can apply the same function many times. It's an open problem whether the sequence always ends with the repetition 4, 2, 1, but it can take a surprisingly long time to get there—try 27 as an argument.
+
+        (2⊸|)◶⟨÷⟜2,1+3×⊢⟩¨ 6‿7
+    [ 3 22 ]
+        (2⊸|)◶⟨÷⟜2,1+3×⊢⟩⍟(↕10) 6
+    [ 6 3 10 5 16 8 4 2 1 4 ]
diff --git a/docsrc/gen b/docsrc/gen
new file mode 100755
index 00000000..8733a52f
--- /dev/null
+++ b/docsrc/gen
@@ -0,0 +1,3 @@
+#! /usr/bin/env bash
+
+for f in *.md; do ../spec/dzref md.bqn "•←ConvertFile \"$PWD/$f\"" > ../doc/${f%md}html; done
diff --git a/docsrc/group.md b/docsrc/group.md
new file mode 100644
index 00000000..b561cd7d
--- /dev/null
+++ b/docsrc/group.md
@@ -0,0 +1,181 @@
+# Group
+
+BQN replaces the [Key](https://aplwiki.com/wiki/Key) operator from J or Dyalog APL, and [many forms of partitioning](https://aplwiki.com/wiki/Partition_representations), with a single (ambivalent) Group function `⊔`. This function is somewhat related to the K function `=` of the same name, but results in an array rather than a dictionary.
+
+The BQN prototype does not implement this function: instead it uses `⊔` for a Group/Key function very similar to `{⊂⍵}⌸` in Dyalog APL, and also has a Cut function `\`. The new BQN Group on numeric arguments (equivalently, rank-1 results) can be defined like this:
+
+    ⊔↩((↕1+(>⌈´))=¨<)∘⊣ /¨⟜< ↕∘≠⍠⊢
+
+Once defined, the old BQN Key (dyadic) is `⍷⊸⊐⊸⊔` and Group (monadic) is `⍷⊸⊐⊔↕∘≠` using the Deduplicate or Unique Cells function `⍷` (BQN2NGN spells it `∪`). Cut on matching-length arguments is `` +`⊸⊔ ``.
+
+## Definition
+
+Group operates on a numeric list of indices and a value array, treated as a list of its major cells, to produce a list of groups, each of which is a selection from the values. The indices and values have the same length, and each value cell is paired with the index at the same position. That index indicates the result group the value should go into, with an "index" of ¯1 indicating that it should be dropped and not appear in the result.
+
+        0‿1‿2‿0‿1 ≍ "abcde"  # Corresponding indices and values
+    ┌
+      0 1 2 0 1
+      a b c d e
+                ┘
+        0‿1‿2‿0‿1 ⊔ "abcde"  # Values grouped by index
+    [ [ ad ] [ be ] [ c ] ]
+
+For example, we might choose to group a list of words by length. Within each group, values maintain the ordering they had in the list originally.
+
+        phrase ← "BQN"‿"uses"‿"notation"‿"as"‿"a"‿"tool"‿"of"‿"thought"
+        ⥊˘ ≠¨⊸⊔ phrase
+    ┌
+      []
+      [ [ a ] ]
+      [ [ as ] [ of ] ]
+      [ [ BQN ] ]
+      [ [ uses ] [ tool ] ]
+      []
+      []
+      [ [ thought ] ]
+      [ [ notation ] ]
+                            ┘
+
+(Could we define `phrase` more easily? See [below](#partitioning).)
+
+If we'd like to ignore words of 0 letters, or more than 5, we can set all word lengths greater than 5 to 0, then reduce the lengths by 1. Two words end up with left argument values of ¯1 and are omitted from the result.
+
+        1-˜≤⟜5⊸×≠¨phrase
+    [ 2 3 ¯1 1 0 3 1 ¯1 ]
+        ⥊˘ {1-˜≤⟜5⊸×≠¨𝕩}⊸⊔ phrase
+    ┌
+      [ [ a ] ]
+      [ [ as ] [ of ] ]
+      [ [ BQN ] ]
+      [ [ uses ] [ tool ] ]
+                            ┘
+
+Note that the length of the result is determined by the largest index. So the result never includes trailing empty groups. A reader of the above code might expect 5 groups (lengths 1 through 5), but there are no words of length 5, so the last group isn't there.
+
+When Group is called dyadically, the left argument is used for the indices and the right is used for values, as seen above. When it is called monadically, the right argument gives the indices and the values grouped are the right argument's indices, that is, `↕≠𝕩`.
+
+        ⥊˘ ⊔ 2‿3‿¯1‿2
+    ┌
+      []
+      []
+      [ [ 0 ] [ 3 ] ]
+      [ [ 1 ] ]
+                      ┘
+
+Here, the index 2 appears at indices 0 and 3 while the index 3 appears at index 1.
+
+### Multidimensional grouping
+
+Dyadic Group allows the right argument to be grouped along multiple axes by using a nested left argument. In this case, the left argument must be a vector of numeric vectors, and the result has rank `≠𝕨` while its elements—as always—have the same rank as `𝕩`. The result shape is `1+⌈´¨𝕨`, while the shape of element `i⊑𝕨⊔𝕩` is `i+´∘=¨𝕨`. If every element of `𝕨` is sorted ascending and contains only non-negative numbers, we have `𝕩≡∾𝕨⊔𝕩`, that is, Join is the inverse of Partition.
+
+Here we split up a rank-2 array into a rank-2 array of rank-2 arrays. Along the first axis we simply separate the first pair and second pair of rows—a partition. Along the second axis we separate odd from even indices.
+
+        ⟨0‿0‿1‿1,0‿1‿0‿1‿0‿1‿0⟩⊔(10×↕4)+⌜↕7
+    ┌
+      ┌               ┌
+         0  2  4  6      1  3  5
+        10 12 14 16     11 13 15
+                    ┘            ┘
+      ┌               ┌
+        20 22 24 26     21 23 25
+        30 32 34 36     31 33 35
+                    ┘            ┘
+                                   ┘
+
+Each group `i⊑𝕨⊔𝕩` is composed of the cells `j<¨⊸⊏𝕩` such that `i≢j⊑¨𝕨`. The groups retain their array structure and ordering along each argument axis. Using multidimensional Replicate we can say that `i⊑𝕨⊔𝕩` is `(i=𝕨)/𝕩`.
+
+The monadic case works similarly: Group Indices always satisfies `⊔𝕩 ←→ 𝕩⊔↕≠⚇1 x`. As with `↕`, the depth of the result of Group Indices is always one greater than that of its argument. A depth-0 argument is not allowed.
+
+## Properties
+
+Group is closely related to the inverse of Indices, `/⁼`. In fact, inverse Indices called on the index argument gives the length of each group:
+
+        ≠¨⊔ 2‿3‿1‿2
+    [ 0 1 2 1 ]
+        /⁼∧ 2‿3‿1‿2
+    [ 0 1 2 1 ]
+
+A related fact is that calling Indices on the result of Group sorts all the indices passed to Group (removing any ¯1s). This is a kind of counting sort.
+
+        /≠¨⊔ 2‿3‿1‿¯1‿2
+    [ 1 2 2 3 ]
+
+Called dyadically, Group sorts the right argument according to the left and adds some extra structure. If this structure is removed with Join, Group can be thought of as a kind of sorting.
+
+        ∾ 2‿3‿1‿¯1‿2 ⊔ "abcde"
+    [ caeb ]
+        2‿3‿1‿¯1‿2 {F←(0≤𝕨)⊸/ ⋄ 𝕨⍋⊸⊏○F𝕩} "abcde"
+    [ caeb ]
+
+Group can even be implemented with the same techniques as a bucket sort, which can be branchless and fast.
+
+## Applications
+
+The obvious application of Group is to group some values according to a known or computed property. If this property isn't an integer, it can be turned into one using Unique and Index Of (the combination `⍷⊸⊐` has been called "self-classify").
+
+        ln ← "Phelps"‿"Latynina"‿"Bjørgen"‿"Andrianov"‿"Bjørndalen"
+        co ← "US"    ‿"SU"      ‿"NO"     ‿"SU"       ‿"NO"
+        ⥊˘ co ⍷⊸⊐⊸⊔ ln
+    ┌
+      [ [ Phelps ] ]
+      [ [ Latynina ] [ Andrianov ] ]
+      [ [ Bjørgen ] [ Bjørndalen ] ]
+                                     ┘
+
+If we would like a particular index to key correspondence, we can use a fixed left argument to Index Of.
+
+        countries ← "IT"‿"JP"‿"NO"‿"SU"‿"US"
+        countries ∾˘ co countries⊸⊐⊸⊔ ln
+    ┌
+      [ IT ] []
+      [ JP ] []
+      [ NO ] [ [ Bjørgen ] [ Bjørndalen ] ]
+      [ SU ] [ [ Latynina ] [ Andrianov ] ]
+      [ US ] [ [ Phelps ] ]
+                                            ┘
+
+However, this solution will fail if there are trailing keys with no values. To force the result to have a particular length you can append that length as a dummy index to each argument, then remove the last group after grouping.
+
+        countries ↩ "IT"‿"JP"‿"NO"‿"SU"‿"US"‿"ZW"
+        countries ∾˘ co countries{𝕗⊸⊐⊸(¯1↓⊔○(∾⟜(≠𝕗)))} ln
+    ┌
+      [ IT ] []
+      [ JP ] []
+      [ NO ] [ [ Bjørgen ] [ Bjørndalen ] ]
+      [ SU ] [ [ Latynina ] [ Andrianov ] ]
+      [ US ] [ [ Phelps ] ]
+      [ ZW ] []
+                                            ┘
+
+### Partitioning
+
+In examples we have been using a list of strings stranded together. Often it's more convenient to write the string with spaces, and split it up as part of the code. In this case, the index corresponding to each word (that is, each letter in the word) is the number of spaces before it. We can get this number of spaces from a prefix sum on the boolean list which is 1 at each space.
+
+        ' '(+`∘=⊔⊢)"BQN uses notation as a tool of thought"
+    [ [ BQN ] [  uses ] [  notation ] [  as ] [  a ] [  tool ] [  of ] [  thought ] ]
+
+To avoid including spaces in the result, we should change the result index at each space to ¯1. Here is one way to do that:
+
+        ' '((⊢-˜¬×+`)∘=⊔⊢)"BQN uses notation as a tool of thought"
+    [ [ BQN ] [ uses ] [ notation ] [ as ] [ a ] [ tool ] [ of ] [ thought ] ]
+
+A function with structural Under, such as `` {¯1¨⌾(𝕩⊸/)+`𝕩} ``, would also work.
+
+In other cases, we might want to split on spaces, so that words are separated by any number of spaces, and extra spaces don't affect the output. Currently our function makes a new word with each space:
+
+        ' '((⊢-˜¬×+`)∘=⊔⊢)"  string with  spaces   "
+    [ [] [] [ string ] [ with ] [] [ spaces ] ]
+
+However, trailing spaces are ignored because Group never produces trailing empty groups (to get them back we would use a dummy final character in the string). To avoid empty words, we should increase the word index only once per group of spaces. We can do this by taking the prefix sum of a list that is 1 only for a space with no space before it. To make such a list, we can use the [Windows](windows.md) function. We will extend our list with an initial 1 so that leading spaces will be ignored. Then we take windows of the same length as the original list: the first includes the dummy argument followed by a shifted copy of the list, and the second is the original list. These represent whether the previous and current characters are spaces; we want positions where the previous wasn't a space and the current is.
+
+        ≍⟜((<´<˘)≠↕1∾⊢) ' '="  string with  spaces   "  # All, then filtered, spaces
+    ┌
+      1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1
+      0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0
+        ≍⟜(⊢-˜¬×+`∘((<´<˘)≠↕1∾⊢))' '="  string with  spaces   "  # More processing
+    ┌
+       1  1 0 0 0 0 0 0  1 0 0 0 0  1  1 0 0 0 0 0 0  1  1  1
+      ¯1 ¯1 0 0 0 0 0 0 ¯1 1 1 1 1 ¯1 ¯1 2 2 2 2 2 2 ¯1 ¯1 ¯1
+                                                              ┘
+        ' '((⊢-˜¬×+`∘((<´<˘)≠↕1∾⊢))∘=⊔⊢)"  string with  spaces   "  # Final result
+    [ [ string ] [ with ] [ spaces ] ]
diff --git a/docsrc/indices.md b/docsrc/indices.md
new file mode 100644
index 00000000..382f7f80
--- /dev/null
+++ b/docsrc/indices.md
@@ -0,0 +1,53 @@
+# Indices
+
+One-dimensional arrays such as K lists or Python arrays have only one kind of index, a single number that refers to an element. For multidimensional arrays using the leading axis theory, there are several types of indexing that can be useful. Historically, nested APL designs have equivocated between these, which I believe can lead to subtle errors when programming. BQN focuses on single-number (depth 0) indices, which can refer to vector elements or array major cells (or more generally indexing along any particular axis). When using this kind of element indices, arrays are required to be vectors. Only two functions allow the use of vector element indices: Range (`↕`), which can accept a vector argument, and Pick (`⊑`), which uses the depth-1 arrays in its left argument as index scalars or vectors.
+
+The following functions take or return indices. Except where marked, the indices are in the result; this is by far the most common type of index use. `⊔` is given two rows as it falls into both cases. Note that in the result case, there is usually no possibility for the programmer to select the format of indices. Instead, the language should be carefully designed to make sure that the return type of indices is as useful as possible.
+
+| Monad | Dyad | Where   | How
+|-------|------|---------|--------------------------
+|  `↕`  |      |         | Element scalar or vector
+|  `/`  |      |         | Element scalar
+|  `⊔`  |      |         | Element scalar
+|  `⊔`  | `⊔`  | `𝕨`/`𝕩` | Along-axis scalar
+|       | `⊑`  | `𝕨`     | Element vector
+|  `⍋`  | `⍋`  |         | Major cell scalar
+|  `⍒`  | `⍒`  |         | Major cell scalar
+|       | `⊐`  |         | Major cell scalar
+|       | `⊒`  |         | Major cell scalar
+|       | `⊏`  | `𝕨`     | Major cell or along-axis scalar
+|  `⍉`  |      |         | Axis scalar
+
+Dyadic Transpose (`⍉`) takes an index into the right argument axes as its left argument, but since array shape is 1-dimensional, there is only one sensible choice for this, a single number.
+
+# Element indices
+
+In general, the index of an element of an array is a vector whose length matches the array rank. It is also possible to use a number for an index into a vector, as the vector index is a singleton, but this must be kept consistent with the rest of the language. NARS-family APLs make the Index Generator (`↕` in BQN) return a numeric vector when the argument has length 1 but a nested array otherwise. This means that the depth of the result depends on the shape of the argument, inverting the typical hierarchy. BQN shouldn't have such an inconsistency.
+
+Functions `↕`, `/`, `⊔`, and `⊑` naturally deal with element indices. Each of these can be defined to use vector indices. However, this usually rules out the possibility of using scalar indices, which makes these functions harder to use both with generic array manipulation and with the major cell indices discussed in the next section. For this reason BQN restricts `⊔` and monadic `/` to use depth-0 indices, which comes with the requirement that the arguments to monadic `/` and `⊔`, and the result of monadic `⊔`, must be vectors. For dyadic `⊔` the depth-1 elements of the left argument are vectors of indices along axes of the result; see [the documentation](group.md#multidimensional-grouping). The restriction that comes from using single-number indices is that all axes must be treated independently, so that for example it isn't possible to group elements along diagonals without preprocessing. However, this restriction also prevents Group from having to use an ordering on vector indices.
+
+Unlike `/` and `⊔`, `↕` and `⊑` do use vector element indices. For `↕` this is because the output format can be controlled by the argument format: if passed a single number, the result uses single-number indices (so it's a numeric vector); if passed a vector, it uses vector indices and the result has depth 2 (the result depth is always one greater than the argument depth). For `⊑`, vector indices are chosen because `⊏` handles scalar indices well already. When selecting multiple elements from a vector, they would typically have to be placed in an array, which is equivalent to `⊏` with a numeric vector left argument. A single scalar index to `⊑` is converted to a vector, so it can be used to select a single element if only one is wanted. To select multiple elements, `⊑` uses each depth-1 array in the left argument as an index and replaces it with that element from the right argument. Because this uses elements as elements (not cells), it is impossible to have conformability errors where elements do not fit together. Ill-formed index errors are of course still possible, and the requirements on indices are quite strict. They must exactly match the structure of the right argument's shape, with no scalars or higher-rank arrays allowed. Single numbers also cannot be used in this context, as it would create ambiguity: is a one-element vector an index, or does it contain an index?
+
+# Major cell indices
+
+One of the successes of the [leading axis model](https://aplwiki.com/wiki/Leading_axis_theory) is to introduce a kind of index for multidimensional arrays that is easier to work with than vector indices. The model introduces [cells](https://aplwiki.com/wiki/Cell), where a cell index is a vector of any length up to the containing array's rank. General cell indices are discussed in the next section; first we introduce a special case, indices into major cells or ¯1-cells. These cells naturally form a list, so the index of a major cell is a single number. These indices can also be considered indices along the first axis, since an index along any axis is a single number.
+
+Ordering-based functions `⍋`, `⍒`, `⊐`, and `⊒` only really make sense with major cell indices: while it's possible to order other indices as ravel indices, this probably isn't useful from a programming standpoint. Note that `⊐` only uses the ordering in an incidental way, because it's defined to return the *first* index where a right argument cell is found. A mathematician would be more interested in a "pre-image" function that returns the set of all indices where a particular value appears. However, programming usefulness and consistency with the other search functions makes searching for the first index a reasonable choice.
+
+Only one other function—but an important one!—deals with cells rather than elements: `⊏`, cell selection. Like dyadic `↑↓↕⌽⍉` (depth 0) and `/⊔` (depth 1), Select allows either a simple first-axis case where the left argument has depth 1 or less (a depth-0 argument is automatically enclosed), and a multi-axis case where it is a vector of depth-1 elements. In each case the depth-1 arrays index into a single axis.
+
+# General cell indices
+
+BQN does not use general cell indices directly, but it is useful to consider how they might work, and how a programmer might implement functions that use them in BQN if needed. The functions `/`, `⊔`, and `⊏` are the ones that can work with indices for multidimensional arrays but don't already. Here we will examine how multidimensional versions would work.
+
+A cell index into an array of rank `r` is a numeric vector of length `l≤r`, which then refers to a cell of rank `r-l`. In BQN, the cell at index `i` of array `a` is `i<¨⊸⊏a`.
+
+Because the shape of a cell index relates to the shape of the indexed array, it makes sense not to enclose cell indices, instead treating them as rows of an index array. A definition for `⊏` for depth-1 left arguments of rank at least 1 follows: replace each row of the left argument with the indexed cell of the right, yielding a result with the same depth as the right argument and shape `𝕨((¯1↓⊣)∾(¯1↑⊣)⊸↓)○≢𝕩`.
+
+To match this format, Range (`↕`) could be changed to return a flat array when given a shape—what is now `>↕`. Following this pattern, Indices (`/`) would also return a flat array, where the indices are rows: using the modified Range, `⥊/↕∘≢`. Here the result cannot retain the argument's array structure; it is always a rank-2 list of rows.
+
+The most interesting feature would be that `⊏` could still allow a nested left argument. In this case each element of the left argument would be an array with row indices as before. However, each row can now index along multiple axes, allowing some adjacent axes to be dependent while others remain independent. This nicely unifies scatter-point and per-axis selection, and allows a mix of the two. However, it doesn't allow total freedom, as non-adjacent axes can't be combined except by also mixing in all axes in between.
+
+Group (`⊔`) could accept the same index format for its index argument. Each depth-1 array in the left argument would correspond to multiple axes in the outer result array, but only a single axis in the argument and inner arrays. Because the ravel ordering of indices must be used to order cells of inner arrays, this modification is not quite as clean as the change to Select. It's also not so clearly useful, as the same results can be obtained by using numeric indices and reshaping the result.
+
+Overall it seems to me that the main use of cell indices of the type discussed here is for the Select primitive, and the other cases are somewhat contrived an awkward. So I've chosen not to support it in BQN at all.
diff --git a/docsrc/join.md b/docsrc/join.md
new file mode 100644
index 00000000..cb940699
--- /dev/null
+++ b/docsrc/join.md
@@ -0,0 +1,43 @@
+# Join
+
+Join (`∾`) is an extension of the monadic function [Raze](https://aplwiki.com/wiki/Raze) from A+ and J to arbitrary argument ranks. It has the same relationship to Join to, the dyadic function sharing the same glyph, as Unbox (`>`) does to Couple (`≍`): `a≍b` is `>a‿b` and `a∾b` is `∾a‿b`. While Unbox and Couple combine arrays (the elements of Unbox's argument, or the arguments themselves for Coups) along a new leading axis, Join and Join to combine them along the existing leading axis. Both Unbox and Join can also be called on a higher-rank array, causing Unbox to add multiple leading axes while Join combines elements along multiple existing axes.
+
+Join can be used to combine several strings into a single string, like `array.join()` in Javascript (but it doesn't force the result to be a string).
+
+        ∾"time"‿"to"‿"join"‿"some"‿"words"
+    [ timetojoinsomewords ]
+
+To join with a separator in between, we might prepend the separator to each string, then remove the leading separator after joining. Another approach would be to insert the separator array as an element between each pair of array elements before calling Join.
+
+        1↓∾' '∾¨"time"‿"to"‿"join"‿"some"‿"words"
+    [ time to join some words ]
+
+Join requires each element of its argument to be an array, and their ranks to match exactly. No rank extension is performed.
+
+        ∾"abc"‿'d'‿"ef"  # Includes a non-array
+    RANK ERROR
+        ∾"abc"‿(<'d')‿"ef"  # Includes a scalar
+    RANK ERROR
+
+However, Join has higher-dimensional uses as well. Given a rank-`m` array of rank-`n` arrays (requiring `m≤n`), it will merge arrays along their first `m` axes. For example, if the argument is a matrix of matrices representing a [block matrix](https://en.wikipedia.org/wiki/Block_matrix), Join will give the corresponding unblocked matrix as its result.
+
+        ⊢ m ← (3‿1∾⌜4‿2‿5) ⥊¨ 2‿3⥊↕6
+    ┌
+      ┌           ┌       ┌
+        0 0 0 0     1 1     2 2 2 2 2
+        0 0 0 0     1 1     2 2 2 2 2
+        0 0 0 0     1 1     2 2 2 2 2
+                ┘       ┘             ┘
+      [ 3 3 3 3 ] [ 4 4 ] [ 5 5 5 5 5 ]
+                                        ┘
+        ∾ m  # Join all that together
+    ┌
+      0 0 0 0 1 1 2 2 2 2 2
+      0 0 0 0 1 1 2 2 2 2 2
+      0 0 0 0 1 1 2 2 2 2 2
+      3 3 3 3 4 4 5 5 5 5 5
+                            ┘
+
+Join has fairly strict requirements on the shapes of its argument elements—although less strict than those of Unbox, which requires they all have identical shape. Suppose the argument to Join has rank `m`. Each of its elements must have the same rank, `n`, which is at least `m`. The trailing shapes `m↓⟜≢¨𝕩` must all be identical (the trailing shape `m↓≢∾𝕩` of the result will match these shapes as well). The other entries in the leading shapes need not be the same, but the shape of an element along a particular axis must depend only on the location of the element along that axis in the full array. For a vector argument this imposes no restriction, since the one leading shape element is allowed to depend on position along the only axis. But for higher ranks the structure quickly becomes more rigid.
+
+To state this requirement more formally in BQN, we say that there is some vector `s` of length vectors, so that `(≢¨s)≡≢𝕩`. We require element `i⊑𝕩` to have shape `i⊑¨s`. Then the first `m` axes of the result are `+´¨s`.
diff --git a/docsrc/logic.md b/docsrc/logic.md
new file mode 100644
index 00000000..0e38fb1d
--- /dev/null
+++ b/docsrc/logic.md
@@ -0,0 +1,56 @@
+# Logic functions: And, Or, Not (also Span)
+
+BQN retains the APL symbols `∧` and `∨` for logical *and* and *or*, and changed APL's `~` to `¬` for *not*, since `~` looks too much like `˜` and `¬` is more common in mathematics today. Like J, BQN extends Not to the linear function `1⊸-`. However, it discards [GCD](https://aplwiki.com/wiki/GCD) and [LCM](https://aplwiki.com/wiki/LCM) as extensions of And and Or, and instead uses bilinear extensions: And is identical to Times (`×`), while Or is `×⌾¬`, following De Morgan's laws (other ways of obtaining a function for Or give an equivalent result—there is only one bilinear extension).
+
+If the arguments are probabilities of independent events, then an extended function gives the probability of the boolean function on their outcomes (for example, if *A* occurs with probability `a` and *B* with probability `b` independent of *A*, then *A* or *B* occurs with probability `a∨b`). These extensions have also been used in complexity theory, because they allow mathematicians to transfer a logical circuit from the discrete to the continuous domain in order to use calculus on it.
+
+Both valences of `¬` are equivalent to the fork `1+-`. The dyadic valence, called "Span", computes the number of integers in the range from `𝕩` to `𝕨`, inclusive, when both arguments are integers and `𝕩≤𝕨` (note the reversed order, which is used for consistency with subtraction). This function has many uses, and in particular is relevant to the [Windows](windows.md) function.
+
+## Definitions
+
+We define:
+
+    Not ← 1+-  # also Span
+    And ← ×
+    Or  ← ×⌾¬
+
+Note that `¬⁼ ←→ ¬`, since the first added 1 will be negated but the second won't; the two 1s cancel leaving two subtractions, and `-⁼ ←→ -`. An alternate definition of Or that matches the typical formula from probability theory is
+
+    Or  ← +-×
+
+## Examples
+
+We can form truth tables including the non-integer value one-half:
+
+        ¬ 0‿0.5‿1
+    [ 1 0.5 0 ]
+
+        ∧⌜˜ 0‿0.5‿1
+    ┌
+      0    0   0
+      0 0.25 0.5
+      0  0.5   1
+                 ┘
+
+        ∨⌜˜ 0‿0.5‿1
+    ┌
+        0  0.5 1
+      0.5 0.75 1
+        1    1 1
+                 ┘
+
+As with logical And and Or, any value and 0 is 0, while any value or 1 is 1. The other boolean values give the identity elements for the two functions: 1 and any value gives that value, as does 0 or the value.
+
+## Why not GCD and LCM?
+
+The main reason for omitting these functions is that they are complicated and, when applied to real or complex numbers, require a significant number of design decisions where there is no obvious choice (for example, whether to use comparison tolerance). On the other hand, these functions are fairly easy to implement, which allows the programmer to control the details, and also add functionality such as the extended GCD.
+
+A secondary reason is that the GCD falls short as an extension of Or, because its identity element 0 is not total. `0∨x`, for a real number `x`, is actually equal to `|x` and not `x`: for example, `0∨¯2` is `2` in APL. This means the identity `0∨x ←→ x` isn't reliable in APL.
+
+## Identity elements
+
+It's common to apply `∧´` or `∨´` to a list (checking whether all elements are true and whether any are true, respectively), and so it's important for extensions to And and Or to share their identity element. Minimum and Maximum do match And and Or when restricted to booleans, but they have different identity elements. It would be dangerous to use Maximum to check whether any element of a list is true because `>⌈´⟨⟩` yields `¯∞` instead of `0`—a bug waiting to happen. Always using `0` as a left argument to `⌈´` fixes this problem but requires more work from the programmer, making errors more likely.
+
+It is easy to prove that the bilinear extensions have the identity elements we want. Of course `1∧x` is `1×x`, or `x`, and `0∨x` is `0×⌾¬x`, or `¬1×¬x`, giving `¬¬x` or `x` again. Both functions are commutative, so these identities are double-sided.
+
+Other logical identities do not necessarily hold. For example, in boolean logic And distributes over Or and vice-versa: `a∧b∨c ←→ (a∧b)∨(a∧c)`. But substituting `×` for `∧` and `+-×` for `∨` we find that the left hand side is `(a×b)+(a×c)+(a×b×c)` while the right gives `(a×b)+(a×c)+(a×b×a×c)`. These are equivalent for arbitrary `b` and `c` only if `a=a×a`, that is, `a` is 0 or 1. In terms of probabilities the difference when `a` is not boolean is caused by failure of independence. On the left hand side, the two arguments of every logical function are independent. On the right hand side, each pair of arguments to `∧` are independent, but the two arguments to `∨`, `a∧b` and `a∧c`, are not. The relationship between these arguments means that logical equivalences no longer apply.
diff --git a/docsrc/md.bqn b/docsrc/md.bqn
new file mode 100644
index 00000000..acb264ba
--- /dev/null
+++ b/docsrc/md.bqn
@@ -0,0 +1,324 @@
+# The Markdown function is a markdown to html converter for a "good
+# enough" subset of Github-flavored markdown, as specified at
+# https://github.github.com/gfm/ .
+#
+# Additionally, it highlights code sections as BQN, and executes
+# sections that are doubly indented (eight spaces), placing their
+# results below them.
+
+# Not supported:
+# - Thematic breaks like *** or ---
+# - Setext headings (underlined with ==== or ----)
+# - Fenced code blocks (marked off with ``` or ~~~)
+# - HTML blocks
+# - Link reference definitions (who uses these?)
+# - Block quotes (start with >)
+# - Task lists
+
+# Here, a markdown file is represented as a list of its lines, which are
+# strings (they don't include any line ending character).
+# The html file is constructed directly as a string, using Html.
+
+################################
+# Utilities
+
+# 𝕨 is a list of lists. Find the first of these lists each cell of 𝕩
+# belongs to.
+FindGroup ← {
+  i ← (∾𝕨) ⊐ 𝕩  # Index in all cells of 𝕨
+  e ← +`≠¨𝕨     # Index past the end of each group of 𝕨
+  e ⍋ i         # How many end-indices does each element pass?
+}
+
+# 𝕨 is a list of possible expression start indices in any order and 𝕩 is
+# the corresponding endpoints. The expressions are mutually exclusive
+# and do not nest, and are enabled in index order. Return a shape ·‿2
+# array where the rows give the start and end of each enabled expression
+# in index order.
+Trace ← {
+  Se←{(⊏˜𝕨)Se 1¨⌾((𝕩/𝕨)⊸⊏)𝕩}⍟{0=⊑⌽𝕩}
+  g←⍋𝕨 ⋄ s←g⊏𝕨 ⋄ e←g⊏𝕩
+  st←¯1↓Se⟜(1↑˜≠)∾⟜≠s⍋e
+  st/s≍˘e
+}
+
+# Join lines with newline characters. Include the trailing newline.
+JoinLines ← ∾ ∾⟜lf¨
+
+# Create an html node from a tag name and interior text
+Html ← {
+  ∾ ⟨"<",𝕨,">" , 𝕩 , "</",(⊑⊐⟜" ")⊸↑𝕨,">"⟩
+}
+
+################################
+Markdown ← {
+  ######
+  # Utilities
+
+  # Index of first zero, or number of leading 1s
+  Lead ← ⊑ ⊐⟜0
+
+  # Shift cells 𝕨 into array 𝕩, maintaining its total length
+  Shl ←   ≠∘⊢ ↑ ∾   # From the left
+  Shr ← -∘≠∘⊢ ↑ ∾˜  # From the right
+
+  # Find whether 𝕨 was true at the last index where 𝕩 was true, in each
+  # position.
+  PrecedesGroup ← {
+    # We prepend a 0 to 𝕨, so that 0 is the "before start" index, with a
+    # false value, and normal indices are increased by 1.
+    𝕨 ∾˜↩ 0
+    inds ← 1 + ↕≠𝕩
+    # Zero out indices where x was false, and find the greatest index so
+    # far at each position.
+    last ← ⌈` inds × ¬𝕩
+    last ⊏ 𝕨
+  }
+
+  # Remove leading and trailing spaces
+  Trim ← { 𝕩 /˜ ¬ (∧` ∨ ∧`⌾⌽) ' '=𝕩 }
+
+  ######
+  # First we classify each line based on the type of block it can start.
+  ClassifyLine ← (0<≠)◶(0‿0)‿{
+    ind ← ⊑ lineChars FindGroup ⊏𝕩
+    getLen ← ind ⊑ lineClas∾⟨0˜⟩
+    l ← GetLen 𝕩
+    ⟨ind ∧ l>0 ⋄ l⟩
+  }
+
+  # Non-empty lines in code blocks have 4 leading spaces
+  IsCode ← 4 (≤⟜≠)◶⟨0,∧´' '=↑⟩ ⊢
+  ProcCode ← {
+    lines ← JoinLines 4 ↓¨ 𝕩
+    Esc ← (∾⥊¨) ("<>"⊸⊐ ⊑⟜⟨"&lt;","&gt;"⟩⍟(2>⊣)¨ ⊢)
+    "pre" Html doHighlight◶⟨"code"Html Esc,Highlight⟩ lines
+  }
+
+  # Headings start with #, and require 1-6 #s followed by a space.
+  # Any trailing #s are ignored.
+  LenHeading ← {
+    n ← Lead 𝕩='#'
+    l ← (0<n) ∧ (6≥n)
+    s ← n (<⟜≠)◶⟨1,' '=⊑⟩ 𝕩 # Character after hashes must be a space, if any
+    n × l ∧ s
+  }
+  ProcHeading ← {
+    tag ← "h" ∾ 𝕨⊏•d        # h3 for 3 hashes, etc.
+    𝕩 ↓˜↩ 𝕨+1
+    trsp ← ∧`⌾⌽ 𝕩=' '
+    tail ← ∧`⌾⌽ trsp∨𝕩='#'  # Mask of trailing hashes
+    f ← tail < 0 Shr tail   # Character before trailing hashes
+    𝕩 /˜↩ ¬ f (⊑⟨"\"," ",""⟩⊐<f/𝕩)◶⟨⊣,⊢,⊢,0¨⊢⟩ tail
+    # Add an id: lowercase the header, replacing non-•a with hyphens
+    Slugify ← {
+      ch ← •UCS "-Aa"
+      bounds ← ⥊ (1↓ch) +⌜ 0‿26  # Of the upper and lowercase alphabet
+      (bounds⊸⍋ {(⊑ch)¨⌾((¬2|𝕨)⊸/)𝕩+32×1=𝕨} ⊢)⌾•UCS 𝕩
+    }
+    tag ∾↩ " id="∾""""(∾∾⊣) Slugify 𝕩
+    tag Html ProcInline Trim 𝕩
+  }⟜⊑
+
+  # List items start with a bullet (unordered) or number (ordered).
+  LenBullet ← 2 × 1 (<⟜≠)◶⟨0,' '=⊑⟩ ⊢
+  LenListNum ← {
+    n ← Lead 𝕩∊•d
+    l ← (1≤n) ∧ (9≥n)
+    ' ' = n ↓ 𝕩
+    t ← n↓(n+2)↑𝕩
+    l ∧ (" " ≡ 1↓t) ∧ ⊑(")." ∊˜ 1↑t)
+  }
+
+  # Any line that starts with a | is a table, at least in my lazy version
+  IsTable ← 1˜
+  ProcTable ← {
+    rows ← (Trim¨ ((1-˜¬×+`)'|'⊸=)⊸⊔)¨ 𝕩
+    inc ← ¬ rule ← ∧´∘∾¨'-'=rows
+    rows ↩ ProcInline¨¨⌾(inc⊸/) rows
+    rowType ← inc / +` rule  # Head or body
+    DoRow ← { lf ∾ JoinLines 𝕨⊸Html¨ 𝕩 }
+    rows ↩ (rowType ⊏ "th"‿"td") DoRow¨ inc/rows
+    rowGroups ← ¯1 ↓ rowType ⊔○(∾⟜2) "tr"⊸Html¨ rows
+    sections ← "thead"‿"tbody" Html⟜(lf ∾ JoinLines)¨ rowGroups
+    "table" Html lf ∾ JoinLines (0 < ≠¨rowGroups) / sections
+  }
+
+  # Paragraphs
+  ProcParagraph ← {
+    Trsp ← { m←∧`⌾⌽𝕩=' ' ⋄ (m¬⊸/𝕩)∾(𝕨<∨´m)/"<br />" }
+    𝕩 ↩ (/(≠𝕩)(-∾⊢)1) Trsp¨ 𝕩
+    "p" Html ProcInline ¯1 ↓ JoinLines ((Lead ' '⊸=)+"\#"≡2⊸↑)⊸↓¨ 𝕩
+  }
+
+  lineChars‿lineClas‿procFns ← <˘⍉>⟨
+    ""    ‿ (!∘0)       ‿ ProcParagraph
+    "#"   ‿ LenHeading  ‿ ProcHeading
+    " "   ‿ IsCode      ‿ ProcCode
+  # "-+*" ‿ LenBullet   ‿ ProcBullet
+  # •d    ‿ LenListNum  ‿ ProcListNum
+    "|"   ‿ IsTable     ‿ ProcTable
+  ⟩
+
+  ######
+  # Inline elements
+  ProcInline ← {
+    puncChars ← "!""#$%&'()*+,-./:;<=>?@[\]^_`{|}~"
+    I2M ← (≠𝕩)↑/⁼  # Index to mask
+
+    # Code spans
+    ProcCodeSpan ← {
+      𝕩 ↩ ' '¨⌾((𝕩=lf)⊸/) 𝕩
+      𝕩 ↩ (1↓¯1↓⊢)⍟((⊢<○(∧´)⊑∾⊑∘⌽) ' '⊸=) 𝕩
+      "code" Html Highlight⍟doHighlight 𝕩
+    }
+    tick ← 𝕩='`'
+    tend ← / (⊢ > 0⊸Shr) tick
+    tcount ← (1+↕∘≠)⊸(⊣-⌈`∘×) ¬ tick
+    tlen ← tend ⊏ tcount
+    c ← Trace´ tlen {m←(⊢=0⊸Shl)𝕨⋄(⌽⟜m/𝕩˜)¨1‿0}○((⍋tlen)⊸⊏) tend
+    cl ← (⊏˘c) ⊏ tcount
+    ctInds ← ⥊˘ 1 + c -⌜˘ cl×⌜1‿0
+    include ← ¬ ≠` I2M ⥊ 0‿3⊸⊏˘ ctInds
+    codeStart ← I2M 1 ⊏˘ ctInds
+    codeGroup ← 1 -˜ codeStart (⊣×>)○(+`) I2M 2 ⊏˘ ctInds
+    code ← ProcCodeSpan¨ codeGroup ⊔ 𝕩
+
+    # Links
+    ReplaceMDSub ← { ¯2 (↓∾"html"˜)⍟(("md"≡↑)∧'/'∧´∘≠⊢) 𝕩 }
+    ReplaceMD ← { ReplaceMDSub⌾((⊑𝕩⊐"#")⊸↑) 𝕩 }
+    ProcLink ← { ∾⟨"<a href=""",(ReplaceMD 𝕩),""">",𝕨,"</a>"⟩ }
+    brak ← /∘(include ∧ 𝕩⊸=)¨ "]()["
+    link ← (∊/⊣)´ 0‿¯1 + 2 ↑ brak
+    chains ← (⍋˜ ⊏ ⊢∾(≠𝕩)˜)` ¯1 ⌽ (<link) ∾ 2 ↓ brak
+    chains ↩ > (∧´ (∊ ∧ <⟜(≠𝕩))¨ 1 ↓ chains)⊸/¨ chains
+    linkStart ← I2M 0 ⊏ chains
+    lInds ← 1‿0‿2‿0⊸+˘ (⥊2⊸↕)˘ ⍉ chains
+    include ∧↩ ¬ ≠` I2M ⥊ (¯1‿1+0‿3⊸⊏)˘ lInds
+    linkGroup ← 1 -˜ (1‿0⥊˜≢)⊸(/ (⊣×>)○(+`I2M) ¬⊸/) ⥊lInds
+    links ← <∘ProcLink´˘ 2⊸(÷˜⟜≠∾⊣)⊸⥊ linkGroup ⊔ 𝕩
+
+    # Emphasis (still rudimentary)
+    eMasks ← (include ∧ 𝕩⊸=)¨ "*_"
+    eInds ← (⊢-2|⊢)∘≠⊸↑∘/¨ eMasks
+    include ∧↩ ¬∨´eMasks
+    eTags ← ∾ eInds ≠⊸⥊¨ <"<em>"‿"</em>"
+
+    new ← ∾⟨eTags,code,links⟩           # Text to be added
+    inds← ∾eInds∾/¨codeStart‿linkStart  # Where to add it
+    ((/include)∾(≠¨new)/inds) ⍋⊸⊏ (include/𝕩)∾∾new
+  }
+
+  ######
+  # Create the block structure using line classifications.
+  lengths ← ≠¨ 𝕩                   # Length of each line
+  blanks ← (Lead ' '⊸=)¨ 𝕩         # Number of leading blanks
+  nonEmptyMask ← blanks < lengths  # Empty ←→ all leading blanks
+
+  # Get line classifications: type of line, and data to be passed into
+  # the line processor. Note that leading blanks aren't passed in.
+  lineType‿lineDat ← <˘⍉ > ClassifyLine¨ blanks ↓¨ 𝕩
+  # Empty lines have type ¯1.
+  lineType ↩ ¯1¨⌾((¬nonEmptyMask)⊸/) lineType
+
+  # Lines that could be included in code blocks (will be refined)
+  codeMask ← nonEmptyMask ∧ blanks ≥ 4
+  paragraphMask ← 0 = lineType
+  # A header can't have 4 spaces of indentation. If it doesn't become
+  # part of a code block, it will be included in a paragraph.
+  lineType -↩ codeMask ∧ 1 = lineType
+
+  # Code blocks consist of indented lines, possibly with blank lines
+  # in between. They must be separated from paragraphs by blank lines.
+  codeMask ∧↩ ¬ paragraphMask PrecedesGroup codeMask
+  codeMask ∨↩ codeMask (⊢ ∧ PrecedesGroup ∧ PrecedesGroup⌾⌽) lineType < 0
+  lineType ↩ 2¨⌾(codeMask⊸/) lineType
+
+  # Lines continue blocks if they are part of the same multi-line
+  # type as the previous line, and otherwise start new ones.
+  # Headers (type 1) always start new blocks.
+  blockStart ← nonEmptyMask ∧ (1 = lineType) ∨ ¯1⊸Shl⊸≠ lineType
+  # Headers and paragraphs ignore leading blanks.
+  drop ← blanks × lineType < 2
+  # Group blocks based on blockStart, with type ¯1 lines excluded.
+  blocks ← (1 -˜ (lineType ≥ 0) × +`blockStart) ⊔ drop ↓¨ 𝕩
+
+  # To process a block, pick the appropriate function from procFns.
+  ProcBlock ← {t‿l G b: f←t⊑procFns ⋄ l F ⊑b }
+  JoinLines (blockStart / lineType≍˘lineDat) <∘ProcBlock˘ blocks
+}
+
+################################
+# Testing
+# Uses the test cases at https://spec.commonmark.org/0.29/spec.json
+# since Github doesn't seem to have published theirs
+TestSections ← {
+  doHighlight ↩ 0
+  tests ← ¯2 ↓˘ 8⊸(÷˜⟜≠∾⊣)⊸⥊2↓•LNS •path∾"../spec.json"
+  tests ↩ ((⊑2+⊐⟜':')¨∘⊏ ((-','=¯1⊑⊢)↓↓)¨⎉1 ⊢) tests
+  testSection ← (1↓¯1↓⊢)¨ 5⊏˘tests
+  UnEsc ← {
+    esc ← (2 | (1+↕∘≠) (⊣-⌈`∘×) '\'≠⊢) 𝕩
+    esc ¬⊸/ (("\"""∾•UCS 9‿10)⊏˜"\""tn"⊐⊢)⌾((¯1⌽esc)⊸/) 𝕩
+  }
+  RunTest ← {
+    in‿exp ← UnEsc∘(1↓¯1↓⊢)¨2↑𝕩
+    out ← Markdown (•UCS 10) ((⊢-˜¬×+`)∘=⊔⊢) in
+    ⟨exp≡out,in,exp,out,2⊑𝕩⟩
+  }
+
+  ignore ← (2 ⊏˘ tests) ∊ ⟨"47","85"⟩
+  res ← 1 ↓˘ (¬⊏˘)⊸/ RunTest˘ tests /˜ ignore < testSection ∊ 𝕩
+  doHighlight ↩ 1
+  res
+}
+
+################################
+# Syntax highlighting
+doHighlight ← 1
+Highlight ← {
+  idChars ← ⟨
+    •d∾"¯.π∞"
+    ' '+⌾•UCS•a
+    •a
+    "_"
+  ⟩
+  classes‿chars ← <˘ ⍉ 2⊸(÷˜⟜≠∾⊣)⊸⥊⟨
+    0             , " "∾•UCS 9‿10
+    "Value"       , ¯1⊏˘5‿2⥊"𝕨𝕩𝕗𝕘𝕤"
+    "Function"    , "+-×÷⋆√⌊⌈|¬∧∨<>≠=≤≥≡≢⊣⊢⥊∾≍↑↓↕⌽⍉/⍋⍒⊏⊑⊐⊒∊⍷⊔!"∾¯1⊏˘5‿2⥊"𝕎𝕏𝔽𝔾𝕊"
+    "Modifier"    , "˜˘¨⌜⁼´`"
+    "Composition" , "∘○⊸⟜⌾⊘◶⎉⚇⍟"
+    "Number"      , ∾idChars
+    "Gets"        , "←↩→"
+    "Paren"       , "()"
+    "Bracket"     , "⟨⟩"
+    "Brace"       , "{}"
+    "Ligature"    , "‿"
+    "Nothing"     , "·"
+    "Separator"   , "⋄,"
+    "Comment"     , "#"
+    "String"      , "'"""
+  ⟩
+  classTag ← ""‿""∾>{⟨"<span class='"∾𝕩∾"'>","</span>"⟩}¨1↓classes
+
+  r←𝕩='#'⋄s←/(≠↑2⊸↓)⊸∧𝕩='''⋄d←/𝕩='"'
+  b←⟨s⋄¯1↓d⋄/r⟩ Trace○∾ ⟨2+s⋄1↓d⋄(⊢-¯1↓0∾⊢)∘⊏⟜(0∾+`r)⊸//(𝕩=lf)∾1⟩
+  sc←+´(1‿2-˜≠classes)×(≠`∨⊢)∘((≠𝕩)↑/⁼∘∾)¨2↑((⊏˘b)⊏r)⊔○(∾⟜2)<˘b
+  col←sc⌈14|chars FindGroup 𝕩
+
+  w←(≠↑0∾⊢)⊸<id←col=5
+  idc←1+5|1-˜(idChars FindGroup w/𝕩)+'_'=((1↓∾⟜0)⊸<id)/𝕩
+  col↩((id/+`w)⊏0∾idc)⌾(id⊸/)col
+
+  col↩(1⌽col)⊣⌾((𝕩=⊑"𝕩")⊸/)col
+
+  bd←(≠↑¯1∾⊢)⊸≠col
+  f←0<bd/col
+  tags←⥊f/(bd/col)⊏classTag
+  pos←⥊f/2↕/bd∾1
+  ((↕≠𝕩)∾˜(≠¨tags)/pos) ⍋⊸⊏ 𝕩∾˜∾tags
+}
+
+head ← "<head><link href=""style.css"" rel=""stylesheet""/></head>"∾lf
+ConvertFile ← head ∾ Markdown∘•LNS
diff --git a/docsrc/transpose.md b/docsrc/transpose.md
new file mode 100644
index 00000000..006f0de6
--- /dev/null
+++ b/docsrc/transpose.md
@@ -0,0 +1,111 @@
+# Transpose
+
+As in APL, Transpose (`⍉`) is a tool for rearranging the axes of an array. BQN's version is tweaked to align better with the leading axis model and make common operations easier.
+
+## Monadic Transpose
+
+Transposing a matrix exchanges its axes, mirroring it across the diagonal. APL extends the function to any rank by reversing all axes, but this generalization isn't very natural and is almost never used. The main reason for it is to maintain the equivalence `a MP b ←→ a MP⌾⍉ b`, where `MP ← (+´<˘)∘×⎉1‿∞` is the generalized matrix product. But even here APL's Transpose is suspect. It does much more work than it needs to, as we'll see.
+
+BQN's transpose takes the first axis of its argument and moves it to the end.
+
+        ≢ a23456 ← ↕2‿3‿4‿5‿6
+    [ 2 3 4 5 6 ]
+        ≢ ⍉ a23456
+    [ 3 4 5 6 2 ]
+
+On the argument's ravel, this looks like a simple 2-dimensional transpose: one axis is exchanged with a compound axis made up of the other axes. Here we transpose a rank 3 matrix:
+
+        a322 ← 3‿2‿2⥊↕12
+        ≍○<⟜⍉ a322
+    ┌
+      ┌         ┌
+         0  1     0 4  8
+         2  3     1 5  9
+
+         4  5     2 6 10
+         6  7     3 7 11
+                         ┘
+         8  9
+        10 11
+              ┘
+                           ┘
+
+But, reading in ravel order, the argument and result have exactly the same element ordering as for the rank 2 matrix `⥊˘ a322`:
+
+        ≍○<⟜⍉ ⥊˘ a322
+    ┌
+      ┌             ┌
+        0 1  2  3     0 4  8
+        4 5  6  7     1 5  9
+        8 9 10 11     2 6 10
+                  ┘   3 7 11
+                             ┘
+                               ┘
+
+To exchange multiple axes, use the Power operator. Like Rotate, a negative power will move axes in the other direction. In particular, to move the last axis to the front, use Inverse (as you might expect, this exactly inverts `⍉`).
+
+        ≢ ⍉⍟3 a23456
+    [ 5 6 2 3 4 ]
+        ≢ ⍉⁼ a23456
+    [ 6 2 3 4 5 ]
+
+In fact, we have `≢⍉⍟k a ←→ k⌽≢a` for any number `k` and array `a`.
+
+To move axes other than the first, use the Rank operator in order to leave initial axes untouched. A rank of `k>0` transposes only the last `k` axes while a rank of `k<0` ignores the first `|k` axes.
+
+        ≢ ⍉⎉3 a23456
+    [ 2 3 5 6 4 ]
+
+And of course, Rank and Power can be combined to do more complicated transpositions: move a set of contiguous axes with any starting point and length to the end.
+
+        ≢ ⍉⁼⎉¯1 a23456
+    [ 2 6 3 4 5 ]
+
+Using these forms, we can state BQN's generalized matrix product swapping rule:
+
+    a MP b  ←→  ⍉⍟(≠≢a) a ⍉⁼⊸MP⟜⍉ b
+
+Certainly not as concise as APL's version, but not a horror either. BQN's rule is actually more parsimonious in that it only performs the axis exchanges necessary for the computation: it moves the two axes that will be paired with the matrix product into place before the product, and directly exchanges all axes afterwards. Each of these steps is equivalent in terms of data movement to a matrix transpose, the simplest nontrivial transpose to perform. Also remember that for two-dimensional matrices both kinds of transposition are the same, and APL's rule holds in BQN.
+
+Axis permutations of the types we've shown generate the complete permutation group on any number of axes, so you could produce any transposition you want with the right sequence of monadic transpositions with Rank. However, this can be unintuitive and tedious. What if you want to transpose the first three axes, leaving the rest alone? With monadic Transpose you have to send some axes to the end, then bring them back to the beginning. For example [following four or five failed tries]:
+
+        ≢ ⍉⁼⎉¯2 ⍉ a23456  # Restrict Transpose to the first three axes
+    [ 3 4 2 5 6 ]
+
+In a case like this BQN's Dyadic transpose is much easier.
+
+## Dyadic Transpose
+
+Transpose also allows a left argument that specifies a permutation of the right argument's axes. For each index `p←i⊏𝕨` in the left argument, axis `i` of the argument is used for axis `p` of the result. Multiple argument axes can be sent to the same result axis, in which case that axis goes along a diagonal of the argument array, and the result will have a lower rank than the argument.
+
+        ≢ 1‿3‿2‿0‿4 ⍉ a23456
+    [ 5 2 4 3 6 ]
+        ≢ 1‿2‿2‿0‿0 ⍉ a23456  # Don't worry too much about this case though
+    [ 5 2 3 ]
+
+Since this kind of rearrangement can be counterintuitive, it's often easier to use `⍉⁼` when specifying all axes. If `p≡○≠≢a`, then we have `≢p⍉⁼a ←→ p⊏≢a`.
+
+        ≢ 1‿3‿2‿0‿4 ⍉⁼ a23456
+    [ 3 5 4 2 6 ]
+
+So far, all like APL. BQN makes one little extension, which is to allow only some axes to be specified. The left argument will be matched up with leading axes of the right argument. Those axes are moved according to the left argument, and remaining axes are placed in order into the gaps between them.
+
+        ≢ 0‿2‿4 ⍉ a23456
+    [ 2 5 3 6 4 ]
+
+In particular, the case with only one argument specified is interesting. Here, the first axis ends up at the given location. This gives us a much better solution to the problem at the end of the last section.
+
+        ≢ 2 ⍉ a23456  # Restrict Transpose to the first three axes
+    [ 3 4 2 5 6 ]
+
+Finally, it's worth noting that, as monadic Transpose moves the first axis to the end, it's equivalent to dyadic Transpose with a "default" left argument: `(≠∘≢-1˜)⊸⍉`.
+
+## Definitions
+
+Here we define the two valences of Transpose more precisely.
+
+A non-array right argument to Transpose is always boxed to get a scalar array before doing anything else.
+
+Monadic transpose is identical to `(≠∘≢-1˜)⊸⍉`, except that for scalar arguments it returns the array unchanged rather than giving an error.
+
+In Dyadic transpose, the left argument is a number or numeric array of rank 1 or less, and `𝕨≤○≠≢𝕩`. Define the result rank `r←(≠≢𝕩)-+´¬∊𝕨` to be the argument rank minus the number of duplicate entries in the left argument. We require `∧´𝕨<r`. Bring `𝕨` to full length by appending the missing indices: `𝕨∾↩𝕨(¬∘∊˜/⊢)↕r`. Now the result shape is defined to be `⌊´¨𝕨⊔≢𝕩`. Element `i⊑z` of the result `z` is element `(𝕨⊏i)⊑𝕩` of the argument.
diff --git a/docsrc/windows.md b/docsrc/windows.md
new file mode 100644
index 00000000..d4165d62
--- /dev/null
+++ b/docsrc/windows.md
@@ -0,0 +1,112 @@
+# Windows
+
+In BQN, it's strongly preferred to use functions, and not operators (modifiers and compositions), for array manipulation. Functions are simpler as they have fewer moving parts. They are more concrete, since the array results can always be viewed right away. They are easier to implement with reasonable performance as well, since there is no need to recognize many possible function operands as special cases.
+
+The Window function replaces APL's Windowed Reduction, J's more general Infix operator, and Dyalog's Stencil, which is adapted from one case of J's Cut operator.
+
+## Definition
+
+We'll start with the one-axis case. Here Window's left argument is a number between `0` and `1+≠𝕩`. The result is composed of slices of `𝕩` (contiguous sections of major cells) with length `𝕨`, starting at each possible index in order.
+
+        5↕"abcdefg"
+    ┌
+      abcde
+      bcdef
+      cdefg
+            ┘
+
+There are `1+(≠𝕩)-𝕨`, or `(≠𝕩)¬𝕨`, of these sections, because the starting index must be at least `0` and at most `(≠𝕩)-𝕨`. Another way to find this result is to look at the number of cells in or before a given slice: there are always `𝕨` in the slice and there are only `≠𝕩` in total, so the number of slices is the range spanned by these two endpoints.
+
+You can take a slice of an array `𝕩` that has length `l` and starts at index `i` using `l↑i↓𝕩` or `l↑i⌽𝕩`. The [Prefixes](prefixes.md) function returns all the slices that end at the end of the array (`(≠𝕩)=i+l`), and Suffixes gives the slices that start at the beginning (`i=0`). Windows gives yet another collection of slices: the ones that have a fixed length `l=𝕨`. Selecting one cell from its result gives you the slice starting at that cell's index:
+
+        2⊏5↕"abcdefg"
+    [ cdefg ]
+        5↑2↓"abcdefg"
+    [ cdefg ]
+
+Windows differs from Prefixes and Suffixes in that it doesn't add a layer of nesting (it doesn't box each slice). This is possible because the slices have a fixed size.
+
+### Multiple dimensions
+
+The above description applies to a higher-rank right argument. As an example, we'll look at two-row slices of a shape `3‿4` array. For convenience, we will box each slice. Note that slices always have the same rank as the argument array.
+
+        <⎉2 2↕"0123"∾"abcd"≍"ABCD"
+    ┌
+      ┌        ┌
+        0123     abcd
+        abcd     ABCD
+             ┘        ┘
+                        ┘
+
+Passing a list as the left argument to Windows takes slices along any number of leading axes. Here are all the shape `2‿2` slices:
+
+        <⎉2 2‿2↕"0123"∾"abcd"≍"ABCD"
+    ┌
+      ┌      ┌      ┌
+        01     12     23
+        ab     bc     cd
+           ┘      ┘      ┘
+      ┌      ┌      ┌
+        ab     bc     cd
+        AB     BC     CD
+           ┘      ┘      ┘
+                           ┘
+
+The slices are naturally arranged along multiple dimensions according to their starting index. Once again the equivalence `i⊏l↕x` ←→ `l↑i↓x` holds, provided `i` and `l` have the same length.
+
+If the left argument has length `0`, then the argument is not sliced along any dimensions. The only slice that results—the entire argument—is then arranged along an additional zero dimensions. In the end, the result is the same as the argument.
+
+### More formally
+
+`𝕩` is an array. `𝕨` is a number or numeric list or scalar with `𝕨≤○≠≢𝕩`. The result `z` has shape `𝕨∾¬⟜𝕨⌾((≠𝕨)⊸↑)≢𝕩`, and element `i⊑z` is `𝕩⊑˜(≠𝕨)(↑+⌾((≠𝕨)⊸↑)↓)i`.
+
+Using [Group](group.md) we could also write `i⊑z` ←→ `𝕩⊑˜(𝕨∾○(↕∘≠)≢𝕩) +´¨∘⊔ i`.
+
+## Symmetry
+
+Let's look at an earlier example, along with its transpose.
+
+        {⟨𝕩,⍉𝕩⟩}5↕"abcdefg"
+    ┌
+      ┌         ┌
+        abcde     abc
+        bcdef     bcd
+        cdefg     cde
+              ┘   def
+                  efg
+                      ┘
+                        ┘
+
+Although the two arrays have different shapes, they are identical where they overlap.
+
+        ≡○(3‿3⊸↑)⟜⍉5↕"abcdefg"
+    1
+
+In other words, the i'th element of slice j is the same as the j'th element of slice i: it is the `i+j`'th element of the argument. So transposing still gives a possible result of Windows, but with a different slice length.
+
+        {(5↕𝕩)≡⍉(3↕𝕩)}"abcdefg"
+    1
+
+In general, we need a more complicated transpose—swapping the first set of `≠𝕨` axes with the second set. Note again the use of Span, our slice-length to slice-number converter.
+
+        {((5‿6¬2‿2)↕𝕩) ≡ 2‿3⍉(2‿2↕𝕩)} ↕5‿6‿7
+    1
+
+## Applications
+
+Windows can be followed up with a reduction on each slice to give a windowed reduction. Here we take running sums of 3 values.
+
+        +´˘3↕ ⟨2,6,0,1,4,3⟩
+    [ 8 7 5 8 ]
+
+A common task is to pair elements, with an initial or final element so the total length stays the same. This can also be done with a pairwise reduction, but another good way (and more performant without special support in the interpreter) is to add the element and then use windows matching the original length. Here both methods are used to invert `` +` ``, which requires we take pairwise differences starting at initial value 0.
+
+        -˜´˘2↕0∾ +` 3‿2‿1‿1
+    [ 3 2 1 1 ]
+        ((-˜´<˘)≠↕0∾⊢) +` 3‿2‿1‿1
+    [ 3 2 1 1 ]
+
+This method extends to any number of initial elements. We can modify the running sum above to keep the length constant by starting with two zeros.
+
+        ((+´<˘)≠↕(2⥊0)⊸∾) ⟨2,6,0,1,4,3⟩
+    [ 2 8 8 7 5 8 ]