Vaguely adequate spec for index-related primitives

author: Marshall Lochbaum <mwlochbaum@gmail.com> 2021-02-01 22:21:26 -0500
committer: Marshall Lochbaum <mwlochbaum@gmail.com> 2021-02-01 22:21:26 -0500
commit: 36037d57b46811084d076f20e79b87894e139671 (patch)
tree: 9b6f14e8f859c6b704910ca9d0ec75aabc5b5959 /spec
parent: fba3d225e7687f87f923dca6e7647bb30aa89025 (diff)
1 files changed, 18 insertions, 2 deletions
diff --git a/spec/primitive.md b/spec/primitive.md
index 6131fea7..5fc15001 100644
--- a/spec/primitive.md
+++ b/spec/primitive.md
@@ -145,6 +145,22 @@ Modifiers for iteration are defined in layers 1, 2, and 4. Two 2-modifiers, `⚇
 
 **Transpose** (`⍉`) reorders axes of its argument to place the first axis last; if the argument has one or fewer axes then it's returned unchanged. **Reorder Axes** (`⍉`) requires the left argument to be a list or unit of natural numbers, with length at most the rank of the right argument. This list is extended to match the right argument rank exactly by repeatedly appending the least unused natural number (for example, given `1‿3‿0‿0`, `2` is appended). After extension, it specifies a result axis for each axis of the right argument. There must be no gaps in the list: that is, with the result rank equal to one plus the greatest value present, every result axis must appear at least once. Now each argument axis is "sent to" the specified result axis: in terms of indices, `i⊑𝕨⍉𝕩` is `(𝕨⊏i)⊑𝕩` if `𝕨` is complete. If multiple argument axes correspond to the same result axis, then a diagonal is taken, and it's as long as the shortest of those argument axes. While Transpose does not enclose an atom right argument, Reorder Axes does, so that its result is always an array.
 
+### Indices and selection
+
+Each element in an array `s⥊e` is associated with an *index*, which is a list of natural numbers `i` such that `∧´i<s`. The list of all indices, which corresponds to the element list `e`, contains all such lists `i` in lexicographic order. That is, index `i` comes before `j` exactly when the two indices are not the same, and `i` has the smaller value at the first position where they are unequal. The index of an element along a particular axis `a` is the value `a⊑i`.
+
+**Pick** (`⊑`) is extended to array left arguments. In this case, it requires every depth-1 array in the nested structure of `𝕨` to be a valid index list for `𝕩`, and every atom to be contained in one of these lists. The result is `𝕨` with each index list replaced by the element of `𝕩` at that index. In the simple case where `𝕨` itself is an index list, the result is the element of `𝕩` at index `𝕨`.
+
+**First** (`⊑`) simply takes the first element of its argument in index order, or the fill element if `𝕩` is empty.
+
+For **Select** (`⊏`), `𝕨` is an array of natural numbers, or a list of such arrays; if it's an empty list, it's interpreted as the former. The given arrays are matched with leading axes of `𝕩` and used to select from those axes. Their shape is retained, so that the final shape is the combined shapes of each array of natural numbers in `𝕨` in order, followed by the trailing (unmatched) shape of `𝕩`. This means that a single axis in `𝕩` can correspond to any number of axes in `𝕨⊏𝕩`, depending on the rank of that portion of `𝕨`. More precisely, the value of the result at an index `j` is obtained by splitting `j` into one index into each array of `𝕨` followed by a partial index into `𝕩`. An index `i` for `𝕩` comes from selecting from each array of `𝕨` and appending the results to the partial index from `j`, and the value `i⊑𝕩` is `j⊑𝕨⊏𝕩`.
+
+**First Cell** (`⊏`) selects the initial major cell of `𝕩`, giving an error if `𝕩` has rank 0 or length 0.
+
+**Group** (`⊔`) performs an opposite operation to Select, so that `𝕨` specifies not the argument index that result values come from, but the result index that argument values go to. The result rank is `≠𝕨`, and each result element has rank `1+(=𝕩)-+´=¨𝕨`. The result element contains the minimal list of cells so that each cell of `𝕩` appears in the corresponding element of `𝕨⊏𝕨⊔𝕩`, in index order. **Group Indices** treats its argument `𝕩` as a left argument for Group and uses a right argument made up of indices, with the definition `⊔⟜(↕≠⚇1)`.
+
+**Range** (`↕`) is extended to apply to a list of natural numbers, in addition to the provided case of a single natural number (an enclosed natural number `𝕩` should still result in an error). For a list `𝕩`, the result is an array of shape `𝕩` in which the value at a given index is that index, as a list of natural numbers. That is, `i≡i⊑↕𝕩` for any list of natural numbers `i` with `∧´i<𝕩`.
+
 ### Searching
 
 **Match** (`≡`) indicates whether two values are considered equivalent. It always returns 0 or 1, and never causes an error. If both arguments are atoms then it is identical to `=`, and if one is an atom and the other an array then it returns 0. If both arguments are arrays then it returns 1 only if they have the same shape and all pairs of corresponding elements match. Fill elements aren't taken into account, so that arrays that match might still differ in behavior. **Not Match** simply returns the complement of Match, `¬≡`.
@@ -169,7 +185,7 @@ Sorting functions are those that depend on BQN's array ordering. There are three
 
 **Sort** (`∧∨`) reorders the major cells of its argument so that a major cell with a lower index comes earlier in the ordering than a major cell with a higher index, or matches it.
 
-**Grade** (`⍋⍒`) returns a permutation describing the way the argument array would be sorted. For this reason the reference implementations simply define Sort to be selection by the grade. One way to define Grade is as a sorted version of the index list `↕≠𝕩`. An index `i` is ordered according to the corresponding major cell `i⊏𝕩`. However, ties in the ordering are broken by ordering the index values themselves, so that no two indices are ever considered equal, and the result of sorting is well-defined (for Sort this is not an issue—matching cells are truly interchangeable). This property means that a stable sorting algorithm must be used to implement Grade functions. While cells might be ordered ascending or descending, indices are always ordered ascending, so that for example index `i` is placed before index `j` if either `i⊏𝕩` comes earlier in the ordering than `j⊏𝕩`, or if `i<j`.
+**Grade** (`⍋⍒`) returns a permutation describing the way the argument array would be sorted. For this reason the reference implementations simply define Sort to be selection by the grade. One way to define Grade is as a sorted version of the index list `↕≠𝕩`. An index `i` is ordered according to the corresponding major cell `i⊏𝕩`. However, ties in the ordering are broken by ordering the index values themselves, so that no two indices are ever considered equal, and the result of sorting is well-defined (for Sort this is not an issue—matching cells are truly interchangeable). This property means that a stable sorting algorithm must be used to implement Grade functions. While cells might be ordered ascending or descending, indices are always ordered ascending, so that for example index `i` is placed before index `j` if either `i⊏𝕩` comes earlier in the ordering than `j⊏𝕩`, or if they match and `i<j`.
 
 **Bins** (`⍋⍒`) requires the `𝕨` to be ordered in the sense of Sort (with the same direction). Like a dyadic search function, it then works on cells of `𝕩` with the same rank as major cells of `𝕨`: the rank of `𝕩` cannot be less than `(=𝕨)-1`. For each of these, it identifies where in the ordering given by `𝕨` the cell belongs, that is, the index of the first cell in `𝕨` that is ordered later than it, or `≠𝕨` if no such cell exists. An equivalent formulation is that the result value for a cell of `𝕩` is the number of major cells in `𝕨` that match or precede it.
 
@@ -179,6 +195,6 @@ Here we define the array ordering using the terms "smaller" and "larger". For fu
 
 To compare two arrays, BQN first attempts to compare elements at corresponding indices, where two indices are considered to correspond if one is a suffix of the other. Elements are accessed in ravel order, that is, beginning at the all-zero index and first increasing the final number in the index, then the second-to-last, and so on. They are compared, using array comparison if necessary, until a non-matching pair of elements is found—in this case the ordering of this pair determines the ordering of the arrays—or one array has an index with no corresponding index in the other array. For example, comparing `4‿3‿2⥊1` with `2‿5⥊1` stops when index `0‿2` in `2‿5⥊1` is reached, because the corresponding index `0‿0‿2` is out of range. The index `0‿2‿0` in the other array also has no corresponding index, but comes later in the index ordering. In this case, the array that lacks the index in question is considered smaller.
 
-If two arrays have the same shape (ignoring leading `1`s) and all matching element, or if they are both empty, then the element-by-element comparison will not find any differences. In this case, the arrays are compared first by rank, with the higher-rank array considered larger, and then by shape, beginning with the leading axes.
+If two arrays have the same shape (ignoring leading `1`s) and all matching elements, or if they are both empty, then the element-by-element comparison will not find any differences. In this case, the arrays are compared first by rank, with the higher-rank array considered larger, and then by shape, beginning with the leading axes.
 
 To compare two atoms, array ordering uses `≤`: if `𝕨≤𝕩` then `𝕨` matches `𝕩` if `𝕩≤𝕨` and otherwise is smaller than `𝕩` (and `𝕩` is larger than `𝕨`). To compare an atom to an array, the atom is promoted to an array by enclosing it; however, if the enclosed atom matches the array then the atom is considered smaller.
author	Marshall Lochbaum <mwlochbaum@gmail.com>	2021-02-01 22:21:26 -0500
committer	Marshall Lochbaum <mwlochbaum@gmail.com>	2021-02-01 22:21:26 -0500
commit	36037d57b46811084d076f20e79b87894e139671 (patch)
tree	9b6f14e8f859c6b704910ca9d0ec75aabc5b5959 /spec
parent	fba3d225e7687f87f923dca6e7647bb30aa89025 (diff)