diff options
| author | Marshall Lochbaum <mwlochbaum@gmail.com> | 2021-02-01 22:21:26 -0500 |
|---|---|---|
| committer | Marshall Lochbaum <mwlochbaum@gmail.com> | 2021-02-01 22:21:26 -0500 |
| commit | 36037d57b46811084d076f20e79b87894e139671 (patch) | |
| tree | 9b6f14e8f859c6b704910ca9d0ec75aabc5b5959 /docs/spec | |
| parent | fba3d225e7687f87f923dca6e7647bb30aa89025 (diff) | |
Vaguely adequate spec for index-related primitives
Diffstat (limited to 'docs/spec')
| -rw-r--r-- | docs/spec/primitive.html | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/docs/spec/primitive.html b/docs/spec/primitive.html index 61b190b3..db1b7e02 100644 --- a/docs/spec/primitive.html +++ b/docs/spec/primitive.html @@ -119,6 +119,14 @@ <p><strong>Join To</strong> (<code><span class='Function'>∾</span></code>) combines its two arguments along an existing initial axis, unless both arguments are units, in which case it creates an axis and is identical to Couple (<code><span class='Function'>≍</span></code>). The arguments must differ in rank by at most 1, and the result rank is equal to the maximum of 1 and the higher argument rank. Each argument with rank less than the result, and each major cell of an argument with rank equal to it, becomes a major cell of the result, with cells from the left argument placed before those from the right. <strong>Join</strong> (<code><span class='Function'>∾</span></code>) generalizes the equal-rank subset of this behavior to an array of values instead of just two. The argument must be an array (unlike Merge), and its elements must all the same rank, which is at least the argument rank. Atom elements are treated as unit arrays. Then "outer" argument axes are matched up with leading "inner" element axes, and elements are joined along these axes. In order to allow this, the length of an element along a particular axis must depend only on the position along the corresponding axis in the argument. An empty argument to Join is return unchanged, as though the element rank is equal to the argument rank.</p> <p><strong>Deshape</strong> (<code><span class='Function'>⥊</span></code>) differs from the provided function (which returns the element list of an array) only in that it accepts an atom, returning a one-element list containing it. <strong>Reshape</strong> (<code><span class='Function'>⥊</span></code>) is extended in numerous ways. It accepts any list of natural numbers (including as a unit array or atom) for the left argument and any right argument; <code><span class='Value'>𝕩</span></code> is deshaped first so that it is treated as a list of elements. These elements are repeated cyclically to fill the result array in ravel order. If <code><span class='Value'>𝕩</span></code> is empty but the result is not, then the result consists of fill elements for <code><span class='Value'>𝕩</span></code>. Furthermore, at most one element of <code><span class='Value'>𝕨</span></code> can be a "length code": one of the primitives <code><span class='Modifier2'>∘</span><span class='Function'>⌊⌽↑</span></code>. In this case, a target length is computed from the number of elements in <code><span class='Value'>𝕩</span></code> divided by the product of the other elements of <code><span class='Value'>𝕨</span></code> (which must not be zero). If the target length is an integer then it is used directly for the length code. Otherwise, an error is given if the length code is <code><span class='Modifier2'>∘</span></code>, and the target length is rounded down if the code is <code><span class='Function'>⌊</span></code> and up if it's <code><span class='Function'>⌽</span></code> or <code><span class='Function'>↑</span></code>. With code <code><span class='Function'>⌽</span></code>, elements are repeated cyclically as usual, but with code <code><span class='Function'>↑</span></code>, the extra elements after each argument element is used are fill values for <code><span class='Value'>𝕩</span></code>.</p> <p><strong>Transpose</strong> (<code><span class='Function'>⍉</span></code>) reorders axes of its argument to place the first axis last; if the argument has one or fewer axes then it's returned unchanged. <strong>Reorder Axes</strong> (<code><span class='Function'>⍉</span></code>) requires the left argument to be a list or unit of natural numbers, with length at most the rank of the right argument. This list is extended to match the right argument rank exactly by repeatedly appending the least unused natural number (for example, given <code><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span></code>, <code><span class='Number'>2</span></code> is appended). After extension, it specifies a result axis for each axis of the right argument. There must be no gaps in the list: that is, with the result rank equal to one plus the greatest value present, every result axis must appear at least once. Now each argument axis is "sent to" the specified result axis: in terms of indices, <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>𝕨</span><span class='Function'>⍉</span><span class='Value'>𝕩</span></code> is <code><span class='Paren'>(</span><span class='Value'>𝕨</span><span class='Function'>⊏</span><span class='Value'>i</span><span class='Paren'>)</span><span class='Function'>⊑</span><span class='Value'>𝕩</span></code> if <code><span class='Value'>𝕨</span></code> is complete. If multiple argument axes correspond to the same result axis, then a diagonal is taken, and it's as long as the shortest of those argument axes. While Transpose does not enclose an atom right argument, Reorder Axes does, so that its result is always an array.</p> +<h3 id="indices-and-selection">Indices and selection</h3> +<p>Each element in an array <code><span class='Value'>s</span><span class='Function'>⥊</span><span class='Value'>e</span></code> is associated with an <em>index</em>, which is a list of natural numbers <code><span class='Value'>i</span></code> such that <code><span class='Function'>∧</span><span class='Modifier'>´</span><span class='Value'>i</span><span class='Function'><</span><span class='Value'>s</span></code>. The list of all indices, which corresponds to the element list <code><span class='Value'>e</span></code>, contains all such lists <code><span class='Value'>i</span></code> in lexicographic order. That is, index <code><span class='Value'>i</span></code> comes before <code><span class='Value'>j</span></code> exactly when the two indices are not the same, and <code><span class='Value'>i</span></code> has the smaller value at the first position where they are unequal. The index of an element along a particular axis <code><span class='Value'>a</span></code> is the value <code><span class='Value'>a</span><span class='Function'>⊑</span><span class='Value'>i</span></code>.</p> +<p><strong>Pick</strong> (<code><span class='Function'>⊑</span></code>) is extended to array left arguments. In this case, it requires every depth-1 array in the nested structure of <code><span class='Value'>𝕨</span></code> to be a valid index list for <code><span class='Value'>𝕩</span></code>, and every atom to be contained in one of these lists. The result is <code><span class='Value'>𝕨</span></code> with each index list replaced by the element of <code><span class='Value'>𝕩</span></code> at that index. In the simple case where <code><span class='Value'>𝕨</span></code> itself is an index list, the result is the element of <code><span class='Value'>𝕩</span></code> at index <code><span class='Value'>𝕨</span></code>.</p> +<p><strong>First</strong> (<code><span class='Function'>⊑</span></code>) simply takes the first element of its argument in index order, or the fill element if <code><span class='Value'>𝕩</span></code> is empty.</p> +<p>For <strong>Select</strong> (<code><span class='Function'>⊏</span></code>), <code><span class='Value'>𝕨</span></code> is an array of natural numbers, or a list of such arrays; if it's an empty list, it's interpreted as the former. The given arrays are matched with leading axes of <code><span class='Value'>𝕩</span></code> and used to select from those axes. Their shape is retained, so that the final shape is the combined shapes of each array of natural numbers in <code><span class='Value'>𝕨</span></code> in order, followed by the trailing (unmatched) shape of <code><span class='Value'>𝕩</span></code>. This means that a single axis in <code><span class='Value'>𝕩</span></code> can correspond to any number of axes in <code><span class='Value'>𝕨</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>, depending on the rank of that portion of <code><span class='Value'>𝕨</span></code>. More precisely, the value of the result at an index <code><span class='Value'>j</span></code> is obtained by splitting <code><span class='Value'>j</span></code> into one index into each array of <code><span class='Value'>𝕨</span></code> followed by a partial index into <code><span class='Value'>𝕩</span></code>. An index <code><span class='Value'>i</span></code> for <code><span class='Value'>𝕩</span></code> comes from selecting from each array of <code><span class='Value'>𝕨</span></code> and appending the results to the partial index from <code><span class='Value'>j</span></code>, and the value <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>𝕩</span></code> is <code><span class='Value'>j</span><span class='Function'>⊑</span><span class='Value'>𝕨</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>.</p> +<p><strong>First Cell</strong> (<code><span class='Function'>⊏</span></code>) selects the initial major cell of <code><span class='Value'>𝕩</span></code>, giving an error if <code><span class='Value'>𝕩</span></code> has rank 0 or length 0.</p> +<p><strong>Group</strong> (<code><span class='Function'>⊔</span></code>) performs an opposite operation to Select, so that <code><span class='Value'>𝕨</span></code> specifies not the argument index that result values come from, but the result index that argument values go to. The result rank is <code><span class='Function'>≠</span><span class='Value'>𝕨</span></code>, and each result element has rank <code><span class='Number'>1</span><span class='Function'>+</span><span class='Paren'>(</span><span class='Function'>=</span><span class='Value'>𝕩</span><span class='Paren'>)</span><span class='Function'>-+</span><span class='Modifier'>´</span><span class='Function'>=</span><span class='Modifier'>¨</span><span class='Value'>𝕨</span></code>. The result element contains the minimal list of cells so that each cell of <code><span class='Value'>𝕩</span></code> appears in the corresponding element of <code><span class='Value'>𝕨</span><span class='Function'>⊏</span><span class='Value'>𝕨</span><span class='Function'>⊔</span><span class='Value'>𝕩</span></code>, in index order. <strong>Group Indices</strong> treats its argument <code><span class='Value'>𝕩</span></code> as a left argument for Group and uses a right argument made up of indices, with the definition <code><span class='Function'>⊔</span><span class='Modifier2'>⟜</span><span class='Paren'>(</span><span class='Function'>↕≠</span><span class='Modifier2'>⚇</span><span class='Number'>1</span><span class='Paren'>)</span></code>.</p> +<p><strong>Range</strong> (<code><span class='Function'>↕</span></code>) is extended to apply to a list of natural numbers, in addition to the provided case of a single natural number (an enclosed natural number <code><span class='Value'>𝕩</span></code> should still result in an error). For a list <code><span class='Value'>𝕩</span></code>, the result is an array of shape <code><span class='Value'>𝕩</span></code> in which the value at a given index is that index, as a list of natural numbers. That is, <code><span class='Value'>i</span><span class='Function'>≡</span><span class='Value'>i</span><span class='Function'>⊑↕</span><span class='Value'>𝕩</span></code> for any list of natural numbers <code><span class='Value'>i</span></code> with <code><span class='Function'>∧</span><span class='Modifier'>´</span><span class='Value'>i</span><span class='Function'><</span><span class='Value'>𝕩</span></code>.</p> <h3 id="searching">Searching</h3> <p><strong>Match</strong> (<code><span class='Function'>≡</span></code>) indicates whether two values are considered equivalent. It always returns 0 or 1, and never causes an error. If both arguments are atoms then it is identical to <code><span class='Function'>=</span></code>, and if one is an atom and the other an array then it returns 0. If both arguments are arrays then it returns 1 only if they have the same shape and all pairs of corresponding elements match. Fill elements aren't taken into account, so that arrays that match might still differ in behavior. <strong>Not Match</strong> simply returns the complement of Match, <code><span class='Function'>¬≡</span></code>.</p> <p>Monadic search functions compare the major cells of <code><span class='Value'>𝕩</span></code> to each other. <code><span class='Value'>𝕩</span></code> must have rank at least 1. Except for Unique (<code><span class='Function'>⍷</span></code>), the result is a list of numbers with the same length as <code><span class='Value'>𝕩</span></code>.</p> @@ -138,10 +146,10 @@ <h3 id="sorting">Sorting</h3> <p>Sorting functions are those that depend on BQN's array ordering. There are three kinds of sorting function, with two functions of each kind: one with an upward-pointing glyph that uses an ascending ordering (these function names are suffixed with "Up"), and one with a downward-pointing glyph and the reverse, descending, ordering ("Down"). Below, these three kinds of function are described, then the ordering rules. Except for the right argument of Bins, all arguments must have rank at least 1.</p> <p><strong>Sort</strong> (<code><span class='Function'>∧∨</span></code>) reorders the major cells of its argument so that a major cell with a lower index comes earlier in the ordering than a major cell with a higher index, or matches it.</p> -<p><strong>Grade</strong> (<code><span class='Function'>⍋⍒</span></code>) returns a permutation describing the way the argument array would be sorted. For this reason the reference implementations simply define Sort to be selection by the grade. One way to define Grade is as a sorted version of the index list <code><span class='Function'>↕≠</span><span class='Value'>𝕩</span></code>. An index <code><span class='Value'>i</span></code> is ordered according to the corresponding major cell <code><span class='Value'>i</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>. However, ties in the ordering are broken by ordering the index values themselves, so that no two indices are ever considered equal, and the result of sorting is well-defined (for Sort this is not an issue—matching cells are truly interchangeable). This property means that a stable sorting algorithm must be used to implement Grade functions. While cells might be ordered ascending or descending, indices are always ordered ascending, so that for example index <code><span class='Value'>i</span></code> is placed before index <code><span class='Value'>j</span></code> if either <code><span class='Value'>i</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code> comes earlier in the ordering than <code><span class='Value'>j</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>, or if <code><span class='Value'>i</span><span class='Function'><</span><span class='Value'>j</span></code>.</p> +<p><strong>Grade</strong> (<code><span class='Function'>⍋⍒</span></code>) returns a permutation describing the way the argument array would be sorted. For this reason the reference implementations simply define Sort to be selection by the grade. One way to define Grade is as a sorted version of the index list <code><span class='Function'>↕≠</span><span class='Value'>𝕩</span></code>. An index <code><span class='Value'>i</span></code> is ordered according to the corresponding major cell <code><span class='Value'>i</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>. However, ties in the ordering are broken by ordering the index values themselves, so that no two indices are ever considered equal, and the result of sorting is well-defined (for Sort this is not an issue—matching cells are truly interchangeable). This property means that a stable sorting algorithm must be used to implement Grade functions. While cells might be ordered ascending or descending, indices are always ordered ascending, so that for example index <code><span class='Value'>i</span></code> is placed before index <code><span class='Value'>j</span></code> if either <code><span class='Value'>i</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code> comes earlier in the ordering than <code><span class='Value'>j</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code>, or if they match and <code><span class='Value'>i</span><span class='Function'><</span><span class='Value'>j</span></code>.</p> <p><strong>Bins</strong> (<code><span class='Function'>⍋⍒</span></code>) requires the <code><span class='Value'>𝕨</span></code> to be ordered in the sense of Sort (with the same direction). Like a dyadic search function, it then works on cells of <code><span class='Value'>𝕩</span></code> with the same rank as major cells of <code><span class='Value'>𝕨</span></code>: the rank of <code><span class='Value'>𝕩</span></code> cannot be less than <code><span class='Paren'>(</span><span class='Function'>=</span><span class='Value'>𝕨</span><span class='Paren'>)</span><span class='Function'>-</span><span class='Number'>1</span></code>. For each of these, it identifies where in the ordering given by <code><span class='Value'>𝕨</span></code> the cell belongs, that is, the index of the first cell in <code><span class='Value'>𝕨</span></code> that is ordered later than it, or <code><span class='Function'>≠</span><span class='Value'>𝕨</span></code> if no such cell exists. An equivalent formulation is that the result value for a cell of <code><span class='Value'>𝕩</span></code> is the number of major cells in <code><span class='Value'>𝕨</span></code> that match or precede it.</p> <p>BQN's <em>array ordering</em> is an extension of the number and character ordering given by <code><span class='Function'>≤</span></code> to arrays. In this system, any two arrays consisting of only numbers and characters for atoms can be compared with each other. Furthermore, some arrays that contain incomparable atoms (operations) might be comparable, if the result of the comparison can be decided before reaching these atoms. Array ordering does not depend on the fill elements for the two arguments.</p> <p>Here we define the array ordering using the terms "smaller" and "larger". For functions <code><span class='Function'>∧⍋</span></code>, "earlier" means "smaller" and "later" means "larger", while <code><span class='Function'>∨⍒</span></code> use the opposite definition, reversing the ordering.</p> <p>To compare two arrays, BQN first attempts to compare elements at corresponding indices, where two indices are considered to correspond if one is a suffix of the other. Elements are accessed in ravel order, that is, beginning at the all-zero index and first increasing the final number in the index, then the second-to-last, and so on. They are compared, using array comparison if necessary, until a non-matching pair of elements is found—in this case the ordering of this pair determines the ordering of the arrays—or one array has an index with no corresponding index in the other array. For example, comparing <code><span class='Number'>4</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Function'>⥊</span><span class='Number'>1</span></code> with <code><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>5</span><span class='Function'>⥊</span><span class='Number'>1</span></code> stops when index <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>2</span></code> in <code><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>5</span><span class='Function'>⥊</span><span class='Number'>1</span></code> is reached, because the corresponding index <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>2</span></code> is out of range. The index <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>0</span></code> in the other array also has no corresponding index, but comes later in the index ordering. In this case, the array that lacks the index in question is considered smaller.</p> -<p>If two arrays have the same shape (ignoring leading <code><span class='Number'>1</span></code>s) and all matching element, or if they are both empty, then the element-by-element comparison will not find any differences. In this case, the arrays are compared first by rank, with the higher-rank array considered larger, and then by shape, beginning with the leading axes.</p> +<p>If two arrays have the same shape (ignoring leading <code><span class='Number'>1</span></code>s) and all matching elements, or if they are both empty, then the element-by-element comparison will not find any differences. In this case, the arrays are compared first by rank, with the higher-rank array considered larger, and then by shape, beginning with the leading axes.</p> <p>To compare two atoms, array ordering uses <code><span class='Function'>≤</span></code>: if <code><span class='Value'>𝕨</span><span class='Function'>≤</span><span class='Value'>𝕩</span></code> then <code><span class='Value'>𝕨</span></code> matches <code><span class='Value'>𝕩</span></code> if <code><span class='Value'>𝕩</span><span class='Function'>≤</span><span class='Value'>𝕨</span></code> and otherwise is smaller than <code><span class='Value'>𝕩</span></code> (and <code><span class='Value'>𝕩</span></code> is larger than <code><span class='Value'>𝕨</span></code>). To compare an atom to an array, the atom is promoted to an array by enclosing it; however, if the enclosed atom matches the array then the atom is considered smaller.</p> |
