aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarshall Lochbaum <mwlochbaum@gmail.com>2021-07-14 22:25:01 -0400
committerMarshall Lochbaum <mwlochbaum@gmail.com>2021-07-14 22:25:01 -0400
commit8d58eafa341b5a65bed1a267bc34653e46bbc6e8 (patch)
treebeca1336e8b5ce493a27bbe9cebcc26149f3c7a8
parentf113d9f57bdae219c87887c5e0781a5c824dc8e4 (diff)
Document high-rank search function behavior
-rw-r--r--doc/leading.md12
-rw-r--r--doc/search.md52
-rw-r--r--docs/doc/leading.html8
-rw-r--r--docs/doc/search.html69
-rwxr-xr-xdzref2
5 files changed, 132 insertions, 11 deletions
diff --git a/doc/leading.md b/doc/leading.md
index 467890df..2f75ee3d 100644
--- a/doc/leading.md
+++ b/doc/leading.md
@@ -96,11 +96,11 @@ With leading axis agreement, there are `k+1` shapes for arrays that can be added
### Search functions
-The [search functions](search.md) Bins (`⍋⍒`), Index of (`⊐`), Progressive Index of (`⊒`), and Member of (`∊`) look through cells of one argument to find cells of the other. Find (`⍷`) also does a search, but a slightly different one: it tries to find *slices* of cells of `𝕩` that match `𝕨`.
+The [search functions](search.md), Index of (`⊐`), Progressive Index of (`⊒`), and Member of (`∊`), and also [Bins](order.md#bins) (`⍋⍒`), look through cells of one argument to find cells of the other. Find (`⍷`) also does a search, but a slightly different one: it tries to find *slices* of cells of `𝕩` that match `𝕨`.
-| Searching through | Look for | Functions
-|-------------------|----------|----------
-| `𝕨` | `𝕩` | `⍋⍒⊐⊒`
-| `𝕩` | `𝕨` | `∊⍷`
+| Search in | Search for | Functions
+|-----------|------------|----------
+| `𝕨` | `𝕩` | `⍋⍒⊐⊒`
+| `𝕩` | `𝕨` | `∊⍷`
-For all of these functions but Find, the argument to search through is treated as a list of its major cells. It is the rank of these major cells—let's call this rank `c`—that determines how the other argument is treated. That argument must have rank at least `c`, and it is treated as an array of `c`-cells. For example, if the left argument to `⍋` is a matrix, then each 1-cell or row of `𝕩` is treated independently, and each one yields one number in the result: a 0-cell. The result rank of `⍋` is always `𝕨¬○=𝕩`.
+For all of these functions but Find, the searched-in argument is treated as a list of its major cells, and the searched-for argument is considered a collection of cells with the same rank. See the [search function documentation](search.md#higher-ranks).
diff --git a/doc/search.md b/doc/search.md
index d186e434..159ba2f6 100644
--- a/doc/search.md
+++ b/doc/search.md
@@ -114,3 +114,55 @@ This primitive gives an interesting way to implement the [ordinals](order.md#ord
Here's a goofy code golf tip: if the two arguments to Progressive Index of are the same, then every cell will be matched to itself, because all the previous indices are taken but the current one does match. So `⊒˜` is the same as `↕∘≠`.
⊒˜ "anything at all"
+
+## Single search
+
+Search functions are designed to search for multiple elements at once, and return an array of results. This is the array-oriented way to do it, and can allow faster algorithms to be used for the computation.
+
+ stuff ← "tacks"‿"paper"‿"string"‿"tape"
+
+ stuff ⊐ "tacks"‿"string"
+
+The first thing you might try to search for just one element does not go so well (and yes, this [is a bad thing](../commentary/problems.md#search-function-depth)).
+
+ stuff ⊐ "string"
+
+Instead of interpreting `𝕩` as a single element, Index of treats it as a list, and `𝕨` doesn't even contain characters! Well, [Enclose](enclose.md) (`<`) makes an array from a single element…
+
+ stuff ⊐< "string"
+
+Just as bad, this result has the right information, but is enclosed and could break the program later on. Remember that the result of a search function is *always* an array. We really want the [first](pick.md#first) element.
+
+ stuff ⊑∘⊐⟜< "string"
+
+If `𝕨` is fixed, then the version I prefer is to use Under to enclose the argument and then un-enclose the result. It requires `𝕨` to be bound to `⊐` because otherwise Under would enclose `𝕨` as well, since it applies `𝔾` to both arguments.
+
+ stuff⊸⊐⌾< "string"
+
+For Member of, the equivalent is `∊⟜stuff⌾<`.
+
+## Higher ranks
+
+So far we've shown set functions acting on lists. Well, and one example with a unit array slipped into the last section. In fact, if the searched-in array is a list, then the searched-for argument can have any rank.
+
+ ("high"≍"rank") ∊ "list arg"
+
+Member of and Index of compute each result number independently, so only the shape is different. Progressive Index of depends on the way entries in `𝕩` are ordered: it searches them in index order, so that (using [Deshape](reshape.md)) `⥊𝕨⊒𝕩` is `𝕨⊒⥊𝕩`.
+
+ 4‿4‿4 ⊒ 3‿2⥊4
+
+But the seached-in argument doesn't have to be a list either! It can also be an array of higher rank. Rank 0 isn't allowed: if you want to "search" a unit, you're probably just looking for [match](match.md).
+
+The searched-in argument is treated as a list of its major cells. It's the rank of these major cells—let's call this rank `c`—that determines how the searched-for argument is treated. That argument must have rank `c` or more, and it's treated as an array of `c`-cells. For example, if the left argument to `⊐` is a rank-2 table, then each 1-cell (row) of `𝕩` is searched for independently, yielding one number in the result: a 0-cell.
+
+ ⊢ rows ← >"row"‿"rho"‿"row"‿"rue"
+
+ rows ⊐ >"row"‿"row"‿"col"≍"rho"‿"cow"‿"col"
+
+So the result rank of `⊐` is always `𝕨¬○=𝕩`, with a result shape `(1-˜=𝕨)↓≢𝕩`, and `𝕨⊐𝕩` fails if `1>=𝕩` or the result rank would be negative. In the list case, we have `1==𝕩` (so the first condition holds), and the result rank resolves to `=𝕨` (which can't be negative, so the second holds as well). The cell rank of `𝕩` is 0, and the fact that a 0-cell of `𝕩` gives a 0-cell of the result is what causes the shape arithmetic to be so simple.
+
+For Member of, the arguments are reversed relative to Index of, but otherwise everything's the same. This differs from APL, where entries are always elements, not cells. Many APL designers consider the APL definition to be a failure of foresight and would prefer BQN's definition—or rather A+'s or J's definition, as these languages were actually the first to use it. The rank-aware version is more flexible, as it allows both searching for elements and searching for rows. APL would return the first result in both cases.
+
+ (2‿1≍3‿1) ∊ 3‿1‿4‿3
+
+ (2‿1≍3‿1) ∊ 3‿1≍4‿3
diff --git a/docs/doc/leading.html b/docs/doc/leading.html
index 72ce1450..69f04d77 100644
--- a/docs/doc/leading.html
+++ b/docs/doc/leading.html
@@ -201,12 +201,12 @@
<p>If one argument is a unit, that is, it has no axes, then leading axis agreement reduces to APL's &quot;scalar extension&quot; (where &quot;scalar&quot; is equivalent to BQN's &quot;unit&quot;), where a single unit is matched with an entire array by repeating it at every application. A unit always agrees with any other array under leading axis agreement because it has no axes whose lengths would need to be checked.</p>
<p>With leading axis agreement, there are <code><span class='Value'>k</span><span class='Function'>+</span><span class='Number'>1</span></code> shapes for arrays that can be added (or any other function with Each) to a given array <code><span class='Value'>x</span></code> without changing its rank. These are precisely the prefixes of <code><span class='Function'>≢</span><span class='Value'>x</span></code>, with ranks from <code><span class='Number'>0</span></code> to <code><span class='Value'>k</span></code> inclusive. Arrays with larger rank can also be used as the other argument, but then the result shape will match that argument and not <code><span class='Value'>x</span></code>.</p>
<h3 id="search-functions">Search functions</h3>
-<p>The <a href="search.html">search functions</a> Bins (<code><span class='Function'>⍋⍒</span></code>), Index of (<code><span class='Function'>⊐</span></code>), Progressive Index of (<code><span class='Function'>⊒</span></code>), and Member of (<code><span class='Function'>∊</span></code>) look through cells of one argument to find cells of the other. Find (<code><span class='Function'>⍷</span></code>) also does a search, but a slightly different one: it tries to find <em>slices</em> of cells of <code><span class='Value'>𝕩</span></code> that match <code><span class='Value'>𝕨</span></code>.</p>
+<p>The <a href="search.html">search functions</a>, Index of (<code><span class='Function'>⊐</span></code>), Progressive Index of (<code><span class='Function'>⊒</span></code>), and Member of (<code><span class='Function'>∊</span></code>), and also <a href="order.html#bins">Bins</a> (<code><span class='Function'>⍋⍒</span></code>), look through cells of one argument to find cells of the other. Find (<code><span class='Function'>⍷</span></code>) also does a search, but a slightly different one: it tries to find <em>slices</em> of cells of <code><span class='Value'>𝕩</span></code> that match <code><span class='Value'>𝕨</span></code>.</p>
<table>
<thead>
<tr>
-<th>Searching through</th>
-<th>Look for</th>
+<th>Search in</th>
+<th>Search for</th>
<th>Functions</th>
</tr>
</thead>
@@ -223,4 +223,4 @@
</tr>
</tbody>
</table>
-<p>For all of these functions but Find, the argument to search through is treated as a list of its major cells. It is the rank of these major cells—let's call this rank <code><span class='Value'>c</span></code>—that determines how the other argument is treated. That argument must have rank at least <code><span class='Value'>c</span></code>, and it is treated as an array of <code><span class='Value'>c</span></code>-cells. For example, if the left argument to <code><span class='Function'>⍋</span></code> is a matrix, then each 1-cell or row of <code><span class='Value'>𝕩</span></code> is treated independently, and each one yields one number in the result: a 0-cell. The result rank of <code><span class='Function'>⍋</span></code> is always <code><span class='Value'>𝕨</span><span class='Function'>¬</span><span class='Modifier2'>○</span><span class='Function'>=</span><span class='Value'>𝕩</span></code>.</p>
+<p>For all of these functions but Find, the searched-in argument is treated as a list of its major cells, and the searched-for argument is considered a collection of cells with the same rank. See the <a href="search.html#higher-ranks">search function documentation</a>.</p>
diff --git a/docs/doc/search.html b/docs/doc/search.html
index 285f8b46..e24a3b3c 100644
--- a/docs/doc/search.html
+++ b/docs/doc/search.html
@@ -192,3 +192,72 @@
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4oqSy5wgImFueXRoaW5nIGF0IGFsbCI=">↗️</a><pre> <span class='Function'>⊒</span><span class='Modifier'>˜</span> <span class='String'>&quot;anything at all&quot;</span>
⟨ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ⟩
</pre>
+<h2 id="single-search">Single search</h2>
+<p>Search functions are designed to search for multiple elements at once, and return an array of results. This is the array-oriented way to do it, and can allow faster algorithms to be used for the computation.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c3R1ZmYg4oaQICJ0YWNrcyLigL8icGFwZXIi4oC/InN0cmluZyLigL8idGFwZSIKCnN0dWZmIOKKkCAidGFja3Mi4oC/InN0cmluZyI=">↗️</a><pre> <span class='Value'>stuff</span> <span class='Gets'>←</span> <span class='String'>&quot;tacks&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;paper&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;string&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;tape&quot;</span>
+
+ <span class='Value'>stuff</span> <span class='Function'>⊐</span> <span class='String'>&quot;tacks&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;string&quot;</span>
+⟨ 0 2 ⟩
+</pre>
+<p>The first thing you might try to search for just one element does not go so well (and yes, this <a href="../commentary/problems.html#search-function-depth">is a bad thing</a>).</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c3R1ZmYg4oqQICJzdHJpbmci">↗️</a><pre> <span class='Value'>stuff</span> <span class='Function'>⊐</span> <span class='String'>&quot;string&quot;</span>
+⟨ 4 4 4 4 4 4 ⟩
+</pre>
+<p>Instead of interpreting <code><span class='Value'>𝕩</span></code> as a single element, Index of treats it as a list, and <code><span class='Value'>𝕨</span></code> doesn't even contain characters! Well, <a href="enclose.html">Enclose</a> (<code><span class='Function'>&lt;</span></code>) makes an array from a single element…</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c3R1ZmYg4oqQPCAic3RyaW5nIg==">↗️</a><pre> <span class='Value'>stuff</span> <span class='Function'>⊐&lt;</span> <span class='String'>&quot;string&quot;</span>
+┌·
+· 2
+ ┘
+</pre>
+<p>Just as bad, this result has the right information, but is enclosed and could break the program later on. Remember that the result of a search function is <em>always</em> an array. We really want the <a href="pick.html#first">first</a> element.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c3R1ZmYg4oqR4oiY4oqQ4p+cPCAic3RyaW5nIg==">↗️</a><pre> <span class='Value'>stuff</span> <span class='Function'>⊑</span><span class='Modifier2'>∘</span><span class='Function'>⊐</span><span class='Modifier2'>⟜</span><span class='Function'>&lt;</span> <span class='String'>&quot;string&quot;</span>
+2
+</pre>
+<p>If <code><span class='Value'>𝕨</span></code> is fixed, then the version I prefer is to use Under to enclose the argument and then un-enclose the result. It requires <code><span class='Value'>𝕨</span></code> to be bound to <code><span class='Function'>⊐</span></code> because otherwise Under would enclose <code><span class='Value'>𝕨</span></code> as well, since it applies <code><span class='Function'>𝔾</span></code> to both arguments.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c3R1ZmbiirjiipDijL48ICJzdHJpbmci">↗️</a><pre> <span class='Value'>stuff</span><span class='Modifier2'>⊸</span><span class='Function'>⊐</span><span class='Modifier2'>⌾</span><span class='Function'>&lt;</span> <span class='String'>&quot;string&quot;</span>
+2
+</pre>
+<p>For Member of, the equivalent is <code><span class='Function'>∊</span><span class='Modifier2'>⟜</span><span class='Value'>stuff</span><span class='Modifier2'>⌾</span><span class='Function'>&lt;</span></code>.</p>
+<h2 id="higher-ranks">Higher ranks</h2>
+<p>So far we've shown set functions acting on lists. Well, and one example with a unit array slipped into the last section. In fact, if the searched-in array is a list, then the searched-for argument can have any rank.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KCJoaWdoIuKJjSJyYW5rIikg4oiKICJsaXN0IGFyZyI=">↗️</a><pre> <span class='Paren'>(</span><span class='String'>&quot;high&quot;</span><span class='Function'>≍</span><span class='String'>&quot;rank&quot;</span><span class='Paren'>)</span> <span class='Function'>∊</span> <span class='String'>&quot;list arg&quot;</span>
+┌─
+╵ 0 1 1 0
+ 1 1 0 0
+ ┘
+</pre>
+<p>Member of and Index of compute each result number independently, so only the shape is different. Progressive Index of depends on the way entries in <code><span class='Value'>𝕩</span></code> are ordered: it searches them in index order, so that (using <a href="reshape.html">Deshape</a>) <code><span class='Function'>⥊</span><span class='Value'>𝕨</span><span class='Function'>⊒</span><span class='Value'>𝕩</span></code> is <code><span class='Value'>𝕨</span><span class='Function'>⊒⥊</span><span class='Value'>𝕩</span></code>.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=NOKAvzTigL80IOKKkiAz4oC/MuKlijQ=">↗️</a><pre> <span class='Number'>4</span><span class='Ligature'>‿</span><span class='Number'>4</span><span class='Ligature'>‿</span><span class='Number'>4</span> <span class='Function'>⊒</span> <span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Function'>⥊</span><span class='Number'>4</span>
+┌─
+╵ 0 1
+ 2 3
+ 3 3
+ ┘
+</pre>
+<p>But the seached-in argument doesn't have to be a list either! It can also be an array of higher rank. Rank 0 isn't allowed: if you want to &quot;search&quot; a unit, you're probably just looking for <a href="match.html">match</a>.</p>
+<p>The searched-in argument is treated as a list of its major cells. It's the rank of these major cells—let's call this rank <code><span class='Value'>c</span></code>—that determines how the searched-for argument is treated. That argument must have rank <code><span class='Value'>c</span></code> or more, and it's treated as an array of <code><span class='Value'>c</span></code>-cells. For example, if the left argument to <code><span class='Function'>⊐</span></code> is a rank-2 table, then each 1-cell (row) of <code><span class='Value'>𝕩</span></code> is searched for independently, yielding one number in the result: a 0-cell.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4oqiIHJvd3Mg4oaQID4icm93IuKAvyJyaG8i4oC/InJvdyLigL8icnVlIgoKcm93cyDiipAgPiJyb3ci4oC/InJvdyLigL8iY29sIuKJjSJyaG8i4oC/ImNvdyLigL8iY29sIg==">↗️</a><pre> <span class='Function'>⊢</span> <span class='Value'>rows</span> <span class='Gets'>←</span> <span class='Function'>&gt;</span><span class='String'>&quot;row&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;rho&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;row&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;rue&quot;</span>
+┌─
+╵"row
+ rho
+ row
+ rue"
+ ┘
+
+ <span class='Value'>rows</span> <span class='Function'>⊐</span> <span class='Function'>&gt;</span><span class='String'>&quot;row&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;row&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;col&quot;</span><span class='Function'>≍</span><span class='String'>&quot;rho&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;cow&quot;</span><span class='Ligature'>‿</span><span class='String'>&quot;col&quot;</span>
+┌─
+╵ 0 0 4
+ 1 4 4
+ ┘
+</pre>
+<p>So the result rank of <code><span class='Function'>⊐</span></code> is always <code><span class='Value'>𝕨</span><span class='Function'>¬</span><span class='Modifier2'>○</span><span class='Function'>=</span><span class='Value'>𝕩</span></code>, with a result shape <code><span class='Paren'>(</span><span class='Number'>1</span><span class='Function'>-</span><span class='Modifier'>˜</span><span class='Function'>=</span><span class='Value'>𝕨</span><span class='Paren'>)</span><span class='Function'>↓≢</span><span class='Value'>𝕩</span></code>, and <code><span class='Value'>𝕨</span><span class='Function'>⊐</span><span class='Value'>𝕩</span></code> fails if <code><span class='Number'>1</span><span class='Function'>&gt;=</span><span class='Value'>𝕩</span></code> or the result rank would be negative. In the list case, we have <code><span class='Number'>1</span><span class='Function'>==</span><span class='Value'>𝕩</span></code> (so the first condition holds), and the result rank resolves to <code><span class='Function'>=</span><span class='Value'>𝕨</span></code> (which can't be negative, so the second holds as well). The cell rank of <code><span class='Value'>𝕩</span></code> is 0, and the fact that a 0-cell of <code><span class='Value'>𝕩</span></code> gives a 0-cell of the result is what causes the shape arithmetic to be so simple.</p>
+<p>For Member of, the arguments are reversed relative to Index of, but otherwise everything's the same. This differs from APL, where entries are always elements, not cells. Many APL designers consider the APL definition to be a failure of foresight and would prefer BQN's definition—or rather A+'s or J's definition, as these languages were actually the first to use it. The rank-aware version is more flexible, as it allows both searching for elements and searching for rows. APL would return the first result in both cases.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KDLigL8x4omNM+KAvzEpIOKIiiAz4oC/MeKAvzTigL8zCgooMuKAvzHiiY0z4oC/MSkg4oiKIDPigL8x4omNNOKAvzM=">↗️</a><pre> <span class='Paren'>(</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Function'>≍</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Paren'>)</span> <span class='Function'>∊</span> <span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>4</span><span class='Ligature'>‿</span><span class='Number'>3</span>
+┌─
+╵ 0 1
+ 1 1
+ ┘
+
+ <span class='Paren'>(</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Function'>≍</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Paren'>)</span> <span class='Function'>∊</span> <span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Function'>≍</span><span class='Number'>4</span><span class='Ligature'>‿</span><span class='Number'>3</span>
+⟨ 0 1 ⟩
+</pre>
diff --git a/dzref b/dzref
index e7d92832..f60e0365 100755
--- a/dzref
+++ b/dzref
@@ -64,7 +64,7 @@ Find←{
⍷ ← ∊⊸/ ⊘ Find
OccurrenceCount ← ⊐˜(⊢-⊏)⍋∘⍋
-ProgressiveIndexOf ← {𝕨⊐○(≍˘⟜OccurrenceCount𝕨⊸⊐)𝕩}
+ProgressiveIndexOf ← {𝕨⊐○(((≢∾2˙)⥊≍˘⟜OccurrenceCount∘⥊)𝕨⊸⊐)𝕩}
⊒ ← OccurrenceCount⊘ ProgressiveIndexOf
"