From 8d58eafa341b5a65bed1a267bc34653e46bbc6e8 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Wed, 14 Jul 2021 22:25:01 -0400 Subject: Document high-rank search function behavior --- docs/doc/leading.html | 8 +++--- docs/doc/search.html | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+), 4 deletions(-) (limited to 'docs/doc') diff --git a/docs/doc/leading.html b/docs/doc/leading.html index 72ce1450..69f04d77 100644 --- a/docs/doc/leading.html +++ b/docs/doc/leading.html @@ -201,12 +201,12 @@

If one argument is a unit, that is, it has no axes, then leading axis agreement reduces to APL's "scalar extension" (where "scalar" is equivalent to BQN's "unit"), where a single unit is matched with an entire array by repeating it at every application. A unit always agrees with any other array under leading axis agreement because it has no axes whose lengths would need to be checked.

With leading axis agreement, there are k+1 shapes for arrays that can be added (or any other function with Each) to a given array x without changing its rank. These are precisely the prefixes of x, with ranks from 0 to k inclusive. Arrays with larger rank can also be used as the other argument, but then the result shape will match that argument and not x.

Search functions

-

The search functions Bins (⍋⍒), Index of (), Progressive Index of (), and Member of () look through cells of one argument to find cells of the other. Find () also does a search, but a slightly different one: it tries to find slices of cells of 𝕩 that match 𝕨.

+

The search functions, Index of (), Progressive Index of (), and Member of (), and also Bins (⍋⍒), look through cells of one argument to find cells of the other. Find () also does a search, but a slightly different one: it tries to find slices of cells of 𝕩 that match 𝕨.

- - + + @@ -223,4 +223,4 @@
Searching throughLook forSearch inSearch for Functions
-

For all of these functions but Find, the argument to search through is treated as a list of its major cells. It is the rank of these major cells—let's call this rank c—that determines how the other argument is treated. That argument must have rank at least c, and it is treated as an array of c-cells. For example, if the left argument to is a matrix, then each 1-cell or row of 𝕩 is treated independently, and each one yields one number in the result: a 0-cell. The result rank of is always 𝕨¬=𝕩.

+

For all of these functions but Find, the searched-in argument is treated as a list of its major cells, and the searched-for argument is considered a collection of cells with the same rank. See the search function documentation.

diff --git a/docs/doc/search.html b/docs/doc/search.html index 285f8b46..e24a3b3c 100644 --- a/docs/doc/search.html +++ b/docs/doc/search.html @@ -192,3 +192,72 @@ ↗️
    ˜ "anything at all"
 ⟨ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ⟩
 
+ +

Search functions are designed to search for multiple elements at once, and return an array of results. This is the array-oriented way to do it, and can allow faster algorithms to be used for the computation.

+↗️
    stuff  "tacks""paper""string""tape"
+
+    stuff  "tacks""string"
+⟨ 0 2 ⟩
+
+

The first thing you might try to search for just one element does not go so well (and yes, this is a bad thing).

+↗️
    stuff  "string"
+⟨ 4 4 4 4 4 4 ⟩
+
+

Instead of interpreting 𝕩 as a single element, Index of treats it as a list, and 𝕨 doesn't even contain characters! Well, Enclose (<) makes an array from a single element…

+↗️
    stuff ⊐< "string"
+┌·   
+· 2  
+    ┘
+
+

Just as bad, this result has the right information, but is enclosed and could break the program later on. Remember that the result of a search function is always an array. We really want the first element.

+↗️
    stuff < "string"
+2
+
+

If 𝕨 is fixed, then the version I prefer is to use Under to enclose the argument and then un-enclose the result. It requires 𝕨 to be bound to because otherwise Under would enclose 𝕨 as well, since it applies 𝔾 to both arguments.

+↗️
    stuff< "string"
+2
+
+

For Member of, the equivalent is stuff<.

+

Higher ranks

+

So far we've shown set functions acting on lists. Well, and one example with a unit array slipped into the last section. In fact, if the searched-in array is a list, then the searched-for argument can have any rank.

+↗️
    ("high""rank")  "list arg"
+┌─         
+╵ 0 1 1 0  
+  1 1 0 0  
+          ┘
+
+

Member of and Index of compute each result number independently, so only the shape is different. Progressive Index of depends on the way entries in 𝕩 are ordered: it searches them in index order, so that (using Deshape) 𝕨𝕩 is 𝕨⊒⥊𝕩.

+↗️
    444  324
+┌─     
+╵ 0 1  
+  2 3  
+  3 3  
+      ┘
+
+

But the seached-in argument doesn't have to be a list either! It can also be an array of higher rank. Rank 0 isn't allowed: if you want to "search" a unit, you're probably just looking for match.

+

The searched-in argument is treated as a list of its major cells. It's the rank of these major cells—let's call this rank c—that determines how the searched-for argument is treated. That argument must have rank c or more, and it's treated as an array of c-cells. For example, if the left argument to is a rank-2 table, then each 1-cell (row) of 𝕩 is searched for independently, yielding one number in the result: a 0-cell.

+↗️
     rows  >"row""rho""row""rue"
+┌─     
+╵"row  
+  rho  
+  row  
+  rue" 
+      ┘
+
+    rows  >"row""row""col""rho""cow""col"
+┌─       
+╵ 0 0 4  
+  1 4 4  
+        ┘
+
+

So the result rank of is always 𝕨¬=𝕩, with a result shape (1-˜=𝕨)↓≢𝕩, and 𝕨𝕩 fails if 1>=𝕩 or the result rank would be negative. In the list case, we have 1==𝕩 (so the first condition holds), and the result rank resolves to =𝕨 (which can't be negative, so the second holds as well). The cell rank of 𝕩 is 0, and the fact that a 0-cell of 𝕩 gives a 0-cell of the result is what causes the shape arithmetic to be so simple.

+

For Member of, the arguments are reversed relative to Index of, but otherwise everything's the same. This differs from APL, where entries are always elements, not cells. Many APL designers consider the APL definition to be a failure of foresight and would prefer BQN's definition—or rather A+'s or J's definition, as these languages were actually the first to use it. The rank-aware version is more flexible, as it allows both searching for elements and searching for rows. APL would return the first result in both cases.

+↗️
    (2131)  3143
+┌─     
+╵ 0 1  
+  1 1  
+      ┘
+
+    (2131)  3143
+⟨ 0 1 ⟩
+
-- cgit v1.2.3