diff options
| -rw-r--r-- | doc/group.md | 14 | ||||
| -rw-r--r-- | docs/doc/group.html | 19 |
2 files changed, 22 insertions, 11 deletions
diff --git a/doc/group.md b/doc/group.md index 7029e8f5..aceea8ed 100644 --- a/doc/group.md +++ b/doc/group.md @@ -53,16 +53,19 @@ b ← (0.4⌈0.2+≠¨zf) {∾"M vhv"∾¨FmtNum (0‿1‿1‿0‿1⊏d)×(⟨ Group operates on a list of atomic-number [indices](indices.md) `𝕨` and an array `𝕩`, treated as a list of its major cells, to produce a list of groups, each containing some of the cells from `𝕩`. The two arguments have the same length, and each cell in `𝕩` is paired with the index in `𝕨` at the same position, which indicates which result group should include that cell. 0‿1‿2‿0‿1 ≍ "abcde" # Corresponding indices and values + 0‿1‿2‿0‿1 ⊔ "abcde" # Values grouped by index -A few extra options can be useful in some circumstances. First, an "index" of `¯1` in `𝕨` indicates that the corresponding cell should be dropped and not appear in the result. Second, `𝕨` is allowed to have an extra element after the end, which gives a minimum length for the result: otherwise, the result will be just long enough to accomodate the highest index in `𝕨`. +A few extra options can be useful in some circumstances. First, an "index" of `¯1` in `𝕨` indicates that the corresponding cell should be dropped and not appear in the result. Second, `𝕨` is allowed to have an extra element after the end, which gives a minimum length for the result: otherwise, the result will be just long enough to accomodate the highest index in `𝕨` (it might seem like the last element should be treated like an index, making the minimum length one higher, but the length version usually leads to simpler arithmetic). 0‿¯1‿2‿2‿¯1 ⊔ "abcde" # Drop c and e + 0‿1‿2‿2‿1‿6 ⊔ "abcde" # Length-6 result A third extension is that `𝕨` doesn't really have to be a list: if not, then it groups `-=𝕨`-cells of `𝕩` instead of just `¯1`-cells. These cells are placed in index order. This extension isn't compatible with the second option from above, because it's usually not possible to add just one extra element to a non-list array. One usage is to group the diagonals of a table. See if you can figure out how the code below does this. ⊢ a ← 'a'+⥊⟜(↕×´)3‿5 + (+⌜´·↕¨≢)⊸⊔ a For a concrete example, we might choose to group a list of words by length. Within each group, cells maintain the ordering they had in the list originally. @@ -75,18 +78,23 @@ For a concrete example, we might choose to group a list of words by length. With If we'd like to ignore words of 0 letters, or more than 5, we can set all word lengths greater than 5 to 0, then reduce the lengths by 1. Two words end up with left argument values of ¯1 and are omitted from the result. 1 -˜ ≤⟜5⊸× ≠¨ phrase + ≍˘ {1-˜≤⟜5⊸×≠¨𝕩}⊸⊔ phrase Note that the length of the result is determined by the largest index. So the result never includes trailing empty groups. A reader of the above code might expect 5 groups (lengths 1 through 5), but there are no words of length 5, so the last group isn't there. To ensure the result always has 5 groups, we can add a `5` at the end of the left argument. ≍˘ {5∾˜1-˜≤⟜5⊸×≠¨𝕩}⊸⊔ phrase -When Group is called dyadically, the left argument is used for the indices and the right is used for values, as seen above. When it is called monadically, the right argument, which must be a list, gives the indices and the values grouped are the right argument's indices, that is, `↕≠𝕩`. +### Group Indices + +Above, Group has two arguments, and `𝕨` gives the indices and `𝕩` is the values to be grouped. In the one-argument case, `𝕩` now gives the result indices, and the values grouped are indices related to `𝕩`. For a numeric list, `⊔𝕩` is `𝕩⊔↕≠𝕩`. ≍˘ ⊔ 2‿3‿¯1‿2 Here, the index 2 appears at indices 0 and 3 while the index 3 appears at index 1. +But `𝕩` can also be a list of numeric arrays. In this case the indices `↕∾≢¨𝕩` will be grouped by `𝕩` according to the multidimensional grouping documented in the next section. Since the argument to [Range](range.md) (`↕`) is now a list, each index to be grouped is a list instead of a number. As with `↕`, the depth of the result of Group Indices is always one greater than that of its argument. One consequence is that for an array `a` of any rank, `⊔⋈a` groups the indices `↕≢a`. + ### Multidimensional grouping Dyadic Group allows the right argument to be grouped along multiple axes by using a nested left argument. In this case, the left argument must be a list of numeric lists, and the result has rank `≠𝕨` while its elements—as always—have the same rank as `𝕩`. The result shape is `1+⌈´¨𝕨`, while the shape of element `i⊑𝕨⊔𝕩` is `i+´∘=¨𝕨`. If every element of `𝕨` is sorted ascending and contains only non-negative numbers, we have `𝕩≡∾𝕨⊔𝕩`, that is, [Join](join.md#join) is the inverse of Partition. @@ -97,8 +105,6 @@ Here we split up a rank-2 array into a rank-2 array of rank-2 arrays. Along the Each group `i⊑𝕨⊔𝕩` is composed of the cells `j<¨⊸⊏𝕩` such that `i≢j⊑¨𝕨`. The groups retain their array structure and ordering along each argument axis. Using multidimensional Replicate we can say that `i⊑𝕨⊔𝕩` is `(i=𝕨)/𝕩`. -The monadic case works similarly: Group Indices always satisfies `⊔𝕩 ←→ 𝕩⊔↕≠⚇1𝕩`. As with `↕`, the depth of the result of Group Indices is always one greater than that of its argument. A depth-0 argument is not allowed. - ## Properties Group is closely related to the [inverse of Indices](replicate.md#inverse), `/⁼`. In fact, inverse Indices called on the index argument gives the length of each group: diff --git a/docs/doc/group.html b/docs/doc/group.html index 03357675..8184f09f 100644 --- a/docs/doc/group.html +++ b/docs/doc/group.html @@ -31,27 +31,30 @@ <h2 id="definition"><a class="header" href="#definition">Definition</a></h2> <p>Group operates on a list of atomic-number <a href="indices.html">indices</a> <code><span class='Value'>𝕨</span></code> and an array <code><span class='Value'>𝕩</span></code>, treated as a list of its major cells, to produce a list of groups, each containing some of the cells from <code><span class='Value'>𝕩</span></code>. The two arguments have the same length, and each cell in <code><span class='Value'>𝕩</span></code> is paired with the index in <code><span class='Value'>𝕨</span></code> at the same position, which indicates which result group should include that cell.</p> -<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MOKAvzHigL8y4oC/MOKAvzEg4omNICJhYmNkZSIgICMgQ29ycmVzcG9uZGluZyBpbmRpY2VzIGFuZCB2YWx1ZXMKMOKAvzHigL8y4oC/MOKAvzEg4oqUICJhYmNkZSIgICMgVmFsdWVzIGdyb3VwZWQgYnkgaW5kZXg=">↗️</a><pre> <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span> <span class='Function'>≍</span> <span class='String'>"abcde"</span> <span class='Comment'># Corresponding indices and values +<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MOKAvzHigL8y4oC/MOKAvzEg4omNICJhYmNkZSIgICMgQ29ycmVzcG9uZGluZyBpbmRpY2VzIGFuZCB2YWx1ZXMKCjDigL8x4oC/MuKAvzDigL8xIOKKlCAiYWJjZGUiICAjIFZhbHVlcyBncm91cGVkIGJ5IGluZGV4">↗️</a><pre> <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span> <span class='Function'>≍</span> <span class='String'>"abcde"</span> <span class='Comment'># Corresponding indices and values </span>┌─ ╵ 0 1 2 0 1 'a' 'b' 'c' 'd' 'e' ┘ + <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span> <span class='Function'>⊔</span> <span class='String'>"abcde"</span> <span class='Comment'># Values grouped by index </span>⟨ "ad" "be" "c" ⟩ </pre> -<p>A few extra options can be useful in some circumstances. First, an "index" of <code><span class='Number'>¯1</span></code> in <code><span class='Value'>𝕨</span></code> indicates that the corresponding cell should be dropped and not appear in the result. Second, <code><span class='Value'>𝕨</span></code> is allowed to have an extra element after the end, which gives a minimum length for the result: otherwise, the result will be just long enough to accomodate the highest index in <code><span class='Value'>𝕨</span></code>.</p> -<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MOKAv8KvMeKAvzLigL8y4oC/wq8xIOKKlCAiYWJjZGUiICAjIERyb3AgYyBhbmQgZQow4oC/MeKAvzLigL8y4oC/MeKAvzYg4oqUICJhYmNkZSIgICMgTGVuZ3RoLTYgcmVzdWx0">↗️</a><pre> <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>¯1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>¯1</span> <span class='Function'>⊔</span> <span class='String'>"abcde"</span> <span class='Comment'># Drop c and e +<p>A few extra options can be useful in some circumstances. First, an "index" of <code><span class='Number'>¯1</span></code> in <code><span class='Value'>𝕨</span></code> indicates that the corresponding cell should be dropped and not appear in the result. Second, <code><span class='Value'>𝕨</span></code> is allowed to have an extra element after the end, which gives a minimum length for the result: otherwise, the result will be just long enough to accomodate the highest index in <code><span class='Value'>𝕨</span></code> (it might seem like the last element should be treated like an index, making the minimum length one higher, but the length version usually leads to simpler arithmetic).</p> +<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MOKAv8KvMeKAvzLigL8y4oC/wq8xIOKKlCAiYWJjZGUiICAjIERyb3AgYyBhbmQgZQoKMOKAvzHigL8y4oC/MuKAvzHigL82IOKKlCAiYWJjZGUiICAjIExlbmd0aC02IHJlc3VsdA==">↗️</a><pre> <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>¯1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>¯1</span> <span class='Function'>⊔</span> <span class='String'>"abcde"</span> <span class='Comment'># Drop c and e </span>⟨ "a" ⟨⟩ "cd" ⟩ + <span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>6</span> <span class='Function'>⊔</span> <span class='String'>"abcde"</span> <span class='Comment'># Length-6 result </span>⟨ "a" "be" "cd" ⟨⟩ ⟨⟩ ⟨⟩ ⟩ </pre> <p>A third extension is that <code><span class='Value'>𝕨</span></code> doesn't really have to be a list: if not, then it groups <code><span class='Function'>-=</span><span class='Value'>𝕨</span></code>-cells of <code><span class='Value'>𝕩</span></code> instead of just <code><span class='Number'>¯1</span></code>-cells. These cells are placed in index order. This extension isn't compatible with the second option from above, because it's usually not possible to add just one extra element to a non-list array. One usage is to group the diagonals of a table. See if you can figure out how the code below does this.</p> -<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4oqiIGEg4oaQICdhJyvipYrin5wo4oaVw5fCtCkz4oC/NQooK+KMnMK0wrfihpXCqOKJoiniirjiipQgYQ==">↗️</a><pre> <span class='Function'>⊢</span> <span class='Value'>a</span> <span class='Gets'>←</span> <span class='String'>'a'</span><span class='Function'>+⥊</span><span class='Modifier2'>⟜</span><span class='Paren'>(</span><span class='Function'>↕×</span><span class='Modifier'>´</span><span class='Paren'>)</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>5</span> +<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4oqiIGEg4oaQICdhJyvipYrin5wo4oaVw5fCtCkz4oC/NQoKKCvijJzCtMK34oaVwqjiiaIp4oq44oqUIGE=">↗️</a><pre> <span class='Function'>⊢</span> <span class='Value'>a</span> <span class='Gets'>←</span> <span class='String'>'a'</span><span class='Function'>+⥊</span><span class='Modifier2'>⟜</span><span class='Paren'>(</span><span class='Function'>↕×</span><span class='Modifier'>´</span><span class='Paren'>)</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>5</span> ┌─ ╵"abcde fghij klmno" ┘ + <span class='Paren'>(</span><span class='Function'>+</span><span class='Modifier'>⌜´</span><span class='Nothing'>·</span><span class='Function'>↕</span><span class='Modifier'>¨</span><span class='Function'>≢</span><span class='Paren'>)</span><span class='Modifier2'>⊸</span><span class='Function'>⊔</span> <span class='Value'>a</span> ⟨ "a" "bf" "cgk" "dhl" "eim" "jn" "o" ⟩ </pre> @@ -72,8 +75,9 @@ </pre> <p>(Could we define <code><span class='Value'>phrase</span></code> more easily? See <a href="#partitioning">below</a>.)</p> <p>If we'd like to ignore words of 0 letters, or more than 5, we can set all word lengths greater than 5 to 0, then reduce the lengths by 1. Two words end up with left argument values of ¯1 and are omitted from the result.</p> -<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MSAty5wg4omk4p+cNeKKuMOXIOKJoMKoIHBocmFzZQriiY3LmCB7MS3LnOKJpOKfnDXiirjDl+KJoMKo8J2VqX3iirjiipQgcGhyYXNl">↗️</a><pre> <span class='Number'>1</span> <span class='Function'>-</span><span class='Modifier'>˜</span> <span class='Function'>≤</span><span class='Modifier2'>⟜</span><span class='Number'>5</span><span class='Modifier2'>⊸</span><span class='Function'>×</span> <span class='Function'>≠</span><span class='Modifier'>¨</span> <span class='Value'>phrase</span> +<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=MSAty5wg4omk4p+cNeKKuMOXIOKJoMKoIHBocmFzZQoK4omNy5ggezEty5ziiaTin5w14oq4w5fiiaDCqPCdlal94oq44oqUIHBocmFzZQ==">↗️</a><pre> <span class='Number'>1</span> <span class='Function'>-</span><span class='Modifier'>˜</span> <span class='Function'>≤</span><span class='Modifier2'>⟜</span><span class='Number'>5</span><span class='Modifier2'>⊸</span><span class='Function'>×</span> <span class='Function'>≠</span><span class='Modifier'>¨</span> <span class='Value'>phrase</span> ⟨ 2 3 ¯1 1 0 3 1 ¯1 ⟩ + <span class='Function'>≍</span><span class='Modifier'>˘</span> <span class='Brace'>{</span><span class='Number'>1</span><span class='Function'>-</span><span class='Modifier'>˜</span><span class='Function'>≤</span><span class='Modifier2'>⟜</span><span class='Number'>5</span><span class='Modifier2'>⊸</span><span class='Function'>×≠</span><span class='Modifier'>¨</span><span class='Value'>𝕩</span><span class='Brace'>}</span><span class='Modifier2'>⊸</span><span class='Function'>⊔</span> <span class='Value'>phrase</span> ┌─ ╵ ⟨ "a" ⟩ @@ -92,7 +96,8 @@ ⟨⟩ ┘ </pre> -<p>When Group is called dyadically, the left argument is used for the indices and the right is used for values, as seen above. When it is called monadically, the right argument, which must be a list, gives the indices and the values grouped are the right argument's indices, that is, <code><span class='Function'>↕≠</span><span class='Value'>𝕩</span></code>.</p> +<h3 id="group-indices"><a class="header" href="#group-indices">Group Indices</a></h3> +<p>Above, Group has two arguments, and <code><span class='Value'>𝕨</span></code> gives the indices and <code><span class='Value'>𝕩</span></code> is the values to be grouped. In the one-argument case, <code><span class='Value'>𝕩</span></code> now gives the result indices, and the values grouped are indices related to <code><span class='Value'>𝕩</span></code>. For a numeric list, <code><span class='Function'>⊔</span><span class='Value'>𝕩</span></code> is <code><span class='Value'>𝕩</span><span class='Function'>⊔↕≠</span><span class='Value'>𝕩</span></code>.</p> <a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4omNy5gg4oqUIDLigL8z4oC/wq8x4oC/Mg==">↗️</a><pre> <span class='Function'>≍</span><span class='Modifier'>˘</span> <span class='Function'>⊔</span> <span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>¯1</span><span class='Ligature'>‿</span><span class='Number'>2</span> ┌─ ╵ ⟨⟩ @@ -102,6 +107,7 @@ ┘ </pre> <p>Here, the index 2 appears at indices 0 and 3 while the index 3 appears at index 1.</p> +<p>But <code><span class='Value'>𝕩</span></code> can also be a list of numeric arrays. In this case the indices <code><span class='Function'>↕∾≢</span><span class='Modifier'>¨</span><span class='Value'>𝕩</span></code> will be grouped by <code><span class='Value'>𝕩</span></code> according to the multidimensional grouping documented in the next section. Since the argument to <a href="range.html">Range</a> (<code><span class='Function'>↕</span></code>) is now a list, each index to be grouped is a list instead of a number. As with <code><span class='Function'>↕</span></code>, the depth of the result of Group Indices is always one greater than that of its argument. One consequence is that for an array <code><span class='Value'>a</span></code> of any rank, <code><span class='Function'>⊔⋈</span><span class='Value'>a</span></code> groups the indices <code><span class='Function'>↕≢</span><span class='Value'>a</span></code>.</p> <h3 id="multidimensional-grouping"><a class="header" href="#multidimensional-grouping">Multidimensional grouping</a></h3> <p>Dyadic Group allows the right argument to be grouped along multiple axes by using a nested left argument. In this case, the left argument must be a list of numeric lists, and the result has rank <code><span class='Function'>≠</span><span class='Value'>𝕨</span></code> while its elements—as always—have the same rank as <code><span class='Value'>𝕩</span></code>. The result shape is <code><span class='Number'>1</span><span class='Function'>+⌈</span><span class='Modifier'>´¨</span><span class='Value'>𝕨</span></code>, while the shape of element <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>𝕨</span><span class='Function'>⊔</span><span class='Value'>𝕩</span></code> is <code><span class='Value'>i</span><span class='Function'>+</span><span class='Modifier'>´</span><span class='Modifier2'>∘</span><span class='Function'>=</span><span class='Modifier'>¨</span><span class='Value'>𝕨</span></code>. If every element of <code><span class='Value'>𝕨</span></code> is sorted ascending and contains only non-negative numbers, we have <code><span class='Value'>𝕩</span><span class='Function'>≡∾</span><span class='Value'>𝕨</span><span class='Function'>⊔</span><span class='Value'>𝕩</span></code>, that is, <a href="join.html#join">Join</a> is the inverse of Partition.</p> <p>Here we split up a rank-2 array into a rank-2 array of rank-2 arrays. Along the first axis we simply separate the first pair and second pair of rows—a partition. Along the second axis we separate odd from even indices.</p> @@ -118,7 +124,6 @@ ┘ </pre> <p>Each group <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>𝕨</span><span class='Function'>⊔</span><span class='Value'>𝕩</span></code> is composed of the cells <code><span class='Value'>j</span><span class='Function'><</span><span class='Modifier'>¨</span><span class='Modifier2'>⊸</span><span class='Function'>⊏</span><span class='Value'>𝕩</span></code> such that <code><span class='Value'>i</span><span class='Function'>≢</span><span class='Value'>j</span><span class='Function'>⊑</span><span class='Modifier'>¨</span><span class='Value'>𝕨</span></code>. The groups retain their array structure and ordering along each argument axis. Using multidimensional Replicate we can say that <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>𝕨</span><span class='Function'>⊔</span><span class='Value'>𝕩</span></code> is <code><span class='Paren'>(</span><span class='Value'>i</span><span class='Function'>=</span><span class='Value'>𝕨</span><span class='Paren'>)</span><span class='Function'>/</span><span class='Value'>𝕩</span></code>.</p> -<p>The monadic case works similarly: Group Indices always satisfies <code><span class='Function'>⊔</span><span class='Value'>𝕩</span> <span class='Gets'>←→</span> <span class='Value'>𝕩</span><span class='Function'>⊔↕≠</span><span class='Modifier2'>⚇</span><span class='Number'>1</span><span class='Value'>𝕩</span></code>. As with <code><span class='Function'>↕</span></code>, the depth of the result of Group Indices is always one greater than that of its argument. A depth-0 argument is not allowed.</p> <h2 id="properties"><a class="header" href="#properties">Properties</a></h2> <p>Group is closely related to the <a href="replicate.html#inverse">inverse of Indices</a>, <code><span class='Function'>/</span><span class='Modifier'>⁼</span></code>. In fact, inverse Indices called on the index argument gives the length of each group:</p> <a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4omgwqjiipQgMuKAvzPigL8x4oC/Mgov4oG84oinIDLigL8z4oC/MeKAvzI=">↗️</a><pre> <span class='Function'>≠</span><span class='Modifier'>¨</span><span class='Function'>⊔</span> <span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>2</span> |
