From 2010e8b2897a5741e211980c9f8ec9177299c939 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Fri, 16 Jul 2021 18:23:52 -0400 Subject: Finish links and editing documentation pass --- docs/doc/transpose.html | 33 ++++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) (limited to 'docs/doc/transpose.html') diff --git a/docs/doc/transpose.html b/docs/doc/transpose.html index f86029a7..ee3e926c 100644 --- a/docs/doc/transpose.html +++ b/docs/doc/transpose.html @@ -5,7 +5,7 @@

Transpose

-

As in APL, Transpose () is a tool for rearranging the axes of an array. BQN's version is tweaked to align better with the leading axis model and make common operations easier.

+

Transpose () is a tool for rearranging the axes of an array. BQN's version is tweaked relative to APL to align better with the leading axis model and make common operations easier.

Transpose basics

The name for the primitive comes from the Transpose operation on matrices. Given a matrix as an array of rank 2, will transpose it:

↗️
     mat  23  6
@@ -29,13 +29,14 @@
 

With two axes the only interesting operation of this sort is to swap them (and with one or zero axes there's nothing interesting to do, and just returns the argument array). But a BQN programmer may well want to work with higher-rank arrays—although such a programmer might call them "tensors"—and this means there are many more ways to rearrange the axes. Transpose extends to high-rank arrays to allow some useful special cases as well as completely general axis rearrangement, as described below.

Monadic Transpose

APL extends matrix transposition to any rank by reversing all axes for its monadic , but this generalization isn't very natural and is almost never used. The main reason for it is to maintain the equivalence a MP b ←→ b MP a, where MP +˝×1 is the generalized matrix product. But even here APL's Transpose is suspect. It does much more work than it needs to, as we'll see.

-

BQN's transpose takes the first axis of its argument and moves it to the end.

-↗️
     a23456  23456
+

BQN's transpose takes the first axis of 𝕩 and moves it to the end.

+↗️
     a23456  23456
 ⟨ 2 3 4 5 6 ⟩
+
       a23456
 ⟨ 3 4 5 6 2 ⟩
 
-

In terms of the argument data as given by Deshape (), this looks like a simple 2-dimensional transpose: one axis is exchanged with a compound axis made up of the other axes. Here we transpose a rank 3 matrix:

+

In terms of the argument data as given by Deshape (), this looks like a simple 2-dimensional transpose: one axis is exchanged with a compound axis made up of the other axes. Here we transpose a rank 3 matrix:

↗️
    a322  322⥊↕12
     < a322
 ┌─                      
@@ -62,9 +63,10 @@
                          ┘  
                            ┘
 
-

To exchange multiple axes, use the Power modifier. A negative power moves axes in the other direction, just like how Rotate handles negative left arguments. In particular, to move the last axis to the front, use Undo (as you might expect, this exactly inverts ).

-↗️
     3 a23456
+

To exchange multiple axes, use the Repeat modifier. A negative power moves axes in the other direction, just like how Rotate handles negative left arguments. In particular, to move the last axis to the front, use Undo (as you might expect, this exactly inverts ).

+↗️
     3 a23456
 ⟨ 5 6 2 3 4 ⟩
+
       a23456
 ⟨ 6 2 3 4 5 ⟩
 
@@ -73,11 +75,11 @@ ↗️
     3 a23456
 ⟨ 2 3 5 6 4 ⟩
 
-

And of course, Rank and Power can be combined to do more complicated transpositions: move a set of contiguous axes with any starting point and length to the end.

+

And of course, Rank and Repeat can be combined to do more complicated transpositions: move a set of contiguous axes with any starting point and length to the end.

↗️
     ¯1 a23456
 ⟨ 2 6 3 4 5 ⟩
 
-

Using these forms, we can state BQN's generalized matrix product swapping rule:

+

Using these forms (and the Rank function), we can state BQN's generalized matrix product swapping rule:

a MP b  ←→  (1-=a) (b) MP (a)
 

Certainly not as concise as APL's version, but not a horror either. BQN's rule is actually more parsimonious in that it only performs the axis exchanges necessary for the computation: it moves the two axes that will be paired with the matrix product into place before the product, and directly exchanges all axes afterwards. Each of these steps is equivalent in terms of data movement to a matrix transpose, the simplest nontrivial transpose to perform. Also remember that for two-dimensional matrices both kinds of transposition are the same, so that APL's simpler rule MP MP˜ holds in BQN.

@@ -87,9 +89,10 @@

In a case like this BQN's Dyadic transpose is much easier.

Dyadic Transpose

-

Transpose also allows a left argument that specifies a permutation of the right argument's axes. For each index pi𝕨 in the left argument, axis i of the argument is used for axis p of the result. Multiple argument axes can be sent to the same result axis, in which case that axis goes along a diagonal of the argument array, and the result will have a lower rank than the argument.

-↗️
     13204  a23456
+

Transpose also allows a left argument that specifies a permutation of 𝕩's axes. For each index pi𝕨 in the left argument, axis i of 𝕩 is used for axis p of the result. Multiple argument axes can be sent to the same result axis, in which case that axis goes along a diagonal of 𝕩, and the result will have a lower rank than 𝕩.

+↗️
     13204  a23456
 ⟨ 5 2 4 3 6 ⟩
+
      12200  a23456  # Don't worry too much about this case though
 ⟨ 5 2 3 ⟩
 
@@ -97,17 +100,17 @@ ↗️
     13204  a23456
 ⟨ 3 5 4 2 6 ⟩
 
-

So far, all like APL. BQN makes one little extension, which is to allow only some axes to be specified. The left argument will be matched up with leading axes of the right argument. Those axes are moved according to the left argument, and remaining axes are placed in order into the gaps between them.

+

So far, all like APL. BQN makes one little extension, which is to allow only some axes to be specified. Then 𝕨 will be matched up with leading axes of 𝕩. Those axes are moved according to 𝕨, and remaining axes are placed in order into the gaps between them.

↗️
     024  a23456
 ⟨ 2 5 3 6 4 ⟩
 
-

In particular, the case with only one argument specified is interesting. Here, the first axis ends up at the given location. This gives us a much better solution to the problem at the end of the last section.

+

In particular, the case with only one axis specified is interesting. Here, the first axis ends up at the given location. This gives us a much better solution to the problem at the end of the last section.

↗️
     2  a23456  # Restrict Transpose to the first three axes
 ⟨ 3 4 2 5 6 ⟩
 

Finally, it's worth noting that, as monadic Transpose moves the first axis to the end, it's equivalent to dyadic Transpose with a "default" left argument: (=-1˙).

Definitions

Here we define the two valences of Transpose more precisely.

-

Monadic transpose is identical to (=-1˙), except that if the argument is a unit it is returned unchanged rather than giving an error.

-

An atom right argument to dyadic Transpose is always enclosed to get an array before doing anything else.

-

In dyadic Transpose, the left argument is a number or numeric array of rank 1 or less, and 𝕨≠≢𝕩. Define the result rank r(=𝕩)-+´¬∊𝕨 to be the argument rank minus the number of duplicate entries in the left argument. We require ´𝕨<r. Bring 𝕨 to full length by appending the missing indices: 𝕨𝕨(¬˜/⊢)r. Now the result shape is defined to be ´¨𝕨⊔≢𝕩. Element iz of the result z is element (𝕨i)𝕩 of the argument.

+

An atom right argument to either valence of Transpose is always enclosed to get an array before doing anything else.

+

Monadic transpose is identical to (=-1˙), except that if 𝕩 is a unit it is returned unchanged (after enclosing, if it's an atom) rather than giving an error.

+

In dyadic Transpose, 𝕨 is a number or numeric array of rank 1 or less, and 𝕨≠≢𝕩. Define the result rank r(=𝕩)-+´¬∊𝕨 to be the right argument rank minus the number of duplicate entries in the left argument. We require ´𝕨<r. Bring 𝕨 to full length by appending the missing indices: 𝕨𝕨(¬˜/⊢)r. Now the result shape is defined to be ´¨𝕨⊔≢𝕩. Element iz of the result z is element (𝕨i)𝕩 of the argument.

-- cgit v1.2.3