From 7b81224a5929c919815eb0a770cac93fa1325174 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Mon, 14 Jun 2021 22:06:01 -0400 Subject: Documentation for Indices and Replicate --- docs/doc/index.html | 1 + docs/doc/primitive.html | 4 +- docs/doc/replicate.html | 233 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 236 insertions(+), 2 deletions(-) create mode 100644 docs/doc/replicate.html (limited to 'docs/doc') diff --git a/docs/doc/index.html b/docs/doc/index.html index 0f114bed..6d0a3dad 100644 --- a/docs/doc/index.html +++ b/docs/doc/index.html @@ -43,6 +43,7 @@
  • Fold and Insert (´˝)
  • Group ()
  • Identity functions (⊢⊣)
  • +
  • Indices and Replicate (/)
  • Join and Join To ()
  • Logical functions (∧∨¬)
  • Match (≡≢)
  • diff --git a/docs/doc/primitive.html b/docs/doc/primitive.html index 618769e9..79792f24 100644 --- a/docs/doc/primitive.html +++ b/docs/doc/primitive.html @@ -181,8 +181,8 @@ / -Indices -Replicate +Indices +Replicate diff --git a/docs/doc/replicate.html b/docs/doc/replicate.html new file mode 100644 index 00000000..752fc7e1 --- /dev/null +++ b/docs/doc/replicate.html @@ -0,0 +1,233 @@ + + + + BQN: Indices and Replicate + + +

    Indices and Replicate

    + + + + + + + + + 𝕨 + 𝕩 + 𝕨/𝕩 + /𝕨 + + 'r' + 'e' + 'p' + 'l' + 'i' + 'c' + 'a' + 't' + 'e' + 0 + 1 + 1 + 0 + 3 + 2 + 0 + 0 + 0 + 'e' + 'p' + 'i' + 'i' + 'i' + 'c' + 'c' + 1 + 2 + 4 + 4 + 4 + 5 + 5 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

    The functions Indices and Replicate are used to copy or filter data. They might be described as transforming a run-length encoding into unencoded form. On the other hand, Indices might be described as giving a sparse representation of 𝕩, which is smaller if 𝕩 mostly consists of zeros.

    +

    BQN doesn't have any of the various features used in APL to add fills to the result of Replicate, like negative numbers in 𝕨 or an Expand (\) primitive. An alternative to Expand is to use Replicate with structural Under () to insert values into an array of fills.

    +

    Replicate

    +

    Given a list of natural numbers 𝕨, Replicate repeats each major cell in 𝕩 the corresponding number of times. That is, 𝕨 and 𝕩 must have the same length, and the result includes i𝕨 copies of each cell i𝕩, in order.

    +↗️
        2102 / "abcd"
    +"aabdd"
    +
    +     a  >"aa0""bb1""cc2""dd3"
    +┌─     
    +╵"aa0  
    +  bb1  
    +  cc2  
    +  dd3" 
    +      ┘
    +
    +    2102 / a
    +┌─     
    +╵"aa0  
    +  aa0  
    +  bb1  
    +  dd3  
    +  dd3" 
    +      ┘
    +
    +

    It's also allowed for 𝕨 to be a single number (or enclosed number: it just needs to be a unit and not a list). In this case every cell of 𝕩 is repeated that number of times.

    +↗️
        3 / "copy"
    +"cccooopppyyy"
    +
    +

    When 𝕨 is a list of booleans, a cell is never repeated more than once, meaning that each cell of 𝕩 is either left out (0), or kept in (1). If Fn is a function with a boolean result, Fn¨/ filters elements of a list according to Fn.

    +↗️
        110010 / "filter"
    +"fie"
    +
    +    'i' "filter"
    +⟨ 1 1 0 0 1 0 ⟩
    +
    +    'i'/ "filter"
    +"fie"
    +
    +

    Here 'i' is a pervasive function, so there's no need to add ¨. Similarly, to filter major cells of an array, Fn˘/ could be used, applying Fn to one major cell at a time.

    +

    A similar pattern applies to Replicate as well. The function below tests which input characters are double quotes, but by adding one it changes the result to 1 for each non-quote character and 2 for quotes (but source code and display also double quotes here, so the input string has only two "s and the output has four).

    +↗️
        {1+'"'=𝕩}/ "for ""escaping"" quotes"
    +"for """"escaping"""" quotes"
    +
    +

    Compound Replicate

    +

    If 𝕨 has depth two, then its elements give the amounts to copy along each leading axis of 𝕩.

    +↗️
         b  25  10
    +┌─           
    +╵ 0 1 2 3 4  
    +  5 6 7 8 9  
    +            ┘
    +
    +    20, 10011 / b
    +┌─       
    +╵ 0 3 4  
    +  0 3 4  
    +        ┘
    +
    +    20 / 10011/˘ b
    +┌─       
    +╵ 0 3 4  
    +  0 3 4  
    +        ┘
    +
    +

    Here the 20 indicates that the first row of b is copied twice and the second is ignored, while 10011 picks out three entries from that row. 𝕩 can also have more axes than elements of 𝕨, and the trailing ones aren't changed, just like the simpler case. However, 𝕨 has to have at least as many elements as 𝕩 has axes (so (𝕨)≥=𝕩), and each element has to have the same length as the corresponding axis in 𝕩—or it can be a unit, as shown below.

    +↗️
        <2,<3 / b
    +┌─                               
    +╵ 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4  
    +  0 0 0 1 1 1 2 2 2 3 3 3 4 4 4  
    +  5 5 5 6 6 6 7 7 7 8 8 8 9 9 9  
    +  5 5 5 6 6 6 7 7 7 8 8 8 9 9 9  
    +                                ┘
    +
    +

    Above, both elements of 𝕨 are enclosed numbers. An individual element doesn't have to be enclosed, but I don't recommend this, since if none of them are enclosed, then 𝕨 will have depth 1 and it will be interpreted as replicating along the first axis only.

    +↗️
        2,3 / b
    +┌─           
    +╵ 0 1 2 3 4  
    +  0 1 2 3 4  
    +  5 6 7 8 9  
    +  5 6 7 8 9  
    +  5 6 7 8 9  
    +            ┘
    +
    +

    The example above has a different result from the previous one! The function <¨/ could be used in place of / to replicate each axis by a different number.

    +

    If 𝕨 is ⟨⟩, then it has depth 1, but is handled with the multidimensional case anyway, giving a result of 𝕩. The one-dimensional case could also only ever return 𝕩, but it would be required to have length 0, so this convention still only extends the simple case.

    +↗️
        b  ⟨⟩ / b
    +1
    +
    +

    Indices

    +

    The monadic form of / is much simpler than the dyadic one, with no multidimensional case or mismatched argument ranks. 𝕩 must be a list of natural numbers, and /𝕩 is the list 𝕩/↕≠𝕩. Its elements are the indices for 𝕩, with index i repeated i𝕩 times.

    +↗️
        / 3012
    +⟨ 0 0 0 2 3 3 ⟩
    +
    +

    A unit argument isn't allowed, and isn't very useful: for example, /6 might indicate an array of six zeros, but this can be written /⥊6 or 60 with hardly any extra effort.

    +

    When 𝕨 has rank 1, 𝕨/𝕩 is equivalent to 𝕨/𝕩. Of course, this isn't the only use of Indices. It also gets along well with Group: for example, / groups 𝕩 according to a list of lengths 𝕨.

    +↗️
        2501 / "ABCDEFGH"
    +⟨ "AB" "CDEFG" ⟨⟩ "H" ⟩
    +
    +

    This function will fail to include trailing empty arrays; the modification (/∾1) fixes this and ensures the result always has as many elements as 𝕨.

    +

    If 𝕩 is boolean then /𝕩 contains all the indices where a 1 appears in 𝕩. Applying -» to the result gives the distance from each 1 to the previous, or to the start of the list, another potentially useful function.

    +↗️
        / 0101000010
    +⟨ 1 3 8 ⟩
    +
    +    -» / 0101000010
    +⟨ 1 2 5 ⟩
    +
    +

    With more effort we can also use / to analyze groups of 1s in the argument (and of course all these methods can be applied to 0s instead, by first flipping the values with ¬). First we highlight the start and end of each group by comparing the list with a shifted copy of itself. Or rather, we'll first place a 0 at the front and then at the end, in order to detect when a group starts at the beginning of the list or ends at the end (there's also a shift-based version, «0𝕩).

    +↗️
        0 (∾≍∾˜) 01110010110
    +┌─                         
    +╵ 0 0 1 1 1 0 0 1 0 1 1 0  
    +  0 1 1 1 0 0 1 0 1 1 0 0  
    +                          ┘
    +
    +    0 (∾≠∾˜) 01110010110
    +⟨ 0 1 0 0 1 0 1 1 1 0 1 0 ⟩
    +
    +    / 0(∾≠∾˜) 01110010110
    +⟨ 1 4 6 7 8 10 ⟩
    +
    +

    So now we have the indices of each transition from 0 to 1 or 1 to 0, in an extended list with 0 added at the beginning and end. The first index has to be for a 0 to 1 transition, because we forced the first value to be a 0, and then the next can only be 1 to 0, then 0 to 1, and so on until the last, which must be 1 to 0 because the last value is also 0.

    +↗️
        -˜`˘ 2⥊/ 0(∾≠∾˜) 01110010110
    +┌─     
    +╵ 1 3  
    +  6 1  
    +  8 2  
    +      ┘
    +
    +

    This means the transitions can be grouped exactly in pairs, the beginning and end of each group. Reshape with a computed length 2 groups these pairs, and then a scan -˜`˘ can be used to convert the start/end format to start/length if wanted.

    +

    Inverse

    +

    The result of Indices /n is an ordered list of natural numbers, where the number i appears in times. Given an ordered list of natural numbers k, the inverse of indices returns a corresponding n: one where the value in is the number of times i appears in k.

    +↗️
        / 321
    +⟨ 0 0 0 1 1 2 ⟩
    +
    +    / 000112
    +⟨ 3 2 1 ⟩
    +
    +

    Finding how many times each index appears in a list of indices is often a useful thing to do, and there are a few ways to do it:

    +↗️
        +˝˘ (5) = 224120  # Inefficient
    +⟨ 1 1 3 0 1 ⟩
    +
    +    ¨ 224120
    +⟨ 1 1 3 0 1 ⟩
    +
    +    / 224120
    +⟨ 1 1 3 0 1 ⟩
    +
    +

    For / to work, the argument has to be sorted: otherwise it won't be a valid result of /. But sorting with is no problem, and / will probably be faster than ¨ in the absence of special handling for either combination.

    -- cgit v1.2.3