From 2afb23928e1984d475cc460e1672e8f6fa0e4dbe Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Wed, 11 Aug 2021 17:21:31 -0400 Subject: Allow clicking on header to get fragment link --- docs/doc/arithmetic.html | 12 ++++++------ docs/doc/array.html | 14 +++++++------- docs/doc/arrayrepr.html | 26 +++++++++++++------------- docs/doc/assert.html | 4 ++-- docs/doc/based.html | 10 +++++----- docs/doc/block.html | 24 ++++++++++++------------ docs/doc/context.html | 10 +++++----- docs/doc/control.html | 24 ++++++++++++------------ docs/doc/couple.html | 8 ++++---- docs/doc/depth.html | 8 ++++---- docs/doc/embed.html | 8 ++++---- docs/doc/enclose.html | 12 ++++++------ docs/doc/expression.html | 10 +++++----- docs/doc/fill.html | 6 +++--- docs/doc/find.html | 4 ++-- docs/doc/fold.html | 14 +++++++------- docs/doc/fromDyalog.html | 14 +++++++------- docs/doc/fromJ.html | 14 +++++++------- docs/doc/functional.html | 10 +++++----- docs/doc/glossary.html | 18 +++++++++--------- docs/doc/group.html | 12 ++++++------ docs/doc/identity.html | 10 +++++----- docs/doc/index.html | 2 +- docs/doc/indices.html | 8 ++++---- docs/doc/join.html | 6 +++--- docs/doc/leading.html | 18 +++++++++--------- docs/doc/lexical.html | 14 +++++++------- docs/doc/logic.html | 10 +++++----- docs/doc/map.html | 8 ++++---- docs/doc/match.html | 6 +++--- docs/doc/namespace.html | 8 ++++---- docs/doc/oop.html | 14 +++++++------- docs/doc/order.html | 14 +++++++------- docs/doc/paradigms.html | 6 +++--- docs/doc/pick.html | 8 ++++---- docs/doc/prefixes.html | 8 ++++---- docs/doc/primitive.html | 6 +++--- docs/doc/range.html | 6 +++--- docs/doc/repeat.html | 10 +++++----- docs/doc/replicate.html | 10 +++++----- docs/doc/reshape.html | 12 ++++++------ docs/doc/reverse.html | 8 ++++---- docs/doc/scan.html | 10 +++++----- docs/doc/search.html | 12 ++++++------ docs/doc/select.html | 8 ++++---- docs/doc/selfcmp.html | 12 ++++++------ docs/doc/shape.html | 6 +++--- docs/doc/shift.html | 12 ++++++------ docs/doc/syntax.html | 22 +++++++++++----------- docs/doc/take.html | 8 ++++---- docs/doc/train.html | 10 +++++----- docs/doc/transpose.html | 10 +++++----- docs/doc/types.html | 18 +++++++++--------- docs/doc/undo.html | 8 ++++---- docs/doc/windows.html | 12 ++++++------ 55 files changed, 301 insertions(+), 301 deletions(-) (limited to 'docs/doc') diff --git a/docs/doc/arithmetic.html b/docs/doc/arithmetic.html index 0be70506..22dc233b 100644 --- a/docs/doc/arithmetic.html +++ b/docs/doc/arithmetic.html @@ -4,7 +4,7 @@ BQN: Arithmetic functions -

Arithmetic functions

+

Arithmetic functions

BQN's arithmetic functions use mostly the same symbols as APL, and the functionality is actually defined by the language implementation's number system and not the specification, so there's not too much to say about them.

Summary of differences for APLers:

-

Basic arithmetic

+

Basic arithmetic

These functions are also introduced in the first BQN tutorial.

BQN of course supports the elementary functions taught in schools everywhere:

@@ -104,7 +104,7 @@ ׯ2¯004 ⟨ 1 ¯1 0 0 1 ⟩ -

Character arithmetic

+

Character arithmetic

The Add and Subtract functions can be applied to characters as well as numbers. While any two numbers (finite ones, at least) can be added or subtracted, character arithmetic has more restrictions.

@@ -149,7 +149,7 @@ ↗️
    'a' - @
 97
 
-

Additional arithmetic

+

Additional arithmetic

@@ -202,7 +202,7 @@ 1

Unlike in APL, a left argument of 0 fails or returns a not-a-number result. Set 𝕨 to to keep 𝕩 intact, but do note that if 𝕩<0 this will return .

-

Comparisons

+

Comparisons

BQN uses the six standard comparison functions of mathematics. For each pair of atoms the result is 1 if the comparison is true and 0 if it's false. These functions do the obvious thing with numeric arguments, but are extended to other types as well.

@@ -283,7 +283,7 @@ 'b'"abacba" ⟨ 1 0 1 1 0 1 ⟩ -

Pervasion

+

Pervasion

Arithmetic primitives act as though they are given depth 0, so that with array arguments they treat each atom independently. While the examples above use only numbers or lists of them, arithmetic applies to nested and high-rank arrays just as easily.

↗️
    × ˘¯8,¯9⟨⟨2,0,4,5
 ┌─                 
diff --git a/docs/doc/array.html b/docs/doc/array.html
index 001a9f70..1eeb5ee6 100644
--- a/docs/doc/array.html
+++ b/docs/doc/array.html
@@ -4,7 +4,7 @@
   BQN: The array
 
 
-

The array

+

The array

As BQN is an array language, it's often helpful to understand what an array is when writing BQN programs. Fully describing the concept is sometimes held to be tricky; here we'll see definitions, examples, and metaphors.

In BQN, as in APL, arrays are multidimensional, instead of strictly linear. Languages like Python, Javascript, or Haskell offer only one-dimensional arrays with [] syntax, and typically represent multidimensional data with nested arrays. Multidimensional arrays have fundamental differences relative to this model.

BQN's arrays are immutable, meaning that an array is entirely defined by its attributes, and there is no way to modify an existing array, only to produce another array that has changes relative to it. As a result, an array can never contain itself, and arrays form an inductive type. BQN's mutable types are operations and namespaces.

@@ -85,7 +85,7 @@ -

Rectangles

+

Rectangles

A BQN array is a multidimensional arrangement of data. The word "array" descends from words meaning "order", and the data in an array is ordered indeed. Below are examples of arrays with zero, one, and two dimensions.

↗️
    <5
 ┌·   
@@ -105,23 +105,23 @@
 

Each dimension, or axis, has some finite number of positions, with an element at every combination of positions. For example, if a group of friends shop at several different stores, the amount they spend in a week could be placed in a two-dimensional array, with people along one axis and stores along another. An element of the array would indicate how much one person spent at one store, so that summing across stores gives each person's expenditures and summing across people gives each store's income.

The axes of an array must be independent, that is, the positions present in one axis are fixed for the entire array and don't depend on other axes. This is a difference relative to a nested list model. When storing data in nested lists, the outer axis comes first and later axes are subordinate to it. The length of the second axis depends completely on the position in the first. A programmer might choose the lengths so it doesn't in a particular case, but in a BQN array differing lengths simply aren't representable.

The array also needs to be complete. Every element—every combination of positions—must have a value. This value could be a placeholder like @, but it has to be something (in the spending example, everyone spends some amount at each store, even if it's zero). And of course, there are no extra elements that don't fit into the positioning system—the fill isn't really part of the array, but extra information about it.

-

Ordering and indices

+

Ordering and indices

To finish this definition of an array we also need to nail down the idea of a position. The positions along one dimension can't be labelled in any way, but they have a linear ordering (mathematically speaking, a total order: out of any two different positions one comes earlier and the other later). BQN keeps track of this order: for example, when we join two arrays it places positions in 𝕨 before those of 𝕩 and otherwise maintains the original ordering.

↗️
    "before"  "after"
 "beforeafter"
 

It's only the ordering that allows positions to be distinguished. BQN labels them with natural numbers called indices that can be derived from the order: the earliest position is called 0, the next 1, and so on. The axes of an array are also ordered, and they're indexed starting at 0 as well.

These kinds of index are one-dimensional, but there's also a multidimensional kind of array index, that identifies an element. An element index consists of one index along each axis. Because the axis are ordered, it can be represented as a list l of numbers, where il is the index along axis i. It's important to distinguish an element from its value: for example, there's only one value (3) contained in the array 3,3,3, but it still has three elements, identified by indices 0, 1, and 2.

-

Dimensions

+

Dimensions

The number of axes in an array is called its rank. The number of positions along an axis is called its length, and the length of an array means its length along the first axis, or 1 if there are no axes. The list of the length along each axis is the array's shape, and describes the possible element locations completely. In BQN they're exposed as the functions Rank (=), Length (), and Shape ().

The total number of elements in an array is its bound, and can be found using Deshape with , is then the product of all the lengths in the shape. An array of rank 0, which always contains exactly one element, is called a unit, while an array of rank 1 is called a list and an array of rank 2 is called a table.

-

Elements

+

Elements

Any BQN value can be used as an array element, including another array (BQN, as a dynamically-typed language, doesn't restrict the types that can be used in one context without a good reason). However, BQN arrays are restricted relative to another array model. Frameworks like NumPy or Julia have mutable arrays, so that the value of an element can be changed after the array is created. This allows an array to be its own element, by creating an array and then inserting it into itself. This would be unnatural in BQN, where an array can only be formed from elements that already exist. In BQN only operations and namespaces are mutable.

-

Properties

+

Properties

Summarizing, the values needed to define an array are its rank (the number of axes), its shape (the number of positions along each axis), and the value of each element (that is, at each combination of positions). Two arrays match when all these values match.

If the rank is considered to be part of the shape, as it is when the shape is a BQN list, then the array is defined by its shape and element list—from deshape.

Here's a somewhat informal mathematical take. Given a set of possible element values T, a list of T of length l is a map from natural numbers less than l to T. An array is a rank r, along with a list s of natural numbers of length r, and a map from lists of natural numbers i that satisfy i(j) < s(j) for all natural numbers j<r to BQN values. Arrays are an inductive type, so that an array can only be defined using elements that already exist. As a result an array's elements are always values of lesser complexity and selecting one element of an array, then an element of that element, and so on, must eventually reach a non-array.

-

Why arrays?

+

Why arrays?

The multidimensional array is a fairly simple structure, but there are simpler ones like pairs, lists, sets, and dictionaries. Why does BQN choose the array for its central type? I don't think arrays are always the best data structure (or that BQN is always the best language), but I do think they're one of several good choices and have unique advantages.

Arrays offer a lot of flexibility since they generalize lists. This also means that they can be used to represent pairs or sets. Two lists, or an array with a length-2 axis, can represent a map, although it could be hard to use with good performance.

But arrays are less flexible than nested lists (which in turn are less flexible than nested arrays). This is also in some sense a strength. The axes of an array are inherently independent. Lots of things in real life are independent! Regardless of which main you choose in your Cook Out tray you have the same options for sides. A term in a multivariate polynomial can have any power of x and any power of y. An array language lets you encode this independence in your data, and use operations that take advantage of it.

diff --git a/docs/doc/arrayrepr.html b/docs/doc/arrayrepr.html index 55598a6a..7c27fa07 100644 --- a/docs/doc/arrayrepr.html +++ b/docs/doc/arrayrepr.html @@ -4,10 +4,10 @@ BQN: Array notation and display -

Array notation and display

+

Array notation and display

This page documents ways arrays are represented in BQN: the notation you can use to write them and the way the REPL displays them.

Array display is a feature of a BQN environment such as a REPL. You can also access it with •Fmt, which takes a value and returns a string indicating how it would be formatted. Array notation is of course part of BQN source code, but you can also go from an array to one possible source code for it using the similar system function •Repr.

-

Array display

+

Array display

Although it's really part of the language environment and not BQN itself, let's look at display first so it's clear what arrays we're talking about later on. The BQN REPL prints arrays in a way that's meant to unambiguously show the structure and data, but doesn't correspond to BQN source code. A few examples are given below; of course, displays like this appear all over the documentation.

↗️
     34             # Array of lists
 ┌─                                 
@@ -31,7 +31,7 @@
 

There are several different ways to show arrays: as a string "", with brackets ⟨⟩, or with corners and . We'll start with the most general, the corners. These show arrays of any rank while the other two ways are special cases for lists.

Array displays show only the array shape and elements. The fill is an inferred property and the display never indicates or depends on it.

-

Corners

+

Corners

Those top-left and bottom-right corners are a distinctive part of BQN's display, as other systems almost always completely enclose the contents. BQN could add the other two corners, naturally; it just doesn't. Within the corners, elements are separated by whitespace only, and generally aligned to the top left.

↗️
    2,"xy"22"abcd",4  # Nested 2×2 array
 ┌─             
@@ -43,7 +43,7 @@
               ┘
 

The lack of extra separation is to make it clear that the corners enclose the array rather than any of its elements (elements are still distinguishable becase an individual element won't contain whitespace except maybe between quotes). Now every set of corners indicates one array. This is a good fit for the based array model, where data doesn't have to be in an array.

-

Rank indicator

+

Rank indicator

The top left corner indicates the rank of an array. Here's a neat way using Fold (´) and Prefixes () to nest ranks 0 through 6 together:

↗️
    0 <´ 61
 ┌·                           
@@ -67,7 +67,7 @@
 ·    ┌─    ┌─    ┌─    ┌─    ┌─    6    7  
 ·     ·                                 
 
-

High-rank layout

+

High-rank layout

We've seen already that elements of a list are placed side by side, while the rows of a table (rank-2 array) are stacked on top of each other.

↗️
    <¨ 5        # A list of units
 ┌─                               
@@ -120,7 +120,7 @@
  ·lmnopqrst" 
             ┘
 
-

Empty arrays

+

Empty arrays

The top-left corner can show the rank of an array but not its shape; the shape must be seen from the data. An empty array has no data, and it's hard to tell shape from a bunch of blank space. In general, an empty array is printed as shape. An empty list is shown using brackets ⟨⟩, which are discussed in the next section.

↗️
    ¨ 04, 301, 200, 0
 ⟨ ↕0‿4 ↕3‿0‿1 ↕2‿0‿0 ⟨⟩ ⟩
@@ -135,7 +135,7 @@
             ┘  
               ┘
 
-

Simple lists

+

Simple lists

In two cases BQN might use a different format to display a list on one line. The first is for a string (that is, a list of just characters), which is displayed using the exact source code that would generate it. This is different from the array display, which doesn't escape quotes, and substitutes control characters to make sure things stay horizontal.

↗️
    "tab(	)+quote("")"
 "tab(	)+quote("")"
@@ -158,10 +158,10 @@
 ⟨⟩
 

This case also covers empty lists, which are shown as ⟨⟩. This includes an empty string, as the only difference between an empty string and any other empty list is its fill element and array displays don't depend on the fill.

-

List literals

+

List literals

The tutorial section here also covers this topic.

There are three kinds literal notation for lists: strings, list notation, and stranding. Strings indicate character lists (with space for the fill) and the other two can combine any sequence of elements.

-

Strings

+

Strings

A string consists of a sequence of characters surrounded by double quotes "". The only rule for the characters inside is that any double quote must be escaped by repeating it twice; otherwise the string ends at that point.

↗️
    "-'×%""*"
 "-'×%""*"
@@ -170,7 +170,7 @@
 ERROR
 

Even special characters like a newline can appear in a string literal, so that string literals are automatically multi-line.

-

Brackets

+

Brackets

List notation uses angle brackets ⟨⟩. The contents are structurally identical to those of a block, that is, a list of expressions separated by , or or newlines. Unlike a block, a list doesn't need to have any expressions: ⟨⟩ or or ,,⋄, will create an empty list. Other differences are that a list doesn't introduce a new scope and all of the expressions have to result in a value, not Nothing (·).

Entries in a list are evaluated in source order, and the value will be the list of those results. The list has a subject role, even if it contains expressions with other roles. Any value can be an element.

↗️
    @, ˘, "abc"
@@ -192,7 +192,7 @@
   "e6"
 
 
-

Strands

+

Strands

Strand notation is another way to write lists of length two or more. The elements are connected with the ligature character . It has a precedence lower than the namespace dot but higher than anything else other than paired brackets (), {}, and ⟨⟩, so compound elements generally need to be placed in parentheses. Expressions joined by ligatures behave exactly the same as those in list notation: they are evaluated in order and placed in a list.

↗️
    +´×
 ⟨ + ´ ∘ × ⟩
@@ -201,7 +201,7 @@
 1
 

Strand notation is mainly useful for simple elements that don't require parentheses. A strand with one set of parentheses is no shorter than using list notation (but could look nicer), and one with more parentheses will be longer.

-

Why not whitespace?

+

Why not whitespace?

In APL two or more arrays that are next to each other in the code are combined into a list, a convention known as stranding. So 2 3 5 + 1 adds a list to a number. This looks substantially cleaner than a BQN list, so it's reasonable to ask: why give it up? I admit I've been jealous of that clean look at times. But I'm also finding I view it with a certain unease: what's hiding in that space?

This feeling comes because the language is doing something I didn't ask it to, and it's justified. Consider the BQN expression a +˝×1 b for a matrix product. If we remove the space then we have 1 b. There's no good rule to say which of the three subjects 1, , and b to strand together. For modifiers like Rank and Depth we'd like stranding to bind more tightly than modifier application, but in order to actually use arguments for these modifiers the modifier application should take precedence. Similar but simpler cases show up more often when binding an argument to a function. The difference between the following two statements is obvious in BQN, but with space-for-stranding one of them would require a complicating parenthesis.

↗️
    3 1+× 5
@@ -213,7 +213,7 @@
 

Explicit stranding is also more general, because it applies equally to elements of any role. 2+3 is a perfectly fine list in BQN—maybe it's part of an AST—while 2 + 3 is clearly not a list. J and K restrict their stranding even further, to numbers only. It does mean that issues with stranding show up in fewer cases, but it also means that changing one element of a list from a constant to a variable requires rewriting the whole list.

Why can't the more explicit list notation a,b,c drop the separators? This is also largely for reasons of generality, which is even more important given that ⟨⟩ is the more general-purpose list notation. Writing ÷,-,4 without the , won't go well. For something like 2×c,b-1, maybe the interpreter could sort it out but it would be pretty confusing. Pretty soon you're going through the list character by character trying to figure out which space is actually a separator. And cursing, probably.

Fortunately, I find that after a reasonable period of adjustment typing ligatures instead of spaces doesn't feel strange, and reading code is improved overall by the more explicit notation. A minor note is that lists of literal numbers, where APL-style stranding is best, tend to show up more in the snippets that beginners write to test out the language than in programs even in the tens of lines. So this issue sticks out in first experiences with BQN, but will probably come up less later on.

-

Array notation?

+

Array notation?

BQN has literal notation for lists only right now. To get an array with rank other than 1, either reshape a list, or merge a list of arrays:

↗️
    2  2,3, 4,1, 0,5
 ┌─     
diff --git a/docs/doc/assert.html b/docs/doc/assert.html
index 7630f925..2e112283 100644
--- a/docs/doc/assert.html
+++ b/docs/doc/assert.html
@@ -4,7 +4,7 @@
   BQN: Assert
 
 
-

Assert

+

Assert

BQN takes the position that errors exist to indicate exceptional conditions that the developer of a given program didn't expect. However, the types of errors that BQN naturally checks for, such as mismatched shapes in Couple (), aren't always enough to detect exceptional conditions. Issues like numeric values that don't make physical sense will slip right through. BQN makes it easy for a programmer to check for these sorts of problems by building in the primitive Assert, written !. This function checks whether 𝕩 matches 1: if it does, then it does nothing and returns 𝕩, and otherwise it gives an error.

↗️
    ! 2=2  # Passed
 1
@@ -23,7 +23,7 @@ ERROR
     ,"abc",˜ ! '0'  # Okay this is not a very helpful printout
 ERROR
 
-

Computing the error message on demand

+

Computing the error message on demand

Because the left argument to a function is always computed before the function is called, Assert doesn't let you compute the error message only if there's an error. This might be a problem if the error message computation is slow or has side effects. There are a few ways to work around the issue:

  • Handle errors with ordinary if-then logic (perhaps using control structures). This is probably the best path for user-facing applications where displaying an error goes through the user interface.
  • diff --git a/docs/doc/based.html b/docs/doc/based.html index 6fce0526..7974f1d5 100644 --- a/docs/doc/based.html +++ b/docs/doc/based.html @@ -4,12 +4,12 @@ BQN: Based array theory -

    Based array theory

    +

    Based array theory

    "Like a normal programming language"

    This page explains how BQN's array model (christened "based" in 1981) differs from the models used by existing APL dialects, and why the choice was made to discard APL's "everything is an array" dictum. If you're not wondering what the difference is, and don't think everything should be an array, then you can probably just read about BQN's type system instead.

    If you're an array programmer then I have bad news for you. My thesis here is that APL took a wrong turn around 1981 when it extrapolated the excellent, but limited, flat array model of APL\360 to the ill-founded nested array model and the rigorous but clumsy boxed array model. Make that two wrong turns, I guess. Simultaneously. Anyway, if you've been brought up in either of these array models, then the best thing to do when starting BQN is to throw out your existing ideas about array depth and nesting (but don't worry too much: the fundamental concept of an array as a rectangular collection of data still holds!). If you'd like to ponder the relationship of BQN to APL later, that's great, but trying to initially understand BQN in terms of APL or J will just cause confusion.

    -

    Starting from atoms

    +

    Starting from atoms

    APL tends to define its data by starting with the array and then looking downwards in depth at what it contains. The based array model, as the name suggests, starts at the foundations, which in BQN are called "atoms". There are six types of atom, which together with the array type give the seven types a value can have in BQN. Based means being yourself, and an atom's not an array.

    An atom has depth 0, and doesn't inherently have a shape. However, primitives that expect an array promote atoms by enclosing them to get a rank-0, or unit, array that contains the atom (any value can be enclosed in this way, giving a unit array with higher depth, but it only happens automatically for atoms). Rank and shape both do this, so an atom can be considered to have the same dimensions as a unit array: rank 0 and shape ⟨⟩. An atom is also considered a kind of unit, but it's not a unit array.

    Atoms are displayed as plain values, while enclosed atoms, that is, depth-1 unit arrays, are shown with an array display.

    @@ -39,14 +39,14 @@ · 3 ┘
-

Building arrays

+

Building arrays

Arrays in BQN, like nearly all data structures in modern programming languages, are an inductive type. That means that an array can be constructed from existing values, but can't contain itself (including recursively: an array always has finite depth). To construct the type of all BQN values inductively, we would say that atoms form the base case, and arrays are an inductive case: an array is a shaped collection of existing BQN values. For an array programmer, this is of course the easy part.

-

Versus the nested array model

+

Versus the nested array model

The nested array model of NARS, APL2, Dyalog, and GNU APL can be constructed from the based model by adding a rule: a unit (or "scalar" in APL) array containing an atom is equivalent to that atom. The equivalents of atoms in nested array theory are thus called "simple scalars", and they are considered arrays but share the characteristics of BQN atoms. Nested arrays don't form an inductive type, because simple scalars contain themselves.

Nested array theory can seem simpler to use, because the programmer never has to worry about simple scalars being enclosed the wrong number of times: all these encloses have been identified with each other. For example, 'abcd'[2] returns a character while BQN's 2"abcd" returns an array containing a character. However, these issues usually still appear with more complex arrays: 'ab' 1 'ef'[2] (here spaces are used for stranding) is not a string but an enclosed string!

A property that might warn about dangerous issues like this is that nested array theory tends to create inversions where the depth of a particular array depends on its rank (reversing the normal hierarchy of depth→rank→shape). A 1-character string has depth 1, but when its rank is reduced to 0, its depth is reduced as well.

In some cases nested array theory can remove a depth issue entirely, and not just partially. Most notable is the search function result depth issue, in which it's impossible for a search function in BQN to return an atomic number because it always returns an array. Nested array theory doesn't have this issue since a scalar number is "just a number", and more complicated arrays can't cause problems because a search function's result is always a numeric array. The other half of the problem, about the non-principal argument depth, is only partly hidden, and causes problems for example when searching for a single string out of a list of strings.

-

Versus the boxed array model

+

Versus the boxed array model

The boxed array model of SHARP APL, A+, and J is an inductive system like BQN's. But this model uses arrays as the base case: numeric and character arrays are the simplest kind of data allowed, and "a number" means a rank-0 numeric array. The inductive step is the array of boxes; as with numbers "a box" is simply a rank-0 array of boxes.

Numeric and character arrays in this system have depth 0. In general these correspond to arrays of depth 1 in BQN, but because there's no lower depth they are also used where BQN atoms would appear. For example, both Shape ($) and Length (#) return depth-0 results in J. For an array a with rank at least 1, the length #a is exactly [/ $ a, while the identical BQN code ˝ a returns not a but < a. Like the nested model, the boxed model can hide depth issues that occur at lower depths but generally reveals them at higher depths.

The boundary at depth 0 will tend to cause inconsistencies and confusion in any array language, and boxed array languages push this boundary up a level. This leads to the programmer spending more effort managing boxes: for example, to reverse each list in a list of lists, the programmer can use reverse under open, |. &. >. But to find the lengths of each of these lists, # &. > would yield a boxed list, which is usually not wanted, so # @ > is needed instead. BQN shows that a system that doesn't require these distinctions is possible, as a BQN programmer would use ¨ and ¨.

diff --git a/docs/doc/block.html b/docs/doc/block.html index 82547a15..a3fa4397 100644 --- a/docs/doc/block.html +++ b/docs/doc/block.html @@ -4,7 +4,7 @@ BQN: Blocks -

Blocks

+

Blocks

In BQN, a block is any piece of code surrounded with curly braces {}. Blocks can be used simply to group statements, or can define functions or modifiers. They are the sole large-scale structure used to organize programs.

Blocks are most commonly used to define functions by including one of the special names for arguments, 𝕨 or 𝕩. With the operands 𝔽 or 𝔾, they can also define 1-modifiers or 2-modifiers.

↗️
    {𝕩+1} 3
@@ -17,7 +17,7 @@
     { a"inner"  ab }
 ⟨ "inner" "outer" ⟩
 
-

Headerless blocks

+

Headerless blocks

In the simplest case a block is just a list of statements, which are executed to evaluate the block. A block with no special names like 𝕨 or 𝕩 is called an immediate block, and is evaluated as soon as it is reached. The only think such a block does is group some statements, and create a scope for them so that definitions made there are discarded when the block finishes. Even this small amount of functionality could be useful; as an example the following program can build up an array from named components without polluting the rest of the program with those names.

↗️
    updown  { up5  downup  updown }
     updown
@@ -66,7 +66,7 @@
 
 

Of these, 𝕣 is sort of a "more special" character, as we'll discuss below. Except for 𝕣, every special name is a single character and can't have underscores added to spell it as a modifier, allowing a modifier to be applied to a special name with no spacing as in 𝕗_m, something that can't be done with ordinary names.

-

Arguments

+

Arguments

The names 𝕨 and 𝕩, and their uppercase spellings, represent function arguments. As the argument to a function is typically data, it's more common to use the lowercase forms for these. Either of these names will turn an immediate block into a function (or an immediate modifier into a deferred one; see the next section). Instead of being evaluated as soon as it appears in the source, a function is evaluated when it's called, with the special names set to appropriate values. Unlike in Dyalog APL's dfns, their values can be changed like ordinary variables.

↗️
    {'c'=𝕩} "abcd"
 ⟨ 0 0 1 0 ⟩
@@ -97,7 +97,7 @@
 143.4131591025766
 

Called dyadically, this function will expand to (𝕨)-𝕩, so we might expect the monadic result to be -𝕩. This sort of expansion isn't right with · on the left. - taken as a whole is a function, so · - 𝕩 is just - 𝕩, or (𝕩)-𝕩, giving the large result seen above.

-

Operands

+

Operands

The special names 𝔽 and 𝔾, and their lowercase forms, represent operands. Since operands are more often functions, they're typically shown with the uppercase spelling. If 𝔽 is present in a block then it defines a 1-modifier or 2-modifier depending on whether 𝔾 is present; if 𝔾 is there it's always a 2-modifier.

↗️
    4 {ט𝕗}
 16
@@ -116,7 +116,7 @@
 ⟨ ⟨ 2 ⟩ ¯5 ⟩
 

The distinction between an immediate and deferred modifier only matters inside the braces. Once defined, the object is simply a modifier that can be called on operands to return a result. For a deferred modifier this result will always be a function; for an immediate modifier it could be anything.

-

Self-reference

+

Self-reference

If a block is assigned a name after it is created, this name can be used for recursion:

↗️
    Fact  { 𝕩 × (0<)1Fact 𝕩-1 }
     Fact 7
@@ -134,7 +134,7 @@
 5040
 

Because 𝕣 only ever refers to a 1-modifier or 2-modifer, it can never make sense to refer to it as a function, and the uppercase letter is not recognized by BQN. In order to allow 𝕣 to be spelled as a 1-modifier _𝕣 or 2-modifier _𝕣_, it is treated as an ordinary identifier character, so it must be separated from letters or numbers by spaces.

-

Block headers

+

Block headers

As a program becomes larger, it often becomes necessary to name inputs to blocks rather than just using special names. It can also become difficult to identify what kind of block is being defined, as it requires scanning through the block for special names. A block header, which is separated from the body of a block by a colon :, specifies the kind of block and can declare names for the block and its inputs. Its syntax mirrors an application of the block. As suggested by the positioning, the names given in a header apply only inside the block.

# A dyadic function called Func
 { l Func r:
@@ -160,16 +160,16 @@
   
 

Unlike these assignments, the header also constrains what inputs the block can take: a monadic 1-modifier like the one above can't take a right operand or left argument, and consequently its body can't contain 𝔾 or 𝕨. Calling it with a left argument, or a right argument that isn't a two-element list, will result in an error.

-

Destructuring

+

Destructuring

Arguments, but not operands, allow destructuring like assignment does. While assignment only tolerates lists of variables, header destructuring also allows constants. The argument must match the given structure, including the constants where they appear, or an error results.

↗️
    Destruct  { 𝕊 a1b,2: ab }
     Destruct       517,2
 ⟨ 5 7 ⟩
 
-

Special names in headers

+

Special names in headers

Any element of a function or modifier header can be left nameless by using the corresponding special name in that position, instead of an identifier. For example, the header 𝕨 𝔽_𝕣_𝔾 𝕩: incorporates as much vagueness as possible. It indicates a deferred 2-modifier, but provides no other information.

The name 𝕨 in this context can refer to either a left argument or no left argument, allowing a header with arguments to be used even for an ambiguous function. Recall that 𝕨 is the only token other than · that can have no value. If an identifier or list is given as the left argument, then the function must be called with a left argument.

-

Short headers

+

Short headers

A header does not need to include all inputs, as shown by the F _op_ val: header above. The simplest case, when no inputs are given, is called a label. While it doesn't restrict the inputs, a label specifies the type of the block and gives an internal name that can be used to refer to it.

{ b:   # Block
 { 𝕊:   # Function
@@ -177,7 +177,7 @@
 { _𝕣_: # 2-Modifier
 

For immediate blocks, this is the only type of header possible, and it must use an identifier as there is no applicable special name. However, this name can't be used, except for returns: it doesn't make sense to refer to a value while it is still being computed!

-

Multiple bodies

+

Multiple bodies

Blocks that define functions and deferred modifiers can include more than one body, separated by semicolons ;. The body used for a particular evaluation is chosen based on the arguments the the block. One special case applies when there are exactly two bodies either without headers or with labels only: in this case, the first applies when there is one argument and the second when there are two.

↗️
    Ambiv  { 1,𝕩 ; 2,𝕨,𝕩 }
     Ambiv 'a'
@@ -198,7 +198,7 @@
 ↗️
    3 CaseAdd 3
 ERROR
 
-

Case headers

+

Case headers

A special rule allows for convenient case-matching syntax for one-argument functions. In any function header with one argument, the function name can be omitted as long as the argument is not a plain identifier—it must be 𝕩 or a compound value like a list to distinguish it from an immediate block label.

Test  {
   "abc": "string"
@@ -208,6 +208,6 @@ ERROR
 }
 

These case-style headers function exactly the same as if they were preceded by 𝕊, and can be mixed with other kinds of headers.

-

Returns

+

Returns

This feature is not yet included in any BQN implementation.

The glyph indicates an early return from a block. It must be preceded either by one of the self-reference special names 𝕊 or 𝕣 or by an internal name for a containing block. The combination of name and return token—like F, let's say—is a function that returns from the current instance of the indicated block. If that instance has already returned, then it instead results in an error.

diff --git a/docs/doc/context.html b/docs/doc/context.html index 35705a7e..b907956f 100644 --- a/docs/doc/context.html +++ b/docs/doc/context.html @@ -4,7 +4,7 @@ BQN's context-free grammar -

BQN's context-free grammar

+

BQN's context-free grammar

APL has a problem. To illustrate, let's look at an APL expression:

a b c d e
 
@@ -16,12 +16,12 @@
a B C _d e
 

Here, the lowercase spelling indicates that a and e are to be treated as subjects ("arrays" in APL) while the uppercase spelling of variables B and C are used as functions and _d is a 1-modifier ("monadic operator"). Like parentheses for function application, the spelling is not inherent to the variable values used, but instead indicates their grammatical role in this particular expression. A variable has no inherent spelling and can be used in any role, so the names a, A, _a, and _a_ all refer to exact same variable, but in different roles; typically we use the lowercase name to refer to the variable in isolation—all values are nouns when speaking about them in English. While we still don't know anything about what values a, b, c, and so on have, we know how they interact in the line of code above.

-

Is grammatical context really a problem?

+

Is grammatical context really a problem?

Yes, in the sense of problems with BQN. A grammar that uses context is harder for humans to read and machines to execute. A particular difficulty is that parts of an expression you don't yet understand can interfere with parts you do, making it difficult to work through an unknown codebase.

One difficulty beginners to APL will encounter is that code in APL at first appears like a string of undifferentiated symbols. For example, a tacit Unique Mask implementation ⍳⍨= consists of six largely unfamiliar characters with little to distinguish them (in fact, the one obvious bit of structure, the repeated , is misleading as it means different things in each case!). Simply placing parentheses into the expression, like (⍳⍨)=(), can be a great help to a beginner, and part of learning APL is to naturally see where the parentheses should go. The equivalent BQN expression, ˜=↕, will likely appear equally intimidating at first, but the path to learning which things apply to which is much shorter: rather than learning the entire list of APL primitives, a beginner just needs to know that superscript characters like ˜ are 1-modifiers and characters like with unbroken circles are 2-modifiers before beginning to learn the BQN grammar that will explain how to tie the various parts together.

This sounds like a distant concern to a master of APL or a computer that has no difficulty memorizing a few dozen glyphs. Quite the opposite: the same concern applies to variables whenever you begin work with an unfamiliar codebase! Many APL programmers even enforce variable name conventions to ensure they know the class of a variable. By having such a system built in, BQN keeps you from having to rely on programmers following a style guide, and also allows greater flexibility, including functional programming, as we'll see later.

Shouldn't a codebase define all the variables it uses, so we can see their class from the definition? Not always: consider that in a language with libraries, code might be imported from dependencies. Many APLs also have some dynamic features that can allow a variable to have more than one class, such as the pattern in a dfn that makes an array in the dyadic case but a function in the monadic case. Regardless, searching for a definition somewhere in the code is certainly a lot more work than knowing the class just from looking! One final difficulty is that even one unknown can delay understanding of an entire expression. Suppose in A B c, B is a function and c is an array, and both values are known to be constant. If A is known to be a function (even if its value is not yet known), its right argument B c can be evaluated ahead of time. But if A's type isn't known, it's impossible to know if this optimization is worth it, because if it is an array, B will instead be called dyadically.

-

BQN's spelling system

+

BQN's spelling system

BQN's expression grammar is a simplified version of the typical APL, removing some oddities like niladic functions and the two-glyph Outer Product operator. Every value can be used in any of four syntactic roles:

@@ -58,7 +58,7 @@

BQN's variables use another system, where the spelling indicates how the variable's value is used. A variable spelled with a lowercase first letter, like var, is a subject. Spelled with an uppercase first letter, like Var, it is a function. Underscores are placed where operands apply to indicate a 1-modifier _var or 2-modifier _var_. Other than the first letter or underscore, variables are case-insensitive.

The associations between spelling and syntactic role are considered part of BQN's token formation rules.

One rule for typing is also best considered to be a pre-parsing rule like the spelling system: the role of a brace construct {} with no header is determined by which special arguments it uses: it's a subject if there are none, but a 𝕨 or 𝕩 makes it at least a function, an 𝔽 makes it a 1- or 2-modifier, and a 𝔾 always makes it a 2-modifier.

-

BQN's grammar

+

BQN's grammar

A formal treatment is included in the spec. BQN's grammar—the ways syntactic roles interact—follows the original APL model (plus trains) closely, with allowances for new features like list notation. In order to keep BQN's syntax context-free, the syntactic role of any expression must be known from its contents, just like tokens.

Here is a table of the APL-derived modifier and function application rules:

@@ -133,7 +133,7 @@

A function with an asterisk indicates that a subject can also be used: in these positions there is no difference between function and subject spellings. Modifier applications bind more tightly than functions, and associate left-to-right while functions associate right-to-left.

BQN lists can be written with angle brackets elt0,elt1, or ligatures elt0elt1. In either case the elements can have any type, and the result is a subject.

The statements in a block can also be any role, including the return value at the end. These roles have no effect: outside of braces, an immediate block is a subject, a function always returns a subject, and a modifier always returns a function, regardless of how these objects were defined.

-

Mixing roles

+

Mixing roles

BQN's value types align closely with its syntactic roles: functions, 1-modifiers, and 2-modifiers are all types (operation types) as well as roles, while the other types (data types) are split into numbers, characters, and arrays. This is no accident, and usually values will be used in roles that correspond to their underlying type. However, the ability to use a role that doesn't match the type is also useful.

Any type can be passed as an argument to a function, or as an operand, by treating it as a subject. This means that BQN fully supports Lisp-style functional programming, where functions can be used as first-class entities.

It can also be useful to treat a value of a data type as a function, in which case it applies as a constant function. This rule is useful with most built-in modifiers. For example, F1 uses a constant for the rank even though in general a function can be given, and if a is an array then a(b/) inserts the values in a into the positions selected by b, ignoring the old values rather than applying a function to them.

diff --git a/docs/doc/control.html b/docs/doc/control.html index ce33ddb6..10a9fc7a 100644 --- a/docs/doc/control.html +++ b/docs/doc/control.html @@ -4,7 +4,7 @@ Control flow in BQN -

Control flow in BQN

+

Control flow in BQN

BQN does not have ALGOL-style control structures. Instead, functional techniques can be used to control when code is evaluated. This page describes how BQN functionality can be used to emulate something more familiar to an imperative programmer.

Control structures here are always functions that act on lists of functions, although alternatives might be presented. This is because stranded functions can be formatted in a very similar way to blocks in curly-brace languages. However, there are many ways to write control flow, including simple operators and a mix of operators and more control-structure-like code. Implementing a control structure rarely takes much code with any method, so there are usually several simple ways to implement a given flow or a variation of it.

The surfeit of ways to write control structures could be a bit of an issue for reading BQN. My hope is that the community can eventually settle on a smaller set of standard forms to recommend so that you won't have to recognize all the variants given here. On the other hand, the cost of using specialized control structures is lower in a large project without too many contributors. In this case BQN's flexibility allows developers to adapt to the project's particular demands (for example, some programs use switch/case statements heavily but most do not).

@@ -21,7 +21,7 @@ Switch{c𝕩ma<˘21𝕩(a⊐C)m@}Test{fn{CA𝕊e:CAE}´𝕩Fn@} -

Blocks and functions

+

Blocks and functions

Control structures are generally defined to work with blocks of code, which they might skip, or execute one or more times. This might sound like a BQN immediate block, which also consists of a sequence of code to execute, but immediate blocks are always executed as soon as they are encountered and can't be manipulated the way that blocks in imperative languages can. They're intended to be used with lexical scoping as a tool for encapsulation. Instead, the main tool we will use to get control structures is the block function.

Using functions as blocks is a little outside their intended purpose, and the fact that they have to be passed an argument and are expected to use it will be a minor annoyance. The following conventions signal a function that ignores its argument and is called purely for the side effects:

@@ -151,7 +151,7 @@ ↗️
    214 <¨ 3452
 ⟨ ⟨ 2 1 4 0 ⟩ ⟨ 2 1 4 1 ⟩ ⟩
 
-

The Depth modifier

+

The Depth modifier

The Depth 2-modifier () is a generalization of Each that allows diving deeper into an array. To illustrate it we'll use a shape 43 array of lists of lists.

↗️
     n  <12 4322⥊↕48
 ┌─                                                                         
diff --git a/docs/doc/embed.html b/docs/doc/embed.html
index e0f48dc2..bdd03679 100644
--- a/docs/doc/embed.html
+++ b/docs/doc/embed.html
@@ -4,10 +4,10 @@
   Using embedded BQN
 
 
-

Using embedded BQN

+

Using embedded BQN

The Javascript implementation of BQN, docs/bqn.js, can be used as a standalone interpreter, but it can also be called from JS, which in combination with BQN's first-class functions allows the two language to interoperate well. Similar functionality will most likely be brought to other host languages in the future. Languages that (like JS) allow functions and arrays to be tagged with extra properties can host a full BQN implementation with good interoperability. Other languages would either require functions and arrays to be stored in specialized data structures, making interoperability a little harder, or would miss out on some inferred properties like function inverses and array fills.

There is only one mechanism to interface between the host language and BQN: the function bqn evaluates a string containing a BQN program and returns the result. Doesn't sound like much, especially considering these programs can't share any state such as global variables (BQN doesn't have those). But taking first-class functions and closures into account, it's all you could ever need!

-

Passing closures

+

Passing closures

Probably you can figure out the easy things like calling bqn("×´1+↕6") to compute six factorial. But how do you get JS and BQN to talk to each other, for example to compute the factorial of a number n? Constructing a source string with bqn("×´1+↕"+n) isn't the best way—in fact I would recommend you never use this strategy.

Instead, return a function from BQN and call it: bqn("{×´1+↕𝕩}")(n). This strategy also has the advantage that you can store the function, so that it will only be compiled once. Define let fact = bqn("{×´1+↕𝕩}"); at the top of your program and use it as a function elsewhere.

BQN can also call JS functions, to use functionality that isn't native to BQN or interact with a program written in JS. For example, bqn("{𝕏'a'+↕26}")(alert) calls the argument alert from within BQN. The displayed output isn't quite right here, because a BQN string is stored as a JS array, not a string. See the next section for more information.

@@ -32,12 +32,12 @@

When defining closures for their side effects like this, make sure they are actually functions! For example, since flip ignores its argument (you can call it with flip(), because a right argument of undefined isn't valid but will just be ignored), it needs an extra 𝕤 in the definition to be a function instead of an immediate block.

You can also use an array to pass multiple functions or other values from JS into BQN all at once. However, a JS array can't be used directly in BQN because its shape isn't known. The function list() converts a JS array into a BQN list by using its length for the shape; the next section has a few more details.

-

JS encodings

+

JS encodings

In the programs above we've used numbers and functions of one argument, which mean the same thing in BQN and JS. This isn't the case for all types: although every BQN value is stored as some JS value, the way it's represented may not be obvious and there are many JS values that don't represent any BQN value and could cause errors. BQN operations don't verify that their inputs are valid BQN values (this would have a large performance cost), so it's up to the JS programmer to make sure that values passed in are valid. To do this, you need to know the encodings for each of the seven BQN types you're going to use.

The two atomic data values are simple: numbers are just JS numbers, and characters are strings containing a single code point. Arrays are JS arrays, but with some extra information. Since JS arrays are 1-dimensional, a BQN array a is stored as the element list a. Its shape a, a list of numbers, is a.sh in JS (the shape isn't necessarily a BQN array so it doesn't have to have a sh property). Optionally, its fill element is a.fill. Note that a BQN string is not a JS string, but instead an array of BQN characters, or JS strings. To convert it to a JS string you can use str.join("").

There are two utilities for converting from JS to BQN data: list([…]) converts a JS array to a BQN list, and str("JS string") converts a string.

Operations are all stored as JS functions, with one or two arguments for the inputs. The type is determined by the .m property, which is 1 for a 1-modifier and 2 for a 2-modifier, and undefined or falsy for a function. Functions might be called with one or two arguments. In either case, 𝕩 is the first argument; 𝕨, if present, is the second. Note that F(x,w) in JS corresponds to w F x in BQN, reversing the visual ordering of the arguments! For modifiers there's no such reversal, as 𝕗 is always the first argument, and for 2-modifiers 𝕘 is the second argument. As in BQN, a modifier may or may not return a function.

Operations may have some extra properties set that aren't terribly important for the JS programmer: for each primitive p, p.glyph gives its glyph, and for a compound operation o such as a train, or a modifier with bound operands, o.repr() decomposes it into its parts. It wouldn't make sense to define either of these properties for a function created in JS.

-

Other functionality

+

Other functionality

The BQN script also contains the function fmt, which takes a BQN value for its argument and returns a string displaying it.

The expression diagram builder used for the REPL's "Explain" button isn't included in the main BQN script: it's kept in docs/repl.js instead. It's a little tricky to use as it takes both the source string and the bytecode obtained from compiling the source: see the surrounding Javascript code for an example. The result is the source code for an svg, as a BQN string.

diff --git a/docs/doc/enclose.html b/docs/doc/enclose.html index 2be9bddb..aac48d68 100644 --- a/docs/doc/enclose.html +++ b/docs/doc/enclose.html @@ -4,7 +4,7 @@ BQN: Enclose -

Enclose

+

Enclose

The function enclose creates a unit array whose only element is 𝕩.

↗️
    < "xyz"
 ┌·       
@@ -13,7 +13,7 @@
 

If you understand the concept of a unit array, then that definition almost certainly made sense to you. Therefore the remainder of this document will explain what a unit array is, what it isn't, and why you would use it.

If you're familiar with the Enclose or Box function from APL or J (but particularly APL), then it's possible you understand the concept of a unit array wrongly, or at least, not in the same way BQN uses it. A difference from APL is that <𝕩 is never the same as 𝕩. I recommend reading about based array theory if you haven't already.

-

What's a unit?

+

What's a unit?

A unit array is an array with no axes: that is, it has rank 0 and its shape is the empty list. The array itself isn't empty though. The number of elements is the product of the shape, which is 1.

↗️
        <"anything"   # empty shape
 ⟨⟩
@@ -45,9 +45,9 @@ ERROR
 

The function +´˘ tries to mix together the result elements into one big array, causing an error because they have different lengths, but +˝˘ keeps them as elements.

One strained example probably isn't all that compelling. And it doesn't explain why you'd use Enclose, which doesn't remove an axis from an existing array but creates a whole new unit array. So…

-

Why create a unit?

+

Why create a unit?

Why indeed?

-

Table of combinations

+

Table of combinations

Let's take a look at the following program, which uses Table () to create an array of combinations—every possibility from three sets of choices. It uses Enclose not once but twice.

↗️
    (<⟨⟩) <⌜´ """anti", "red""blue""green", "up""down"
 ┌─                                                   
@@ -112,7 +112,7 @@ ERROR
   ⟨ "green" "down" "quark" ⟩ ⟨ "green" "strange" "quark" ⟩ ⟨ "green" "bottom" "quark" ⟩  
                                                                                         ┘
 
-

Broadcasting

+

Broadcasting

Table isn't the only mapping function that gets along well with units. Here's an example with Each (¨).

↗️
    = {𝕎𝕩}¨ < 32"abcdef"
 ⟨ 2 3 1 ⟨ 3 2 ⟩ ⟩
@@ -125,7 +125,7 @@ ERROR
   ⟨ 14 ¯9 ⟩ ⟨ 15 ¯6 ⟩  
                       ┘
 
-

Coda

+

Coda

Perhaps you feel bludgeoned rather than convinced at this point. Unit arrays are useful, sure, but aren't they ugly? Aren't they a hack?

The practical answer is that I think you should use them anyway. You'll probably come to appreciate the use of Enclose and how it can help you produce working, reliable code, making you a more effective BQN programmer.

On the theoretical side, it's important to realize that units are just a consequence of having multidimensional arrays. Array languages come with units be default, so that "adding" them is not really a complication, it's a simplification. It's natural to not feel quite right around these sorts of non-things, because zero is a pretty special number—being among other things the only number of paddles you can have and still not be able to go anywhere in your canoe. In my opinion the right response is to understand why they are special but also why they fit in as part of the system, so you can be in control instead of worrying.

diff --git a/docs/doc/expression.html b/docs/doc/expression.html index 20eb896e..614052fa 100644 --- a/docs/doc/expression.html +++ b/docs/doc/expression.html @@ -4,10 +4,10 @@ BQN: Expression syntax -

Expression syntax

+

Expression syntax

BQN expressions are the part of syntax that describes computations to perform. Programs are mainly made up of expressions with a little organizing material like blocks and namespaces around them. This page explains how functions, modifiers, and assignment combine with their inputs. It doesn't describe constant and array literals, which each form a single subject for grammatical purposes.

The first tutorial also covers how to build and read BQN expressions.

-

Overview

+

Overview

BQN expressions consist of subjects, functions, and modifiers arranged in sequence, with parentheses to group parts into subexpressions. Assignment arrows and can also be present and mostly behave similar to functions. Functions can be applied to subjects or grouped into trains, while modifiers can be applied to subjects or functions. The most important kinds of application are:

@@ -57,11 +57,11 @@

The four roles (subject, function, two kinds of modifier) describe expressions, not values. When an expression is evaluated, the value's type doesn't have to correspond to its role, and can even change from one evaluation to another. An expression's role is determined entirely by its source code, so it's fixed.

If you're comfortable reading BNF and want to understand things in more detail than described below, you might check the grammar specification as well.

-

Syntactic role

+

Syntactic role

This issue is approached from a different angle in Context free grammar.

In APL, the way one part of an expression interacts with others is determined by its value. That means that to parse an expression, in general you would have to evaluate that part, get a value, check its type, and then figure out how it fits in with the rest of the expression. This is a lot of work. BQN changes things so that you can determine how to parse an expression just by looking at its source code. But because it still needs to support expressions that can evaluate to more than one possible type, BQN has to introduce a new and independent concept, called syntactic role, in order to support APL-like expressions.

Syntactic role is a property of an expression, not its value. To describe it in terms of English grammar, you might say "I like BQN", using "BQN" as an object, or "BQN scares me", using it as a subject. BQN itself isn't a subject or object, it's a programming language. Similarly you might write F g, placing f in a function role to apply it to g, or G f to use f as an argument. Maybe even in the same program, although it's unlikely.

-

Role spellings

+

Role spellings

The four roles are subject, function, 1-modifier, and 2-modifier, as shown in the table below. Each type has an associated role (with non-operation types all corresponding to subjects), and the value of an expression will often have a matching type, but it doesn't have to.

@@ -98,7 +98,7 @@

Variable names can be written in any case and with underscores added, and these changes don't affect what identifier the name refers to. ab, aB, AB, and _a_B_ are all the same variable. However, the spelling—specifically the first and last characters—determine the variable's role. A lowercase first letter indicates a subject, and an uppercase first letter makes it a function. A leading underscore (regardless of the following character) indicates a 1-modifier, and both leading and trailing underscores makes a 2-modifier.

Besides these, character, string, and list literals always have a subject role, and the role of a block is determined by its type, which depends either on the header it has or which special variables it uses.

The role of a compound expression, formed by applying an operation to some inputs, depends on the operation applied. This system is discussed in the remaining sections below.

-

Kinds of application

+

Kinds of application

Here is a table of the modifier and function application rules:

diff --git a/docs/doc/fill.html b/docs/doc/fill.html index b4ac35ef..e350fa6e 100644 --- a/docs/doc/fill.html +++ b/docs/doc/fill.html @@ -4,10 +4,10 @@ BQN: Fill elements -

Fill elements

+

Fill elements

A few array operations need an array element to use when no existing element applies. BQN tries to maintain a "default" element for every array, known as a fill element, for this purpose. If it's known, the fill element is a nested array structure where each atom is either 0 or ' '. If no fill is known, a function that requests it results in an error.

Fills are used by Take () when a value in 𝕨 is larger than the corresponding length in 𝕩, by the two Nudge functions (»«) when 𝕩 is non-empty, by Merge (>) when 𝕩 is empty, and by Reshape () when 𝕨 contains . Except for these specific cases, the fill value an array has can't affect the program. The result of Match () doesn't depend on fills, and any attempt to compute a fill can't cause side effects.

-

Using fills

+

Using fills

For the examples in this section we'll use the fact that an all-number array usually has 0 as a fill while a string has ' ' (BQN maintains fills alongside array values rather than deriving them from arrays, so it's possible to construct arrays where this isn't true, but this probably wouldn't happen in ordinary code).

Take () and Nudge (»«) in either direction use the fill for padding, to extend the array past its boundary. For example, 𝕨𝕩 will add elements to one side when a number in |𝕨 is larger than the corresponding length in 𝕩.

↗️
    ¯7  43     # Fill with 0
@@ -38,7 +38,7 @@
 ↗️
    ⊑»1↑⥊"string"
 ' '
 
-

How fills are computed

+

How fills are computed

For the exact requirements placed on fill, see the specification (particularly "required functions"). This section loosely describes behavior in existing BQN implementations, and includes some parts that aren't required in the specification.

A fill element should encompass something that's necessarily true for all elements of an array. If the way an array is computed implies it's all numbers, the fill should be 0. If every element is a list of two numbers, then the fill should be 0,0. If every element is a list but the lengths might vary, ⟨⟩ is probably a reasonable fill element.

For arithmetic primitives, the fill is found by the rules of pervasion, applying the function to both argument fills. Generally this means it consists of 0, but character arithmetic also allows space fills.

diff --git a/docs/doc/find.html b/docs/doc/find.html index 689d8415..85f26297 100644 --- a/docs/doc/find.html +++ b/docs/doc/find.html @@ -4,7 +4,7 @@ BQN: Find -

Find

+

Find

Find () searches for occurrences of an array 𝕨 within 𝕩. The result contains a boolean for each possible location, which is 1 if 𝕨 was found there and 0 if not.

↗️
    "xx"  "xxbdxxxcx"
 ⟨ 1 0 0 0 1 1 0 0 ⟩
@@ -43,7 +43,7 @@ ERROR
 0
 

This pattern also works in the high-rank case discussed below, testing whether 𝕨 is a multi-dimensional prefix starting at the lowest-index corner of 𝕩.

-

Higher ranks

+

Higher ranks

If 𝕨 and 𝕩 are two-dimensional then Find does a two-dimensional search. The cells used are also found in 𝕨𝕩. For example, the bottom-right corner of 𝕩 below matches 𝕨, so there's a 1 in the bottom-right corner of the result.

↗️
     a  7 (4|⋆˜) 9   # Array with patterns
 ┌─                   
diff --git a/docs/doc/fold.html b/docs/doc/fold.html
index 2890eb4d..b9dddf8b 100644
--- a/docs/doc/fold.html
+++ b/docs/doc/fold.html
@@ -4,7 +4,7 @@
   BQN: Fold and Insert
 
 
-

Fold and Insert

+

Fold and Insert

@@ -43,7 +43,7 @@

The closely related 1-modifiers Fold (´) and Insert (˝) apply a dyadic operand function 𝔽 repeatedly between elements or major cells of 𝕩. Neither is quite like the APL2-style Reduce operator (/ or in APL), although I sometimes use the term "reduction" to mean either Fold or Insert. There are a bunch of other names like "accumulate" and "aggregate" for this class of calculations—I don't know which is best but I know "catamorphism" is worst.

A distinguishing feature of APL-family reductions is that they don't use an initial value, and try to derive an "identity element" from the operand if the argument array is empty. BQN retains this capability but also allows the programmer to supply an initial value as 𝕨.

-

Fold

+

Fold

As its glyph suggests, Fold is slightly simpler than Insert. The argument 𝕩 must always be a list, and Fold applies 𝔽 between elements—always two at a time—of the list to yield a single result value. In this sense, 𝔽´ removes a layer of depth from 𝕩, although it's not necessarily true that the depth of 𝔽´𝕩 is less than that of 𝕩 because the function 𝔽 might increase depth.

↗️
    +´ 2431
 10
@@ -64,7 +64,7 @@
     ´ 110
 1
 
-

Identity values

+

Identity values

Folding over a list of length 1 never calls the operand function: it returns the lone element unchanged.

↗️
    !´ 
 ⎊
@@ -132,7 +132,7 @@
 
 
 
-

Right-to-left

+

Right-to-left

The functions we've shown so far are associative (ignoring floating point imprecision), meaning it's equally valid to combine elements of the argument list in any order. But it can be useful to fold using a non-associative function. In this case you must know that Fold performs a right fold, starting from the end of the array and working towards the beginning.

↗️
    <´ "abcd"
 ⟨ 'a' ⟨ 'b' "cd" ⟩ ⟩
@@ -153,7 +153,7 @@
 ↗️
    +÷´ 21211411
 2.7183098591549295
 
-

Initial element

+

Initial element

When the operand isn't just an arithmetic primitive, folding with no initial element can be dangerous. Even if you know 𝕩 isn't empty, saving you from an "Identity not found" error, the case with only one element can easily violate expectations. Here's a somewhat silly example of a function meant to merge elements of the argument into a single list (∾⥊¨ is a much better way to do this):

↗️
    ´ 2468,"abcd",0
 ⟨ 2 4 6 8 'a' 'b' 'c' 'd' 0 ⟩
@@ -180,7 +180,7 @@
 ↗️
    "STOP" ´ "ABCDE""012""abcd"
 "EDCBA210dcbaSTOP"
 
-

Insert

+

Insert

Fold only works on lists. What if you want to, say, sum the columns of a table?

↗️
     tab  (2+↕5) | 9+↕3
 ┌─       
@@ -241,7 +241,7 @@
 ⟨ 0 4 ⟩
 

As a historical note, Insert is named after J's adverb /, which comes from SHARP APL's , reduce-down. In the original APL, only arithmetic reductions were defined, and nested arrays didn't exist—arrays were either all characters or all numbers. SHARP extended them by splitting the array into cells as we've shown. However, there's another interpretation, which is what you'll find in mainstream APLs today…

-

APL2 reduction?

+

APL2 reduction?

If you try an expression like ⍪⌿ in Dyalog APL, you'll get results very different from BQN's ˝. Instead of combining the cells like we see above, APL applies the function on pairs of elements much like Fold. The difference is that, because reduction happens only along one axis but an array might have other axes, there can be multiple values in the result, so that it will always be an array like the argument. BQN can perform this operation as well: ⍪⌿ is written ¨˝ in BQN.

↗️
    ¨˝ tab
 ⟨ ⟨ 1 0 1 4 3 ⟩ ⟨ 0 1 2 0 4 ⟩ ⟨ 1 2 3 1 5 ⟩ ⟩
diff --git a/docs/doc/fromDyalog.html b/docs/doc/fromDyalog.html
index 1662a543..84928cca 100644
--- a/docs/doc/fromDyalog.html
+++ b/docs/doc/fromDyalog.html
@@ -4,10 +4,10 @@
   BQN–Dyalog APL dictionary
 
 
-

BQN–Dyalog APL dictionary

+

BQN–Dyalog APL dictionary

A few tables to help users of Dyalog APL (or similar) get started quickly on BQN. Here we assume ML is 1 for Dyalog.

-

Terminology

-

Array model

+

Terminology

+

Array model

BQN uses the based array model, so that a Dyalog simple scalar corresponds to many BQN values: an atom, its enclose, and so on.

@@ -37,7 +37,7 @@

BQN shares the terms "cell" and "major cell" with Dyalog, and uses "element" (which may mean different things to different Dyalog users) not for a 0-cell but for the value it contains.

-

Roles

+

Roles

Dyalog uses value types (array, function, and so on) to determine syntax while BQN uses a separate concept called syntactic roles. See context-free grammar.

@@ -69,7 +69,7 @@
-

Syntax

+

Syntax

BQN comments are written with #, not . BQN strings use double quotes "" while single quotes '' enclose a character.

BQN's functions use {}, like Dyalog's dfns. The names for inputs and self-reference are different:

@@ -109,7 +109,7 @@

BQN doesn't have guards: it uses modifiers or control structures instead. However, BQN function and modifier blocks have headers that allow pattern matching. See the block documentation.

The assignment arrow defines a new variable in a block, while modifies an existing one.

BQN uses the ligature character for stranding, instead of plain juxtaposition. It also has a list notation using ⟨⟩.

-

For reading

+

For reading

Here are some closest equivalents in Dyalog APL for the BQN functions that don't use the same glyphs as APL. Correspondence can be approximate, and is just used as a decorator to mean "reverse some things".

@@ -272,7 +272,7 @@

In BQN is Rank and is Atop. Dyalog's Atop () and Over () were added in version 18.0.

-

For writing

+

For writing

The tables below give approximate implementations of Dyalog primitives for the ones that aren't the same. First- and last-axis pairs are also mostly omitted. BQN just has the first-axis form, and you can get the last-axis form with 1.

The form FG (Power with a function right operand; Power limit) must be implemented with recursion instead of primitives because it performs unbounded iteration. The modifier _while_ {𝔽𝔾𝔽_𝕣_𝔾𝔽𝔾𝕩} provides similar functionality without risk of stack overflow. It's discussed here and called as Fn _while_ Cond arg.

diff --git a/docs/doc/fromJ.html b/docs/doc/fromJ.html index 46d312be..10dff0d7 100644 --- a/docs/doc/fromJ.html +++ b/docs/doc/fromJ.html @@ -4,15 +4,15 @@ BQN–J dictionary -

BQN–J dictionary

+

BQN–J dictionary

A guide to help users of J get up to speed with BQN quickly.

-

Terminology

-

Array model

+

Terminology

+

Array model

BQN uses the based array model, which is fundamentally different from J's flat array model. BQN uses non-array values such as characters and numbers, called "atoms", while in J every noun is an array. A BQN array can contain any values in any mixture, while a J array must be uniformly numbers, characters, or boxes (BQN doesn't use boxes).

The J terms "atom" and "element" are used to mean different things by different authors. In BQN, an atom or rank-0 array is called a "unit", and the values contained in an array—which may or may not be arrays—are called "elements". Each element is contained in a 0-cell, or rank-0 subarray. BQN uses the term "major cell" for what J calls an "item" of an array: a cell with rank one less than that array. BQN shares the terms "list" and "table" for rank-1 and rank-2 arrays with J.

BQN uses "depth" rather than "boxing level". BQN gives atoms depth 0, so that the depth of a BQN array is one higher than the boxing level of the corresponding J array.

-

Roles

+

Roles

In J, the part of speech is an inherent property of a value, while in BQN it is determined by how the value is used in a particular expression, and can be different from the value's type. See context-free grammar.

@@ -40,7 +40,7 @@
-

Syntax

+

Syntax

@@ -108,7 +108,7 @@

BQN's explicit functions and modifiers are called "blocks", and have a more sophisticated syntax than J; see the documentation. BQN uses lexical scope, and has no global variables. BQN also has a list notation using ⟨⟩.

-

For reading

+

For reading

J analogues of BQN primitive functions are given below. They are not always the same; usually this is because BQN has extra functionality relative to J, although in some cases it has less or different functionality.

Functions + - | < > are the same in both languages.

@@ -329,7 +329,7 @@
-

For writing

+

For writing

J's primitive nouns are easily defined in BQN.

diff --git a/docs/doc/functional.html b/docs/doc/functional.html index 1595e5ae..d48b78d1 100644 --- a/docs/doc/functional.html +++ b/docs/doc/functional.html @@ -4,7 +4,7 @@ BQN: Functional programming -

Functional programming

+

Functional programming

BQN boasts of its functional capabilities, including first-class functions. What sort of functional support does it have, and how can a BQN programmer exercise these and out themself as a Schemer at heart?

First, let's be clear about what the terms we're using mean. A language has first-class functions when functions (however they are defined) can be used in all the same ways as "ordinary" values like numbers and so on, such as being passed as an argument or placed in a list. Lisp and JavaScript have first-class functions, C has unsafe first-class functions via function pointers, and Java 7 and APL don't have them as functions can't be placed in lists or used as arguments. This doesn't mean every operation is supported on functions: for instance, numbers can be added, compared, and sorted; while functions could perhaps be added to give a train, comparing or sorting them as functions (not representations) isn't computable, and BQN doesn't support any of the three operations when passing functions as arguments.

Traditionally, APL has worked around its lack of first-class functions with operators, that is, second-order functions. Arrays in APL are first class while functions are second class and operators are third class, and each class can act on the ones before it. However, the three-tier system has some obvious limitations that we'll discuss, and BQN removes these by making every type first class.

@@ -61,14 +61,14 @@

The term functional programming is more contentious, and has many meanings some of which can be vague. Here I use it for what might be called first-class functional programming, programming that makes significant use of first-class functions; in this usage, Scheme is probably the archetypal functional programming language. However, other definitions are also worth mentioning. APL is often called a functional programming language on the grounds that functions can be assigned and manipulated, and called recursively, all characteristics it shares with Lisp. I prefer the term function-level programming for this usage. A newer usage, which I call pure functional programming, restricts the term "function" to mathematical functions, which have no side effects, so that functional programming is programming with no side effects, often using monads to accumulate effects as part of arguments and results instead. Finally, typed functional programming is closely associated with pure functional programming and refers to languages influenced by type theory such as Haskell, F#, and Idris (the last of which even supports dependently-typed functional programming, but I already said "finally" so we'll stop there). Of these, BQN supports first-class functional and function-level programming, allows but doesn't encourage pure functional programming, and does not support typed functional programming, as it's dynamically and not statically typed.

Another topic we are interested in is lexical scoping and closures. Lexical scoping means that the realm in which a variable exists is determined by its containing context (in BQN, the surrounding set of curly braces {}, if any) within the source code. A closure is really an implementation mechanism, but it's often used to refer to a property of lexical scoping that appears when functions defined in a particular block can be accessed after the block finishes execution. For example, they might be returned from a function or assigned to a variable outside of that function's scope. In this case the functions can still access variables in the original scope. I consider this property to be a requirement for a correct lexical scoping implementation, but it's traditionally not a part of APL: implementation might not have lexical scoping (for example, J and I believe A+ use static scoping where functions can't access variables in containing scopes) or might cut off the scope once execution ends, leading to value errors that one wouldn't predict from the rules of lexical scoping.

-

Functions in APL

+

Functions in APL

This seems like a good place for a brief and entirely optional discussion of how APL handles functions and why it does it this way. As mentioned above, APL's functions are second class rather than first class. But the barriers to making functions first-class objects have been entirely syntactic and conceptual, not technical. In fact, the J language has for a long time had a bug that allows an array containing a function to be created: by selecting from the array, the function itself can even be passed through tacit functions as an argument!

The primary reason why APL doesn't allow functions to be passed as arguments is probably syntax: in particular, there's no way to say that a function should be used as the left argument to another function, as an expression like F G x with functions F and G and an array x will simply be evaluated as two monadic function applications. However, there's no syntactic rule that prevents a function from returning a function, and Dyalog APL for example allows this (so '+' returns the function +). Dyalog's OR is another interesting phenomenon in this context: it creates an array from a function or operator, which can then be used as an element or argument like any array. The mechanism is essentially the same as BQN's first class functions, and in fact ORs even share a form of BQN's syntactic type erasure, as a OR of a function passed as an operand magically becomes a function again. But outside of this property, it's cumbersome and slow to convert functions to and from ORs, so they don't work very well as a first-class function mechanism.

Another reason for APL's reluctance to adopt first-class functions is that Iverson and others seemed to believe that functions fundamentally are not a kind of data, because it's impossible to uniquely represent, compare, and order them. One effect of this viewpoint is J's gerund mechanism, which converts a function to an array representation, primarily so that lists of gerunds can be created. Gerunds are nested arrays containing character vectors at the leaves, so they are arrays as Iverson thought of them. However, I consider this conversion of functions to arrays, intended to avoid arrays that contain "black box" functions, to be a mistake: while it doesn't compromise the purity of arrays, it gives the illusion that a function corresponds to a particular array, which is not true from the mathematical perspective of functions as mappings from an arbitrary input to an output. I also think the experience of countless languages with first-class functions shows that there is no practical issue with arrays that contain functions. While having all arrays be concrete entities with a unique canonical representation seems desirable, I don't find the existence of arrays without this property to be detract from working with arrays that do have it.

-

Functional programming in BQN

+

Functional programming in BQN

Reminder: I am discussing only first-class functional programming here, and not other concepts like pure or typed functional programming!

What does functional programming in BQN look like? How is it different from the typical APL style of manipulating functions with operators?

-

Working with roles

+

Working with roles

First, let's look at the basics: a small program that has functions as its argument and result. The function Lin below gives a linear approximation to its function argument based on the values at 0 and 1. To find these two values, we call the argument as a function by using its uppercase spelling, 𝕏.

Lin  {
   v0  𝕏 0
@@ -105,7 +105,7 @@
     Exp _lin 5
 9.591409142295225
 
-

Arrays of functions

+

Arrays of functions

It's very convenient to put a function in an array, which is fortunate because this is one of the most important uses of functions as subjects. Here's an example of an array of functions with a fold applied to it, composing them together.

↗️
    {𝕎𝕏}´ -(ט)
 ⋆∘(-∘(ט))
diff --git a/docs/doc/glossary.html b/docs/doc/glossary.html
index b24ce012..8aeb819a 100644
--- a/docs/doc/glossary.html
+++ b/docs/doc/glossary.html
@@ -4,9 +4,9 @@
   BQN glossary
 
 
-

BQN glossary

+

BQN glossary

Below are short, and sometimes vague, definitions of terms used to describe BQN code.

-

Types

+

Types

  • Value: Something that can be stored in variables and manipulated by a BQN programmer.
  • Type: One of seven possible kinds of value.
  • @@ -37,7 +37,7 @@
  • Real number (more accurately, approximate doubly-extended real number): A number with no complex part.
  • Complex number: A real number plus i (one of the square roots of -1) times another real number.
-

Roles

+

Roles

  • Syntactic role: One of four possible types for an expression, which are determined by the expression itself and not outside context and describe how it interacts with other parts of the syntax.
@@ -48,7 +48,7 @@
  • 1-modifier: Can be called on one subject or function.
  • 2-modifier: Can be called on two subjects or functions.
  • -

    Arrays

    +

    Arrays

    • Array: A multidimensional collection of values.
    • Element: One of the values contained in an array.
    • @@ -77,7 +77,7 @@
      • Index: One of a variety of ways to select an element, cell, axis, or position along an axis of an array.
      -

      Operations

      +

      Operations

      • Operation: A value that is called on inputs to perform computation and return a result or cause an error.
      • Call: Submit inputs to an operation and receive any result.
      • @@ -99,7 +99,7 @@
      • Error: A condition that stops compilation or execution.
      • Inferred property: A property of a value that is derived by BQN based on constraints. If it cannot be derived then the value will not have the property.
      -

      Tokens

      +

      Tokens

      • Token formation or tokenization: Splitting source code into a sequence of tokens.
      • Token: A name, literal, primitive, or other syntactic element.
      • @@ -112,7 +112,7 @@
      • Character literal: A literal written with single quotes '', indicating a string.
      • Null literal: The literal @, indicating the null character (code point 0).
      -

      Parsing

      +

      Parsing

      • Parsing: Analysis of the tokens of a program, which determines which actions will be taken to evaluate it.
      • Expression: A piece of code that defines a (not necessarily constant) variable.
      • @@ -121,7 +121,7 @@
      • Ligature: The character .
      • List notation: The angle brackets ⟨⟩ or ligatures used to indicate a list.
      -

      Assignment and scoping

      +

      Assignment and scoping

      • Assignment: An operation that sets a variable's value. Definition () or a change of definition ().
      • Assignment arrow: or , used to denote assignment.
      • @@ -131,7 +131,7 @@
      • Scope: An environment where variables are defined and manipulated, which is created before evaluating a body.
      • Identifier: An instance of a name in a program, with two identifiers considered the same if they correspond to the same definition.
      -

      Blocks

      +

      Blocks

      • Block: A syntactic element surrounded in curly braces {}, which encapsulates code.
      • Immediate block: A block that is evaluated and returns a value immediately; it has a subject role.
      • diff --git a/docs/doc/group.html b/docs/doc/group.html index 1e3afd03..03357675 100644 --- a/docs/doc/group.html +++ b/docs/doc/group.html @@ -4,7 +4,7 @@ BQN: Group -

        Group

        +

        Group

        BQN replaces the Key operator from J or Dyalog APL, and many forms of partitioning, with a single (ambivalent) Group function . This function is somewhat related to the K function = of the same name, but results in an array rather than a dictionary.

        @@ -29,7 +29,7 @@ -

        Definition

        +

        Definition

        Group operates on a list of atomic-number indices 𝕨 and an array 𝕩, treated as a list of its major cells, to produce a list of groups, each containing some of the cells from 𝕩. The two arguments have the same length, and each cell in 𝕩 is paired with the index in 𝕨 at the same position, which indicates which result group should include that cell.

        ↗️
            01201  "abcde"  # Corresponding indices and values
         ┌─                     
        @@ -102,7 +102,7 @@
                   ┘
         

        Here, the index 2 appears at indices 0 and 3 while the index 3 appears at index 1.

        -

        Multidimensional grouping

        +

        Multidimensional grouping

        Dyadic Group allows the right argument to be grouped along multiple axes by using a nested left argument. In this case, the left argument must be a list of numeric lists, and the result has rank 𝕨 while its elements—as always—have the same rank as 𝕩. The result shape is 1+⌈´¨𝕨, while the shape of element i𝕨𝕩 is i+´=¨𝕨. If every element of 𝕨 is sorted ascending and contains only non-negative numbers, we have 𝕩≡∾𝕨𝕩, that is, Join is the inverse of Partition.

        Here we split up a rank-2 array into a rank-2 array of rank-2 arrays. Along the first axis we simply separate the first pair and second pair of rows—a partition. Along the second axis we separate odd from even indices.

        ↗️
            0011,0101010  (10×↕4)+7
        @@ -119,7 +119,7 @@
         

        Each group i𝕨𝕩 is composed of the cells j<¨𝕩 such that ij¨𝕨. The groups retain their array structure and ordering along each argument axis. Using multidimensional Replicate we can say that i𝕨𝕩 is (i=𝕨)/𝕩.

        The monadic case works similarly: Group Indices always satisfies 𝕩 ←→ 𝕩⊔↕≠1𝕩. As with , the depth of the result of Group Indices is always one greater than that of its argument. A depth-0 argument is not allowed.

        -

        Properties

        +

        Properties

        Group is closely related to the inverse of Indices, /. In fact, inverse Indices called on the index argument gives the length of each group:

        ↗️
            ¨ 2312
         ⟨ 0 1 2 1 ⟩
        @@ -137,7 +137,7 @@
         "caeb"
         

        Group can even be implemented with the same techniques as a bucket sort, making it branchless and fast.

        -

        Applications

        +

        Applications

        The obvious application of Group is to group some values according to a known or computed property. If this property isn't a natural number, it can be turned into one using Classify (), which numbers the unique values in its argument by first occurrence.

        ↗️
            ln  "Phelps""Latynina""Bjørgen""Andrianov""Bjørndalen"
             co  "US"    "SU"      "NO"     "SU"       "NO"
        @@ -171,7 +171,7 @@
           "ZW" ⟨⟩                          
                                           ┘
         
        -

        Partitioning

        +

        Partitioning

        In examples we have been using a list of strings stranded together. Often it's more convenient to write the string with spaces, and split it up as part of the code. In this case, the index corresponding to each word (that is, each letter in the word) is the number of spaces before it. We can get this number of spaces from a Plus-Scan on the boolean list which is 1 at each space.

        ↗️
            ' '(+`=⊔⊢)"BQN uses notation as a tool of thought"
         ⟨ "BQN" " uses" " notation" " as" " a" " tool" " of" " thought" ⟩
        diff --git a/docs/doc/identity.html b/docs/doc/identity.html
        index 36795e61..5b710809 100644
        --- a/docs/doc/identity.html
        +++ b/docs/doc/identity.html
        @@ -4,7 +4,7 @@
           BQN: Identity functions
         
         
        -

        Identity functions

        +

        Identity functions

        Here are the simplest functions in BQN: Right () always returns its right argument, and Left () returns its left argument if called with two arguments, and the right argument otherwise.

        ↗️
             "only"
         "only"
        @@ -20,7 +20,7 @@
         

        Depending on your past experiences, this could cause some confusion: built-in support for functions that do nothing? Documentation should say why a feature's there and how to use it, not just what it does, so we'll try to address this below. The most important single use is for tacit programming, but there are a variety of other uses as well.

        Of course, it's easy to write block functions {𝕩} and {𝕨} that return particular arguments. While I would already make and primitives just because they are common and important, there are also specific disadvantages to using blocks. They fail to indicate that there are no side effects, as primitives would, and they also need special casing for the interpreter to manipulate them when applying Undo () or making other inferences.

        -

        Filling arrays

        +

        Filling arrays

        What's the easiest way to create a matrix with 0 on the first row, 1 on the second, and so on? Probably this one, with table:

        ↗️
            (4)  5
         ┌─           
        @@ -49,7 +49,7 @@
                                           ┘
         

        This method can replace even values nested deeply in arrays, as long as you can write the function to get at them. The parts that aren't accessed don't even need to have matching shapes!

        -

        As a variable

        +

        As a variable

        Suppose you want a list of a matrix, its transpose, and its negation. One way to do this is to put together a list of functions for each of these values: the first one is an identity.

        ↗️
            - {𝕎𝕩}¨< 0¯110
         ┌─                             
        @@ -60,7 +60,7 @@
                                       ┘
         

        Here ends up being used as 𝕎. A similar case might be a function or program with a caller-specified processing step. For example, a function to write some kind of file, with a parameter function to encrypt data before writing. To use no encryption, you'd pass a parameter . Or it might happen that you write a Choose () expression where one of the cases should do nothing , or return the left argument .

        -

        In tacit functions

        +

        In tacit functions

        In a tacit context, is roughly equivalent to 𝕨 and to 𝕩. In some (not too common) cases, it's even possible to translate a block function to tacit code directly by replacing the variables in this way.

        ↗️
            3 {𝕩-𝕨÷1+𝕩} 5
         4.5
        @@ -68,7 +68,7 @@
         4.5
         

        A larger class of block functions can be translated just by adding parentheses and ˙ (there's a discussion of this technique in APL here). It's helpful when writing tacit code to know that Fn applies Fn to the left argument only and Fn applies it to the right argument—these can be read "Fn of left" and "Fn of right".

        -

        Syntax tricks

        +

        Syntax tricks

        You've probably seen used in documentation to display the value of a variable being assigned. This is a hack, and in most contexts •Show should be used to display values.

        ↗️
             a  "show this"
         "show this"
        diff --git a/docs/doc/index.html b/docs/doc/index.html
        index ec0e46eb..3f4e2d2c 100644
        --- a/docs/doc/index.html
        +++ b/docs/doc/index.html
        @@ -4,7 +4,7 @@
           BQN documentation
         
         
        -

        BQN documentation

        +

        BQN documentation

        BQN's documentation describes what features it has, how to use them (with examples), and why they were chosen. For a linear introduction to the language, see the tutorials. While the core language specification is complete, the documentation still has minor gaps.

        Overview:

          diff --git a/docs/doc/indices.html b/docs/doc/indices.html index fe67492d..8ed4317e 100644 --- a/docs/doc/indices.html +++ b/docs/doc/indices.html @@ -4,7 +4,7 @@ BQN: Indices -

          Indices

          +

          Indices

          One-dimensional arrays such as K lists or Python arrays have only one kind of index, a single number that refers to an element. For multidimensional arrays using the leading axis theory, there are several types of indexing that can be useful. Historically, nested APL designs have equivocated between these, which I believe can lead to subtle errors when programming. BQN focuses on single-number (atomic) indices, which can refer to list elements or array major cells (or more generally indexing along any particular axis). When using atomic indices to select elements, the indexed array has to be a list. In contrast, elements of any array can be indicated by list indices with length equal to that array's rank. Only two BQN primitives use these list indices: Range (), which returns an array of them if given a list argument, and Pick (), where the depth-1 components of an array left argument are list indices.

          The following functions take or return indices. Except where marked, the indices are in the result; this is by far the most common type of index use. Group () is given two rows as it falls into both cases. Note that in the result case, there is usually no possibility for the programmer to select the format of indices. Instead, the language should be carefully designed to make sure that the kind of index returned is as useful as possible.

    @@ -86,15 +86,15 @@

    In Dyadic Transpose (), 𝕨 is made up of indices into axes of 𝕩. Since array shape is 1-dimensional, there is only one sensible choice for these elements, a single number each.

    -

    Element indices

    +

    Element indices

    In general, the index of an element of an array is a list whose length matches the array rank. It is also possible to use a number for an index into a list, as the list index is a singleton, but this must be kept consistent with the rest of the language. NARS-family APLs make the Index Generator ( in BQN) return a numeric list when the argument has length 1 but a nested array otherwise. This means that the depth of the result depends on the shape of the argument, inverting the typical hierarchy. BQN shouldn't have such an inconsistency.

    Functions Range (), Indices (/), Group (), and Pick () naturally deal with element indices. Each of these can be defined to use list indices. However, this usually rules out the possibility of using atomic indices, which makes these functions harder to use both with generic array manipulation and with the major cell indices discussed in the next section. For this reason BQN restricts and / to use atomic indices, which comes with the requirement that the arguments to Group and Indices, and the result of Group Indices, must be lists. For dyadic Group the depth-1 elements of 𝕨 are arrays of indices along axes of the result (multi-axis documentation). This means each axis of 𝕩 can only be related to one axis of the result.

    Unlike / and , and do use list element indices. For this is because the output format can be controlled by the argument format: if passed a single number, the result uses atomic indices (so it's a numeric list); if passed a list, it uses list indices and the result has depth 2 (the result depth is always one greater than the argument depth). For , list indices are chosen because Select () handles atomic indices well already. When selecting multiple elements from a list, they would typically have to be placed in an array, which is equivalent to with a numeric list 𝕨. An atom 𝕨 in Pick is converted to a list, so it can be used to select a single element if only one is wanted. To select multiple elements, uses each depth-1 array in 𝕨 as an index and replaces it with that element from the right argument. Because this uses elements as elements (not cells), it is impossible to have conformability errors where elements do not fit together (invalid index errors are of course still possible). Atoms also cannot be used in this context, as it would create ambiguity: is a one-element list an index, or does it contain an index?

    -

    Major cell indices

    +

    Major cell indices

    One of the successes of the leading axis model is to introduce a kind of index for multidimensional arrays that is easier to work with than list indices. The model introduces cells, where a cell index is a list of any length up to the containing array's rank. General cell indices are discussed in the next section; first we introduce a special case, indices into major cells or ¯1-cells. These cells naturally form a list, so the index of a major cell is a single number. Such an index can also be considered to select along the first axis, since an index along any axis is a single number.

    Ordering functions ⍋⍒ and search/self-search functions ⊐⊒ that depend on cell ordering only really make sense with major cell indices: while other indices have an ordering, it's not very natural. Note that only uses the ordering in an incidental way, because it's defined to return the first index where a cell in 𝕩 is found. A mathematician would be more interested in a "pre-image" function that returns the set of all indices where a particular value appears. However, programming usefulness and consistency with the other search functions makes searching for the first index a reasonable choice.

    Only one other function—but an important one!—deals with cells rather than elements: Select (). Select allows either a simple first-axis case where 𝕨 has depth 1 or less (a depth-0 argument is automatically enclosed), and a multi-axis case where it is a list of depth-1 elements. In each case the depth-1 arrays index along a single axis.

    -

    General cell indices

    +

    General cell indices

    BQN does not use general cell indices directly, but it is useful to consider how they might work, and how a programmer might implement functions that use them in BQN if needed. The functions /, , and are the ones that can work with indices for multidimensional arrays but don't already. Here we will examine how multidimensional versions would work.

    A cell index into an array of rank r is a numeric list of length lr, which then refers to a cell of rank r-l. In BQN, the cell at index i of array a is i<¨a.

    Because the shape of a cell index relates to the shape of the indexed array, it makes sense not to enclose cell indices, instead treating them as rows of an index array. A definition for for depth-1 left arguments of rank at least 1 follows: replace each row of the left argument with the indexed cell of the right, yielding a result with the same depth as the right argument and shape 𝕨((¯1↓⊣)(¯1↑⊣))𝕩.

    diff --git a/docs/doc/join.html b/docs/doc/join.html index 414f19b0..220c6631 100644 --- a/docs/doc/join.html +++ b/docs/doc/join.html @@ -4,9 +4,9 @@ BQN: Join and Join To -

    Join and Join To

    +

    Join and Join To

    The glyph combines arrays along an existing axis, a concept that other languages might call "concatenation" or "catenation" but BQN names "Join". The one-argument form Join and two-argument form Join To are parallel to the functions that combine arrays along a new axis, Merge (>) and Couple ().

    -

    Join To

    +

    Join To

    Join To connects its two arguments together, for example to join two strings:

    ↗️
        "abcd"  "EFG"
     "abcdEFG"
    @@ -50,7 +50,7 @@
     ⟨ 3 'c' ⟩
     

    This case is unusual because the rank of the result is higher than that of either argument. It's also identical to Couple (); Couple should be preferred because it doesn't require a special case for this situation. See coupling units.

    -

    Join

    +

    Join

    The monadic form of , called simply Join, is more complicated than Join To because it really takes not just one argument but an entire array of them. Join is an extension of the monadic function Raze from A+ and J to arbitrary argument ranks. It has the same relationship to Join to, the dyadic function sharing the same glyph, as Merge (>) does to Couple (): ab is >ab and ab is ab. While Merge and Couple combine arrays (the elements of Merge's argument, or the arguments themselves for Couple) along a new leading axis, Join and Join to combine them along the existing leading axis. Both Merge and Join can also be called on a higher-rank array, causing Merge to add multiple leading axes while Join combines elements along multiple existing axes.

    Join can be used to combine several strings into a single string, like array.join() in Javascript (but it doesn't force the result to be a string).

    ↗️
        "time""to""join""some""words"
    diff --git a/docs/doc/leading.html b/docs/doc/leading.html
    index ac7d2d49..55b72a40 100644
    --- a/docs/doc/leading.html
    +++ b/docs/doc/leading.html
    @@ -4,10 +4,10 @@
       BQN: The leading axis convention
     
     
    -

    The leading axis convention

    +

    The leading axis convention

    Several primitive functions manipulate the right argument, or sometimes both arguments, of an array along one or more axes. According to the leading axis model, it's best to make the primitives operate on initial axes, because the Rank modifier then allows it to apply to later axes as well. Here we'll see how this pattern works in BQN.

    -

    Monadic functions

    -

    Manipulating cells

    +

    Monadic functions

    +

    Manipulating cells

    Most non-arithmetic monadic functions work only on the first axis of the argument—that is, they treat it as a list of its major cells. The function Length () counts these major cells, while Prefixes (), Suffixes (), Reverse (), and First Cell () move them around. The Insert (˝) and Scan (`) modifiers also yield functions that work along the first axis; in contrast, Fold (´) requires 𝕩 to be a list, as it works on elements.

    ↗️
         a  32  "abcdef"  # An array with three major cells
     ┌─    
    @@ -80,7 +80,7 @@
          0 a               # …or deeper still.
     ⟨ 3 2 1 ⟩
     
    -

    Comparing cells

    +

    Comparing cells

    The functions in the last section manipulate cells in the same way regardless of what data they contain. Other functions compare cells to each other, either testing whether they match or how they are ordered relative to one another. The two Grade functions ⍋⍒, and the self-search functions Classify (), Mark Firsts (), and Occurrence Count (), each give a list result, with one number for each cell. We can see below that Occurrence Count returns the same results even as we make the argument cells more complicated, because the changes made preserve the matching of cells.

    ↗️
        s  "abracadabra"
          s
    @@ -106,13 +106,13 @@
       3 3 2 1 0  
                 ┘
     
    -

    Other monadic functions

    +

    Other monadic functions

    Not all functions work on the first axis in a straightforward manner. Transpose moves the first axis to the end, so while it focuses on the first one, it shifts every axis of 𝕩. Join also works on every axis of its argument, and applies to the leading axes of 𝕩's elements instead: these leading inner axes are matched up with the outer axes, and trailing inner axes are allowed but the elements must have rank at least as high as the argument array.

    The other two monadic functions that work on high-rank arguments are Deshape () and First (). These treat 𝕩 as one long list, ordered by its element indices. This ordering privileges leading axes (in fact, it is the reason for the choice of leading axes in the leading axis convention), but these functions can't really be said to work on leading axes: they apply to all axes.

    The Each (¨) and Table () modifiers return functions which are the same in the monadic case. These functions simply go through all elements of the argument array without regard for its multi-dimensional structure (the operand is applied to elements in index order, matching Deshape; this matters if it has side effects). Similarly, monadic arithmetic functions do not have any sort of leading axis dependence.

    -

    Dyadic functions

    +

    Dyadic functions

    For dyadic functions the pattern of working on only one argument axis is not so common. Only two functions can be said to follow it roughly: Join to () combines two arrays along one axis, using the first axis of both arguments if they have the same rank and of the higher-rank argument if they differ by one. Couple (), like Solo, does not manipulate the argument axes but adds a result axis. There are also some functions that can't be limited to leading axes: Reshape () treats 𝕩 as one long list, and Pick () requires each index to be as long as 𝕩's rank, because it selects elements and not cells from 𝕩.

    -

    Multiple axes

    +

    Multiple axes

    Instead of always working on a single axis, many dyadic functions work on one axis by default, but also allow a left argument with multiple elements corresponding to leading axes of 𝕩. To decide which of the two possibilities applies, these functions test the depth of 𝕨, a convention that is discussed in the depth documentation. A left argument that applies to one axis has a particular depth; 𝕨 can also be a list of such arguments.

    @@ -145,7 +145,7 @@ ⟨ 4 5 7 7 ⟩

    Functions with single-axis depth 1 tend to be more complicated; see for example Group.

    -

    Leading axis agreement

    +

    Leading axis agreement

    Arithmetic functions, and the Each (¨) and Depth () modifiers, use leading axis agreement to match their arguments together. All axes of the lower-rank argument are matched with the leading axes of the higher-rank one, and axes matched together must have the same length. After pairing axes in this way, a single element of the lower-rank argument might correspond to any number of elements of the higher-rank one. It's reused for each of those corresponding elements.

    ↗️
         x  324  60     # A rank-3 array
     ┌─             
    @@ -200,7 +200,7 @@
     

    If one argument is a unit, that is, it has no axes, then leading axis agreement reduces to APL's "scalar extension" (where "scalar" is equivalent to BQN's "unit"), where a single unit is matched with an entire array by repeating it at every application. A unit always agrees with any other array under leading axis agreement because it has no axes whose lengths would need to be checked.

    With leading axis agreement, there are k+1 shapes for arrays that can be added (or any other function with Each) to a given array x without changing its rank. These are precisely the prefixes of x, with ranks from 0 to k inclusive. Arrays with larger rank can also be used as the other argument, but then the result shape will match that argument and not x.

    -

    Search functions

    +

    Search functions

    The search functions, Index of (), Progressive Index of (), and Member of (), and also Bins (⍋⍒), look through cells of one argument to find cells of the other. Find () also does a search, but a slightly different one: it tries to find slices of cells of 𝕩 that match 𝕨.

    diff --git a/docs/doc/lexical.html b/docs/doc/lexical.html index 2ce2cfc9..ec3f6567 100644 --- a/docs/doc/lexical.html +++ b/docs/doc/lexical.html @@ -4,10 +4,10 @@ BQN: Lexical scoping -

    Lexical scoping

    +

    Lexical scoping

    BQN uses lexical scope, like most modern functional programming languages including Javascript, Scheme, and Julia, and like Dyalog APL's dfns (tradfns are dynamically scoped). This document describes how lexical scoping works, and a few small details relevant to BQN's version of it.

    In short, every block is a separate scope that can refer to identifiers in containing scopes. When evaluated, the block makes a variable for each identifier defined in it (including arguments and operands). The blocks that it contains will now access these variables. In the first level of a block, variables must be defined before they can be used, but in child blocks, a variable can be used regardless of where it's defined, as long as the definition is evaluated before the child block is.

    -

    Scopes

    +

    Scopes

    Scoping is a mechanism that allows the same variable name to refer to different variables depending on program context. For example, the following code uses the name a in two ways: once for a value at the top level, and once locally in a function. With scoping, once you write {} to create a block, you can define any name you want inside without worrying whether it's taken.

    ↗️
        a  6
         F  { a × 1 + a  𝕩 }
    @@ -35,7 +35,7 @@
     

    Each call creates the list of indices i, then calls itself using 𝕊 on each element of 𝕩 if it's a list, then couples i to the result. This requires i to be unaffected by other calls to the function, which works because i is scoped not only to the source code location but also to the particular evaluation of the block that creates it.

    These examples probably work like you expect—they're meant to highlight the features that scoping should have, in order to help show how less intuitive cases work later on.

    -

    Visibility

    +

    Visibility

    A scope can view and modify (with ) variables in other scopes that contain it. We say these variables are visible in the inner scopes. Variables at the top level of a program are visible to all the code in that program, so that we might call them "global". That would be a little misleading though, because for example each file is an entire program, so if one file is imported from another then it can't read the first file's variables.

    ↗️
        counter  0
         inc  6
    @@ -77,7 +77,7 @@ ERROR
     42
     

    The function C3_7 uses the versions of counter and inc created in _makeCount, even though it's called not from inside _makeCount, but from the top-level program. This is what it means for BQN's scoping to be lexical rather than dynamic. Which identifiers are visible is determined by where the code containing them is located in the source code, not how it's called at runtime. The static nature of lexical scoping makes it much easier to keep track of how variables are used (for compilers, this means optimization opportunities), and for this reason dynamic scoping is very rare in programming languages today.

    -

    Post-definition

    +

    Post-definition

    In the top level a block, a variable can only be used after it's defined, and the compiler rejects any code that tries to use an undefined variable. But blocks contained in that block see variables it defines regardless of the positioning. Below, PlusC references the variable c that's defined after it (but when code is executed one line at a time like it is here, I still have to put both definitions on the same line so they are compiled together).

    ↗️
        PlusC  { 𝕩+c }  c¯1
         PlusC 7
    @@ -88,7 +88,7 @@ ERROR
     ERROR
     

    Why define things this way? It's easier to see if you imagine the variable used is also a function. It's normal for a function to call other functions defined at the top level, of course. And it would be pretty unpleasant for BQN to enforce a specific ordering for them. It would also make recursive functions impossible except by using 𝕊, and mutually recursive ones completely impossible. A simple rule that makes all these things just work smoothly seems much better than any alternative.

    -

    Closures

    +

    Closures

    Let's run _makeCount from above a few more times.

    ↗️
        C4_2  42 _makeCount  # Start at 4; increment by 2
         C1_4  14 _makeCount  #          1;              4
    @@ -121,12 +121,12 @@ ERROR
     

    The variable Mean is visible only within the outer immediate block. The only way it can be accessed is by code in this block: the two calls in the returned function, which will later be renamed stdDev. Nothing in the block modifies it, so its value is constant. It's just a little utility that exists only for code in the block. Making it visible elsewhere is as simple as moving it out of the block, but it's best not to do this without reason. Keeping a variable in the smallest possible scope makes it easier to understand the program, because it reduces the amount of information needed to understand scopes where that variable doesn't apply.

    Neither the specification nor a typical implementation keep track of what is and isn't a closure, although an advanced interpreter will probably work with some related properties. The existence of closures is an ordinary feature of lexical scoping and not a special case. However, it's sometimes a useful term for discussing the operation of a program. We might define a closure as a block that can be run and access variables from a parent scope even after the block that created that scope finishes execution.

    -

    Environments form a tree

    +

    Environments form a tree

    So a block has access to every environment that it might need a variable from, for as long as it needs. This idea is a little fuzzy, so let's clarify by describing how an implementation would support figure out what can access where.

    The mechanism is that each environment can have a parent environment (the topmost environment, which corresponds to the entire program, has no parent). When a variable is accessed, it might be in the current environment, or its parent, or that environment's parent, and so on. Every environment corresponds to one block in the source code, and its parent corresponds to the parent block, so a compiler can figure out how many levels up it will have to go based on the source code.

    We've seen that one block can create many environments. An environment can have only one parent, but many children, so environments form a tree. A forest to be precise, as one execution of BQN can involve multiple programs.

    How does an environment know which of the many environments corresponding to the parent scope is its parent? This information is saved when the block is reached in the program and a block instance is created. Unless it's an immediate block, the block instance won't be run right away: a block instance isn't the same as a block evaluation. But each block evaluation starts with a block instance, and that's where it gets the parent environment. Unlike block evaluation, which can happen anywhere, a block instance is created only during evaluation of the parent block. So the saved parent environment is simply the current environment.

    -

    Mutation

    +

    Mutation

    The value of a variable can be modified with . It's similar to definition in that it sets the value of the variable, but the way it interacts with scoping is completely different. Defining creates a new variable in the current scope, and modifying refers to an existing variable in the current scope or a parent. In scoping terms, modifying is more like an ordinary variable reference than a definition.

    When a variable's modified, functions with access to it see the new value. They have access to the variable, not any particular value that it has.

    ↗️
        factor  3
    diff --git a/docs/doc/logic.html b/docs/doc/logic.html
    index 2635814c..8fc23c44 100644
    --- a/docs/doc/logic.html
    +++ b/docs/doc/logic.html
    @@ -4,12 +4,12 @@
       BQN Logic functions: And, Or, Not (also Span)
     
     
    -

    Logic functions: And, Or, Not (also Span)

    +

    Logic functions: And, Or, Not (also Span)

    BQN retains the APL symbols and for logical and and or, and changed APL's ~ to ¬ for not, since ~ looks too much like ˜ and ¬ is more common in mathematics today. Like J, BQN extends Not to the linear function 1-. However, it discards GCD and LCM as extensions of And and Or, and instead uses bilinear extensions: And is identical to Times (×), while Or is ׬, following De Morgan's laws (other ways of obtaining a function for Or give an equivalent result—there is only one bilinear extension).

    If the arguments are probabilities of independent events, then an extended function gives the probability of the boolean function on their outcomes (for example, if A occurs with probability a and B with probability b independent of A, then A or B occurs with probability ab). These extensions have also been used in complexity theory, because they allow mathematicians to transfer a logical circuit from the discrete to the continuous domain in order to use calculus on it.

    Both valences of ¬ are equivalent to the fork 1+-. The dyadic valence, called "Span", computes the number of integers in the range from 𝕩 to 𝕨, inclusive, when both arguments are integers and 𝕩𝕨 (note the reversed order, which is used for consistency with subtraction). This function has many uses, and in particular is relevant to the Windows function.

    These functions are considered arithmetic functions and thus are pervasive.

    -

    Definitions

    +

    Definitions

    We define:

    Not  1+-  # also Span
     And  ×
    @@ -18,7 +18,7 @@
     

    Note that ¬ ←→ ¬, since when applying ¬ twice the first added 1 will be negated but the second won't; the two 1s cancel leaving two subtractions, and - ←→ -. An alternate definition of Or that matches the typical formula from probability theory is

    Or   +-×
     
    -

    Examples

    +

    Examples

    We can form truth tables including the non-integer value one-half:

    ↗️
        ¬ 00.51
     ⟨ 1 0.5 0 ⟩
    @@ -38,10 +38,10 @@
                  ┘
     

    As with logical And and Or, any value and 0 is 0, while any value or 1 is 1. The other boolean values give the identity values for the two functions: 1 and any value gives that value, as does 0 or the value.

    -

    Why not GCD and LCM?

    +

    Why not GCD and LCM?

    The main reason for omitting these functions is that they are complicated and, when applied to real or complex numbers, require a significant number of design decisions where there is no obvious choice (for example, whether to use comparison tolerance). On the other hand, these functions are fairly easy to implement, which allows the programmer to control the details, and also add functionality such as the extended GCD.

    A secondary reason is that the GCD falls short as an extension of Or, because its identity value 0 is not total. 0x, for a real number x, is actually equal to |x and not x: for example, 0¯2 is 2 in APL. This means the identity 0x ←→ x isn't reliable in APL.

    -

    Identity values

    +

    Identity values

    It's common to apply a fold ´ or ´ to a list (checking whether all elements are true and whether any are true, respectively), and so it's important for extensions to And and Or to share their identity value. Minimum and Maximum do match And and Or when restricted to booleans, but they have different identity values. It would be dangerous to use Maximum to check whether any element of a list is true because ´⟨⟩ yields ¯∞ instead of 0—a bug waiting to happen. To avoid this the programmer would have to use an initial value 𝕨 of 0, which is easy to forget.

    It's not hard to prove that the bilinear extensions have the identity values we want. Of course 1x is 1×x, or x, and 0x is 0׬x, or ¬1׬x, giving ¬¬x or x again. Both functions are commutative, so these values are identities on the right as well.

    Other logical identities do not necessarily hold. For example, in boolean logic And distributes over Or and vice-versa: abc ←→ (ab)(ac). But substituting × for and +-× for we find that the left hand side is (a×b)+(a×c)+(a×b×c) while the right gives (a×b)+(a×c)+(a×b×a×c). These are equivalent for arbitrary b and c only if a=a×a, that is, a is 0 or 1. In terms of probabilities the difference when a is not boolean is caused by failure of independence. On the left hand side, the two arguments of every logical function are independent. On the right hand side, each pair of arguments to are independent, but the two arguments to , ab and ac, are not. The relationship between these arguments means that logical equivalences no longer apply.

    diff --git a/docs/doc/map.html b/docs/doc/map.html index 91c27caf..b7c4afe5 100644 --- a/docs/doc/map.html +++ b/docs/doc/map.html @@ -4,10 +4,10 @@ BQN: Mapping modifiers -

    Mapping modifiers

    +

    Mapping modifiers

    Mapping a function over an array means to call it on each element of that array, creating an array of results. It's also possible to map over two arrays, applying the function to various choices of one element from each, but there's no longer a single correct way to iterate over these elements.

    BQN has two 1-modifiers to map over arrays: Each (¨) and Table (). On two arguments, Table applies its operand to all combinations of elements while Each creates a one-to-one or one-to-many matching. Since they apply to elements, these modifiers are different from Cells (˘) or its generalization Rank (), which apply the function to array cells. The modifier Depth () is a generalization of Each, so that ¨ is ¯1; however, it can't be used to implement Table without some additional array operations.

    -

    One-argument mapping

    +

    One-argument mapping

    @@ -66,7 +66,7 @@ "indexorder"

    When an array is displayed, index order is the same as the top-to-bottom, left-to-right reading order of English. It's also the same as the ordering of Deshape's result, so that here o ends up being 𝕩. The dyadic cases described in the following sections will also have a defined evaluation order, but it's not as easy to describe it in terms of the arguments: instead, the result elements are produced in index order.

    -

    Table

    +

    Table

    @@ -158,7 +158,7 @@ ⟨ 2 2 3 ⟩

    Except for the more sophisticated shape, this result is exactly what you'd get if you deshaped each argument to a list. In each case, every element of 𝕨 is visited in turn, and each time the element is paired with every element of 𝕩.

    -

    Each

    +

    Each

    diff --git a/docs/doc/match.html b/docs/doc/match.html index 74fc83a3..b162bd3b 100644 --- a/docs/doc/match.html +++ b/docs/doc/match.html @@ -4,7 +4,7 @@ BQN: Match -

    Match

    +

    Match

    The primitive Match () tests whether its two argument arrays are considered equivalent in BQN, returning 1 if so and 0 otherwise. Not Match () is the opposite, returning 1 if the two arrays aren't equivalent and 0 if they are.

    ↗️
        "abc"  'a''b''c'
     1
    @@ -24,7 +24,7 @@
     

    Match compares arrays based on their fundamental properties—shape and elements—and not the fill element, which is an inferred property. Since it can be computed differently in different implementations, using the fill element in Match could lead to some confusing results. Even if the implementation doesn't define a fill for 'a''b''c', it should still be considered to match "abc".

    To give a precise definition, two arrays are considered to match if they have the same shape and all corresponding elements from the two arrays match. Every array has a finite depth so this recursive definition always ends up comparing non-arrays, or atoms. An array never matches an atom, so the result if only one argument is an atom is 0. The interesting case is when both arguments are atoms, discussed below.

    -

    Atomic equality

    +

    Atomic equality

    Atoms in BQN have six possible types: number, character, function, 1-modifier, 2-modifier, and namespace. Equality is not allowed to fail for any two arguments, so it needs to be defined on all of these types.

    Starting with the easiest rules, values with different types are never equal to each other.

    ↗️
        'a', +, 3 = -», '+', 3˙
    @@ -54,7 +54,7 @@
     ⟨ 1 ⟩
     

    This approach can't tell you whether two functions are mathematically different—that is, whether they ever return different results given the same arguments (this is an undecidable problem, and also gets confusing since "different" is included in its own definition). However, if two functions compare equal, then they will always return the same results.

    -

    Block equality

    +

    Block equality

    The final point above about block instances is subtler. An instance of a block function or modifier is mutable, meaning that its behavior can change over the course of a program. Consider the following two functions:

    ↗️
        FG  { a10  {a+𝕩}{a𝕩} }
     ⟨ *function* *function* ⟩
    diff --git a/docs/doc/namespace.html b/docs/doc/namespace.html
    index 667e323d..a0c06d75 100644
    --- a/docs/doc/namespace.html
    +++ b/docs/doc/namespace.html
    @@ -4,7 +4,7 @@
       BQN: Namespaces
     
     
    -

    Namespaces

    +

    Namespaces

    A namespace is a type of value that groups together several values (fields) from the same scope. A block or file returns a namespace if it contains any export arrows at the top level, and fields from namespaces can be accessed with either dot syntax or destructuring assignment. A namespace can be mutable only if any of the code in it uses to change the value of a field.

    The following quick example shows a few ways to use a namespace returned by •Import:

    ns  •Import "file.bqn"
    @@ -17,12 +17,12 @@
     _something  {𝕗}          # Separate definition
     DoThing  "TODO"!
     
    -

    Uses

    +

    Uses

    The features of namespaces that make them useful in BQN programming are encapsulation and mutability. But these are exactly the same features that closures provide! In fact a namespace is not much more than a closure with a name lookup system. Consequently namespaces don't really expand the basic functionality of the language, but just make it easier to use.

    Namespaces improve encapsulation by allowing many values to be exported at once. With only one way to call them, functions and modifiers aren't such a good way to define a large part of a program. With a namespace you can define lots of things and expose exactly the ones you want to the rest of the world. For example, it's typical for files to define namespaces. A reader can see the exported values just by searching for , and if you're nice, you might declare them all at the beginning of the file. Careful use of exports can guarantee that potentially dangerous functions are used correctly: if it's only valid to call function B after function A has been called, export AB{A𝕩B𝕩} and don't export B.

    Mutability means that the behavior of one namespace can change over the course of the program. Mutability is often a liability, so make sure you really need it before leaning too heavily on this property. While there's no way to tell from the outside that a particular namespace is mutable, you can tell it isn't if the source code doesn't contain , as this is the only way it can modify the variables it contains.

    A namespace that makes use of mutability is essentially an object: a collection of state along with operations that act on it. Object-oriented programming is the other major use of namespaces. Contrary to the name, there's never a need to orient your programming around objects, and it's perfectly fine to use an object here or there when you need to, for instance to build a mutable queue of values.

    -

    Exports

    +

    Exports

    The double arrow is used to export variables from a block or file, making the result a namespace instead of the result of the last line. There are two ways to export variables. First, in the variable definition can be replaced with to export the variable as it's defined. Second, an export statement consisting of an assignment target followed by , with nothing to the right, exports the variables in the target and does nothing else. These export statements can be placed anywhere in the relevant program or body, including before declaration or on the last line, and a given variable can be exported any number of times. The block in the example below has two statements that export variables, exporting a, b, and c.

    example  {
       bc   # Non-definition exports can go anywhere
    @@ -31,7 +31,7 @@
       cb"str"
     }
     
    -

    Imports

    +

    Imports

    There are also two ways to get values out of a namespace, such as example defined above. First, it might be used in a destructuring assignment like the one below. This assignment's target looks like a list, where each entry specifies one of the names exported by the block and what it should be assigned to. The element can be either a single name, like b, which gives both, or an aliasing expression like b2b. In this case, the value b from the namespace is used, but it's given the name b2 instead of b. Imported names can be repeated—but the variables defined can't—and all the names can be spelled with any role (the role is ignored).

    aliasa, b, c0c1c, b2b  example
     
    diff --git a/docs/doc/oop.html b/docs/doc/oop.html index 916a4eef..ce3afd66 100644 --- a/docs/doc/oop.html +++ b/docs/doc/oop.html @@ -4,7 +4,7 @@ Object-oriented programming in BQN -

    Object-oriented programming in BQN

    +

    Object-oriented programming in BQN

    BQN's namespaces can be used to support a simple form of object-oriented programming (OOP) without type checking or inheritance. It's suitable for some program architectures but not others: making OOP work as a solution to every problem isn't a BQN design goal. In fact, BQN was never designed to support OOP at all! I added namespaces or modules as a way to structure programs. The techniques we're going to discuss are all just ways to use namespaces, and if it ever starts seeming like confusing magic it might help to go back to this model. However, thinking of namespaces as objects can be quite powerful in the right circumstances, and on this page I'm going to frame things in OOP terms. The following table shows which well-known aspects of OOP are supported in BQN:

    @@ -64,7 +64,7 @@
    -

    Objects

    +

    Objects

    An object in BQN is simply a namespace: its fields and methods are variables in the namespace, and one of these can be accessed outside of the namespace with dot syntax if it's exported with . Unexported variables are instance-private in OOP parlance, meaning that only they're only visible to the object containing them. They could be utilities, or hold state for the object. As an example, the object below implements the Tower of Hanoi puzzle with five disks. You can view the state (a list of disks occupying each of the three rods) with towerOfHanoi.View, or move the top disk from one rod to another with towerOfHanoi.Move.

    towerOfHanoi  {
       l  ¨500
    @@ -100,7 +100,7 @@
         t.Move 21
       2 3 4   0 1  ⟨⟩ 
     
    -

    Classes

    +

    Classes

    The object above is a singleton: there's just one of it, at least in the scope it occupies. It's often more useful to have a class that can be used to create objects. What we'll call a "class" is a namespace function, that is, a function that contains and so returns a namespace. It's very easy to convert a singleton object to a class: just add a no-op 𝕤 line to force it to be a function, and call it with @ when needed.

    MakeStack  {𝕤
       st@
    @@ -117,7 +117,7 @@
     }
     

    A stack is a particularly simple class to make because its state can be represented efficiently as a BQN value. Other data structures don't allow this, and will often require an extra Node class when they are implemented—see MakeQueue below.

    -

    Mutability

    +

    Mutability

    An object is one way to transform variable mutation into mutable data. These are two different concepts: changes which value is attached to a name in a scope, while mutable data means that the behavior of a particular value can change. But if a value is linked to a scope (for an object, the scope that contains its fields), then variable mutation in that scope can change the value's behavior. In fact, in BQN this is the only way to create mutable data. Which doesn't mean it's rare: functions, modifiers, and namespaces are all potentially mutable. The difference between objects and the operations is just a matter of syntax. Mutability in operations can only be observed by calling them. For instance F 10 or -_m could return a different result even if the variables involved don't change value. Mutability in an object can only be observed by accessing a member, meaning that obj.field or fieldobj can yield different values over the course of a program even if obj is still the same object.

    Let's look at how mutability plays out in an example class for a single-ended queue. This queue works by linking new nodes to the tail t of the queue, and detaching nodes from the head h when requested (a detached node will still point to h, but nothing in the queue points to it, so it's unreachable and will eventually be garbage collected). Each node has some data v and a single node reference n directed tailwards; in a double-ended queue or more complicated structure it would have more references. You can find every mutable variable in the queue by searching for , which shows that t and h in the queue, and n in a node, may be mutated. It's impossible for the other variables to change value once they're assigned.

    MakeQueue  {𝕤
    @@ -128,7 +128,7 @@
     }
     

    Unlike a stack, a node's successor isn't known when it's created, and it has to be set. You might be inclined to make n settable directly, but we'll get more mileage out of a setter function SetN. This allows us to create a pseudo-node e (for "empty") indicating there are no values in the queue. Because it has no .v field, if h is e then Pop gives an error (but in a real implementation you'd want to test explicitly instead in order to give an appropriate error message). In fact it doesn't have an n field, and essentially uses the queue head h instead. With this empty "node", the queue definition is straightforward. The only tricky part to remember is that if Pop removes the last node, resulting in e=h, then the tail has to be set to e as well, or it will keep pointing to the removed node and cause bugs.

    -

    Composition

    +

    Composition

    BQN classes don't support inheritance because there's no way to extend an existing class with new fields. But a lot of OOP enthusiasts these days are promoting composition over inheritance, and here BQN does pretty well. Forwarding methods from another class is as simple as a multiple assignment, like View below (in a real program undoableTowerOfHanoi should almost certainly be a class, but I introduced towerOfHanoi before classes, and I'm not about to write it again just to add an 𝕤).

    undoableTowerOfHanoi  {
       PushPop  MakeStack@     # Copy methods as private
    @@ -139,7 +139,7 @@
     

    This class composes a Tower of Hanoi with an undo stack that stores previous moves. To undo a move from a to b, it moves from b to a, although if you felt really fancy you might define Move in towerOfHanoi instead with 𝕊𝕩: 𝕊⌽𝕩.

    It's also possible to copy several variables and only export some of them, with an export statement. For example, if I wasn't going to make another method called Move, I might have written ViewMove towerOfHanoi and then View. In fact, depending on your personal style and how complicated your classes are, you might prefer to avoid inline exports entirely, and declare all the exports at the top.

    -

    Self-reference

    +

    Self-reference

    An object's class is given by 𝕊. Remember, a class is an ordinary BQN function! It might be useful for an object to produce another object of the same class (particularly if it's immutable), and an object might also expose a field class𝕤 to test whether an object o belongs to a class c with o.class = c.

    It's not currently possible for an object to know its own value without some outside help, such as a special constructor:

    IntrospectiveClass  {
    @@ -151,7 +151,7 @@
     }
     

    This is a pretty clunky solution, and exports a useless method SetThis (which gives an error if it's ever called). It would be possible for BQN to define a system value •this that just gets the namespace's value. It would work only at the top level, so it would have to be assigned (this•this) in order to use it in functions. This means it's always used before the namespace is done being defined, so a drawback is that it introduces the possibility that an object used in a program has undefined fields. The reason this isn't possible for objects without •this is that BQN's blocks don't have any sort of control flow, so that they always execute every statement in order. The namespace becomes accessible as a value once the block finishes, and at this point every statement has been executed and every field is initialized.

    -

    Class members

    +

    Class members

    As with this, giving a class variables that belong to it is a do-it-yourself sort of thing (or more positively, not at all magic (funny how programmer jargon goes the opposite way to ordinary English)). It's an easy one though, as this is exactly what lexical scoping does:

    staticClass  {
       counter  0
    diff --git a/docs/doc/order.html b/docs/doc/order.html
    index 1737ebd1..bd13f64a 100644
    --- a/docs/doc/order.html
    +++ b/docs/doc/order.html
    @@ -4,7 +4,7 @@
       BQN: Ordering functions
     
     
    -

    Ordering functions

    +

    Ordering functions

    BQN has six functions that order arrays as part of their operation (the comparison functions ≤<>≥ only order atoms, so they aren't included). These come in three pairs, where one of each pair uses an ascending ordering and the other uses a descending ordering.

    • ∨∧, Sort, rearranges the argument to order it
    • @@ -13,7 +13,7 @@

    The array ordering shared by all six is described last. For lists it's "dictionary ordering": two lists are compared one element at a time until one runs out, and the shorter one comes first in case of a tie. Operation values aren't ordered, so if an argument to an ordering function has a function or modifier somewhere in it then it will fail unless all the orderings can be decided without checking that value.

    You can't provide a custom ordering function to Sort. The function would have to be called on one pair of cells at a time, which is contrary to the idea of array programming, and passing in a function with side effects could lead to implementation-specific behavior. Instead, build another array that will sort in the order you want (for example, by selecting or deriving the property you want to sort on). Then Grade it, and use the result to select from the original array.

    -

    Sort

    +

    Sort

    You've probably seen it before. Sort Up () reorders the major cells of its argument to place them in ascending order, and Sort Down () puts them in descending order. Every ordering function follows this naming convention—there's an "Up" version pointing up and a "Down" version going the other way.

    ↗️
         "delta""alpha""beta""gamma"
     ⟨ "alpha" "beta" "delta" "gamma" ⟩
    @@ -22,7 +22,7 @@
     "δγβα"
     

    Sort Down always matches Sort Up reversed, . The reason for this is that BQN's array ordering is a total order, meaning that if one array doesn't come earlier or later that another array in the ordering then the two arrays match. Since any two non-matching argument cells are strictly ordered, they will have one ordering in and the opposite ordering in . With the reverse, any pair of non-matching cells are ordered the same way in and . Since these two results have the same major cells in the same order, they match. However, note that the results will not always behave identically because Match doesn't take fill elements into account (if you're curious, take a look at ¨0,"" versus ¨0,"").

    -

    Grade

    +

    Grade

    See the APL Wiki page for a few more examples. BQN only has the monadic form.

    Grade is more abstract than Sort. Rather than rearranging the argument's cells immediately, it returns a list of indices (more precisely, a permutation) giving the ordering that would sort them.

    ↗️
         l  "planet""moon""star""asteroid"
    @@ -38,7 +38,7 @@
     ↗️
        (l)  l
     ⟨ "asteroid" "moon" "planet" "star" ⟩
     
    -

    Ordinals

    +

    Ordinals

    So the elements of the Grade of an array correspond to the cells of that array after it's sorted. It's tempting if you don't have the sorted list handy to try to match them up with major cells of the original array, but this never makes sense—there's no relationship. However, applying Grade twice gives us a list that does correspond to the original argument quite usefully: it says, for each major cell of that argument, what rank it has relative to the others (smallest is 0, next is 1, and so on, breaking ties in favor of which cell comes earlier in the argument). Experienced APL programmers call this pattern the "ordinals" idiom.

    ↗️
        l  ⍋⍋ l
     ┌─                                   
    @@ -49,7 +49,7 @@
     

    How does it work? First, let's note that l is a permutation: it contains exactly the numbers ↕≠l, possibly in a different order. In other words, ∧⍋l is ↕≠l. Permuting an array rearranges the cells but doesn't remove or duplicate any. This implies it's always invertible: given a permutation p, some other permutation q will have 𝕩 qp𝕩 for every 𝕩 of the right length. This would mean that while l transforms l to l, the inverse of l transforms l back into l. That's what we want: for each cell of l, the corresponding number in the inverse of l is what index that cell has after sorting.

    But what's the inverse q of a permutation p? Our requirement is that 𝕩 qp𝕩 for any 𝕩 with the same length as p. Setting 𝕩 to ↕≠p (the identity permutation), we have (↕≠p) qp, because p⊏↕≠p is just p. But if p is a permutation then p is ↕≠p, so our requirement could also be written (p) qp. Now it's all coming back around again. We know exactly how to get q! Defining qp, we have qp (p)p p ↕≠p, and qp𝕩 (qp)𝕩 (↕≠p)𝕩 𝕩.

    The fact that Grade Up inverts a permutation is useful in itself. Note that this applies to Grade Up specifically, and not Grade Down. This is because the identity permutation is ordered in ascending order. Grade Down would actually invert the reverse of a permutation, which is unlikely to be useful. So the ordinals idiom that goes in the opposite direction is actually not ⍒⍒ but ⍋⍒. The initial grade is different, but the way to invert it is the same.

    -

    Stability

    +

    Stability

    When sorting an array, we usually don't care how matching cells are ordered relative to each other (although it's possible to detect it by using fill elements carefully. They maintain their ordering). Grading is a different matter, because often the grade of one array is used to order another one.

    ↗️
         t  > "dog"4, "ant"6, "pigeon"2, "pig"4 
     ┌─            
    @@ -83,7 +83,7 @@
     ↗️
        (⌽⍒/345)  "012abcdABCDE"
     "210dcbaEDCBA"
     
    -

    Bins

    +

    Bins

    There's also an APL Wiki page on this function, but be careful as the Dyalog version has subtle differences.

    The two Bins functions are written with the same symbols and as Grade, but take two arguments instead of one. More complicated? A little, but once you understand Bins you'll find that it's a basic concept that shows up in the real world all the time.

    Bins behaves like a search function with respect to rank: it looks up cells from 𝕩 relative to major cells of 𝕨. However, there's an extra requirement: the left argument to Bins is already sorted according to whichever ordering is used. If it isn't, you'll get an error.

    @@ -100,7 +100,7 @@ ERROR ⟨ 3 5 0 1 ⟩

    A score of 565e7 sits between 578e7 and 553e7 at rank 3, 322e7 wouldn't make the list, 788e7 would beat everyone, and 627e7 would tie the high score but not beat it. The same principles apply to less spring-loaded things like character indices and line numbers (𝕨 is the index of the start of each line), or percentage scores and letter grades on a test (𝕨 is the minimum score possible for each grade). In each case, it's better to think of Bins not as a counting exercise but as finding "what bin" something fits into.

    -

    Array ordering

    +

    Array ordering

    Most of the time you won't need to worry about the details of how BQN arrays are ordered. It's documented here because, well, that's what documentation does.

    The array ordering defines some arrays to be smaller or larger than others. All of the "Up" ordering functions use this ordering directly, so that smaller arrays come earlier, and the "Down" ones use the opposite ordering, with larger arrays coming earlier. For arrays consisting only of characters and numbers, with arbitrary nesting, the ordering is always defined. If an array contains an operation, trying to order it relative to another array might give an error. If comparing two arrays succeeds, there are three possibilities: the first array is smaller, the second is smaller, or the two arrays match.

    Comparing two atoms is defined to work the same way as the comparison functions ≤<>≥. Numbers come earlier than characters and otherwise these two types are ordered in the obvious way. To compare an atom to an array, the atom enclosing and then compared with the array ordering defined below. The result of this comparison is used except when the two arrays match: in that case, the atom is considered smaller.

    diff --git a/docs/doc/paradigms.html b/docs/doc/paradigms.html index 73fe750c..a0305aa9 100644 --- a/docs/doc/paradigms.html +++ b/docs/doc/paradigms.html @@ -4,15 +4,15 @@ BQN in programming paradigms -

    BQN in programming paradigms

    +

    BQN in programming paradigms

    It hangs onto weakly positive connotations somehow, but the term "multi-paradigm" shouldn't impress you. Let's dig into exactly which paradigms BQN supports and how.

    This information doesn't tell you what tasks BQN is good for: after all, it turns out you can write an efficient compiler entirely using array programming, something many people assumed was impossible. Instead, it tells you what approaches you can take to writing programs, and how comfortable you'll find it to start using BQN—or how much you can use it to stretch your brain in new directions.

    When programming in BQN, I almost always use array, tacit, and (slightly impure) functional styles, and encapsulate code in medium or large projects using namespaces. I sometimes use object-oriented or imperative programming in addition to these.

    -

    Typing

    +

    Typing

    BQN is a dynamically typed language with a coarse type system that only distinguishes types when the difference is blindingly obvious. There is a single numeric type and a single unicode character type. A fast implementation such as CBQN will check to see when it can represent the data with a smaller type than the one offered by the language. BQN usually avoids implicit type conversion, with the exception that many primitives automatically convert atoms to unit arrays. The fact that a data value can be applied as a function to return itself could also be considered an implicit conversion.

    BQN has no "pointer" or "reference" type, and uses automatic memory management. Its data types are immutable while operations and namespaces are mutable; mutable data can create reference loops, which the implementation must account for in garbage collection but the programmer doesn't have to worry about.

    Dynamic types and garbage collection introduce overhead relative to a statically-typed or manually managed language. The impact of this overhead can be greatly reduced with array programming, because an array of numbers or characters can be stored as a single unit of memory and processed with functions specialized to its element type.

    -

    Styles

    +

    Styles

    BQN is designed for array programming. The array is its only built-in collection type and it has many primitives designed to work with arrays.

    BQN is okay for imperative programming. Blocks are lists of statements. Variables can be modified with , and while there are no truly global variables, lexical scoping allows variables at the top level of a file, which are similar (•Import with no left argument saves and reuses results, so that data can be shared between files by loading the same namespace-defining file in each). BQN doesn't directly support structured programming (which refers to a particular way to structure programs; it also doesn't have a Goto statement, the "unstructured" alternative when the term was coined). However, its first-class functions allow a reasonably similar imitation of control structures.

    Functional programming is a term with many meanings. Using the terms defined in the functional programming document, BQN supports first-class functions and function-level programming, allows but doesn't encourage pure functional programming, and does not support typed functional programming. BQN uses lexical scope and has full support for closures. In this way BQN is very similar to Lisp, although it lacks Lisp's macro system.

    diff --git a/docs/doc/pick.html b/docs/doc/pick.html index 76f9bfe9..d3166012 100644 --- a/docs/doc/pick.html +++ b/docs/doc/pick.html @@ -4,11 +4,11 @@ BQN: Pick -

    Pick

    +

    Pick

    Pick () chooses elements from 𝕩 based on index lists from 𝕨. 𝕨 can be a plain list, or even one number if 𝕩 is a list, in order to get one element from 𝕩. It can also be an array of index lists, or have deeper array structure: each index list will be replaced with the element of 𝕩 at that index, effectively applying to 𝕨 at depth 1.

    With no 𝕨, monadic 𝕩 takes the first element of 𝕩 in index order, with an error if 𝕩 is empty.

    While sometimes "scatter-point" indexing is necessary, using Pick to select multiple elements from 𝕩 is less array-oriented than Select (), and probably slower. Consider rearranging your data so that you can select along axes instead of picking out elements.

    -

    One element

    +

    One element

    When the left argument is a number, Pick gets an element from a list:

    ↗️
        2  01234
     2
    @@ -46,7 +46,7 @@
         ⟨⟩  'a'
     'a'
     
    -

    First

    +

    First

    With no left argument, is called First, and performs a slight generalization of Pick with a default left argument 0¨𝕩. For a non-empty array it returns the first element in index order.

    ↗️
         <'a'
     'a'
    @@ -69,7 +69,7 @@ ERROR
         ´ "last"
     't'
     
    -

    Many elements

    +

    Many elements

    Pick also accepts a list of indices:

    ↗️
        a  # Defined above
     ┌─       
    diff --git a/docs/doc/prefixes.html b/docs/doc/prefixes.html
    index 59e94b21..dd27b18c 100644
    --- a/docs/doc/prefixes.html
    +++ b/docs/doc/prefixes.html
    @@ -4,7 +4,7 @@
       BQN: Prefixes and Suffixes
     
     
    -

    Prefixes and Suffixes

    +

    Prefixes and Suffixes

    The Prefixes () function gives a list of all prefixes of its argument array along the first axis, and Suffixes () gives a similar list for suffixes. Because the result can be much larger than the argument, these functions may not be used often in high-performance code, but they are a powerful conceptual tool and can make sense for algorithms that are inherently quadratic.

    ↗️
         "abcde"
     ⟨ ⟨⟩ "a" "ab" "abc" "abcd" "abcde" ⟩
    @@ -26,9 +26,9 @@
     ⟨ "abcde" "abcde" "abcde" "abcde" "abcde" "abcde" ⟩
     

    Joining corresponding elements of 𝕩 and 𝕩 gives 𝕩 again. This is because i(↑∾¨)𝕩 is (i⊑↑𝕩)(i⊑↓𝕩), or, using the Take and Drop correspondences, (i𝕩)(i𝕩), which is 𝕩. Element-wise, we are combining the first i cells of 𝕩 with all but the first i. Looking at the entire result, we now know that (↑∾¨)𝕩 is (1+≠𝕩)⥊<𝕩. The total number of cells in this combined array is therefore (1+≠𝕩)×≠𝕩, or 1+×≠𝕩. Each of Prefixes and Suffixes must have the same total number of cells (in fact, 𝕩 is ¨𝕩, and Reverse doesn't change an array's shape). So the total number in either one is 2÷˜1+×≠𝕩. With n𝕩, it is 2÷˜n×1+n.

    -

    Definition

    +

    Definition

    Knowing the length and the elements, it's easy to define functions for Prefixes and Suffixes: is equivalent to (1+≠)¨< while is (1+≠)¨<. Each primitive is defined only on arrays with at least one axis.

    -

    Working with pairs

    +

    Working with pairs

    Sometimes it's useful to apply an operation to every unordered pair of elements from a list. For example, consider all possible products of numbers between 1 and 6:

    ↗️
        ×⌜˜ 1+↕6
     ┌─                  
    @@ -70,7 +70,7 @@
         (<˘ ¨¨   ) "abc"
     ⟨ ⟨⟩ ⟨ "ba" ⟩ ⟨ "ca" "cb" ⟩ ⟩
     
    -

    Slices

    +

    Slices

    Prefixes and Suffixes give certain restricted slices of the argument array, where a slice is a contiguous selection of cells. If we want all slices along the first axis, we can take the suffixes of each prefix, or vice-versa:

    ↗️
        ¨ "abc"
     ┌─                                                         
    diff --git a/docs/doc/primitive.html b/docs/doc/primitive.html
    index ef7f8d86..ebb09c09 100644
    --- a/docs/doc/primitive.html
    +++ b/docs/doc/primitive.html
    @@ -4,11 +4,11 @@
       BQN primitives
     
     
    -

    BQN primitives

    +

    BQN primitives

    Primitives are the basic functions and modifiers built into the language, written with individual glyphs (more about the concept here). The role of a primitive when written always matches its type (but you can use its value in other roles by assigning it, or other methods).

    Primitives have no side effects other than errors, and can't perform infinite computations, except when a primitive modifier calls an operand function that does one of these things (this can only happen when arguments are passed, as primitive modifiers are always deferred). Side effects here include both writing state such as variables or printed output, and reading any outside state, so that a function without them always returns the same result if passed the same arguments. Since trains and list notation have the same nice properties, tacit code written entirely with primitives, trains, and lists always describes finite, self-contained computations.

    Recursion is the primary way to perform potentially infinite computations in BQN, and it can be packaged into control structures like While for ease of use. A given BQN implementation might also provide system values for "impure" tasks like file access or other I/O.

    -

    Functions

    +

    Functions

    Functions that have significant differences from APL equivalents or don't appear in APL are marked with an asterisk.

    @@ -236,7 +236,7 @@
    -

    Modifiers

    +

    Modifiers

    diff --git a/docs/doc/range.html b/docs/doc/range.html index 8d73bf0b..6a8409ac 100644 --- a/docs/doc/range.html +++ b/docs/doc/range.html @@ -4,7 +4,7 @@ BQN: Range -

    Range

    +

    Range

    Range () is a monadic function that creates arrays of indices (like APL's famous iota function). Each element in the result is its own index.

    ↗️
         6
     ⟨ 0 1 2 3 4 5 ⟩
    @@ -43,7 +43,7 @@
       ⟨ 3 0 ⟩ ⟨ 3 1 ⟩  
                       ┘
     
    -

    Number range

    +

    Number range

    Calling on an atom, which must be a natural number, is the more common case. This gives us the list of natural numbers less than 𝕩 (and starting at 0, the first natural number as BQN defines it). You can also get the first b integers starting at a with a+↕b, or the natural numbers from a to b with a↓↕b.

    ↗️
        4
     ⟨ 0 1 2 3 ⟩
    @@ -96,7 +96,7 @@
     ↗️
        ` b × 1 + ↕≠b
     ⟨ 0 2 3 3 3 3 7 7 ⟩
     
    -

    List range

    +

    List range

    When the argument is a list of numbers, the result is an array of lists.

    ↗️
         234
     ┌─                                         
    diff --git a/docs/doc/repeat.html b/docs/doc/repeat.html
    index aaa68219..b5e4ba06 100644
    --- a/docs/doc/repeat.html
    +++ b/docs/doc/repeat.html
    @@ -4,7 +4,7 @@
       BQN: The Repeat modifier
     
     
    -

    The Repeat modifier

    +

    The Repeat modifier

    Repeat () is a 2-modifier that applies its operand function 𝔽 multiple times.

    ↗️
        »»» "ABCDE"
     "   AB"
    @@ -18,7 +18,7 @@
     

    F0 repeats F zero times, that is, does nothing. Like n0 gives the multiplicative identity 1, F0 is the compositional identity, . Since F1 applies F and F0 doesn't, Repeat might be pronounced "if" or "conditional" when 𝔾 is boolean.

    BQN's Repeat modifier has some extra functionality relative to the mathematical version. It allows a left argument, and some extensions to the right operand 𝔾. As usual for 2-modifiers, 𝔾 is actually a function that applies to the arguments to give a result. The result can be a natural number as shown above, or a negative number to Undo () 𝔽, or an array of values.

    -

    Left argument

    +

    Left argument

    If 𝕨 is given, it's passed as the left argument to 𝔽 for every invocation.

    ↗️
        3 +2 7
     13
    @@ -26,7 +26,7 @@
     13
     

    This kind of composition can't be represented by anymore (you'd need a train), but it's similar in spirit. 𝕨 𝔽n 𝕩 is always equivalent to 𝕨𝔽n 𝕩, provided n is a constant—not a function, as discussed in the next section.

    -

    Dynamic repetition count

    +

    Dynamic repetition count

    In the general case, 𝔾 is a function, which is applied to all arguments to get the repetition count. That is, the actual count is 𝕨𝔾𝕩.

    ↗️
        1 4
     ⟨ 4 1 1 1 1 ⟩
    @@ -42,13 +42,13 @@
     ↗️
        3 <¨ 246  # Left if less, i.e. minimum
     ⟨ 2 3 3 ⟩
     
    -

    Negative repetition

    +

    Negative repetition

    What does it mean to repeat a function a negative number of times? For a negative integer -n, BQN defines F(-n) to be Fn. In particular, F¯1 simply undoes F.

    ↗️
        1 ¯1 "abcde"  # Rotate backwards
     "eabcd"
     

    Because BQN's Undo is a little looser than a strict mathematical inverse, this is an extension of the function inverse written f⁻¹ in mathematics. As a result, it doesn't have all the same properties. For natural numbers, Repeat follows the rule that Fm Fn 𝕩 is F(m+n) 𝕩. For integers, we have 𝕩 Fn F(-n) 𝕩, but not necessarily 𝕩 F(-n) Fn 𝕩.

    -

    Array of repetition counts

    +

    Array of repetition counts

    The value of 𝕨𝔾𝕩 might also be an array, whose elements are any valid repetition values—integers, or other arrays. Each integer in the nested structure is replaced with the result of repeating 𝔽 that many times.

    ↗️
        2×2,4,¯2,1⟩⟩ 1
     ⟨ 4 ⟨ 16 0.25 2 ⟩ ⟩
    diff --git a/docs/doc/replicate.html b/docs/doc/replicate.html
    index 844b67c9..4e8aba5d 100644
    --- a/docs/doc/replicate.html
    +++ b/docs/doc/replicate.html
    @@ -4,7 +4,7 @@
       BQN: Indices and Replicate
     
     
    -

    Indices and Replicate

    +

    Indices and Replicate

    @@ -87,7 +87,7 @@

    The functions Indices and Replicate are used to copy or filter data. They might be described as transforming a run-length encoding into unencoded form. On the other hand, Indices might be described as giving a sparse representation of 𝕩, which is smaller if 𝕩 mostly consists of zeros.

    BQN doesn't have any of the various features used in APL to add fills to the result of Replicate, like negative numbers in 𝕨 or an Expand (\) primitive. An alternative to Expand is to use Replicate with structural Under () to insert values into an array of fills.

    -

    Replicate

    +

    Replicate

    Given a list of natural numbers 𝕨, Replicate repeats each major cell in 𝕩 the corresponding number of times. That is, 𝕨 and 𝕩 must have the same length, and the result includes i𝕨 copies of each cell i𝕩, in order.

    ↗️
        2102 / "abcd"
     "aabdd"
    @@ -128,7 +128,7 @@
     ↗️
        {1+'"'=𝕩}/ "for ""escaping"" quotes"
     "for """"escaping"""" quotes"
     
    -

    Compound Replicate

    +

    Compound Replicate

    If 𝕨 has depth two, then its elements give the amounts to copy along each leading axis of 𝕩.

    ↗️
         b  25  10
     ┌─           
    @@ -172,7 +172,7 @@
     ↗️
        b  ⟨⟩ / b
     1
     
    -

    Indices

    +

    Indices

    The monadic form of / is much simpler than the dyadic one, with no multidimensional case or mismatched argument ranks. 𝕩 must be a list of natural numbers, and /𝕩 is the list 𝕩/↕≠𝕩. Its elements are the indices for 𝕩, with index i repeated i𝕩 times.

    ↗️
        / 3012
     ⟨ 0 0 0 2 3 3 ⟩
    @@ -212,7 +212,7 @@
           ┘
     

    This means the transitions can be grouped exactly in pairs, the beginning and end of each group. Reshape with a computed length 2 groups these pairs, and then a scan -˜`˘ can be used to convert the start/end format to start/length if wanted.

    -

    Inverse

    +

    Inverse

    The result of Indices /n is an ordered list of natural numbers, where the number i appears in times. Given an ordered list of natural numbers k, the inverse of indices returns a corresponding n: one where the value in is the number of times i appears in k.

    ↗️
        / 321
     ⟨ 0 0 0 1 1 2 ⟩
    diff --git a/docs/doc/reshape.html b/docs/doc/reshape.html
    index 58b7dece..e9523103 100644
    --- a/docs/doc/reshape.html
    +++ b/docs/doc/reshape.html
    @@ -4,7 +4,7 @@
       BQN: Deshape and Reshape
     
     
    -

    Deshape and Reshape

    +

    Deshape and Reshape

    @@ -60,7 +60,7 @@

    The glyph indicates BQN's facilities to reflow the data in an array, giving it a different shape. Its monadic form, Deshape, simply removes all shape information, returning a list of all the elements from the array in reading order. With a left argument, is called Reshape and is a more versatile tool for rearranging the data in an array into the desired shape.

    Because of its dependence on the reading order of an array, Reshape is less fundamental than other array operations. Using Reshape in the central computations of a program can be a sign of imperfect usage of arrays. For example, it may be useful to use Reshape to create a constant array or repeat a sequence of values several times, but the same task might also be accomplished more simply with Table , or by taking advantage of leading axis agreement in arithmetic primitives.

    -

    Deshape

    +

    Deshape

    The result of Deshape is a list containing the same elements as the argument.

    ↗️
         a  +⌜´ 100200, 3040, 567
     ┌─             
    @@ -94,10 +94,10 @@
          2
     ⟨ 2 ⟩
     
    -

    Reshape

    +

    Reshape

    While Deshape removes all shape information from its argument array, Reshape adds shape information back based on the left argument. Reshape ignores the shape of its original argument, treating it like a list of elements as though it were deshaped initially.

    The left argument of Reshape gives the shape of the result, except that one entry can be left unspecified for BQN to fill in. We'll look at the cases where a full shape is given first.

    -

    Matching lengths

    +

    Matching lengths

    If the number of elements implied by the given shape—that is, ×´𝕨—is equal to the number of elements in 𝕩, then 𝕩 is simply rearranged to match that shape. The element list is kept the same, so that the deshaped result matches the deshaped right argument.

    ↗️
        a
     ┌─             
    @@ -133,7 +133,7 @@
       7 8 9 10 11 12 13  
                         ┘
     
    -

    Non-matching lengths

    +

    Non-matching lengths

    If 𝕨 implies a smaller number of elements than are present initially, then only the initial elements of 𝕩 are used. Here the result stops at 237, three-quarters of the way through a, because at that point the result is filled up.

    ↗️
        33  a
     ┌─             
    @@ -160,7 +160,7 @@ ERROR
         5  < "string"
     ⟨ "string" "string" "string" "string" "string" ⟩
     
    -

    Computed lengths

    +

    Computed lengths

    What if you want to reshape an array into, say, rows of length 2, but don't want to write out the number of rows?

    ↗️
        2  "aAeEiIoOuU"
     ┌─    
    diff --git a/docs/doc/reverse.html b/docs/doc/reverse.html
    index 104a7e44..d1a9ccec 100644
    --- a/docs/doc/reverse.html
    +++ b/docs/doc/reverse.html
    @@ -4,10 +4,10 @@
       BQN: Reverse and Rotate
     
     
    -

    Reverse and Rotate

    +

    Reverse and Rotate

    The symbol indicates two different array transformations: with no left argument, it reverses the major cells of the array, but with a left argument, it rotates or cycles them around. These two possibilities, first put together in very early versions of APL, can't be considered restrictions or different views of some unifying function, but there are connections between them. Each returns an array with the same shape and all the same elements as 𝕩, possibly in a different arrangement. And elements that start out next to each other in 𝕩 generally stay next to each other—always, if we consider an element on one edge to be next to the one opposite to it. One might think of them as isometries preserving a discrete subgroup of the torus, if one were inclined to think such things. On major cells, the two functions decompose the dihedral group okay I'll stop.

    Many uses of Rotate in APL are better handled by shift functions in BQN. If there's no reason to treat the data as cyclic or periodic, it's best to avoid Rotate.

    -

    Reverse

    +

    Reverse

    There's not too much to say about Reverse. It puts the elements of a list the other way around, or more generally the major cells of an array.

    ↗️
         "abcdefg"
     "gfedcba"
    @@ -45,7 +45,7 @@
         ` 0010010
     ⟨ 1 1 1 1 1 1 0 ⟩
     
    -

    Rotate

    +

    Rotate

    Rotate moves elements in a list around cyclically. It can also rotate any number of axes of the argument array by different amounts at once. That's discussed in the next section; for now we'll stick to a single number for 𝕨. It has to be an integer, and 𝕩 has to be an array with at least one axis.

    ↗️
        2  "rotate"
     "tatero"
    @@ -73,7 +73,7 @@
     ↗️
        ¯2  "rotate"
     "terota"
     
    -

    Multiple axes

    +

    Multiple axes

    The easiest way to rotate a later array axis is usually to use the Cells (˘) or Rank () modifier.

    ↗️
         tab  34"abcdABCD0123"
     ┌─      
    diff --git a/docs/doc/scan.html b/docs/doc/scan.html
    index 4b168b89..9d881cba 100644
    --- a/docs/doc/scan.html
    +++ b/docs/doc/scan.html
    @@ -4,7 +4,7 @@
       BQN: Scan
     
     
    -

    Scan

    +

    Scan

    @@ -61,7 +61,7 @@

    The 1-modifier Scan (`) moves along the first axis of the array 𝕩, building up an array of results by applying 𝔽 repeatedly beginning with 𝕨 or 𝕩. It's related to the fold modifiers, and most closely resembles the APL2-style reduction ¨˝, but it traverses the array in forward rather than reverse index order, and includes all intermediate results of 𝔽 in its output instead of just the final one.

    BQN's Scan is ordered differently from Scan in APL. Both include one result for each non-empty prefix of 𝕩. In BQN this is a left-to-right fold, so that each new result requires one application of 𝔽. APL uses right-to-left folds, which matches with reduction, but requires starting over at the end for each new prefix, except in special cases. If needed, this definition can be obtained with a fold on each prefix except the first (which is empty). In the particular case of -, that nested solution isn't needed: negate odd-indexed elements and then apply +`.

    Scan also differs from Fold or Insert in that it never depends on 𝔽's identity value, because scanning over an empty array simply returns that array.

    -

    Lists

    +

    Lists

    The best-known use of Scan is the prefix sum of a list, in which each element of the result is the sum of that element and all the ones before it. With a shift this can be modified to sum the previous elements only.

    ↗️
        +` 2431
     ⟨ 2 6 9 10 ⟩
    @@ -108,7 +108,7 @@
         {¬<`'\'=𝕩}/ "ab\\\rs\\\\"
     "ab\rs\\"
     
    -

    Reverse scan

    +

    Reverse scan

    We've discussed how the scan moves forward along 𝕩, so that each time 𝔽 takes an old result as 𝕨 and a new value as 𝕩. This means that results correspond to prefixes and go left to right on each one. Since the most important scans have associative, commutative operands, the left-to-right ordering often doesn't make a difference. But sometimes a suffix rather than prefix scan is wanted. For these cases, Scan Under Reverse (`) does the trick.

    ↗️
        `   0010010
     ⟨ 0 0 1 1 1 1 1 ⟩
    @@ -124,7 +124,7 @@
     ↗️
        {"("𝕨")𝔽"𝕩}˜` "a""b""c""d"
     ⟨ "(a)𝔽(b)𝔽(c)𝔽d" "(b)𝔽(c)𝔽d" "(c)𝔽d" "d" ⟩
     
    -

    Higher ranks

    +

    Higher ranks

    Scan moves along the leading axis of 𝕩: vertically, for a table. To apply a scan to later axes, use ˘ or . Since a scan returns an array with the same shape as its argument, this can't cause an error from differing result cell shapes, unlike Fold or Insert.

    ↗️
         a  ¯20.25'a'  34¯101
     ┌─                
    @@ -152,6 +152,6 @@
                    ┘
     

    Results are produced in index order. This means that instead of moving along each column in turn, a scan produces the first result cell one element at a time, then the next, and so on. Something like a breadth-first as opposed to depth-first ordering.

    -

    Definition

    +

    Definition

    Scan admits a simple recursive definition. 𝕩 is an array of rank one or more and 𝕨, if given, is an atom or array with shape 1↓≢𝕩. The result z𝕨𝔽`𝕩 is an array with the same shape as 𝕩. If it has length at least one, z is 𝕩 if 𝕨 isn't given and 𝕨𝔽¨𝕩 if it is. For 0i, (i+1)z is (iz)𝔽¨(i+1)𝕩.

    The ordering of 𝔽 application is the natural one for this definition: cells are computed in turn, and each instance of 𝔽¨ goes in index order.

    diff --git a/docs/doc/search.html b/docs/doc/search.html index accd54fa..7834a06d 100644 --- a/docs/doc/search.html +++ b/docs/doc/search.html @@ -4,7 +4,7 @@ BQN: Search functions -

    Search functions

    +

    Search functions

    @@ -133,7 +133,7 @@

    The searched-for argument is 𝕩 in Index-of functions (⊐⊒) and 𝕨 in Member of (). Bins Up and Down (⍋⍒) are ordering functions but follow the same pattern as Index-of. It's split into cells, but not necessarily major cells: instead, the cells used match the rank of a major cell of the other (searched-in) argument. In the most common case, when the searched-in argument is a list, 0-cells are used for the search (we might also say elements, as it gives the same result).

    The result is always an array containing one number for each searched-for cell. For Index of and Member of, every result is computed independently; for Progressive Index of the result for a cell can depend on earlier cells, in index order.

    -

    Member of

    +

    Member of

    The simplest of the search functions, Member of () returns 1 if an entry in 𝕨 matches some entry in 𝕩, and 0 if it doesn't.

    ↗️
        "green""bricks""cow""blue"  "red""green""blue"
     ⟨ 1 0 0 1 ⟩
    @@ -147,7 +147,7 @@
     "tal st"
     

    These are the APL functions Intersect () and Without (~). Really, only 𝕩 is treated like a set, while the ordering and multiplicity of elements of 𝕨 are maintained. I think the explicit implementations show this well, since 𝕩 is only used as the right argument to , and prefer this clarity to the brevity of a single symbol.

    -

    Index of

    +

    Index of

    Index of () returns the index of the first occurrence of each entry in 𝕨, or 𝕨 if an entry doesn't appear in 𝕨 at all.

    ↗️
        "zero""one""two""three"  "one""eight""two"
     ⟨ 1 4 2 ⟩
    @@ -160,7 +160,7 @@
         "letters" (</˘⌜˜) "let"  # Many to many
     ⟨ ⟨ 0 ⟩ ⟨ 1 4 ⟩ ⟨ 2 3 ⟩ ⟩
     
    -

    Progressive Index of

    +

    Progressive Index of

    Progressive Index of (), as the name and glyph suggest, is a more sophisticated variant of Index of. Like Index of, it returns either 𝕨 or an index of a cell from 𝕨 that matches the given cell of 𝕩. Unlike Index of, no index except 𝕨 can ever be repeated. Progressive Index of returns the index of the first unused match, provided there's still one left.

    ↗️
        "aaa"  "aaaaa"
     ⟨ 0 1 2 3 3 ⟩
    @@ -192,7 +192,7 @@
     ↗️
        ˜ "anything at all"
     ⟨ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ⟩
     
    - +

    Search functions are designed to search for multiple elements at once, and return an array of results. This is the array-oriented way to do it, and can allow faster algorithms to be used for the computation.

    ↗️
        stuff  "tacks""paper""string""tape"
     
    @@ -218,7 +218,7 @@
     2
     

    For Member of, the equivalent is stuff<.

    -

    Higher ranks

    +

    Higher ranks

    So far we've shown search functions acting on lists. Well, and one example with a unit array slipped into the last section. In fact, if the searched-in array is a list, then the searched-for argument can have any rank.

    ↗️
        ("high""rank")  "list arg"
     ┌─         
    diff --git a/docs/doc/select.html b/docs/doc/select.html
    index 63a55e1e..213731bf 100644
    --- a/docs/doc/select.html
    +++ b/docs/doc/select.html
    @@ -4,7 +4,7 @@
       BQN: Select
     
     
    -

    Select

    +

    Select

    @@ -50,7 +50,7 @@

    The function Select () reorganizes the array 𝕩 along one or more axes based on indices given by 𝕨. The result has the same depth as 𝕩, since its elements are always elements of 𝕩. This means it differs from Pick (), which takes elements from 𝕩 but can arrange them in any nested structure, including returning an element directly.

    The monadic form First Cell () gets the major cell with index 0, so that 𝕩 is identical to 0𝕩.

    -

    Single selection

    +

    Single selection

    Each axis of a BQN array is numbered starting at zero. Major cells are arranged along the first axis; in accordance with the leading axis principle, Select returns a major cell of 𝕩 when 𝕨 is an atom.

    ↗️
        2  "abcdef"  # An enclosed element
     ┌·   
    @@ -91,7 +91,7 @@ ERROR
          'a'
     ERROR
     
    -

    First-axis selection

    +

    First-axis selection

    If 𝕨 is an array of numbers (including any empty array), then each number indicates a major cell of 𝕩. In the simplest case, a list of numbers gives a result with the same rank as 𝕩 but maybe not the same length.

    ↗️
        233041  "OlZEt"
     "ZEEOtl"
    @@ -161,7 +161,7 @@ ERROR
       0123" 
            ┘
     
    -

    Multi-axis selection

    +

    Multi-axis selection

    Select also allows 𝕨 to apply to multiple axes of 𝕩 simultaneously. For this case, 𝕨 must be a non-empty list (or unit array) where every element is an array of indices.

    ↗️
        21, 300  34
     ┌─                         
    diff --git a/docs/doc/selfcmp.html b/docs/doc/selfcmp.html
    index 432cd7c4..90438b04 100644
    --- a/docs/doc/selfcmp.html
    +++ b/docs/doc/selfcmp.html
    @@ -4,7 +4,7 @@
       BQN: Self-search functions
     
     
    -

    Self-search functions

    +

    Self-search functions

    @@ -148,7 +148,7 @@ ⟨ 1 1 0 1 0 ⟩

    The result has one number for each major cell, or in other words is a list with the same length as its argument. Three self-search functions follow this pattern, but Deduplicate () is different: it returns an array of the same rank but possibly a shorter length than the argument.

    -

    Classify

    +

    Classify

    Classify is the universal self-search function, in that it preserves all the self-search information in its argument. It gives each different cell value a natural number, ordered by first appearance.

    ↗️
         562251
     ⟨ 0 1 2 2 0 3 ⟩
    @@ -169,7 +169,7 @@
         {(𝕏≡𝕏)"dbaedcbcecbcd"}¨ 
     ⟨ 1 1 1 0 ⟩
     
    -

    Classify and Deduplicate

    +

    Classify and Deduplicate

    Classify is also related to Deduplicate. In a way they are complements: applying both in sequence always gives a completely uninteresting result!

    ↗️
         c  >"yellow""orange""yellow""purple""orange""yellow"
     ┌─        
    @@ -206,7 +206,7 @@
     

    One way to view this relationship is to consider an idea from linear algebra, where an idempotent transformation is called a "projection". That means that the argument might be any value but the result is part of a smaller class of values, and any argument from that smaller class is left the same. What arrays do the two functions project to? The result of Deduplicate is an array with no repeated major cells. The result of Classify is a list of natural numbers, but it also has an additional property: each number in the list is at most one higher than the previous numbers, and the first number is zero. This comes from the way Classify numbers the cells of its argument. When it finds a cell that hasn't appeared before (at a lower index), it always chooses the next higher number for it.

    Applying both Classify and Deduplicate gives an array that has both properties (this isn't the case for all pairs of projections—we need to know that Classify maintains the uniqueness property for Deduplicate and vice-versa). It has no duplicate major cells, and it's a list of natural numbers that starts with 0 and never goes up by more than one. Taken together, these are a tight constraint! The first element of the argument has to be 0. The next can't be 0 because it's already appeared, but it can't be more than one higher—it has to be 1. The next can't be 0 or 1, and has to be 2. And so on. So the result is always n for some n. In fact it's possible to determine the length as well, by noting that each function preserves the number of unique major cells in its argument. Classify does this because distinct numbers in the output correspond exactly to distinct major cells in the input; Deduplicate does this because it only removes duplicate cells, not distinct ones. So the final result is n, where n is the number of unique major cells in the argument.

    -

    Mark Firsts

    +

    Mark Firsts

    See the APL Wiki page on this function as well.

    Mark Firsts () is the simplest self-search function: it returns 0 for any major cell of the argument that is a duplicate of an earlier cell and 1 for a major cell that's the first with its value. To implement Deduplicate in terms of Mark Firsts, just filter out the duplicates with /.

    ↗️
           314159265
    @@ -221,7 +221,7 @@
     ⟨ 0 0 1 0 1 ⟩
     

    Remember that you don't have to apply the result of Mark Firsts to the same array you got it from! For example, it might be useful in a database application to find unique values in a particular column but use these to filter the entire table, or one other column.

    -

    Occurrence Count

    +

    Occurrence Count

    Occurrence Count () is a somewhat more sophisticated take on the idea behind Mark Firsts: instead of just testing whether a cell is a duplicate, it returns a number indicating how many previous cells match it. This means that Mark Firsts can be implemented with 0=⊒.

    ↗️
           27181718284
     ⟨ 0 0 0 0 1 1 2 1 1 2 0 ⟩
    @@ -248,7 +248,7 @@
     ⟨ 0 1 0 1 2 0 1 2 3 ⟩
     

    A more efficient way when doesn't have a fast implementation is /(¯1⊑↕-⊏»)+`, but that's clearly quite a bit more complicated.

    -

    Deduplicate

    +

    Deduplicate

    There's also an APL Wiki page on this function.

    Deduplicate removes every major cell from the argument that matches an earlier cell, resulting in an array with the same rank but possibly a shorter length. It might also be described as returning the unique major cells of the argument, ordered by first occurrence. Deduplicate Under Reverse () orders by last occurrence instead.

    ↗️
         >"take""drop""drop""pick""take""take"
    diff --git a/docs/doc/shape.html b/docs/doc/shape.html
    index 99a4260e..80cfb45e 100644
    --- a/docs/doc/shape.html
    +++ b/docs/doc/shape.html
    @@ -4,10 +4,10 @@
       BQN: Array dimensions
     
     
    -

    Array dimensions

    +

    Array dimensions

    The function Shape () returns an array's shape, and Rank (=) and Length () return properties that can be derived from the shape. BQN's arrays are multidimensional, so that the shape is a list of natural numbers (the length along each axis), while the rank (length of the shape) and length (of the first axis) are numbers. In these functions, an atom is treated as a unit array, which has rank 0 and empty shape. A unit has no first axis, but its length is defined to be 1.

    Rank can be defined as while Length can be defined with a fold to be 1´.

    -

    Examples

    +

    Examples

    The function Reshape () always returns an array of shape 𝕨, so we use it to make an array of shape 1326 in the example below (Take () shares this property if (𝕨)≤=𝕩).

    ↗️
         arr  1326  '0'+↕10
     ┌─        
    @@ -44,7 +44,7 @@
         (=  ) <↕10
     ⟨ 0 1 ⟩
     
    -

    Units

    +

    Units

    A unit is an atom, or an array with no axes—rank 0. (See Enclose for more about unit arrays). Since it doesn't have any axes, its shape should have no elements. It should be the empty list ⟨⟩ (with a fill of 0, like all shapes). As there's no first element in the shape, it's not obvious what the length should be, and a stricter language would just give an error. However, there are some good reasons to use a length of 1. First, the total number of elements is 1, meaning that if the length divides this number evenly (as it does for non-unit arrays) then the only possible natural number it can be is 1. Second, many functions that take a list for a particular argument also accept a unit, and treat it as a length-1 array. For example, 5a and 5a are identical. Defining 5 to be 1 means that =sa is always s.

    Despite this last point, it's important to remember that a unit isn't the same as a 1-element list. For example, the length-1 string "a" doesn't match <'a' but instead 'a'. And also bear in mind that having an empty shape doesn't make a unit an empty array. That would mean it has no elements, not one!

    diff --git a/docs/doc/shift.html b/docs/doc/shift.html index 012dfb0d..5b72e865 100644 --- a/docs/doc/shift.html +++ b/docs/doc/shift.html @@ -4,7 +4,7 @@ BQN: Shift functions -

    Shift functions

    +

    Shift functions

    The shift functions « and » add major cells to one side an array, displacing cells on the opposite side and moving those in between. Shifts resemble but are more general than the bit-based shift operations used in low-level languages. They replace the APL pattern of a 2-wise reduction after appending or prepending an element (APL's 2≠/0,v translates to »v), one of the more common uses of 2-wise reduction.

    The result of a shift function always has the same shape as 𝕩. The function adds major cells to the beginning (») or end («) of 𝕩, moving the cells already in 𝕩 to accomodate them. Some cells on the opposite side from those added will "fall off" and not be included in the result.

    ↗️
        00 » 321             # Shift Before
    @@ -19,7 +19,7 @@
     ⟨ 2 3 0 ⟩
     

    If 𝕨 is longer than 𝕩, some cells from 𝕨 will be discarded, as well as all of 𝕩. In this case 𝕨»𝕩 is (𝕩)𝕨 and 𝕨«𝕩 is (-≠𝕩)𝕨. For similar reasons, nudging an array of length 0 returns it unchanged.

    -

    Sequence processing with shifts

    +

    Sequence processing with shifts

    When working with a sequence of data such as text, daily measurements, or audio data, shift functions are generally the best way to handle the concept of "next" or "previous". In the following example s is shown alongside the shifted-right data »s, and each element is compared to the previous with -», which we see is the inverse of Plus Scan +`.

    ↗️
        s  1224356
         s  »s
    @@ -58,7 +58,7 @@
     ⟨ ¯0.5 ¯0.5 ¯1 ¯0.5 ¯0.5 ¯1.5 ¯0.5 ⟩
     

    A feature these examples all share is that they maintain the length of s. This is a common condition in sequence processing, particularly when the processed sequence needs to be combined or compared with the original in some way. However, it's not always the case. In some instances, for example when searching s to see if there is any value less than the previous, the list should get shorter during processing. In these cases, Windows () is usually a better choice.

    -

    Arithmetic and logical shifts

    +

    Arithmetic and logical shifts

    The glyphs « and », suggesting movement, were chosen for the same reasons as the digraphs << and >>, and can be used to implement the same operations on boolean lists.

    ↗️
         i  "10011011"-'0'
     ⟨ 1 0 0 1 1 0 1 1 ⟩
    @@ -76,7 +76,7 @@
         3 (⊏»⊢) i  # Arithmetic right shift
     ⟨ 1 1 1 1 0 0 1 1 ⟩
     
    -

    Other examples

    +

    Other examples

    In Take (), there's no way to specify the fill element when the result is longer than the argument. To take along the first axis with a specified, constant fill value, you can use Shift Before instead, where the right argument is an array of fills with the desired final shape.

    ↗️
        "abc" » 5'F'
     "abcFF"
    @@ -95,7 +95,7 @@
     ↗️
        (×`1»⊢) 5243
     ⟨ 24 12 3 1 ⟩
     
    -

    Higher rank

    +

    Higher rank

    Shifting always works on the first axis of 𝕩 (which must have rank 1 or more), and shifts in major cells. A left argument can have rank equal to 𝕩, or one less than it, in which case it becomes a single cell of the result. With no left argument, a cell of fills 10𝕩 is nudged in.

    ↗️
         a  (↕×´) 43
     ┌─         
    @@ -129,7 +129,7 @@
       'c' 'e' 'l'  
                   ┘
     
    -

    Definition

    +

    Definition

    In any instance of » or «, 𝕩 must have rank at least 1.

    For a dyadic shift function, 𝕨 must be Join-compatible with 𝕩 (that is, 𝕨𝕩 completes without error) and cannot have greater rank than 𝕩. Then Shift Before (») is {(𝕩)𝕨𝕩} and Shift After («) is {(-≠𝕩)𝕩𝕨}

    When called monadically, the default argument is a cell of fills 10𝕩. That is, Nudge (») is (10↑⊢)» and Nudge Back («) is (10↑⊢)«. This default argument always satisfies the compatibility requirement above and so the only conditions for nudge are that 𝕩 has rank at least 1 and has a fill element.

    diff --git a/docs/doc/syntax.html b/docs/doc/syntax.html index 98e09225..54271ab0 100644 --- a/docs/doc/syntax.html +++ b/docs/doc/syntax.html @@ -4,9 +4,9 @@ BQN: Syntax overview -

    Syntax overview

    +

    Syntax overview

    BQN syntax consists of expressions where computation is done with a little organizing structure around them like assignment, functions, and list notation. Expressions are where the programmer is in control so the design tries to do as much as possible with them before introducing special syntax.

    -

    Special glyphs

    +

    Special glyphs

    The following glyphs are used for BQN syntax. Primitives (built-in functions and modifiers) are not listed in this table, and have their own page. Digits, characters, and the underscore _ are used for numbers, and identifiers or variable names.

    @@ -102,9 +102,9 @@
    -

    Comments

    +

    Comments

    A comment starts with # that is not part of a string and continues to the end of the line.

    -

    Constants

    +

    Constants

    BQN has single-token notation for numbers, strings, and characters.

    Numbers allow the typical decimal notation with ¯ for the negative sign (because - is a function) and e for scientific notation (or E, as numeric notation is case-insensitive). and π may be used as special numeric values. If complex numbers are supported, then they can be written with the components separated by i. However, no BQN to date supports complex numbers.

    ↗️
         ¯π  0.5  5e¯1  1.5E3      # A list of numbers
    @@ -118,7 +118,7 @@
     ⟨ 1 0 ⟩
     

    The null character (code point 0) has a dedicated literal representation @. This character can be used to directly convert between characters and numeric code points, which among many other uses allows tricky characters to be entered by code point: for example, a non-breaking space is @+160. The character can also be entered as a character literal, but this will display differently in various editors and some tools may have trouble with a file directly containing a null, so it is best to use @ instead.

    -

    Expressions

    +

    Expressions

    More discussion

    Like APL, BQN uses four syntactic roles for values in expressions:

      @@ -136,7 +136,7 @@ ┘

    BQN's built-in operations also have patterns to indicate the syntactic role: 1-modifiers (˜¨˘⁼⌜´`) are all superscript characters, and 2-modifiers (∘○⊸⟜⌾⊘◶⚇⎉⍟) all have an unbroken circle (two functions ⌽⍉ have broken circles with lines through them). Every other built-in constant is a function, although the special symbols ¯, , and π are used as part of numeric literal notation.

    -

    Assignment

    +

    Assignment

    Another element that can be included in expressions is assignment, which is written with to define (also called "declare" in many other languages) a variable and to change its definition. A variable can only be defined once within a scope, and can only be changed if it has already been defined. However, it can be shadowed, meaning that it is defined again in an inner scope even though it has a definition in an outer scope already.

    ↗️
        x1  {x2  x3  x}
     3
    @@ -149,7 +149,7 @@
         a
     ¯3
     
    -

    Exports

    +

    Exports

    The double arrow is used to export variables from a block or program, causing the result to be a namespace. There are two ways to export variables. First, in the variable definition can be replaced with to export the variable as it's defined. Second, an export statement consisting of an assignment target followed by with nothing to the right exports the variables in the assignment target and does nothing else. Export statements can be placed anywhere in the relevant program or body, including before declaration or on the last line, and a given variable can be exported any number of times.

    aliasa, b, c0c1c, b2b{
       bc   # Non-definition exports can go anywhere
    @@ -159,13 +159,13 @@
     }
     

    Fields of the resulting namespace can be accessed either directly using namespace.field syntax, or with a destructuring assignment as shown above. This assignment's target is a list where each element specifies one of the names exported by the block and what it should be assigned to. The element can be either a single name (such as b above), which gives both, or a combination of the assignment target, then , then a name. If is never used, the names can be given as a strand with . To use for aliases, bracket syntax ⟨⟩ is needed. Imported names can be repeated and can be spelled with any role (the role is ignored).

    -

    Lists and blocks

    -

    Separators

    +

    Lists and blocks

    +

    Separators

    The characters and , and newline are completely interchangeable and are used to separate expressions. An expression might be an element in a list or a line in a function. Empty sections—those that consist only of whitespace—are ignored. This means that any number of separators can be used between expressions, and that leading and trailing separators are also allowed. The expressions are evaluated in text order: left to right and top to bottom.

    -

    List notation

    +

    List notation

    Full documentation

    Lists (1-dimensional arrays) are enclosed in angle brackets ⟨⟩, with the results of the expressions in between being the list's elements. Lists of two elements or more can also be written with the ligature character . This character has higher binding strength than any part of an expression. If one of the elements is a compound expression, then it will need to be enclosed in parentheses.

    -

    Blocks

    +

    Blocks

    Full documentation

    Blocks are written with curly braces {} and can be used to group expressions or define functions and modifiers. The contents are simply a sequence of expressions, where each is evaluated and the result of the last is returned in order to evaluate the block. This result can have any value, and its syntactic role in the calling context is determined by the normal rules: functions return subjects and modifiers return functions. Blocks have lexical scope.

    The special names 𝕨 and 𝕩, which stand for arguments, and 𝕗 and 𝕘, which stand for operands, are available inside curly braces. Like ordinary names, the lowercase forms indicate subjects and the uppercase forms 𝕎𝕏𝔽𝔾 indicate functions. The type and syntactic role of the block is determined by its contents: a 2-modifier contains 𝕘, a 1-modifier contains 𝕗 but not 𝕘, and a function contains neither but does have one of 𝕨𝕩𝕤𝕎𝕏𝕊. If no special names are present the block is an immediate block and is evaluated as soon as it appears, with the result having a subject role.

    diff --git a/docs/doc/take.html b/docs/doc/take.html index 0dcbaa39..1dd895ff 100644 --- a/docs/doc/take.html +++ b/docs/doc/take.html @@ -4,7 +4,7 @@ BQN: Take and Drop -

    Take and Drop

    +

    Take and Drop

    @@ -43,7 +43,7 @@

    These extensions can be combined as well, so there are a lot of possibilities. A good picture to have in mind is cutting out a corner of the array 𝕩. This is because the result 𝕨𝕩 or 𝕨𝕩 always aligns with one side of 𝕩 along each axis, so it aligns with the corner where those sides meet.

    The result d𝕩 is always the same as t𝕩 for some other argument t, but computing t wouldn't be too convenient. The reverse isn't true: only Take can insert fills, so results that include them can't come from Drop.

    -

    One axis

    +

    One axis

    Let's start with a natural number 𝕨. Take gives the first 𝕨 major cells of 𝕩 (or elements of a list), while Drop gives all but the first 𝕨.

    ↗️
        4  "take and drop"
     "take"
    @@ -76,7 +76,7 @@
         3  <"element"
     ⟨⟩
     
    -

    Negative argument

    +

    Negative argument

    If 𝕨 is negative then wraps around the other side to take or drop from the end of 𝕩. It's a lot like negative indices in Select (), but while negative indices are asymmetric—0 is the first entry but ¯1 is the last—this case is symmetric. It's because the place to cut is always before the index 𝕨, cancelling out the negative index asymmetry.

    ↗️
        3  "abcdeEDCBA"
     "abc"
    @@ -98,7 +98,7 @@
     ↗️
        ¯6  "xy"
     "    xy"
     
    -

    Multiple axes

    +

    Multiple axes

    In the general case 𝕨 is a list of integers. They're matched with the leading axes of 𝕩, so that each affects one axis independently from the others.

    ↗️
         m  (10×↕5) + 7
     ┌─                      
    diff --git a/docs/doc/train.html b/docs/doc/train.html
    index b9c2f398..9675922c 100644
    --- a/docs/doc/train.html
    +++ b/docs/doc/train.html
    @@ -4,10 +4,10 @@
       BQN: Function trains
     
     
    -

    Function trains

    +

    Function trains

    Trains are an important aspect of BQN's tacit programming capabilities. In fact, a crucial one: with trains and the identity functions Left () and Right (), a fully tacit program can express any explicit function whose body is a statement with 𝕨 and 𝕩 used only as arguments (that is, there are no assignments and 𝕨 and 𝕩 are not used in operands or lists. Functions with assignments may have too many variables active at once to be directly translated but can be emulated by constructing lists. But it's probably a bad idea). Without trains it isn't possible to have two different functions that each use both arguments to a dyadic function. With trains it's perfectly natural.

    BQN's trains are the same as those of Dyalog APL, except that Dyalog is missing the minor convenience of BQN's Nothing (·). There are many Dyalog-based documents and videos on trains you can view on the APL Wiki.

    -

    2-train, 3-train

    +

    2-train, 3-train

    Trains are an adaptation of the mathematical convention that, for example, two functions F and G can be added to get a new function F+G that applies as (F+G)(x) = F(x)+G(x). With a little change to the syntax, we can do exactly this in BQN:

    ↗️
        (⊢+⌽) 5
     ⟨ 4 4 4 4 4 ⟩
    @@ -25,7 +25,7 @@
     "fcdeab"
     

    The three functions ∾⌽, ·∾⌽, and are completely identical: Join of Reverse. Why might we want three different ways to write the same thing? If we only want to define a function, there's hardly any difference. However, these three forms have different syntax, and might be easier or harder to use in different contexts. As we'll see, we can use inside a train without parenthesizing it, and string ·∾⌽ but not ∾⌽ together with other trains. Let's look at how the train syntax extends to longer expressions.

    -

    Longer trains

    +

    Longer trains

    Function application in trains, as in other contexts, shares the lowest precedence level with assignment. Modifiers and strands (with ) have higher precedence, so they are applied before forming any trains. Once this is done, an expression is a subject expression if it ends with a subject and a function expression if it ends with a function (there are also modifier expressions, which aren't relevant here). A train is any function expression with multiple functions or subjects in it: while we've seen examples with two or three functions, any number are allowed.

    Subject expressions are the domain of "old-school" APL, and just apply one function after another to a subject, possibly assigning some of the results (that's the top-level picture—anything can still happen within parentheses). Subjects other than the first appear only as left arguments to functions, which means that two subjects can't appear next to each other because the one on the left would have no corresponding function. Here's an example from the compiler (at one point), with functions and assignments numbered in the order they are applied and their arguments marked with «», and a fully-parenthesized version shown below.

    cnpilt/𝕩civi+nv
    @@ -40,7 +40,7 @@
     ⊢>(¯1»⌈`)
     

    In a train, arguments alternate strictly with combining functions between them. Arguments can be either functions or subjects, except for the rightmost one, which has to be a function to indicate that the expression is a train. Trains tend to be shorter than subject expressions partly because to keep track of this alternation in a train of all functions, you need to know where each function is relative to the end of the train (subjects like the ¯1 above only occur as left arguments, so they can also serve as anchors).

    -

    Practice training

    +

    Practice training

    The train ⊢>¯1»⌈` is actually a nice trick to get the result of Mark Firsts 𝕩 given the result of Classify 𝕩, without doing another search. Let's take a closer look, first by applying it mechanically. To do this, we apply each "argument" to the train's argument, and then combine them with the combining functions.

    ( > ¯1 » `) 𝕩
     (𝕩) > (¯1) » (`𝕩)
    @@ -70,7 +70,7 @@
         (⊢>¯1»⌈`) sc
     ⟨ 1 1 1 1 0 0 1 0 0 1 1 ⟩
     
    -

    Composing trains

    +

    Composing trains

    The example above uses a train with five functions: an odd number. Trains with an odd length are always composed of length-3 trains, and they themselves are composed the same way as subject expressions: an odd-length train can be placed in the last position of another train without parentheses, but it needs parentheses to go in any other position.

    But we also saw the length-2 train ∾⌽ above. Even-length trains consist of a single function () applied to a function or odd-length train (); another perspective is that an even-length train is an odd-length train where the left argument of the final (leftmost) function is left out, so it's called with only a right argument. An even-length train always needs parentheses if it's used as one of the functions in another train. However, it can also be turned into an odd-length train by placing · at the left, making the implicit missing argument explicit. After this it can be used at the end of an odd-length train without parentheses. To get some intuition for even-length trains, let's look at an example of three functions used together: the unique () sorted () absolute values (|) of an argument list.

    ↗️
        ⍷∧| 34¯3¯20
    diff --git a/docs/doc/transpose.html b/docs/doc/transpose.html
    index 42d3dd71..b2319106 100644
    --- a/docs/doc/transpose.html
    +++ b/docs/doc/transpose.html
    @@ -4,9 +4,9 @@
       BQN: Transpose
     
     
    -

    Transpose

    +

    Transpose

    Transpose () is a tool for rearranging the axes of an array. BQN's version is tweaked relative to APL to align better with the leading axis model and make common operations easier.

    -

    Transpose basics

    +

    Transpose basics

    The name for the primitive comes from the Transpose operation on matrices. Given a matrix as an array of rank 2, will transpose it:

    ↗️
         mat  23  6
     ┌─       
    @@ -27,7 +27,7 @@
     3
     

    With two axes the only interesting operation of this sort is to swap them (and with one or zero axes there's nothing interesting to do, and just returns the argument array). But a BQN programmer may well want to work with higher-rank arrays—although such a programmer might call them "tensors"—and this means there are many more ways to rearrange the axes. Transpose extends to high-rank arrays to allow some useful special cases as well as completely general axis rearrangement, as described below.

    -

    Monadic Transpose

    +

    Monadic Transpose

    APL extends matrix transposition to any rank by reversing all axes for its monadic , but this generalization isn't very natural and is almost never used. The main reason for it is to maintain the equivalence a MP b ←→ b MP a, where MP +˝×1 is the generalized matrix product. But even here APL's Transpose is suspect. It does much more work than it needs to, as we'll see.

    BQN's transpose takes the first axis of 𝕩 and moves it to the end.

    ↗️
         a23456  23456
    @@ -88,7 +88,7 @@
     ⟨ 3 4 2 5 6 ⟩
     

    In a case like this BQN's Dyadic transpose is much easier.

    -

    Dyadic Transpose

    +

    Dyadic Transpose

    Transpose also allows a left argument that specifies a permutation of 𝕩's axes. For each index pi𝕨 in the left argument, axis i of 𝕩 is used for axis p of the result. Multiple argument axes can be sent to the same result axis, in which case that axis goes along a diagonal of 𝕩, and the result will have a lower rank than 𝕩.

    ↗️
         13204  a23456
     ⟨ 5 2 4 3 6 ⟩
    @@ -109,7 +109,7 @@
     ⟨ 3 4 2 5 6 ⟩
     

    Finally, it's worth noting that, as monadic Transpose moves the first axis to the end, it's equivalent to dyadic Transpose with a "default" left argument: (=-1˙).

    -

    Definitions

    +

    Definitions

    Here we define the two valences of Transpose more precisely.

    An atom right argument to either valence of Transpose is always enclosed to get an array before doing anything else.

    Monadic transpose is identical to (=-1˙), except that if 𝕩 is a unit it is returned unchanged (after enclosing, if it's an atom) rather than giving an error.

    diff --git a/docs/doc/types.html b/docs/doc/types.html index 1fffb7d1..075a0878 100644 --- a/docs/doc/types.html +++ b/docs/doc/types.html @@ -4,7 +4,7 @@ BQN: Types -

    Types

    +

    Types

    BQN supports the following fundamental types:

    • Number
    • @@ -44,11 +44,11 @@

      The reason operations and namespaces are called "mutable" is that the values obtained from them—by calling an operation on particular arguments or reading a field from a namespace—may change over the course of the program. This property is caused by variable modification , which can directly change a namespace field, or change the behavior of an operation that uses the modified variable. This means that a program that doesn't include won't have such changes in behavior. However, there will still be an observable difference between immutable data and the mutable types: code that creates a mutable value (for example, a block function {𝕩}) creates a different one each time, so that two different instances don't match () each other. Data values created at different times may match, but mutable values never will.

      An array is considered immutable because its shape, and what elements it contains, cannot change. An array has no identity outside these properties (and possibly its fill element), so an array with a different shape or different elements would simply be a different array. However, any element of an array could be mutable, in which case the behavior of the array would change with respect to the operation of selecting that element and calling it or accessing a field.

      -

      Data types

      +

      Data types

      Data types—numbers, characters, and arrays—are more like "things" than "actions". If called as a function, a value of one of these types simply returns itself. Data can be uniquely represented, compared for equality, and ordered using BQN's array ordering; in contrast, determining whether two functions always return the same result can be undecidable. For arrays, these properties apply only if there are no operations inside. We might say that "data" in BQN refers to numbers, characters, and arrays of data.

      -

      Numbers

      +

      Numbers

      The BQN spec allows for different numeric models to be used, but requires there to be only one numeric type from the programmer's perspective: while programs can often be executed faster by using limited-range integer types, there is no need to expose these details to the programmer. Existing BQN implementations are based on double-precision floats, like Javascript or Lua.

      -

      Characters

      +

      Characters

      A character in BQN is always a Unicode code point. BQN does not use encodings such as UTF-8 or UTF-16 for characters, although it would be possible to store arrays of integers or characters that correspond to data in these encodings. Because every code point corresponds to a single unit in UTF-32, BQN characters can be thought of as UTF-32 encoded.

      Addition and subtraction treat characters as an affine space relative to the linear space of numbers. This means that:

        @@ -56,18 +56,18 @@
      • Two characters can be subtracted to get the distance between them—a number.

      Other linear combinations such as adding two characters or negating a character are not allowed. You can check whether an application of + or - on numbers and characters is allowed by applying the same function to the "characterness" of each value: 0 for a number and 1 for a character. The result will be a number if this application gives 0 and a character if this gives 1, and otherwise the operation is not allowed.

      -

      Arrays

      +

      Arrays

      Full documentation here.

      A BQN array is a multidimensional arrangement of data. This means it has a certain shape, which is a finite list of natural numbers giving the length along each axis, and it contains an element for each possible index, which is a choice of one natural number that's less than each axis length in the shape. The total number of elements, or bound, is then the product of all the lengths in the shape. The shape may have any length including zero, and this shape is known as the array's rank. An array of rank 0, which always contains exactly one element, is called a unit, while an array of rank 1 is called a list and an array of rank 2 is called a table.

      Each array—empty or nonempty—has an inferred property called a fill. The fill either indicates what element should be used to pad an array, or that such an element is not known and an error should result. Fills can be used by Take (), the two Nudge functions (»«), Merge (>), and Reshape ().

      Arrays are value types (or immutable), so that there is no way to "change" the shape or elements of an array. An array with different properties is a different array. As a consequence, arrays are an inductive type, and it's not possible for an array to contain itself, or contain an array that contains itself, and so on. However, it is possible for an array to contain a function or other operation that has access to the array through a variable, and in this sense an array can "know about" itself.

      Different elements of an array should not influence each other. While some APLs force numbers placed in the same array to a common representation, which may have different precision properties, BQN values must not change behavior when placed in an array. However, this doesn't preclude changing the storage type of an array for better performance: for example, in a BQN implementation using 64-bit floats, an array whose elements are all integers that fit in 32-bit int range might be represented as an array of 32-bit ints.

      -

      Operation types

      +

      Operation types

      An operation is either a function or modifier, and can be applied to inputs—which are called arguments for functions and operands for modifiers—to obtain a result. During this application an operation might also change variables within its scope and call other operations, or cause an error, in which case it doesn't return a result. There is one type of call for each of the three operation types, and an operation will give an error if it is called in a way that doesn't match its type.

      In BQN syntax the result of a function has a subject role and the result of a modifier has a function role. However, the result can be any value at all: roles take place at the syntactic level, which has no bearing on types and values in the semantic level. This distinction is discussed further in Mixing roles.

      -

      Functions

      +

      Functions

      A function is called with one or two arguments. A data value (number, character, or array) can also be called the same way, but only a function takes any action when passed arguments, as data just returns itself. Both the one-argument and two-argument calls are considered function calls, and it's common for a function to allow both. A function that always errors in one case or the other might be called a one-argument or two-argument function, depending on which case is allowed.

      -

      Modifiers

      +

      Modifiers

      A 1-modifier is called with one operand, while a 2-modifier is called with two. In contrast to functions, these are distinct types, and it is impossible to have a value that can be called with either one or two operands. Also in contrast to functions, data values cannot be called as modifiers: they will cause an error if called this way.

      -

      Namespaces

      +

      Namespaces

      Functions and modifiers have internal scopes which they can manipulate (by defining and modifying variables) to save and update information. Namespaces let the programmer to expose this state more directly: identifiers in a namespace may be exported, allowing code outside the namespace to read their values. They are described in detail here.

      diff --git a/docs/doc/undo.html b/docs/doc/undo.html index dc3f9d37..5e7d6276 100644 --- a/docs/doc/undo.html +++ b/docs/doc/undo.html @@ -4,7 +4,7 @@ BQN: Undo -

      Undo

      +

      Undo

      Oh no, you've deleted a function after spending half an hour writing it! Well… hopefully your editor can help with that. But maybe you'll be interested to hear that BQN can reverse the action of some functions—ones that don't lose information. This capability is also used by Repeat () to repeat a negative number of times.

      ↗️
          2  "abcde"
       "cdeab"
      @@ -20,7 +20,7 @@
       "BQN"
       

      Here it undoes a function to decrement the last character by incrementing that character. In part this is enabled by the clean design of BQN primitives, because better-behaved functions like those using structural Under are easier to invert.

      -

      The rules

      +

      The rules

      If 𝔽 can be inverted exactly, then Undo just does that. However, there are also some other functions that BQN inverts. For example, the squaring function ט has both a positive and a negative inverse, and yet:

      ↗️
          ט ¯3
       9
      @@ -32,7 +32,7 @@
       ↗️
          6 - 6
       8.881784197001252e¯16
       
      -

      What's supported?

      +

      What's supported?

      For the full list, see the specification. An individual implementation might support a lot more functionality than is required, so if you're not concerned about portability just try out whatever function you're interested in.

      Arithmetic and simple combinators are usually invertible. A compound function that refers to its argument just once, like 6+⌽, can typically be undone, but one that uses the argument in two different ways, such as ⊢+⋆, probably can't.

      A few notable inverses are the logarithm , un-Transpose , and Indices inverse /. Enclose inverse, <, is an alternative to First that requires its argument to be a unit.

      @@ -42,7 +42,7 @@ ERROR 3 3 3
      -

      Undo headers

      +

      Undo headers

      Undo headers are currently supported only by dzaima/BQN.

      Of course BQN will never be able to invert all the functions you could write (if it could you could earn a lot of bitcoins, among other feats). But it does recognize some header forms that you can use to specify the inverse of a block function. BQN will trust you and won't verify the results your specified inverse gives.

      {
      diff --git a/docs/doc/windows.html b/docs/doc/windows.html
      index bc2aa263..77c40f19 100644
      --- a/docs/doc/windows.html
      +++ b/docs/doc/windows.html
      @@ -4,10 +4,10 @@
         BQN: Windows
       
       
      -

      Windows

      +

      Windows

      In BQN, it's strongly preferred to use functions, and not modifiers, for array manipulation. Functions are simpler as they have fewer moving parts. They are more concrete, since the array results can always be viewed right away. They are easier to implement with reasonable performance as well, since there is no need to recognize many possible function operands as special cases.

      The Window function replaces APL's Windowed Reduction, J's more general Infix operator, and Dyalog's Stencil, which is adapted from one case of J's Cut operator.

      -

      Definition

      +

      Definition

      We'll start with the one-axis case. Here Window's left argument is a number between 0 and 1+≠𝕩. The result is composed of slices of 𝕩 (contiguous sections of major cells) with length 𝕨, starting at each possible index in order.

      ↗️
          5"abcdefg"
       ┌─       
      @@ -24,7 +24,7 @@
       "cdefg"
       

      Windows differs from Prefixes and Suffixes in that it doesn't add a layer of nesting (it doesn't enclose each slice). This is possible because the slices have a fixed size.

      -

      Multiple dimensions

      +

      Multiple dimensions

      The above description applies to a higher-rank right argument. As an example, we'll look at two-row slices of a shape 34 array. For convenience, we will enclose each slice. Note that slices always have the same rank as the argument array.

      ↗️
          <2 2"0123""abcd""ABCD"
       ┌─                   
      @@ -49,10 +49,10 @@
       

      The slices are naturally arranged along multiple dimensions according to their starting index. Once again the equivalence ilx ←→ lix holds, provided i and l have the same length.

      If 𝕨 has length 0, then 𝕩 is not sliced along any dimensions. The only slice that results—the entire argument—is then arranged along an additional zero dimensions. In the end, the result is 𝕩, unchanged.

      -

      More formally

      +

      More formally

      𝕩 is an array. 𝕨 is a number, or numeric list or unit, with 𝕨≠≢𝕩. The result z has shape 𝕨∾¬𝕨((𝕨))𝕩, and element iz is 𝕩˜(𝕨)(↑+((𝕨)))i.

      Using Group we could also write iz ←→ 𝕩˜(𝕨()𝕩) +´¨ i.

      -

      Symmetry

      +

      Symmetry

      Let's look at an earlier example, along with its Transpose ().

      ↗️
          {𝕩,𝕩}5"abcdefg"
       ┌─                   
      @@ -77,7 +77,7 @@
       ↗️
          {((56¬22)𝕩)  23(22𝕩)} 567
       1
       
      -

      Applications

      +

      Applications

      Windows can be followed up with a reduction on each slice to give a windowed reduction. Here we take running sums of 3 values.

      ↗️
          +˝˘3 2,6,0,1,4,3
       ⟨ 8 7 5 8 ⟩
      -- 
      cgit v1.2.3