Documentation about arrays (bland)

author: Marshall Lochbaum <mwlochbaum@gmail.com> 2021-07-25 22:22:19 -0400
committer: Marshall Lochbaum <mwlochbaum@gmail.com> 2021-07-25 22:28:16 -0400
commit: 8533da8eaab9712485ed838dcb2eff90cd8fc5b2 (patch)
tree: 5965602a7685281e73b6c571203efe3449d20694 /docs
parent: a17782ce2ec31709ce30edb3d96fe2f3a9a6ed1f (diff)
5 files changed, 56 insertions, 2 deletions
diff --git a/docs/doc/array.html b/docs/doc/array.html
new file mode 100644
index 00000000..752dcd39
--- /dev/null
+++ b/docs/doc/array.html
@@ -0,0 +1,52 @@
+<head>
+  <link href="../favicon.ico" rel="shortcut icon" type="image/x-icon"/>
+  <link href="../style.css" rel="stylesheet"/>
+  <title>BQN: The array</title>
+</head>
+<div class="nav"><a href="https://github.com/mlochbaum/BQN">BQN</a> / <a href="../index.html">main</a> / <a href="index.html">doc</a></div>
+<h1 id="the-array">The array</h1>
+<p>As BQN is an array language, it's often helpful to understand what an array is when writing BQN programs. Fully describing the concept is sometimes <a href="https://www.jsoftware.com/papers/array.htm">held to be tricky</a>; here we'll see definitions, examples, and metaphors.</p>
+<p>In BQN, as in APL, arrays are multidimensional, instead of strictly linear. Languages like Python, Javascript, or Haskell offer only one-dimensional arrays with <code><span class='Value'>[]</span></code> syntax, and typically represent multidimensional data with nested arrays. Multidimensional arrays have fundamental differences relative to this model.</p>
+<p>BQN's arrays are immutable, meaning that an array is entirely defined by its attributes, and there is no way to modify an existing array, only to produce another array that has changes relative to it. As a result, an array can never contain itself, and arrays form an inductive type. BQN's <a href="lexical.html#mutation">mutable</a> types are operations and namespaces.</p>
+<p>An array might also have a <a href="fill.html">fill element</a> that captures some structural information about its elements and is used by a few operations. The fill, as an inferred property, isn't considered to truly be part of the array but is instead some information about the array that the interpreter keeps track of. So it's out of scope here.</p>
+<h2 id="rectangles">Rectangles</h2>
+<p>A BQN <strong>array</strong> is a multidimensional arrangement of data. The word &quot;array&quot; descends from words meaning &quot;order&quot;, and the data in an array is ordered indeed. Below are examples of arrays with zero, one, and two dimensions.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=PDUKCuKfqDMsJ3gnLDHin6kKCjLigL8z4oC/NCDDl+KMnCAx4oC/NeKAvzjigL8xMQ==">↗️</a><pre>    <span class='Function'>&lt;</span><span class='Number'>5</span>
+┌·   
+· 5  
+    ┘
+
+    <span class='Bracket'>⟨</span><span class='Number'>3</span><span class='Separator'>,</span><span class='String'>'x'</span><span class='Separator'>,</span><span class='Number'>1</span><span class='Bracket'>⟩</span>
+⟨ 3 'x' 1 ⟩
+
+    <span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Ligature'>‿</span><span class='Number'>4</span> <span class='Function'>×</span><span class='Modifier'>⌜</span> <span class='Number'>1</span><span class='Ligature'>‿</span><span class='Number'>5</span><span class='Ligature'>‿</span><span class='Number'>8</span><span class='Ligature'>‿</span><span class='Number'>11</span>
+┌─            
+╵ 2 10 16 22  
+  3 15 24 33  
+  4 20 32 44  
+             ┘
+</pre>
+<p>Each dimension, or <strong>axis</strong>, has some finite number of positions, with an <strong>element</strong> at every <em>combination</em> of positions. For example, if a group of friends shop at several different stores, the amount they spend in a week could be placed in a two-dimensional array, with people along one axis and stores along another. An element of the array would indicate how much one person spent at one store, so that summing across stores gives each person's expenditures and summing across people gives each store's income.</p>
+<p>The axes of an array must be independent, that is, the positions present in one axis are fixed for the entire array and don't depend on other axes. This is a difference relative to a nested list model. When storing data in nested lists, the outer axis comes first and later axes are subordinate to it. The length of the second axis depends completely on the position in the first. A programmer might choose the lengths so it doesn't in a particular case, but in a BQN array differing lengths simply aren't representable.</p>
+<p>The array also needs to be complete. Every element—every combination of positions—must have a value. This value could be a placeholder like <code><span class='String'>@</span></code>, but it has to be <em>something</em> (in the spending example, everyone spends some amount at each store, even if it's zero). And of course, there are no extra elements that don't fit into the positioning system—the <a href="fill.html">fill</a> isn't really part of the array, but extra information about it.</p>
+<h2 id="ordering-and-indices">Ordering and indices</h2>
+<p>To finish this definition of an array we also need to nail down the idea of a position. The positions along one dimension can't be labelled in any way, but they have a linear ordering (mathematically speaking, a <a href="https://en.wikipedia.org/wiki/Total_order">total order</a>: out of any two different positions one comes earlier and the other later). BQN keeps track of this order: for example, when we <a href="join.html">join</a> two arrays it places positions in <code><span class='Value'>𝕨</span></code> before those of <code><span class='Value'>𝕩</span></code> and otherwise maintains the original ordering.</p>
+<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=ImJlZm9yZSIg4oi+ICJhZnRlciI=">↗️</a><pre>    <span class='String'>&quot;before&quot;</span> <span class='Function'>∾</span> <span class='String'>&quot;after&quot;</span>
+"beforeafter"
+</pre>
+<p>It's only the ordering that allows positions to be distinguished. BQN labels them with natural numbers called <strong>indices</strong> that can be derived from the order: the earliest position is called <code><span class='Number'>0</span></code>, the next <code><span class='Number'>1</span></code>, and so on. The axes of an array are also ordered, and they're indexed starting at <code><span class='Number'>0</span></code> as well.</p>
+<p>These kinds of index are one-dimensional, but there's also a multidimensional kind of array <a href="indices.html">index</a>, that identifies an element. An element index consists of one index along each axis. Because the axis are ordered, it can be represented as a list <code><span class='Value'>l</span></code> of numbers, where <code><span class='Value'>i</span><span class='Function'>⊑</span><span class='Value'>l</span></code> is the index along axis <code><span class='Value'>i</span></code>. It's important to distinguish an element from its value: for example, there's only one value (<code><span class='Number'>3</span></code>) contained in the array <code><span class='Bracket'>⟨</span><span class='Number'>3</span><span class='Separator'>,</span><span class='Number'>3</span><span class='Separator'>,</span><span class='Number'>3</span><span class='Bracket'>⟩</span></code>, but it still has three elements, identified by indices <code><span class='Bracket'>⟨</span><span class='Number'>0</span><span class='Bracket'>⟩</span></code>, <code><span class='Bracket'>⟨</span><span class='Number'>1</span><span class='Bracket'>⟩</span></code>, and <code><span class='Bracket'>⟨</span><span class='Number'>2</span><span class='Bracket'>⟩</span></code>.</p>
+<h2 id="dimensions">Dimensions</h2>
+<p>The number of axes in an array is called its <strong>rank</strong>. The number of positions along an axis is called its <strong>length</strong>, and the length of an array means its length along the first axis, or <code><span class='Number'>1</span></code> if there are no axes. The list of the length along each axis is the array's <strong>shape</strong>, and describes the possible element locations completely. In BQN they're exposed as the <a href="shape.html">functions</a> Rank (<code><span class='Function'>=</span></code>), Length (<code><span class='Function'>≠</span></code>), and Shape (<code><span class='Function'>≢</span></code>).</p>
+<p>The total number of elements in an array is its <strong>bound</strong>, and can be found using <a href="reshape.html">Deshape</a> with <code><span class='Function'>≠</span><span class='Modifier2'>∘</span><span class='Function'>⥊</span></code>, is then the product of all the lengths in the shape. An array of rank 0, which always contains exactly one element, is called a <a href="enclose.html#whats-a-unit"><strong>unit</strong></a>, while an array of rank 1 is called a <strong>list</strong> and an array of rank 2 is called a <strong>table</strong>.</p>
+<h2 id="elements">Elements</h2>
+<p>Any BQN value can be used as an array element, including another array (BQN, as a dynamically-typed language, doesn't restrict the types that can be used in one context without a good reason). However, BQN arrays are restricted relative to another array model. Frameworks like NumPy or Julia have mutable arrays, so that the value of an element can be changed after the array is created. This allows an array to be its own element, by creating an array and then inserting it into itself. This would be unnatural in BQN, where an array can only be formed from elements that already exist. In BQN only operations and namespaces are <a href="lexical.html#mutation">mutable</a>.</p>
+<h2 id="properties">Properties</h2>
+<p>Summarizing, the values needed to define an array are its rank (the number of axes), its shape (the number of positions along each axis), and the value of each element (that is, at each combination of positions). Two arrays <a href="match.html">match</a> when all these values match.</p>
+<p>If the rank is considered to be part of the shape, as it is when the shape is a BQN list, then the array is defined by its shape and element list—from <a href="reshape.html">deshape</a>.</p>
+<p>Here's a somewhat informal mathematical take. Given a set of possible element values <code><span class='Function'>T</span></code>, a <em>list</em> of <code><span class='Function'>T</span></code> of length <code><span class='Value'>l</span></code> is a map from natural numbers less than <code><span class='Value'>l</span></code> to <code><span class='Function'>T</span></code>. An array is a rank <code><span class='Value'>r</span></code>, along with a list <code><span class='Value'>s</span></code> of natural numbers of length <code><span class='Value'>r</span></code>, and a map from lists of natural numbers <code><span class='Value'>i</span></code> that satisfy <code><span class='Value'>i</span><span class='Paren'>(</span><span class='Value'>j</span><span class='Paren'>)</span> <span class='Function'>&lt;</span> <span class='Value'>s</span><span class='Paren'>(</span><span class='Value'>j</span><span class='Paren'>)</span></code> for all natural numbers <code><span class='Value'>j</span><span class='Function'>&lt;</span><span class='Value'>r</span></code> to BQN values. Arrays are an inductive type, so that an array can only be defined using elements that already exist. As a result an array's elements are always values of lesser complexity and selecting one element of an array, then an element of that element, and so on, must eventually reach a non-array.</p>
+<h2 id="why-arrays">Why arrays?</h2>
+<p>The multidimensional array is a fairly simple structure, but there are simpler ones like pairs, lists, sets, and dictionaries. Why does BQN choose the array for its central type? I don't think arrays are always the best data structure (or that BQN is always the best language), but I do think they're one of several good choices and have unique advantages.</p>
+<p>Arrays offer a lot of flexibility since they generalize lists. This also means that they can be used to represent pairs or sets. Two lists, or an array with a length-2 axis, can represent a map, although it could be hard to use with good performance.</p>
+<p>But arrays are less flexible than <em>nested</em> lists (which in turn are less flexible than nested arrays). This is also in some sense a strength. The axes of an array are inherently independent. Lots of things in real life are independent! Regardless of which main you choose in your Cook Out tray you have the same options for sides. A term in a multivariate polynomial can have any power of <code><span class='Value'>x</span></code> and any power of <code><span class='Value'>y</span></code>. An array language lets you encode this independence in your data, and use operations that take advantage of it.</p>
+<p>The rigidity of arrays is also great for performance. Nested lists might have a complicated structure in memory. An array can always be packed flat: the shape and the elements. This strided representation makes branchless and cache-friendly primitive algorithms much easier.</p>
diff --git a/docs/doc/index.html b/docs/doc/index.html
index e7ab568f..1c68347a 100644
--- a/docs/doc/index.html
+++ b/docs/doc/index.html
@@ -22,6 +22,7 @@
 <p>Concepts:</p>
 <ul>
 <li><a href="context.html">Context-free grammar</a></li>
+<li><a href="array.html">Arrays</a></li>
 <li><a href="based.html">Based array theory</a></li>
 <li><a href="arrayrepr.html">Array notation and display</a></li>
 <li><a href="indices.html">Array indices</a></li>
diff --git a/docs/doc/indices.html b/docs/doc/indices.html
index 7f342cca..5a0836d9 100644
--- a/docs/doc/indices.html
+++ b/docs/doc/indices.html
@@ -5,7 +5,7 @@
 </head>
 <div class="nav"><a href="https://github.com/mlochbaum/BQN">BQN</a> / <a href="../index.html">main</a> / <a href="index.html">doc</a></div>
 <h1 id="indices">Indices</h1>
-<p>One-dimensional arrays such as K lists or Python arrays have only one kind of index, a single number that refers to an element. For multidimensional arrays using the <a href="leading.html">leading axis theory</a>, there are several types of indexing that can be useful. Historically, nested APL designs have equivocated between these, which I believe can lead to subtle errors when programming. BQN focuses on single-number (atomic) indices, which can refer to list elements or array major cells (or more generally indexing along any particular axis). When using atomic indices to select elements, the indexed array has to be a list. In contrast, elements of any array can be indicated by list indices with length equal to that array's rank. Only two BQN primitives use these list indices: <a href="range.html">Range</a> (<code><span class='Function'>↕</span></code>), which returns an array of them if given a list argument, and <a href="pick.html">Pick</a> (<code><span class='Function'>⊑</span></code>), where the depth-1 components of an array left argument are list indices.</p>
+<p>One-dimensional arrays such as K lists or Python arrays have only one kind of index, a single number that refers to an element. For <a href="array.html">multidimensional arrays</a> using the <a href="leading.html">leading axis theory</a>, there are several types of indexing that can be useful. Historically, nested APL designs have equivocated between these, which I believe can lead to subtle errors when programming. BQN focuses on single-number (atomic) indices, which can refer to list elements or array major cells (or more generally indexing along any particular axis). When using atomic indices to select elements, the indexed array has to be a list. In contrast, elements of any array can be indicated by list indices with length equal to that array's rank. Only two BQN primitives use these list indices: <a href="range.html">Range</a> (<code><span class='Function'>↕</span></code>), which returns an array of them if given a list argument, and <a href="pick.html">Pick</a> (<code><span class='Function'>⊑</span></code>), where the depth-1 components of an array left argument are list indices.</p>
 <p>The following functions take or return indices. Except where marked, the indices are in the result; this is by far the most common type of index use. <a href="group.html">Group</a> (<code><span class='Function'>⊔</span></code>) is given two rows as it falls into both cases. Note that in the result case, there is usually no possibility for the programmer to select the format of indices. Instead, the language should be carefully designed to make sure that the kind of index returned is as useful as possible.</p>
 <table>
 <thead>
diff --git a/docs/doc/leading.html b/docs/doc/leading.html
index 11e7a567..51f29e38 100644
--- a/docs/doc/leading.html
+++ b/docs/doc/leading.html
@@ -5,7 +5,7 @@
 </head>
 <div class="nav"><a href="https://github.com/mlochbaum/BQN">BQN</a> / <a href="../index.html">main</a> / <a href="index.html">doc</a></div>
 <h1 id="the-leading-axis-convention">The leading axis convention</h1>
-<p>Several primitive functions manipulate the right argument, or sometimes both arguments, along one or more axes. According to the <a href="https://aplwiki.com/wiki/Leading_axis_theory">leading axis model</a>, it's best to make the primitives operate on initial axes, because the Rank modifier then allows it to apply to later axes as well. Here we'll see how this pattern works in BQN.</p>
+<p>Several primitive functions manipulate the right argument, or sometimes both arguments, of an <a href="array.html">array</a> along one or more axes. According to the <a href="https://aplwiki.com/wiki/Leading_axis_theory">leading axis model</a>, it's best to make the primitives operate on initial axes, because the Rank modifier then allows it to apply to later axes as well. Here we'll see how this pattern works in BQN.</p>
 <h2 id="monadic-functions">Monadic functions</h2>
 <h3 id="manipulating-cells">Manipulating cells</h3>
 <p>Most non-arithmetic monadic functions work only on the first axis of the argument—that is, they treat it as a list of its major cells. The function <a href="shape.html">Length</a> (<code><span class='Function'>≠</span></code>) counts these major cells, while <a href="prefixes.html">Prefixes</a> (<code><span class='Function'>↑</span></code>), Suffixes (<code><span class='Function'>↓</span></code>), <a href="reverse.html">Reverse</a> (<code><span class='Function'>⌽</span></code>), and <a href="select.html">First Cell</a> (<code><span class='Function'>⊏</span></code>) move them around. The <a href="fold.html#insert">Insert</a> (<code><span class='Modifier'>˝</span></code>) and <a href="scan.html">Scan</a> (<code><span class='Modifier'>`</span></code>) modifiers also yield functions that work along the first axis; in contrast, <a href="fold.html">Fold</a> (<code><span class='Modifier'>´</span></code>) requires <code><span class='Value'>𝕩</span></code> to be a list, as it works on elements.</p>
diff --git a/docs/doc/types.html b/docs/doc/types.html
index cef41290..46356766 100644
--- a/docs/doc/types.html
+++ b/docs/doc/types.html
@@ -57,6 +57,7 @@
 </ul>
 <p>Other linear combinations such as adding two characters or negating a character are not allowed. You can check whether an application of <code><span class='Function'>+</span></code> or <code><span class='Function'>-</span></code> on numbers and characters is allowed by applying the same function to the &quot;characterness&quot; of each value: <code><span class='Number'>0</span></code> for a number and <code><span class='Number'>1</span></code> for a character. The result will be a number if this application gives <code><span class='Number'>0</span></code> and a character if this gives <code><span class='Number'>1</span></code>, and otherwise the operation is not allowed.</p>
 <h3 id="arrays">Arrays</h3>
+<p><em>Full documentation <a href="array.html">here</a>.</em></p>
 <p>A BQN array is a multidimensional arrangement of data. This means it has a certain <a href="shape.html"><em>shape</em></a>, which is a finite list of natural numbers giving the length along each axis, and it contains an <em>element</em> for each possible <a href="indices.html"><em>index</em></a>, which is a choice of one natural number that's less than each axis length in the shape. The total number of elements, or <em>bound</em>, is then the product of all the lengths in the shape. The shape may have any length including zero, and this shape is known as the array's <em>rank</em>. An array of rank 0, which always contains exactly one element, is called a <em>unit</em>, while an array of rank 1 is called a <em>list</em> and an array of rank 2 is called a <em>table</em>.</p>
 <p>Each array—empty or nonempty—has an inferred property called a <em>fill</em>. The fill either indicates what element should be used to pad an array, or that such an element is not known and an error should result. Fills can be used by <a href="take.html">Take</a> (<code><span class='Function'>↑</span></code>), the two <a href="shift.html">Nudge</a> functions (<code><span class='Function'>»«</span></code>), <a href="pick.html">First</a> (<code><span class='Function'>⊑</span></code>), and <a href="reshape.html">Reshape</a> (<code><span class='Function'>⥊</span></code>).</p>
 <p>Arrays are value types (or immutable), so that there is no way to &quot;change&quot; the shape or elements of an array. An array with different properties is a different array. As a consequence, arrays are an inductive type, and it's not possible for an array to contain itself, or contain an array that contains itself, and so on. However, it is possible for an array to contain a function or other operation that has access to the array through a variable, and in this sense an array can &quot;know about&quot; itself.</p>
author	Marshall Lochbaum <mwlochbaum@gmail.com>	2021-07-25 22:22:19 -0400
committer	Marshall Lochbaum <mwlochbaum@gmail.com>	2021-07-25 22:28:16 -0400
commit	8533da8eaab9712485ed838dcb2eff90cd8fc5b2 (patch)
tree	5965602a7685281e73b6c571203efe3449d20694 /docs
parent	a17782ce2ec31709ce30edb3d96fe2f3a9a6ed1f (diff)