From 89cbad6fc9abadb36103745ab650ed213f68da88 Mon Sep 17 00:00:00 2001
From: Marshall Lochbaum <mwlochbaum@gmail.com>
Date: Thu, 10 Dec 2020 21:10:07 -0500
Subject: Define Under, with proof that structural Under is well-defined

---
 docs/spec/inferred.html | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

(limited to 'docs')
diff --git a/docs/spec/inferred.html b/docs/spec/inferred.html
index cf8520bd..f2c068af 100644
--- a/docs/spec/inferred.html
+++ b/docs/spec/inferred.html
@@ -8,7 +8,7 @@
 <p>BQN includes some simple deductive capabilities: detecting the type of empty array elements, and the Undo (<code><span class='Modifier'>⁼</span></code>) and Under (<code><span class='Modifier2'>⌾</span></code>) modifiers. These tasks are a kind of proof-based or constraint programming, and can never be solved completely (some instances will be undecidable) but can be solved in more instances by ever-more sophisticated algorithms. To allow implementers to develop more advanced implementations while offering some stability and portability to programmers, two kinds of specification are given here. First, constraints are given on the behavior of inferred properties. These are not exact and require some judgment on the part of the implementer. Second, behavior for common or useful cases is specified more precisely. Non-normative suggestions are also given as a reference for implementers.</p>
 <p>For the specified cases, the given functions and modifiers refer to those particular representations. It is not necessary to detect equivalent representations, for example to reduce <code><span class='Paren'>(</span><span class='Function'>+-×</span><span class='Paren'>)</span><span class='Modifier'>⁼</span></code> to <code><span class='Function'>∨</span><span class='Modifier'>⁼</span></code>. However, it is necessary to identify computed functions and modifiers: for example <code><span class='Function'>F</span><span class='Modifier'>⁼</span></code> when the value of <code><span class='Function'>F</span></code> in the expression is <code><span class='Function'>∨</span></code>, or <code><span class='Paren'>(</span><span class='Number'>1</span><span class='Function'>⊑∧</span><span class='Ligature'>‿</span><span class='Function'>∨</span><span class='Paren'>)</span><span class='Modifier'>⁼</span></code>.</p>
 <h2 id="undo">Undo</h2>
-<p>The Undo modifier <code><span class='Modifier'>⁼</span></code>, given an operand <code><span class='Function'>𝔽</span></code> and argument <code><span class='Value'>𝕩</span></code>, and possibly a left argument <code><span class='Value'>𝕨</span></code>, finds a value <code><span class='Value'>y</span></code> such that <code><span class='Value'>𝕩</span><span class='Function'>≡</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Value'>y</span></code>, that is, an element of the pre-image of <code><span class='Value'>𝕩</span></code> under <code><span class='Function'>𝔽</span></code> or <code><span class='Value'>𝕨</span><span class='Function'>𝔽⊢</span></code>. Thus it satisfies the constraint <code><span class='Value'>𝕩</span> <span class='Function'>≡</span> <span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier'>⁼</span><span class='Value'>𝕩</span></code> (<code><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier'>⁼</span><span class='Function'>⊢</span></code> is a <em>right inverse</em> of <code><span class='Value'>𝕨</span><span class='Function'>𝔽⊢</span></code>) provided <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> and <code><span class='Function'>𝔽</span></code> both complete without error. <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> should of course give an error if no inverse element exists, and can also fail if no inverse can be found. It is also preferred for <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> to give an error if there are many choices of inverse with no clear way to choose one of them: for example, <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Function'>⍉</span><span class='Value'>m</span></code> returns the diagonal of matrix <code><span class='Value'>m</span></code>; <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Function'>⍉</span><span class='Modifier'>⁼</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span></code> requires values to be chosen for the off-diagonal elements in its result. It is better to give an error, encouraging the programmer to use a fully-specified approach like <code><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Modifier2'>⌾</span><span class='Paren'>(</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Modifier2'>⊸</span><span class='Function'>⍉</span><span class='Paren'>)</span></code> applied to a matrix of initial elements, than to return a result that could be very different from other implementations.</p>
+<p>The Undo 1-modifier <code><span class='Modifier'>⁼</span></code>, given an operand <code><span class='Function'>𝔽</span></code> and argument <code><span class='Value'>𝕩</span></code>, and possibly a left argument <code><span class='Value'>𝕨</span></code>, finds a value <code><span class='Value'>y</span></code> such that <code><span class='Value'>𝕩</span><span class='Function'>≡</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Value'>y</span></code>, that is, an element of the pre-image of <code><span class='Value'>𝕩</span></code> under <code><span class='Function'>𝔽</span></code> or <code><span class='Value'>𝕨</span><span class='Function'>𝔽⊢</span></code>. Thus it satisfies the constraint <code><span class='Value'>𝕩</span> <span class='Function'>≡</span> <span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier'>⁼</span><span class='Value'>𝕩</span></code> (<code><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier'>⁼</span><span class='Function'>⊢</span></code> is a <em>right inverse</em> of <code><span class='Value'>𝕨</span><span class='Function'>𝔽⊢</span></code>) provided <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> and <code><span class='Function'>𝔽</span></code> both complete without error. <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> should of course give an error if no inverse element exists, and can also fail if no inverse can be found. It is also preferred for <code><span class='Function'>𝔽</span><span class='Modifier'>⁼</span></code> to give an error if there are many choices of inverse with no clear way to choose one of them: for example, <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Function'>⍉</span><span class='Value'>m</span></code> returns the diagonal of matrix <code><span class='Value'>m</span></code>; <code><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Function'>⍉</span><span class='Modifier'>⁼</span><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span></code> requires values to be chosen for the off-diagonal elements in its result. It is better to give an error, encouraging the programmer to use a fully-specified approach like <code><span class='Number'>2</span><span class='Ligature'>‿</span><span class='Number'>3</span><span class='Modifier2'>⌾</span><span class='Paren'>(</span><span class='Number'>0</span><span class='Ligature'>‿</span><span class='Number'>0</span><span class='Modifier2'>⊸</span><span class='Function'>⍉</span><span class='Paren'>)</span></code> applied to a matrix of initial elements, than to return a result that could be very different from other implementations.</p>
 <p>When working with limited-precision numbers, it may be difficult or impossible to exactly invert the operand function. Instead, it is generally acceptable to perform a computation that, if done with unlimited precision, would exactly invert <code><span class='Function'>𝔽</span></code> computed with unlimited precision. This principle is the basis for the numeric inverses specified below. It is also acceptable to find an inverse by numeric methods, provided that the error in the inverse value found relative to an unlimited-precision inverse can be kept close to the inherent error in the implementation's number format.</p>
 <h3 id="required-functions">Required functions</h3>
 <p>Function inverses are given for one or two arguments, with cases where inverse support is not required left blank.</p>
@@ -134,7 +134,7 @@
 </tbody>
 </table>
 <h3 id="optional-functions">Optional functions</h3>
-<p>Several primitives are easily undone, but doing so is not important for BQN programming. These primitives are listed below along with suggested algorithms to undo them. Unlike the implementations above, these functions are not valid in all cases, and the inputs must be validated or the results checked in order to use them.</p>
+<p>Several primitives are easily and uniquely undone, but doing so is not important for BQN programming. These primitives are listed below along with suggested algorithms to undo them. Unlike the implementations above, these functions are not valid in all cases, and the inputs must be validated or the results checked in order to use them.</p>
 <table>
 <thead>
 <tr>
@@ -336,3 +336,28 @@
 </tr>
 </tbody>
 </table>
+<h2 id="under">Under</h2>
+<p>The Under 2-modifier <code><span class='Modifier2'>⌾</span></code> conceptually applies its left operand under the action of its right operand. Setting <code><span class='Value'>z</span><span class='Gets'>←</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier2'>⌾</span><span class='Function'>𝔾</span><span class='Value'>𝕩</span></code>, it satisfies <code><span class='Paren'>(</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier2'>○</span><span class='Function'>𝔾</span><span class='Value'>𝕩</span><span class='Paren'>)</span> <span class='Function'>≡</span> <span class='Function'>𝔾</span><span class='Value'>z</span></code>. We might say that <code><span class='Function'>𝔾</span></code> transforms values to a new domain, and <code><span class='Modifier2'>⌾</span><span class='Function'>𝔾</span></code> lifts actions <code><span class='Function'>𝔽</span></code> performed in this domain to the original domain of values. For example, addition in the logarithmic domain corresponds to multiplication in the linear domain: <code><span class='Function'>+</span><span class='Modifier2'>⌾</span><span class='Paren'>(</span><span class='Function'>⋆</span><span class='Modifier'>⁼</span><span class='Paren'>)</span></code> is <code><span class='Function'>×</span></code> (but less precise if computed in floating point).</p>
+<p>Let <code><span class='Value'>v</span><span class='Gets'>←</span><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier2'>○</span><span class='Function'>𝔾</span><span class='Value'>𝕩</span></code>, so that <code><span class='Value'>v</span><span class='Function'>≡𝔾</span><span class='Value'>z</span></code>. <code><span class='Value'>v</span></code> is of course well-defined, so the inference step is to find <code><span class='Value'>z</span></code> based on <code><span class='Value'>v</span></code> and possibly the original inputs. We distinguish three cases for Under:</p>
+<ul>
+<li><em>Invertible</em> Under: If <code><span class='Function'>𝔾</span></code> is uniquely invertible on <code><span class='Value'>v</span></code>, that is, <code><span class='Value'>v</span><span class='Function'>≡𝔾</span><span class='Value'>z</span></code> has a unique solution for <code><span class='Value'>z</span></code>, then the result of Under is that solution.</li>
+<li><em>Structural</em> Under: If <code><span class='Function'>𝔾</span></code> is a structural function (to be defined below) and <code><span class='Value'>v</span></code> is compatible with <code><span class='Function'>𝔾</span></code> on <code><span class='Value'>𝕩</span></code>, then the result is obtained by inserting <code><span class='Value'>v</span></code> back into <code><span class='Value'>𝕩</span></code>.</li>
+<li><em>Computational</em> Under: If <code><span class='Function'>𝔾</span></code> is provably not a structural function, then the result is <code><span class='Function'>𝔾</span><span class='Modifier'>⁼</span><span class='Value'>v</span></code> if it is defined.</li>
+</ul>
+<p>When implementing, there is no need to implement invertable Under specially: it can be handled as part of the structural and computation cases.</p>
+<h3 id="mathematical-definition-of-structural-under">Mathematical definition of structural Under</h3>
+<p>In general, structural Under requires information from the original right argument to be computed. Here we will define the <em>structural inverse of</em> structural function <code><span class='Function'>𝔾</span></code> <em>on</em> <code><span class='Value'>v</span></code> <em>into</em> <code><span class='Value'>𝕩</span></code>, where <code><span class='Value'>𝕩</span></code> gives this information. The value <code><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier2'>⌾</span><span class='Function'>𝔾</span><span class='Value'>𝕩</span></code> is then the structural inverse of <code><span class='Function'>𝔾</span></code> on <code><span class='Value'>𝕨</span><span class='Function'>𝔽</span><span class='Modifier2'>○</span><span class='Function'>𝔾</span><span class='Value'>𝕩</span></code> into <code><span class='Value'>𝕩</span></code>.</p>
+<p>We define a <em>structure</em> to be either the value <code><span class='Nothing'>·</span></code> or an array of structures (substitute <code><span class='Number'>0</span></code> or any other specific value for <code><span class='Nothing'>·</span></code> if you'd like structures to be a subset of BQN arrays; the value is irrelevant). A given structure <code><span class='Value'>s</span></code> is a <em>captures</em> a BQN value or structure <code><span class='Value'>𝕩</span></code> if it is <code><span class='Nothing'>·</span></code>, or if <code><span class='Value'>s</span></code> and <code><span class='Value'>𝕩</span></code> are arrays of the same shape, and each element of <code><span class='Value'>s</span></code> captures the corresponding element of <code><span class='Value'>𝕩</span></code>. Thus a structure shares some or all of the structural information in arrays it captures, but none of the data.</p>
+<p>A <em>structure transformation</em> consists of an initial structure <code><span class='Value'>s</span></code> and a result structure <code><span class='Value'>t</span></code>, as well as a relation between the two: each instance of <code><span class='Nothing'>·</span></code> in <code><span class='Value'>t</span></code> is assigned the location of an instance of <code><span class='Nothing'>·</span></code> in <code><span class='Value'>s</span></code>. If <code><span class='Value'>s</span></code> captures a value <code><span class='Value'>𝕩</span></code>, we say that the structural transformation captures <code><span class='Value'>𝕩</span></code> as well. Given such a value <code><span class='Value'>𝕩</span></code>, the transformation is applied to <code><span class='Value'>𝕩</span></code> by replacing each <code><span class='Nothing'>·</span></code> in <code><span class='Value'>t</span></code> with the corresponding value from <code><span class='Value'>𝕩</span></code>, found by taking the same location in <code><span class='Value'>𝕩</span></code> as the one in <code><span class='Value'>s</span></code> given by the transformation.</p>
+<p>Given a structure transformation <code><span class='Function'>G</span></code> and values <code><span class='Value'>𝕩</span></code> and <code><span class='Value'>v</span></code>, the <em>structural inverse</em> <code><span class='Value'>z</span></code> of <code><span class='Function'>G</span></code> on <code><span class='Value'>v</span></code> into <code><span class='Value'>𝕩</span></code>, if it exists, is the value such that <code><span class='Value'>v</span><span class='Function'>≡G</span> <span class='Value'>z</span></code>, and <code><span class='Value'>𝕩</span> <span class='Function'>≡</span><span class='Modifier2'>○</span><span class='Function'>F</span> <span class='Value'>z</span></code> for every structure transformation <code><span class='Function'>F</span></code> as possible given the previous constraint. If <code><span class='Function'>G</span></code> has initial structure <code><span class='Value'>s</span></code> and final structure <code><span class='Value'>t</span></code>, we know that <code><span class='Value'>s</span></code> captures <code><span class='Value'>𝕩</span></code> and <code><span class='Value'>z</span></code> (it's required in order to apply <code><span class='Function'>G</span></code> at all) while <code><span class='Value'>t</span></code> captures <code><span class='Value'>v</span></code>. For each instance of <code><span class='Nothing'>·</span></code> in <code><span class='Value'>s</span></code>, there are three possibilities:</p>
+<ul>
+<li>No result location in <code><span class='Value'>t</span></code> is assigned this location. This component of <code><span class='Value'>z</span></code> must match <code><span class='Value'>𝕩</span></code>, or <code><span class='Value'>z</span></code> could be improved without breaking any constraints by replacing it.</li>
+<li>Exactly one result location is assigned this location. The requirement <code><span class='Value'>v</span><span class='Function'>≡G</span> <span class='Value'>z</span></code> implies <code><span class='Value'>z</span></code>'s value here is exactly <code><span class='Value'>v</span></code>'s value at that result location.</li>
+<li>More than one result location is assigned this location. Now <code><span class='Value'>z</span></code>'s value there must match <code><span class='Value'>v</span></code>'s value at each of these result leaves. If <code><span class='Value'>v</span></code> has different values at the different leaves, there is no inverse.</li>
+</ul>
+<p>Following this analysis, <code><span class='Value'>z</span></code> can be constructed by replacing each instance of <code><span class='Nothing'>·</span></code> in <code><span class='Value'>s</span></code> with the component of <code><span class='Value'>𝕩</span></code> or <code><span class='Value'>v</span></code> indicated, and it follows that <code><span class='Value'>z</span></code> is well-defined if it exists—and it exists if and only if <code><span class='Value'>t</span></code> captures <code><span class='Value'>v</span></code> and values in <code><span class='Value'>v</span></code> that correspond to the same position in <code><span class='Value'>s</span></code> have the same value.</p>
+<p>A <em>structural function decomposition</em> is a possibly infinite family of structure transformations such that any possible BQN value is captured by at most one of these transformations. It can be applied to any value: if some transformation captures the value, then apply that transformation, and otherwise give an error. A function is a <em>structural function</em> if there is a structural function decomposition that matches it: that is, for any input either both functions give an error or the results match.</p>
+<p>For a structural function <code><span class='Function'>𝔾</span></code>, the <em>structural inverse</em> of <code><span class='Function'>𝔾</span></code> on <code><span class='Value'>v</span></code> into <code><span class='Value'>𝕩</span></code> is the inverse of <code><span class='Function'>G</span></code> on <code><span class='Value'>v</span></code> into <code><span class='Value'>𝕩</span></code>, where <code><span class='Function'>G</span></code> is the structure transformation that captures <code><span class='Value'>𝕩</span></code> from some structural function decomposition <code><span class='Function'>Gd</span></code> matching <code><span class='Function'>𝔾</span></code>. If no decomposition has an initial structural matching <code><span class='Value'>𝕩</span></code> then the structural inverse does not exist.</p>
+<h4 id="well-definedness">Well-definedness</h4>
+<p>In order to show that the structural inverse of a structural function is well-defined, we must show that it does not depend on the choice of structural function decomposition. That is, for a given <code><span class='Value'>𝕩</span></code>, if <code><span class='Function'>G</span></code> and <code><span class='Function'>H</span></code> are structure transformations from different decompositions of <code><span class='Function'>𝔾</span></code> both capturing <code><span class='Value'>𝕩</span></code>, then the structural inverse of <code><span class='Function'>G</span></code> on <code><span class='Value'>v</span></code> into <code><span class='Value'>𝕩</span></code> matches that of <code><span class='Function'>H</span></code> on <code><span class='Value'>v</span></code> into <code><span class='Value'>𝕩</span></code>. Call these inverses <code><span class='Value'>y</span></code> and <code><span class='Value'>z</span></code>. Now begin by supposing that <code><span class='Function'>H</span></code> captures <code><span class='Value'>y</span></code> and <code><span class='Function'>G</span></code> captures <code><span class='Value'>z</span></code>; we will show this later. From the definition of a structural inverse, <code><span class='Value'>v</span><span class='Function'>≡G</span> <span class='Value'>y</span></code>, so that <code><span class='Value'>v</span><span class='Function'>≡𝔾</span> <span class='Value'>y</span></code>, and because <code><span class='Function'>H</span></code> captures <code><span class='Value'>y</span></code> we have <code><span class='Value'>v</span><span class='Function'>≡H</span> <span class='Value'>y</span></code> as well. Let <code><span class='Function'>S</span> <span class='Value'>w</span></code> indicate the set of all functions <code><span class='Function'>F</span></code> such that <code><span class='Value'>w</span> <span class='Function'>≡</span><span class='Modifier2'>○</span><span class='Function'>F</span> <span class='Value'>𝕩</span></code> (this is not a BQN value, both because it is a set and because it's usually infinite): from the definition of <code><span class='Value'>z</span></code> we know that <code><span class='Function'>S</span> <span class='Value'>z</span></code> is a strict superset of <code><span class='Function'>S</span> <span class='Value'>w</span></code> for any <code><span class='Value'>w</span></code> other than <code><span class='Value'>z</span></code> with <code><span class='Value'>v</span><span class='Function'>≡H</span> <span class='Value'>w</span></code>. It follows that either <code><span class='Value'>y</span><span class='Function'>≡</span><span class='Value'>z</span></code> or <code><span class='Function'>S</span> <span class='Value'>y</span></code> is a strict subset of <code><span class='Function'>S</span> <span class='Value'>z</span></code>. By symmetry the same relation holds exchanging <code><span class='Value'>y</span></code> and <code><span class='Value'>z</span></code>, but it's not possible for <code><span class='Function'>S</span> <span class='Value'>y</span></code> to be a strict subset of <code><span class='Function'>S</span> <span class='Value'>z</span></code> and vice-versa. The only remaining possibility is that <code><span class='Value'>y</span><span class='Function'>≡</span><span class='Value'>z</span></code>.</p>
+<p>We now need to show that <code><span class='Function'>H</span></code> captures <code><span class='Value'>y</span></code> (the proof that <code><span class='Function'>G</span></code> captures <code><span class='Value'>z</span></code> is of course the same as <code><span class='Function'>H</span></code> and <code><span class='Function'>G</span></code> are symmetric). To do this we must show that any array in the initial structure of <code><span class='Function'>H</span></code> corresponds to a matching array in <code><span class='Value'>y</span></code>. Choose the position of an array in <code><span class='Function'>H</span></code>, and assume by induction that each array containing it already has the desired property; this implies that this position exists in <code><span class='Value'>y</span></code> as well although we know nothing about its contents. <code><span class='Function'>G</span></code> captures <code><span class='Value'>y</span></code>, so <code><span class='Function'>G</span></code>'s initial structure is <code><span class='Nothing'>·</span></code> at this position or some parent position. There are now two cases: either <code><span class='Function'>G</span></code> makes use of this postion—at least one position in its final structure corresponds to it—or it doesn't. If not, then the contents of <code><span class='Value'>y</span></code> at this position are the same as those of <code><span class='Value'>𝕩</span></code>. Since <code><span class='Function'>H</span></code> captures <code><span class='Value'>𝕩</span></code>, its initial structure matches <code><span class='Value'>𝕩</span></code> and hence <code><span class='Value'>y</span></code> as well at this position. If it does, then instead <code><span class='Value'>y</span></code> matches a part of <code><span class='Value'>v</span></code>.</p>
-- 
cgit v1.2.3