aboutsummaryrefslogtreecommitdiff
path: root/docs/doc/train.html
blob: 4ae23b1381d4b295b8f3c0a51d3fd61e5f96319d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
<head>
  <link href="../favicon.ico" rel="shortcut icon" type="image/x-icon"/>
  <link href="../style.css" rel="stylesheet"/>
  <title>BQN: Function trains</title>
</head>
<div class="nav"><a href="https://github.com/mlochbaum/BQN">BQN</a> / <a href="../index.html">main</a> / <a href="index.html">doc</a></div>
<h1 id="function-trains">Function trains</h1>
<p>Trains are an important aspect of BQN's <a href="tacit.html">tacit</a> programming capabilities. In fact, a crucial one: with trains and the identity functions Left (<code><span class='Function'></span></code>) and Right (<code><span class='Function'></span></code>), a fully tacit program can express any explicit function whose body is a statement with <code><span class='Value'>𝕨</span></code> and <code><span class='Value'>𝕩</span></code> used only as arguments (that is, there are no assignments and <code><span class='Value'>𝕨</span></code> and <code><span class='Value'>𝕩</span></code> are not used in operands or lists. Functions with assignments may have too many variables active at once to be directly translated but can be emulated by constructing lists. But it's probably a bad idea). Without trains it isn't possible to have two different functions that each use both arguments to a dyadic function. With trains it's perfectly natural.</p>
<p>BQN's trains are the same as those of Dyalog APL, except that Dyalog is missing the minor convenience of BQN's Nothing (<code><span class='Nothing'>·</span></code>). There are many Dyalog-based documents and videos trains you can view on the <a href="https://aplwiki.com/wiki/Train">APL Wiki</a>.</p>
<h2 id="2-train-3-train">2-train, 3-train</h2>
<p>Trains are an adaptation of the mathematical convention that, for example, two functions <code><span class='Function'>F</span></code> and <code><span class='Function'>G</span></code> can be added to get a new function <code><span class='Function'>F+G</span></code> that applies as <code><span class='Paren'>(</span><span class='Function'>F+G</span><span class='Paren'>)(</span><span class='Value'>x</span><span class='Paren'>)</span> <span class='Function'>=</span> <span class='Function'>F</span><span class='Paren'>(</span><span class='Value'>x</span><span class='Paren'>)</span><span class='Function'>+G</span><span class='Paren'>(</span><span class='Value'>x</span><span class='Paren'>)</span></code>. With a little change to the syntax, we can do exactly this in BQN:</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KOKKoivijL0pIOKGlTU=">↗️</a><pre>    <span class='Paren'>(</span><span class='Function'>⊢+⌽</span><span class='Paren'>)</span> <span class='Function'></span><span class='Number'>5</span>
⟨ 4 4 4 4 4 ⟩
</pre>
<p>So given a list of the first few natural numbers, that <em>same</em> list <em>plus</em> its <em>reverse</em> gives a list of just one number repeated many times. I'm sure if I were <a href="https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss#Anecdotes">Gauss</a> I'd be able to find some clever use for that fact. The mathematical convention extends to any central operator and any number of function arguments, which in BQN means we use any three functions, and call the train with a left argument as well—the only numbers of arguments BQN syntax allows are 1 and 2.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=NyAoK+KJjS0pIDI=">↗️</a><pre>    <span class='Number'>7</span> <span class='Paren'>(</span><span class='Function'>+≍-</span><span class='Paren'>)</span> <span class='Number'>2</span>
⟨ 9 5 ⟩
</pre>
<p>Here <a href="couple.html">Couple</a> (<code><span class='Function'></span></code>) is used to combine two units into a list, so we get seven plus and minus two. It's also possible to leave out the leftmost function of a train, or replace it with <code><span class='Nothing'>·</span></code>. In this case the function on the right is called, then the other function is called on its result—it's identical to the mathematical composition <code><span class='Modifier2'></span></code>, which is also part of BQN.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KOKIvuKMvSkgImFiIuKAvyJjZGUi4oC/ImYiCijCt+KIvuKMvSkgImFiIuKAvyJjZGUi4oC/ImYiCuKIvuKImOKMvSAiYWIi4oC/ImNkZSLigL8iZiI=">↗️</a><pre>    <span class='Paren'>(</span><span class='Function'>∾⌽</span><span class='Paren'>)</span> <span class='String'>&quot;ab&quot;</span><span class='Ligature'></span><span class='String'>&quot;cde&quot;</span><span class='Ligature'></span><span class='String'>&quot;f&quot;</span>
"fcdeab"
    <span class='Paren'>(</span><span class='Nothing'>·</span><span class='Function'>∾⌽</span><span class='Paren'>)</span> <span class='String'>&quot;ab&quot;</span><span class='Ligature'></span><span class='String'>&quot;cde&quot;</span><span class='Ligature'></span><span class='String'>&quot;f&quot;</span>
"fcdeab"
    <span class='Function'></span><span class='Modifier2'></span><span class='Function'></span> <span class='String'>&quot;ab&quot;</span><span class='Ligature'></span><span class='String'>&quot;cde&quot;</span><span class='Ligature'></span><span class='String'>&quot;f&quot;</span>
"fcdeab"
</pre>
<p>The three functions <code><span class='Function'>∾⌽</span></code>, <code><span class='Nothing'>·</span><span class='Function'>∾⌽</span></code>, and <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> are completely identical. Why might we want <strong>three</strong> different ways to write the same thing? If we only want to define a function, there's hardly any difference. However, these three forms have different syntax, and might be easier or harder to use in different contexts. As we'll see, we can use <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> inside a train without parenthesizing it, and string <code><span class='Nothing'>·</span><span class='Function'>∾⌽</span></code> but not <code><span class='Function'>∾⌽</span></code> together with other trains. Let's look at how the train syntax extends to longer expressions.</p>
<h2 id="longer-trains">Longer trains</h2>
<p>Function application in trains, as in other contexts, shares the lowest precedence level with assignment. Modifiers and strands (with <code><span class='Ligature'></span></code>) have higher precedence, so they are applied before forming any trains. Once this is done, an expression is a <em>subject expression</em> if it ends with a subject and a <em>function expression</em> if it ends with a function (there are also modifier expressions, which aren't relevant here). A train is any function expression with multiple functions or subjects in it: while we've seen examples with two or three functions, any number are allowed.</p>
<p>Subject expressions are the domain of &quot;old-school&quot; APL, and just apply one function after another to a subject, possibly assigning some of the results (that's the top-level picture—anything can still happen within parentheses). Subjects other than the first appear only as left arguments to functions, which means that two subjects can't appear next to each other because the one on the left would have no corresponding function. Here's an example from the compiler (at one point), with functions and assignments numbered in the order they are applied and their arguments marked with <code><span class='Function'>«»</span></code>, and a fully-parenthesized version shown below.</p>
<pre><span class='Value'>cn</span><span class='Gets'></span><span class='Value'>pi</span><span class='Function'></span><span class='Value'>lt</span><span class='Gets'></span><span class='Function'>/</span><span class='Value'>𝕩</span><span class='Function'></span><span class='Value'>ci</span><span class='Gets'></span><span class='Value'>vi</span><span class='Function'>+</span><span class='Value'>nv</span>
 <span class='Function'>«</span><span class='Number'>6</span> <span class='Function'>«</span><span class='Number'>5</span> <span class='Function'>«</span><span class='Number'>43</span><span class='Function'>«</span><span class='Number'>2</span> <span class='Function'>«</span><span class='Number'>1</span> <span class='Function'>«</span><span class='Number'>0</span><span class='Function'>»</span>

<span class='Value'>cn</span><span class='Gets'></span><span class='Paren'>(</span><span class='Value'>pi</span><span class='Function'></span><span class='Paren'>(</span><span class='Value'>lt</span><span class='Gets'></span><span class='Paren'>(</span><span class='Function'>/</span><span class='Paren'>(</span><span class='Value'>𝕩</span><span class='Function'></span><span class='Paren'>(</span><span class='Value'>ci</span><span class='Gets'></span><span class='Paren'>(</span><span class='Value'>vi</span><span class='Function'>+</span><span class='Value'>nv</span><span class='Paren'>))))))</span>
</pre>
<p>Function expressions have related but different rules, driven by the central principle that functions can be used as &quot;arguments&quot;. Because roles can no longer be used to distinguish functions from their arguments, every function is assumed to have two arguments unless there's nothing to the left of it, or an assignment. In trains, assignments can't appear in the middle, only at the left side after all the functions have been applied. Here's another example from the compiler. Remember that for our purposes <code><span class='Function'></span><span class='Modifier'>`</span></code> behaves as a single component.</p>
<pre><span class='Function'>&gt;</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span>
<span class='Function'>«</span><span class='Number'>1</span> <span class='Function'>«</span><span class='Number'>0</span><span class='Function'>»</span>

<span class='Function'>&gt;</span><span class='Paren'>(</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Paren'>)</span>
</pre>
<p>In a train, arguments alternate strictly with combining functions between them. Arguments can be either functions or subjects, except for the rightmost one, which has to be a function to indicate that the expression is a train. Trains tend to be shorter than subject expressions partly because to keep track of this alternation in a train of all functions, you need to know where each function is relative to the end of the train (subjects like the <code><span class='Number'>¯1</span></code> above only occur as left arguments, so they can also serve as anchors).</p>
<h2 id="practice-training">Practice training</h2>
<p>The train <code><span class='Function'>&gt;</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span></code> is actually a nice trick for getting the unique mask <code><span class='Function'></span><span class='Value'>𝕩</span></code> from the self-classify <code><span class='Function'></span><span class='Value'>𝕩</span></code> without doing another search. Let's take a closer look, first by applying it mechanically. To do this, we apply each &quot;argument&quot; to the train's argument, and then combine them with the combining functions.</p>
<pre><span class='Paren'>(</span><span class='Function'></span> <span class='Function'>&gt;</span> <span class='Number'>¯1</span> <span class='Function'>»</span> <span class='Function'></span><span class='Modifier'>`</span><span class='Paren'>)</span> <span class='Value'>𝕩</span>
<span class='Paren'>(</span><span class='Function'></span><span class='Value'>𝕩</span><span class='Paren'>)</span> <span class='Function'>&gt;</span> <span class='Paren'>(</span><span class='Number'>¯1</span><span class='Paren'>)</span> <span class='Function'>»</span> <span class='Paren'>(</span><span class='Function'></span><span class='Modifier'>`</span><span class='Value'>𝕩</span><span class='Paren'>)</span>
<span class='Value'>𝕩</span> <span class='Function'>&gt;</span> <span class='Number'>¯1</span> <span class='Function'>»</span> <span class='Function'></span><span class='Modifier'>`</span><span class='Value'>𝕩</span>
</pre>
<p>So—although not all trains simplify so much—this confusing train is just <code><span class='Brace'>{</span><span class='Value'>𝕩</span><span class='Function'>&gt;</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Value'>𝕩</span><span class='Brace'>}</span></code>! Why would I write it in such an obtuse way? To someone used to working with trains, the function <code><span class='Paren'>(</span><span class='Function'>&gt;</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Paren'>)</span></code> isn't any more complicated to read: <code><span class='Function'></span></code> in an argument position of a train just means <code><span class='Value'>𝕩</span></code> while <code><span class='Function'></span><span class='Modifier'>`</span></code> will be applied to the arguments. Using the train just means slightly shorter code and two fewer <code><span class='Value'>𝕩</span></code>s to trip over.</p>
<p>This function's argument is the self-classify <code><span class='Function'></span></code> of some list (in fact this technique also works on the self-indices <code><span class='Value'>𝕩</span><span class='Function'></span><span class='Value'>𝕩</span></code>). Self-classify moves along its argument, giving each major cell a number: the first unused natural number if that value hasn't been seen yet, and otherwise the number chosen when it was first seen. It can be implemented as <code><span class='Function'>∊⊐⊢</span></code>, another train!</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4oqiIHNjIOKGkCDiipAgInRhY2l0dHJhaW5zIg==">↗️</a><pre>    <span class='Function'></span> <span class='Value'>sc</span> <span class='Gets'></span> <span class='Function'></span> <span class='String'>&quot;tacittrains&quot;</span>
⟨ 0 1 2 3 0 0 4 1 3 5 6 ⟩
</pre>
<p>Each <code><span class='String'>'t'</span></code> is <code><span class='Number'>0</span></code>, each <code><span class='String'>'a'</span></code> is <code><span class='Number'>1</span></code>, and so on. We'd like to discard some of the information in the self-classify, to just find whether each major cell had a new value. Here are the input and desired result:</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c2Mg4omNIOKIiiAidGFjaXR0cmFpbnMi">↗️</a><pre>    <span class='Value'>sc</span> <span class='Function'></span> <span class='Function'></span> <span class='String'>&quot;tacittrains&quot;</span>
┌─                       
╵ 0 1 2 3 0 0 4 1 3 5 6  
  1 1 1 1 0 0 1 0 0 1 1  
                        ┘
</pre>
<p>The result should be <code><span class='Number'>1</span></code> when a new number appears, higher than all the previous numbers. To do this, we first find the highest previous number by taking the maximum-scan <code><span class='Function'></span><span class='Modifier'>`</span></code> of the argument, then <a href="shift.html">shifting</a> to move the previous maximum to the current position. The first cell is always new, so we shift in a <code><span class='Number'>¯1</span></code>, so it will be less than any element of the argument.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=wq8xIMK7IOKMiGBzYwoowq8xwrvijIhgKSBzYw==">↗️</a><pre>    <span class='Number'>¯1</span> <span class='Function'>»</span> <span class='Function'></span><span class='Modifier'>`</span><span class='Value'>sc</span>
⟨ ¯1 0 1 2 3 3 3 4 4 4 5 ⟩
    <span class='Paren'>(</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Paren'>)</span> <span class='Value'>sc</span>
⟨ ¯1 0 1 2 3 3 3 4 4 4 5 ⟩
</pre>
<p>Now we compare the original list with the list of previous-maximums.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=c2MgPiDCrzHCu+KMiGBzYwoo4oqiPsKvMcK74oyIYCkgc2M=">↗️</a><pre>    <span class='Value'>sc</span> <span class='Function'>&gt;</span> <span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Value'>sc</span>
⟨ 1 1 1 1 0 0 1 0 0 1 1 ⟩
    <span class='Paren'>(</span><span class='Function'>&gt;</span><span class='Number'>¯1</span><span class='Function'>»⌈</span><span class='Modifier'>`</span><span class='Paren'>)</span> <span class='Value'>sc</span>
⟨ 1 1 1 1 0 0 1 0 0 1 1 ⟩
</pre>
<h2 id="composing-trains">Composing trains</h2>
<p>The example above uses a train with five functions: an odd number. Trains with an odd length are always composed of length-3 trains, and they themselves are composed the same way as subject expressions: an odd-length train can be placed in the last position of another train without parentheses, but it needs parentheses to go in any other position.</p>
<p>But we also saw the length-2 train <code><span class='Function'>∾⌽</span></code> above. Even-length trains consist of a single function (<code><span class='Function'></span></code>) applied to a function or odd-length train (<code><span class='Function'></span></code>); another perspective is that an even-length train is an odd-length train where the left argument of the final (leftmost) function is left out, so it's called with only a right argument. An even-length train <em>always</em> needs parentheses if it's used as one of the functions in another train. However, it can also be turned into an odd-length train by placing <code><span class='Nothing'>·</span></code> at the left, making the implicit missing argument explicit. After this it can be used at the end of an odd-length train without parentheses. To get some intuition for even-length trains, let's look at an example of three functions used together: the unique (<code><span class='Function'></span></code>) sorted (<code><span class='Function'></span></code>) absolute values (<code><span class='Function'>|</span></code>) of an argument list.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=4o234oinfCAz4oC/NOKAv8KvM+KAv8KvMuKAvzA=">↗️</a><pre>    <span class='Function'>⍷∧|</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
</pre>
<p>If it doesn't have to be a function, it's easiest to write it all out! Let's assume we want a tacit function instead. With three one-argument functions, we can't use a 3-train, as the middle function in a 3-train always has two arguments. Instead, we will compose the functions with 2-trains. Composition is associative, meaning that this can be done starting at either the left or the right.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KCjijbfiiKcpfCkgM+KAvzTigL/CrzPigL/CrzLigL8wCijijbco4oinfCkpIDPigL804oC/wq8z4oC/wq8y4oC/MA==">↗️</a><pre>    <span class='Paren'>((</span><span class='Function'>⍷∧</span><span class='Paren'>)</span><span class='Function'>|</span><span class='Paren'>)</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
    <span class='Paren'>(</span><span class='Function'></span><span class='Paren'>(</span><span class='Function'>∧|</span><span class='Paren'>))</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
</pre>
<p>We might make the first train above easier to read by using Atop (<code><span class='Modifier2'></span></code>) instead of a 2-train. Atop is a 2-modifier, so it doesn't need parentheses when used in a train. The second train can also be changed to <code><span class='Function'>⍷∧</span><span class='Modifier2'></span><span class='Function'>|</span></code> in the same way, but there is another option: the rightmost train <code><span class='Function'>∧|</span></code> can be expanded to <code><span class='Nothing'>·</span><span class='Function'>∧|</span></code>. After this it's an odd-length train in the last position, and doesn't need parentheses anymore.</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=KOKNt+KImOKIp3wpIDPigL804oC/wq8z4oC/wq8y4oC/MAoo4o23wrfiiKd8KSAz4oC/NOKAv8KvM+KAv8KvMuKAvzA=">↗️</a><pre>    <span class='Paren'>(</span><span class='Function'></span><span class='Modifier2'></span><span class='Function'>∧|</span><span class='Paren'>)</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
    <span class='Paren'>(</span><span class='Function'></span><span class='Nothing'>·</span><span class='Function'>∧|</span><span class='Paren'>)</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
</pre>
<p>These two forms have a different emphasis, because the first breaks into subfunctions <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> and <code><span class='Function'>|</span></code> and the second into <code><span class='Function'></span></code> and <code><span class='Function'>∧|</span></code>. It's more common to use <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> as a unit than <code><span class='Function'>∧|</span></code>, so in this case <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'>∧|</span></code> is probably the better train.</p>
<p>Many one-argument functions strung together is <a href="../commentary/problems.html#trains-dont-like-monads">a major weakness</a> for train syntax. If there are many such functions it's probably best to stick with a block function instead!</p>
<a class="replLink" title="Open in the REPL" target="_blank" href="https://mlochbaum.github.io/BQN/try.html#code=e+KNt+KIp3zwnZWpfSAz4oC/NOKAv8KvM+KAv8KvMuKAvzA=">↗️</a><pre>    <span class='Brace'>{</span><span class='Function'>⍷∧|</span><span class='Value'>𝕩</span><span class='Brace'>}</span> <span class='Number'>3</span><span class='Ligature'></span><span class='Number'>4</span><span class='Ligature'></span><span class='Number'>¯3</span><span class='Ligature'></span><span class='Number'>¯2</span><span class='Ligature'></span><span class='Number'>0</span>
⟨ 0 2 3 4 ⟩
</pre>
<p>In our example, there aren't enough of these functions to really be cumbersome. If <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> is a common combination in a particular program, then the train <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'>∧|</span></code> will be more visually consistent and make it easier to use a utility function for <code><span class='Function'></span><span class='Modifier2'></span><span class='Function'></span></code> if that's wanted in the future.</p>