From 0c716e4c6b7c2c44bbfd02b6503cae66af7b7480 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Fri, 28 Jan 2022 16:34:41 -0500 Subject: Separate syntax highlighting category for header/body characters ;:? --- docs/commentary/why.html | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) (limited to 'docs/commentary/why.html') diff --git a/docs/commentary/why.html b/docs/commentary/why.html index e7c80e61..158dc679 100644 --- a/docs/commentary/why.html +++ b/docs/commentary/why.html @@ -32,32 +32,32 @@

BQN's Power modifier allows an array operand to specify multiple results, for example Fn(4) to get 0 up to 3 iterations. Intermediate results are saved, so the number of calls only depends on the highest iteration number present. On the other hand, BQN has no direct equivalent of Power Limit , requiring it to be implemented manually.

An APL selective assignment arr[2 3]+1 should usually be written with Under in BQN: 1+(23)arr (but the correspondence might not always be so direct). You can think of this as a very fancy At (@) operator, that lets you pull out an arbitrary part of an array.

Dfns are adjusted in a few ways that make them more useful for general-purpose programming. A BQN block always runs to the last statement, so a block like {Update 𝕩 1+x} won't return early. Writing modification with makes it clearer which variable's which. Dfns also do a weird shadowing thing where a1a2 makes two different variables; in BQN this is an error because the second should use . Tradfns are removed entirely, along with control structures.

-

BQN doesn't have an exact replacement for dfn guards, although the predicate ? can look similar: {2| : 1+3× ÷2} is equivalent to {2|𝕩 ? 1+3×𝕩 ; 𝕩÷2}. But note that where APL uses the statement separator , BQN uses the body separator ;. This means that the if-true branch in BQN can consist of multiple statements (including additional predicates), but also that the if-false branch can't access variables defined in or before the condition. In both cases the "better" behavior can be obtained with an extra set of braces and possibly assigning names to arguments /𝕩. I think guards end up being cleaner when they work, and predicates are more versatile.

+

BQN doesn't have an exact replacement for dfn guards, although the predicate ? can look similar: {2| : 1+3× ÷2} is equivalent to {2|𝕩 ? 1+3×𝕩 ; 𝕩÷2}. But note that where APL uses the statement separator , BQN uses the body separator ;. This means that the if-true branch in BQN can consist of multiple statements (including additional predicates), but also that the if-false branch can't access variables defined in or before the condition. In both cases the "better" behavior can be obtained with an extra set of braces and possibly assigning names to arguments /𝕩. I think guards end up being cleaner when they work, and predicates are more versatile.

BQN's namespaces have a dedicated syntax, are much easier to create than Dyalog namespaces, and have better performance. I use them all the time, and they feel like a natural part of the language.

J

See also the BQN-J dictionary. J is under development again and a moving target. I stopped using it completely shortly after starting work on BQN in 2020, and while I try to keep up to date on language changes, some remarks here might not fit with the experience you'd get starting with J today.

To me building with J feels like making a tower out of wood and nails by hand: J itself is reliable but I soon don't trust what I'm standing on. J projects start to feel hacky when I have multiple files, locales, or a bit of global state. With BQN I begin to worry about maintainability only when I have enough functions that I can't remember what arguments they expect, and with lexically-scoped variables I simply don't use global state. If you don't reach this scale (in particular, if you use J as a calculator or spreadsheet substitute) you won't feel these concerns, and will have less to gain by moving to BQN. And if you go beyond, you'd need to augment your programs with rigorous documentation and testing in either language.

-

The biggest difference could be in file loading. If you write a script that depends on other files, and want it to work regardless of the directory it's called from, you need to deal with this. In J, >{:4!:3 '' gives the name of the most recently loaded script (the current one, if you put it before any imports), but to make it into a utility you need this glob of what's-going-on:

-
cur_script =: {{(4!:3$0) {::~ 4!:4<'y'}}
+

The biggest difference could be in file loading. If you write a script that depends on other files, and want it to work regardless of the directory it's called from, you need to deal with this. In J, >{:4!:3 '' gives the name of the most recently loaded script (the current one, if you put it before any imports), but to make it into a utility you need this glob of what's-going-on:

+
cur_script =: {{(4!:3$0) {::~ 4!:4<'y'}}
 

In BQN it's •path. And usually you don't need it because •Import resolves paths relative to the file containing it.

-

J uses numeric codes; BQN uses mostly names. So J's 1&o. is BQN's •math.Sin, and 6!:9 corresponds to BQN's •MonoTime.

-

J uses bytestrings by default, making Unicode handling a significant difficulty (see u:). BQN strings are lists of codepoints, so you don't have to worry about how they're encoded or fight to avoid splitting up UTF-8 bytes that need to go together.

+

J uses numeric codes; BQN uses mostly names. So J's 1&o. is BQN's •math.Sin, and 6!:9 corresponds to BQN's •MonoTime.

+

J uses bytestrings by default, making Unicode handling a significant difficulty (see u:). BQN strings are lists of codepoints, so you don't have to worry about how they're encoded or fight to avoid splitting up UTF-8 bytes that need to go together.

But J has its type advantages as well. I miss complex number support in BQN, as it's an optional extension that we haven't yet implemented. And BQN has a hard rule that only one numeric type is exposed to the programmer, which means high-precision integers and rationals aren't allowed at all for a float-based implementation. I think this rule is worth it because J's implicit type conversion is hard to predict and an unexpected numeric type can cause sporadic or subtle program errors.

BQN uses a modifier for J's hook, adding for a reversed version (which I use nearly twice as often). This frees up the 2-train, which is made equivalent to Atop (). It's the system Roger Hui came to advocate, since he argued in favor of a hook conjunction here and made 2-train an Atop when he brought it to Dyalog APL. As an example, the J hook (#~0&<:) to remove negative numbers becomes 0/ in BQN. Hooks are also the topic of Array Cast episode 14, where the panel points out that in J, adding a verb at the far left of a dyadic train changes the rest of the train from dyadic to monadic or vice-versa, an effect that doesn't happen in BQN.

J locales are not first-class values, and BQN namespaces are. I think BQN's namespaces are a lot more convenient to construct, although it is lacking an inheritance mechanism (but J's path system can become confusing quickly). More importantly, BQN namespaces (and closures) are garbage collected. J locales leak unless manually freed by the programmer. More generally, J has no mutable data at all, and to simulate it properly you'd have to write your own tracing garbage collection as the J interpreter doesn't have any. I discussed this issue some in this J forum thread.

-

In J, each function has a built-in rank attribute: for example the ranks of + are 0 0 0. This rank is accessed by the "close" compositions @, &, and &.. Choosing the shorter form for the close compositions—for example @ rather than @:—is often considered a mistake within the J community. And function ranks are unreliable: consider that the ranks of ]@:+, a function that always has the same result as +, are _ _ _. In BQN there aren't any close compositions at all, and no function ranks. J's &.> is simply ¨, and other close compositions, in my opinion, just aren't needed.

+

In J, each function has a built-in rank attribute: for example the ranks of + are 0 0 0. This rank is accessed by the "close" compositions @, &, and &.. Choosing the shorter form for the close compositions—for example @ rather than @:—is often considered a mistake within the J community. And function ranks are unreliable: consider that the ranks of ]@:+, a function that always has the same result as +, are _ _ _. In BQN there aren't any close compositions at all, and no function ranks. J's &.> is simply ¨, and other close compositions, in my opinion, just aren't needed.

J has several adverbs (key, prefix, infix, outfix…) to slice up an argument in various ways and apply a verb to those parts. In BQN, I rejected this approach: there are 1-modifiers for basic iteration patterns, and functions such as Group () that do the slicing but don't apply anything. So </.~a is a, but fn/.~a is >Fn¨a (I also reject J's implicit merge except for the Rank modifier, as I don't think function results should be homogeneous by default). BQN's approach composes better, and is more predictable from a performance perspective.

Gerunds are J's answer to BQN's first-class functions. For example J's (%&2)`(1+3*])@.(2&|) would be written 2|÷2,1+3×⊢ with a list of functions. I think lists of functions are a big improvement, since there's no need to convert between gerund and function, and no worries about arrays that just happen to be valid gerunds (worried about losing the ability to construct gerunds? Constructing tacit functions in BQN is much easier). The usability gap widens because passing J functions around either as values or gerunds has presents some highly idiosyncratic challenges, discussed below.

Named functions

Its impact on the programmer is smaller than a lot of the issues above, but this section describes a behavior that I find pretty hard to justify. What does the identifier fn indicate in a J expression? The value of fn in the current scope, one might suppose. Nope—only if the value is a noun. Let's make it a function.

-
   fn =: -
+
   fn =: -
    fn`-
 ┌──┬─┐
 │fn│-
 └──┴─┘
 
-

The tie adverb ` makes gerund representations of both operands and places them in a list. It returns 'fn';,'-' here: two different strings for what we'd think of as the same function. But it's just being honest. The value of fn really is more like a name than the primitive -. To see this we can pass it in to an adverb that defines its own local, totally separate copy of fn.

+

The tie adverb ` makes gerund representations of both operands and places them in a list. It returns 'fn';,'-' here: two different strings for what we'd think of as the same function. But it's just being honest. The value of fn really is more like a name than the primitive -. To see this we can pass it in to an adverb that defines its own local, totally separate copy of fn.

   fn{{u 3}}
 _3
    fn{{
@@ -67,27 +67,27 @@
 0.333333
 

That's right, it is not safe to use fn as an operand! Instead you're expected to write fn f., where f. (fix) is a primitive that recursively expands all the names. Okay, but if you didn't have these weird name wrappers everywhere you wouldn't have to expand them. Why?

-
   a =: 3 + b
-   b =: a^:(10&<) @: -:
+
   a =: 3 + b
+   b =: a^:(10&<) @: -:
    b 100
 15.25
 

This feature allows tacit recursion and mutual recursion. You can't do this in BQN, because A 3 + B with no B defined is a reference to an undefined identifier. You have to use {B𝕩} instead. So this is actually kind of nice. 'Cept it's broken:

   b f.  NB. impossible to fix all the way
-(3 + b)^:(10&<)@:-:
+(3 + b)^:(10&<)@:-:
 
    b f.{{
      b =. 2
      u 100
    }}
-|domain error: b
+|domain error: b
 |       u 100
 

A tacit-recursive function can't be called unless its definition is visible, period. We gained the ability to do this cool tacit recursion thing, and all it cost us was… the ability to reliably use functions as values at all, which should be one of the things tacit programming is good for.

It gets worse.

-
   g =: -
-   f =: g
-   g =: |.
+
   g =: -
+   f =: g
+   g =: |.
    f i. 3
 2 1 0
    <@f i. 3
@@ -96,4 +96,4 @@
 └─┴─┴─┘
 

This should not be possible. f here doesn't behave like +, or quite like |.: in fact there is no function that does what f does. The result of f depends on the entire argument, but <@f encloses rank 0 components! How long would it take you to debug an issue like this? It's rare, but I've run into it in my own code and seen similar reports on the forums.

-

The cause is that the value of f here—a named g function—is not just a name, but also comes with a function rank. The function rank is set by the assignment f =: g, and doesn't change along with g. Calling f doesn't rely on the rank, but @ does, so <@f effectively becomes <@|."-, mixing the two versions of g. The only explanation I have for this one is implementation convenience.

+

The cause is that the value of f here—a named g function—is not just a name, but also comes with a function rank. The function rank is set by the assignment f =: g, and doesn't change along with g. Calling f doesn't rely on the rank, but @ does, so <@f effectively becomes <@|."-, mixing the two versions of g. The only explanation I have for this one is implementation convenience.

-- cgit v1.2.3