From 0c716e4c6b7c2c44bbfd02b6503cae66af7b7480 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Fri, 28 Jan 2022 16:34:41 -0500 Subject: Separate syntax highlighting category for header/body characters ;:? --- docs/commentary/history.html | 8 +-- docs/commentary/problems.html | 4 +- docs/commentary/why.html | 32 +++++----- docs/doc/block.html | 58 +++++++++---------- docs/doc/control.html | 56 +++++++++--------- docs/doc/couple.html | 2 +- docs/doc/embed.html | 12 ++-- docs/doc/expression.html | 6 +- docs/doc/fromDyalog.html | 2 +- docs/doc/fromJ.html | 122 +++++++++++++++++++-------------------- docs/doc/glossary.html | 4 +- docs/doc/oop.html | 4 +- docs/doc/pair.html | 2 +- docs/doc/primitive.html | 2 +- docs/doc/rebqn.html | 2 +- docs/doc/syntax.html | 10 ++-- docs/doc/undo.html | 4 +- docs/editors/index.html | 10 ++-- docs/help/currentfunction.html | 2 +- docs/help/nothing.html | 2 +- docs/implementation/kclaims.html | 48 +++++++-------- docs/implementation/vm.html | 6 +- docs/spec/complex.html | 2 +- docs/spec/evaluate.html | 8 +-- docs/spec/grammar.html | 44 +++++++------- docs/spec/inferred.html | 2 +- docs/spec/literal.html | 6 +- docs/spec/primitive.html | 2 +- docs/style.css | 2 + md.bqn | 1 + 30 files changed, 234 insertions(+), 231 deletions(-) diff --git a/docs/commentary/history.html b/docs/commentary/history.html index 77ebec7f..cb296237 100644 --- a/docs/commentary/history.html +++ b/docs/commentary/history.html @@ -140,7 +140,7 @@ -05 ngn/apl Nikolov -Multiple function bodies ; +Multiple function bodies ; @@ -161,14 +161,14 @@ dzaima -Inverse headers π•ŠβΌ: +Inverse headers π•ŠβΌ: 0, 1 w/ dzaima -Headers : +Headers : 0, 1 @@ -213,7 +213,7 @@

Double-struck special names

There was a lot of discussion about names for arguments at YAG (no one liked alpha and omega); I think Nathan Rogers suggested using Unicode's mathematical variants of latin letters and I picked out the double-struck ones. My impression is that we were approaching a general concensus that "w" and "x" were the best of several bad choices of argument letters, but that I was the first to commit to them.

Assert primitive

-

Nathan Rogers suggested that assertion should be made a primitive to elevate it to a basic part of the language. I used J's assert often enough for this idea to make sense immediately, but I think it was new to me. He suggested the dagger character; I changed this to the somewhat similar-looking !. The error-trapping modifier ⎊ is identical to J's ::, but J only has the function [: to unconditionally throw an error, with no way to set a message.

+

Nathan Rogers suggested that assertion should be made a primitive to elevate it to a basic part of the language. I used J's assert often enough for this idea to make sense immediately, but I think it was new to me. He suggested the dagger character; I changed this to the somewhat similar-looking !. The error-trapping modifier ⎊ is identical to J's ::, but J only has the function [: to unconditionally throw an error, with no way to set a message.

Context-free grammar

In YAG meetings, I suggested adopting APL\iv's convention that variable case must match variable type in order to achieve a context-free grammar. AdΓ‘m, a proponent of case-insensitive names, pointed out that the case might indicate the type the programmer wanted to use instead of the value's type, creating cross roles. Although I considered swapping subjects and functions, I ended up using exactly the conventions of his APL style guide.

Headers

diff --git a/docs/commentary/problems.html b/docs/commentary/problems.html index 8b9f0955..4199ddaa 100644 --- a/docs/commentary/problems.html +++ b/docs/commentary/problems.html @@ -56,7 +56,7 @@

Can't mix define and modify in multiple assignment

Say a is a pair and h isn't defined yet; how would you set h to the first element of a and change a to be just the second? hβ€Ώa↩a doesn't work because h isn't defined, so the best I have is h←@β‹„hβ€Ώa↩a. A heavier assignment syntax wouldn't break down; BQN could allow ⟨h←,aβŸ©β†©a but I don't think this merits special syntax.

Trains don't like monads

-

If you have the normal mix of monads and dyads you'll need a lot of parentheses and might end up abusing ⟜. Largely solved with the Nothing syntax ·, which acts like J's Cap ([:) in a train, but still a minor frustration.

+

If you have the normal mix of monads and dyads you'll need a lot of parentheses and might end up abusing ⟜. Largely solved with the Nothing syntax ·, which acts like J's Cap ([:) in a train, but still a minor frustration.

Under/bind combination is awkward

It's most common to use Under with dyadic structural functions in the form β€¦βŒΎ(i⊸F), for example where F is one of / or ↑. This is frustrating for two reasons: it requires parentheses, and it doesn't allow i to be computed tacitly. If there's no left argument then the modifier {π”½βŒΎ(π•¨βŠΈπ”Ύ)𝕩} can be more useful, but it doesn't cover some useful cases such as mask a ⊣⌾(u⊸/) b.

Axis ordering is big-endian

@@ -155,7 +155,7 @@

Fixed by adding block returns such as label← to jump out of a block with header name label. Hopefully these don't cause too many new problems.

This was an issue with using functions as control flow. For example, when looping through an array with Each, you can't decide to exit early. In a curly-brace language you would just use a for loop and a return. In BQN, we need… longjmp? Maybe not as crazy as it sounds, and potentially worth it in exchange for replacing control structures.

Ambivalent explicit functions

-

Fixed with multiple bodies: if there are two bodies with no headers such as {2×𝕩;𝕨-𝕩}, they are the monadic and dyadic case.

+

Fixed with multiple bodies: if there are two bodies with no headers such as {2×𝕩;𝕨-𝕩}, they are the monadic and dyadic case.

How to choose a partitioning function?

Fixed with Group, which I found May 2020. Group serves as a much improved Partition. Later extended to multiple axes as well to get all the functionality.

Key doesn't do what you want

diff --git a/docs/commentary/why.html b/docs/commentary/why.html index e7c80e61..158dc679 100644 --- a/docs/commentary/why.html +++ b/docs/commentary/why.html @@ -32,32 +32,32 @@

BQN's Power modifier ⍟ allows an array operand to specify multiple results, for example Fn⍟(↕4) to get 0 up to 3 iterations. Intermediate results are saved, so the number of calls only depends on the highest iteration number present. On the other hand, BQN has no direct equivalent of Power Limit ⍣≑, requiring it to be implemented manually.

An APL selective assignment arr[2 3]+←1 should usually be written with Under in BQN: 1⊸+⌾(2β€Ώ3⊸⊏)arr (but the correspondence might not always be so direct). You can think of this as a very fancy At (@) operator, that lets you pull out an arbitrary part of an array.

Dfns are adjusted in a few ways that make them more useful for general-purpose programming. A BQN block always runs to the last statement, so a block like {Update 𝕩 β‹„ 1+x} won't return early. Writing modification with ↩ makes it clearer which variable's which. Dfns also do a weird shadowing thing where a←1β‹„a←2 makes two different variables; in BQN this is an error because the second should use ↩. Tradfns are removed entirely, along with control structures.

-

BQN doesn't have an exact replacement for dfn guards, although the predicate ? can look similar: {2|⍡ : 1+3×⍡ β‹„ ⍡÷2} is equivalent to {2|𝕩 ? 1+3×𝕩 ; 𝕩÷2}. But note that where APL uses the statement separator β‹„, BQN uses the body separator ;. This means that the if-true branch in BQN can consist of multiple statements (including additional predicates), but also that the if-false branch can't access variables defined in or before the condition. In both cases the "better" behavior can be obtained with an extra set of braces and possibly assigning names to arguments ⍡/𝕩. I think guards end up being cleaner when they work, and predicates are more versatile.

+

BQN doesn't have an exact replacement for dfn guards, although the predicate ? can look similar: {2|⍡ : 1+3×⍡ β‹„ ⍡÷2} is equivalent to {2|𝕩 ? 1+3×𝕩 ; 𝕩÷2}. But note that where APL uses the statement separator β‹„, BQN uses the body separator ;. This means that the if-true branch in BQN can consist of multiple statements (including additional predicates), but also that the if-false branch can't access variables defined in or before the condition. In both cases the "better" behavior can be obtained with an extra set of braces and possibly assigning names to arguments ⍡/𝕩. I think guards end up being cleaner when they work, and predicates are more versatile.

BQN's namespaces have a dedicated syntax, are much easier to create than Dyalog namespaces, and have better performance. I use them all the time, and they feel like a natural part of the language.

J

See also the BQN-J dictionary. J is under development again and a moving target. I stopped using it completely shortly after starting work on BQN in 2020, and while I try to keep up to date on language changes, some remarks here might not fit with the experience you'd get starting with J today.

To me building with J feels like making a tower out of wood and nails by hand: J itself is reliable but I soon don't trust what I'm standing on. J projects start to feel hacky when I have multiple files, locales, or a bit of global state. With BQN I begin to worry about maintainability only when I have enough functions that I can't remember what arguments they expect, and with lexically-scoped variables I simply don't use global state. If you don't reach this scale (in particular, if you use J as a calculator or spreadsheet substitute) you won't feel these concerns, and will have less to gain by moving to BQN. And if you go beyond, you'd need to augment your programs with rigorous documentation and testing in either language.

-

The biggest difference could be in file loading. If you write a script that depends on other files, and want it to work regardless of the directory it's called from, you need to deal with this. In J, >{:4!:3 '' gives the name of the most recently loaded script (the current one, if you put it before any imports), but to make it into a utility you need this glob of what's-going-on:

-
cur_script =: {{(4!:3$0) {::~ 4!:4<'y'}}
+

The biggest difference could be in file loading. If you write a script that depends on other files, and want it to work regardless of the directory it's called from, you need to deal with this. In J, >{:4!:3 '' gives the name of the most recently loaded script (the current one, if you put it before any imports), but to make it into a utility you need this glob of what's-going-on:

+
cur_script =: {{(4!:3$0) {::~ 4!:4<'y'}}
 

In BQN it's β€’path. And usually you don't need it because β€’Import resolves paths relative to the file containing it.

-

J uses numeric codes; BQN uses mostly names. So J's 1&o. is BQN's β€’math.Sin, and 6!:9 corresponds to BQN's β€’MonoTime.

-

J uses bytestrings by default, making Unicode handling a significant difficulty (see u:). BQN strings are lists of codepoints, so you don't have to worry about how they're encoded or fight to avoid splitting up UTF-8 bytes that need to go together.

+

J uses numeric codes; BQN uses mostly names. So J's 1&o. is BQN's β€’math.Sin, and 6!:9 corresponds to BQN's β€’MonoTime.

+

J uses bytestrings by default, making Unicode handling a significant difficulty (see u:). BQN strings are lists of codepoints, so you don't have to worry about how they're encoded or fight to avoid splitting up UTF-8 bytes that need to go together.

But J has its type advantages as well. I miss complex number support in BQN, as it's an optional extension that we haven't yet implemented. And BQN has a hard rule that only one numeric type is exposed to the programmer, which means high-precision integers and rationals aren't allowed at all for a float-based implementation. I think this rule is worth it because J's implicit type conversion is hard to predict and an unexpected numeric type can cause sporadic or subtle program errors.

BQN uses a modifier ⟜ for J's hook, adding ⊸ for a reversed version (which I use nearly twice as often). This frees up the 2-train, which is made equivalent to Atop (∘). It's the system Roger Hui came to advocate, since he argued in favor of a hook conjunction here and made 2-train an Atop when he brought it to Dyalog APL. As an example, the J hook (#~0&<:) to remove negative numbers becomes 0βŠΈβ‰€βŠΈ/ in BQN. Hooks are also the topic of Array Cast episode 14, where the panel points out that in J, adding a verb at the far left of a dyadic train changes the rest of the train from dyadic to monadic or vice-versa, an effect that doesn't happen in BQN.

J locales are not first-class values, and BQN namespaces are. I think BQN's namespaces are a lot more convenient to construct, although it is lacking an inheritance mechanism (but J's path system can become confusing quickly). More importantly, BQN namespaces (and closures) are garbage collected. J locales leak unless manually freed by the programmer. More generally, J has no mutable data at all, and to simulate it properly you'd have to write your own tracing garbage collection as the J interpreter doesn't have any. I discussed this issue some in this J forum thread.

-

In J, each function has a built-in rank attribute: for example the ranks of + are 0 0 0. This rank is accessed by the "close" compositions @, &, and &.. Choosing the shorter form for the close compositionsβ€”for example @ rather than @:β€”is often considered a mistake within the J community. And function ranks are unreliable: consider that the ranks of ]@:+, a function that always has the same result as +, are _ _ _. In BQN there aren't any close compositions at all, and no function ranks. J's &.> is simply Β¨, and other close compositions, in my opinion, just aren't needed.

+

In J, each function has a built-in rank attribute: for example the ranks of + are 0 0 0. This rank is accessed by the "close" compositions @, &, and &.. Choosing the shorter form for the close compositionsβ€”for example @ rather than @:β€”is often considered a mistake within the J community. And function ranks are unreliable: consider that the ranks of ]@:+, a function that always has the same result as +, are _ _ _. In BQN there aren't any close compositions at all, and no function ranks. J's &.> is simply Β¨, and other close compositions, in my opinion, just aren't needed.

J has several adverbs (key, prefix, infix, outfix…) to slice up an argument in various ways and apply a verb to those parts. In BQN, I rejected this approach: there are 1-modifiers for basic iteration patterns, and functions such as Group (βŠ”) that do the slicing but don't apply anything. So </.~a is βŠβŠΈβŠ”a, but fn/.~a is >FnΒ¨βŠβŠΈβŠ”a (I also reject J's implicit merge except for the Rank modifier, as I don't think function results should be homogeneous by default). BQN's approach composes better, and is more predictable from a performance perspective.

Gerunds are J's answer to BQN's first-class functions. For example J's (%&2)`(1+3*])@.(2&|) would be written 2⊸|β—ΆβŸ¨Γ·βŸœ2,1+3Γ—βŠ’βŸ© with a list of functions. I think lists of functions are a big improvement, since there's no need to convert between gerund and function, and no worries about arrays that just happen to be valid gerunds (worried about losing the ability to construct gerunds? Constructing tacit functions in BQN is much easier). The usability gap widens because passing J functions around either as values or gerunds has presents some highly idiosyncratic challenges, discussed below.

Named functions

Its impact on the programmer is smaller than a lot of the issues above, but this section describes a behavior that I find pretty hard to justify. What does the identifier fn indicate in a J expression? The value of fn in the current scope, one might suppose. Nopeβ€”only if the value is a noun. Let's make it a function.

-
   fn =: -
+
   fn =: -
    fn`-
 β”Œβ”€β”€β”¬β”€β”
 β”‚fnβ”‚-β”‚
 β””β”€β”€β”΄β”€β”˜
 
-

The tie adverb ` makes gerund representations of both operands and places them in a list. It returns 'fn';,'-' here: two different strings for what we'd think of as the same function. But it's just being honest. The value of fn really is more like a name than the primitive -. To see this we can pass it in to an adverb that defines its own local, totally separate copy of fn.

+

The tie adverb ` makes gerund representations of both operands and places them in a list. It returns 'fn';,'-' here: two different strings for what we'd think of as the same function. But it's just being honest. The value of fn really is more like a name than the primitive -. To see this we can pass it in to an adverb that defines its own local, totally separate copy of fn.

   fn{{u 3}}
 _3
    fn{{
@@ -67,27 +67,27 @@
 0.333333
 

That's right, it is not safe to use fn as an operand! Instead you're expected to write fn f., where f. (fix) is a primitive that recursively expands all the names. Okay, but if you didn't have these weird name wrappers everywhere you wouldn't have to expand them. Why?

-
   a =: 3 + b
-   b =: a^:(10&<) @: -:
+
   a =: 3 + b
+   b =: a^:(10&<) @: -:
    b 100
 15.25
 

This feature allows tacit recursion and mutual recursion. You can't do this in BQN, because A ← 3 + B with no B defined is a reference to an undefined identifier. You have to use {B𝕩} instead. So this is actually kind of nice. 'Cept it's broken:

   b f.  NB. impossible to fix all the way
-(3 + b)^:(10&<)@:-:
+(3 + b)^:(10&<)@:-:
 
    b f.{{
      b =. 2
      u 100
    }}
-|domain error: b
+|domain error: b
 |       u 100
 

A tacit-recursive function can't be called unless its definition is visible, period. We gained the ability to do this cool tacit recursion thing, and all it cost us was… the ability to reliably use functions as values at all, which should be one of the things tacit programming is good for.

It gets worse.

-
   g =: -
-   f =: g
-   g =: |.
+
   g =: -
+   f =: g
+   g =: |.
    f i. 3
 2 1 0
    <@f i. 3
@@ -96,4 +96,4 @@
 β””β”€β”΄β”€β”΄β”€β”˜
 

This should not be possible. f here doesn't behave like +, or quite like |.: in fact there is no function that does what f does. The result of f depends on the entire argument, but <@f encloses rank 0 components! How long would it take you to debug an issue like this? It's rare, but I've run into it in my own code and seen similar reports on the forums.

-

The cause is that the value of f hereβ€”a named g functionβ€”is not just a name, but also comes with a function rank. The function rank is set by the assignment f =: g, and doesn't change along with g. Calling f doesn't rely on the rank, but @ does, so <@f effectively becomes <@|."-, mixing the two versions of g. The only explanation I have for this one is implementation convenience.

+

The cause is that the value of f hereβ€”a named g functionβ€”is not just a name, but also comes with a function rank. The function rank is set by the assignment f =: g, and doesn't change along with g. Calling f doesn't rely on the rank, but @ does, so <@f effectively becomes <@|."-, mixing the two versions of g. The only explanation I have for this one is implementation convenience.

diff --git a/docs/doc/block.html b/docs/doc/block.html index 64643e7d..40de8ab2 100644 --- a/docs/doc/block.html +++ b/docs/doc/block.html @@ -135,26 +135,26 @@

Because 𝕣 only ever refers to a 1-modifier or 2-modifer, it can never make sense to refer to it as a function, and the uppercase letter ℝ is not recognized by BQN. In order to allow 𝕣 to be spelled as a 1-modifier _𝕣 or 2-modifier _𝕣_, it is treated as an ordinary identifier character, so it must be separated from letters or numbers by spaces.

Block headers

-

As a program becomes larger, it often becomes necessary to name inputs to blocks rather than just using special names. It can also become difficult to identify what kind of block is being defined, as it requires scanning through the block for special names. A block header, which is separated from the body of a block by a colon :, specifies the kind of block and can declare names for the block and its inputs.

-
Fact ← { F n:
+

As a program becomes larger, it often becomes necessary to name inputs to blocks rather than just using special names. It can also become difficult to identify what kind of block is being defined, as it requires scanning through the block for special names. A block header, which is separated from the body of a block by a colon :, specifies the kind of block and can declare names for the block and its inputs.

+
Fact ← { F n:
   n Γ— (0⊸<)β—Ά1β€ΏF n-1
 }
 

Its syntax mirrors an application of the block. As suggested by the positioning, the names given in a header apply only inside the block: for example F above is only defined inside the {} braces while Fact could be used either outside or inside. Some other possibilites are given below.

# A dyadic function that refers to itself as Func
-{ l Func r:
+{ l Func r:
   …
 
 # A deferred 1-modifier with a list argument
-{ Fn _apply ⟨a,b⟩:
+{ Fn _apply ⟨a,b⟩:
   …
 
 # A monadic function with no names given
-{ π•Šπ•©:
+{ π•Šπ•©:
   …
 
 # An immediate or deferred 2-modifier
-{ F _op_ val:
+{ F _op_ val:
   …
 

In all cases special names still work just like in a headerless function. In this respect the effect of the header is the same as a series of assignments at the beginning of a function, such as the following translation of the second header above:

@@ -167,31 +167,31 @@

Unlike these assignments, the header also constrains what inputs the block can take: a monadic 1-modifier like the one above can't take a right operand or left argument, and consequently its body can't contain 𝔾 or 𝕨. Calling it with a left argument, or a right argument that isn't a two-element list, will result in an error.

Destructuring

Arguments, but not operands, allow destructuring like assignment does. While assignment only tolerates lists of variables, header destructuring also allows constants. The argument must match the given structure, including the constants where they appear, or an error results.

-↗️
    Destruct ← { π•Š aβ€Ώ1β€ΏβŸ¨b,2⟩: a≍b }
+↗️
    Destruct ← { π•Š aβ€Ώ1β€ΏβŸ¨b,2⟩: a≍b }
     Destruct       5β€Ώ1β€ΏβŸ¨7,2⟩
 ⟨ 5 7 ⟩
 

Special names in headers

-

Any element of a function or modifier header can be left nameless by using the corresponding special name in that position, instead of an identifier. For example, the header 𝕨 𝔽_𝕣_𝔾 𝕩: incorporates as much vagueness as possible. It indicates a deferred 2-modifier, but provides no other information.

+

Any element of a function or modifier header can be left nameless by using the corresponding special name in that position, instead of an identifier. For example, the header 𝕨 𝔽_𝕣_𝔾 𝕩: incorporates as much vagueness as possible. It indicates a deferred 2-modifier, but provides no other information.

The name 𝕨 in this context can refer to either a left argument or no left argument, allowing a header with arguments to be used even for an ambiguous function. Recall that 𝕨 is the only token other than Β· that can have no value. If an identifier or list is given as the left argument, then the function must be called with a left argument.

Short headers

-

A header does not need to include all inputs, as shown by the F _op_ val: header above. The simplest case, when no inputs are given, is called a label. While it doesn't restrict the inputs, a label specifies the type of the block and gives an internal name that can be used to refer to it.

-
{ b:   # Block
-{ π•Š:   # Function
-{ _𝕣:  # 1-Modifier
-{ _𝕣_: # 2-Modifier
+

A header does not need to include all inputs, as shown by the F _op_ val: header above. The simplest case, when no inputs are given, is called a label. While it doesn't restrict the inputs, a label specifies the type of the block and gives an internal name that can be used to refer to it.

+
{ b:   # Block
+{ π•Š:   # Function
+{ _𝕣:  # 1-Modifier
+{ _𝕣_: # 2-Modifier
 

For immediate blocks, this is the only type of header possible, and it must use an identifier as there is no applicable special name. However, the name can't be used: it doesn't make sense to refer to a value while it is still being computed!

Multiple bodies

-

Blocks that define functions and deferred modifiers can include more than one body, separated by semicolons ;. The body used for a particular evaluation is chosen based on the arguments the the block. One special case applies when there are exactly two bodies either without headers or with labels only: in this case, the first applies when there is one argument and the second when there are two.

-↗️
    Ambiv ← { ⟨1,π•©βŸ© ; ⟨2,𝕨,π•©βŸ© }
+

Blocks that define functions and deferred modifiers can include more than one body, separated by semicolons ;. The body used for a particular evaluation is chosen based on the arguments the the block. One special case applies when there are exactly two bodies either without headers or with labels only: in this case, the first applies when there is one argument and the second when there are two.

+↗️
    Ambiv ← { ⟨1,π•©βŸ© ; ⟨2,𝕨,π•©βŸ© }
     Ambiv 'a'
 ⟨ 1 'a' ⟩
     'a' Ambiv 'b'
 ⟨ 2 'a' 'b' ⟩
 

Bodies before the last two must have headers that include arguments. When a block that includes this type of header is called, its headers are checked in order for compatibility with the arguments. The first body with a compatible header is used.

-↗️
    CaseAdd ← { 2π•Š3:0β€Ώ5 ; 2π•Šπ•©:⟨1,2+π•©βŸ© ; π•Šπ•©:2‿𝕩 }
+↗️
    CaseAdd ← { 2π•Š3:0β€Ώ5 ; 2π•Šπ•©:⟨1,2+π•©βŸ© ; π•Šπ•©:2‿𝕩 }
     2 CaseAdd 3
 ⟨ 0 5 ⟩
     2 CaseAdd 4
@@ -206,16 +206,16 @@
 

Case headers

A special rule allows for convenient case-matching syntax for one-argument functions. In any function header with one argument, the function name can be omitted as long as the argument is not a plain identifierβ€”it must be 𝕩 or a compound value like a list to distinguish it from an immediate block label.

Test ← {
-  "abc": "string" ;
-  ⟨2,b⟩: βŒ½π•©       ;
-  5:     "number" ;
-  𝕩:     "default"
+  "abc": "string" ;
+  ⟨2,b⟩: βŒ½π•©       ;
+  5:     "number" ;
+  𝕩:     "default"
 }
 

These case-style headers function exactly the same as if they were preceded by π•Š, and can be mixed with other kinds of headers.

Predicates

-

Destructuring with a header is quite limited, only allowing matching structure and data with exact equality. A predicate, written with ?, allows you to test an arbitrary property before evaluating the rest of the body, and also serves as a limited kind of control flow. It can be thought of as an extension to a header, so that for example the following function requires the argument to have two elements and for the first to be less than the second before using the first body. Otherwise it moves to the next body, which is unconditional.

-↗️
    CheckPair ← { π•ŠβŸ¨a,b⟩: a<b? "ok" ; "not ok" }
+

Destructuring with a header is quite limited, only allowing matching structure and data with exact equality. A predicate, written with ?, allows you to test an arbitrary property before evaluating the rest of the body, and also serves as a limited kind of control flow. It can be thought of as an extension to a header, so that for example the following function requires the argument to have two elements and for the first to be less than the second before using the first body. Otherwise it moves to the next body, which is unconditional.

+↗️
    CheckPair ← { π•ŠβŸ¨a,b⟩: a<b? "ok" ; "not ok" }
 
     CheckPair ⟨3,8⟩    # Fails destructuring
 "ok"
@@ -224,12 +224,12 @@
     CheckPair ⟨3,¯1⟩   # Not ascending
 "not ok"
 
-

The body where the predicate appears doesn't need to start with a header, and there can be other statements before it. In fact, ? functions just like a separator (like β‹„ or ,) with a side effect.

-↗️
    { rβ†βŒ½π•© β‹„ 't'=βŠ‘r ? r ; 𝕩 }Β¨ "test"β€Ώ"this"
+

The body where the predicate appears doesn't need to start with a header, and there can be other statements before it. In fact, ? functions just like a separator (like β‹„ or ,) with a side effect.

+↗️
    { rβ†βŒ½π•© β‹„ 't'=βŠ‘r ? r ; 𝕩 }Β¨ "test"β€Ώ"this"
 ⟨ "tset" "this" ⟩
 
-

So r is the reversed argument, and if its first character (the last one in 𝕩) is 't' then it returns r, and otherwise we abandon that line of reasoning and return 𝕩. This sounds a lot like an if statement. And { a<b ? a ; b }, which computes a⌊b the hard way, shows how the syntax can be similar to a ternary operator. This is an immediate block with multiple bodies, something that makes sense with predicates but not headers. But ?; offers more possibilities. It can support any number of options, with multiple tests for each oneβ€”the structure below is "if _ and _ then _; else if _ then _; else _".

-↗️
    Thing ← { 𝕩β‰₯3? 𝕩≀8? 2|𝕩 ; 𝕩=0? @ ; ∞ }
+

So r is the reversed argument, and if its first character (the last one in 𝕩) is 't' then it returns r, and otherwise we abandon that line of reasoning and return 𝕩. This sounds a lot like an if statement. And { a<b ? a ; b }, which computes a⌊b the hard way, shows how the syntax can be similar to a ternary operator. This is an immediate block with multiple bodies, something that makes sense with predicates but not headers. But ?; offers more possibilities. It can support any number of options, with multiple tests for each oneβ€”the structure below is "if _ and _ then _; else if _ then _; else _".

+↗️
    Thing ← { 𝕩β‰₯3? 𝕩≀8? 2|𝕩 ; 𝕩=0? @ ; ∞ }
 
     (⊒ ≍ ThingΒ¨) ↕10  # Table of arguments and results
 β”Œβ”€                     
@@ -237,8 +237,8 @@
   @ ∞ ∞ 1 0 1 0 1 0 ∞  
                       β”˜
 
-

This structure is still constrained by the rules of block bodies: each instance of ; is a separate scope, so that variables defined before a ? don't survive past the ;.

-↗️
    { 0=n←≠𝕩 ? ∞ ; n } "abc"
+

This structure is still constrained by the rules of block bodies: each instance of ; is a separate scope, so that variables defined before a ? don't survive past the ;.

+↗️
    { 0=n←≠𝕩 ? ∞ ; n } "abc"
 Error: Undefined identifier
 
-

This is the main drawback of predicates relative to guards in APL dfns (also written with ?), while the advantage is that it allows multiple expressions, or extra conditions, after a ?. It's not how I would have designed it if I just wanted to make a syntax for if statements, but it's a natural fit for the header system.

+

This is the main drawback of predicates relative to guards in APL dfns (also written with ?), while the advantage is that it allows multiple expressions, or extra conditions, after a ?. It's not how I would have designed it if I just wanted to make a syntax for if statements, but it's a natural fit for the header system.

diff --git a/docs/doc/control.html b/docs/doc/control.html index 5abfa166..b4426630 100644 --- a/docs/doc/control.html +++ b/docs/doc/control.html @@ -10,26 +10,26 @@

The surfeit of ways to write control structures could be a bit of an issue for reading BQN. My hope is that the community can eventually settle on a smaller set of standard forms to recommend so that you won't have to recognize all the variants given here. On the other hand, the cost of using specialized control structures is lower in a large project without too many contributors. In this case BQN's flexibility allows developers to adapt to the project's particular demands (for example, some programs use switch/case statements heavily but most do not).

The useful control structures introduced here are collected as shortened definitions below. While uses the slightly more complicated implementation that avoids stack overflow, and DoWhile and For are written in terms of it in order to share this property. The more direct versions with linear stack use appear in the main text.

If      ← {π•βŸπ•Ž@}Β΄                 # Also Repeat
-IfElse  ← {cβ€ΏTβ€ΏF: cβ—ΆFβ€ΏT@}
+IfElse  ← {cβ€ΏTβ€ΏF: cβ—ΆFβ€ΏT@}
 While   ← {𝕩{π”½βŸπ”Ύβˆ˜π”½_𝕣_π”Ύβˆ˜π”½βŸπ”Ύπ•©}𝕨@}Β΄  # While 1β€Ώ{... to run forever
 DoWhile ← {𝕏@ β‹„ While 𝕨‿𝕩}Β΄
-For     ← {Iβ€ΏCβ€ΏPβ€ΏA: I@ β‹„ While⟨C,P∘A⟩}
+For     ← {Iβ€ΏCβ€ΏPβ€ΏA: I@ β‹„ While⟨C,P∘A⟩}
 
 # Switch/case statements have many variations; these are a few
 Match   ← {𝕏𝕨}Β΄
 Select  ← {(βŠ‘π•©)β—Ά(1↓𝕩)@}
 Switch  ← {cβ†βŠ‘π•© β‹„ mβ€Ώa←<Λ˜β‰βˆ˜β€Ώ2β₯Š1↓𝕩 β‹„ (βŠ‘a⊐C)β—Άm@}
-Test    ← {fn←{Cβ€ΏAπ•Še:Cβ—ΆAβ€ΏE}´𝕩⋄Fn@}
+Test    ← {fn←{Cβ€ΏAπ•Še:Cβ—ΆAβ€ΏE}´𝕩⋄Fn@}
 

Blocks and functions

Control structures are generally defined to work with blocks of code, which they might skip, or execute one or more times. This might sound like a BQN immediate block, which also consists of a sequence of code to execute, but immediate blocks are always executed as soon as they are encountered and can't be manipulated the way that blocks in imperative languages can. They're intended to be used with lexical scoping as a tool for encapsulation. Instead, the main tool we will use to get control structures is the block function.

Using functions as blocks is a little outside their intended purpose, and the fact that they have to be passed an argument and are expected to use it will be a minor annoyance. The following conventions signal a function that ignores its argument and is called purely for the side effects:

  • Pass @ to a function that ignores its argument. It's a nice signal that nothing is happening and is easy to type.
  • -
  • A headerless function that doesn't use an argument will be interpreted as an immediate block by default. Start it with the line 𝕀 to avoid this (it's an instruction to navel gaze: the function contemplates its self, but does nothing about it). Other options like π•Š:, F:, or 𝕩 also work, but are more visually distracting.
  • +
  • A headerless function that doesn't use an argument will be interpreted as an immediate block by default. Start it with the line 𝕀 to avoid this (it's an instruction to navel gaze: the function contemplates its self, but does nothing about it). Other options like π•Š:, F:, or 𝕩 also work, but are more visually distracting.

Even with these workarounds, BQN's "niladic" function syntax is quite lightweight, comparing favorably to a low-boilerplate language like Javascript.

-
fn = ()=>{m+=1;n*=2}; fn()
+
fn = ()=>{m+=1;n*=2}; fn()
 Fn ← {𝕀⋄  m+↩1,n×↩2}, Fn @
 

Control structures are called "statements" below to match common usage, but they are actually expressions, and return a value that might be used later.

@@ -49,7 +49,7 @@

The result of any of these if statements is the result of the action if it's performed, and otherwise it's whatever argument was passed to the statement, which is @ or 10 here.

BQN's syntax for a pure if statement isn't so good, but predicates handle if-else statements nicely. So in most cases you'd forego the definitions above in favor of an if-else with nothing in the else branch:

-
{ a<10 ? a+↩10 ; @ }
+
{ a<10 ? a+↩10 ; @ }
 

Repeat

There's no reason the condition in an if statement from the previous section has to be boolean: it could be any natural number, causing the action to be repeated that many times. If the action is never performed, the result is the statement's argument, and otherwise it's the result of the last time the action was performed.

@@ -57,14 +57,14 @@

If-Else

In most cases, the easy way to write an if-else statement is with predicates:

{
-  threshold < 6 ?
-  a ↩ Small threshold ;  # If predicate was true
+  threshold < 6 ?
+  a ↩ Small threshold ;  # If predicate was true
   b ↩ 1 Large threshold  # If it wasn't
 }
 

We might also think of an if-else statement as a kind of switch-case statement, where the two cases are true (1) and false (0). As a result, we can implement it either with Choose (β—Ά) or with case headers of 1 and 0.

When using Choose, note that the natural ordering places the false case before the true one to match list index ordering. To get the typical if-else order, the condition should be negated or the statements reversed. Here's a function to get an if-else statement by swapping the conditions, and two ways its application might be written.

-
IfElse ← {condβ€ΏTrueβ€ΏFalse: condβ—ΆFalseβ€ΏTrue @}
+
IfElse ← {condβ€ΏTrueβ€ΏFalse: condβ—ΆFalseβ€ΏTrue @}
 
 IfElse βŸ¨π•©<midβŠ‘π•¨
   {𝕀⋄ hi↩mid}
@@ -79,22 +79,22 @@
 

Case headers have similar syntax, but the two cases are labelled explicitly. In this form, the two actions are combined in a single function, which could be assigned to call it on various conditions.

{𝕏𝕨}Β΄ (𝕩<midβŠ‘π•¨)β€Ώ{
-  1: hi↩mid
-;
-  0: lo↩mid
+  1: hi↩mid
+;
+  0: lo↩mid
 }
 

The result of an if-else statement is just the result of whichever branch was used; chained if-else and switch-case statements will work the same way.

Chained If-Else

One pattern in imperative languages is to check one condition and apply an action if it succeeds, but check a different condition if it fails, in sequence until some condition succeeds or every one has been checked. Languages might make this pattern easier by making if-else right associative, so that the programmer can write an if statement followed by a sequence of else if "statements", or might just provide a unified elif keyword that works similarly. BQN's predicates work really well for this structure:

{
-  a<b ? a+↩1 ;
-  a<c ? c-↩1 ;
+  a<b ? a+↩1 ;
+  a<c ? c-↩1 ;
         a-↩2
 }
 

For a function-based approach, it's possible to nest IfElse expressions, but it's also possible to write a control structure that chains them all at one level. For this statement the input will be a sequence of ⟨Test,Action⟩ pairs, followed by a final action to perform if no test succeeds. The first test is always performed; other tests should be wrapped in blocks because otherwise they'll be executed even if an earlier test succeeded.

-
Test ← {fn←{Condβ€ΏAct π•Š else: Condβ—ΆElseβ€ΏAct}´𝕩 β‹„ Fn@}
+
Test ← {fn←{Condβ€ΏAct π•Š else: Condβ—ΆElseβ€ΏAct}´𝕩 β‹„ Fn@}
 
 Test ⟨
   (  a<b)β€Ώ{𝕀⋄a+↩1}
@@ -107,11 +107,11 @@
 
Match ← {𝕏𝕨}Β΄
 
 Match valueβ€Ώ{
-  0β€Ώb: n-↩b
-;
-  aβ€Ώb: n+↩a-b
-;
-  𝕩: nβˆΎβ†©π•©
+  0β€Ώb: n-↩b
+;
+  aβ€Ώb: n+↩a-b
+;
+  𝕩: nβˆΎβ†©π•©
 }
 

A simplified version of a switch-case statement is possible if the cases are natural numbers 0, 1, and so on. The Choose (β—Ά) modifier does just what we want. The Select statement below generalizes IfElse, except that it doesn't rearrange the cases relative to Choose while IfElse swaps them.

@@ -144,7 +144,7 @@ } arg

To convert this to a control structure format, we want to take an action A, and produce a function that runs A, then runs itself. Finally we want to call that function on some argument, say @. The argument is a single function, so to call Forever, we need to convert that function to a subject role.

-
Forever ← {π•Ša:{π•ŠA𝕩}@}
+
Forever ← {π•Ša:{π•ŠA𝕩}@}
 
 Forever 1βŠ‘@β€Ώ{𝕀
   # Stuff to do forever
@@ -184,13 +184,13 @@
 FnΒ¨ βŒ½β†•n    # for (𝕩=n; --𝕩; )
 

Very well… a for loop is just a while loop with some extra pre- and post-actions.

-
For ← {Preβ€ΏCondβ€ΏPostβ€ΏAct: Pre@ β‹„ {π•Šβˆ˜Post∘Act⍟Cond 𝕩}@}
+
For ← {Preβ€ΏCondβ€ΏPostβ€ΏAct: Pre@ β‹„ {π•Šβˆ˜Post∘Act⍟Cond 𝕩}@}
 
 For (c←27⊣n←0)β€Ώ{𝕀⋄1<c}β€Ώ{𝕀⋄n+↩1}β€Ώ{𝕀
   {𝕏𝕨}Β΄ (2|c)β€Ώ{
-    0: c÷↩2
-  ;
-    1: c↩1+3Γ—c
+    0: c÷↩2
+  ;
+    1: c↩1+3Γ—c
   }
 }
 
@@ -199,9 +199,9 @@
c←27 β‹„ n←0
 While ⟨{𝕀⋄1<c}, {𝕀⋄n+↩1}{π”Ύβˆ˜π”½}{𝕀
   Match (2|c)β€Ώ{
-    0: c÷↩2
-  ;
-    1: c↩1+3Γ—c
+    0: c÷↩2
+  ;
+    1: c↩1+3Γ—c
   }
 }⟩
 
diff --git a/docs/doc/couple.html b/docs/doc/couple.html index 5f9b8dd5..1f71bb19 100644 --- a/docs/doc/couple.html +++ b/docs/doc/couple.html @@ -78,5 +78,5 @@

A note on the topic of Solo and Couple applied to units. As always, one axis will be added, so that the result is a list (strangely, J's laminate differs from Couple in this one case, as it will add an axis to get a shape 2β€Ώ1 result). For Solo, this is interchangeable with Deshape (β₯Š), and either primitive might be chosen for stylistic reasons. For Couple, it is equivalent to Join-to (∾), but this is an irregular form of Join-to because it is the only case where Join-to adds an axis to both arguments instead of just one. Couple should be preferred in this case.

The function Pair (β‹ˆ) can be written ≍○<, while ≍ in either valence is >βˆ˜β‹ˆ. As an interesting consequence, ≍ ←→ >βˆ˜β‰β—‹<, and β‹ˆ ←→ >βˆ˜β‹ˆβ—‹<. These two identities have the same form because adding β—‹< commutes with adding >∘.

Definitions

-

As discussed above, ≍ is equivalent to >{βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}. To complete the picture we should describe Merge fully. Merge is defined on an array argument 𝕩 such that there's some shape s satisfying ∧´β₯Š(s≑≒)¨𝕩. If 𝕩 is empty then any shape satisfies this expression; s should be chosen based on known type information for 𝕩 or otherwise assumed to be ⟨⟩. If s is empty then 𝕩 is allowed to contain atoms as well as unit arrays, and these will be implicitly promoted to arrays by the βŠ‘ indexing used later. We construct the result by combining the outer and inner axes of the argument with Table; since the outer axes come first they must correspond to the left argument and the inner axes must correspond to the right argument. 𝕩 is a natural choice of left argument, and because no concrete array can be used, the right argument will be ↕s, the array of indices into any element of 𝕩. To get the appropriate element corresponding to a particular choice of index and element of 𝕩 we should select using that index. The result of Merge is π•©βŠ‘ΛœβŒœβ†•s.

+

As discussed above, ≍ is equivalent to >{βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}. To complete the picture we should describe Merge fully. Merge is defined on an array argument 𝕩 such that there's some shape s satisfying ∧´β₯Š(s≑≒)¨𝕩. If 𝕩 is empty then any shape satisfies this expression; s should be chosen based on known type information for 𝕩 or otherwise assumed to be ⟨⟩. If s is empty then 𝕩 is allowed to contain atoms as well as unit arrays, and these will be implicitly promoted to arrays by the βŠ‘ indexing used later. We construct the result by combining the outer and inner axes of the argument with Table; since the outer axes come first they must correspond to the left argument and the inner axes must correspond to the right argument. 𝕩 is a natural choice of left argument, and because no concrete array can be used, the right argument will be ↕s, the array of indices into any element of 𝕩. To get the appropriate element corresponding to a particular choice of index and element of 𝕩 we should select using that index. The result of Merge is π•©βŠ‘ΛœβŒœβ†•s.

Given this definition we can also describe Rank (βŽ‰) in terms of Each (Β¨) and the simpler monadic function Enclose-Rank <βŽ‰k. We assume effective ranks j for 𝕨 (if present) and k for 𝕩 have been computed. Then the correspondence is 𝕨FβŽ‰k𝕩 ←→ >(<βŽ‰j𝕨)FΒ¨(<βŽ‰k𝕩).

diff --git a/docs/doc/embed.html b/docs/doc/embed.html index bdd03679..b02342cc 100644 --- a/docs/doc/embed.html +++ b/docs/doc/embed.html @@ -9,16 +9,16 @@

There is only one mechanism to interface between the host language and BQN: the function bqn evaluates a string containing a BQN program and returns the result. Doesn't sound like much, especially considering these programs can't share any state such as global variables (BQN doesn't have those). But taking first-class functions and closures into account, it's all you could ever need!

Passing closures

Probably you can figure out the easy things like calling bqn("Γ—Β΄1+↕6") to compute six factorial. But how do you get JS and BQN to talk to each other, for example to compute the factorial of a number n? Constructing a source string with bqn("Γ—Β΄1+↕"+n) isn't the best wayβ€”in fact I would recommend you never use this strategy.

-

Instead, return a function from BQN and call it: bqn("{Γ—Β΄1+↕𝕩}")(n). This strategy also has the advantage that you can store the function, so that it will only be compiled once. Define let fact = bqn("{Γ—Β΄1+↕𝕩}"); at the top of your program and use it as a function elsewhere.

+

Instead, return a function from BQN and call it: bqn("{Γ—Β΄1+↕𝕩}")(n). This strategy also has the advantage that you can store the function, so that it will only be compiled once. Define let fact = bqn("{Γ—Β΄1+↕𝕩}"); at the top of your program and use it as a function elsewhere.

BQN can also call JS functions, to use functionality that isn't native to BQN or interact with a program written in JS. For example, bqn("{𝕏'a'+↕26}")(alert) calls the argument alert from within BQN. The displayed output isn't quite right here, because a BQN string is stored as a JS array, not a string. See the next section for more information.

Cool, but none of these examples really use closures, just self-contained functions. Closures are functions that use outside state, which is maintained over the course of the program. Here's an example program that defines i and then returns a function that manipulates i and returns its new value.

let push = bqn(`
     i←4β₯Š0
     {i+↩𝕩»i}
-`);
-push(3);    // [3,0,0,0]
-push(-2);   // [1,3,0,0]
-push(4);    // [5,4,3,0]
+`);
+push(3);    // [3,0,0,0]
+push(-2);   // [1,3,0,0]
+push(4);    // [5,4,3,0]
 

Note that this program doesn't have any outer braces. It's only run once, and it initializes i and returns a function. Just putting braces around it wouldn't have any effectβ€”it just changes it from a program that does something to a program that runs a block that does the same thingβ€”but adding braces and using 𝕨 or 𝕩 inside them would turn it into a function that could be run multiple times to create different closures. For example, pushGen = bqn("{i←4β₯Šπ•©β‹„{i+↩𝕩»i}}") causes pushGen(n) to create a new closure with i initialized to 4β₯Šn.

The program also returns only one function, which can be limiting. But it's possible to get multiple closures out of the same program by returning a list of functions. For example, the following program defines three functions that manipulate a shared array in different ways.

@@ -28,7 +28,7 @@ RotY ← {aβ†©π•©βŒ½a} Flip ← {𝕀⋄a↩⍉a} RotXβ€ΏRotYβ€ΏFlip -`); +`);

When defining closures for their side effects like this, make sure they are actually functions! For example, since flip ignores its argument (you can call it with flip(), because a right argument of undefined isn't valid but will just be ignored), it needs an extra 𝕀 in the definition to be a function instead of an immediate block.

You can also use an array to pass multiple functions or other values from JS into BQN all at once. However, a JS array can't be used directly in BQN because its shape isn't known. The function list() converts a JS array into a BQN list by using its length for the shape; the next section has a few more details.

diff --git a/docs/doc/expression.html b/docs/doc/expression.html index 8676b3d9..6b7f588c 100644 --- a/docs/doc/expression.html +++ b/docs/doc/expression.html @@ -22,7 +22,7 @@ -w? +w? F x Subject @@ -30,7 +30,7 @@ RtL, looser -F? +F? G H Function @@ -56,7 +56,7 @@

The four roles (subject, function, two kinds of modifier) describe expressions, not values. When an expression is evaluated, the value's type doesn't have to correspond to its role, and can even change from one evaluation to another. An expression's role is determined entirely by its source code, so it's fixed.

-

In the table, ? marks an optional left argument. If there isn't a value in that position, or it's Nothing (Β·), the middle function will be called with only one argument.

+

In the table, ? marks an optional left argument. If there isn't a value in that position, or it's Nothing (Β·), the middle function will be called with only one argument.

If you're comfortable reading BNF and want to understand things in more detail than described below, you might check the grammar specification as well.

Syntactic role

This issue is approached from a different angle in Context free grammar.

diff --git a/docs/doc/fromDyalog.html b/docs/doc/fromDyalog.html index ac3d4974..1f256416 100644 --- a/docs/doc/fromDyalog.html +++ b/docs/doc/fromDyalog.html @@ -286,7 +286,7 @@ ! Γ—Β΄1+↕ -˜(+Γ·β—‹(Γ—Β΄)⊒)1+β†•βˆ˜βŠ£ β—‹ Ο€βŠΈΓ— β€’math ~ Β¬ ¬∘∊/⊣ - ? β€’rand.Rangeβš‡0 β€’rand.Deal + ? β€’rand.Rangeβš‡0 β€’rand.Deal ⍲ ¬∘∧ ⍱ ¬∘∨ ⍴ β‰’ β₯Š diff --git a/docs/doc/fromJ.html b/docs/doc/fromJ.html index a0dfde9e..524ec8d3 100644 --- a/docs/doc/fromJ.html +++ b/docs/doc/fromJ.html @@ -61,18 +61,18 @@ ' creates characters -=. and =: +=. and =: ← and ↩ ← to define; ↩ to modify -3 :… or {{…}} +3 :… or {{…}} {…} -: -; +: +; To separate function cases @@ -96,7 +96,7 @@ -[: +[: Β· Cap @@ -133,11 +133,11 @@ * % ^ -%: +%: <. >. -<: ->: +<: +>: [ ] @@ -156,10 +156,10 @@ J -,: -,&:< +,: +,&:< |. -|: +|: @@ -181,15 +181,15 @@ Monad -/:~ -\:~ +/:~ +\:~ -. #@$ # L. $ , -; +; Dyad @@ -197,9 +197,9 @@ +. +-. = -~: --: --.@-: +~: +-: +-.@-: $ , @@ -256,22 +256,22 @@ Monad -/: -/: +/: +/: {. -0{::, +0{::, i.~~. … -~: +~: ~. </.i.@# Dyad I. -I.&:- +I.&:- { -{:: +{:: i. … e. @@ -300,12 +300,12 @@ J "_ ~ -@: -&: -&.: -: +@: +&: +&.: +: @. -:: +:: @@ -336,9 +336,9 @@ /\ "_1 " -L: -^: -^:_1 +L: +^: +^:_1 @@ -357,13 +357,13 @@ @+↕256 -a: +a: <↕0

Functions + - | < > are the same in both languages.

-

Some other primitives are essentially the same in J and BQN, but with different spellings (but transpose behaves differently; J's dyadic |: is more like ⍉⁼):

+

Some other primitives are essentially the same in J and BQN, but with different spellings (but transpose behaves differently; J's dyadic |: is more like ⍉⁼):

@@ -372,13 +372,13 @@ - + - + @@ -403,14 +403,14 @@ - - - - + + + + - - - + + + @@ -444,12 +444,12 @@ - + - + @@ -459,7 +459,7 @@ - + @@ -469,7 +469,7 @@ - + @@ -479,7 +479,7 @@ - + @@ -499,7 +499,7 @@ - + @@ -514,12 +514,12 @@ - + - + @@ -544,12 +544,12 @@ - + - + @@ -564,12 +564,12 @@ - + - + @@ -579,7 +579,7 @@ - + @@ -589,12 +589,12 @@ - + - + @@ -614,7 +614,7 @@ - + @@ -673,11 +673,11 @@ - - + + - + diff --git a/docs/doc/glossary.html b/docs/doc/glossary.html index 1220fa6c..75048532 100644 --- a/docs/doc/glossary.html +++ b/docs/doc/glossary.html @@ -139,8 +139,8 @@
  • Block modifier: A block defining a 1- or 2-modifier.
  • Immediate modifier: A modifier that's evaluated as soon as it receives its operands.
  • Deferred modifier: The opposite of an immediate modifier, one that's only evaluated when called with operands and arguments.
  • -
  • Header: A preface to a body in a block function or modifier indicating possible inputs, which is followed by a colon :.
  • +
  • Header: A preface to a body in a block function or modifier indicating possible inputs, which is followed by a colon :.
  • Label: A header consisting of a single name.
  • -
  • Body: One sequence of statements in a block. Bodies, possibly preceded by headers, are separated by semicolons ;.
  • +
  • Body: One sequence of statements in a block. Bodies, possibly preceded by headers, are separated by semicolons ;.
  • Tacit: Code that defines functions or modifiers without using blocks.
  • diff --git a/docs/doc/oop.html b/docs/doc/oop.html index af0daaa2..66df070f 100644 --- a/docs/doc/oop.html +++ b/docs/doc/oop.html @@ -71,7 +71,7 @@ View⇐{𝕀l} - Move⇐{fromβ€Ώto: + Move⇐{fromβ€Ώto:l↩Transfer´⌾(π•©βŠΈβŠ)⍟(≠´𝕩)l}# Move a disk from 𝕨 to 𝕩 @@ -137,7 +137,7 @@ Undo ⇐ t.Move∘⌽∘Pop } -

    This class composes a Tower of Hanoi with an undo stack that stores previous moves. To undo a move from a to b, it moves from b to a, although if you felt really fancy you might define Move⁼ in towerOfHanoi instead with π•ŠβΌπ•©: π•ŠβŒ½π•©.

    +

    This class composes a Tower of Hanoi with an undo stack that stores previous moves. To undo a move from a to b, it moves from b to a, although if you felt really fancy you might define Move⁼ in towerOfHanoi instead with π•ŠβΌπ•©: π•ŠβŒ½π•©.

    It's also possible to copy several variables and only export some of them, with an export statement. For example, if I wasn't going to make another method called Move, I might have written Viewβ€ΏMove ← towerOfHanoi and then View⇐. In fact, depending on your personal style and how complicated your classes are, you might prefer to avoid inline ⇐ exports entirely, and declare all the exports at the top.

    Self-reference

    An object's class is given by π•Š. Remember, a class is an ordinary BQN function! It might be useful for an object to produce another object of the same class (particularly if it's immutable), and an object might also expose a field class⇐𝕀 to test whether an object o belongs to a class c with o.class = c.

    diff --git a/docs/doc/pair.html b/docs/doc/pair.html index a4bf9b75..12165060 100644 --- a/docs/doc/pair.html +++ b/docs/doc/pair.html @@ -62,4 +62,4 @@ ↗️
        4 ↑ "a"β€Ώ5 β‹ˆ "b"β€Ώ7
     ⟨ ⟨ "a" 5 ⟩ ⟨ "b" 7 ⟩ ⟨ " " 0 ⟩ ⟨ " " 0 ⟩ ⟩
     
    -

    This means that β‹ˆ may always behave the same as the obvious implementation {βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}. However, ≍○< and even >∘{βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}β—‹< compute the result fill as β‹ˆ does and are identical implementations.

    +

    This means that β‹ˆ may always behave the same as the obvious implementation {βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}. However, ≍○< and even >∘{βŸ¨π•©βŸ©;βŸ¨π•¨,π•©βŸ©}β—‹< compute the result fill as β‹ˆ does and are identical implementations.

    diff --git a/docs/doc/primitive.html b/docs/doc/primitive.html index 05cb3dbc..2f39dda5 100644 --- a/docs/doc/primitive.html +++ b/docs/doc/primitive.html @@ -468,7 +468,7 @@
    - + diff --git a/docs/doc/rebqn.html b/docs/doc/rebqn.html index 81a00418..932a647f 100644 --- a/docs/doc/rebqn.html +++ b/docs/doc/rebqn.html @@ -75,4 +75,4 @@ ⟨ 0 1 2 0 ⟩

    Above, ^ becomes a 1-modifier, so that it modifies % rather than being called directly on 1β€Ώ2 as a function.

    -

    The glyph can be any character that's not being used by BQN already. Characters like c or ⟩ or : will result in an error, as they'd break BQN syntax. Other than that, the sky's the limit! Or rather, the Unicode consortium is the limit. If they don't recognize your symbol, you're going to have to petition to make it an emoji or something. Oh well.

    +

    The glyph can be any character that's not being used by BQN already. Characters like c or ⟩ or : will result in an error, as they'd break BQN syntax. Other than that, the sky's the limit! Or rather, the Unicode consortium is the limit. If they don't recognize your symbol, you're going to have to petition to make it an emoji or something. Oh well.

    diff --git a/docs/doc/syntax.html b/docs/doc/syntax.html index 6eed337e..460fc460 100644 --- a/docs/doc/syntax.html +++ b/docs/doc/syntax.html @@ -69,15 +69,15 @@ - + - + - + @@ -140,7 +140,7 @@ - + @@ -149,7 +149,7 @@ - + diff --git a/docs/doc/undo.html b/docs/doc/undo.html index f2082a10..3a2ddcba 100644 --- a/docs/doc/undo.html +++ b/docs/doc/undo.html @@ -45,8 +45,8 @@

    Undo headers

    Of course BQN will never be able to invert all the functions you could write (if it could you could earn a lot of bitcoins, among other feats). But it does recognize some header forms that you can use to specify the inverse of a block function. BQN will trust you and won't verify the results your specified inverse gives.

    {
    -  π•Šπ•©:  𝕩÷1+𝕩 ;
    -  π•ŠβΌπ•©: 𝕩÷1-𝕩
    +  π•Šπ•©:  𝕩÷1+𝕩 ;
    +  π•ŠβΌπ•©: 𝕩÷1-𝕩
     }
     

    The above function could also be defined with the automatically invertible 1⊸+⌾÷, but maybe there's a numerical reason to use the definition above. Like a normal header, an undo header reflects the normal use, but it includes ⁼ and possibly ˜ addition to the function and arguments.

    diff --git a/docs/editors/index.html b/docs/editors/index.html index 6ce04840..8aa9bb6e 100644 --- a/docs/editors/index.html +++ b/docs/editors/index.html @@ -15,11 +15,11 @@

    System-wide

    XKB (Unix)

    The file bqn is for configuring XKB on Linux, or other systems using X11. To use, copy it to /usr/share/X11/xkb/symbols/, then run

    -
    $ setxkbmap -layout us,bqn -option grp:switch
    +
    $ setxkbmap -layout us,bqn -option grp:switch
     

    replacing us with your ordinary keyboard layout. switch indicates the right alt key and can be replaced with lswitch for left alt or other codes. The setting will go away on shutdown so you will probably want to configure it to run every time you start up. The way to do this depends on your desktop environment. For further discussion, see Wikipedia or the APL Wiki.

    Another XKB option is to place XCompose (possibly with adjustments) in ~/.XCompose and enable a compose key. This can be done using either OS-specific settings or the following command:

    -
    $ setxkbmap -option compose:rwin
    +
    $ setxkbmap -option compose:rwin
     

    Windows

    Folder autohotkey-win contains an AutoHotKey script and the generated .exe file. It runs as an ordinary program that recognizes BQN key combinations system-wide, using the right alt key (to change this, replace RAlt in the script and rebuild). Move it to the startup folder if you'd like to have it running all the time. You can right-click its icon in the system tray to disable it temporarily.

    @@ -39,11 +39,11 @@
      au! BufRead,BufNewFile *.bqn setf bqn
       au! BufRead,BufNewFile * if getline(1) =~ '^#!.*bqn$' | setf bqn | endif
     
    -

    Include syntax on in your .vimrc for syntax highlighting and filetype plugin on for keyboard input. View docs from vim with :help bqn.

    +

    Include syntax on in your .vimrc for syntax highlighting and filetype plugin on for keyboard input. View docs from vim with :help bqn.

    To use vim-plug to install BQN support for vim, add this to your plugin section of your .vimrc:

    -
      Plug 'mlochbaum/BQN', {'rtp': 'editors/vim'}
    +
      Plug 'mlochbaum/BQN', {'rtp': 'editors/vim'}
     
    -

    Then run :PlugInstall.

    +

    Then run :PlugInstall.

    Neovim interactivity

    See this repository for an additional plugin that provides bindings to run BQN code as you're editing it.

    Emacs

    diff --git a/docs/help/currentfunction.html b/docs/help/currentfunction.html index cb3c3d60..3d884422 100644 --- a/docs/help/currentfunction.html +++ b/docs/help/currentfunction.html @@ -9,7 +9,7 @@

    β†’full documentation

    A variable assigned to the current function block. 𝕀 accesses the same value but has a subject role.

    π•Š can be used for recursion.

    -↗️
        F ← {π•Š 0: 1; 𝕩 Γ— π•Š 𝕩-1} # Factorial
    +↗️
        F ← {π•Š 0: 1; 𝕩 Γ— π•Š 𝕩-1} # Factorial
         F 5
     120
     
    diff --git a/docs/help/nothing.html b/docs/help/nothing.html
    index 09188be2..b83a26c4 100644
    --- a/docs/help/nothing.html
    +++ b/docs/help/nothing.html
    @@ -17,7 +17,7 @@
     

    In Block Headers

    For Block header pattern matching syntax, Nothing can be used to indicate an unused value.

    -↗️
        F ← {π•Š aβ€ΏΒ·β€Ώb: a∾b}
    +↗️
        F ← {π•Š aβ€ΏΒ·β€Ώb: a∾b}
     
         F 1β€Ώ2β€Ώ3
     ⟨ 1 3 ⟩
    diff --git a/docs/implementation/kclaims.html b/docs/implementation/kclaims.html
    index 26bbe7ca..d99d8ea4 100644
    --- a/docs/implementation/kclaims.html
    +++ b/docs/implementation/kclaims.html
    @@ -41,35 +41,35 @@
            [Cycles where a code fetch is stalled due to L1 instruction cache miss]
     

    That's just the whole cost (in cycles) of L1 misses, exactly what we want! First I'll run this on a J program I have lying around, building my old Honors thesis with JtoLaTeX.

    -
     Performance counter stats for 'jlatex document.jtex nopdf':
    +
     Performance counter stats for 'jlatex document.jtex nopdf':
     
    -     1,457,284,402      cycles:u
    -        56,485,452      icache_16b.ifdata_stall:u
    -         2,254,192      cache-misses:u
    -        37,849,426      L1-dcache-load-misses:u
    -        28,797,332      L1-icache-load-misses:u
    +     1,457,284,402      cycles:u
    +        56,485,452      icache_16b.ifdata_stall:u
    +         2,254,192      cache-misses:u
    +        37,849,426      L1-dcache-load-misses:u
    +        28,797,332      L1-icache-load-misses:u
     
            0.557255985 seconds time elapsed
     

    Here's the BQN call that builds CBQN's object code sources:

    -
     Performance counter stats for './genRuntime /home/marshall/BQN/':
    +
     Performance counter stats for './genRuntime /home/marshall/BQN/':
     
    -       241,224,322      cycles:u
    -         5,452,372      icache_16b.ifdata_stall:u
    -           829,146      cache-misses:u
    -         6,954,143      L1-dcache-load-misses:u
    -         1,291,804      L1-icache-load-misses:u
    +       241,224,322      cycles:u
    +         5,452,372      icache_16b.ifdata_stall:u
    +           829,146      cache-misses:u
    +         6,954,143      L1-dcache-load-misses:u
    +         1,291,804      L1-icache-load-misses:u
     
            0.098228740 seconds time elapsed
     

    And the Python-based font tool I use to build font samples for this site:

    -
     Performance counter stats for 'pyftsubset […more stuff]':
    +
     Performance counter stats for 'pyftsubset […more stuff]':
     
    -       499,025,775      cycles:u
    -        24,869,974      icache_16b.ifdata_stall:u
    -         5,850,063      cache-misses:u
    -        11,175,902      L1-dcache-load-misses:u
    -        11,784,702      L1-icache-load-misses:u
    +       499,025,775      cycles:u
    +        24,869,974      icache_16b.ifdata_stall:u
    +         5,850,063      cache-misses:u
    +        11,175,902      L1-dcache-load-misses:u
    +        11,784,702      L1-icache-load-misses:u
     
            0.215698059 seconds time elapsed
     
    @@ -84,13 +84,13 @@

    So, roughly 4%, 2%, and 5%. The cache miss counts are also broadly in line with these numbers. Note that full cache misses are pretty rare, so that most misses just hit L2 or L3 and don't suffer a large penalty. Also note that instruction cache misses are mostly lower than data misses, as expected.

    Don't get me wrong, I'd love to improve performance even by 2%. But it's not exactly world domination, is it? And it doesn't matter how cache-friendly K is, that's the absolute limit.

    For comparison, here's ngn/k (which does aim for a small executable) running one of its unit testsβ€”test 19 in the a20/ folder, chosen because it's the longest-running of those tests.

    -
     Performance counter stats for '../k 19.k':
    +
     Performance counter stats for '../k 19.k':
     
    -     3,341,989,998      cycles:u
    -        21,136,960      icache_16b.ifdata_stall:u
    -           336,847      cache-misses:u
    -        10,748,990      L1-dcache-load-misses:u
    -        20,204,548      L1-icache-load-misses:u
    +     3,341,989,998      cycles:u
    +        21,136,960      icache_16b.ifdata_stall:u
    +           336,847      cache-misses:u
    +        10,748,990      L1-dcache-load-misses:u
    +        20,204,548      L1-icache-load-misses:u
     
            1.245378356 seconds time elapsed
     
    diff --git a/docs/implementation/vm.html b/docs/implementation/vm.html index 41bfee29..7a9db034 100644 --- a/docs/implementation/vm.html +++ b/docs/implementation/vm.html @@ -33,7 +33,7 @@

    The last property can be a single number or a list of lists. A single number indicates the body to be executed, and is used only for blocks with exactly one body. If it's a list of lists, the length is 1 for a block without arguments and 2 or more for a block with arguments (function or deferred modifier). Each element is a list of body indices. After selecting the appropriate list, execution begins at the first body in the appropriate list, moving to the next one if a header test (SETH or PRED instruction) fails. If a test fails but there's no next body, block evaluation is an error.

    The five possible cases for a function are monadic, dyadic, inverse monadic (π•ŠβΌx), inverse dyadic (wπ•ŠβΌx), and swapped-inverse dyadic (xπ•ŠΛœβΌw). The first two will always be provided, while the remaining three typically don't exist as they have to be specified with undo headers. The smallest length that covers all possible cases will be used.

    Bodies

    -

    Bodies in a block are separated by ;. Each entry in bodies is a list containing:

    +

    Bodies in a block are separated by ;. Each entry in bodies is a list containing:

    • Starting index in code
    • Number of variables the block needs to allocate
    • @@ -43,7 +43,7 @@

      The starting index refers to the position in bytecode where execution starts in order to evaluate the block. Different bodies will always have the same set of special names, but the variables they define are unrelated, so of course they can have different counts. The given number of variables includes special names, but list of names and export mask don't.

      The program's symbol list is included in the tokenization information t: it is 0βŠ‘2βŠ‘t. Since the entire program (the source code passed in one compiler call) uses this list, namespace field accesses can be performed with indices alone within a program. The symbol list is needed for cross-program access, for example if β€’BQN returns a namespace.

      Instructions

      -

      The following instructions are defined (those without names are tentatively reserved only). The ones emitted by the self-hosted BQN compiler are marked in the "used" column. Instructions marked "NS" are used only in programs with namespaces, and those marked "HE" are used only with headers : or predicates ?. Only those marked "X" are needed to support the compiler and self-hosted runtime.

      +

      The following instructions are defined (those without names are tentatively reserved only). The ones emitted by the self-hosted BQN compiler are marked in the "used" column. Instructions marked "NS" are used only in programs with namespaces, and those marked "HE" are used only with headers : or predicates ?. Only those marked "X" are needed to support the compiler and self-hosted runtime.

    % ^ ^.%:%: <. >. [ ] |.|:|:
    J ~@:&:&.::@:&:&.:: "L:^:::L:^:::
    =
    <:<: -⟜1 ≀
    >:>: 1⊸+ β‰₯
    ∨
    +:+: 2βŠΈΓ— ¬∨
    ∧
    *:*: Γ—Λœ ¬∧
    ¬∘∊/⊣
    -:-: ÷⟜2 ≑
    ~:~: ∊ β‰ 
    ∾˘
    ,:,: ≍
    ;; ∾ ∾⟜(<⍟(1β‰₯≑))
    -˜(+Γ·β—‹(Γ—Β΄)⊒)1+β†•βˆ˜βŠ£
    /:/: ⍋ β‹βŠΈβŠ
    \:\: ⍒ β’βŠΈβŠ
    ↑
    {:{: ⊒˝
    {::{:: βŠ‘
    ↓
    }:}: Β―1βŠΈβ†“
    ":": β€’Fmt
    ?? β€’rand.Rangeβš‡0 β€’rand.Deal
    ⊐
    i:i: {𝕩-Λœβ†•1+2×𝕩} β‰ βˆ˜βŠ£-1+⌽⊸⊐
    x F˝∘GβŽ‰1β€Ώβˆž y
    F :. G{π•Š: 𝕨F𝕩; π•ŠβΌ: 𝕨G𝕩}F :. G{π•Š: 𝕨F𝕩; π•ŠβΌ: 𝕨G𝕩}
    <;._1<;._1 ((1-ΛœΒ¬Γ—+`)=⟜⊏⊘⊣)βŠ”βŠ’
    ⊘ Valences{𝔽𝕩;𝕨𝔾𝕩}{𝔽𝕩;𝕨𝔾𝕩} Apply 𝔽 if there's one argument but 𝔾 if there are two
    Block such as a function definition
    :: Block header
    ;; Block body separator
    ?? Predicate
    ↕ 10w?w? F x Subject
    + β‹ˆ -F?F? G H Function
    @@ -400,7 +400,7 @@ - + diff --git a/docs/spec/complex.html b/docs/spec/complex.html index a5d4826e..3ef70e13 100644 --- a/docs/spec/complex.html +++ b/docs/spec/complex.html @@ -8,7 +8,7 @@

    Complex numbers are an optional extension to BQN's numeric system. If they are supported, the following functionality must also be supported. This extension is a draft and is versioned separately from the rest of the BQN specification.

    A complex number is a value with two components, a real part and an imaginary part. The type of each component is a real number, as described in the type specification. However, this type replaces the number type given there.

    The numeric literal notation is extended with the character i, which separates two real-valued components (in effect, it has lower "precedence" than other characters like e and Β―). If a second component is present (using i or I), that component's value is multiplied by the imaginary unit i and added to the first component; otherwise the value is the first component's value without modification. As with real numbers, the exact complex number given is rounded to fit the number system in use.

    -
    complexnumber = number ( ( "i" | "I" ) number )?
    +
    complexnumber = number ( ( "i" | "I" ) number )?
     

    Basic arithmetic functions +-Γ—Γ· are extended to complex numbers. A monadic case for the function + is added, which returns the conjugate argument: a number with real part equal to the real part of 𝕩 and imaginary part negated relative to 𝕩.

    The primitive function ⍳ is added: the character ⍳ forms a primitive function token, and its value is the function {π•¨βŠ’βŠ˜+0j1×𝕩}. This function multiplies 𝕩 by i, then adds 𝕨 if given.

    diff --git a/docs/spec/evaluate.html b/docs/spec/evaluate.html index ef0bd0ca..9bb02864 100644 --- a/docs/spec/evaluate.html +++ b/docs/spec/evaluate.html @@ -20,9 +20,9 @@

    Assignment

    An assignment is one of the four rules containing ASGN. It is evaluated by first evaluating the right-hand-side subExpr, FuncExpr, _m1Expr, or _m2Exp_ expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for subjects, only a lone identifier is allowed on the left-hand side and storage sets it equal to the result. For subjects, destructuring assignment is performed when an lhs is lhsList or lhsStr. Destructuring assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier assignment as the base case. The target "Β·" is also possible in place of a NAME, and performs no assignment.

    The right-hand-side value, here called v, in destructuring assignment must be a list (rank 1 array) or namespace. If it's a list, then each LHS_ENTRY node must be an LHS_ELT. The left-hand side is treated as a list of lhs targets, and matched to v element-wise, with an error if the two lists differ in length. If v is a namespace, then the left-hand side must be an lhsStr where every LHS_ATOM is an NAME, or an lhsList where every LHS_ENTRY is an NAME or lhs "⇐" NAME, so that it can be considered a list of NAME nodes some of which are also associated with lhs nodes. To perform the assignment, the value of each name is obtained from the namespace v, giving an error if v does not define that name. The value is assigned to the lhs node if present (which may be a destructuring assignment or simple subject assignment), and otherwise assigned to the same NAME node used to get it from v.

    -

    Modified assignment is the subject assignment rule lhs Derv "↩" subExpr?. In this case, lhs is evaluated as if it were a subExpr (the syntax is a subset of subExpr), and passed as an argument to Derv. The full application is lhs Derv subExpr, if subExpr is given, and Derv lhs otherwise. Its value is assigned to lhs, and is also the result of the modified assignment expression.

    +

    Modified assignment is the subject assignment rule lhs Derv "↩" subExpr?. In this case, lhs is evaluated as if it were a subExpr (the syntax is a subset of subExpr), and passed as an argument to Derv. The full application is lhs Derv subExpr, if subExpr is given, and Derv lhs otherwise. Its value is assigned to lhs, and is also the result of the modified assignment expression.

    Expressions

    -

    We now give rules for evaluating an atom, Func, _mod1 or _mod2_ expression (the possible options for ANY). A literal or primitive sl, Fl, _ml, or _cl_ has a fixed value defined by the specification (literals and built-ins). An identifier s, F, _m, or _c_, if not preceded by atom ".", must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by atom ".", then the atom node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as "(" _modExpr ")" simply returns the result of the interior expression. A braced construct such as BraceFunc is defined by the evaluation of the statements it contains after all parameters are accepted. Finally, a list "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩" or ANY ( "β€Ώ" ANY )+ consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation.

    +

    We now give rules for evaluating an atom, Func, _mod1 or _mod2_ expression (the possible options for ANY). A literal or primitive sl, Fl, _ml, or _cl_ has a fixed value defined by the specification (literals and built-ins). An identifier s, F, _m, or _c_, if not preceded by atom ".", must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by atom ".", then the atom node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as "(" _modExpr ")" simply returns the result of the interior expression. A braced construct such as BraceFunc is defined by the evaluation of the statements it contains after all parameters are accepted. Finally, a list "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩" or ANY ( "β€Ώ" ANY )+ consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation.

    Rules in the table below are function and modifier evaluation.

    08 RETDx? β†’ nx? β†’ n Clears stack, dropping 0 or 1 value
    @@ -38,7 +38,7 @@ - + @@ -82,7 +82,7 @@ - + diff --git a/docs/spec/grammar.html b/docs/spec/grammar.html index 3b18a102..fc8e2542 100644 --- a/docs/spec/grammar.html +++ b/docs/spec/grammar.html @@ -8,19 +8,19 @@

    BQN's grammar is given below. Terms are defined in a BNF variant. However, handling special names properly is possible but difficult in BNF, so they are explained in text along with the braced block grammar.

    The symbols s, F, _m, and _c_ are identifier tokens with subject, function, 1-modifier, and 2-modifier classes respectively. Similarly, sl, Fl, _ml, and _cl_ refer to literals and primitives of those classes. While names in the BNF here follow the identifier naming scheme, this is informative only: syntactic roles are no longer used after parsing and cannot be inspected in a running program.

    A program is a list of statements. Almost all statements are expressions. Namespace export statements, and valueless results stemming from Β·, or 𝕨 in a monadic brace function, can be used as statements but not expressions.

    -
    PROGRAM  = β‹„? ( STMT β‹„ )* STMT β‹„?
    +
    PROGRAM  = β‹„? ( STMT β‹„ )* STMT β‹„?
     STMT     = EXPR | nothing | EXPORT
     β‹„        = ( "β‹„" | "," | \n )+
     EXPR     = subExpr | FuncExpr | _m1Expr | _m2Expr_
    -EXPORT   = LHS_ELT? "⇐"
    +EXPORT   = LHS_ELT? "⇐"
     

    Here we define the "atomic" forms of functions and modifiers, which are either single tokens or enclosed in paired symbols. Stranded lists with β€Ώ, which binds more tightly than any form of execution, are also included.

    ANY      = atom | Func | _mod1 | _mod2_
    -_mod2_   = ( atom "." )? _c_ | _cl_ | "(" _m1Expr_ ")" | _blMod2_
    -_mod1    = ( atom "." )? _m  | _ml  | "(" _m2Expr  ")" | _blMod1
    -Func     = ( atom "." )?  F  |  Fl  | "(" FuncExpr ")" |  BlFunc
    -atom     = ( atom "." )?  s  |  sl  | "(" subExpr  ")" |  blSub | list
    -list     = "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"
    +_mod2_   = ( atom "." )? _c_ | _cl_ | "(" _m1Expr_ ")" | _blMod2_
    +_mod1    = ( atom "." )? _m  | _ml  | "(" _m2Expr  ")" | _blMod1
    +Func     = ( atom "." )?  F  |  Fl  | "(" FuncExpr ")" |  BlFunc
    +atom     = ( atom "." )?  s  |  sl  | "(" subExpr  ")" |  blSub | list
    +list     = "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"
     subject  = atom | ANY ( "β€Ώ" ANY )+
     

    Starting at the highest-order objects, modifiers have simple syntax. In most cases the syntax for ← and ↩ is the same, but only ↩ can be used for modified assignment. The export arrow ⇐ can be used in the same ways as ←, but it can also be used at the beginning of a header to force a namespace result, or with no expression on the right in an EXPORT statement.

    @@ -46,12 +46,12 @@

    Subject expressions consist mainly of function application. We also define nothing-statements, which have very similar syntax to subject expressions but do not permit assignment. They can be used as an STMT or in place of a left argument.

    arg      = subExpr
    -         | ( subject | nothing )? Derv arg
    +         | ( subject | nothing )? Derv arg
     nothing  = "Β·"
    -         | ( subject | nothing )? Derv nothing
    +         | ( subject | nothing )? Derv nothing
     subExpr  = arg
              | lhs ASGN subExpr
    -         | lhs Derv "↩" subExpr?      # Modified assignment
    +         | lhs Derv "↩" subExpr?      # Modified assignment
     

    The target of subject assignment can be compound to allow for destructuring. List and namespace assignment share the nodes lhsList and lhsStr and cannot be completely distinguished until execution. The term sl in LHS_SUB is used for header inputs below: as an additional rule, it cannot be used in the lhs term of a subExpr node.

    NAME     = s | F | _m | _c_
    @@ -61,7 +61,7 @@
     LHS_ELT  = LHS_ANY | lhsStr
     LHS_ENTRY= LHS_ELT | lhs "⇐" NAME
     lhsStr   = LHS_ATOM ( "β€Ώ" LHS_ATOM )+
    -lhsList  = "⟨" β‹„? ( ( LHS_ENTRY β‹„ )* LHS_ENTRY β‹„? )? "⟩"
    +lhsList  = "⟨" β‹„? ( ( LHS_ENTRY β‹„ )* LHS_ENTRY β‹„? )? "⟩"
     lhsComp  = LHS_SUB | lhsStr | "(" lhs ")"
     lhs      = s | lhsComp
     
    @@ -81,19 +81,19 @@

    There are some extra possibilities for a header that specifies arguments. As a special rule, a monadic function header specifically can omit the function when the argument is not just a name (as this would conflict with a subject label). Additionally, an inference header doesn't affect evaluation of the function, but describes how an inferred property (Undo) should be computed. Here "˜" and "⁼" are both specific instances of the _ml token.

    ARG_HEAD = LABEL
    -         | headW? IMM_HEAD      "⁼"? headX
    +         | headW? IMM_HEAD      "⁼"? headX
              | headW  IMM_HEAD "˜"  "⁼"  headX
    -         |        FuncName "˜"? "⁼"
    +         |        FuncName "˜"? "⁼"
              | lhsComp
     
    -

    A braced block contains bodies, which are lists of statements, separated by semicolons and possibly preceded by headers, which are separated from the body with a colon. A non-final expression can be made into a predicate by following it with the separator-like ?. Multiple bodies allow different handling for various cases, which are pattern-matched by headers. A block can have any number of bodies with headers. After these there can be bodies without headersβ€”up to one for an immediate block and up to two for a block with arguments. If a block with arguments has one such body, it's ambivalent, but two of them refer to the monadic and dyadic cases.

    -
    BODY     = β‹„? ( STMT β‹„ | EXPR β‹„? "?" β‹„? )* STMT β‹„?
    +

    A braced block contains bodies, which are lists of statements, separated by semicolons and possibly preceded by headers, which are separated from the body with a colon. A non-final expression can be made into a predicate by following it with the separator-like ?. Multiple bodies allow different handling for various cases, which are pattern-matched by headers. A block can have any number of bodies with headers. After these there can be bodies without headersβ€”up to one for an immediate block and up to two for a block with arguments. If a block with arguments has one such body, it's ambivalent, but two of them refer to the monadic and dyadic cases.

    +
    BODY     = β‹„? ( STMT β‹„ | EXPR β‹„? "?" β‹„? )* STMT β‹„?
     CASE     = BODY
    -I_CASE   = β‹„? IMM_HEAD β‹„? ":" BODY
    -A_CASE   = β‹„? ARG_HEAD β‹„? ":" BODY
    +I_CASE   = β‹„? IMM_HEAD β‹„? ":" BODY
    +A_CASE   = β‹„? ARG_HEAD β‹„? ":" BODY
     IMM_BLK  = "{" ( I_CASE ";" )* ( I_CASE | CASE ) "}"
    -ARG_BLK  = "{" ( A_CASE ";" )* ( A_CASE | CASE ( ";" CASE )? ) "}"
    -blSub    = "{" ( β‹„? s β‹„? ":" )? BODY "}"
    +ARG_BLK  = "{" ( A_CASE ";" )* ( A_CASE | CASE ( ";" CASE )? ) "}"
    +blSub    = "{" ( β‹„? s β‹„? ":" )? BODY "}"
     BlFunc   =           ARG_BLK
     _blMod1  = IMM_BLK | ARG_BLK
     _blMod2_ = IMM_BLK | ARG_BLK
    @@ -145,10 +145,10 @@
     
    𝕨( subject | nothing )?( subject | nothing )? Derv arg 𝕩{(𝕨L𝕩)C(𝕨R𝕩)}
    nothing?nothing? Derv Fork { C(𝕨R𝕩)}
    -

    The rules for special names can be expressed in BNF by making many copies of all expression rules above. For each "level", or row in the table, a new version of every rule should be made that allows that level but not higher ones, and another version should be made that requires exactly that level. The values themselves should be included in s, F, _m, and _c_ for these copies. Then the "allowed" rules are made simply by replacing the terms they contain (excluding blSub and so on) with the same "allowed" versions, and "required" rules are constructed using both "allowed" and "required" rules. For every part of a production rule, an alternative should be created that requires the relevant name in that part while allowing it in the others. For example, ( subject | nothing )? Derv arg would be transformed to

    +

    The rules for special names can be expressed in BNF by making many copies of all expression rules above. For each "level", or row in the table, a new version of every rule should be made that allows that level but not higher ones, and another version should be made that requires exactly that level. The values themselves should be included in s, F, _m, and _c_ for these copies. Then the "allowed" rules are made simply by replacing the terms they contain (excluding blSub and so on) with the same "allowed" versions, and "required" rules are constructed using both "allowed" and "required" rules. For every part of a production rule, an alternative should be created that requires the relevant name in that part while allowing it in the others. For example, ( subject | nothing )? Derv arg would be transformed to

    arg_req1 = subExpr_req1
              | ( subject_req1 | nothing_req1 ) Derv_allow1 arg_allow1
    -         | ( subject_allow1 | nothing_allow1 )? Derv_req1 arg_allow1
    -         | ( subject_allow1 | nothing_allow1 )? Derv_allow1 arg_req1
    +         | ( subject_allow1 | nothing_allow1 )? Derv_req1 arg_allow1
    +         | ( subject_allow1 | nothing_allow1 )? Derv_allow1 arg_req1
     

    Quite tedious. The explosion of rules is partly due to the fact that the brace-typing rule falls into a weaker class of grammars than the other rules. Most of BQN is deterministic context-free but brace-typing is not, only context-free. Fortunately brace typing does not introduce the parsing difficulties that can be present in a general context-free grammar, and it can easily be performed in linear time: after scanning but before parsing, move through the source code maintaining a stack of the current top-level set of braces. Whenever a colon or special name is encountered, annotate that set of braces to indicate that it is present. When a closing brace is encountered and the top brace is popped off the stack, the type is needed if there was no colon, and can be found based on which names were present. One way to present this information to the parser is to replace the brace tokens with new tokens that indicate the type.

    diff --git a/docs/spec/inferred.html b/docs/spec/inferred.html index fd247c65..11c27867 100644 --- a/docs/spec/inferred.html +++ b/docs/spec/inferred.html @@ -381,7 +381,7 @@ ⌜ -{!0<≑𝕩⋄ π”½βΌβŒœπ•©;} +{!0<≑𝕩⋄ π”½βΌβŒœπ•©;} Monadic case only diff --git a/docs/spec/literal.html b/docs/spec/literal.html index c171dbf5..0dd938cd 100644 --- a/docs/spec/literal.html +++ b/docs/spec/literal.html @@ -8,9 +8,9 @@

    A literal is a single token that indicates a fixed character, number, or array. While literals indicate values of a data type, primitives indicate values of an operation type: function, 1-modifier, or 2-modifier.

    Two types of literal deal with text. As the source code is considered to be a sequence of unicode code points ("characters"), and these code points are also used for BQN's character data type, the representation of a text literal is very similar to its value. In a text literal, the newline character is always represented using the ASCII line feed character, code point 10. A character literal is enclosed with single quotes ' and its value is identical to the single character between them. A string literal is enclosed in double quotes ", and any double quotes between them must come in pairs, as a lone double quote marks the end of the literal. The value of a string literal is a rank-1 array whose elements are the characters in between the enclosing quotes, after replacing each pair of double quotes with only one such quote. The null literal is the token @ and represents the null character, code point 0.

    The format of a numeric literal is more complicated. From the tokenization rules, a numeric literal consists of a numeric character (one of Β―βˆžΟ€.0123456789) followed by any number of numeric or alphabetic characters. Some numeric literals are valid and indicate a number, while others are invalid and cause an error. The grammar for valid numbers is given below in a BNF variant. The alphabetic character allowed is "e" or "E", which functions as in scientific notation. Not included in this grammar are underscoresβ€”they can be placed anywhere in a number, including after the last non-underscore character, and are ignored entirely.

    -
    number    = "¯"? ( "∞" | mantissa ( ( "e" | "E" ) exponent )? )
    -exponent  = "Β―"? digit+
    -mantissa  = "Ο€" | digit+ ( "." digit+ )?
    +
    number    = "¯"? ( "∞" | mantissa ( ( "e" | "E" ) exponent )? )
    +exponent  = "Β―"? digit+
    +mantissa  = "Ο€" | digit+ ( "." digit+ )?
     digit     = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
     

    The digits or arabic numerals correspond to the numbers from 0 to 9 in the conventional way (also, each corresponds to its code point value minus 48). A sequence of digits gives a natural number by evaluating it in base 10: the number is 0 for an empty sequence, and otherwise the last digit's numerical value plus ten times the number obtained from the remaining digits. The symbol ∞ indicates infinity and Ο€ indicates the ratio pi of a perfect circle's circumference to its diameter. The high minus symbol Β― indicates that the number containing it is to be negated. When an exponent is provided (with e or E), the corresponding mantissa is multiplied by ten to that power, giving the value mantissaΓ—10⋆exponent.

    diff --git a/docs/spec/primitive.html b/docs/spec/primitive.html index b1c115f2..05ed2dcc 100644 --- a/docs/spec/primitive.html +++ b/docs/spec/primitive.html @@ -87,7 +87,7 @@
  • Before/Bind (⊸)
  • After/Bind (⟜)
  • -

    The somewhat complicated definition of Valences could be replaced with {𝔽𝕩;𝕨𝔾𝕩} using headers. However, reference.bqn uses a simple subset of BQN's syntax that doesn't include headers. Instead, the definition relies on the fact that 𝕨 works like Β· if no left argument is given: (1˙𝕨)-0 is 1-0 or 1 if 𝕨 is present and (1Λ™Β·)-0 otherwise: this reduces to Β·-0 or 0.

    +

    The somewhat complicated definition of Valences could be replaced with {𝔽𝕩;𝕨𝔾𝕩} using headers. However, reference.bqn uses a simple subset of BQN's syntax that doesn't include headers. Instead, the definition relies on the fact that 𝕨 works like Β· if no left argument is given: (1˙𝕨)-0 is 1-0 or 1 if 𝕨 is present and (1Λ™Β·)-0 otherwise: this reduces to Β·-0 or 0.

    Array properties

    The reference implementations extend Shape (β‰’) to atoms as well as arrays, in addition to implementing other properties. In all cases, an atom behaves as if it has shape ⟨⟩. The functions in this section never cause an error.

      diff --git a/docs/style.css b/docs/style.css index 15cf4031..e76956cd 100644 --- a/docs/style.css +++ b/docs/style.css @@ -189,6 +189,7 @@ kbd { a:link { color: #0b39dc; text-decoration-color: #0b39dc91; } a:visited { color: #3d155f; } +.Head ,a.Head, .Value ,a.Value { color: #1f2020; } .Function ,a.Function { color: #1f7229; } .Modifier ,a.Modifier { color: #7b3b60; } @@ -225,6 +226,7 @@ a:visited { color: #3d155f; } a:link { color: #5592d9; text-decoration-color: #508dd978; } a:visited { color: #8781c1; } + .Head ,a.Head, .Value ,a.Value { color: #b2b9bb; } .Function ,a.Function { color: #3aa548; } .Modifier ,a.Modifier { color: #93428b; } diff --git a/md.bqn b/md.bqn index c2955824..d252c192 100644 --- a/md.bqn +++ b/md.bqn @@ -563,6 +563,7 @@ hlcharsβ€ΏclassTag ← { "Paren" , "()" "Bracket" , "⟨⟩" "Brace" , "{}" + "Head" , ":;?" "Ligature" , "β€Ώ" "Nothing" , "Β·" "Separator" , "β‹„," -- cgit v1.2.3