aboutsummaryrefslogtreecommitdiff
path: root/spec
diff options
context:
space:
mode:
authorMarshall Lochbaum <mwlochbaum@gmail.com>2022-06-13 14:02:54 -0400
committerMarshall Lochbaum <mwlochbaum@gmail.com>2022-06-13 14:02:54 -0400
commit316ae01c16db690a972fcba9ce735818a31773ac (patch)
tree34edbf515e8da9a7df803dbcaa96e1831defe0f1 /spec
parent0d6c26b9aa607ff14e14e6488bace207e324022a (diff)
Specify []; spec is now a version 1.0 draft
Diffstat (limited to 'spec')
-rw-r--r--spec/README.md2
-rw-r--r--spec/evaluate.md10
-rw-r--r--spec/grammar.md8
-rw-r--r--spec/scope.md2
-rw-r--r--spec/token.md2
5 files changed, 15 insertions, 9 deletions
diff --git a/spec/README.md b/spec/README.md
index a549066e..bcbccf3a 100644
--- a/spec/README.md
+++ b/spec/README.md
@@ -2,7 +2,7 @@
# BQN specification
-This document, and the others in this directory (linked in the list below) make up the pre-versioning BQN specification. The specification differs from the [documentation](../doc/README.md) in that its purpose is only to describe the exact details of BQN's operation in the most quickly accessible way, rather than to explain the central ideas of BQN functionality and how it might be used. The core of BQN, which excludes system-provided values, is now almost completely specified. Planned changes to the specification are tracked on [this page](https://topanswers.xyz/apl?q=1888).
+This document, and the others in this directory (linked in the list below) make up the pre-versioning BQN specification. The specification differs from the [documentation](../doc/README.md) in that its purpose is only to describe the exact details of BQN's operation in the most quickly accessible way, rather than to explain the central ideas of BQN functionality and how it might be used. The core language specification, which excludes the system-provided values section (to be versioned separately), is now considered a version 1.0 draft: all functionality to be included in version 1.0 is described, but it must be reviewed to confirm that it's described as intended.
Under this specification, a language implementation is a **BQN pre-version implementation** if it behaves as specified for all input programs. It is a **BQN pre-version implementation with extensions** if it behaves as specified in all cases where the specification does not require an error, but behaves differently in at least one case where it requires an error. It is a **partial** version of either of these if it doesn't conform to the description but differs from a conforming implementation only by rejecting with an error some programs that the conforming implementation accepts. As the specification is not yet versioned, other instances of the specification define these terms in different ways. An implementation can use one of these terms if it conforms to any instance of the pre-versioning BQN specifications that defines them. When versioning is begun, there will be only one specification for each version.
diff --git a/spec/evaluate.md b/spec/evaluate.md
index 48d00014..de3aca44 100644
--- a/spec/evaluate.md
+++ b/spec/evaluate.md
@@ -28,9 +28,11 @@ If there is no left argument, but the `BODY` contains `𝕨` or `π•Ž` at the to
### Assignment
-An *assignment* is one of the four rules containing `ASGN`. It is evaluated by first evaluating the right-hand-side `subExpr`, `FuncExpr`, `_m1Expr`, or `_m2Exp_` expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for subjects, only a lone identifier is allowed on the left-hand side and storage sets it equal to the result. For subjects, *destructuring assignment* is performed when an `lhs` is `lhsList` or `lhsStr`. Destructuring assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier assignment as the base case. The target `"Β·"` is also possible in place of a `NAME`, and performs no assignment.
+An *assignment* is one of the four rules containing `ASGN`. It is evaluated by first evaluating the right-hand-side `subExpr`, `FuncExpr`, `_m1Expr`, or `_m2Exp_` expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for subjects, only a lone identifier is allowed on the left-hand side and storage sets it equal to the result. For subjects, *destructuring assignment* is performed when an `lhs` is `lhsList`, `lhsStr`, or `lhsArray`. Destructuring assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier assignment as the base case. The target `"Β·"` is also possible in place of a `NAME`, and performs no assignment.
-The right-hand-side value, here called `v`, in destructuring assignment must be a list (rank 1 array) or namespace. If it's a list, then each `LHS_ENTRY` node must be an `LHS_ELT`. The left-hand side is treated as a list of `lhs` targets, and matched to `v` element-wise, with an error if the two lists differ in length. If `v` is a namespace, then the left-hand side must be an `lhsStr` where every `LHS_ATOM` is an `NAME`, or an `lhsList` where every `LHS_ENTRY` is an `NAME` or `lhs "⇐" NAME`, so that it can be considered a list of `NAME` nodes some of which are also associated with `lhs` nodes. To perform the assignment, the value of each name is obtained from the namespace `v`, giving an error if `v` does not define that name. The value is assigned to the `lhs` node if present (which may be a destructuring assignment or simple subject assignment), and otherwise assigned to the same `NAME` node used to get it from `v`.
+In assignment to `lhsList` or `lhsStr`, the right-hand-side value, here called `v`, must be a list (rank 1 array) or namespace. If it's a list, then each `LHS_ENTRY` node must be an `LHS_ELT`. The left-hand side is treated as a list of `lhs` targets, and matched to `v` element-wise, with an error if the two lists differ in length. If `v` is a namespace, then the left-hand side must be an `lhsStr` where every `LHS_ATOM` is an `NAME`, or an `lhsList` where every `LHS_ENTRY` is an `NAME` or `lhs "⇐" NAME`, so that it can be considered a list of `NAME` nodes some of which are also associated with `lhs` nodes. To perform the assignment, the value of each name is obtained from the namespace `v`, giving an error if `v` does not define that name. The value is assigned to the `lhs` node if present (which may be a destructuring assignment or simple subject assignment), and otherwise assigned to the same `NAME` node used to get it from `v`.
+
+Assignment to `lhsArray` destructures the major cells of right-hand-side value `v`, which must be an array of rank at least 1. The number of cells in `v` is its length `l`, that is, the first element of its shape. The shape of each is the shape of `v` without its first element, and the cell ravels are formed by splitting `v`'s ravel evenly into `l` sections. Besides this difference in how `v` is divided, assignment behaves the same way as assignment of a list `v` to `lhsList`.
A destructuring assignment is performed in program order, or equivalently index order, with each sub-assignment fully completed before beginning the next (a depth-first order). Thus if an assignment with `↩` encounters an error but it's caught with `⎊`, some of the assignment may have already been performed, changing variable values.
@@ -38,7 +40,9 @@ A destructuring assignment is performed in program order, or equivalently index
### Expressions
-We now give rules for evaluating an `atom`, `Func`, `_mod1` or `_mod2_` expression (the possible options for `ANY`). A literal or primitive `sl`, `Fl`, `_ml`, or `_cl_` has a fixed value defined by the specification ([literals](literal.md) and [built-ins](primitive.md)). An identifier `s`, `F`, `_m`, or `_c_`, if not preceded by `atom "."`, must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by `atom "."`, then the `atom` node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as `"(" _modExpr ")"` simply returns the result of the interior expression. A block is defined by the evaluation of the statements it contains after all parameters are accepted, as described above. Finally, a list `"⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"` or `ANY ( "β€Ώ" ANY )+` consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation.
+We now give rules for evaluating an `atom`, `Func`, `_mod1` or `_mod2_` expression (the possible options for `ANY`). A literal or primitive `sl`, `Fl`, `_ml`, or `_cl_` has a fixed value defined by the specification ([literals](literal.md) and [built-ins](primitive.md)). An identifier `s`, `F`, `_m`, or `_c_`, if not preceded by `atom "."`, must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by `atom "."`, then the `atom` node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as `"(" _modExpr ")"` simply returns the result of the interior expression. A block is defined by the evaluation of the statements it contains after all parameters are accepted, as described above.
+
+A list `"⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"` or `ANY ( "β€Ώ" ANY )+` consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation. The square bracket notation `"[" β‹„? ( EXPR β‹„ )* EXPR β‹„? "]"` evaluates expressions in the same way, but makes them into major cells of an array instead of elements. The result is identical to applying the [primitive](primitive.md) function Merge (`>`) to a list of the expression results.
Rules in the table below are function and modifier evaluation.
| L | Left | Called | Right | R | Types
diff --git a/spec/grammar.md b/spec/grammar.md
index 6e6c28be..acb00ef5 100644
--- a/spec/grammar.md
+++ b/spec/grammar.md
@@ -20,8 +20,9 @@ Here we define the "atomic" forms of functions and modifiers, which are either s
_mod2_ = ( atom "." )? _c_ | _cl_ | "(" _m2Expr_ ")" | _blMod2_
_mod1 = ( atom "." )? _m | _ml | "(" _m1Expr ")" | _blMod1
Func = ( atom "." )? F | Fl | "(" FuncExpr ")" | BlFunc
- atom = ( atom "." )? s | sl | "(" subExpr ")" | blSub | list
- list = "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"
+ atom = ( atom "." )? s | sl | "(" subExpr ")" | blSub | array
+ array = "⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"
+ | "[" β‹„? ( EXPR β‹„ )* EXPR β‹„? "]"
subject = atom | ANY ( "β€Ώ" ANY )+
Starting at the highest-order objects, modifiers have simple syntax. In most cases the syntax for `←` and `↩` is the same, but only `↩` can be used for modified assignment. The export arrow `⇐` can be used in the same ways as `←`, but it can also be used at the beginning of a header to force a namespace result, or with no expression on the right in an `EXPORT` statement.
@@ -60,13 +61,14 @@ Subject expressions consist mainly of function application. We also define nothi
The target of subject assignment can be compound to allow for destructuring. List and namespace assignment share the nodes `lhsList` and `lhsStr` and cannot be completely distinguished until execution. The term `sl` in `LHS_SUB` is used for header inputs below: as an additional rule, it cannot be used in the `lhs` term of a `subExpr` node.
NAME = s | F | _m | _c_
- LHS_SUB = "Β·" | lhsList | sl
+ LHS_SUB = "Β·" | lhsList | lhsArray | sl
LHS_ANY = NAME | LHS_SUB | "(" LHS_ELT ")"
LHS_ATOM = LHS_ANY | "(" lhsStr ")"
LHS_ELT = LHS_ANY | lhsStr
LHS_ENTRY= LHS_ELT | lhs "⇐" NAME
lhsStr = LHS_ATOM ( "β€Ώ" LHS_ATOM )+
lhsList = "⟨" β‹„? ( ( LHS_ENTRY β‹„ )* LHS_ENTRY β‹„? )? "⟩"
+ lhsArray = "[" β‹„? ( LHS_ELT β‹„ )* LHS_ELT β‹„? "]"
lhsComp = LHS_SUB | lhsStr | "(" lhs ")"
lhs = s | lhsComp
diff --git a/spec/scope.md b/spec/scope.md
index a5ffb38c..fb364275 100644
--- a/spec/scope.md
+++ b/spec/scope.md
@@ -18,7 +18,7 @@ Two identifier instances have the *same name* if their tokens, as strings, match
- The two identifiers are the same instance (a defined variable is its own definition).
The definition for an identifier is chosen from the potential definitions based on their containing scopes: it is the one whose containing scope does not contain or match the containing scope of any other potential definition. If for any identifier there is no definition, then the program is not valid and results in an error. This can occur if the identifier has no potential definition, and also if two potential definitions appear in the same scope. In fact, under this scheme it is never valid to make two definitions with the same name at the top level of a single scope, because both definitions would be potential definitions for the one that comes second in program order. Both definitions have the same containing scope, and any potential definition must contain or match this scope, so no potential definition can be selected.
-The definition of *program order* for identifier tokens follows the order of BQN [execution](evaluate.md). It corresponds to the order of a particular traversal of the abstract syntax tree for a program. To find the relative ordering of two identifiers in a program, we consider the highest-depth node that they both belong to; in this node they must occur in different components, or that component would be a higher-depth node containing both of them. In most nodes, the program order goes from right to left: components further to the right come earlier in program order. The exceptions are `PROGRAM`, `BODY`, `list`, `subject` (for stranding), `lhsList`, `lhsStr`, and body structure (`I_CASE`, `A_CASE`, `IMM_BLK`, `ARG_BLK`, and `blSub`) nodes, in which program order goes in the opposite order, from left to right.
+The definition of *program order* for identifier tokens follows the order of BQN [execution](evaluate.md). It corresponds to the order of a particular traversal of the abstract syntax tree for a program. To find the relative ordering of two identifiers in a program, we consider the highest-depth node that they both belong to; in this node they must occur in different components, or that component would be a higher-depth node containing both of them. In most nodes, the program order goes from right to left: components further to the right come earlier in program order. The exceptions are `PROGRAM`, `BODY`, `array`, `subject` (for stranding), `lhsList`, `lhsArray`, `lhsStr`, and body structure (`I_CASE`, `A_CASE`, `IMM_BLK`, `ARG_BLK`, and `blSub`) nodes, in which program order goes in the opposite order, from left to right.
A subject label is the `s` term in a `blSub` node. As part of a header, it can serve as the definition for an identifier. However, it's defined to be a syntax error if another instance of this identifier appears.
diff --git a/spec/token.md b/spec/token.md
index 12d04679..eafea8f3 100644
--- a/spec/token.md
+++ b/spec/token.md
@@ -25,6 +25,6 @@ Otherwise, a single character forms a token. Only the specified set of character
| Primitive 1-Modifier | `` Λ™ΛœΛ˜Β¨βŒœβΌΒ΄Λ` ``
| Primitive 2-Modifier | `βˆ˜β—‹βŠΈβŸœβŒΎβŠ˜β—ΆβŽ‰βš‡βŸβŽŠ`
| Special name | `π•¨π•©π•—π•˜π•€π•Žπ•π”½π”Ύπ•Š`
-| Punctuation | `←⇐↩(){}βŸ¨βŸ©β€Ώβ‹„,.` and newline
+| Punctuation | `←⇐↩(){}⟨⟩[]β€Ώβ‹„,.` and newline
In the BQN [grammar specification](grammar.md), the three primitive classes are grouped into terminals `Fl`, `_ml`, and `_cl`, while the punctuation characters are identified separately as keywords such as `"←"`. The special names are handled specially. The uppercase versions `π•Žπ•π”½π”Ύπ•Š` and lowercase versions `π•¨π•©π•—π•˜π•€` are two spellings of the five underlying inputs and function.