aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarshall Lochbaum <mwlochbaum@gmail.com>2020-07-07 16:07:15 -0400
committerMarshall Lochbaum <mwlochbaum@gmail.com>2020-07-07 16:09:49 -0400
commit85c54f4c22897972025d76502b9e305541ec5a6e (patch)
tree754479d1800e086eefa1c6369d8f5abce90df251
parenta9d01bc7c2be4b66595af69d53ff015bf5da2023 (diff)
Specify headers (still too much hand-waving)
-rw-r--r--spec/evaluate.md18
-rw-r--r--spec/grammar.md31
2 files changed, 33 insertions, 16 deletions
diff --git a/spec/evaluate.md b/spec/evaluate.md
index 03edc813..d4e345d9 100644
--- a/spec/evaluate.md
+++ b/spec/evaluate.md
@@ -2,14 +2,28 @@ This page describes the semantics of the code constructs whose grammar is given
Here we assume that the referent of each identifier, or equivalently the connections between identifiers, have been identified according to the [scoping rules](scope.md).
-A `PROGRAM` or `BRACED` block is a list of `STMT`s (for `BRACED`, the last must be an `EXPR`, a particular kind of `STMT`), which are evaluated in program order. The statement `nothing` does nothing when evaluated, while `EXPR` evaluates some APL code and possibly assigns the results, as described below.
+### Programs and blocks
-One additional rule for `BRACED` blocks makes it easier to define ambivalent functions. Each such block that contains `𝕨` at the top level is parsed normally to give a dyadic function, but is also parsed a second time with all instances of `𝕨` replaced by `Β·` to give a monadic function (as the only effect is to change some instances of `arg` to `nothing`, this can be achieved efficiently by annotating parts of the AST that depend on `𝕨` as conditionally-nothing). When called, the dyadic function is used if both arguments are given and the monadic one is used if there is only one. This applies to modifiers and compositions written with `𝕨` as well, where the choice of which version to use is made when the derived function is called.
+The result of parsing a valid BQN program is a `PROGRAM`, and the program is run by evaluating this term.
+
+A `PROGRAM` or `BODY` is a list of `STMT`s (for `BODY`, the last must be an `EXPR`, a particular kind of `STMT`), which are evaluated in program order. The statement `nothing` does nothing when evaluated, while `EXPR` evaluates some APL code and possibly assigns the results, as described below.
+
+A block consists of several `BODY` terms, some of which may have an accompanying header describing accepted inputs and how they are processed. A value block `brVal` can only have one `BODY`, and is evaluated by evaluating the code in it. Other types of blocks do not evaluate any `BODY` immediately, but instead return a function, modifier, or operator that obtains its result by evaluating a particular `BODY`. The `BODY` is identified and evaluated once the block has received enough inputs (operands or arguments), which for modifiers and compositions can take one or two calls: if two calls are required, then on the first call the operands are simply stored and no code is evaluated yet. Two calls are required if there is more than one `BODY` term, if the `BODY` contains the special names `π•¨π•©π•€π•Žπ•π•Š`, or if its header specifies arguments (the header-body is a `_mCase` or `_cCase_`). Otherwise only one is required.
+
+To evaluate a block when enough inputs have been received, first the correct case must be identified. To do this, first each special case (`FCase`, `_mCase`, or `_cCase_`) is checked in order to see if its arguments are strucurally compatible with the given arguments. That is, is `headW` is a `value`, there must be a left argument matching that structure, and if `headX` is a `value`, the right argument must match that structure. This means that `𝕨` not only matches any left argument but also no argument. The test for compatibility is the same as for multiple assignment described below, except that the header may contain constants, which must match the corresponding part of the given argument.If no special case matches, then an appropriate general case (`FMain`, `_mMain`, or `_cMain_`) is used: if there are two, the first is used with no left argument and the second with a left argument; if there are one, it is always used, and if there are none, an error results.
+
+The only remaining step before evaluating the `BODY` is to bind the inputs and other names. Special names are always bound when applicable: `𝕨𝕩𝕀` if arguments are used, `𝕨` if there is a left argument, `π•—π•˜` if operands are used, and `_𝕣` and `_𝕣_` for modifiers and combinators, respectively. Any names in the header are also bound, allowing multiple assignment for arguments.
+
+If there is no left argument, but the `BODY` contains `𝕨` at the top level, then it is conceptually re-parsed with `𝕨` replaced by `Β·` to give a monadic version before application. As the only effect when this re-parsed form is valid is to change some instances of `arg` to `nothing`, this can be achieved efficiently by annotating parts of the AST that depend on `𝕨` as conditionally-nothing. However, it also causes an error if `𝕨` is used as an operand or list element, where `nothing` is not allowed by the grammar.
+
+### Assignment
An *assignment* is one of the four rules containing `ASGN`. It is evaluated by first evaluating the right-hand-side `valExpr`, `FuncExpr`, `_modExpr`, or `_cmpExp_` expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for values, only a lone identifier is allowed on the left-hand side and storage is obvious. For values, *multiple assignment* with a list left-hand side is also allowed. Multiple assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier (`v`) assignment as the base case. When matching the right-hand side to a list left-hand side, the left hand side is treated as a list of `lhs` targets. The evaluated right-hand side must be a list (rank-1 array) of the same length, and is matched to these targets element-wise.
*Modified assignment* is the value assignment rule `lhs Derv "↩" valExpr`. In this case, `lhs` should be evaluated as if it were a `valExpr` (the syntax is a subset of `valExpr`), and the result of the function application `lhs Derv valExpr` should be assigned to `lhs`, and is also the result of the modified assignment expression.
+### Expressions
+
We now give rules for evaluating an `atom`, `Func`, `_mod` or `_comp_` expression (the possible options for `ANY`). A literal `vl`, `Fl`, `_ml`, or `_cl_` has a fixed value defined by the specification ([value literals](literal.md) and [built-ins](primitive.md)). An identifier `v`, `F`, `_m`, or `_c_` is evaluated by returning its value; because of the scoping rules it must have one when evaluated. A parenthesized expression such as `"(" _modExpr ")"` simply returns the result of the interior expression. A braced construct such as `BraceFunc` is defined by the evaluation of the statements it contains after all parameters are accepted. Finally, a list `"⟨" β‹„? ( ( EXPR β‹„ )* EXPR β‹„? )? "⟩"` or `ANY ( "β€Ώ" ANY )+` consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation.
Rules in the table below are function and operator evaluation.
diff --git a/spec/grammar.md b/spec/grammar.md
index ea3aabf5..2ea60e65 100644
--- a/spec/grammar.md
+++ b/spec/grammar.md
@@ -61,7 +61,7 @@ Value expressions are complicated by the possibility of list assignment. We also
| lhs ASGN valExpr
| lhs Derv "↩" valExpr ⍝ Modified assignment
-A header looks like a name for the thing being headed, or its application to inputs (possibly twice in the case of modifiers and compositions). As with assignment, it is restricted to a simple form with no extra parentheses. The full list syntax is allowed for arguments. As a special rule, a function header specifically can omit the function.
+A header looks like a name for the thing being headed, or its application to inputs (possibly twice in the case of modifiers and compositions). As with assignment, it is restricted to a simple form with no extra parentheses. The full list syntax is allowed for arguments. As a special rule, a monadic function header specifically can omit the function when the argument is not just a name (as this would conflict with a value label). The following cases define only headers with arguments, which are assumed to be special cases; there can be any number of these. Headers without arguments can only refer to the general caseβ€”note that operands are not pattern matchedβ€”so there can be at most two of these kinds of headers, indicating the monadic and dyadic cases.
headW = value | "𝕨"
headX = value | "𝕩"
@@ -69,22 +69,25 @@ A header looks like a name for the thing being headed, or its application to inp
HeadG = F | "π•˜" | "𝔾"
ModH1 = HeadF ( _m | "_𝕣" )
CmpH1 = HeadF ( _c_ | "_𝕣_" ) HeadG
- valHead = v
- FuncHead = F | headW? ( F | "π•Š" ) headX
- | vl | "(" valExpr ")" | brVal | list ⍝ value but not v
- _modHead = _m | ModH1 | headW? ModH1 headX
- _cmpHed_ = _c_ | CmpH1 | headW? CmpH1 headX
+ FuncHead = headW? ( F | "π•Š" ) headX
+ | vl | "(" valExpr ")" | brVal | list ⍝ value,
+ | ANY ( "β€Ώ" ANY )+ ⍝ but not v
+ _modHead = headW? ModH1 headX
+ _cmpHed_ = headW? CmpH1 headX
-A braced block contains bodies, which are lists of statements, separated by semicolons and possibly preceded by headers, which are separated from the body with a colon. Multiple bodies allow different handling for various cases, which are pattern-matched by headers. For a value block there are no inputs, so there can only be one possible case and one body. Functions and operators allow any number of bodies with headers followed by at most two bodies without headers. These bodies refer to the general cases: ambivalent if there is only one and split into monadic and dyadic if there are two.
+A braced block contains bodies, which are lists of statements, separated by semicolons and possibly preceded by headers, which are separated from the body with a colon. Multiple bodies allow different handling for various cases, which are pattern-matched by headers. For a value block there are no inputs, so there can only be one possible case and one body. Functions and operators allow any number of "matched" bodies, with headers that have arguments, followed by at most two "main" bodies with either no headers or headers without arguments. If there is one main body, it is ambivalent, but two main bodies refer to the monadic and dyadic cases.
BODY = β‹„? ( STMT β‹„ )* EXPR β‹„?
- FuncCase = β‹„? FuncHead ":" BODY
- _modCase = β‹„? _modHead ":" BODY
- _cmpCas_ = β‹„? _cmpHed_ ":" BODY
- brVal = "{" ( β‹„? valHead ":" )? BODY "}"
- BrFunc = "{" ( FuncCase ";" )* ( FuncCase | BODY | BODY ";" BODY ) "}"
- _brMod = "{" ( _modCase ";" )* ( _modCase | BODY | BODY ";" BODY ) "}"
- _brComp_ = "{" ( _cmpCas_ ";" )* ( _cmpCas_ | BODY | BODY ";" BODY ) "}"
+ FCase = β‹„? FuncHead ":" BODY
+ _mCase = β‹„? _modHead ":" BODY
+ _cCase_ = β‹„? _cmpHed_ ":" BODY
+ FMain = ( β‹„? F ":" )? BODY
+ _mMain = ( β‹„? ( _m | ModH1 ) ":" )? BODY
+ _cMain_ = ( β‹„? ( _c_ | CmpH1 ) ":" )? BODY
+ brVal = "{" ( β‹„? v ":" )? BODY "}"
+ BrFunc = "{" ( FCase ";" )* ( FCase | FMain ( ";" FMain )? ) "}"
+ _brMod = "{" ( _mCase ";" )* ( _mCase | _mMain ( ";" _mMain )? ) "}"
+ _brComp_ = "{" ( _cCase_ ";" )* ( _cCase_ | _cMan_ ( ";" _cMan_ )? ) "}"
Two additional rules apply to blocks, based on the special name associations in the table below. First, each block allows the special names in its column to be used as the given token types within `BODY` terms (not headers). Except for the spaces labelled "None", each column is cumulative and a given entry also includes all the entries above it. Second, for `BrFunc`, `_brMod`, and `_brComp_` terms, if no header is given, then at least one `BODY` term in it *must* contain one of the names on, and not above, the corresponding row. Otherwise the syntax would be ambiguous, since for example a simple `"{" BODY "}"` sequence could have any type.