From a1b60a18922e578a97e50efe1a2e29863b6f8d92 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Sun, 17 Apr 2022 18:00:17 -0400 Subject: Fix the spec's treatment of multiple bodies and predicates --- spec/evaluate.md | 8 ++++---- spec/grammar.md | 16 +++++++++------- 2 files changed, 13 insertions(+), 11 deletions(-) (limited to 'spec') diff --git a/spec/evaluate.md b/spec/evaluate.md index 15861dda..0f340825 100644 --- a/spec/evaluate.md +++ b/spec/evaluate.md @@ -16,13 +16,13 @@ The result of parsing a valid BQN program is a `PROGRAM`, and the program is run A `PROGRAM` or `BODY` is a list of `STMT`s, which are evaluated in program order. A `BODY` also allows an `EXPR` followed by `"?"` in place of an `STMT`: then the expression is evaluated as usual but its result is checked as discussed below. A result is always required for `BODY` nodes, and sometimes for `PROGRAM` nodes (for example, when loaded with `β€’Import`). If any identifiers in the node's scope are exported, or any of its statements is an `EXPORT`, then the result is the namespace created in order to evaluate the node. If a result is required but the namespace case doesn't apply, then the last `STMT` node must be an `EXPR` and its result is used. The statement `EXPR` evaluates some BQN code and possibly assigns the results, while `nothing` evaluates any `subject` or `Derv` terms it contains but discards the results. An `EXPORT` statement performs no action. -A block consists of several `BODY` terms, some of which may have an accompanying header describing accepted inputs and how they are processed. An immediate block `brImm` can only have one `BODY`, and is evaluated by evaluating it. Other types of blocks don't evaluate any `BODY` immediately, but instead return a function or modifier that obtains its result by evaluating a particular `BODY`. The `BODY` is identified and evaluated once the block has received enough inputs (operands or arguments), which for modifiers can take one or two calls: if two calls are required, then on the first call the operands are simply stored and no code is evaluated yet. The stored values can be accessed by equality checking, or `β€’Decompose` if defined. Two calls are required if there is more than one `BODY` term, if the `BODY` contains the special names `π•¨π•©π•€π•Žπ•π•Š`, or if its header specifies arguments (the header-body combination is a `_mCase` or `_cCase_`). Otherwise only one is required. +A block consists of several `BODY` terms, some of which may have an accompanying header describing accepted inputs and how they are processed. An immediate block `blSub` is evaluated when reached. Other types of blocks don't evaluate any `BODY` immediately, but instead return a function or modifier that obtains its result by evaluating a particular `BODY`. The `BODY` is identified and evaluated once the block has received enough inputs (operands or arguments), which for modifiers takes one call for an `IMM_BLK` and two for an `ARG_BLK`. If two calls are required, then on the first call the operands are simply stored and no code is evaluated yet. The stored values can be accessed by equality checking, or `β€’Decompose` if defined. -To evaluate a block when enough inputs have been received, first the correct case must be identified. To do this, first each special case (`I_CASE` or `A_CASE`), excluding `A_CASE` nodes whose `ARG_HEAD` contains `"⁼"`, is checked in order to see if its arguments are strucurally compatible with the given arguments. That is, is `headW` is an `lhs`, there must be a left argument matching that structure, and if `headX` is an `lhs`, the right argument must match that structure. This means that `𝕨` not only matches any left argument but also no argument. The test for compatibility is the same as for multiple assignment described below, except that the header may contain constants, which must match the corresponding part of the given argument. If no special case matches, then an appropriate general `CASE` is used: if there are two, the first is used with no left argument and the second with a left argument; if there is one, it is always used, and if there are none, an error results. +To evaluate a block when enough inputs have been received, each case (`I_CASE`, `A_CASE`, or `S_CASE`), excluding `A_CASE` nodes whose `ARG_HEAD` contains `"⁼"`, is tried in order. If any case completes, the block returns the result of that evaluation, and if all cases are tried but none finishes, an error results. A case might not complete because of an incompatible header or failed predicate, as described below. A general case (one with no header or predicates, as defined in the grammar) is always compatible, unless it is the first of two general cases in an `ARG_BLK` block and a left argument is givenβ€”this will be handled by the second case. -When a predicate `"?"` is evaluated, it may change the choice of case. The associated `EXPR` is evaluated and its result is checked. If it's not one of the numbers `0` or `1`, an error results. If it's `1`, evaluation of the `BODY` continues as usual. If it's `0`, evaluation is stopped and the next compatible `BODY` term is evaluated using the block's original inputs. +If a case has a header, then it must structurally match the inputs to begin evaluation. That is, if `headX` is an `lhs`, the right argument must match that structure, and similarly for `HeadF` with a left operand and `HeadG` with a right operand. If `headW` is an `lhs`, there must be a left argument matching that structure. This means that `𝕨` not only matches any left argument but also no argument. The test for compatibility is the same as for destructuring assignment described below, except that the header may contain constants, which must match the corresponding part of the given argument. For a compatible header, inputs and other names are bound when evaluation of a `BODY` is begun. Special names are always bound when applicable: `𝕨𝕩𝕀` if arguments are used, `𝕨` if there is a left argument, `π•—π•˜` if operands are used, and `_𝕣` and `_𝕣_` for modifiers and combinators, respectively. Any names in the header are also bound, allowing multiple assignment for arguments. -Inputs and other names are bound when evaluation of a `BODY` is begun. Special names are always bound when applicable: `𝕨𝕩𝕀` if arguments are used, `𝕨` if there is a left argument, `π•—π•˜` if operands are used, and `_𝕣` and `_𝕣_` for modifiers and combinators, respectively. Any names in the header are also bound, allowing multiple assignment for arguments. +When a predicate `"?"` is evaluated, the associated `EXPR` is evaluated and its result is checked. If it's not one of the numbers `0` or `1`, an error results. If it's `1`, evaluation of the `BODY` continues as usual. If it's `0`, evaluation is stopped and the next compatible `BODY` term is evaluated using the block's original inputs. If there is no left argument, but the `BODY` contains `𝕨` or `π•Ž` at the top level, then it is conceptually re-parsed with `𝕨` replaced by `Β·` to give a monadic version before application; this modifies the syntax tree by replacing some instances of `subject`, `arg`, or `Operand` with `nothing`. The token `π•Ž` is not allowed in this case and causes an error. Re-parsing `𝕨` can also cause an error if it's used as an operand or list element, where `nothing` is not allowed by the grammar. Note that these errors must not appear if the block is always called with two arguments. True re-parsing is not required, as the same effect can also be achieved dynamically by treating `Β·` as a value and checking for it during execution. If it's used as a left argument, then the function should instead be called with no left argument (and similarly in trains); if it's used as a right argument, then the function and its left argument are evaluated but rather than calling the function `Β·` is "returned" immediately; and if it's used in another context then it causes an error. diff --git a/spec/grammar.md b/spec/grammar.md index 50e465ec..6e6c28be 100644 --- a/spec/grammar.md +++ b/spec/grammar.md @@ -93,15 +93,17 @@ There are some extra possibilities for a header that specifies arguments. As a s | FuncName "˜"? "⁼" | lhsComp -A block is written with braces. It contains bodies, which are lists of statements, separated by semicolons. Multiple bodies can handle different cases, as determined by headers and predicates. A header is written before its body with a separating colon, and an expression other than the last in a body can be made into a predicate by following it with the separator-like `?`. A block can have any number of bodies with headers. After these there can be bodies without headersβ€”up to one for an immediate block and up to two for a block with arguments. If a block with arguments has one such body, it's ambivalent, but two of them refer to the monadic and dyadic cases. +A block is written with braces. It contains bodies, which are lists of statements, separated by semicolons. Multiple bodies can handle different cases, as determined by headers and predicates. A header is written before its body with a separating colon, and an expression other than the last in a body can be made into a predicate by following it with the separator-like `?`. + +An `I_CASE`, `A_CASE`, or `S_CASE` is called a *general case* or *general body* if it has no header or predicate, or, more formally, it doesn't directly include a `":"` token and its `BODY` node doesn't use the `EXPR β‹„? "?" β‹„?` case. A program must satisfy some additional rules regarding general cases, but these are not needed to resolve the grammar and shouldn't strictly be considered part of it. First, no general body can appear before a body that isn't general in a block. Second, a `IMM_BLK` or `blSub` can directly contain at most one general body and an `ARG_BLK` at most two (these are monadic and dyadic cases). BODY = β‹„? ( STMT β‹„ | EXPR β‹„? "?" β‹„? )* STMT β‹„? - CASE = BODY - I_CASE = β‹„? IMM_HEAD β‹„? ":" BODY - A_CASE = β‹„? ARG_HEAD β‹„? ":" BODY - IMM_BLK = "{" ( I_CASE ";" )* ( I_CASE | CASE ) "}" - ARG_BLK = "{" ( A_CASE ";" )* ( A_CASE | CASE ( ";" CASE )? ) "}" - blSub = "{" ( β‹„? s β‹„? ":" )? BODY "}" + I_CASE = ( β‹„? IMM_HEAD β‹„? ":" )? BODY + A_CASE = ( β‹„? ARG_HEAD β‹„? ":" )? BODY + S_CASE = ( β‹„? s β‹„? ":" )? BODY + IMM_BLK = "{" ( I_CASE ";" )* I_CASE "}" + ARG_BLK = "{" ( A_CASE ";" )* A_CASE "}" + blSub = "{" ( S_CASE ";" )* S_CASE "}" BlFunc = ARG_BLK _blMod1 = IMM_BLK | ARG_BLK _blMod2_ = IMM_BLK | ARG_BLK -- cgit v1.2.3