From 316ae01c16db690a972fcba9ce735818a31773ac Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Mon, 13 Jun 2022 14:02:54 -0400 Subject: Specify []; spec is now a version 1.0 draft --- docs/spec/evaluate.html | 8 +++++--- docs/spec/grammar.html | 8 +++++--- docs/spec/index.html | 2 +- docs/spec/scope.html | 2 +- docs/spec/token.html | 2 +- 5 files changed, 13 insertions(+), 9 deletions(-) (limited to 'docs/spec') diff --git a/docs/spec/evaluate.html b/docs/spec/evaluate.html index 50e42796..8c99de10 100644 --- a/docs/spec/evaluate.html +++ b/docs/spec/evaluate.html @@ -18,12 +18,14 @@

When a predicate "?" is evaluated, the associated EXPR is evaluated and its result is checked. If it's not one of the numbers 0 or 1, an error results. If it's 1, evaluation of the BODY continues as usual. If it's 0, evaluation is stopped and the next compatible BODY term is evaluated using the block's original inputs.

If there is no left argument, but the BODY contains ๐•จ or ๐•Ž at the top level, then it is conceptually re-parsed with ๐•จ replaced by ยท to give a monadic version before application; this modifies the syntax tree by replacing some instances of subject, arg, or Operand with nothing. The token ๐•Ž is not allowed in this case and causes an error. Re-parsing ๐•จ can also cause an error if it's used as an operand or list element, where nothing is not allowed by the grammar. Note that these errors must not appear if the block is always called with two arguments. True re-parsing is not required, as the same effect can also be achieved dynamically by treating ยท as a value and checking for it during execution. If it's used as a left argument, then the function should instead be called with no left argument (and similarly in trains); if it's used as a right argument, then the function and its left argument are evaluated but rather than calling the function ยท is "returned" immediately; and if it's used in another context then it causes an error.

Assignment

-

An assignment is one of the four rules containing ASGN. It is evaluated by first evaluating the right-hand-side subExpr, FuncExpr, _m1Expr, or _m2Exp_ expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for subjects, only a lone identifier is allowed on the left-hand side and storage sets it equal to the result. For subjects, destructuring assignment is performed when an lhs is lhsList or lhsStr. Destructuring assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier assignment as the base case. The target "ยท" is also possible in place of a NAME, and performs no assignment.

-

The right-hand-side value, here called v, in destructuring assignment must be a list (rank 1 array) or namespace. If it's a list, then each LHS_ENTRY node must be an LHS_ELT. The left-hand side is treated as a list of lhs targets, and matched to v element-wise, with an error if the two lists differ in length. If v is a namespace, then the left-hand side must be an lhsStr where every LHS_ATOM is an NAME, or an lhsList where every LHS_ENTRY is an NAME or lhs "โ‡" NAME, so that it can be considered a list of NAME nodes some of which are also associated with lhs nodes. To perform the assignment, the value of each name is obtained from the namespace v, giving an error if v does not define that name. The value is assigned to the lhs node if present (which may be a destructuring assignment or simple subject assignment), and otherwise assigned to the same NAME node used to get it from v.

+

An assignment is one of the four rules containing ASGN. It is evaluated by first evaluating the right-hand-side subExpr, FuncExpr, _m1Expr, or _m2Exp_ expression, and then storing the result in the left-hand-side identifier or identifiers. The result of the assignment expression is the result of its right-hand side. Except for subjects, only a lone identifier is allowed on the left-hand side and storage sets it equal to the result. For subjects, destructuring assignment is performed when an lhs is lhsList, lhsStr, or lhsArray. Destructuring assignment is performed recursively by assigning right-hand-side values to the left-hand-side targets, with single-identifier assignment as the base case. The target "ยท" is also possible in place of a NAME, and performs no assignment.

+

In assignment to lhsList or lhsStr, the right-hand-side value, here called v, must be a list (rank 1 array) or namespace. If it's a list, then each LHS_ENTRY node must be an LHS_ELT. The left-hand side is treated as a list of lhs targets, and matched to v element-wise, with an error if the two lists differ in length. If v is a namespace, then the left-hand side must be an lhsStr where every LHS_ATOM is an NAME, or an lhsList where every LHS_ENTRY is an NAME or lhs "โ‡" NAME, so that it can be considered a list of NAME nodes some of which are also associated with lhs nodes. To perform the assignment, the value of each name is obtained from the namespace v, giving an error if v does not define that name. The value is assigned to the lhs node if present (which may be a destructuring assignment or simple subject assignment), and otherwise assigned to the same NAME node used to get it from v.

+

Assignment to lhsArray destructures the major cells of right-hand-side value v, which must be an array of rank at least 1. The number of cells in v is its length l, that is, the first element of its shape. The shape of each is the shape of v without its first element, and the cell ravels are formed by splitting v's ravel evenly into l sections. Besides this difference in how v is divided, assignment behaves the same way as assignment of a list v to lhsList.

A destructuring assignment is performed in program order, or equivalently index order, with each sub-assignment fully completed before beginning the next (a depth-first order). Thus if an assignment with โ†ฉ encounters an error but it's caught with โŽŠ, some of the assignment may have already been performed, changing variable values.

Modified assignment is the subject assignment rule lhs Derv "โ†ฉ" subExpr?. In this case, lhs is evaluated as if it were a subExpr (the syntax is a subset of subExpr), and passed as an argument to Derv. The full application is lhs Derv subExpr, if subExpr is given, and Derv lhs otherwise. Its value is assigned to lhs, and is also the result of the modified assignment expression.

Expressions

-

We now give rules for evaluating an atom, Func, _mod1 or _mod2_ expression (the possible options for ANY). A literal or primitive sl, Fl, _ml, or _cl_ has a fixed value defined by the specification (literals and built-ins). An identifier s, F, _m, or _c_, if not preceded by atom ".", must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by atom ".", then the atom node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as "(" _modExpr ")" simply returns the result of the interior expression. A block is defined by the evaluation of the statements it contains after all parameters are accepted, as described above. Finally, a list "โŸจ" โ‹„? ( ( EXPR โ‹„ )* EXPR โ‹„? )? "โŸฉ" or ANY ( "โ€ฟ" ANY )+ consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation.

+

We now give rules for evaluating an atom, Func, _mod1 or _mod2_ expression (the possible options for ANY). A literal or primitive sl, Fl, _ml, or _cl_ has a fixed value defined by the specification (literals and built-ins). An identifier s, F, _m, or _c_, if not preceded by atom ".", must have an associated variable due to the scoping rules, and returns this variable's value, or causes an error if it has not yet been set. If it is preceded by atom ".", then the atom node is evaluated first; its value must be a namespace, and the result is the value of the identifier's name in the namespace, or an error if the name is undefined. A parenthesized expression such as "(" _modExpr ")" simply returns the result of the interior expression. A block is defined by the evaluation of the statements it contains after all parameters are accepted, as described above.

+

A list "โŸจ" โ‹„? ( ( EXPR โ‹„ )* EXPR โ‹„? )? "โŸฉ" or ANY ( "โ€ฟ" ANY )+ consists grammatically of a list of expressions. To evaluate it, each expression is evaluated in source order and their results are placed as elements of a rank-1 array. The two forms have identical semantics but different punctuation. The square bracket notation "[" โ‹„? ( EXPR โ‹„ )* EXPR โ‹„? "]" evaluates expressions in the same way, but makes them into major cells of an array instead of elements. The result is identical to applying the primitive function Merge (>) to a list of the expression results.

Rules in the table below are function and modifier evaluation.

diff --git a/docs/spec/grammar.html b/docs/spec/grammar.html index 139a5dd4..c7212568 100644 --- a/docs/spec/grammar.html +++ b/docs/spec/grammar.html @@ -19,8 +19,9 @@ _mod2_=(atom".")?_c_|_cl_|"("_m2Expr_")"|_blMod2__mod1=(atom".")?_m|_ml|"("_m1Expr")"|_blMod1Func=(atom".")?F|Fl|"("FuncExpr")"|BlFunc -atom=(atom".")?s|sl|"("subExpr")"|blSub|list -list="โŸจ"โ‹„?((EXPRโ‹„)*EXPRโ‹„?)?"โŸฉ" +atom=(atom".")?s|sl|"("subExpr")"|blSub|array +array="โŸจ"โ‹„?((EXPRโ‹„)*EXPRโ‹„?)?"โŸฉ" + |"["โ‹„?(EXPRโ‹„)*EXPRโ‹„?"]"subject=atom|ANY("โ€ฟ"ANY)+

Starting at the highest-order objects, modifiers have simple syntax. In most cases the syntax for โ† and โ†ฉ is the same, but only โ†ฉ can be used for modified assignment. The export arrow โ‡ can be used in the same ways as โ†, but it can also be used at the beginning of a header to force a namespace result, or with no expression on the right in an EXPORT statement.

@@ -55,13 +56,14 @@

The target of subject assignment can be compound to allow for destructuring. List and namespace assignment share the nodes lhsList and lhsStr and cannot be completely distinguished until execution. The term sl in LHS_SUB is used for header inputs below: as an additional rule, it cannot be used in the lhs term of a subExpr node.

NAME     = s | F | _m | _c_
-LHS_SUB  = "ยท" | lhsList | sl
+LHS_SUB  = "ยท" | lhsList | lhsArray | sl
 LHS_ANY  = NAME | LHS_SUB | "(" LHS_ELT ")"
 LHS_ATOM = LHS_ANY | "(" lhsStr ")"
 LHS_ELT  = LHS_ANY | lhsStr
 LHS_ENTRY= LHS_ELT | lhs "โ‡" NAME
 lhsStr   = LHS_ATOM ( "โ€ฟ" LHS_ATOM )+
 lhsList  = "โŸจ" โ‹„? ( ( LHS_ENTRY โ‹„ )* LHS_ENTRY โ‹„? )? "โŸฉ"
+lhsArray = "[" โ‹„?   ( LHS_ELT   โ‹„ )* LHS_ELT   โ‹„?    "]"
 lhsComp  = LHS_SUB | lhsStr | "(" lhs ")"
 lhs      = s | lhsComp
 
diff --git a/docs/spec/index.html b/docs/spec/index.html index 5aa2bc7d..fe5e4471 100644 --- a/docs/spec/index.html +++ b/docs/spec/index.html @@ -5,7 +5,7 @@

BQN specification

-

This document, and the others in this directory (linked in the list below) make up the pre-versioning BQN specification. The specification differs from the documentation in that its purpose is only to describe the exact details of BQN's operation in the most quickly accessible way, rather than to explain the central ideas of BQN functionality and how it might be used. The core of BQN, which excludes system-provided values, is now almost completely specified. Planned changes to the specification are tracked on this page.

+

This document, and the others in this directory (linked in the list below) make up the pre-versioning BQN specification. The specification differs from the documentation in that its purpose is only to describe the exact details of BQN's operation in the most quickly accessible way, rather than to explain the central ideas of BQN functionality and how it might be used. The core language specification, which excludes the system-provided values section (to be versioned separately), is now considered a version 1.0 draft: all functionality to be included in version 1.0 is described, but it must be reviewed to confirm that it's described as intended.

Under this specification, a language implementation is a BQN pre-version implementation if it behaves as specified for all input programs. It is a BQN pre-version implementation with extensions if it behaves as specified in all cases where the specification does not require an error, but behaves differently in at least one case where it requires an error. It is a partial version of either of these if it doesn't conform to the description but differs from a conforming implementation only by rejecting with an error some programs that the conforming implementation accepts. As the specification is not yet versioned, other instances of the specification define these terms in different ways. An implementation can use one of these terms if it conforms to any instance of the pre-versioning BQN specifications that defines them. When versioning is begun, there will be only one specification for each version.

The following documents are included in the BQN specification. A BQN program is a sequence of Unicode code points: to evaluate it, it is converted into a sequence of tokens using the token formation rules, then these tokens are arranged in a syntax tree according to the grammar, and then this tree is evaluated according to the evaluation semantics. The program may be evaluated in the presence of additional context such as a filesystem or command-line arguments; this context is presented to the program and manipulated through the system-provided values.

The definition for an identifier is chosen from the potential definitions based on their containing scopes: it is the one whose containing scope does not contain or match the containing scope of any other potential definition. If for any identifier there is no definition, then the program is not valid and results in an error. This can occur if the identifier has no potential definition, and also if two potential definitions appear in the same scope. In fact, under this scheme it is never valid to make two definitions with the same name at the top level of a single scope, because both definitions would be potential definitions for the one that comes second in program order. Both definitions have the same containing scope, and any potential definition must contain or match this scope, so no potential definition can be selected.

-

The definition of program order for identifier tokens follows the order of BQN execution. It corresponds to the order of a particular traversal of the abstract syntax tree for a program. To find the relative ordering of two identifiers in a program, we consider the highest-depth node that they both belong to; in this node they must occur in different components, or that component would be a higher-depth node containing both of them. In most nodes, the program order goes from right to left: components further to the right come earlier in program order. The exceptions are PROGRAM, BODY, list, subject (for stranding), lhsList, lhsStr, and body structure (I_CASE, A_CASE, IMM_BLK, ARG_BLK, and blSub) nodes, in which program order goes in the opposite order, from left to right.

+

The definition of program order for identifier tokens follows the order of BQN execution. It corresponds to the order of a particular traversal of the abstract syntax tree for a program. To find the relative ordering of two identifiers in a program, we consider the highest-depth node that they both belong to; in this node they must occur in different components, or that component would be a higher-depth node containing both of them. In most nodes, the program order goes from right to left: components further to the right come earlier in program order. The exceptions are PROGRAM, BODY, array, subject (for stranding), lhsList, lhsArray, lhsStr, and body structure (I_CASE, A_CASE, IMM_BLK, ARG_BLK, and blSub) nodes, in which program order goes in the opposite order, from left to right.

A subject label is the s term in a blSub node. As part of a header, it can serve as the definition for an identifier. However, it's defined to be a syntax error if another instance of this identifier appears.

Special names

Special names such as ๐•ฉ or ๐•ฃ refer to variables, but have no definition and do not use scoping. Instead, they always refer to the immediately enclosing scope, and are defined automatically when the block is evaluated.

diff --git a/docs/spec/token.html b/docs/spec/token.html index b935b9b9..c54fb384 100644 --- a/docs/spec/token.html +++ b/docs/spec/token.html @@ -43,7 +43,7 @@ - +
Punctuationโ†โ‡โ†ฉ(){}โŸจโŸฉโ€ฟโ‹„,. and newlineโ†โ‡โ†ฉ(){}โŸจโŸฉ[]โ€ฟโ‹„,. and newline
-- cgit v1.2.3