diff options
| -rw-r--r-- | README.md | 14 | ||||
| -rw-r--r-- | doc/context.md | 82 |
2 files changed, 89 insertions, 7 deletions
@@ -16,7 +16,7 @@ It incorporates concepts developed over years of APL practice: But BQN is redesigned from the ground up, with brand new ideas to make these paradigms easier to use and less likely to fail. * The **based array model** makes non-arrays a fundamental part of the language, and removes the surprise of floating arrays and the hassle of explicit boxes. New **array notation** eliminates the gotchas of [stranding](https://aplwiki.com/wiki/Strand_notation). -* A **context-free grammar** where a value's syntactic role is determined by its spelling makes it easier for machines and humans to understand code. +* A [**context-free grammar**](doc/context.md) where a value's syntactic role is determined by its spelling makes it easier for machines and humans to understand code. * Oh, and it naturally leads to **first-class functions**, a feature often missed in APL. * The **new symbols** for built-in functionality allow the syntactic role of a primitive to be distinguished at a glance, and aim to be more consistent and intuitive. @@ -44,15 +44,15 @@ BQN syntax consists of expressions where computation is done with a little organ ### Expressions -Like APL, BQN values have one of four *syntactic classes*: +Like APL, BQN uses four *syntactic roles* for values in expressions: * **Values**, like APL arrays or J nouns * **Functions**, or verbs in J * **Modifiers**, like APL monadic operators or J adverbs * **Compositions**, like APL dyadic operators or J conjunctions. -These classes work exactly like they do in APL, with functions applying to one or two arguments, modifiers taking a single function or value on the left, and compositions taking a function or value on each side. +These roles work exactly like they do in APL, with functions applying to one or two arguments, modifiers taking a single function or value on the left, and compositions taking a function or value on each side. -Unlike APL, in BQN the syntactic class of a value is determined purely by the way it's spelled: a lowercase first letter (`name`) makes it a value, an uppercase first letter (`Name`) makes it a function, and underscores are used for modifiers (`_name`) and compositions (`_name_`). Below, the function `{ππ©}` treats its left argument `π` as a function and its right argument `π©` as an argument. With a list of functions, we can make a table of the square and square root of a few numbers: +Unlike APL, in BQN the syntactic role of a value is determined purely by the way it's spelled: a lowercase first letter (`name`) makes it a value, an uppercase first letter (`Name`) makes it a function, and underscores are used for modifiers (`_name`) and compositions (`_name_`). Below, the function `{ππ©}` treats its left argument `π` as a function and its right argument `π©` as an argument. With a list of functions, we can make a table of the square and square root of a few numbers: β¨ΓΛ,ββ© {ππ©}β 1βΏ4βΏ9 β @@ -60,7 +60,7 @@ Unlike APL, in BQN the syntactic class of a value is determined purely by the wa 1 2 3 β -BQN's built-in operations also have patterns to indicate the syntactic class: modifiers (`` ΛΒ¨ΛβΌβΒ΄` ``) are all superscript characters, and compositions (`βββΈββΎβββ`) all have an unbroken circle (two functions `β½β` have broken circles with lines through them). Every other built-in constant is a function, although the special symbols `Β―`, `β`, and `Ο` are used as part of numeric literal notation. +BQN's built-in operations also have patterns to indicate the syntactic role: modifiers (`` ΛΒ¨ΛβΌβΒ΄` ``) are all superscript characters, and compositions (`βββΈββΎβββ`) all have an unbroken circle (two functions `β½β` have broken circles with lines through them). Every other built-in constant is a function, although the special symbols `Β―`, `β`, and `Ο` are used as part of numeric literal notation. ### Special syntax @@ -179,9 +179,9 @@ If added, [sets and dictionaries](#sets-and-dictionaries) would also use a list- ### Explicit functions -Functions, modifiers, and combinators can be defined using curly braces `{}`. The contents are simply a sequence of expressions, where each is evaluated and the result of the last is returned. This result can have any value, and its syntactic class in the calling context is determined by the normal rules: functions return values and modifiers and compositions return functions. Operations defined in this way have lexical scope. +Functions, modifiers, and combinators can be defined using curly braces `{}`. The contents are simply a sequence of expressions, where each is evaluated and the result of the last is returned. This result can have any value, and its syntactic role in the calling context is determined by the normal rules: functions return values and modifiers and compositions return functions. Operations defined in this way have lexical scope. -The special values `π¨` and `π©`, which stand for arguments, and `π` and `π`, which stand for operands, are available inside curly braces. Like ordinary names, the lowercase forms indicate values and the uppercase forms `πππ½πΎ` indicate functions. The type (including syntactic class) of the result is determined by its contents: a composition contains `π`, a modifier contains `π` but not `π`, and a function contains neither. +The special values `π¨` and `π©`, which stand for arguments, and `π` and `π`, which stand for operands, are available inside curly braces. Like ordinary names, the lowercase forms indicate values and the uppercase forms `πππ½πΎ` indicate functions. The type (including syntactic role) of the result is determined by its contents: a composition contains `π`, a modifier contains `π` but not `π`, and a function contains neither. A modifier or composition can be evaluated twice: once when passed operands and again when the resulting function is passed arguments. If it contains `π¨` or `π©`, the first evaluation simply remembers the operands, and the contents will be executed only on the second evaluation, when the arguments are available. If it doesn't contain these, then the contents are executed on the first evaluation and the result is treated as a function. diff --git a/doc/context.md b/doc/context.md new file mode 100644 index 00000000..7a9e218d --- /dev/null +++ b/doc/context.md @@ -0,0 +1,82 @@ +# BQN's context-free grammar + +APL has a problem. To illustrate, let's look at an APL expression: + + a b c d e + +It is impossible to say anything about this sentence! Is `c` a dyadic operator being applied to `b` and `d`, or are `b` and `d` two dyadic functions being applied to arrays? In contrast, expressions in C-like or Lisp-like languages show their structure of application: + + b(a, d(c)(e)) + (b a ((d c) e)) + +In each case, some values are used as inputs to functions while others are the functions being applied. The result of a function can be used either as an input or as a function again. These expressions correspond to the APL expression where `a` and `e` are arrays, `b` and `c` are functions, and `d` is a monadic operator. However, these syntactic classes have to be known to see what the APL expression is doingβthey are a form of context that is required for a reader to know the grammatical structure of the expression. In a context-free grammar like that of simple C or Lisp expressions, a value's grammatical role is part of the expression itself, indicated with parentheses: they come after the function in C and before it in Lisp. Of course, a consequence of using parentheses in this way is having a lot of parentheses. BQN uses a different method to annotate grammatical role: + + a B C _d e + +Here, the lowercase spelling indicates that `a` and `e` are to be treated as values ("arrays" in APL) while uppercase variables `B` and `C` are used as functions and `_d` is a modifier ("monadic operator"). Like parentheses for function application, the spelling is not inherent to the variable values used, but instead indicates their grammatical role in this particular expression. While we still don't know anything about what values `a`, `b`, `c`, and so on have, we know how they interact in this line of code. + +## Is grammatical context really a problem? + +Yes, in the sense of [problems with BQN](../problems.md). A grammar that uses context is harder for humans to read and machines to execute. A particular difficulty is that parts of an expression you don't yet understand can interfere with parts you do, making it difficult to work through an unknown codebase. + +One difficulty beginners to APL will encounter is that code in APL at first appears like a string of undifferentiated symbols. For example, a tacit Unique Mask implementation `β³β¨=β³ββ’` consists of six largely unfamiliar characters with little to distinguish them (in fact, the one obvious bit of structure, the repeated `β³`, is misleading as it means different things in each case!). Simply placing parentheses into the expression, like `(β³β¨)=(β³ββ’)`, can be a great help to a beginner, and part of learning APL is to naturally see where the parentheses should go. The equivalent BQN expression, `βΛ=βββ `, will likely appear equally intimidating at first, but the path to learning which things apply to which is much shorter: rather than learning the entire list of APL primitives, a beginner just needs to know that superscript characters like `Λ` are modifiers and characters like `β` with unbroken circles are compositions before beginning to learn the BQN grammar that will explain how to tie the various parts together. + +This sounds like a distant concern to a master of APL or a computer that has no difficulty memorizing a few dozen glyphs. Quite the opposite: the same concern applies whenever you begin work with an unfamiliar codebase! Many APL programmers even enforce variable name conventions to ensure they know the class of a variable. By having such a system built in, BQN keeps you from having to rely on programmers following a style guide, and also allows greater flexibility, as we'll see later. + +Shouldn't a codebase define all the variables it uses, so we can see their class from the definition? Not always: consider that in a language with libraries, code might be imported from dependencies. Many APLs also have some dynamic features that can allow a variable to have more than one class, such as the `βΊββ’` pattern in a dfn that makes `βΊ` an array in the dyadic case but a function in the monadic case. Regardless, searching for a definition somewhere in the code is certainly a lot more work than knowing the class right away! One final difficulty is that even one unknown can delay understanding of an entire expression. Suppose in `A B c`, `B` is a function and `c` is an array, and both values are known to be constant. If `A` is known to be a function (even if its value is not yet known), its right argument `B c` can be evaluated ahead of time. But if `A`'s type isn't known, it's impossible to know if this optimization is worth it, because if it is an array, `B` will instead be called dyadically. + +## BQN's spelling system + +BQN's expression grammar is a simplified version of the typical APL, removing some oddities like niladic functions and the two-glyph Outer Product operator. Values can be used in four syntactic roles: + +| BQN | APL | J +|-------------|------------------|------ +| Value | Array | Noun +| Function | Function | Verb +| Modifier | Monadic operator | Adverb +| Composition | Dyadic operator | Conjunction + +BQN primitives have only one spelling, and a fixed role (but their values can be used in a different role by storing them in variables). Superscript glyphs `` ΛΒ¨ΛβΌβΒ΄` `` are used for modifiers, and glyphs `βββΈββΎββββΆβ` with an unbroken circle are compositions. Other primitives are functions. String and numeric literals are values. + +BQN's variables use another system. Unlike primitives, variables can be spelled as any of the four syntactic types. Its value remains the same, as the spelling only indicates how this value is used. A variable spelled with a lowercase first letter, like `var`, is a value. Spelled with an uppercase first letter, like `Var`, it is a function. Underscores are placed where operands apply to indicate a modifier `_var` or composition `_var_`. Other than the first letter or underscore, variables are case-insensitive. + +The associations between spelling and syntactic role are considered part of BQN's [token formation rules](../spec/token.md). + +## BQN's grammar + +A formal treatment is included in [the spec](../spec/grammar.md). BQN's grammarβthe ways syntactic roles interactβfollows the original APL model (plus trains) closely, with allowances for new features like list notation. In order to keep BQN's syntax context-free, the syntactic role of any expression must be known, just like tokens. + +BQN fails to be context-free in one fairly mild way: the role of a brace construct `{}` is determined by which special arguments it uses. This means the grammar is not context-free in the technical sense, but since the "context" in this case is carried between the braces and cannot be left out it's not harmful in the same way as variable values. Informally it still makes sense to call BQN "context-free". + +Here is a table of the APL-derived operator and function application rules: + +| left | main | right | output | name +|-------|-------|-------|----------|------ +| | `F` | `x` | Value | Monadic function +| `w` | `F` | `x` | Value | Dyadic function +| | `F` | `G` | Function | 2-train +| `F*` | `G` | `H` | Function | 3-train +| `F*` | `_m` | | Function | Modifier +| `F*` | `_c_` | `G*` | Function | Composition +| | `_c_` | `G*` | Modifier | Partial application +| `F*` | `_c_` | | Modifier | Partial application + +A function with an asterisk indicates that a value can also be used: in these positions there is no difference between function and value spellings. Operator applications bind more tightly than functions, and associate left-to-right while functions associate right-to-left. + +BQN lists can be written with angle brackets `β¨elt0,elt1,β¦β©` or ligatures `elt0βΏelt1βΏβ¦`. In either case the elements can have any type, and the result is a value. + +The statements in a brace block, function, or operator can also be any role, including the return value at the end. These roles have no effect: outside of braces, a function always returns an array regardless of how it was defined. + +## Mixing roles + +BQN's basic types align closely with its syntactic roles: functions, modifiers, and compositions are all basic types, while values are split into numbers, characters, and arrays. This is no accident, and usually values will be used in roles that match their underlying type. However, the ability to use a role that doesn't match the type is very useful. + +Any type can be passed as an argument to a function, or as an operand, by treating it as an array. This means that BQN fully supports Lisp-style functional programming, where functions can be used as values. + +It can also be useful to treat an array as a function, in which case it applies as a constant function. This rule is useful with most built-in operators. For example, `Fβ1` uses a constant for the rank even though in general a function can be given, and `aβΎ(bβΈ/)` inserts the values in `a` into the positions selected by `b`, ignoring the old values rather than applying a function to them. + +Other mixes of roles are generally not useful. While a combination such as treating a function as a modifier is allowed, attempting to apply it to an operand will fail. Only a modifier can be applied as a modifier and only a composition can be applied as a composition. Only a function or array can be applied as a function. + +It's also worth noting that something that appears to be an array may actually be a function! For example, the result of `π¨Λπ©` may not always be `π¨`. `π¨Λπ©` is exactly identical to `πΛπ©`, which gives `π©ππ©`. If `π` is a number, character, or array, that's the same as `π¨`, but if it is a function, then it will be applied. + +The primary way to change the role of a value in BQN is to use a name, including one of the arguments to a brace function. In particular, you can use `{π½}` to convert a value operand into a function. Converting a function to a value is more difficult. Often an array of functions is wanted, in which case they can be stranded together; otherwise it's probably best to give the function a name. Picking a function out of a list, for example `ββ¨+β©` will give it as a value. |
