diff options
Diffstat (limited to 'spec/token.md')
| -rw-r--r-- | spec/token.md | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/spec/token.md b/spec/token.md index 3c2235eb..52e46d25 100644 --- a/spec/token.md +++ b/spec/token.md @@ -1,3 +1,5 @@ +*View this file with results and syntax highlighting [here](https://mlochbaum.github.io/BQN/spec/token.html).* + This page describes BQN's token formation rules (token formation is also called scanning). Most tokens in BQN are a single character long, but quoted characters and strings, identifiers, and numbers can consist of multiple characters, and comments, spaces, and tabs are discarded during token formation. BQN source code should be considered as a series of unicode code points, which we refer to as "characters". The separator between lines in a file is considered to be a single character, newline, even though some operating systems such as Windows typically represent it with a two-character CRLF sequence. Implementers should note that not all languages treat unicode code points as atomic, as exposing the UTF-8 or UTF-16 representation instead is common. For a language such as JavaScript that uses UTF-16, the double-struck characters `𝕨𝕎𝕩𝕏𝕗𝔽𝕘𝔾` are represented as two 16-bit surrogate characters, but BQN treats them as a single unit. |
