From 29bc342af8527f9bada1d011b17d7fd87d4ebdad Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Tue, 25 Jan 2022 19:03:40 -0500 Subject: Editing --- docs/doc/arrayrepr.html | 8 ++++---- docs/implementation/codfns.html | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) (limited to 'docs') diff --git a/docs/doc/arrayrepr.html b/docs/doc/arrayrepr.html index 43c0ad51..0e59315a 100644 --- a/docs/doc/arrayrepr.html +++ b/docs/doc/arrayrepr.html @@ -203,16 +203,16 @@

Strand notation is mainly useful for simple elements that don't require parentheses. A strand with one set of parentheses is no shorter than using list notation (but could look nicer), and one with more parentheses will be longer.

Why not whitespace?

In APL two or more arrays that are next to each other in the code are combined into a list, a convention known as stranding. So 2 3 5 + 1 adds a list to a number. This looks substantially cleaner than a BQN list, so it's reasonable to ask: why give it up? I admit I've been jealous of that clean look at times. But I'm also finding I view it with a certain unease: what's hiding in that space?

-

This feeling comes because the language is doing something I didn't ask it to, and it's justified. Consider the BQN expression a +˝×1 b for a matrix product. If we remove the space then we have 1 b. There's no good rule to say which of the three subjects 1, , and b to strand together. For modifiers like Rank and Depth we'd like stranding to bind more tightly than modifier application, but in order to actually use arguments for these modifiers the modifier application should take precedence. Similar but simpler cases show up more often when binding an argument to a function. The difference between the following two statements is obvious in BQN, but with space-for-stranding one of them would require a complicating parenthesis.

+

This feeling comes because the language is doing something I didn't ask it to, and it's well justified. Consider the BQN expression a +˝×1 b for a matrix product. If we remove the space then we have 1 b. There's no good rule to say which of the three subjects 1, , and b to strand together. For modifiers like Rank and Depth we'd like stranding to bind more tightly than modifier application, but in order to actually use arguments for these modifiers the modifier application should take precedence. Similar but simpler cases show up more often when binding an argument to a function. The difference between the following two statements is obvious in BQN, but with space-for-stranding one of them would require a complicating parenthesis.

↗️
    3 1+× 5
 20
 
     31+× 5
 ⟨ 40 30 ⟩
 
-

Explicit stranding is also more general, because it applies equally to elements of any role. 2+3 is a perfectly fine list in BQN—maybe it's part of an AST—while 2 + 3 is clearly not a list. J and K restrict their stranding even further, to numbers only. It does mean that issues with stranding show up in fewer cases, but it also means that changing one element of a list from a constant to a variable requires rewriting the whole list.

-

Why can't the more explicit list notation a,b,c drop the separators? This is also largely for reasons of generality, which is even more important given that ⟨⟩ is the more general-purpose list notation. Writing ÷,-,4 without the , won't go well. For something like 2×c,b-1, maybe the interpreter could sort it out but it would be pretty confusing. Pretty soon you're going through the list character by character trying to figure out which space is actually a separator. And cursing, probably.

-

Fortunately, I find that after a reasonable period of adjustment typing ligatures instead of spaces doesn't feel strange, and reading code is improved overall by the more explicit notation. A minor note is that lists of literal numbers, where APL-style stranding is best, tend to show up more in the snippets that beginners write to test out the language than in programs even in the tens of lines. So this issue sticks out in first experiences with BQN, but will probably come up less later on.

+

Explicit stranding is also more general, because it applies equally to elements of any role. 2+3 is a perfectly fine list in BQN—maybe it's part of an AST—while 2 + 3 is clearly not a list. Meanwhile J and K restrict their stranding even further, to numbers only. It does mean that issues with stranding show up in fewer cases, but it also means that changing one element of a list from a constant to a variable requires rewriting the whole list.

+

Why couldn't the more explicit list notation a,b,c drop the separators? This is also largely for reasons of generality—even more important here since ⟨⟩ is the more general-purpose list notation. Writing ÷,-,4 without the , won't go well. For something like 2×c,b-1, maybe the interpreter could sort it out but it would be pretty confusing. Pretty soon you're going through the list character by character trying to figure out which space is actually a separator. And cursing, probably.

+

Fortunately, I find that after a reasonable period of adjustment typing ligatures instead of spaces doesn't feel strange, and reading code is improved overall by the more explicit notation. A minor note is that lists of literal numbers, where APL-style stranding is best, tend to show up more in the snippets that beginners write to test out the language than in programs even in the tens of lines. So this issue sticks out in first experiences with BQN, but will come up less later on.

Array notation?

BQN has literal notation for lists only right now. To get an array with rank other than 1, either reshape a list, or merge a list of arrays:

↗️
    2  2,3, 4,1, 0,5
diff --git a/docs/implementation/codfns.html b/docs/implementation/codfns.html
index 65ac1391..ba71fb26 100644
--- a/docs/implementation/codfns.html
+++ b/docs/implementation/codfns.html
@@ -28,7 +28,7 @@
 

The sort of static guarantee I want is not really a type system but an axis system. That is, if I take ab I want to know that the arithmetic mapping makes sense because the two variables use the same axis. And I want to know that if a and b are compatible, then so are ia and ib, but not a and ib. I could use a form of Hungarian notation for this, and write iaia and ibib, but it's inconvenient to rewrite the axis every time the variable appears, and I'd much prefer a computer checking agreement rather than my own fallible self.

Performance

In his Co-dfns paper Aaron compares to nanopass implementations of his compiler passes. Running on the CPU and using Chez Scheme (not Racket, which is also presented) for nanopass, he finds Co-dfns is up to 10 times faster for large programs. The GPU is of course slower for small programs and faster for larger ones, breaking even above 100,000 AST nodes—quite a large program. I think comparing the self-hosted BQN compiler to the one in dzaima/BQN shows that this large improvement is caused as much by nanopass being slow as Co-dfns being fast.

-

The self-hosted compiler running in CBQN reachej full performance at about 1KB of dense source code. On large files it achieves speeds around 3MB/s, about two-thirds as fast as dzaima/BQN's compiler. This compiler was written in Java by dzaima in a much shorter time than the self-hosted compiler, and is equivalent for benchmarking purposes. While there are minor differences in syntax accepted and the exact bytecode output, I'm sure that either compiler could be modified to match the other with negligible changes in compilation time. The Java compiler is written with performance in mind, but dzaima has expended only a moderate amount of effort to optimize it.

+

The self-hosted compiler running in CBQN reaches full performance at about 1KB of dense source code. On large files it achieves speeds around 3MB/s, about two-thirds as fast as dzaima/BQN's compiler. This compiler was written in Java by dzaima in a much shorter time than the self-hosted compiler, and is equivalent for benchmarking purposes. While there are minor differences in syntax accepted and the exact bytecode output, I'm sure that either compiler could be modified to match the other with negligible changes in compilation time. The Java compiler is written with performance in mind, but dzaima has expended only a moderate amount of effort to optimize it.

A few factors other than the speed of the nanopass compiler might partly cause the discrepancy, or otherwise be worth taking into account. I doubt that these can add up to a factor of 15, so I think that nanopass is simply not as fast as more typical imperative compiler methods.