diff options
| author | Marshall Lochbaum <mwlochbaum@gmail.com> | 2022-10-02 21:03:25 -0400 |
|---|---|---|
| committer | Marshall Lochbaum <mwlochbaum@gmail.com> | 2022-10-02 21:03:25 -0400 |
| commit | 7f2cea6c9eef68fe2521508d87de7d37e6c0edbf (patch) | |
| tree | d0b52eb05081cc87b44ec3c89d0fd3af0b5133fb /docs | |
| parent | 86497094c33e53600917b01db592650e6504e706 (diff) | |
Co-dfns no longer literate
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/implementation/codfns.html | 7 |
1 files changed, 3 insertions, 4 deletions
diff --git a/docs/implementation/codfns.html b/docs/implementation/codfns.html index 066b99ce..09401ea1 100644 --- a/docs/implementation/codfns.html +++ b/docs/implementation/codfns.html @@ -5,7 +5,7 @@ </head> <div class="nav">(<a href="https://github.com/mlochbaum/BQN">github</a>) / <a href="../index.html">BQN</a> / <a href="index.html">implementation</a></div> <h1 id="co-dfns-versus-bqns-implementation"><a class="header" href="#co-dfns-versus-bqns-implementation">Co-dfns versus BQN's implementation</a></h1> -<p><em>Co-dfns is under active development so this document might not reflect its current state. Last update 2022-07-18.</em></p> +<p><em>Co-dfns is under active development so this document might not reflect its current state. Last update 2022-10-02.</em></p> <p>The BQN self-hosted compiler is directly inspired by the <a href="https://github.com/Co-dfns/Co-dfns">Co-dfns</a> project, a compiler for a subset of <a href="../doc/fromDyalog.html">Dyalog APL</a>. I'm very grateful to Aaron for showing that array-oriented compilation is even possible! In addition to the obvious difference of target language, BQN differs from Co-dfns both in goals and methods.</p> <p>The shared goals of BQN and Co-dfns are to implement a compiler for an array language with whole-array operations. This provides the theoretical benefit of a short <em>critical path</em>, which in practice means that both compilers can make good use of a GPU or a CPU's vector instructions simply by providing an appropriate runtime (Co-dfns has good runtimes—an ArrayFire program on the GPU and Dyalog APL on the CPU—while CBQN isn't at the same level yet). The two implementations also share a preference for working "close to the metal" by passing around arrays of numbers rather than creating abstract types to work with data. Objects are right out. These choices lead to a compact source code implementation, and may have some benefits in terms of how easy it is to write and understand the compiler.</p> <h2 id="compilation-strategy"><a class="header" href="#compilation-strategy">Compilation strategy</a></h2> @@ -17,8 +17,7 @@ <h2 id="error-reporting"><a class="header" href="#error-reporting">Error reporting</a></h2> <p>Co-dfns initially didn't check for compilation errors, but has started to add some checks with messages. It's behind BQN, which has complete error checking and good error messages, and includes source positions in compiler errors as well as in the compiled code for use in runtime errors. Position tracking and error checking add up to a little more than 20% overhead for the compiler, both in runtime and lines of code. And improving the way errors are reported once found has no cost for working programs, because reporting code only needs to be run if there's a compiler error. The only thing that really takes advantage of this now is the reporting for bracket matching, which goes over all brackets with a stack-based (not array-oriented or parallel) method.</p> <h2 id="comments"><a class="header" href="#comments">Comments</a></h2> -<p>At the time I began work on BQN, Aaron advocated the almost complete separation of code from comments (thesis) in addition to his very terse style as a general programming methodology. I find that such a style makes it hard to connect the documentation to the code, and is very slow in providing a summary or reminder of functionality that a comment might. I chose one comment per line as a better balance of compactness and faster accessibility.</p> -<p>Subsequently Co-dfns undertook perhaps the greatest shift in comment-to-code ratio that's ever happened. Aaron began by adding a full-line comment for every group of two to three lines, not too far from BQN. Then he converted the whole thing to a literate program using the noweb framework and is working on writing the prose half, perhaps a half page per line of code. This could make development harder but it seems like a great way to make it easier to get into a difficult style of programming, so I'm interested to see how it goes.</p> +<p>At the time I began work on BQN, Aaron advocated the almost complete separation of code from comments (thesis) in addition to his very terse style as a general programming methodology. I find that such a style makes it hard to connect the documentation to the code, and is very slow in providing a summary or reminder of functionality that a comment might. I chose one comment per line as a better balance of compactness and faster accessibility. Co-dfns has since fluctuated wildly in its approach to documenting the code, including a stab at literate programming with perhaps half a page to describe each line for the parts that were finished.</p> <h2 id="is-it-a-good-idea"><a class="header" href="#is-it-a-good-idea">Is it a good idea?</a></h2> <p>In short: <strong>no</strong>, I don't think it makes sense to use an array style for a compiler without a good reason. BQN uses it so it can self-host while maintaining good performance; Co-dfns uses it to prove it's possible to implement the core of a compiler with low parallel asymptotic complexity. It could also make a fun hobby project, although it's very draining. If the goal is to produce a working compiler then my opinion is that using the array style will take longer and require more skill, and the resulting compiler will be slower than a traditional one in a low-level language for practical tasks. Improvements in methodology could turn this around, but I'm pessimistic.</p> <p>It's important to note that my judgment here applies specifically to implementing compilers. Many people already apply APL successfully to problems in finance, or NumPy in science, TensorFlow in machine learning, and so on. Whatever their other qualities, these programs can be developed quickly and have competitive performance. Other problems, such as sparse graph algorithms, are unlikely to be approachable with array programming (and the <a href="https://en.wikipedia.org/wiki/P-complete">P-complete</a> problems are thought to not be efficiently parallelizable, which would rule out any array solution in the sense used here). Compiling seems to occupy an interesting spot near the fuzzy boundary of that useful-array region. Is it more inside, or outside?</p> @@ -35,7 +34,7 @@ <ul> <li>CBQN is a different runtime from Dyalog. We've profiled the compiler and implemented lots of things in SIMD, and I'm confident CBQN runs the compiler at least as fast as Dyalog 17.0, the latest version for Aaron's thesis, would have.</li> <li>Co-dfns and BQN use different compilation strategies. I think that my methods are at least as fast, and scale better to a full compiler.</li> -<li>The portion of the compiler implemented by Co-dfns could be better for arrays than other sections, which in fact seems likely to me—I would say parsing is the worst for array relative to scalar programming.</li> +<li>The portion of the compiler implemented for the paper could be better for arrays than other sections, which in fact seems likely to me—I would say parsing is the worst for array relative to scalar programming.</li> <li>The Co-dfns and nanopass implementations are pass-for-pass equivalent, while BQN and Java are only comparable as a whole. As the passes were chosen to be ideal for Co-dfns, there's some chance this could be slowing down nanopass in a way that doesn't translate to real-world performance.</li> </ul> <p>I reckon that most of the effect is that the nanopass compiler just isn't very fast. I see our whole-compiler results as superseding Aaron's comparison, and it's worth noting as well that C could probably beat Java for a BQN compiler. So I conclude that, with current technology, array-based compilers are competitive with scalar compilers on the CPU and not vastly better. This result is still remarkable! APL and BQN are high-level dynamically-typed languages, and wouldn't be expected to go near the performance of a compiled language like Java. However, it makes it much harder to recommend array-based compilation than the numbers from the Co-dfns paper would.</p> |
