aboutsummaryrefslogtreecommitdiff
path: root/docs/implementation
diff options
context:
space:
mode:
Diffstat (limited to 'docs/implementation')
-rw-r--r--docs/implementation/primitive/replicate.html2
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/implementation/primitive/replicate.html b/docs/implementation/primitive/replicate.html
index 9e07a754..64cf73a7 100644
--- a/docs/implementation/primitive/replicate.html
+++ b/docs/implementation/primitive/replicate.html
@@ -7,7 +7,7 @@
<h1 id="implementation-of-indices-and-replicate"><a class="header" href="#implementation-of-indices-and-replicate">Implementation of Indices and Replicate</a></h1>
<p>The replicate family of functions contains not just primitives but powerful tools for implementing other functionality. The most important is converting <a href="#booleans-to-indices">bits to indices</a>: AVX-512 extensions implement this natively for various index sizes, and even with no SIMD support at all there are surprisingly fast table-based algorithms for it.</p>
<p><a href="#replicate">General replication</a> is more complex. Branching will slow many useful cases down considerably when using the obvious solution. However, branch-free techniques introduce overhead for larger replication amounts. Hybridizing these seems to be the only way, but it's finicky.</p>
-<p>Replicate by a <a href="#constant-replicate">constant amount</a> (so <code><span class='Value'>𝕨</span></code> is a single number) is not too common in itself, but it's notable because it can be the fastest way to implement outer products and scalar dyadics with prefix agreement.</p>
+<p>Replicate by a <a href="#constant-replicate">constant amount</a> (so <code><span class='Value'>𝕨</span></code> is a single number) is not too common in itself, but it's notable because it can be the fastest way to implement outer products and arithmetic with prefix agreement.</p>
<h2 id="indices"><a class="header" href="#indices">Indices</a></h2>
<p>Branchless algorithms are fastest, but with unbounded values in <code><span class='Value'>𝕨</span></code> a fully branchless algorithm is impossible because you can't write an arbitrary amount of memory without branching. So the best algorithms depend on bounding <code><span class='Value'>𝕨</span></code>. Fortunately the most useful case is that <code><span class='Value'>𝕨</span></code> is boolean.</p>
<h3 id="booleans-to-indices"><a class="header" href="#booleans-to-indices">Booleans to indices</a></h3>