Comments on interpolation search

author: Marshall Lochbaum <mwlochbaum@gmail.com> 2022-02-13 11:28:04 -0500
committer: Marshall Lochbaum <mwlochbaum@gmail.com> 2022-02-13 11:28:04 -0500
commit: cde6f699cd4b76b3b193a8d5bd39242261196fbd (patch)
tree: f247073605808c90ba352ee685bb2835f461b6e6
parent: 1853387a0eca09eb4a9778aad64488602e29dca0 (diff)
2 files changed, 9 insertions, 0 deletions
diff --git a/docs/implementation/primitive/sort.html b/docs/implementation/primitive/sort.html
index 8e69a9ea..12160286 100644
--- a/docs/implementation/primitive/sort.html
+++ b/docs/implementation/primitive/sort.html
@@ -58,3 +58,6 @@
 <p>It's cheap and sometimes worthwhile to trim <code><span class='Value'>𝕨</span></code> down to the range of <code><span class='Value'>𝕩</span></code>. After finding the range of <code><span class='Value'>𝕩</span></code>, binary cut <code><span class='Value'>𝕨</span></code> to a smaller list that contains the range. Stop when the middle element fits inside the range, and search each half of <code><span class='Value'>𝕨</span></code> for the appropriate endpoint of the range.</p>
 <p>If <code><span class='Value'>𝕩</span></code> is small-range, then a lookup table method is possible. Check the length of <code><span class='Value'>𝕨</span></code> because if it's too large then this method is slower—binary search doesn't have to hit every element! The approach is simply to create a table of the number of elements in <code><span class='Value'>𝕨</span></code> with each value, then take a prefix sum. In BQN, <code><span class='Value'>𝕩</span><span class='Function'>⊏+</span><span class='Modifier'>`</span><span class='Paren'>(</span><span class='Number'>1</span><span class='Function'>+⌈</span><span class='Modifier'>´</span><span class='Value'>𝕩</span><span class='Paren'>)</span><span class='Function'>↑/</span><span class='Modifier'>⁼</span><span class='Value'>𝕨</span></code>, assuming a minimum of 0.</p>
 <p><a href="#partitioning">Partitioning</a> allows one pivot <code><span class='Value'>t</span></code> from <code><span class='Value'>𝕨</span></code> to be compared with all of <code><span class='Value'>𝕩</span></code> at once. Although the comparison <code><span class='Value'>t</span><span class='Function'>≤</span><span class='Value'>𝕩</span></code> can be vectorized, the overhead of partitioning still makes this method a little slower per-comparison than sequential binary search <em>when</em> <code><span class='Value'>𝕨</span></code> <em>fits in L1 cache</em>. For larger <code><span class='Value'>𝕨</span></code> (and randomly positioned <code><span class='Value'>𝕩</span></code>) cache churn is a huge cost and partitioning can be many times faster. It should be performed recursively, switching to sequential binary search when <code><span class='Value'>𝕨</span></code> is small enough. Unlike quicksort there is no difficulty in pivot selection: always take it from the middle of <code><span class='Value'>𝕨</span></code> as in a normal binary search. However, there is a potential issue with memory. If <code><span class='Value'>𝕩</span></code> is unbalanced with respect to <code><span class='Value'>𝕨</span></code>, then the larger part can be nearly the whole length of <code><span class='Value'>𝕩</span></code> (if it's all of <code><span class='Value'>𝕩</span></code> partitioning isn't actually needed and it doesn't need to be saved). This can require close to <code><span class='Number'>2</span><span class='Function'>⋆</span><span class='Modifier'>⁼</span><span class='Function'>≠</span><span class='Value'>𝕨</span></code> saved partitions of length <code><span class='Function'>≠</span><span class='Value'>𝕩</span></code>, while the expected use would be a total length <code><span class='Function'>≠</span><span class='Value'>𝕩</span></code>.</p>
+<h4 id="interpolation-search"><a class="header" href="#interpolation-search">Interpolation search?</a></h4>
+<p>Binary search is the optimal approach for truly unknown data. However, if the searched array has an approximately uniform distribution, giving an approximately linear relationship between index and value, <a href="https://en.wikipedia.org/wiki/Interpolation_search">interpolation search</a> uses asymptotically fewer comparisons. Because of the high overhead of index computation (a division!!), it could only ever be useful for large <code><span class='Value'>𝕨</span></code>, and <code><span class='Value'>𝕩</span></code> small enough that partitioning isn't viable.</p>
+<p>Interpolation is generally held to be counterproductive because of how badly it fails for non-uniform data. But the continuous-domain equivalent, bracketed root-finding, is better studied, with hybrid approaches developed to fix this. The recently published <a href="https://en.wikipedia.org/wiki/ITP_Method">ITP method</a> (2020) gets the advantages of interpolation with the same maximum number of comparisons as binary search, plus a configurable constant (even if set to 0, it can take advantage of the gap between the length and the next power of 2, but 1 is a better setting). And once the search range is small enough that memory access stops being very expensive, it should switch to binary search and avoid the division.</p>
diff --git a/implementation/primitive/sort.md b/implementation/primitive/sort.md
index 36b9c916..beb99a58 100644
--- a/implementation/primitive/sort.md
+++ b/implementation/primitive/sort.md
@@ -100,3 +100,9 @@ It's cheap and sometimes worthwhile to trim `𝕨` down to the range of `𝕩`.
 If `𝕩` is small-range, then a lookup table method is possible. Check the length of `𝕨` because if it's too large then this method is slower—binary search doesn't have to hit every element! The approach is simply to create a table of the number of elements in `𝕨` with each value, then take a prefix sum. In BQN, ``𝕩⊏+`(1+⌈´𝕩)↑/⁼𝕨``, assuming a minimum of 0.
 
 [Partitioning](#partitioning) allows one pivot `t` from `𝕨` to be compared with all of `𝕩` at once. Although the comparison `t≤𝕩` can be vectorized, the overhead of partitioning still makes this method a little slower per-comparison than sequential binary search *when* `𝕨` *fits in L1 cache*. For larger `𝕨` (and randomly positioned `𝕩`) cache churn is a huge cost and partitioning can be many times faster. It should be performed recursively, switching to sequential binary search when `𝕨` is small enough. Unlike quicksort there is no difficulty in pivot selection: always take it from the middle of `𝕨` as in a normal binary search. However, there is a potential issue with memory. If `𝕩` is unbalanced with respect to `𝕨`, then the larger part can be nearly the whole length of `𝕩` (if it's all of `𝕩` partitioning isn't actually needed and it doesn't need to be saved). This can require close to `2⋆⁼≠𝕨` saved partitions of length `≠𝕩`, while the expected use would be a total length `≠𝕩`.
+
+#### Interpolation search?
+
+Binary search is the optimal approach for truly unknown data. However, if the searched array has an approximately uniform distribution, giving an approximately linear relationship between index and value, [interpolation search](https://en.wikipedia.org/wiki/Interpolation_search) uses asymptotically fewer comparisons. Because of the high overhead of index computation (a division!!), it could only ever be useful for large `𝕨`, and `𝕩` small enough that partitioning isn't viable.
+
+Interpolation is generally held to be counterproductive because of how badly it fails for non-uniform data. But the continuous-domain equivalent, bracketed root-finding, is better studied, with hybrid approaches developed to fix this. The recently published [ITP method](https://en.wikipedia.org/wiki/ITP_Method) (2020) gets the advantages of interpolation with the same maximum number of comparisons as binary search, plus a configurable constant (even if set to 0, it can take advantage of the gap between the length and the next power of 2, but 1 is a better setting). And once the search range is small enough that memory access stops being very expensive, it should switch to binary search and avoid the division.
author	Marshall Lochbaum <mwlochbaum@gmail.com>	2022-02-13 11:28:04 -0500
committer	Marshall Lochbaum <mwlochbaum@gmail.com>	2022-02-13 11:28:04 -0500
commit	cde6f699cd4b76b3b193a8d5bd39242261196fbd (patch)
tree	f247073605808c90ba352ee685bb2835f461b6e6
parent	1853387a0eca09eb4a9778aad64488602e29dca0 (diff)