From 8ddad454b30cdafc9bbdc0cbd51c653bee8a87e5 Mon Sep 17 00:00:00 2001 From: Marshall Lochbaum Date: Fri, 30 Dec 2022 20:57:20 -0500 Subject: Rebuild with CBQN's new number formatting (ryu) --- docs/implementation/kclaims.html | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'docs/implementation') diff --git a/docs/implementation/kclaims.html b/docs/implementation/kclaims.html index d11de34a..83876814 100644 --- a/docs/implementation/kclaims.html +++ b/docs/implementation/kclaims.html @@ -86,12 +86,12 @@

Dividing the stall number by total cycles gives us percentage of program time that can be attributed to L1 instruction misses.

↗️
    l  "J""BQN""BQN""Python"
     l ˘ 100 × 564.549425 ÷ 1_4572325_633499
-┌─                            
-╵ "J"      3.843514070006863  
-  "BQN"    1.939655172413793  
-  "BQN"    8.76974968933073   
-  "Python" 5.01002004008016   
-                             ┘
+┌─                             
+╵ "J"      3.8435140700068633  
+  "BQN"    1.9396551724137931  
+  "BQN"    8.76974968933073    
+  "Python" 5.01002004008016    
+                              ┘
 

So, roughly 4%, 2 to 9%, and 5%. The cache miss counts are also broadly in line with these numbers. Note that full cache misses are pretty rare, so that most misses just hit L2 or L3 and don't suffer a large penalty. Also note that instruction cache misses are mostly lower than data misses, as expected.

Don't get me wrong, I'd love to improve performance even by 2%. But it's not exactly world domination, is it? The perf results are an upper bound for how much these programs could be sped up with better treatment of the instruction cache. If K is faster by more than that, it's because of other optimizations.

-- cgit v1.2.3