update graphs with new data

author: Aria Shrimpton <me@aria.rip> 2024-03-30 12:38:12 +0000
committer: Aria Shrimpton <me@aria.rip> 2024-03-30 12:38:12 +0000
commit: a6db0f978daa4bbfcd4cd25f05711a2f590413be (patch)
tree: f16a3ada59573bf0b91ab50435e20dce83c224b9 /thesis/parts/results.tex
parent: d7aaba98bd0fe7fe68d3cd11d0e0400edea6a724 (diff)
1 files changed, 34 insertions, 40 deletions
diff --git a/thesis/parts/results.tex b/thesis/parts/results.tex
index 944ecd8..f16c502 100644
--- a/thesis/parts/results.tex
+++ b/thesis/parts/results.tex
@@ -78,7 +78,8 @@ Figure \ref{fig:cm_insert_small_n} shows the cost models for insert operations o
 In particular, for $n<1800$ the overhead from sorting a vec is less than running the default hasher function (at least on this hardware).
 
 We also see a sharp spike in the cost for \code{SortedVecSet} at low $n$ values, and an area of supposed 0 cost from around $n=200$ to $n=800$.
-This seems inaccurate, and is likely a result of few data points at low n values resulting in poor fitting.
+This seems inaccurate, and indicates that our current fitting procedure may not be able to deal with low $n$ values properly.
+More work is required to improve this.
 
 \begin{figure}[h!]
   \centering
@@ -186,15 +187,15 @@ We now compare the implementations suggested by our system to the selection that
 For now, we ignore suggestions for adaptive containers.
 
 Table \ref{table:predicted_actual} shows the predicted best assignments alongside the actual best assignment, obtained by brute-force.
-In all but three of our test cases (marked with *), we correctly identify the best container.
+In all but two of our test cases (marked with *), we correctly identify the best container.
 
 \begin{table}[h!]
   \centering
   \begin{tabular}{|c|c|c|c|c|}
     Project & Container Type & Best implementation & Predicted best &   \\
     \hline
-    aoc\_2021\_09 & Set & HashSet & HashSet &  \\
     aoc\_2021\_09 & Map & HashMap & HashMap &  \\
+    aoc\_2021\_09 & Set & HashSet & HashSet &  \\
     aoc\_2022\_08 & Map & HashMap & HashMap &  \\
     aoc\_2022\_09 & Set & HashSet & HashSet &  \\
     aoc\_2022\_14 & Set & HashSet & HashSet &  \\
@@ -202,27 +203,18 @@ In all but three of our test cases (marked with *), we correctly identify the be
     example\_mapping & Map & HashMap & HashMap &  \\
     example\_sets & Set & HashSet & HashSet &  \\
     example\_stack & StackCon & Vec & Vec &  \\
+    prime\_sieve & Primes & BTreeSet & BTreeSet &  \\
     prime\_sieve & Sieve & Vec & LinkedList & * \\
-    prime\_sieve & Primes & HashSet & BTreeSet & * \\
   \end{tabular}
   \caption{Actual best vs predicted best implementations}
   \label{table:predicted_actual}
 \end{table}
 
-Two of these failures appear to be caused by being overly eager to suggest a \code{LinkedList}.
+Both of these failures appear to be caused by being overly eager to suggest a \code{LinkedList}.
 From looking at detailed profiling information, it seems that both of these container types had a relatively small amount of items in them.
 Therefore this is likely caused by our cost models being inaccurate at small $n$ values, such as in Figure \ref{fig:cm_insert_small_n}.
 
-Our only other failure comes from suggesting a \code{BTreeSet} instead of a \code{HashSet}.
-Our cost models suggest that a \code{BTreeSet} is more suitable for the \code{prime_sieve} benchmarks with a smaller $n$ value, but not for the larger ones.
-However, because the smaller benchmarks complete in less time, Criterion (the benchmarking framework used) chooses to run them for more iterations.
-This causes the smaller $n$ values to carry more weight than they should.
-
-This could be worked around by adjusting Criterion's settings to run all benchmarks for the same number of iterations, at the cost of the increased accuracy for smaller benchmarks that the existing strategy gives.
-Another method would be to only fix the number of iterations when profiling the application, and to run benchmarks as normal otherwise.
-Whilst this should be possible, Criterion doesn't currently support this.
-
-Overall, our results show our system is able to suggest the best containers, at least for large $n$ values.
+Overall, our results show our system is able to suggest the best containers, at least for large enough $n$ values.
 Unfortunately, these tests are somewhat limited, as the best container seems relatively predictable: \code{Vec} where uniqueness is not important, and \code{Hash*} otherwise.
 Therefore more thorough testing is needed to fully establish the system's effectiveness.
 
@@ -268,31 +260,33 @@ The exact definition of this varies by benchmark.
 \begin{table}[h]
   \centering
   \begin{adjustbox}{angle=90}
-  \begin{tabular}{|c|c|c|c|c|c|}
-    \hline
-    Project & Assignment & 100 & 1000 & 2000 & \\
-    \hline
-    aoc\_2022\_09 & Set=HashSet & 1ms $\pm$ 5us & 13ms $\pm$ 828us & 27ms $\pm$ 1ms & \\
-    aoc\_2022\_09 & Set=Adaptive & 1ms $\pm$ 2us & 11ms $\pm$ 17us & 39ms $\pm$ 684us & \\
-    \hline
-     &  & 100 & 200 & & \\
-    \hline
-    aoc\_2022\_08 & Map=HashMap & 1ms $\pm$ 9us & 5ms $\pm$ 66us & &\\
-    aoc\_2022\_08 & Map=Adaptive & 1ms $\pm$ 6us & 5ms $\pm$ 41us & & \\
-    \hline
-     &  & 50 & 150 & 2500 & 7500 \\
-    \hline
-    example\_mapping & Map=HashMap & 3us $\pm$ 7ns & 11us $\pm$ 49ns & 185us $\pm$ 2us & 591us $\pm$ 1us \\
-    example\_mapping & Map=Adaptive & 4us $\pm$ 7ns & 33us $\pm$ 55ns & 187us $\pm$ 318ns & 595us $\pm$ 1us \\
-    \hline
-     &  & 50 & 500 & 50000 & \\
-    \hline
-    prime\_sieve & Sieve=Vec, Primes=HashSet & 1us $\pm$ 3ns & 78us $\pm$ 1us & 766ms $\pm$ 1ms & \\
-    prime\_sieve & Sieve=Vec, Primes=Adaptive & 1us $\pm$ 3ns & 84us $\pm$ 138ns & 785ms $\pm$ 730us & \\
-    prime\_sieve & Sieve=Adaptive, Primes=HashSet & 2us $\pm$ 6ns & 208us $\pm$ 568ns & 763ms $\pm$ 1ms & \\
-    prime\_sieve & Sieve=Adaptive, Primes=Adaptive & 2us $\pm$ 4ns & 205us $\pm$ 434ns & 762ms $\pm$ 2ms & \\
-    \hline
-  \end{tabular}
+    \begin{tabular}{|c|c|c|c|c|c|}
+      \hline
+      Project & Implementations & \multicolumn{4}{|c|}{Benchmark size} \\
+      \hline
+       &  & 100 & 200 & & \\
+      \hline
+      aoc\_2022\_08 & Map=HashMap & 1ms $\pm$ 12us & 6ms $\pm$ 170us & & \\
+      aoc\_2022\_08 & Map=Adaptive & 1ms $\pm$ 74us & 6ms $\pm$ 138us & & \\
+      \hline
+       &  & 100 & 1000 & 2000 & \\
+      \hline
+      aoc\_2022\_09 & Set=HashSet & 1ms $\pm$ 6us & 10ms $\pm$ 51us & 22ms $\pm$ 214us & \\
+      aoc\_2022\_09 & Set=Adaptive & 1ms $\pm$ 3us & 10ms $\pm$ 27us & 40ms $\pm$ 514us & \\
+      \hline
+       &  & 50 & 150 & 2500 & 7500 \\
+      \hline
+      example\_mapping & Map=HashMap & 3us $\pm$ 6ns & 11us $\pm$ 20ns & 184us $\pm$ 835ns & 593us $\pm$ 793ns \\
+      example\_mapping & Map=Adaptive & 4us $\pm$ 9ns & 33us $\pm$ 55ns & 192us $\pm$ 311ns & 654us $\pm$ 19us \\
+      \hline
+       &  & 50 & 500 & 50000 & \\
+      \hline
+      prime\_sieve & Primes=BTreeSet, Sieve=Vec & 1us $\pm$ 2ns & 75us $\pm$ 490ns & 774ms $\pm$ 4ms & \\
+      prime\_sieve & Primes=BTreeSet, Sieve=Adaptive & 2us $\pm$ 7ns & 194us $\pm$ 377ns & 765ms $\pm$ 4ms & \\
+      prime\_sieve & Primes=Adaptive, Sieve=Vec & 1us $\pm$ 10ns & 85us $\pm$ 179ns & 788ms $\pm$ 2ms & \\
+      prime\_sieve & Primes=Adaptive Sieve=Adaptive & 2us $\pm$ 5ns & 203us $\pm$ 638ns & 758ms $\pm$ 4ms & \\
+      \hline
+    \end{tabular}
   \end{adjustbox}
   \caption{Adaptive containers vs the best single container, by size of benchmark}
   \label{table:adaptive_perfcomp}
author	Aria Shrimpton <me@aria.rip>	2024-03-30 12:38:12 +0000
committer	Aria Shrimpton <me@aria.rip>	2024-03-30 12:38:12 +0000
commit	a6db0f978daa4bbfcd4cd25f05711a2f590413be (patch)
tree	f16a3ada59573bf0b91ab50435e20dce83c224b9 /thesis/parts/results.tex
parent	d7aaba98bd0fe7fe68d3cd11d0e0400edea6a724 (diff)