thesis/parts/results.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117

%% * Testing setup, benchmarking rationale
\section{Testing setup}

%% ** Specs and VM setup
In order to ensure consistent results and reduce the chance of outliers, all benchmarks were run on a KVM virtual machine on server hardware.
We used 4 cores of an Intel Xeon E5-2687Wv4 CPU, and 4GiB of RAM.

%% ** Reproducibility
The VM was managed and provisioned using NixOS, meaning it can be easily reproduced with the exact software we used.

\section{Cost models}

We start by looking at our generated cost models, and comparing them both to the observations they are based on, and what we expect from asymptotic analysis.
As we build a total of 51 cost models from our library, we will not examine all of them.
We look at ones for the most common operations, and group them by containers that are commonly selected together.

\subsection{Insertion operations}
Starting with the \code{insert} operation, Figure \ref{fig:cm_insert} shows how the estimated cost changes with the size of the container.
The lines correspond to our fitted curves, while the points indicate the raw observations they are drawn from.
To help readability, we group these into regular \code{Container} implementations, and our associative key-value \code{Mapping} implementations.

\begin{figure}[h]
  \centering
  \includegraphics[width=10cm]{assets/insert_containers.png}
  \par\centering\rule{11cm}{0.5pt}
  \includegraphics[width=10cm]{assets/insert_mappings.png}
  \caption{Estimated cost of insert operation on \code{Container} implementations and \code{Mapping} implementations}
  \label{fig:cm_insert}
\end{figure}


For \code{Vec}, we see that insertion is incredibly cheap, and gets slightly cheaper as the size of the container increases.
This is to be expected, as Rust's Vector implementation grows by a multiple whenever it reaches its maximum capacity, so we would expect amortised inserts to require less resizes as $n$ increases.

\code{LinkedList} has a more stable, but significantly slower insertion.
This is likely because it requires a heap allocation for every item inserted, no matter the current size.
This would also explain why data points appear spread out more, as it can be hard to predict the performance of kernel calls, even on systems with few other processes running.

It's unsurprising that these two implementations are the cheapest, as they have no ordering or uniqueness guarantees, unlike our other implementations.

\code{HashSet} insertions are the next most expensive, however the cost appears to rise as the size of the collection goes up.
This is likely due to hash collisions being more likely as the size of the collection increases.

\code{BTreeSet} insertions are also expensive, however the cost appears to level out as the collection size goes up (a logarithmic curve).
It's important to note that Rust's \code{BTreeSet}s are not based on binary tree search, but instead a more general tree search originally proposed by R Bayer and E McCreight\parencite{bayer_organization_1970}, where each node contains $B-1$ to $2B-1$ elements in an array.
\todo{The standard library documentation states that searches are expected to take $B\log(n)$ comparisons on average\parencite{rust_documentation_team_btreemap_2024}, which would explain the logarithm-like growth.}

Our two mapping types, \code{BTreeMap} and \code{HashMap}, mimic the behaviour of their set counterparts.

Our two outlier containers, \code{SortedUniqueVec} and \code{SortedVec}, both have a substantially higher insertion cost which grows roughly linearly.
Internally, both of these containers perform a binary search to determine where the new element should go.
This would suggest we should see a roughly logarithmic complexity.
However, as we will be inserting most elements near the middle of a list, we will on average be copying half the list every time.
This could explain why we see a roughly linear growth.

\todo{Graph this, and justify further}

\subsection{Contains operations}

We now examine the cost of the \code{contains} operation.

\subsection{Outliers / errors}

\subsection{Evaluation}

%% * Predictions
\section{Selections}

%% ** Chosen benchmarks
Our test cases broadly fall into two categories: Example cases, which just repeat a few operations many times, and our 'real' cases, which are implementations of common algorithms and solutions to programming puzles.
We expect the results from our example cases to be relatively unsurprising, while our real cases are more complex and harder to predict.

Most of our real cases are solutions to puzzles from Advent of Code\parencite{wastl_advent_2015}, a popular collection of programming puzzles.
Table \ref{table:test_cases} lists and briefly describes our test cases.

\begin{table}[h]
  \centering
  \begin{tabular}{|c|c|}
    Name & Description \\
    \hline
    example\_sets & Repeated insert and contains on a set. \\
    example\_stack & Repeated push and pop from a stack. \\
    example\_mapping & Repeated insert and get from a mapping. \\
    prime\_sieve & Sieve of eratosthenes algorithm. \\
    aoc\_2021\_09 & Flood-fill like algorithm (Advent of Code 2021, Day 9) \\
    aoc\_2022\_08 & Simple 2D raycasting (AoC 2022, Day 8) \\
    aoc\_2022\_09 & Simple 2D soft-body simulation (AoC 2022, Day 9) \\
    aoc\_2022\_14 & Simple 2D particle simulation (AoC 2022, Day 14) \\
  \end{tabular}

  \caption{Our test applications}
  \label{table:test_cases}
\end{table}

%% ** Effect of selection on benchmarks (spread in execution time)

%% ** Summarise predicted versus actual

%% ** Evaluate performance

%% ** Comment on distribution of best implementation

%% ** Surprising ones / Explain failures

%% * Performance of adaptive containers
\section{Adaptive containers}

%% ** Find where adaptive containers get suggested

%% ** Comment on relative performance speedup

%% ** Suggest future improvements?

%% * Selection time / developer experience
\section{Selection time}

%% ** Mention speedup versus naive brute force