thesis/parts/design.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130

This chapter outlines the design of our container selection system (Candelabra), and the justification behind these design decisions.

We first describe our aims and priorities for the system, and illustrate its usage with an example.

We then provide an overview of the container selection process, and each part in it.

\section{Aims \& Usage}

We aim to create a program for container selection that can select based on both functional and non-functional requirements.
Flexibility is a high priority: It should be easy to add new container implementations, and to integrate our system into existing applications.
Our system should also be able to scale to larger programs, and remain convenient for developers to use.

We chose to implement our system as a Rust CLI, and limit it to selecting containers for Rust programs.
We require the user to provide their own benchmarks, which should be representative of a typical application run.

The user can specify their functional requirements by listing the required traits, and specifying properties that must always hold in a lisp-like language.
This part of the process is handled by Primrose\parencite{qin_primrose_2023}, with only minor modifications to integrate with the rest of our system.

For example, take the below code from our test case based on the sieve of Eratosthenes (\code{src/tests/prime\_sieve} in the source artifacts).
Here we request two container types: \code{Sieve} and \code{Primes}.
The first must implement the \code{Container} and \code{Stack} traits, and must satisfy the \code{lifo} property. This property is defined at the top as only being applicable to \code{Stack}s, and requires that for any \code{x}, pushing \code{x} then popping from the container returns \code{x}.

The second container type, \code{Primes}, must only implement the \code{Container} trait, and must satisfy the \code{ascending} property.
This property requires that at any point, for all consecutive \code{x, y} pairs in the container, \code{x $\leq$ y}.

\begin{lstlisting}
/*SPEC*
property lifo<T> {
    \c <: (Stack) -> (forall \x -> ((equal? (pop ((push c) x))) x))
}

property ascending<T> {
    \c -> ((for-all-consecutive-pairs c) leq?)
}


type Sieve<S> = {c impl (Container, Stack) | (lifo c)}
type Primes<S> = {c impl (Container) | (ascending c)}
*ENDSPEC*/
\end{lstlisting}

Once we've specified our functional requirements and provided a benchmark (\code{src/tests/prime\_sieve/benches/main.rs}), we can simply run candelabra to select a container:

\todo{Show selection process}

Our tool integrates with Rust's packaging system (Cargo) to discover the information it needs about our project, then runs Primrose to find a list of implementations satsifying our functional requirements.

Once it has this list, it estimates a 'cost' for each candidate, which is an upper bound on the total time taken for all container operations.
At this point, it also checks if an 'adaptive' container would be better, by checking if one implementation is better performing at a lower n, and another at a higher n.

Finally, it picks the container with the minimum cost, and creates a new Rust file where the chosen container type is exported.

Our tool requires little user intervention, integrates well with existing workflows, and the time it takes scales linearly with the number of container types in a given project.

\section{Selection Process}

We now describe the design of our selection process in detail, and justify our choices.

As mentioned above, the first stage of our process is to satisfy functional requirements, which we do using code from Primrose\parencite{qin_primrose_2023}.
The exact internals are beyond the scope of this paper, but in brief this works by:
\begin{itemize}
\item Finding all implementations in the container library that implement all required traits
\item Translate any specified properties to a Rosette expression
\item For each implementation, model the behaviour of each operation in Rosette, and check that the required properties always hold
\end{itemize}

We use the code provided with the Primrose paper, with minor modifications elaborated on in \ref{chap:implementation}.

Once a list of functionally close enough implementations have been found, selection is done by:

\begin{itemize}
\item For each operation of each implementation, build a cost model which can estimate the 'cost' of that operation at any given container size $n$
\item Profile the program, tracking operation frequency and container sizes
\item Combining the two to create an estimate of the relative cost of each implementation
\end{itemize}

\subsection{Cost Models}

We use an approach similar to CollectionSwitch\parencite{costa_collectionswitch_2018}, which assumes that the main factor in how long an operation takes is the current size of the collection.

Each operation has a seperate cost model, which we build by executing the operation repeatedly on collections of various sizes.

For example, to build a cost model for \code{Vec::contains}, we would create several \code{Vec}s of varying sizes, and find the average execution time $t$ of \code{contains} at each.

We then perform linear regression, using the collection size $n$ to predict $t$.
In the case of \code{Vec::contains}, we would expect the resulting polynomial to be roughly linear.

This method works well for many operations and structures, although has notable limitations.

For example, the container implementation \code{LazySortedVec} (provided by Primrose) inserts new elements at the end by default, and only sorts them when an operation that relies on the order is called.

were unable to work around this, although a potential later solution could be to perform untimed 'warmup' operations before each operation.
this is complex because it requires some understanding of what operations will have deferred work for them.

Once we have the data, we fit a polynomial to the data.
Whilst we could use a more complex technique, in practice this is good enough: Very few common operations are above $O(n^3)$, and factors such as logarithms are usually 'close enough'.

We cache this data for as long as the implementation is unchanged.
Whilst it would be possible to share this data across computers, micro-architecture can have a large effect on collection performance\parencite{jung_brainy_2011}, so we calculate it on demand.

\subsection{Profiling}

As mentioned above, the ordering of operations can have a large effect on container performance.
Unfortunately, tracking every container operation in order quickly becomes unfeasible, so we settle for tracking the count of each operation, and the size of the collection.

Every instance of the collection is tracked separately, and results are collated after profiling.
results with a close enough n value get sorted into partitions, where each partition has the average amount of each operation, and a weight indicating how common results in that partition were.
this is done to compress the data, and also to allow searching for adaptive containers later

\todo{deal with not taking into account 'preparatory' operations during benchmarks}

\todo{Combining}

\subsection{Adaptive Containers}

adaptive containers are implemented using const generics, and a wrapper class.

they are detected by finding the best implementation for each partition, sorting by n, and seeing if we can split the partitions in half where a different implementation is best on each side

we then check if the cost saving is greater than the cost of a clear operation and n insert operations

\subsection{Associative Collections for Primrose}

We add a new mapping trait to primrose to express KV maps

\todo{add and list library types}

the constraint solver has been updated to allow properties on dicts (dictproperty), but this was unused.

\todo{Summary}