thesis/parts/design.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68

\todo{Introduction}
\todo{Aims / expected input}

\section{Overview of approach}

Once a list of functionally close enough implementations have been found, selection is done by:

\begin{itemize}
\item Get a list of implementations that satisfy the program's functional requirements
\item Estimating the cost of each operation, for each implementation, for any given n value
\item Profiling the program to rank operation 'importance',
\item Combining the two to create an estimate of the relative cost of each implementation
\end{itemize}

\subsection{Cost Estimation}

We use an approach similar to CollectionSwitch\parencite{costa_collectionswitch_2018}, which assumes that the main factor in how long an operation takes is the current size of the collection.

Each operation has a seperate cost model, which we build by executing the operation repeatedly on collections of various sizes.

For example, to build a cost model for \code{Vec::contains}, we would create several \code{Vec}s of varying sizes, and find the average execution time $t$ of \code{contains} at each.

We then perform linear regression, using the collection size $n$ to predict $t$.
In the case of \code{Vec::contains}, we would expect the resulting polynomial to be roughly linear.

This method works well for many operations and structures, although has notable limitations.

For example, the container implementation \code{LazySortedVec} (provided by Primrose) inserts new elements at the end by default, and only sorts them when an operation that relies on the order is called.

were unable to work around this, although a potential later solution could be to perform untimed 'warmup' operations before each operation.
this is complex because it requires some understanding of what operations will have deferred work for them.

Once we have the data, we fit a polynomial to the data.
Whilst we could use a more complex technique, in practice this is good enough: Very few common operations are above $O(n^3)$, and factors such as logarithms are usually 'close enough'.

We cache this data for as long as the implementation is unchanged.
Whilst it would be possible to share this data across computers, micro-architecture can have a large effect on collection performance\parencite{jung_brainy_2011}, so we calculate it on demand.

\subsection{Profiling}

As mentioned above, the ordering of operations can have a large effect on container performance.
Unfortunately, tracking every container operation in order quickly becomes unfeasible, so we settle for tracking the count of each operation, and the size of the collection.

Every instance of the collection is tracked separately, and results are collated after profiling.
results with a close enough n value get sorted into partitions, where each partition has the average amount of each operation, and a weight indicating how common results in that partition were.
this is done to compress the data, and also to allow searching for adaptive containers later

\todo{deal with not taking into account 'preparatory' operations during benchmarks}

\todo{Combining}

\subsection{Adaptive Containers}

adaptive containers are implemented using const generics, and a wrapper class.

they are detected by finding the best implementation for each partition, sorting by n, and seeing if we can split the partitions in half where a different implementation is best on each side

we then check if the cost saving is greater than the cost of a clear operation and n insert operations

\subsection{Associative Collections for Primrose}

We add a new mapping trait to primrose to express KV maps

\todo{add and list library types}

the constraint solver has been updated to allow properties on dicts (dictproperty), but this was unused.

\todo{Summary}