diff options
author | Aria Shrimpton <me@aria.rip> | 2024-03-31 19:35:02 +0100 |
---|---|---|
committer | Aria Shrimpton <me@aria.rip> | 2024-03-31 19:35:02 +0100 |
commit | 8ddb66266b2d433f235fdfce68259f3a642575da (patch) | |
tree | e3ed7dc653692bb13760eae6c3033b254994c485 /thesis/parts/implementation.tex | |
parent | 597b6678380a88864a518c11933fdff63705a5cb (diff) |
redraft #1
Diffstat (limited to 'thesis/parts/implementation.tex')
-rw-r--r-- | thesis/parts/implementation.tex | 88 |
1 files changed, 48 insertions, 40 deletions
diff --git a/thesis/parts/implementation.tex b/thesis/parts/implementation.tex index 8c5483d..bba1a3f 100644 --- a/thesis/parts/implementation.tex +++ b/thesis/parts/implementation.tex @@ -1,4 +1,4 @@ -This chapter elaborates on some implementation details glossed over in the previous chapter. +We now elaborate on our implementation, explaining some of the finer details of our design. With reference to the source code, we explain the structure of our system's implementation, and highlight areas with difficulties. \section{Modifications to Primrose} @@ -8,15 +8,15 @@ In order to facilitate integration with Primrose, we refactored large parts of t This also required updating the older code to a newer edition of Rust, and improving the error handling throughout. %% Mapping trait -As suggested in the original paper, we added the ability to ask for associative container types: ones that map a key to a value. -This was done by adding a new \code{Mapping} trait to the library, and updating the type checking and analysis code to support multiple type variables in container type declarations, and be aware of the operations available on mappings. +As suggested in the original paper, we added the ability to deal with associative container types: key to value mappings. +We added the \code{Mapping} trait to the implementation library, and updated the type checking and analysis code to support multiple type variables. Operations on mapping implementations can be modelled and checked against constraints in the same way that regular containers can be. They are modelled in Rosette as a list of key-value pairs. \code{src/crates/library/src/hashmap.rs} shows how mapping container types can be declared, and operations on them modelled. Table \ref{table:library} shows the library of container types we used. -Most come from the Rust standard library, with the exceptions of \code{SortedVec} and \code{SortedUniqueVec}, which use \code{Vec} internally. +Most come from the Rust standard library, with the exceptions of the \code{SortedVec} family of containers, which use \code{Vec} internally. The library source can be found in \code{src/crates/library}. \begin{table}[h] @@ -46,95 +46,88 @@ We also added new syntax to Primrose's domain-specific language to support defin While performing integration testing, we found and fixed several other issues with the existing code: \begin{enumerate} -\item Only push and pop operations could be modelled in properties without raising an error during type-checking. -\item The Rosette code generated for properties using other operations would be incorrect. +\item Only push and pop operations could be modelled in properties. Ohter operations would raise an error during type-checking. +\item The Rosette code generated for properties using other operations was incorrect. \item Some trait methods used mutable borrows unnecessarily, making it difficult or impossible to write safe Rust using them. \item The generated code would perform an unnecessary heap allocation for every created container, which could affect performance. \end{enumerate} -We also added a requirement for all \code{Container}s and \code{Mappings} to implement \code{IntoIterator} and \code{FromIterator}, as well as to allow iterating over elements. +We also added requirements to the \code{Container} and \code{Mapping} traits related to Rust's \code{Iterator} API. +Among other things, this allows us to use for loops, and to more easily move data from one implementation to another. \section{Building cost models} %% Benchmarker crate -In order to benchmark container types, we use a seperate crate (\code{src/crates/candelabra-benchmarker}) which contains benchmarking code for each trait in the Primrose library. +In order to benchmark container types, we use a seperate crate (\code{src/crates/benchmarker}) containing benchmarking code for each trait in the Primrose library. When benchmarks need to be run for an implementation, we dynamically generate a new crate, which runs all benchmark methods appropriate for the given implementation (\code{src/crate/candelabra/src/cost/benchmark.rs}). As Rust's generics are monomorphised, our generic code is compiled as if we were using the concrete type in our code, so we don't need to worry about affecting the benchmark results. Each benchmark is run in a 'warmup' loop for a fixed amount of time (currently 500ms), then runs for a fixed number of iterations (currently 50). -This is important because we use every observation when fitting our cost models, so varying the number of iterations would change our curve's fit. -We repeat each benchmark at a range of $n$ values, ranging from $10$ to $60,000$. +This is important because we are using least squares fitting - if there are less data points at higher $n$ values then our resulting model may not fit those points as well. +We repeat each benchmark at a range of $n$ values: $10, 50, 100, 250, 500, 1,000, 6,000, 12,000, 24,000, 36,000, 48,000, 60,000$. Each benchmark we run corresponds to one container operation. For most operations, we insert $n$ random values to a new container, then run the operation once per iteration. For certain operations which are commonly amortized (\code{insert}, \code{push}, and \code{pop}), we instead run the operation itself $n$ times and divide all data points by $n$. -We use least squares to fit a polynomial to all of our data. -As operations on most common data structures are polynomial or logarithmic complexity, we believe that least squares fitting is good enough to capture the cost of most operations. -We originally experimented with coefficients up to $x^3$, but found that this led to bad overfitting. +As discussed previously, we discard all points that are outwith one standard deviation of the mean for each $n$ value. +We use the least squares method to fit a polynomial of form $x_0 + x_1 n + x_2 n^2 + x_3 \log_2 n$. +As most operations on common data structures are polynomial or logarithmic complexity, we believe that least squares fitting is good enough to capture the cost of most operations. +We originally experimented with coefficients up to $x^3$, but found that this led to overfitting. \section{Profiling} -We implement profiling by using a \code{ProfilerWrapper} type (\code{src/crates/library/src/profiler.rs}), which takes as type parameters the 'inner' container implementation and an index later used to identify what type the profiling info corresponds to. +We implement profiling using a \code{ProfilerWrapper} type (\code{src/crates/library/src/profiler.rs}), which takes as type parameters the inner container implementation and an index, used later to identify what container type the output corresponds to. We then implement any primrose traits that the inner container implements, counting the number of times each operation is called. We also check the length of the container after each insertion operation, and track the maximum. -This tracking is done per-instance, and recorded when the instance goes out of scope and its \code{Drop} implementation is called. +Tracking is done per-instance, and recorded when the container goes out of scope and its \code{Drop} implementation is called. We write the counts of each operation and maximum size of the collection to a location specified by an environment variable. When we want to profile a program, we pick any valid inner implementation for each selection site, and use that candidate with our profiling wrapper as the concrete implementation for that site. +We then run all of the program's benchmarks once, which gives us an equal sample of data from each of them. This approach has the advantage of giving us information on each individual collection allocated, rather than only statistics for the type as a whole. For example, if one instance of a container type is used in a very different way from the rest, we will be able to see it more clearly than a normal profiling tool would allow us to. -Although there is noticeable overhead in our current implementation, it's not important as we aren't measuring the program's execution time when profiling. -Future work could likely improve the overhead by batching file outputs, however this wasn't necessary for us. +Although there is noticeable overhead in our current implementation, this is not important as we aren't measuring the program's execution time when profiling. +We could likely reduce profiling overhead by batching file outputs, however this wasn't necessary for us. -\section{Selection and Codegen} +\section{Container Selection} +\label{section:impl_selection} %% Selection Algorithm incl Adaptiv Selection is done per container type. For each candidate implementation, we calculate its cost on each partition in the profiler output, then sum these values to get the total estimated cost for each implementation. -This provides us with estimates for each singular candidate. +This is implemented in \code{src/crates/candelabra/src/profiler/info.rs} and \code{src/crates/candelabra/src/select.rs}. In order to try and suggest an adaptive container, we use the following algorithm: \begin{enumerate} -\item Sort partitions in order of ascending maximum n values. -\item Calculate the cost for each candidate and for each partition +\item Sort the list of partitions in order of ascending maximum n values. +\item Calculate the cost for each candidate in each partition individually. \item For each partition, find the best candidate and store it in the array \code{best}. Note that we don't sum across all partitions this time. \item Find the lowest index \code{i} where \code{best[i] != best[0]} -\item Check that \code{i} partitions the list properly: For all \code{j < i}, \code{best[j] == best[0]} and for all \code{j>=i}, \code{best[j] == best[i]}. +\item Check that \code{i} splits the list properly: For all \code{j < i}, \code{best[j] == best[0]} and for all \code{j>=i}, \code{best[j] == best[i]}. \item Let \code{before} be the name of the candidate in \code{best[0]}, \code{after} be the name of the candidate in \code{best[i]}, and \code{threshold} be halfway between the maximum n values of partition \code{i} and partition \code{i-1}. \item Calculate the cost of switching as: $$ - C_{\textrm{before,clear}}(\textrm{threshold}) + \textrm{threshold} * C_{\textrm{after,insert}}(\textrm{threshold}) + C_{\mathit{before,clear}}(\mathit{threshold}) + \mathit{threshold} * C_{\mathit{after,insert}}(\mathit{threshold}) $$ \item Calculate the cost of not switching: The sum of the difference in cost between \code{before} and \code{after} for all partitions with index \code{> i}. -\item If the cost of not switching is less than the cost of switching, we can't make a suggestion. -\item Otherwise, suggest an adaptive container which switches from \code{before} to \code{after} when $n$ gets above \code{threshold}. Its estimated cost is the cost for \code{before} up to partition \code{i}, plus the cost of \code{after} for all other partitions. +\item If the cost of not switching is less than the cost of switching, don't make a suggestion. +\item Otherwise, suggest an adaptive container which switches from \code{before} to \code{after} when $n$ gets above \code{threshold}. Its estimated cost is the cost for \code{before} up to partition \code{i}, plus the cost of \code{after} for all other partitions, and the cost of switching. \end{enumerate} +\section{Code Generation} + %% Generated code (opaque types) -As mentioned above, the original Primrose code would generate code as in Listing \ref{lst:primrose_codegen}. +As mentioned in chapter \ref{chap:design}, we made modifications to Primrose's code generation in order to improve the resulting code's performance. +The original Primrose code would generate code as in Listing \ref{lst:primrose_codegen}. In order to ensure that users specify all of the traits they need, this code only exposes methods on the implementation that are part of the trait bounds given. However, it does this by using a \code{dyn} object, Rust's mechanism for dynamic dispatch. -Although this approach works, it adds an extra layer of indirection to every call: The caller must use the dyn object's vtable to find the method it needs to call. -This also prevents the compiler from optimising across this boundary. - -In order to avoid this, we make use of Rust's support for existential types: Types that aren't directly named, but are inferred by the compiler. -Existential types only guarantee their users the given trait bounds, therefore they accomplish the same goal of forcing users to specify all of their trait bounds upfront. - -Figure \ref{lst:new_codegen} shows our equivalent generated code. -The type alias \code{Stack<S>} only allows users to use the \code{Container<S>}, \code{Stack<S>}, and \code{Default} traits. -Our unused 'dummy' function \code{_StackCon} has the return type \code{Stack<S>}. -Rust's type inference step sees that its actual return type is \code{Vec<S>}, and therefore sets the concrete type of \code{Stack<S>} to \code{Vec<S>} at compile time. - -Unfortunately, this feature is not yet in stable Rust, meaning we have to opt in to it using an unstable compiler flag (\code{feature(type_alias_impl_trait)}). -At time of writing, the main obstacle to stabilisation appears to be design decisions that only apply to more complicated use-cases, therefore we are confident that this code will remain valid and won't encounter any compiler bugs. - \begin{figure}[h] \begin{lstlisting}[caption=Code generated by original Primrose project,label={lst:primrose_codegen},language=Rust] pub trait StackTrait<T> : Container<T> + Stack<T> {} @@ -155,6 +148,17 @@ impl<T: 'static + Ord + std::hash::Hash> ContainerConstructor for Stack<T> { \end{lstlisting} \end{figure} +Although this approach works, it adds an extra layer of indirection to every call: The caller must use the dyn object's vtable to find the method it needs to call. +This also prevents the compiler from optimising across this boundary. + +In order to avoid this, we make use of Rust's support for existential types: Types that aren't directly named, but are inferred by the compiler. +Existential types only guarantee their users the given trait bounds, therefore they accomplish the same goal of forcing users to specify all of their trait bounds upfront. + +Figure \ref{lst:new_codegen} shows our equivalent generated code. +The type alias \code{Stack<S>} only allows users to use the \code{Container<S>}, \code{Stack<S>}, and \code{Default} traits. +Our unused 'dummy' function \code{_StackCon} has the return type \code{Stack<S>}. +Rust's type inference step sees that its actual return type is \code{Vec<S>}, and therefore sets the concrete type of \code{Stack<S>} to \code{Vec<S>} at compile time. + \begin{figure}[h] \begin{lstlisting}[caption=Code generated with new method,label={lst:new_codegen},language=Rust] pub type StackCon<S: PartialEq + Ord + std::hash::Hash> = impl Container<S> + Stack<S> + Default; @@ -165,3 +169,7 @@ fn _StackCon<S: PartialEq + Ord + std::hash::Hash>() -> StackCon<S> { } \end{lstlisting} \end{figure} + +Unfortunately, this feature is not yet in stable Rust, meaning we have to opt in to it using an unstable compiler flag (\code{feature(type_alias_impl_trait)}). +At time of writing, the main obstacle to stabilisation appears to be design decisions that only apply to more complicated use-cases, therefore we are confident that this code will remain valid and won't encounter any compiler bugs. + |