From 47941ae2594c8eb3cea07d40352a96a7243b8cee Mon Sep 17 00:00:00 2001 From: Aria Shrimpton Date: Mon, 1 Apr 2024 15:20:39 +0100 Subject: redraft #2 --- thesis/parts/design.tex | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) (limited to 'thesis/parts/design.tex') diff --git a/thesis/parts/design.tex b/thesis/parts/design.tex index 01cd858..796549e 100644 --- a/thesis/parts/design.tex +++ b/thesis/parts/design.tex @@ -7,7 +7,7 @@ We leave detailed discussion of implementation for chapter \ref{chap:implementat \section{Aims \& Usage} -As mentioned previously, we aim to create an all-in-one solution for container selection that can select based on both functional and non-functional requirements. +As mentioned previously, we aim to create an all-in-one solution for container selection that takes into account both functional and non-functional requirements. Flexibility is a high priority: It should be easy to add new container implementations, and to integrate our system into existing applications. Our system should also be able to scale to larger programs, and remain convenient for developers to use. @@ -47,8 +47,8 @@ The first must implement the \code{Container} and \code{Stack} traits, and must The second container type, \code{Primes}, must implement the \code{Container} trait, and must satisfy the \code{ascending} property. This property requires that for all consecutive \code{x, y} pairs in the container, \code{x <= y}. -Once we've specified our functional requirements and provided a benchmark (\code{src/tests/prime_sieve/benches/main.rs}), we can simply run Candelabra to select a container: \code{candelabra-cli -p prime_sieve select}. -This command outputs something like table \ref{table:selection_output}, and saves the best combination of container types to be used the next time the program is run. +Once we have specified our functional requirements and provided a benchmark, we can simply run Candelabra to select a container: \code{candelabra-cli -p prime_sieve select}. +This command outputs the information in table \ref{table:selection_output} and saves the best combination of container types to be used the next time the program is run. Here, the code generated uses \code{Vec} as the implementation for \code{Sieve}, and \code{HashSet} as the implementation for \code{Primes}. \begin{table}[h] @@ -66,18 +66,19 @@ Here, the code generated uses \code{Vec} as the implementation for \code{Sieve}, \label{table:selection_output} \end{table} +\newpage \section{Overview of process} Our tool integrates with Rust's packaging system (Cargo) to discover the information it needs about our project. -It then runs Primrose to find a list of implementations satsifying our functional requirements from a pre-built library of container implementations. +It then runs a modified version of Primrose \citep{qin_primrose_2023} to find a list of implementations satsifying our functional requirements from a pre-built library of container implementations. -Once we have this list, we build a 'cost model' for each candidate type. This allows us to get an upper bound for the runtime cost of an operation at any given n. -We choose to focus only on CPU time, and disregard memory usage due to the difficulty of accurately measuring memory footprint.\footnote{As Rust is not interpreted, we would need to hook into calls to the OS' memory allocator. This is very platform-specific, although the currently work in progress allocator API may make this easier in future.} +Once we have this list, we build a \emph{cost model} for each candidate type. This allows us to get an upper bound for the runtime cost of an operation at any given n. +We choose to focus only on CPU time, and disregard memory usage due to the difficulty of accurately measuring memory footprint.\footnote{As Rust is not interpreted, we would need to hook into calls to the OS' memory allocator. This is very platform-specific, although the currently work in progress allocator API \citep{rust_rfc_allocators} may make this easier in future.} We then run the user-provided benchmarks, using a wrapper around any of the valid candidates to track how many times each operation is performed, and the maximum size the container reaches. We combine this information with our cost models to estimate a total cost for each candidate, which is an upper bound on the total time taken for all container operations. -At this point, we also check if an 'adaptive' container would be better, by checking if one implementation is better performing at a lower n, and another at a higher n. +At this point, we also check if an adaptive container would be better, by checking if one implementation is better performing at a lower n, and another at a higher n. Finally, we pick the implementation with the minimum cost, and generate code which allows the program to use that implementation. @@ -88,10 +89,10 @@ We now go into more detail on how each step works, although we leave some specif \section{Functional requirements} %% Explain role in entire process -As described in Chapter \ref{chap:background}, any implementation we pick must satisfy the program's functional requirements. +As described in chapter \ref{chap:background}, any implementation we pick must satisfy the program's functional requirements. To do this, we integrate Primrose \citep{qin_primrose_2023} as a first step. -Primrose allows users to specify both the traits they require in an implementation (essentially the API and methods available), and what properties must be satisfied. +Primrose allows users to specify both the traits they require (syntactic properties), and the semantic properties that must be satisfied. Each container type that we want to select an implementation for is bound by a list of traits and a list of properties (lines 11 and 12 in Listing \ref{lst:selection_example}). @@ -128,11 +129,11 @@ Although we use primrose in our implementation, the rest of our system isn't dep \section{Cost Models} -Now that we have a list of correct implementations for each container type, we need a way to understand the performance characteristics of each of them in isolation. -We use an approach similar to CollectionSwitch\citep{costa_collectionswitch_2018}, which assumes that the main factor in how long an operation takes is the current size of the collection. +Now that we have a list of correct implementations for each container type, we need a way to understand the performance characteristics of each of them. +We use an approach similar to CollectionSwitch \citep{costa_collectionswitch_2018}, which assumes that the main factor in how long an operation takes is the current size of the collection. %% Benchmarks -An implementation has a seperate cost model for each operation, which we obtain by executing the operation repeatedly on collections of various sizes. +Implementations have a seperate cost model for each operation, which we obtain by executing that operation repeatedly at various collection sizes. For example, to build a cost model for \code{Vec::contains}, we would create several \code{Vec}s of varying sizes, and find the average execution time $t$ of \code{contains} at each. @@ -233,7 +234,7 @@ But when the size of the container grows, the cost of doing \code{contains} may Adaptive containers attempt to address this need, by starting off with one implementation (referred to as the low or before implementation), and switching to a new implemenation (the high or after implementation) once the size of the container passes a certain threshold. -This is similar to systems such as CoCo\citep{hutchison_coco_2013} and \cite{osterlund_dynamically_2013}. +This is similar to systems such as CoCo \citep{hutchison_coco_2013} and \cite{osterlund_dynamically_2013}. However, we decide when to switch container implementation before the program is run, rather than as it is running. We also do so in a way that requires no knowledge of the implementation internals. -- cgit v1.2.3