8 files changed, 227 insertions, 108 deletions
diff --git a/Tasks.org b/Tasks.org
index f0f3346..8d3d9ee 100644
--- a/Tasks.org
+++ b/Tasks.org
@@ -266,17 +266,13 @@ Ideas:
 
 ** TODO Results & Analysis
 
-*** TODO Testing setup, benchmarking rationale
+*** DONE Testing setup, benchmarking rationale
 
-**** TODO Specs and VM setup
+**** DONE Specs and VM setup
 
-**** TODO Reproducibility
+**** DONE Reproducibility
 
-**** TODO Chosen benchmarks
-
-**** TODO Effect of selection on benchmarks (spread in execution time)
-
-*** TODO Cost model analysis
+*** DONE Cost model analysis
 
 **** TODO Insertion operations
 
@@ -288,6 +284,10 @@ Ideas:
 
 *** TODO Predictions
 
+**** TODO Chosen benchmarks
+
+**** TODO Effect of selection on benchmarks (spread in execution time)
+
 **** TODO Summarise predicted versus actual
 
 **** TODO Evaluate performance
diff --git a/analysis/vis.livemd b/analysis/vis.livemd
index 9f5a58d..044e41f 100644
--- a/analysis/vis.livemd
+++ b/analysis/vis.livemd
@@ -96,9 +96,9 @@ cost_models
 ## Cost model exploratory plots
 
 ```elixir
-startn = 100
-endn = 5000
-resolution = 50
+startn = 1000
+endn = 60_000
+resolution = 100
 
 points_for = fn impl, op ->
   %{"coeffs" => [coeffs]} =
@@ -127,11 +127,14 @@ end
 <!-- livebook:{"reevaluate_automatically":true} -->
 
 ```elixir
-# inspect_op = "insert"
-# impls = ["BTreeSet", "EagerSortedVec", "HashSet"]
+set_impls = ["BTreeSet", "SortedUniqueVec", "HashSet"]
+mapping_impls = ["HashMap", "BTreeMap"]
+list_impls = ["Vec", "LinkedList", "SortedVec"]
+stack_impls = ["Vec", "LinkedList"]
 
-inspect_op = "pop"
-impls = ["Vec", "LinkedList"]
+inspect_op = "insert"
+# impls = set_impls ++ list_impls
+impls = mapping_impls
 
 Tucan.layers([
   cost_models
@@ -154,9 +157,13 @@ Tucan.layers([
     clip: true
   )
 ])
+|> Tucan.Axes.set_y_title("Estimated cost")
+|> Tucan.Axes.set_x_title("Size of container (n)")
 |> Tucan.Scale.set_x_domain(startn, endn)
 |> Tucan.Scale.set_y_domain(0, 200)
-|> Tucan.set_size(500, 500)
+|> Tucan.set_size(500, 250)
+|> Tucan.Legend.set_title(:color, "Implementation")
+|> Tucan.Legend.set_orientation(:color, "bottom")
 ```
 
 ## Read benchmark data
diff --git a/thesis/assets/insert_containers.png b/thesis/assets/insert_containers.png
new file mode 100644
index 0000000..82b433f
--- /dev/null
+++ b/thesis/assets/insert_containers.png
diff --git a/thesis/assets/insert_mappings.png b/thesis/assets/insert_mappings.png
new file mode 100644
index 0000000..6f00f97
--- /dev/null
+++ b/thesis/assets/insert_mappings.png
diff --git a/thesis/assets/insert_sets.png b/thesis/assets/insert_sets.png
new file mode 100644
index 0000000..581d2ba
--- /dev/null
+++ b/thesis/assets/insert_sets.png
diff --git a/thesis/biblio.bib b/thesis/biblio.bib
index fe7a041..203e938 100644
--- a/thesis/biblio.bib
+++ b/thesis/biblio.bib
@@ -1,23 +1,85 @@
 
-@article{jung_brainy_2011,
-	title = {Brainy: effective selection of data structures},
-	volume = {46},
-	issn = {0362-1340, 1558-1160},
-	url = {https://dl.acm.org/doi/10.1145/1993316.1993509},
-	doi = {10.1145/1993316.1993509},
-	shorttitle = {Brainy},
-	abstract = {Data structure selection is one of the most critical aspects of developing effective applications. By analyzing data structures' behavior and their interaction with the rest of the application on the underlying architecture, tools can make suggestions for alternative data structures better suited for the program input on which the application runs. Consequently, developers can optimize their data structure usage to make the application conscious of an underlying architecture and a particular program input.
-            This paper presents the design and evaluation of Brainy, a new program analysis tool that automatically selects the best data structure for a given program and its input on a specific microarchitecture. The data structure's interface functions are instrumented to dynamically monitor how the data structure interacts with the application for a given input. The instrumentation records traces of various runtime characteristics including underlying architecture-specific events. These generated traces are analyzed and fed into an offline model, constructed using machine learning, to select the best data structure. That is, Brainy exploits runtime feedback of data structures to model the situation an application runs on, and selects the best data structure for a given application/input/architecture combination based on the constructed model. The empirical evaluation shows that this technique is highly accurate across several real-world applications with various program input sets on two different state-of-the-art microarchitectures. Consequently, Brainy achieved an average performance improvement of 27\% and 33\% on both microarchitectures, respectively.},
+@inproceedings{jung_brainy_2011,
+	location = {New York, {NY}, {USA}},
+	title = {Brainy: Effective Selection of Data Structures},
+	isbn = {978-1-4503-0663-8},
+	url = {https://doi.org/10.1145/1993498.1993509},
+	doi = {10.1145/1993498.1993509},
+	series = {{PLDI} '11},
+	abstract = {Data structure selection is one of the most critical aspects of developing effective applications. By analyzing data structures' behavior and their interaction with the rest of the application on the underlying architecture, tools can make suggestions for alternative data structures better suited for the program input on which the application runs. Consequently, developers can optimize their data structure usage to make the application conscious of an underlying architecture and a particular program input.This paper presents the design and evaluation of Brainy, a new program analysis tool that automatically selects the best data structure for a given program and its input on a specific microarchitecture. The data structure's interface functions are instrumented to dynamically monitor how the data structure interacts with the application for a given input. The instrumentation records traces of various runtime characteristics including underlying architecture-specific events. These generated traces are analyzed and fed into an offline model, constructed using machine learning, to select the best data structure. That is, Brainy exploits runtime feedback of data structures to model the situation an application runs on, and selects the best data structure for a given application/input/architecture combination based on the constructed model. The empirical evaluation shows that this technique is highly accurate across several real-world applications with various program input sets on two different state-of-the-art microarchitectures. Consequently, Brainy achieved an average performance improvement of 27\% and 33\% on both microarchitectures, respectively.},
 	pages = {86--97},
-	number = {6},
-	journaltitle = {{ACM} {SIGPLAN} Notices},
-	shortjournal = {{SIGPLAN} Not.},
+	booktitle = {Proceedings of the 32nd {ACM} {SIGPLAN} Conference on Programming Language Design and Implementation},
+	publisher = {Association for Computing Machinery},
 	author = {Jung, Changhee and Rus, Silvius and Railing, Brian P. and Clark, Nathan and Pande, Santosh},
-	urldate = {2023-09-21},
-	date = {2011-06-04},
-	langid = {english},
+	date = {2011},
+	note = {event-place: San Jose, California, {USA}},
+	keywords = {application generator, data structure selection, performance counters, training framework},
+}
+
+@inproceedings{thomas_framework_2005,
+	location = {New York, {NY}, {USA}},
+	title = {A Framework for Adaptive Algorithm Selection in {STAPL}},
+	isbn = {1-59593-080-9},
+	url = {https://doi.org/10.1145/1065944.1065981},
+	doi = {10.1145/1065944.1065981},
+	series = {{PPoPP} '05},
+	abstract = {Writing portable programs that perform well on multiple platforms or for varying input sizes and types can be very difficult because performance is often sensitive to the system architecture, the run-time environment, and input data characteristics. This is even more challenging on parallel and distributed systems due to the wide variety of system architectures. One way to address this problem is to adaptively select the best parallel algorithm for the current input data and system from a set of functionally equivalent algorithmic options. Toward this goal, we have developed a general framework for adaptive algorithm selection for use in the Standard Template Adaptive Parallel Library ({STAPL}). Our framework uses machine learning techniques to analyze data collected by {STAPL} installation benchmarks and to determine tests that will select among algorithmic options at run-time. We apply a prototype implementation of our framework to two important parallel operations, sorting and matrix multiplication, on multiple platforms and show that the framework determines run-time tests that correctly select the best performing algorithm from among several competing algorithmic options in 86-100\% of the cases studied, depending on the operation and the system.},
+	pages = {277--288},
+	booktitle = {Proceedings of the Tenth {ACM} {SIGPLAN} Symposium on Principles and Practice of Parallel Programming},
+	publisher = {Association for Computing Machinery},
+	author = {Thomas, Nathan and Tanase, Gabriel and Tkachyshyn, Olga and Perdue, Jack and Amato, Nancy M. and Rauchwerger, Lawrence},
+	date = {2005},
+	note = {event-place: Chicago, {IL}, {USA}},
 	keywords = {ml, read},
-	file = {Jung et al. - 2011 - Brainy effective selection of data structures.pdf:/home/aria/Zotero/storage/DPJPURT8/Jung et al. - 2011 - Brainy effective selection of data structures.pdf:application/pdf},
+}
+
+@inproceedings{osterlund_dynamically_2013,
+	title = {Dynamically transforming data structures},
+	doi = {10.1109/ASE.2013.6693099},
+	pages = {410--420},
+	booktitle = {2013 28th {IEEE}/{ACM} International Conference on Automated Software Engineering ({ASE})},
+	author = {Österlund, Erik and Löwe, Welf},
+	date = {2013},
+	keywords = {read, rules-based},
+}
+
+@inproceedings{franke_collection_2022,
+	location = {New York, {NY}, {USA}},
+	title = {Collection Skeletons: Declarative Abstractions for Data Collections},
+	isbn = {978-1-4503-9919-7},
+	url = {https://doi.org/10.1145/3567512.3567528},
+	doi = {10.1145/3567512.3567528},
+	series = {{SLE} 2022},
+	abstract = {Modern programming languages provide programmers with rich abstractions for data collections as part of their standard libraries, e.g. Containers in the C++ {STL}, the Java Collections Framework, or the Scala Collections {API}. Typically, these collections frameworks are organised as hierarchies that provide programmers with common abstract data types ({ADTs}) like lists, queues, and stacks. While convenient, this approach introduces problems which ultimately affect application performance due to users over-specifying collection data types limiting implementation flexibility. In this paper, we develop Collection Skeletons which provide a novel, declarative approach to data collections. Using our framework, programmers explicitly select properties for their collections, thereby truly decoupling specification from implementation. By making collection properties explicit immediate benefits materialise in form of reduced risk of over-specification and increased implementation flexibility. We have prototyped our declarative abstractions for collections as a C++ library, and demonstrate that benchmark applications rewritten to use Collection Skeletons incur little or no overhead. In fact, for several benchmarks, we observe performance speedups (on average between 2.57 to 2.93, and up to 16.37) and also enhanced performance portability across three different hardware platforms.},
+	pages = {189--201},
+	booktitle = {Proceedings of the 15th {ACM} {SIGPLAN} International Conference on Software Language Engineering},
+	publisher = {Association for Computing Machinery},
+	author = {Franke, Björn and Li, Zhibo and Morton, Magnus and Steuwer, Michel},
+	date = {2022},
+	note = {event-place: Auckland, New Zealand},
+	keywords = {read, functional requirements},
+	file = {Accepted Version:/home/aria/Zotero/storage/TJ3AGL2S/Franke et al. - 2022 - Collection Skeletons Declarative Abstractions for.pdf:application/pdf},
+}
+
+@article{qin_primrose_2023,
+	title = {Primrose: Selecting Container Data Types by Their Properties},
+	volume = {7},
+	issn = {2473-7321},
+	url = {http://arxiv.org/abs/2205.09655},
+	doi = {10.22152/programming-journal.org/2023/7/11},
+	shorttitle = {Primrose},
+	abstract = {Context: Container data types are ubiquitous in computer programming, enabling developers to efficiently store and process collections of data with an easy-to-use programming interface. Many programming languages offer a variety of container implementations in their standard libraries based on data structures offering different capabilities and performance characteristics. Inquiry: Choosing the *best* container for an application is not always straightforward, as performance characteristics can change drastically in different scenarios, and as real-world performance is not always correlated to theoretical complexity. Approach: We present Primrose, a language-agnostic tool for selecting the best performing valid container implementation from a set of container data types that satisfy *properties* given by application developers. Primrose automatically selects the set of valid container implementations for which the *library specifications*, written by the developers of container libraries, satisfies the specified properties. Finally, Primrose ranks the valid library implementations based on their runtime performance. Knowledge: With Primrose, application developers can specify the expected behaviour of a container as a type refinement with *semantic properties*, e.g., if the container should only contain unique values (such as a `set`) or should satisfy the {LIFO} property of a `stack`. Semantic properties nicely complement *syntactic properties* (i.e., traits, interfaces, or type classes), together allowing developers to specify a container's programming interface *and* behaviour without committing to a concrete implementation. Grounding: We present our prototype implementation of Primrose that preprocesses annotated Rust code, selects valid container implementations and ranks them on their performance. The design of Primrose is, however, language-agnostic, and is easy to integrate into other programming languages that support container data types and traits, interfaces, or type classes. Our implementation encodes properties and library specifications into verification conditions in Rosette, an interface for {SMT} solvers, which determines the set of valid container implementations. We evaluate Primrose by specifying several container implementations, and measuring the time taken to select valid implementations for various combinations of properties with the solver. We automatically validate that container implementations conform to their library specifications via property-based testing. Importance: This work provides a novel approach to bring abstract modelling and specification of container types directly into the programmer's workflow. Instead of selecting concrete container implementations, application programmers can now work on the level of specification, merely stating the behaviours they require from their container types, and the best implementation can be selected automatically.},
+	pages = {11},
+	number = {3},
+	journaltitle = {The Art, Science, and Engineering of Programming},
+	shortjournal = {Programming},
+	author = {Qin, Xueying and O'Connor, Liam and Steuwer, Michel},
+	urldate = {2023-09-25},
+	date = {2023-02-15},
+	eprinttype = {arxiv},
+	eprint = {2205.09655 [cs]},
+	keywords = {read, functional requirements},
+	file = {arXiv Fulltext PDF:/home/aria/Zotero/storage/IL59NESA/Qin et al. - 2023 - Primrose Selecting Container Data Types by Their .pdf:application/pdf;arXiv.org Snapshot:/home/aria/Zotero/storage/DCIW4XE4/2205.html:text/html},
 }
 
 @inproceedings{costa_collectionswitch_2018,
@@ -35,7 +97,7 @@
 	urldate = {2023-09-21},
 	date = {2018-02-24},
 	langid = {english},
-	keywords = {estimate-based, read},
+	keywords = {read, estimate-based},
 	file = {Costa and Andrzejak - 2018 - CollectionSwitch a framework for efficient and dy:/home/aria/Zotero/storage/7B8QMVRU/Costa and Andrzejak - 2018 - CollectionSwitch a framework for efficient and dy:application/pdf},
 }
 
@@ -58,37 +120,6 @@
 	file = {Shacham et al. - 2009 - Chameleon adaptive selection of collections.pdf:/home/aria/Zotero/storage/75CS9CWY/Shacham et al. - 2009 - Chameleon adaptive selection of collections.pdf:application/pdf},
 }
 
-@article{qin_primrose_2023,
-	title = {Primrose: Selecting Container Data Types by Their Properties},
-	volume = {7},
-	issn = {2473-7321},
-	url = {http://arxiv.org/abs/2205.09655},
-	doi = {10.22152/programming-journal.org/2023/7/11},
-	shorttitle = {Primrose},
-	abstract = {Context: Container data types are ubiquitous in computer programming, enabling developers to efficiently store and process collections of data with an easy-to-use programming interface. Many programming languages offer a variety of container implementations in their standard libraries based on data structures offering different capabilities and performance characteristics. Inquiry: Choosing the *best* container for an application is not always straightforward, as performance characteristics can change drastically in different scenarios, and as real-world performance is not always correlated to theoretical complexity. Approach: We present Primrose, a language-agnostic tool for selecting the best performing valid container implementation from a set of container data types that satisfy *properties* given by application developers. Primrose automatically selects the set of valid container implementations for which the *library specifications*, written by the developers of container libraries, satisfies the specified properties. Finally, Primrose ranks the valid library implementations based on their runtime performance. Knowledge: With Primrose, application developers can specify the expected behaviour of a container as a type refinement with *semantic properties*, e.g., if the container should only contain unique values (such as a `set`) or should satisfy the {LIFO} property of a `stack`. Semantic properties nicely complement *syntactic properties* (i.e., traits, interfaces, or type classes), together allowing developers to specify a container's programming interface *and* behaviour without committing to a concrete implementation. Grounding: We present our prototype implementation of Primrose that preprocesses annotated Rust code, selects valid container implementations and ranks them on their performance. The design of Primrose is, however, language-agnostic, and is easy to integrate into other programming languages that support container data types and traits, interfaces, or type classes. Our implementation encodes properties and library specifications into verification conditions in Rosette, an interface for {SMT} solvers, which determines the set of valid container implementations. We evaluate Primrose by specifying several container implementations, and measuring the time taken to select valid implementations for various combinations of properties with the solver. We automatically validate that container implementations conform to their library specifications via property-based testing. Importance: This work provides a novel approach to bring abstract modelling and specification of container types directly into the programmer's workflow. Instead of selecting concrete container implementations, application programmers can now work on the level of specification, merely stating the behaviours they require from their container types, and the best implementation can be selected automatically.},
-	pages = {11},
-	number = {3},
-	journaltitle = {The Art, Science, and Engineering of Programming},
-	shortjournal = {Programming},
-	author = {Qin, Xueying and O'Connor, Liam and Steuwer, Michel},
-	urldate = {2023-09-25},
-	date = {2023-02-15},
-	eprinttype = {arxiv},
-	eprint = {2205.09655 [cs]},
-	keywords = {functional requirements, read},
-	file = {arXiv Fulltext PDF:/home/aria/Zotero/storage/IL59NESA/Qin et al. - 2023 - Primrose Selecting Container Data Types by Their .pdf:application/pdf;arXiv.org Snapshot:/home/aria/Zotero/storage/DCIW4XE4/2205.html:text/html},
-}
-
-@inproceedings{osterlund_dynamically_2013,
-	title = {Dynamically transforming data structures},
-	doi = {10.1109/ASE.2013.6693099},
-	pages = {410--420},
-	booktitle = {2013 28th {IEEE}/{ACM} International Conference on Automated Software Engineering ({ASE})},
-	author = {Österlund, Erik and Löwe, Welf},
-	date = {2013},
-	keywords = {rules-based},
-}
-
 @incollection{hutchison_coco_2013,
 	location = {Berlin, Heidelberg},
 	title = {{CoCo}: Sound and Adaptive Replacement of Java Collections},
@@ -123,44 +154,49 @@
 	file = {Full Text:/home/aria/Zotero/storage/KTJNYCES/L. Liu and S. Rus - 2009 - Perflint A Context Sensitive Performance Advisor .pdf:application/pdf},
 }
 
-@article{chung_towards_2004,
-	title = {Towards Automatic Performance Tuning},
-	author = {Chung, I-Hsin},
-	date = {2004-11},
-	file = {Chung - 2004 - Towards Automatic Performance Tuning.pdf:/home/aria/Zotero/storage/WQBJMSN8/Chung - 2004 - Towards Automatic Performance Tuning.pdf:application/pdf},
+@article{jung_brainy_2011-1,
+	title = {Brainy: effective selection of data structures},
+	volume = {46},
+	issn = {0362-1340, 1558-1160},
+	url = {https://dl.acm.org/doi/10.1145/1993316.1993509},
+	doi = {10.1145/1993316.1993509},
+	shorttitle = {Brainy},
+	abstract = {Data structure selection is one of the most critical aspects of developing effective applications. By analyzing data structures' behavior and their interaction with the rest of the application on the underlying architecture, tools can make suggestions for alternative data structures better suited for the program input on which the application runs. Consequently, developers can optimize their data structure usage to make the application conscious of an underlying architecture and a particular program input.
+            This paper presents the design and evaluation of Brainy, a new program analysis tool that automatically selects the best data structure for a given program and its input on a specific microarchitecture. The data structure's interface functions are instrumented to dynamically monitor how the data structure interacts with the application for a given input. The instrumentation records traces of various runtime characteristics including underlying architecture-specific events. These generated traces are analyzed and fed into an offline model, constructed using machine learning, to select the best data structure. That is, Brainy exploits runtime feedback of data structures to model the situation an application runs on, and selects the best data structure for a given application/input/architecture combination based on the constructed model. The empirical evaluation shows that this technique is highly accurate across several real-world applications with various program input sets on two different state-of-the-art microarchitectures. Consequently, Brainy achieved an average performance improvement of 27\% and 33\% on both microarchitectures, respectively.},
+	pages = {86--97},
+	number = {6},
+	journaltitle = {{ACM} {SIGPLAN} Notices},
+	shortjournal = {{SIGPLAN} Not.},
+	author = {Jung, Changhee and Rus, Silvius and Railing, Brian P. and Clark, Nathan and Pande, Santosh},
+	urldate = {2023-09-21},
+	date = {2011-06-04},
+	langid = {english},
+	keywords = {ml, read},
+	file = {Jung et al. - 2011 - Brainy effective selection of data structures.pdf:/home/aria/Zotero/storage/DPJPURT8/Jung et al. - 2011 - Brainy effective selection of data structures.pdf:application/pdf},
 }
 
-@inproceedings{thomas_framework_2005,
+@inproceedings{costa_empirical_2017,
 	location = {New York, {NY}, {USA}},
-	title = {A Framework for Adaptive Algorithm Selection in {STAPL}},
-	isbn = {1-59593-080-9},
-	url = {https://doi.org/10.1145/1065944.1065981},
-	doi = {10.1145/1065944.1065981},
-	series = {{PPoPP} '05},
-	abstract = {Writing portable programs that perform well on multiple platforms or for varying input sizes and types can be very difficult because performance is often sensitive to the system architecture, the run-time environment, and input data characteristics. This is even more challenging on parallel and distributed systems due to the wide variety of system architectures. One way to address this problem is to adaptively select the best parallel algorithm for the current input data and system from a set of functionally equivalent algorithmic options. Toward this goal, we have developed a general framework for adaptive algorithm selection for use in the Standard Template Adaptive Parallel Library ({STAPL}). Our framework uses machine learning techniques to analyze data collected by {STAPL} installation benchmarks and to determine tests that will select among algorithmic options at run-time. We apply a prototype implementation of our framework to two important parallel operations, sorting and matrix multiplication, on multiple platforms and show that the framework determines run-time tests that correctly select the best performing algorithm from among several competing algorithmic options in 86-100\% of the cases studied, depending on the operation and the system.},
-	pages = {277--288},
-	booktitle = {Proceedings of the Tenth {ACM} {SIGPLAN} Symposium on Principles and Practice of Parallel Programming},
+	title = {Empirical Study of Usage and Performance of Java Collections},
+	isbn = {978-1-4503-4404-3},
+	url = {https://doi.org/10.1145/3030207.3030221},
+	doi = {10.1145/3030207.3030221},
+	series = {{ICPE} '17},
+	abstract = {Collection data structures have a major impact on the performance of applications, especially in languages such as Java, C\#, or C++. This requires a developer to select an appropriate collection from a large set of possibilities, including different abstractions (e.g. list, map, set, queue), and multiple implementations. In Java, the default implementation of collections is provided by the standard Java Collection Framework ({JCF}). However, there exist a large variety of less known third-party collection libraries which can provide substantial performance benefits with minimal code changes.In this paper, we first study the popularity and usage patterns of collection implementations by mining a code corpus comprised of 10,986 Java projects. We use the results to evaluate and compare the performance of the six most popular alternative collection libraries in a large variety of scenarios. We found that for almost every scenario and {JCF} collection type there is an alternative implementation that greatly decreases memory consumption while offering comparable or even better execution time. Memory savings range from 60\% to 88\% thanks to reduced overhead and some operations execute 1.5x to 50x faster.We present our results as a comprehensive guideline to help developers in identifying the scenarios in which an alternative implementation can provide a substantial performance improvement. Finally, we discuss how some coding patterns result in substantial performance differences of collections.},
+	pages = {389--400},
+	booktitle = {Proceedings of the 8th {ACM}/{SPEC} on International Conference on Performance Engineering},
 	publisher = {Association for Computing Machinery},
-	author = {Thomas, Nathan and Tanase, Gabriel and Tkachyshyn, Olga and Perdue, Jack and Amato, Nancy M. and Rauchwerger, Lawrence},
-	date = {2005},
-	note = {event-place: Chicago, {IL}, {USA}},
-	keywords = {ml, read},
+	author = {Costa, Diego and Andrzejak, Artur and Seboek, Janos and Lo, David},
+	date = {2017},
+	note = {event-place: L'Aquila, Italy},
+	keywords = {collections, empirical study, execution time, java, memory, performance},
+	file = {Full Text:/home/aria/Zotero/storage/DLA43MW4/Costa et al. - 2017 - Empirical Study of Usage and Performance of Java C.pdf:application/pdf},
 }
 
-@inproceedings{franke_collection_2022,
-	location = {New York, {NY}, {USA}},
-	title = {Collection Skeletons: Declarative Abstractions for Data Collections},
-	isbn = {978-1-4503-9919-7},
-	url = {https://doi.org/10.1145/3567512.3567528},
-	doi = {10.1145/3567512.3567528},
-	series = {{SLE} 2022},
-	abstract = {Modern programming languages provide programmers with rich abstractions for data collections as part of their standard libraries, e.g. Containers in the C++ {STL}, the Java Collections Framework, or the Scala Collections {API}. Typically, these collections frameworks are organised as hierarchies that provide programmers with common abstract data types ({ADTs}) like lists, queues, and stacks. While convenient, this approach introduces problems which ultimately affect application performance due to users over-specifying collection data types limiting implementation flexibility. In this paper, we develop Collection Skeletons which provide a novel, declarative approach to data collections. Using our framework, programmers explicitly select properties for their collections, thereby truly decoupling specification from implementation. By making collection properties explicit immediate benefits materialise in form of reduced risk of over-specification and increased implementation flexibility. We have prototyped our declarative abstractions for collections as a C++ library, and demonstrate that benchmark applications rewritten to use Collection Skeletons incur little or no overhead. In fact, for several benchmarks, we observe performance speedups (on average between 2.57 to 2.93, and up to 16.37) and also enhanced performance portability across three different hardware platforms.},
-	pages = {189--201},
-	booktitle = {Proceedings of the 15th {ACM} {SIGPLAN} International Conference on Software Language Engineering},
-	publisher = {Association for Computing Machinery},
-	author = {Franke, Björn and Li, Zhibo and Morton, Magnus and Steuwer, Michel},
-	date = {2022},
-	note = {event-place: Auckland, New Zealand},
-	keywords = {functional requirements},
-	file = {Accepted Version:/home/aria/Zotero/storage/TJ3AGL2S/Franke et al. - 2022 - Collection Skeletons Declarative Abstractions for.pdf:application/pdf},
+@online{wastl_advent_2015,
+	title = {Advent of Code},
+	url = {https://adventofcode.com/2022/about},
+	author = {Wastl, Eric},
+	urldate = {2024-03-08},
+	date = {2015},
 }
diff --git a/thesis/main.tex b/thesis/main.tex
index f12e6eb..f252b3c 100644
--- a/thesis/main.tex
+++ b/thesis/main.tex
@@ -5,6 +5,7 @@
 \usepackage{ugcheck}
 
 \usepackage[dvipsnames]{xcolor}
+\usepackage{graphicx}
 
 \usepackage{microtype}
 \usepackage[style=numeric]{biblatex}
@@ -88,12 +89,9 @@ from the Informatics Research Ethics committee.
 \chapter{Implementation} \label{chap:implementation}
 \input{parts/implementation}
 
-\chapter{Results} \label{chap:results}
+\chapter{Results \& analysis} \label{chap:results}
 \input{parts/results}
 
-\chapter{Analysis} \label{chap:analysis}
-\input{parts/analysis}
-
 \chapter{Conclusion} \label{chap:conclusion}
 \input{parts/conclusion}
 
diff --git a/thesis/parts/results.tex b/thesis/parts/results.tex
index 47c7c98..17b6088 100644
--- a/thesis/parts/results.tex
+++ b/thesis/parts/results.tex
@@ -1,9 +1,87 @@
-\todo{Selection of benchmarks}
-\todo{Testing setup}
-\todo{Justification of tested elements}
+%% * Testing setup, benchmarking rationale
+\section{Testing setup}
 
-\todo{Produced cost models}
+%% ** Specs and VM setup
+In order to ensure consistent results and reduce the chance of outliers, all benchmarks were run on a KVM virtual machine on server hardware.
+We used 4 cores of an Intel Xeon E5-2687Wv4 CPU, and 4GiB of RAM.
 
-\todo{Estimated costs}
+%% ** Reproducibility
+The VM was managed and provisioned using NixOS, meaning it can be easily reproduced with the exact software we used.
 
-\todo{Real benchmark results}
+\section{Cost models}
+
+We start by looking at our generated cost models, and comparing them both to the observations they are based on, and what we expect from asymptotic analysis.
+As we build a total of 51 cost models from our library, we will not examine all of them.
+We look at ones for the most common operations, and group them by containers that are commonly selected together.
+
+%% ** Insertion operations
+Starting with the \code{insert} operation, Figure \ref{fig:cm_insert} shows how the estimated cost changes with the size of the container.
+To help readability, we group these into regular \code{Container} implementations, and our associative key-value \code{Mapping} implementations.
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[width=10cm]{assets/insert_containers.png}
+  \par\centering\rule{11cm}{0.5pt}
+  \includegraphics[width=10cm]{assets/insert_mappings.png}
+  \caption{Estimated cost of insert operation on \code{Container} implementations and \code{Mapping} implementations}
+  \label{fig:cm_insert}
+\end{figure}
+
+%% ** Contains operations
+
+%% ** Comment on some bad/weird ones
+
+%% ** Conclusion
+
+%% * Predictions
+\section{Selections}
+
+%% ** Chosen benchmarks
+Our test cases broadly fall into two categories: Example cases, which just repeat a few operations many times, and our 'real' cases, which are implementations of common algorithms and solutions to programming puzles.
+We expect the results from our example cases to be relatively unsurprising, while our real cases are more complex and harder to predict.
+
+Most of our real cases are solutions to puzzles from Advent of Code\parencite{wastl_advent_2015}, a popular collection of programming puzzles.
+Table \ref{table:test_cases} lists and briefly describes our test cases.
+
+\begin{table}[h]
+  \centering
+  \begin{tabular}{|c|c|}
+    Name & Description \\
+    \hline
+    example\_sets & Repeated insert and contains on a set. \\
+    example\_stack & Repeated push and pop from a stack. \\
+    example\_mapping & Repeated insert and get from a mapping. \\
+    prime\_sieve & Sieve of eratosthenes algorithm. \\
+    aoc\_2021\_09 & Flood-fill like algorithm (Advent of Code 2021, Day 9) \\
+    aoc\_2022\_08 & Simple 2D raycasting (AoC 2022, Day 8) \\
+    aoc\_2022\_09 & Simple 2D soft-body simulation (AoC 2022, Day 9) \\
+    aoc\_2022\_14 & Simple 2D particle simulation (AoC 2022, Day 14) \\
+  \end{tabular}
+
+  \caption{Our test applications}
+  \label{table:test_cases}
+\end{table}
+
+%% ** Effect of selection on benchmarks (spread in execution time)
+
+%% ** Summarise predicted versus actual
+
+%% ** Evaluate performance
+
+%% ** Comment on distribution of best implementation
+
+%% ** Surprising ones / Explain failures
+
+%% * Performance of adaptive containers
+\section{Adaptive containers}
+
+%% ** Find where adaptive containers get suggested
+
+%% ** Comment on relative performance speedup
+
+%% ** Suggest future improvements?
+
+%% * Selection time / developer experience
+\section{Selection time}
+
+%% ** Mention speedup versus naive brute force