2 files changed, 19 insertions, 165 deletions
diff --git a/Tasks.org b/Tasks.org
index 2a285b4..bc7bd59 100644
--- a/Tasks.org
+++ b/Tasks.org
@@ -2,147 +2,6 @@
 #+TODO: TODO(t) DOING(w) | DONE(d) BLOCKED(b) CANCELED(c)
 #+FILETAGS: :@school:
 
-* Original design notes
-
-  - Existing work: Primrose
-    - Deals with selecting the best container type (currently only for lists)
-    - Specify which pre-defined interfaces are required, and which semantic properties
-      - eg ascending order, uniqueness
-    - Library of premade implementations, with simplified models of their behaviour
-    - Outputs a list of implementations that meet the given requirements
-  - Problem: Picking which one of these implementations
-    - Number of possibilities is exponential in number of types to be selected
-    - Infeasible even for larger problems
-  - Assume we have some user-defined benchmarks that we want to run well
-  - Use approach similar to collectionswitch:
-    - Least intervention required per implementation, so should scale well
-    - For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$
-      - Perform operation repeatedly at various n, and fit a polynomial to that
-      - Requires some trait constraints, and some annotation of traits to know what are 'critical operations'
-      - This step should only need to be run once per computer
-        - could be shared by default and run again for better accuracy
-    - Semantic Profiler
-      - For each allocated collection:
-        - Max size (in terms of items)
-        - # of each operation
-      - This should be aggregated by 'allocation site' (specified by last few bits of callstack).
-      - doesn't need to be /super/ lightweight, just enough to not make things painful to run.
-    - Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$.
-      - We could extend this to suggest different approaches if there is a spread of max n.
-    - If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm.
-
-* DONE Integrating with Primrose
-
-We want to be able to run primrose on some file, and get a list of candidates out.
-We also need to know where we can find the implementation types, for use in building cost models later.
-
-This will involve exposing interfaces for the library specifications, candidate generation, and code generation.
-We can also take this as an opportunity to make it slightly more robust and well-documented, although this is a secondary goal.
-
-** DONE Interface for implementation library
-
-We can list every implementation available, its module path, the traits it implements, and a bunch of generated racket code.
-
-We don't need a list of operations right now, as we'll just make a seperate implementation of cost model building for each trait, and that will give us it.
-
-This is provided by ~primrose::LibSpec~, and defined in the ~lib_specs~ module.
-
-** DONE Interface for candidate generation
-
-We should be able to get a list of candidates for a given file, which links up with the implementation library correctly.
-
-This is provided by ~primrose::ContainerSelector~, and defined in the ~selector~ module.
-
-** DONE Interface for code generation
-
-Given we know what type to use, we should be able to generate the real code that should replace some input file.
-
-This is provided by ~primrose::ContainerSelector~, and defined in the ~codegen~ module.
-
-** DONE Proof of concept type thing
-
-We can get all the candidates for all files in a given project, or even workspace!
-
-This is provided by ~Project~ in ~candelabra_cli::project~.
-
-* Building cost models for types
-
-We need to be able to build cost models for any given type and operation.
-This will probably involve some codegen.
-
-There should also be caching of the outputs.
-
-** Generic trait benchmarkers
-
-Given some type implementing a given trait, we should be able to get performance information at various Ns.
-
-We have a first pass of these benchmarks, although they may need refined over time
-
-*** DONE Container trait benchmarker
-*** DONE Indexable trait benchmarker
-*** DONE Stack trait benchmarker
-
-** DONE Generate benchmarks for arbitrary type
-
-We can generate and run benchmarks from library specs using ~candelabra_cli::cost::benchmark::run_benchmarks~.
-
-** DONE Caching and loading outputs
-
-We cache the benchmark results by type and invalidate based on library modification time.
-Relevant code is in ~candelabra_cli::cache~ and ~candelabra_cli::cost~.
-We also cache primrose results in ~candelabra_cli::candidates~
-
-** DONE Integrate with CLI
-
-The CLI should get cost models for all of the candidate types, and for now just print them out.
-
-** DONE Build cost model from benchmark
-
-We can fit polynomials for each operation from the benchmarking data.
-This seems(?) to be working, but could use some further testing.
-
-* DONE Semantic profiler
-
-We need to be able to pick some random candidate type, wrap it with profiling stuff, and run user benchmarks to get data.
-
-Ideally, we could use information about the cargo project the file resides in to get benchmarks.
-
-We also need to figure out which operations we want to bother counting, and how we can get an 'allocation context'/callstack.
-
-* DONE Integration
-
-We have the code to do all this end-to-end, and to run a 'brute force' for comparison.
-
-* TODO Benchmarks & Evaluation
-
-We implement several test programs which require different data structures and different implementations.
-We compare our results to the best possible result, and to using the standard library implementation.
-
-Ideas:
-  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day11.rs?ref_type=heads
-  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day14.rs?ref_type=heads
-  - Dijkstra's
-
-* TODO Add more collection types
-
-Ideas:
-  - https://lib.rs/crates/hashbrown
-  - https://lib.rs/crates/indexmap
-  - https://lib.rs/crates/smallvec
-  - https://lib.rs/crates/hashlink
-  - https://lib.rs/crates/thin-vec
-
-* BLOCKED Nice to haves
-
-  - Better intermediate outputs, possibly export plots?
-  - Make primrose use actual temp directories, and respect the ~CANDELABRA_SRC_DIR~ stuff that the CLI does
-  - Separate candelabra into a library and a CLI, and give the CLI nicer outputs/subcommands for each step
-  - Add nix build stuff for everything
-  - Nixify the install of rosette
-  - See if the ~linked_list_cursors~ feature / nightly compiler is actually necessary
-  - Alternative to opaque type aliases for codegen
-  - Real checking of map types
-
 * Writing
 
 ** TODO Abstract
diff --git a/analysis/vis.livemd b/analysis/vis.livemd
index 2fa59d2..3eead6f 100644
--- a/analysis/vis.livemd
+++ b/analysis/vis.livemd
@@ -96,9 +96,9 @@ cost_models
 ## Cost model exploratory plots
 
 ```elixir
-startn = 1000
-endn = 60_000
-resolution = 100
+startn = 1
+endn = 350
+resolution = 10
 
 points_for = fn impl, op ->
   %{"coeffs" => [coeffs]} =
@@ -127,16 +127,12 @@ end
 <!-- livebook:{"reevaluate_automatically":true} -->
 
 ```elixir
-set_impls = ["BTreeSet", "SortedUniqueVec", "HashSet"]
-mapping_impls = ["HashMap", "BTreeMap"]
-list_impls = ["Vec", "LinkedList", "SortedVec"]
-stack_impls = ["Vec", "LinkedList"]
+set_impls = ["BTreeSet", "HashSet", "VecSet", "SortedVecSet"]
+mapping_impls = ["HashMap", "BTreeMap", "VecMap", "SortedVecMap"]
+other_impls = ["Vec", "LinkedList", "SortedVec"]
 
-inspect_op = "clear"
-# impls = set_impls ++ list_impls ++ mapping_impls
-impls = ["Vec"]
-# impls = mapping_impls
-# impls = ["SortedUniqueVec", "SortedVec"]
+inspect_op = "insert"
+impls = mapping_impls
 
 Tucan.layers([
   cost_models
@@ -146,18 +142,17 @@ Tucan.layers([
   |> Enum.map(fn %{"impl" => impl} -> points_for.(impl, inspect_op) end)
   |> DF.concat_rows()
   |> DF.filter(impl in ^impls)
-  |> Tucan.lineplot("n", "t", color_by: "impl", clip: true),
-  # |> Tucan.Scale.set_y_domain(0, 200)
-  Tucan.scatter(
-    cost_model_points
-    |> DF.filter(op == ^inspect_op and impl in ^impls)
-    |> DF.group_by(["impl", "n"]),
-    # |> DF.summarise(t: mean(t)),
-    "n",
-    "t",
-    color_by: "impl",
-    clip: true
-  )
+  |> Tucan.lineplot("n", "t", color_by: "impl", clip: true)
+  # Tucan.scatter(
+  #   cost_model_points
+  #   |> DF.filter(op == ^inspect_op and impl in ^impls)
+  # |> DF.group_by(["impl", "n"])
+  # |> DF.summarise(t: mean(t)),
+  # "n",
+  # "t",
+  #   color_by: "impl",
+  #   clip: true
+  # )
 ])
 |> Tucan.Axes.set_y_title("Estimated cost")
 |> Tucan.Axes.set_x_title("Size of container (n)")