#+TITLE: Tasks #+TODO: TODO(t) DOING(w) | DONE(d) BLOCKED(b) CANCELED(c) * Original design notes - Existing work: Primrose - Deals with selecting the best container type (currently only for lists) - Specify which pre-defined interfaces are required, and which semantic properties - eg ascending order, uniqueness - Library of premade implementations, with simplified models of their behaviour - Outputs a list of implementations that meet the given requirements - Problem: Picking which one of these implementations - Number of possibilities is exponential in number of types to be selected - Infeasible even for larger problems - Assume we have some user-defined benchmarks that we want to run well - Use approach similar to collectionswitch: - Least intervention required per implementation, so should scale well - For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$ - Perform operation repeatedly at various n, and fit a polynomial to that - Requires some trait constraints, and some annotation of traits to know what are 'critical operations' - This step should only need to be run once per computer - could be shared by default and run again for better accuracy - Semantic Profiler - For each allocated collection: - Max size (in terms of items) - # of each operation - This should be aggregated by 'allocation site' (specified by last few bits of callstack). - doesn't need to be /super/ lightweight, just enough to not make things painful to run. - Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$. - We could extend this to suggest different approaches if there is a spread of max n. - If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm. * DONE Integrating with Primrose We want to be able to run primrose on some file, and get a list of candidates out. We also need to know where we can find the implementation types, for use in building cost models later. This will involve exposing interfaces for the library specifications, candidate generation, and code generation. We can also take this as an opportunity to make it slightly more robust and well-documented, although this is a secondary goal. ** DONE Interface for implementation library We can list every implementation available, its module path, the traits it implements, and a bunch of generated racket code. We don't need a list of operations right now, as we'll just make a seperate implementation of cost model building for each trait, and that will give us it. This is provided by ~primrose::LibSpec~, and defined in the ~lib_specs~ module. ** DONE Interface for candidate generation We should be able to get a list of candidates for a given file, which links up with the implementation library correctly. This is provided by ~primrose::ContainerSelector~, and defined in the ~selector~ module. ** DONE Interface for code generation Given we know what type to use, we should be able to generate the real code that should replace some input file. This is provided by ~primrose::ContainerSelector~, and defined in the ~codegen~ module. ** DONE Proof of concept type thing We can get all the candidates for all files in a given project, or even workspace! This is provided by ~Project~ in ~candelabra_cli::project~. * Building cost models for types We need to be able to build cost models for any given type and operation. This will probably involve some codegen. There should also be caching of the outputs. ** Generic trait benchmarkers Given some type implementing a given trait, we should be able to get performance information at various Ns. We have a first pass of these benchmarks, although they may need refined over time *** DONE Container trait benchmarker *** DONE Indexable trait benchmarker *** DONE Stack trait benchmarker ** DONE Generate benchmarks for arbitrary type We can generate and run benchmarks from library specs using ~candelabra_cli::cost::benchmark::run_benchmarks~. ** DONE Caching and loading outputs We cache the benchmark results by type and invalidate based on library modification time. Relevant code is in ~candelabra_cli::cache~ and ~candelabra_cli::cost~. We also cache primrose results in ~candelabra_cli::candidates~ ** DONE Integrate with CLI The CLI should get cost models for all of the candidate types, and for now just print them out. ** DONE Build cost model from benchmark We can fit polynomials for each operation from the benchmarking data. This seems(?) to be working, but could use some further testing. * DONE Semantic profiler We need to be able to pick some random candidate type, wrap it with profiling stuff, and run user benchmarks to get data. Ideally, we could use information about the cargo project the file resides in to get benchmarks. We also need to figure out which operations we want to bother counting, and how we can get an 'allocation context'/callstack. * DONE Integration We have the code to do all this end-to-end, and to run a 'brute force' for comparison. * TODO Benchmarks & Evaluation We implement several test programs which require different data structures and different implementations. We compare our results to the best possible result, and to using the standard library implementation. Ideas: - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day11.rs?ref_type=heads - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day14.rs?ref_type=heads - Dijkstra's * TODO Add more collection types Ideas: - https://lib.rs/crates/hashbrown - https://lib.rs/crates/indexmap - https://lib.rs/crates/smallvec - https://lib.rs/crates/hashlink - https://lib.rs/crates/thin-vec * BLOCKED Nice to haves - Better intermediate outputs, possibly export plots? - Make primrose use actual temp directories, and respect the ~CANDELABRA_SRC_DIR~ stuff that the CLI does - Separate candelabra into a library and a CLI, and give the CLI nicer outputs/subcommands for each step - Add nix build stuff for everything - Nixify the install of rosette - See if the ~linked_list_cursors~ feature / nightly compiler is actually necessary - Alternative to opaque type aliases for codegen - Real checking of map types * Writing ** TODO Abstract ** TODO Introduction *** DONE Introduce problem **** DONE Container types common in programs **** DONE Functionally identical implementations **** DONE Large difference in performance *** DONE Motivate w/ effectiveness claims *** TODO Overview of aims & approach **** TODO Scalability to larger projects **** TODO Ease of integration into existing projects **** TODO Ease of adding new container types **** TODO Flexibility of selection *** TODO Overview of results ** DONE Background *** DONE Introduce problem *** DONE Functional vs non-functional requirements *** DONE Existing approaches & their shortfalls *** DONE Lead to next chapter ** TODO Design *** DONE Usage Example *** DONE Primrose Integration **** DONE Explain role in entire process **** DONE Short explanation of selection method **** DONE Abstraction over backend *** DONE Building cost models **** DONE Benchmarks **** DONE Linear Regression **** DONE Limitations *** DONE Profiling applications **** DONE Data collected **** DONE Segmentation **** TODO Limitations w/ pre-benchmark steps *** DONE Selection process & adaptive containers **** DONE Selection process **** TODO Adaptive container detection **** TODO Code generation ** TODO Implementation *** DONE Modifications to Primrose **** DONE API **** DONE Mapping trait **** DONE Resiliency, etc *** DONE Cost models **** DONE Benchmarker crate **** DONE Code generation **** DONE Chosen benchmarks **** TODO Fitting *** DONE Profiling wrapper **** DONE Use of Drop **** DONE Generics and stuff *** DONE Selection / Codegen **** DONE Selection Algorithm incl Adaptive **** DONE Implementation w/ const generics **** DONE Generated code (opaque types) *** TODO Misc Concerns **** TODO Explain cargo's role in rust projects & how it is integrated **** TODO Caching and stuff **** TODO Ease of use **** TODO Integration w/ Cargo ***** TODO Metadata fetching ***** TODO Caching of build dependencies ** TODO Results & Analysis *** TODO Testing setup, benchmarking rationale **** TODO Specs and VM setup **** TODO Reproducibility **** TODO Chosen benchmarks **** TODO Effect of selection on benchmarks (spread in execution time) *** TODO Cost model analysis **** TODO Insertion operations **** TODO Contains operations **** TODO Comment on some bad/weird ones **** TODO Conclusion *** TODO Predictions **** TODO Summarise predicted versus actual **** TODO Evaluate performance **** TODO Comment on distribution of best implementation **** TODO Surprising ones / Explain failures *** TODO Performance of adaptive containers **** TODO Find where adaptive containers get suggested **** TODO Comment on relative performance speedup **** TODO Suggest future improvements? *** TODO Selection time / developer experience **** TODO Mention speedup versus naive brute force ** TODO Conclusion