1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
|
#+TITLE: Tasks
#+TODO: TODO(t) DOING(w) | DONE(d) BLOCKED(b) CANCELED(c)
* Original design notes
- Based on design used by collectionswitch
- Least intervention required per implementation
- Integrate with primrose to get the candidate collections
- Ideally this would just be using the rust crate, or having a JSON interface to a CLI
- For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$
- Perform operation repeatedly at various n, and fit a polynomial to that
- Requires some trait constraints, and some annotation of traits to know what are 'critical operations'
- This step should only need to be run once per computer
- could be shared by default and run again for better accuracy
- Semantic Profiler
- For each allocated collection:
- Max size (in terms of items)
- # of each operation
- This should be aggregated by 'allocation site' (specified by last few bits of callstack).
- Not sure how to do this, maybe look at how tracing crate does it
- Requires user to write their own benchmarks
- criterion is popular for this, and might have hooks?
- doesn't need to be /super/ lightweight, just enough to not make things painful to run.
- Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$.
- We could extend this to suggest different approaches if there is a spread of max n.
- If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm.
* DONE Integrating with Primrose
We want to be able to run primrose on some file, and get a list of candidates out.
We also need to know where we can find the implementation types, for use in building cost models later.
This will involve exposing interfaces for the library specifications, candidate generation, and code generation.
We can also take this as an opportunity to make it slightly more robust and well-documented, although this is a secondary goal.
** DONE Interface for implementation library
We can list every implementation available, its module path, the traits it implements, and a bunch of generated racket code.
We don't need a list of operations right now, as we'll just make a seperate implementation of cost model building for each trait, and that will give us it.
This is provided by ~primrose::LibSpec~, and defined in the ~lib_specs~ module.
** DONE Interface for candidate generation
We should be able to get a list of candidates for a given file, which links up with the implementation library correctly.
This is provided by ~primrose::ContainerSelector~, and defined in the ~selector~ module.
** DONE Interface for code generation
Given we know what type to use, we should be able to generate the real code that should replace some input file.
This is provided by ~primrose::ContainerSelector~, and defined in the ~codegen~ module.
** DONE Proof of concept type thing
We can get all the candidates for all files in a given project, or even workspace!
This is provided by ~Project~ in ~candelabra_cli::project~.
* Building cost models for types
We need to be able to build cost models for any given type and operation.
This will probably involve some codegen.
There should also be caching of the outputs.
** Generic trait benchmarkers
Given some type implementing a given trait, we should be able to get performance information at various Ns.
We have a first pass of these benchmarks, although they may need refined over time
*** DONE Container trait benchmarker
*** DONE Indexable trait benchmarker
*** DONE Stack trait benchmarker
** DONE Generate benchmarks for arbitrary type
We can generate and run benchmarks from library specs using ~candelabra_cli::cost::benchmark::run_benchmarks~.
** DONE Caching and loading outputs
We cache the benchmark results by type and invalidate based on library modification time.
Relevant code is in ~candelabra_cli::cache~ and ~candelabra_cli::cost~.
We also cache primrose results in ~candelabra_cli::candidates~
** DONE Integrate with CLI
The CLI should get cost models for all of the candidate types, and for now just print them out.
** DOING Build cost model from benchmark
Fit polynomials for each operation from the benchmarking data.
Possibly helpful crates:
- https://docs.rs/fitme/latest/fitme/fn.fit.html
- https://lib.rs/crates/enterpolation
- https://docs.rs/varpro/latest/varpro/
* BLOCKED Semantic profiler
We need to be able to pick some random candidate type, wrap it with profiling stuff, and run user benchmarks to get data.
Ideally, we could use information about the cargo project the file resides in to get benchmarks.
We also need to figure out which operations we want to bother counting, and how we can get an 'allocation context'/callstack.
* BLOCKED Integration
We create the last bits of the whole pipeline:
- Use primrose to get candidates
- Get/retrieve cached cost models
- Estimate cost of each candidate
- Pick a candidate, and generate code using that candidate
Ideally we have a usable enough CLI implementation, that can do structured intermediate outputs, as this will aid with benchmarking and presenting results later.
* BLOCKED Benchmarks & Evaluation
We implement several test programs which require different data structures and different implementations.
We compare our results to the best possible result, and to using the standard library implementation.
* BLOCKED Nice to haves
- Make primrose use actual temp directories, and respect the ~CANDELABRA_SRC_DIR~ stuff that the CLI does
- Separate candelabra into a library and a CLI, and give the CLI nicer outputs/subcommands for each step
- Add nix build stuff for everything
- Nixify the install of rosette
- See if the ~linked_list_cursors~ feature / nightly compiler is actually necessary
|