aboutsummaryrefslogtreecommitdiff
path: root/Tasks.org
blob: 8977c0e851d8935ea5f57b101e9529bd9c1fc070 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
#+TITLE: Tasks
#+TODO: TODO(t) DOING(w) | DONE(d) BLOCKED(b) CANCELED(c)

* Original design notes

  - Existing work: Primrose
    - Deals with selecting the best container type (currently only for lists)
    - Specify which pre-defined interfaces are required, and which semantic properties
      - eg ascending order, uniqueness
    - Library of premade implementations, with simplified models of their behaviour
    - Outputs a list of implementations that meet the given requirements
  - Problem: Picking which one of these implementations
    - Number of possibilities is exponential in number of types to be selected
    - Infeasible even for larger problems
  - Assume we have some user-defined benchmarks that we want to run well
  - Use approach similar to collectionswitch:
    - Least intervention required per implementation, so should scale well
    - For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$
      - Perform operation repeatedly at various n, and fit a polynomial to that
      - Requires some trait constraints, and some annotation of traits to know what are 'critical operations'
      - This step should only need to be run once per computer
        - could be shared by default and run again for better accuracy
    - Semantic Profiler
      - For each allocated collection:
        - Max size (in terms of items)
        - # of each operation
      - This should be aggregated by 'allocation site' (specified by last few bits of callstack).
      - doesn't need to be /super/ lightweight, just enough to not make things painful to run.
    - Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$.
      - We could extend this to suggest different approaches if there is a spread of max n.
    - If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm.

* DONE Integrating with Primrose

We want to be able to run primrose on some file, and get a list of candidates out.
We also need to know where we can find the implementation types, for use in building cost models later.

This will involve exposing interfaces for the library specifications, candidate generation, and code generation.
We can also take this as an opportunity to make it slightly more robust and well-documented, although this is a secondary goal.

** DONE Interface for implementation library

We can list every implementation available, its module path, the traits it implements, and a bunch of generated racket code.

We don't need a list of operations right now, as we'll just make a seperate implementation of cost model building for each trait, and that will give us it.

This is provided by ~primrose::LibSpec~, and defined in the ~lib_specs~ module.

** DONE Interface for candidate generation

We should be able to get a list of candidates for a given file, which links up with the implementation library correctly.

This is provided by ~primrose::ContainerSelector~, and defined in the ~selector~ module.

** DONE Interface for code generation

Given we know what type to use, we should be able to generate the real code that should replace some input file.

This is provided by ~primrose::ContainerSelector~, and defined in the ~codegen~ module.

** DONE Proof of concept type thing

We can get all the candidates for all files in a given project, or even workspace!

This is provided by ~Project~ in ~candelabra_cli::project~.

* Building cost models for types

We need to be able to build cost models for any given type and operation.
This will probably involve some codegen.

There should also be caching of the outputs.

** Generic trait benchmarkers

Given some type implementing a given trait, we should be able to get performance information at various Ns.

We have a first pass of these benchmarks, although they may need refined over time

*** DONE Container trait benchmarker
*** DONE Indexable trait benchmarker
*** DONE Stack trait benchmarker

** DONE Generate benchmarks for arbitrary type

We can generate and run benchmarks from library specs using ~candelabra_cli::cost::benchmark::run_benchmarks~.

** DONE Caching and loading outputs

We cache the benchmark results by type and invalidate based on library modification time.
Relevant code is in ~candelabra_cli::cache~ and ~candelabra_cli::cost~.
We also cache primrose results in ~candelabra_cli::candidates~

** DONE Integrate with CLI

The CLI should get cost models for all of the candidate types, and for now just print them out.

** DONE Build cost model from benchmark

We can fit polynomials for each operation from the benchmarking data.
This seems(?) to be working, but could use some further testing.

* DONE Semantic profiler

We need to be able to pick some random candidate type, wrap it with profiling stuff, and run user benchmarks to get data.

Ideally, we could use information about the cargo project the file resides in to get benchmarks.

We also need to figure out which operations we want to bother counting, and how we can get an 'allocation context'/callstack.

* DONE Integration

We have the code to do all this end-to-end, and to run a 'brute force' for comparison.

* TODO It's fucking broken

We can pick the right option, but our cost models seem really weird.
~insert~ seems like the main culprit: probably because we're only measuring long runs of insert.

** TODO Merge observations from multiple benchmarks

ie if we have two different ~insert/100 ...~ lines we should use both instead of only one

** TODO Fix seed for benchmarks

* BLOCKED Benchmarks & Evaluation

We implement several test programs which require different data structures and different implementations.
We compare our results to the best possible result, and to using the standard library implementation.

Ideas:
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2021/day9/09.cpp?ref_type=heads
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day05.rs?ref_type=heads
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day08.rs?ref_type=heads
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day09.rs?ref_type=heads
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day11.rs?ref_type=heads
  - https://git.tardisproject.uk/tcmal/advent-of-code/-/blob/main/2022/src/day14.rs?ref_type=heads
  - Dijkstra's

** Add more collection types

Ideas:
  - https://lib.rs/crates/hashbrown
  - https://lib.rs/crates/indexmap
  - https://lib.rs/crates/smallvec
  - https://lib.rs/crates/hashlink
  - https://lib.rs/crates/thin-vec

* BLOCKED Nice to haves

  - Better intermediate outputs, possibly export plots?
  - Make primrose use actual temp directories, and respect the ~CANDELABRA_SRC_DIR~ stuff that the CLI does
  - Separate candelabra into a library and a CLI, and give the CLI nicer outputs/subcommands for each step
  - Add nix build stuff for everything
  - Nixify the install of rosette
  - See if the ~linked_list_cursors~ feature / nightly compiler is actually necessary
  - Alternative to opaque type aliases for codegen
  - Real checking of map types

* Writing

** TODO Create outline
DEADLINE: <2024-01-26 Fri>