diff options
author | Aria <me@aria.rip> | 2023-10-01 17:03:09 +0100 |
---|---|---|
committer | Aria <me@aria.rip> | 2023-10-01 17:03:09 +0100 |
commit | 57e87634490eca3333e2283b596ed48f887cfb89 (patch) | |
tree | 9a3761711bbc5094a94fac9e078d5a76ea780dcd | |
parent | 5b60993829edaab8254491358ac11a0a19268168 (diff) |
some notes on planned design
-rw-r--r-- | Tasks.org | 25 |
1 files changed, 24 insertions, 1 deletions
@@ -47,4 +47,27 @@ https://ieeexplore.ieee.org/abstract/document/4907670 [20] MITCHELL, J. C. Representation independence and data abstraction. In POPL ’86: Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (New York, NY, USA, 1986), ACM, pp. 263–276. -* NEXT Make notes on planned design +* Planned design + +The design used by CollectionSwitch is the one that works with the least intervention required per new collection type. + +We need some way to integrate with primrose to get the candidate collections. +Ideally this would just be using the rust crate, or having a JSON interface to a CLI. + +For each collection and for each 'critical operation', we generate a performance estimate by performing it repeatedly at various n, and fitting a polynomial to that. +This gives us an estimate of the cost of each operation when the collection is a given size - $C_{op}(n)$. +This step should only need to be run once per compuer, or it could even be shared by default and run again for better accuracy. + +Then, we need a 'semantic profiler'. For our cases, this should collect: + - Max size (in terms of memory used?) + - Max size (in terms of items) + - # of each operation +for each individually allocated collection. +This should then be aggregated by 'allocation site' (specified by last few bits of callstack). +This does require the user to write their own benchmarks - we could maybe hook into criterion for data collection, as it is already popular. +This profiler doesn't need to be /super/ lightweight, just enough to not make things painful to run. + +Then we can approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$. +We could extend this to suggest different approaches if there is a spread of max n. + +If time allows, we could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm. |