diff options
-rw-r--r-- | Tasks.org | 25 |
1 files changed, 24 insertions, 1 deletions
@@ -47,4 +47,27 @@ https://ieeexplore.ieee.org/abstract/document/4907670 [20] MITCHELL, J. C. Representation independence and data abstraction. In POPL ’86: Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (New York, NY, USA, 1986), ACM, pp. 263–276. -* NEXT Make notes on planned design +* Planned design + +The design used by CollectionSwitch is the one that works with the least intervention required per new collection type. + +We need some way to integrate with primrose to get the candidate collections. +Ideally this would just be using the rust crate, or having a JSON interface to a CLI. + +For each collection and for each 'critical operation', we generate a performance estimate by performing it repeatedly at various n, and fitting a polynomial to that. +This gives us an estimate of the cost of each operation when the collection is a given size - $C_{op}(n)$. +This step should only need to be run once per compuer, or it could even be shared by default and run again for better accuracy. + +Then, we need a 'semantic profiler'. For our cases, this should collect: + - Max size (in terms of memory used?) + - Max size (in terms of items) + - # of each operation +for each individually allocated collection. +This should then be aggregated by 'allocation site' (specified by last few bits of callstack). +This does require the user to write their own benchmarks - we could maybe hook into criterion for data collection, as it is already popular. +This profiler doesn't need to be /super/ lightweight, just enough to not make things painful to run. + +Then we can approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$. +We could extend this to suggest different approaches if there is a spread of max n. + +If time allows, we could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm. |