diff options
Diffstat (limited to 'Tasks.org')
-rw-r--r-- | Tasks.org | 43 |
1 files changed, 21 insertions, 22 deletions
@@ -49,25 +49,24 @@ https://ieeexplore.ieee.org/abstract/document/4907670 * Planned design -The design used by CollectionSwitch is the one that works with the least intervention required per new collection type. - -We need some way to integrate with primrose to get the candidate collections. -Ideally this would just be using the rust crate, or having a JSON interface to a CLI. - -For each collection and for each 'critical operation', we generate a performance estimate by performing it repeatedly at various n, and fitting a polynomial to that. -This gives us an estimate of the cost of each operation when the collection is a given size - $C_{op}(n)$. -This step should only need to be run once per compuer, or it could even be shared by default and run again for better accuracy. - -Then, we need a 'semantic profiler'. For our cases, this should collect: - - Max size (in terms of memory used?) - - Max size (in terms of items) - - # of each operation -for each individually allocated collection. -This should then be aggregated by 'allocation site' (specified by last few bits of callstack). -This does require the user to write their own benchmarks - we could maybe hook into criterion for data collection, as it is already popular. -This profiler doesn't need to be /super/ lightweight, just enough to not make things painful to run. - -Then we can approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$. -We could extend this to suggest different approaches if there is a spread of max n. - -If time allows, we could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm. + - Based on design used by collectionswitch + - Least intervention required per implementation + - Integrate with primrose to get the candidate collections + - Ideally this would just be using the rust crate, or having a JSON interface to a CLI + - For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$ + - Perform operation repeatedly at various n, and fit a polynomial to that + - Requires some trait constraints, and some annotation of traits to know what are 'critical operations' + - This step should only need to be run once per computer + - could be shared by default and run again for better accuracy + - Semantic Profiler + - For each allocated collection: + - Max size (in terms of items) + - # of each operation + - This should be aggregated by 'allocation site' (specified by last few bits of callstack). + - Not sure how to do this, maybe look at how tracing crate does it + - Requires user to write their own benchmarks + - criterion is popular for this, and might have hooks? + - doesn't need to be /super/ lightweight, just enough to not make things painful to run. + - Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$. + - We could extend this to suggest different approaches if there is a spread of max n. + - If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm. |