1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
|
#+TITLE: Tasks
* TODO Write background chapter
DEADLINE: <2023-10-20 Fri>
** TODO Problem Introduction
- applications use many different container types
- developers often only care about the functional requirements/semantics of these containers
- however, they are usually forced to specify a concrete implementation (examples)
** TODO Motivation
- justify performance benefit
** TODO Look into Perflint
https://ieeexplore.ieee.org/abstract/document/4907670
** TODO Brainy
- uses ai model to predict based on target microarchitecture, and runtime behaviour
- uses access pattersn, etc.
- also assumes semantically identical set of candidates
- uses application generator for training data
- focuses on the performance difference between microarchitectures
- intended to be run at each install site
** TODO Redraft Chameleon
** TODO CollectionSwitch
- online selection - uses library so easier to integrate
- collects access patterns, size patterns, etc.
- performance model is built beforehand for each concrete implementation, with a cost model used to estimate the relative performance of each based on observed usage
- switches underlying implementation dynamically
- also able to decide size thresholds where the implementation should be changed and do this
- doesn't require specific knowledge of the implementations, although does still assume all are semantically equivalent
** TODO Primrose
- primrose allows specifying syntactic and semantic properties, and gives concrete implementations satisfying these properties
- however, this only deals with the functional requirements for the program, and not non-functional requirements
- it is still up to the developer to choose which of these performs the best, etc. or brute force it
** TODO other papers
[20] MITCHELL, J. C. Representation independence and data abstraction. In POPL ’86: Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (New York, NY, USA, 1986), ACM, pp. 263–276.
* Planned design
The design used by CollectionSwitch is the one that works with the least intervention required per new collection type.
We need some way to integrate with primrose to get the candidate collections.
Ideally this would just be using the rust crate, or having a JSON interface to a CLI.
For each collection and for each 'critical operation', we generate a performance estimate by performing it repeatedly at various n, and fitting a polynomial to that.
This gives us an estimate of the cost of each operation when the collection is a given size - $C_{op}(n)$.
This step should only need to be run once per compuer, or it could even be shared by default and run again for better accuracy.
Then, we need a 'semantic profiler'. For our cases, this should collect:
- Max size (in terms of memory used?)
- Max size (in terms of items)
- # of each operation
for each individually allocated collection.
This should then be aggregated by 'allocation site' (specified by last few bits of callstack).
This does require the user to write their own benchmarks - we could maybe hook into criterion for data collection, as it is already popular.
This profiler doesn't need to be /super/ lightweight, just enough to not make things painful to run.
Then we can approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$.
We could extend this to suggest different approaches if there is a spread of max n.
If time allows, we could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm.
|