#+TITLE: Tasks

* TODO Write background chapter
DEADLINE: <2023-10-20 Fri>

** TODO Problem Introduction

- applications use many different container types
- developers often only care about the functional requirements/semantics of these containers
- however, they are usually forced to specify a concrete implementation (examples)

** TODO Motivation

  - justify performance benefit

** TODO Primrose

- primrose allows specifying syntactic and semantic properties, and gives concrete implementations satisfying these properties
- however, this only deals with the functional requirements for the program, and not non-functional requirements
- it is still up to the developer to choose which of these performs the best, etc. or brute force it

** TODO other papers

https://ieeexplore.ieee.org/abstract/document/4907670
[20] MITCHELL, J. C. Representation independence and data abstraction. In POPL ’86: Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (New York, NY, USA, 1986), ACM, pp. 263–276.

** TODO Redraft CollectionSwitch
** TODO Redraft Chameleon
** TODO Redraft Brainy

* Workshop

background
  - gives context for project
  - motivates project and explains importance/contributions, as well as feasibility

1. what has been done previously
2. where does project fit
3. what does reader need to know

2 & 3 guide 1

audience is someone with background in informatics, but not necessariy in the same problem space

* Planned design

  - Based on design used by collectionswitch
    - Least intervention required per implementation
  - Integrate with primrose to get the candidate collections
    - Ideally this would just be using the rust crate, or having a JSON interface to a CLI
  - For each collection and for each 'critical operation', generate a cost estimate when the collection is a given size - $C_{op}(n)$
    - Perform operation repeatedly at various n, and fit a polynomial to that
    - Requires some trait constraints, and some annotation of traits to know what are 'critical operations'
    - This step should only need to be run once per computer
      - could be shared by default and run again for better accuracy
  - Semantic Profiler
    - For each allocated collection:
      - Max size (in terms of items)
      - # of each operation
    - This should be aggregated by 'allocation site' (specified by last few bits of callstack).
      - Not sure how to do this, maybe look at how tracing crate does it
    - Requires user to write their own benchmarks
      - criterion is popular for this, and might have hooks?
    - doesn't need to be /super/ lightweight, just enough to not make things painful to run.
  - Approximate a cost for each candidate as $\sum_{}op C_{op}(n) * #op/#total$.
    - We could extend this to suggest different approaches if there is a spread of max n.
  - If time allows, could attempt to create a 'wrapper type' that switches between collections as n changes, using rules decided by something similar to the above algorithm.