Working Sets: Past and Present -- Peter J Denning ------------------------------------------------- Saltzer's Problem: Ultimate objective of a multiprogrammed memory manager is to be an adaptive control that would allocate memory and schedule the CPU to maximize performance. In 1967 the coupling between memory and CPU scheduling was not considered essential and Denning's working-set based memory manager was considered "risky". Since programs during that time had little locality, studies seemed to show that there was no clear winner among page replacement algorithms. A program's working set is the collection of the recently references pages of a program's virtual address space. It provides an intrinsic measurement of the program's memory demand. By 1976, it was shown through research on program behaviour of various memory policies and queuing theory, that working set principle is a cost-effective basis for near-optimal management of multiprogrammed memory. Reference string: sequence of T references, r(1)...r(T), where r(i) is the segment containing the ith virtual address generated by the program. Resident set: Set of all the pages of a program present in main memory Cold start: empty initial resident set; warm start: initially non-empty resident set due to preloading of some segments Memory policy's control parameter theta is used to trade paging load against resident set size. Larger theta means larger resident set with longer mean interfault times. Multiple control parameters do not seem to have any advantage over a single parameter. Swapping curve: A function f relating the rate of segment faults to the size of the resident set. fixed-space memory policy interprets theta as the size of resident set. variable-space memory policy interprets theta as a bound on the residence times of segments. Lifetime curve: g(x) = 1/f(x) where x=mean resident set size. This gives the mean number of references between segment faults (i.e, paging rate). knee and primary knee. Resident set size at virtual time t is denoted by R(t, theta) Inclusion property: R(t, theta) is subset of R(t, theta+a) for a>0 Empirical models of g(x): Belady model, Chamberlain model; both of them do not correspond exactly to real data. A lifetime curve model should have knees which are close to the real data, since optimal performance is associated with knees.