Application-Controlled Physical Memory using External Page-Cache Management
---------------------------------------------------------------------------

Harty K, and Cheriton D., 1992

Short Summary: End-to-End argument applied to VM systems. However, the
authors are careful to point out that this is not the approach to be
used for all systems. It is only for some power-hungry apps.

Memory and processors are rapidly improving. This distinguishes
between the apps which are resource intensive (like simulations and
databases) and non-resource intensive.

The basic premise is that the general (app-transparent) virtual memory
(VM) systems will not be sufficient for the resource intensive
apps. So, the authors propose a VM system that provides application
control of physical memory (amount of memory, its contents, and the
scheduling of pages).

---

The demise of secondary storage/network predicted due to the increase
in primary memory is unwarranted. It only distinguishes the modest
apps from the power-hungry ones. Also, it makes page faults very
expensive.

Three major problems with current VM systems:
1. Application cannot monitor and control the size of its physical
memory, and cannot control the specific physical pages alloted to it.
eg: MP3D, Parallel DB query, garbage collector
2. The program cannot efficiently control the contents of the memory. 
eg:DASH, databases
3. The program cannot easily control the read-ahead, writeback and
discarding of pages within its physical memory. eg: prefetching in
scientific computations to minimize I/O

Pagepinning (?), external pagers, Unix's madvise, and mincore try to
address the problem, but they cannot solve it completely, and increase
kernel complexity. The proposed solution is effective and does not
increase kernel complexity.

External page-cache management: the VM system provides one or more
page caches to the application, which manages these completely on its
own to provide best performance. A process-level pager is available
for apps which do not want to perform their own
management. Implemented in the V++ kernel.

kernel provides segments to the processes to manage them. The mapping
of virtual segments/pages to physical pages is similar to that of
conventional systems. However three extra operations are added to the
kernel: SetSegmentManager; MigratePages, ModifyPages;
GetPageAttributes. user-level page fault handling.

When a page fault occurs, a trap is sent to the kernel, which forwards
it to the segment manager module for that segment. The manager might
request the data in the segment from the file server (swap is stored
in a network file server), or if it has the data locally, it will use
it. Then, the manager requests the kernel to copy the pages to the
apps segments and indicates to the app that it can resume. Simple
operations such as maintaining the TLBs is taken care of by the
kernel.

Files are also implemented by using segments. To avoid
infinite-recursion of the page fault handling module, the
fault-handling stack is always kept in memory.

Each large-scale application has its own app-specific segment
manager. It has to handle page faults, reclaim pages from segments and
interact with the system page cache manager (SPCM) to allocate and
return pages. The SPCM can allocate pages based on physical addresses
(to allow app level page coloring) etc., A 'memory market' approach is
used to allocate the memory between different app-specific
managers. This allows the managers to know when they are going to lose
their memory pages etc., and also allow them to make trade-offs such
as computing for less time on large memory or computing for more time
with less memory.

Regarding the pages of the app-specific manager itself, they could be
managed by a default manager, or they could be managed by the
app-specific manager itself. The latter approach is preferable with
the the app-specific manager pinning the pages in memory. The default
manager itself does not page fault.

Performance analysis: Results were given to show that for normal
applications, the performance of V++ VM and Ultrix VM is
comparable. An example of a database transaction system is also given,
where it is shown that app-specific VM management provides performance
benefit.

Concept of "efficient completeness": The OS kernel when providing an
abstraction of hardware should provide an efficient and complete
access to the functionality and performance of the hardware.

Questions:
----------

Is this a valid enough argument considering that the number of
power-hungry apps (such as databases) is so less, and that they anyway
bypass the OS and completely manage the resources they need,
themselves?

Doubts:
-------
What is page-coloring?