TITLE: The Rio File Cache: Surviving Operating System Crashes AUTHORS: Peter M. Chen, Wee Teck Ng, Subhacahandra Chandra, Christopher Aycock, Hurushankar Rajamani, David Lowell ABSTRACT: One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disck traffic lowers performance, and the delay period before data is safe lowers reliability. The goal of the Rio (RAM I/O) file cache is to make ordinary main memory safe for psersistent storage by enabling memory to survive operating system crashes. Reliable memory enables a system to achieve the best of both worlds: reliability equivalent to a write-through file cache, where every write is instantly safe, and performance equivalent to a pure write-back cache, with no reliability-induced writes to disk. To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot). Extensive crash tests show that even with protection, warm reboot enables memory to achieve reliability close to that of a write-through file system. Adding protection makes memory even safer than a write-through file system while adding essentially no overhead. By eliminating reliability-induced disk writes, Rio performs 4-22 times as fast as a write-through file system, 2-14 times as fast as a standard unix file system, and 1-3 times as fast as an optimized system that risks losing 30 seconds of data and metadata. PROBLEM: Use of unreliable memory requires many writes to reliable secondary storage, limiting file system performance. GOAL: Increase file system performance. BOTTLENECK: use of unreliable memory. TRADEOFF: ABSTRACTION: TECHNIQUE: Remove "bottleneck" resource by making memory reliable across crashes and reboots. ALTERNATIVES: NOTES: