Recovery Techniques for Database Systems ======================================== Joost S M Verhofstad, 1978 A survey paper describing recovery techniques for databases. Motivation ---------- Failures are part of life. How do we design database systems to withstand failures? Overview -------- - Not all failures are recoverable, e.g. drastic failures are "catastrophes", some failures are too rare to take into account, some are "surprises", some are too expensive to be handled. - The paper approaches the problem from a "data structures" point of view, distinguishing between "normal/current" data and "recovery" data. It shows how such data can be designed, maintained and used for recovery. Seven techniques ---------------- - Salvation! [The holy grail of our very existence!] - aka fsck. - requires no recovery data. - runs after a system crash and does best-effort scavenging to restore system to a valid state. - Incremental Dumping - copy new/updated files are dumped. - done after a certain # of jobs or at regular intervals. - recent changes are lost. - Audit Trail - similar to "Write Ahead Logging" protocol, popular in contemporary database systems: provides recovery at transactional level (redo/undo). the paper briefly mentions some issues: cascading aborts, locking and checkpoints in this context. A comprehensive summary of transactional recovery schemes can be found in a paper by Haerder and Reuter. - fuzzy checkpointing takes care of the problem mentioned on pg 179. - Differential Files - Maintain a master file and a differential file (the "diff"-file). We look at the diff file first to retrieve any data. See paper for a trick to speedup lookups. - Problems: slower access to elements (extra diff-file lookup), merger of diff's with master can be problematic for systems that are always busy, multiple updates to same element require special handling. - Backup/current Versions - Multiple Copies - Two flavors: - Majority voter circuits with odd # of participants. - Two copies with "update-in-progress" flag. [isn't voting used primarily in hardware these days? Software modules don't use voting.] - Careful Replacement - A technique specialized to updating data structures "in place" - Shadow paging. - Maintaining structures such that the so-called "root-segment rule" is never violated. See Pg 188 for details. Questions --------- - What recovery techniques are actually used in today's systems? - Has any new technique emerged in the last twenty years?