* Replication in the Harp Filesystem * Authors: Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul * Johnson, Liuba Shrira, Michael Williams * Field: OS/File Systems Server replication with a trio: primary, backup, and witness. Modifications are relayed and acknowledged by the backup before the primary returns success. Any network partition/crash that separates only one machine is recoverable. If the witness crashes, there is no effect. If the backup crashes, the primary promotes the witness to record a log of changes. If the primary crashes, the backup becomes the primary and the witness is again promoted. A change in the group structure is called a view change, and the paper details the intricate steps needed to perform one correctly and efficiently. Harp uses a UPS and communication to avoid disk access; neither the primary nor the backup writes the information to disk, instead they rely on crashes affecting only one machine and power outages to be nullified by the UPS. Commital to disk happens in the background. Harp tries to guard against the possibility of a "killer packet" causing any node that receives it to fail by 1) modifying unix to have a portion of volatile memory survive soft crashed (rio vista?) and 2) delaying log application at the backup until after it has been applied at the primary. This way if a particular log commital causes a primary crash, then the view will change and the witness will be called upon to record a log of committed operations. Then if the backup crashes the same way, at least the witness has a log. Discussion: Brewer has spoken of "CAP" -- Consistency, Availability, network Partitioning: pick two. Harp harps on consistency. It also tries to hide the performance impact by keeping things in volatile storage. It also tries to recover from network partitioning. Does it succeed? Is the complexity of Harp worth it? The authors also suggest modifying NFS and using multicast in their relentless pursuit of simultaneous, transparent consistency and availability in face of failures.