* Disconnected Operation in the Code Filesystem * Authors: James J. Kistler and M. Satyanarayanan * Field: OS/File Systems Coda allows clients to operate while disconnected from the primary file servers. Some design goals were: 1. No special hardware 2. Transparency -- seamless integration with unix environment 3. Scalability 4. Balance availability and consistency Optimistic vs. Pessimistic replica control: These are file-system analogues of optimistic/pessimistic concurrency control. In the pessimistic case a client can acquire exclusive or shared access -- exclusive access gives read/write permission to the owner but no access to everyone else (similar to a write lock). Shared access allows reading by many clients but nobody can write (similar to read lock). Problem: what happens on an involuntary disconnect? Can give no access (client cannot access anything), shared access (nobody else can update, even if the client never uses the file), or exclusive access (nobody else can even read the file). A refinement is to use leases to bound the time of this exclusive access. Problem: client loses access after lease expires. Updates are lost after lease expires (debatable?) Optimistic replica control: Potential for conflicting updates, requires machinery to detect conflicts and leave behind information to allow manual repair. OTOH, gives high availability. Implementation: User-level process called Venus supports disconnected operation. A "MiniCache" within the kernel satisfies requests if possible, thereby saving two kernel context switches in many cases, which is crucial for performance. The implementation uses the vnode interface. Venus has three logical states: Hoarding, Emulation, and Reintegration. Venus uses Hoarding while connected, Emulation while disconnected, and Reintegration temporarily when first reconnected. Hoarding: the system tries to maintain the cache appropriately for disconnected operation which may occur unexpectedly. User-specified file/directory priorities are mixed with a metric measuring recent usage. Occassionally (~10 mins) Venus performs a hoard walk, which essentially performs eviction and fetching. [Callback breaks] Emulation: During disconnected operation all operations are logged so they can be replayed on servers when reconnected. Several tricks are used to minimize log size, such as discarding the previous store record when a new one is written. Meta-data is stored using a transactional system called Recoverable Virtual Memory (RVM) to allow for persistence in case of crashes. Reintegration: 1) Parse log and lock all objects mentioned in it 2) check and replay each log entry, creating stubs for the stores 3) fetch data for the stubs 4) commit and release locks. Reintegration fails if there are write/write conflicts between files. For directories, the reintegration fails only if the same dir entry has conflicting modifications. To detect conflicts: each object replica is tagged with a "storeid" and then this is compared with the value on the server when reintegrating. If the storeid is different, the server's copy has been modified and there is a conflict. Questions: On choosing optimistic concurrency control, the authors claim: "The dominant influence on our choice was the low degree of write-sharing typical of Unix. This implied that an optimistic strategy was likely to lead to relatively few conflicts. An optimistic strategy was also consistent with our overall goal of providing the highest possible availability of data." I would argue that a low degree of write sharing has relatively little to do with picking optimistic over pessimistic. Isn't the real question "is availability is more valuable than consistency for disconnected operation in a distributed file system"?