Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency

Gray, Cheriton

 

Lease: leaseholder may write cached data during the term of a lease.  lease may be renewed / extended by the leaseholder by contacting the server and doing another read.  server automatically grants a lease when a client asks for some datum.  if another client wants to modify the datum during the term of the lease, the server needs to asks the current leaseholder for permission, or must delay the write.

 

assumption: leases rely on synchronized physical clocks / small clock skew!

 

paper argues, analytically determines, and experimentally validates that short leases (i.e, 10 seconds) are efficient, and reduce the number of messages requires to ensure consistency.  also argues that leases are good for fault-tolerance—if a client dies, server can continue to allow other clients to write to the file after the lease expires.  if the server dies, but keeps track of the lease it last gave out with the longest term, it may recover, and delay all operations until that lease expires before allowing other clients to write instead of being in an unknown state about which files it can allow clients to write.

 

false sharing – a lease conflict exists when no acutal conflict exists.  that is, a client wants to write to a file, but cannot because another client holds a lease and is not using the file.

 

advantages of short leases

+ have to keep around less state for a shorter period at the server

+ leases are small (just a couple pointers); about 10 bytes per lease.  if lease state, gets to big, we can just have leases over larger granulatity objects, at the risk of more contention

 

advantages of longer leases

+ good for frequently read, but infrequently written files

 

installed files: files that are mostly read—long leases over coarse grained objects can be used to reduce server load and added read/write delay due to consistency messages.  approx 50% reads were on files that were installed files.

 

“Simple” analytical model was developed to determine lower bound on effective lease time.  lease time > 1/(R(alpha-1)) where alpha = lease benefit factor = 2R/SW.  (R=frequency of reads, W=frequency of writes, S=number of shared copies)  The more reading than writing, and the more reading than sharing, the bigger the benefit from leases. 

 

paper weakness? only did one test with S=1 to validate correctness of graphs generated from the model.

 

“knee” in all curves was at about 10 seconds and additional benefit of longer leases was insignificant (only an addition 3% to 10% gain by extending lease time to infinity).

 

In V, there was an order of magnitude more reads than writes than in other systems because they factored out writes to temporary files, and didn’t count them as writes since the temporary files were not shared.  Also, they counted reads due to program loading and directory lookups as reads whereas most other studies did not do this.  also, in V they only counted opening files as reads and closing files as writes instead of counting block level reads and writes???  but they argue that unix would also benefit from leases despite these differences.

 

Trends: 1) inceasing communication delay due to larger wide-area networks, 2) faster client processors.  (1) results in longer lease times being favorable, since we need to account for the longer latencies in granting the leases.  (2) allows the client processors to do more work (reads/writes) within a given lease term, hence increase in client processor speed allows us to have a shorter lease time to keep server load and average delay due to consistency messages constant.

 

Increases in number of clients and servers have no affect on lease time (unless amount of write sharing increases, but, in general, this will not happen due to just more clients and servers.)

 

Lease Management Options / Enhancements

 

 

Fault-tolerance advantages of leases:

 

If server advances clock too fast, or client advances clock too slowly, then leases will not be respected appropriately, and the result will be inconsistency.  (However, we can deal with slower server clocks, and faster client clocks without any problems—so potentially skew should be adjusted to be “off” in those directions to maintain consistency.)

 

 

Related Work [ take more notes on this later ]

 

Sprite, RFS use infinite leases when processes open files.  The opening process owns the files until they close it.

 

A number of other systems are talked about.  The authors claim that most systems could gain performance by using short-term leases.