Implementation and Performance of Munin

Carter, et. al

 

Munin: DSM that employs multiple consistency protocols and release consistency.

 

Variables annotated to determine best consistency protocol to use.  But this is done flexibly, such that protocol for a variable can be changed while programming is running.

 

Release consistency: don’t perform write updates or invalidations until the release lock part of a critical section.  Delay/batch all writes and perform them at release time.  (reduces number of messages, and blocking time of writing process)

 

Challenge: achieve good DSM performance while not requiring the programmer to significantly deviate from conventional programming model.  (Don’t require programmer to rewrite parallel version of application using message passing to obtain efficiency.)

 

Munin is able to achieve message passing efficiency to within 10% by annotating shared variables appropriately.  Interestingly, Munin is built on top of V, as V was first distributed operating system to allow applications to supply their own page fault handlers, and modify page tables.  All synchronization operations must be visible to the Munin DSM runtime system (use Munins synchronization primitives).

 

Software release consistency first introduced in Stanford DASH multiprocessor.

 

(All memory accesses are either ordinary accesses (load/store) or synchronization accesses (acquire/release lock).)

 

Conditions required for ensuring release consistency:

1)     Before an ordinary load or store is allowed to perform with respect to any other processor, all previous acquires must be performed.

2)     Before a release is allowed to perform with respect to any other processor, all previous ordinary loads and stores must be performed.

3)     Before an acquire is allowed to perform with respect to any other processor, all previous releases must be performed.  Before a release is allowed to perform with respect to any other processor, all previous acquires and releases must be performed.

 

“all previous accesses” = all accesses by the same thread that precede the current access in program order

 

a load is said to have “performed with respect to another processor” when a subsequent store on that processor cannot affect the value returned by the load.

 

A store is said to have “performed with respect to another processor” when a subsequent load by that processor will return the value stored (or the value stored in a later store)

 

 

 

 

Most previous DSMs use sequential consistency: writer must block until copies of a page at other processors have been updated.

 

Protocol Parameters:

I – invalidate or update

R – replicas allowed?

D – can system delay updates or invalidations?

FO – send all writes to a fixed owner?

M – multiple writers allowed w/o synchronization?

S – stable sharing pattern?  (same set of threads always accessing shared variable?) (allows Munin to eagerly send updates and lock grants to only these pages)

FI – flush changes to owner?  (send changes only to objects owner and invalidate local copy)

W – writeable? Munin generates a run-time error if an attempt is made to modify a non-writeable object.

 

Sharing Annotations:

(try filling in chart below!)

Sharing Annotation

Parameter Settings

I

R

D

FO

M

S

FI

W

Read-only

 

 

 

 

 

 

 

 

Migratory

 

 

 

 

 

 

 

 

Write-Shared

 

 

 

 

 

 

 

 

Producer-Consumer

 

 

 

 

 

 

 

 

Reduction

 

 

 

 

 

 

 

 

Result

 

 

 

 

 

 

 

 

Conventional

 

 

 

 

 

 

 

 

 

 

Sharing Annotation

Parameter Settings

I

R

D

FO

M

S

FI

W

Read-only

N

Y

-

-

-

-

-

N

Migratory

Y

N

-

N

N

-

N

Y

Write-Shared

N

Y

Y

N

Y

N

N

Y

Producer-Consumer

N

Y

Y

N

Y

Y

N

Y

Reduction

N

Y

N

Y

N

-

N

Y

Result

N

Y

Y

Y

Y

-

Y

Y

Conventional

Y

Y

N

N

N

-

N

Y

 

Advanced Extensions