Back to index
Broadcast Disks: Data management for asymmetric communication
environments.
S. Acharya, R. Alonso, M. Franklin, S. Zdonik. This summary also
includes material from Dissemination based data delivery using
broadcast disks, same authors minus Franklin.
One-line summary:
Multilevel (but not truly "hierarchical") carouseling of multiple
broadcast "disks" onto a single channel; constructing the program and
managing the client caches must
be considered in tandem, since not all uncached pages are equidistant
from client. A caching strategy of LRU-per-disk works well in practice.
Overview/Main Points
- Conceptual notion: partition broadcast channel into "slots" and
treat channel as multiple spinning "disks" of different sizes.
Listeners can read data when it passes under the head. Differing
disk sizes allow hot/cold pages to be scheduled differently.
- Research goals:
- How to organize data on the broadcast disks? ("construct
the broadcast program")
- How should clients manage their local caches?
- Simplifying Assumptions: static client population;
read-only data; no upstream communication.
- Benefit of "Non-flat" (multiple-disk) broadcast programs over a
"randomly skewed" single-disk program:
- increasingly optimal for clients as the variance of access
probabilities for different pages grows.
- Prob. of request arriving during a large gap is greater
than during a short gap (Ed.: only for Gaussian-mean
arrival model, right?)
- Unpredictability of arrival time for a particular page
disallows "sleeping" to conserve client power.
- Creating a multidisk broadcast program: bin pages by expected
relative frequency of client access, and split each bin (disk)
with a granularity that is the least common denominator of
relative frequencies; then interleave chunks.
- Choose relative frequencies with care to avoid extremely
long "major cycle period" times.
- Cache management on client: Not all missed pages are
equidistant.
- Goal: store pages for which local access prob. P is greater
than frequency of broadcast X (cost-based replacement).
- Implementing "PIX policy" (P inverse X) requires perfect
knowledge, so instead use LIX: LRU with one chain per
bcast disk, keeping an EWMA of probability that page will
be accessed in the next timeslot =
1/(currentTime-lastAccessTime). Usually performs within a
constant of PIX.
- Prefetching: (from "Data dissemination" paper)
- Tag team caching: X and Y are pages on the same bcast
disk. Cache X until Y goes by; then replace X with Y.
Expected cost is 0.125 (half the expected cost of the
simple "demand fetching" strategy). Problem of assigning
pages to "tag teams" is a discrete bin packing problem.
- PT value: product of probability P of page access and
amount of time T that will elapse until that page appears
on the broadcast. PT values change with each "tick" of
broadcast; it looks like a sawtooth, since when the page
arrives, T changes discontinuously from its minimum to its
maximum.
Relevance
Flaws
I think more attention should be focused on application level
considerations: what is a "page" from the point of view of
typical apps? How do large "logical pages" get subdivided into
physical pages (application level framing) and can you exploit
application semantics to determine how to schedule the fragments?
Can you analyze application usage/traffic to help construct the
bcast schedule? I recognize that this paper represents early
work so maybe the authors will do this as they relax their
simplifing assumptions.
Also the relationship to NUMA work was mentioned in passing but I'm
surprised they weren't able to draw more from it, or at least state the
differences (e.g. duality between compiler-directed static data
placement for NUMA programs and constructing a static broadcast schedule).
Back to index