Back to index

Broadcast Disks: Data management for asymmetric communication environments.

S. Acharya, R. Alonso, M. Franklin, S. Zdonik. This summary also includes material from Dissemination based data delivery using broadcast disks, same authors minus Franklin.

One-line summary: Multilevel (but not truly "hierarchical") carouseling of multiple broadcast "disks" onto a single channel; constructing the program and managing the client caches must be considered in tandem, since not all uncached pages are equidistant from client. A caching strategy of LRU-per-disk works well in practice.

Overview/Main Points

Conceptual notion: partition broadcast channel into "slots" and treat channel as multiple spinning "disks" of different sizes. Listeners can read data when it passes under the head. Differing disk sizes allow hot/cold pages to be scheduled differently.
Research goals:
- How to organize data on the broadcast disks? ("construct the broadcast program")
- How should clients manage their local caches?
- Simplifying Assumptions: static client population; read-only data; no upstream communication.
Benefit of "Non-flat" (multiple-disk) broadcast programs over a "randomly skewed" single-disk program:
- increasingly optimal for clients as the variance of access probabilities for different pages grows.
- Prob. of request arriving during a large gap is greater than during a short gap (Ed.: only for Gaussian-mean arrival model, right?)
- Unpredictability of arrival time for a particular page disallows "sleeping" to conserve client power.
Creating a multidisk broadcast program: bin pages by expected relative frequency of client access, and split each bin (disk) with a granularity that is the least common denominator of relative frequencies; then interleave chunks.
- Choose relative frequencies with care to avoid extremely long "major cycle period" times.
Cache management on client: Not all missed pages are equidistant.
- Goal: store pages for which local access prob. P is greater than frequency of broadcast X (cost-based replacement).
- Implementing "PIX policy" (P inverse X) requires perfect knowledge, so instead use LIX: LRU with one chain per bcast disk, keeping an EWMA of probability that page will be accessed in the next timeslot = 1/(currentTime-lastAccessTime). Usually performs within a constant of PIX.
Prefetching: (from "Data dissemination" paper)
- Tag team caching: X and Y are pages on the same bcast disk. Cache X until Y goes by; then replace X with Y. Expected cost is 0.125 (half the expected cost of the simple "demand fetching" strategy). Problem of assigning pages to "tag teams" is a discrete bin packing problem.
- PT value: product of probability P of page access and amount of time T that will elapse until that page appears on the broadcast. PT values change with each "tick" of broadcast; it looks like a sawtooth, since when the page arrives, T changes discontinuously from its minimum to its maximum.

Relevance

Flaws

I think more attention should be focused on application level considerations: what is a "page" from the point of view of typical apps? How do large "logical pages" get subdivided into physical pages (application level framing) and can you exploit application semantics to determine how to schedule the fragments? Can you analyze application usage/traffic to help construct the bcast schedule? I recognize that this paper represents early work so maybe the authors will do this as they relax their simplifing assumptions.

Also the relationship to NUMA work was mentioned in passing but I'm surprised they weren't able to draw more from it, or at least state the differences (e.g. duality between compiler-directed static data placement for NUMA programs and constructing a static broadcast schedule).

Back to index