Back to index
An Evaluation of Buffer Management Strategies for Relational Database
Systems.
Hong-Tai Chou and David J. DeWitt
Summary by Steve Gribble and Armando Fox
One-line summary:
A model of relational query behaviour (the query locality set model,
or QLSM) is used to motivate a buffer management strategy called DBMIN;
QLSM predicts future reference behaviour but separates this prediction from
any particular buffer management algorithm.
Overview/Main Points
- "Bad" buffer mgmt algorithms:
- Domain separation: pages are classified into types,
each of which is separately managed in its associated
domain of buffers by LRU discipline. Example type
assignment: one domain to each non-leaf level of B-tree
structure. Limitation is that concept of domain is
static, and relative importance of types is not
considered.
- "New" algorithm: buffer pool is
subdivided and allocated on a per-relation basis. Each
active relation assigned a resident set that is
initially empty, resident sets of relations are
linked in a priority list. MRU within each relation.
Limitations: why MRU all the time? Why order
rleations on priority list? Searching through list can
be expensive.
- Hot Set: a set of pages over which there is a
looping behaviour is called hot set. By plotting
number of page faults vs. buffer size, observe
discontinuities at "hot points". Each query
is provided separate list of buffers managed by LRU
discipline, number of buffers query is entitled is
predicted by hot set model. Limitations: why LRU?
MRU better in some situations; LRU will over-allocate
memory.
- Query Locality Set Model (QLSM): relational database
systems support a limited set of operations, and page reference
patterns exhibited by these operations are regular and
predictable. Database operation can be decomposed into a number
of simple reference patterns:
- Straight sequential (SS): one page frame is all that's
required, as no page is ever revisited.
- Clustered sequential (CS): records in a cluster set
(records
with same key value) should be kept in memory at same
time if possible.
- Looping sequential (LS): MRU.
- Independent random (IR): similar to SS.
- Clustered random (CR): similar to CS.
- Straight hierarchical (SH): index traversed only once,
one page frame all that's necessary.
- Hierarchical with straight sequential (H/SS): index
traversed once, followed by scan through leaves -
similar to SS.
- Hierarchical with clustered sequential (H/CS): similar
to CS.
- Looping hierarchical (LH): pages closer to the root are
more likely to be accessed - access probability
of ith level is inversely proportional to
ith power of fan-out factor, so keep pages at an
upper level in memory preferentially (root page only
one worth keeping in memory for large fan-out).
- DBMIN: buffers are allocated and managed on a per file
instance basis. (A file instance seems to be a file opened by
a query; if multiple queries open the same file, this is
multiple file instances.) Set of buffered pages associated
with a file instance is its locality set. Each localit
set is managed separately by a discipline selected according to
the intended usage of the file instance.
- If buffer contains a page that does not belong to any
locality set, it is placed on global free list.
- Pages in the buffer can belong to at most one locality
set. A file instance is considered the owner of all
pages in its locality set.
- To allow for data sharing among concurrent queries, all
buffers in memory are also accessible through a global
buffer table.
- Algorithm: when page requested by a query, search is
made to global table, followed by an adjustment to
associated locality set. Three cases:
- Page found in both global table and locality
set: only usage statistics need to be updated
as dictated by local replacement policy.
- Page found in global table but not the
file instance's locality set: if page has an
owner,
page is given to requesting query.
Otherwise, page added to locality set of file
instance. If this exceeds file instance's
maximum number of buffers allocated to it,
a page is released according to local
replacement policy.
- Page is not in memory: it is brought in, then
proceed as in case 2.
- Performance: a hybrid trace/distribution driven simulation
was used to compare DBMIN with random (RAND), FIFO, CLOCK, and
working set (WS) algorithms. Metric of comparison was throughput
(average number of queries completed per second).
- DBMIN wins in all cases, followed closely by HOT.
CLOCK, WS, FIFO, and RAND don't do well at all.
- Data sharing increases competitiveness of other
algorithms, but DBMIN and HOT still win. (Data sharing
in three cases: full, in which all queries access the
same database, half, in which every two queries share a
copy of the database, and none, in which every query
has its own copy of the database.)
- Load control (utilization of paging device is kept busy
about half the time by descheduling thrashing queries)
also made other algorithms more competitive, but there
are well-known problems with load control. (High
overhead, responds only after bad situation arises,
doesn't work well in environment with large number of
transactions.)
Relevance
Minimization of I/Os is key, and buffer management strategies is critical
for this task.
Flaws
- Need to know intended usage of file instance so that a management
algorithm and required number of buffers can be allocated. Is it
ever possible that intended usage cannot be predicted?
- What if there are multiple users, and required number of buffers
for all concurrent queries cannot be satisfied? Do you just
delay some of the queries, or do you let all of them suffer?
- Each file instance is considered independently. Is there some
opportunity for optmization across file instances? Surely there
is locality across queries, or across file accesses within a
query.
Back to index