Back to index
Summary by Armando Fox and Steve Gribble
- Goal: allow continuous monitoring of incrementally refined
results of doing a long-running aggregation query (query in which
some statistics are computed over a selection, e.g. "select
avg(R.c1*R.c2) from R where ..."). Related work:
- online analytical processing (OLAP): also
support super-aggregation and sub-aggregation
- "fast-first" query proc: attempt to quickly return first
- APPROXIMATE: defines approximate relational algebra which
it uses to process standard queries in an iteratively
- Design goals:
- Continuous observation in UI: regular updates at a
selectable rate (time granularity)
- Control of fairness: if query is computing several running
aggregates, user should be able to control the relative
rates at which they are updated
- Minimum time to accuracy: should converge to a narrow
confidence interval quickly
- Minimum time to completion: less important; authors
conjecture that most long running online aggregations will
be halted by user long before completion.
- Naive implementation: Ingres allows arbitrary functions in the
query, e.g. "SELECT running_avg(S.grade) WHERE....", and you
define running_avg to be a function with internal state
that returns the next refinement of a running aggregate. But
these functions can't be used with SQL "group by", aren't well
integrated into the query language, and don't perform very well.
So we modify the DBMS engine itself....
- Problems to be solved in DB engine modifications:
- Data must be randomly accessed, in order for running
aggregate to be statistically meaningful
- If query specifies "group by", need non-blocking way to do
the grouping (think of an event loop: each "event" of
incrementally refining one of the aggregates has to be
short, to keep behavior and control interactive)
- Need non-blocking join algorithms (same reasoning)
- Query optimization metrics may change
- Random access to data.
- Heap scan: records are usually stored in unspecified
order, though sometimes some residual dependence on (e.g.)
insertion order. This is usually the best random
- Index scan: only if the indexed attributes are not the
ones being aggregated.
- Random sampling from indices: works, but requires repeated
probes to random hash buckets. Bleah.
- Non-blocking "group by":
- hash input relation on grouping
columns; then as soon as a tuple is read from input, can
immediately update aggregate for its group.
- Index striding: round-robin fetches of tuples that
satisfy the "group by" constraints. To do this, given a
B-tree index on grouping columns, scan index looking for
distinct keys placing each in its own bucket, and once the
complete set of distinct keys has been found, round-robin
tuples from each bucket. Can also weight the round-robin.
- Index striding allows control of relative rates of
updating the aggregates for different groups, and if user
requests that one group be stopped, other groups will run
- Non-blocking join. Sort-merge and hybrid-hash join are
out, since the sort or
inner-relation-hash steps (respectively) are blocking ops. Use
nested-loop join instead.
- Query optimization gotchas.
- Sorting can be avoided entirely
- Don't want to produce results that are ordered on
aggregation or grouping columns, so want to identify
"interestingly bad" (or "anti-interesting") orders (using
- Blocking sub-operations should be modelled by superlinear
cost functions, to make them highly unlikely to be chosen
- Plans that increase user control (eg by using index
striding) should be favored
- Best "batch execution" plan may sometimes be better than
any interactive plan, for short enough query
- Running confidence intervals.
- Can be conservative (final value inside confidence
interval with probability at least
p), large-sample based on Central Limit Thms
(probability p), or deterministic
(probability 1; not very useful unless number of records
- Derivation given for how to compute running
CI for arithmetic mean of an arbitrary combination of
fields of the relation, i.e. "SELECT AVG(f(R.c1,...,R.cN))
FROM R WHERE..."
- Convergence to narrow 99% confidence interval is fast
(tesn of seconds for a query that takes hundreds of
- Implementation artifact: for high update rates, the Tk
script is not able to process updates as fast as the DB
engine can deliver them, so progress appears to be slower.
- Future work and open questions.
- Nested queries? (in which results at outer level depend on
results at inner level)
- User interface for naive users
- usability and performance metrics for OLA not crisply
defined here or elsewhere
- Checkpointing for very long running queries
- tracking online queries: identifying the specific tuples
that caused some interesting behavior (eg spike) in the
- Confidence interval formulae for additional aggregate
- Incremental refinement for long queries; probably very applicable to
"online data mining" (e.g. looking for interesting stuff in WWW
- UI issues may have some overlap with other techniques that
provide incremental refinement. (Can you think of any? Prefix
encodings for images and video maybe?)
- Running CI derivation applies only to a particular aggregation
function (arith mean). How hard is it to do for other functions?
Do you have to add code to the DB core for each new aggregation
function you want to compute using OLA, or can it be done
extensibly while preserving performance? (May be a subset of
the general problem of building extensible DB's)
- Design and approach are all mixed in with implementation
decisions. Should clearly separate design goals and implications
from effects resulting from the specific implementation.
Back to index