Back to index

long title

authors

Summary by Armando Fox and Steve Gribble

One-line summary:

Overview/Main Points

Goal: allow continuous monitoring of incrementally refined results of doing a long-running aggregation query (query in which some statistics are computed over a selection, e.g. "select avg(R.c1*R.c2) from R where ..."). Related work:
- online analytical processing (OLAP): also support super-aggregation and sub-aggregation
- "fast-first" query proc: attempt to quickly return first few tuples
- APPROXIMATE: defines approximate relational algebra which it uses to process standard queries in an iteratively refining manner.
Design goals:
- Continuous observation in UI: regular updates at a selectable rate (time granularity)
- Control of fairness: if query is computing several running aggregates, user should be able to control the relative rates at which they are updated
- Minimum time to accuracy: should converge to a narrow confidence interval quickly
- Minimum time to completion: less important; authors conjecture that most long running online aggregations will be halted by user long before completion.
Naive implementation: Ingres allows arbitrary functions in the query, e.g. "SELECT running_avg(S.grade) WHERE....", and you define running_avg to be a function with internal state that returns the next refinement of a running aggregate. But these functions can't be used with SQL "group by", aren't well integrated into the query language, and don't perform very well. So we modify the DBMS engine itself....
Problems to be solved in DB engine modifications:
- Data must be randomly accessed, in order for running aggregate to be statistically meaningful
- If query specifies "group by", need non-blocking way to do the grouping (think of an event loop: each "event" of incrementally refining one of the aggregates has to be short, to keep behavior and control interactive)
- Need non-blocking join algorithms (same reasoning)
- Query optimization metrics may change
Random access to data.
- Heap scan: records are usually stored in unspecified order, though sometimes some residual dependence on (e.g.) insertion order. This is usually the best random access method.
- Index scan: only if the indexed attributes are not the ones being aggregated.
- Random sampling from indices: works, but requires repeated probes to random hash buckets. Bleah.
Non-blocking "group by":
- hash input relation on grouping columns; then as soon as a tuple is read from input, can immediately update aggregate for its group.
- Index striding: round-robin fetches of tuples that satisfy the "group by" constraints. To do this, given a B-tree index on grouping columns, scan index looking for distinct keys placing each in its own bucket, and once the complete set of distinct keys has been found, round-robin tuples from each bucket. Can also weight the round-robin.
- Index striding allows control of relative rates of updating the aggregates for different groups, and if user requests that one group be stopped, other groups will run faster.
Non-blocking join. Sort-merge and hybrid-hash join are out, since the sort or inner-relation-hash steps (respectively) are blocking ops. Use nested-loop join instead.
Query optimization gotchas.
- Sorting can be avoided entirely
- Don't want to produce results that are ordered on aggregation or grouping columns, so want to identify "interestingly bad" (or "anti-interesting") orders (using Selinger's terminology)
- Blocking sub-operations should be modelled by superlinear cost functions, to make them highly unlikely to be chosen
- Plans that increase user control (eg by using index striding) should be favored
- Best "batch execution" plan may sometimes be better than any interactive plan, for short enough query
Running confidence intervals.
- Can be conservative (final value inside confidence interval with probability at least p), large-sample based on Central Limit Thms (probability p), or deterministic (probability 1; not very useful unless number of records is huge).
- Derivation given for how to compute running CI for arithmetic mean of an arbitrary combination of fields of the relation, i.e. "SELECT AVG(f(R.c1,...,R.cN)) FROM R WHERE..."
Performance.
- Convergence to narrow 99% confidence interval is fast (tesn of seconds for a query that takes hundreds of seconds).
- Implementation artifact: for high update rates, the Tk script is not able to process updates as fast as the DB engine can deliver them, so progress appears to be slower.
Future work and open questions.
- Nested queries? (in which results at outer level depend on results at inner level)
- User interface for naive users
- usability and performance metrics for OLA not crisply defined here or elsewhere
- Checkpointing for very long running queries
- tracking online queries: identifying the specific tuples that caused some interesting behavior (eg spike) in the running aggregate
- Confidence interval formulae for additional aggregate functions

Relevance

Incremental refinement for long queries; probably very applicable to "online data mining" (e.g. looking for interesting stuff in WWW traffic).
UI issues may have some overlap with other techniques that provide incremental refinement. (Can you think of any? Prefix encodings for images and video maybe?)

Flaws

Running CI derivation applies only to a particular aggregation function (arith mean). How hard is it to do for other functions? Do you have to add code to the DB core for each new aggregation function you want to compute using OLA, or can it be done extensibly while preserving performance? (May be a subset of the general problem of building extensible DB's)
Design and approach are all mixed in with implementation decisions. Should clearly separate design goals and implications from effects resulting from the specific implementation.

Back to index