Back to index
Parallel Database Systems: The Future of High Performance Database Systems
David DeWitt and Jim Gray
One-line summary:
The NOW argument applied to parallel database systems.
Overview/Main Points
- customized "database machine" hardware was a fad that
didn't work out - no economy of scale could be realized.
Similarly, high performance parallel systems (shared memory or
shared storage) didn't work out, partly for economic reasons and
partly because of the interference that sharing of resources
induced.
- shared memory/shared disk: data affinity introduced as
coarse-grained partitioning strategy - authors point out this is
first step towards shared-nothing model (i.e. essentially a NOW).
- parallel metrics:
- speedup:
small_system_elapsed_time / big_system_elapsed_time.
throw more resources at problem, how much faster is
problem solved?
- scaleup:
small_system_elapsed_time_on_small_problem /
big_system_elapsed_time_on_big_problem. Like TPC
benchmarks - difficulty of problem scales up with
amount of resources you throw at it.
- Shared-nothing architectures: if done right, can achieve
near-perfect linear speedups and scaleups on complex relational
queries.
- relational queries perfect for parallelization, as queries are
relational operators (at an abstract, non-procedural logical
level) applied to large collections of data. Execute as dataflow
graph
- "pipelined parallelism" (which is doing many
stages of a query in pipelined fashion), and
- "partitioned
pipelined parallelism" (which is doing many such pipelines
at once on different machines - partitioned either over operator
space or over data space).
- data partitioning - use hash, round-robin, or range partitioning.
- some mention of state of the art in 1992 is mentioned - teradata,
tandem nonstop SQL, gamma, bubba, ... are mentioned.
Relevance
The NOW argument applied to parallel databases. Not distributed databases,
but parallel, so get sysadmin and LAN/SAN benefits. Good argument, makes
sense for databases and other systems in general.
Flaws
- this paper seemed targetted at a non-tech audience; more PR than
paper
- paper is a number of assertions, with no backing up through data,
measurements, experiments, or historical evidence.
- examined tech trends leading to state of the world in 1992 - how
about tech trends leading into the future? Will this argument
still be true in 2000?
Back to index