*Summary by: Steve Gribble and Armando Fox*

**One-line summary:**
Aggregates and group-by operators produce 0- or 1-dimensional answers;
cross-tabs produce 2-d answers; the data-cube produces N-dimensional
aggregates and answers by treating each of the N aggregation attributes as
a dimension of N-space.

- I cannot easily imagine the visualization techniques necessary to extend this to N>3 dimensions. Sure, color and motion could be used, but that is not at all intuitive for a human. 3-d shadows, slices, or extrusions could be used, but that gets cluttered and confusing (e.g. after many years of churning away at it, I still can't totally grok a hypercube. Maybe I'm just bad at this sort of a thing.)
- The section discussing algebraic vs. distributive vs. holistic wasn't clear about the exact cost of computing the power set. Give me the gory details!
- Can the power set be lazily generated somehow? Probably don't need all dimentionalities of aggregations, and could compute on demand.

- aggregation operators:
- 5 functions in SQL:
`count, sum, min, max, and avg`

; these return a single value. - group by operator creates a table of aggregates indexed
by some set of attributes (e.g.
`max(salary) group by department`

) - proprietary extensions to SQL:
`cumulative sum, running sum, running average`

(Red Brick systems).

- 5 functions in SQL:
- problems with group by:
- does not easily permit histograms (need to construct table-valued expression, then aggregate over the resulting table, and even then histogram does not necessarily have evenly spaced categories)
- roll-up totals and sub-totals for drill-downs.
Requires N unions for an N-dimensional roll-up.
- roll-up total: reports aggregate data at a
coarse level (e.g.
`average(sales) group by model`

), then at a finer level (e.g.`average(sales) group by model,year`

), etc. for successively finer levels - going from finer to coarser is rolling-up the
data, from coarser to finer is drilling-down
the data (
*I think - paper was ambiguous about which term corresponds to which direction.*)

- roll-up total: reports aggregate data at a
coarse level (e.g.
- cross-tabulations. imagine data:
Model Year Color sales Chevy 1994 black 50 Chevy 1994 white 40 Chevy 1995 black 85 Chevy 1995 white 115 Chevy 1994 1995 total(ALL) black *50**85*135 white *40**115*155 total(ALL) 90 200 290

- data-cube: simply a generalization of the cross-tab to N-dimensions (usually 3). Core cube is data. Extra planes are added for 2-d aggregations of the 3-d data. Extra edges are added for 1-d aggregations of the 3-d data. An extra vertex is added for 0-d aggregation of the 3-d data.
- end up with 2^n aggregates for the N-d data cube, one for each set in the power set that represents the cube.
- issues:
- to implement as a relation, need the power set of the
relation in question - use the special token ALL to
indicate fields that have been aggregated across
(e.g. if a tuple from the sales table is
`(Chevy, 1990, blue, 64)`

, then you could imagine a new field in the power set`(Chevy, ALL, blue, 182)`

which is the aggregation across all years of blue Chevys.) - ALL token greatly complicates SQL code. Think of ALL
as being a set-value, e.g. above, ALL would represent
multiple years
`{1990, 1991, 1992}`

. - How do you efficiently compute this power set (which is
really the data cube)? (If data
aggregation function has special properties, e.g. is
*Distributive*,*Algebraic*, or*Holistic*, can exploit those properties in the algorithm for computing the data cube.

- to implement as a relation, need the power set of the
relation in question - use the special token ALL to
indicate fields that have been aggregated across
(e.g. if a tuple from the sales table is