**One-line summary:**
R-trees (index structure optimized for spatial searching of "geographic"
data) are B-trees in which interior-node keys are k-dimensional
rectangloids rather than scalars, and the comparison test is "overlaps"
rather than linear range comparison.

- Each leaf node's rectangle is the smallest rectangle that encloses the leaf's data object.
- Each interior node's rectangle is the smallest one that encloses all the rectangles in its children.
- Search comparison test is "overlapping", so more than one subtree under a node may need to be visited during a search; so worst-case performance is not log but linear in number of keys.
- Insertion:
- add leaf to the parent whose rectangle must be enlarged by the least amount (break ties using area) to accommodate it
- go up the tree adjusting the enclosing rectangles
- when nodes must be split (overfull): see below

- Deletion: may result in deletion/merge of nodes. Underfull nodes are merged with whichever sibling will have its area increased least.
- Node merging is better than redistributing the orphans to many
siblings because:
- the insert procedure can be called to do orphan insertion, and the nodes it will visit are likely to have been visited during preceding search (that caused the merge);
- reinsertion preserves the spatial structure ("spatial balance") of the tree, preventing deterioriation that might occur if a given child is forced to never be reparented.

- Node splitting: want to minimize chance that
*both*new nodes will need to be visited on future searches => minimize total area of the two covering rectangles. - Exhaustive alg. must examine exponential number of choices;
authors propose 2 heuristics:
- repeatedly pick 2
entries that would waste the most space if they were put in the
same group (examines O(M
^{2}) pairs) - examine entries with highest low side and lowest high side, and pick pair with largest separation (normalized along corresponding "width" dimension), which examines O(M) pairs.

- repeatedly pick 2
entries that would waste the most space if they were put in the
same group (examines O(M
- Experimental observations
- node splitting is responsible for only a small part of insert cost in practice;
- Quadratic algorithm obviously has higher overhead but leads to more balanced ("geographically tighter") R-trees;
- R-trees are good at quickly narrowing geographic search.

- No theoretical analysis of worst/expected case performance of insert/delete (update is just delete and reinsert)
- No analysis of expected performance of heuristics compared to optimal alg.
- Page sizes tried up to 2KB; would quadratic split algorithm be much worse on today's 8KB-16KB or larger pages?