Back to index
The Clearinghouse: A Decentralized Agent for Locating Named Objects
in a Distributed Environment
Derek C. Oppen and Yogen K. Dalal
One-line summary:
The Clearinghouse is a decentralized set of processes that provides
an efficient but not terribly robust method for a distributed
name service.
Overview/Main Points
- Name Spaces: Name spaces can be described in terms of
directed graphs - a node is an object, and an edge is part of an
object's name. An absolute name space has a root node with one edge
leading to each other (unlabelled) node; all edges have unique labels.
A relative name space has unlabeled nodes and labeled edges, and
there is either zero or one uniqeuly labeled edge from any node to
any other. The concept of hierarchy is orthogonal to that of
flat (absolute) vs relative; a hierarchical name space's graph
is partitioned into subgraphs; absolute naming must be used within
leaf subgraphs, but either relative or absolute can be used between
subgraphs. Flat spaces provide one-to-one mappings from names to objects.
Relative spaces provide one-to-many mappings (many nodes may have the
same name - uniqueness requires qualification with a source node.)
Aliases (alternate names) mean many-to-one or many-to-many mappings
are possible.
- Clearinghouse names: Names are of the form L:D:O, where L
is a local name, D is a domain name, and O is the organization. All
objects (including clearinghouses and users) have such names. This
namespace is therefore absolute and hierarchical with 3 levels.
Usernames are made unique by providing a full person's name and a
"birthmark" to resolve conflicts. The clearinghouse maps
names into sets of properties; a property is a (PropertyName,
PropertyType, PropertyValue) tuple. Types are either items or
groups, where a group is a set of names.
- Client's perspective: Client apps talk to clearinghouse
stubs, which contain pointers to clearinghouse servers. Clients
may request the value of a binding, update the value of a binding,
create a new binding, or delete a binding. Each of these 4 operations
is atomic at a clearinghouse but not across the set of all clearinghouses;
temporary inconsistencies can then occur - information should be
treated as a hint in the short term but truth in the long term.
- Architecture: Each domain D in each organization O has one
or more clearing house servers; each such domain server has a
unique name (anything):O:CHServers. The set of all domain servers for
D:O is accessed with name D:O:CHServers and PropertyType
"Distribution List". Each organization also has one or more
organization clearinghouse named (anything):CHServers:CHServers; such
organization clearinghouses contian the name and address of every
domain clearinghouse in the organization. Thus, O:CHServers is a
domain used for naming clearinghouses, whose domain clearinghouse is
the organization's clearinghouse. O:CHServers:CHservers maps into
the set of names of organization clearinghouses for O.
- Lookup: Client stubs contact known clearinghouses. If
the request is for an object in that clearinghouse's domain, it
returns the answer. If not, that clearinghouse returns a list
of organization clearinghouses for the object. The stub then
contacts one of these to obtain a domain clearinghouse for the object,
and contacts that for the object itself. Lookup in the worst case
is then 3 transactions. Sideways shortcuts between domains can be
cached to avoid an up and down step. Domain clearinghouses can
also cache bindings for common objects not in their domains; cache
consistency is not a problem, since clients are responsible for
detecting incorrect values and deleting them!
- Update: Update requests are submitted through stubs to
a domain clearinghouse for the object, which forward the update to
all sibling domain clearinghouses. Updates are only atomic
within a clearinghouse; out-of-order updates across siblings
are discovered and resolved with an unspecified algorithm.
- Authentication: is used. (Details are omitted in the paper.)
Relevance
The clearinghouse is a distributed, somewhat fault tolerant, and
relatively efficient distributed naming scheme. This paper identifies
the issues, and solves some of them well. It certainly takes a
pragmatic rather than rigorous perspective on the problem.
Flaws
- Some issues are swept under the rug, such as detecting and
recovering from failures, network partitions, hostile or untrustworthy
servers, detecting and recovering from update or deletion conflicts
across name servers, providing cache consistency, etc.
- The paper introduces name spaces overly rigorously and solves
problems unrigorously.
Back to index