Back to index
Th TSIMMIS Approach to Mediation: Data Models and Languages
Hector Garcia-Molina, Yannis Papakonstantinou, et al.
One-line summary: Support for integrating "related but
not-quite-the-same" information sources. Desired results are expressed as
queries over semistructured self-describing data; query normalization can be used
to compute the answers to queries that are not directly supported; wrappers around sources
are used for homogeneity.
Overview/Main Points
- Goal: integrate query-like, service-like, etc. info resources, and provide a query
interface (note, not a service I/F), by wrapping each of the sources. A query
can be made against a wrapped source or another mediator, so the potential for composition
seems to exist, though this paper doesn't go into any detail about it.
- Wrapper type system: self-describing objects, which may be atomic or set-valued.
The atomic types are few and simple. Webc's type system is even simpler.
- The logic-based (datalog-like) Mediator Specification Language expresses queries that
capture the structure of the data, although no schema is imposed a priori. Lorel is
the end-user query language.
- TSIMMIS is designed to integrate "information from related but not-quite-the-same
information sources." (Direct quote from beginning of sec. 5)
- A Tsimmis wrapper decides whether a query is directly supported or can be indirectly
supported by filtering a result; if neither of these is the case, query normalization
can sometimes be used to generate strategies in which a query that is not direclty support
can be answered by performing many queries that are supported. For example, if the
only query supported is "parents of X", we can find X's grandparents by running
two sets of queries and taking the union of the results. In general, the strategies
produced may be expensive to find (NP-hard in general?) and expensive to execute.
Relevance
They emphasize what we call "semantically similar" sources: the desired
result is expressed as a query in Lorel, and the necessary "compositions" are
computed using query normalization. The fact that the sources are semantically
similar to begin with is the reason a query language is a natural way to express
"composition" implicitly. In our case, we don't really have a language
(yet) for expressing the desired result, but we want to express "composition" of
semantically-distinct sources using an explicit composition language (similar to CLAM?).
Because the sources (services) are semantically distinct, the tasks we want to
perform through automatic composition cannot be gracefully expressed in terms of a query.
Back to index