Abstract
-
This work considers a problem of optimal query processing
in heterogeneous and distributed database systems. A global query sub-
mitted at a local site is decomposed into a number of queries processed
at the remote sites. The partial results returned by the queries are in-
tegrated at a local site. The paper addresses a problem of an optimal
scheduling of queries that minimizes time spend on data integration of
the partial results into the final answer. A global data model defined in
this work provides a unified view of the heterogeneous data structures
located at the remote sites and a system of operations is defined to ex-
press the complex data integration procedures. This work shows that
the transformations of an entirely simultaneous query processing strate-
gies into a hybrid (simultaneous/sequential) strategy may in some cases
lead to significantly faster data integration. We show how to detect such
cases, what conditions must be satisfied to transform the schedules, and
how to transform the schedules into the more efficient ones.