Data Exchange, Data Integration, and Chase

Alan Nash, Alin Deutsch and Jeff REmmel
April 26, 2006

We study the problem of computing certain answers to a query over a target schema for a source instance under constraints which relate the source and target schemas. Prior work has shown that, for restricted constraints (source-to-target and target-to-target embedded dependencies) and unions of conjunctive queries, there is a special instance, known as a universal solution, such that running the query on it essentially yields the certain answers. Such a universal solution does not always exist for even slight extensions of these classes of constraints and queries. We show that there may be a finite set of instances, which we call a universal set solution, which suffices to compute the certain answers. We also introduce the notion of a k-universal set solution, which is sufficient to compute the certain answers to queries with at most k variables, even when no universal set solution exists. We show how to compute such universal and k-universal set solutions for universal-existential constraints and existential queries. The main algorithm for computing the universal set solution is an extended chase. We provide a completeness result for this chase and sufficient conditions for termination, which strictly extend the best previously known conditions (such as weak acyclicity). We also introduce a new kind of chase to compute k-universal set solutions.

How to view this document

The authors of these documents have submitted their reports to this technical report series for the purpose of non-commercial dissemination of scientific work. The reports are copyrighted by the authors, and their existence in electronic format does not imply that the authors have relinquished any rights. You may copy a report for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect the author's copyright. For information concerning the use of this document for other than research or instructional purposes, contact the authors. Other information concerning this technical report series can be obtained from the Computer Science and Engineering Department at the University of California at San Diego,

[ Search ]

This server operates at UCSD Computer Science and Engineering.
Send email to