T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Please use this identifier to cite or link to this item:
|Title: ||Data Quality By Design: A Goal-oriented Approach|
|Authors: ||Jiang, Lei|
|Advisor: ||Mylopoulos, John|
|Department: ||Computer Science|
|Keywords: ||Data Quality|
|Issue Date: ||13-Aug-2010|
|Abstract: ||A successful information system is the one that meets its design goals. Expressing these goals and subsequently translating them into a working solution is a major challenge for information systems engineering. This thesis adopts the concepts and techniques from goal-oriented (software)
requirements engineering research for conceptual database design, with a focus on data quality issues. Based on a real-world case study, a goal-oriented process is proposed for database requirements analysis and modeling. It spans from analysis of high-level stakeholder goals to detailed design of a conceptual databases schema. This process is then extended specifically for dealing with data quality issues: data of low quality may be detected and corrected by performing various quality assurance activities; to support these activities, the schema needs to be revised by accommodating additional data requirements. The extended process therefore focuses on analyzing and modeling quality assurance data requirements.
A quality assurance activity supported by a revised schema may involve manual work,
and/or rely on some automatic techniques, which often depend on the specification and enforcement of data quality rules. To address the constraint aspect in conceptual database design, data quality rules are classified according to a number of domain and application independent properties. This classification can be used to guide rule designers and to facilitate building of a
rule repository. A quantitative framework is then proposed for measuring and comparing DQ
rules according to one of these properties: effectiveness; this framework relies on derivation of formulas that represent the effectiveness of DQ rules under different probabilistic assumptions.
A semi-automatic approach is also presented to derive these effectiveness formulas.|
|Appears in Collections:||Doctoral|
Department of Computer Science - Doctoral theses
This item is licensed under a Creative Commons License
Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.