test Browse by Author Names Browse by Titles of Works Browse by Subjects of Works Browse by Issue Dates of Works

Advanced Search
& Collections
Issue Date   
Sign on to:   
Receive email
My Account
authorized users
Edit Profile   
About T-Space   

T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Doctoral >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1807/24370

Title: Queries, Data, and Statistics: Pick Two
Authors: Mishra, Chaitanya
Advisor: Koudas, Nick
Department: Computer Science
Keywords: Database Systems
Issue Date: 21-Apr-2010
Abstract: The query processor of a relational database system executes declarative queries on relational data using query evaluation plans. The cost of the query evaluation plan depends on various statistics defined by the query and data. These statistics include intermediate and base table sizes, and data distributions on columns. In addition to being an important factor in query optimization, such statistics also influence various runtime properties of the query evaluation plan. This thesis explores the interactions between queries, data, and statistics in the query processor of a relational database system. Specifically, we consider problems where any two of the three - queries, data, and statistics - are provided, with the objective of instantiating the missing element in the triple such that the query, when executed on the data, satisfies the statistics on the associated subexpressions. We present multiple query processing problems that can be abstractly formulated in this manner. The first contribution of this thesis is a monitoring framework for collecting and estimating statistics during query execution. We apply this framework to the problems of monitoring the progress of query execution, and adaptively reoptimizing query execution plans. Our monitoring and adaptivity framework has a low overhead, while significantly reducing query execution times. This work demonstrates the feasibility and utility of overlaying statistics estimators on query evaluation plans. Our next contribution is a framework for testing the performance of a query processor by generating targeted test queries and databases. We present techniques for data-aware query generation, and query-aware data generation that satisfy test cases specifying statistical constraints. We formally analyze the hardness of the problems considered, and present systems that support best-effort semantics for targeted query and data generation. The final contribution of this thesis is a set of techniques for designing queries for business intelligence applications that specify cardinality constraints on the result. We present an interactive query refinement framework that explicitly incorporates user feedback into query design, refining queries returning too many or few answers. Each of these contributions is accompanied by a formal analysis of the problem, and a detailed experimental evaluation of an associated system.
URI: http://hdl.handle.net/1807/24370
Appears in Collections:Doctoral
Department of Computer Science - Doctoral theses

Files in This Item:

File Description SizeFormat
Mishra_Chaitanya_201003_PhD_thesis.pdf1.64 MBAdobe PDF

Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.