T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Please use this identifier to cite or link to this item:
|Title: ||Modeling and Optimization of Delay and Power for Key Components of Modern High-performance Processors|
|Authors: ||Safi, Elham|
|Advisor: ||Moshovos, Andreas|
|Department: ||Electrical and Computer Engineering|
|Keywords: ||Processor design|
|Issue Date: ||13-Apr-2010|
|Abstract: ||In designing a new processor, computer architects consider a myriad of possible organizations and designs to decide which best meets the constraints on performance, power and cost for each particular processor. To identify practical designs, architects need to have insight into the physical-level characteristics (delay, power and area) of various components of modern processors implemented in recent fabrication technologies. During early stages of design exploration, however, developing physical-level implementations for various design options (often in the order of thousands) is impractical or undesirable due to time and/or cost constraints. In lieu of actual measurements, analytical and/or empirical models can offer reasonable estimates of these physical-level characteristics. However, existing models tend to be out-dated for three reasons: (i) They have been developed based on old circuits in old fabrication technologies; (ii) The high-level designs of the components have evolved and older designs may no longer be representative; and, (iii) The overall architecture of processors has changed significantly, and new components for which no models exist have been introduced or are being considered.
This thesis studies three key components of modern high-performance processors: Counting Bloom Filters (CBFs), Checkpointed Register Alias Tables (RATs), and Compacted Matrix Schedulers (CMSs). CBFs optimize membership tests (e.g., whether a block is cached). RAT and CMS increase the opportunities for exploiting instruction-level parallelism; RAT is the core of the renaming stage, and CMS is an implementation for the instruction scheduler. Physical-level studies or models for these components have been limited or non-existent. In addition to investigating these components at the physical level, this thesis (i) proposes a novel speed- and energy-efficient CBF implementation; (ii) studies how the number of RAT checkpoints affects its latency and energy, and overall processor performance; and, (iii) studies the CMS and its accompanying logic at the physical level. This thesis also develops empirical and analytical latency and energy models that can be adapted for newer fabrication technologies. Additionally, this thesis proposes physical-level latency and energy optimizations for these components motivated by design inefficiencies exposed during the physical-level study phase.|
|Appears in Collections:||Doctoral|
The Edward S. Rogers Sr. Department of Electrical & Computer Engineering - Doctoral theses
This item is licensed under a Creative Commons License
Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.