test Browse by Author Names Browse by Titles of Works Browse by Subjects of Works Browse by Issue Dates of Works

Advanced Search
& Collections
Issue Date   
Sign on to:   
Receive email
My Account
authorized users
Edit Profile   
About T-Space   

T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Doctoral >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1807/16794

Title: Accelerating Successive Approximation Algorithm Via Action Elimination
Authors: Jaber, Nasser M. A. Jr.
Advisor: Lee, Chi-Guhn
Department: Mechanical and Industrial Engineering
Keywords: successive approximation, action elimination, heuristic action elimination, action elimination for monotone policy MDPs
Issue Date: 20-Jan-2009
Abstract: This research is an effort to improve the performance of successive approximation algorithm with a prime aim of solving finite states and actions, infinite horizon, stationary, discrete and discounted Markov Decision Processes (MDPs). Successive approximation is a simple and commonly used method to solve MDPs. Successive approximation often appears to be intractable for solving large scale MDPs due to its computational complexity. Action elimination, one of the techniques used to accelerate solving MDPs, reduces the problem size through identifying and eliminating sub-optimal actions. In some cases successive approximation is terminated when all actions but one per state are eliminated. The bounds on value functions are the key element in action elimination. New terms (action gain, action relative gain and action cumulative relative gain) were introduced to construct tighter bounds on the value functions and to propose an improved action elimination algorithm. When span semi-norm is used, we show numerically that the actual convergence of successive approximation is faster than the known theoretical rate. The absence of easy-to-compute bounds on the actual convergence rate motivated the current research to try a heuristic action elimination algorithm. The heuristic utilizes an estimated convergence rate in the span semi-norm to speed up action elimination. The algorithm demonstrated exceptional performance in terms of solution optimality and savings in computational time. Certain types of structured Markov processes are known to have monotone optimal policy. Two special action elimination algorithms are proposed in this research to accelerate successive approximation for these types of MDPs. The first algorithm uses the state space partitioning and prioritize iterate values updating in a way that maximizes temporary elimination of sub-optimal actions based on the policy monotonicity. The second algorithm is an improved version that includes permanent action elimination to improve the performance of the algorithm. The performance of the proposed algorithms are assessed and compared to that of other algorithms. The proposed algorithms demonstrated outstanding performance in terms of number of iterations and omputational time to converge.
URI: http://hdl.handle.net/1807/16794
Appears in Collections:Doctoral
Department of Mechanical & Industrial Engineering - Doctoral theses

Files in This Item:

File Description SizeFormat
Jaber_Nasser_M_A_200811_PhD_thesis.pdf2.73 MBAdobe PDF

Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.