T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Please use this identifier to cite or link to this item:
|Title: ||Accelerating Successive Approximation Algorithm Via Action Elimination|
|Authors: ||Jaber, Nasser M. A. Jr.|
|Advisor: ||Lee, Chi-Guhn|
|Department: ||Mechanical and Industrial Engineering|
|Keywords: ||successive approximation, action elimination, heuristic action elimination, action elimination for monotone policy MDPs|
|Issue Date: ||20-Jan-2009|
|Abstract: ||This research is an effort to improve the performance of successive approximation algorithm with a prime aim of solving finite states and actions, infinite horizon, stationary, discrete and discounted
Markov Decision Processes (MDPs). Successive approximation is a simple and commonly used method to solve MDPs. Successive approximation often appears to be intractable for solving large scale MDPs due to its computational complexity. Action elimination, one of the techniques used to accelerate solving MDPs, reduces the
problem size through identifying and eliminating sub-optimal actions. In some cases successive approximation is terminated when all actions but one per state are eliminated.
The bounds on value functions are the key element in action elimination. New terms (action gain, action relative gain and action
cumulative relative gain) were introduced to construct tighter bounds on the value functions and to propose an improved action
When span semi-norm is used, we show numerically that the actual convergence of successive approximation is faster than the known theoretical rate. The absence of easy-to-compute bounds on the actual convergence rate motivated the current research to try a
heuristic action elimination algorithm. The heuristic utilizes an estimated convergence rate in the span semi-norm to speed up action
elimination. The algorithm demonstrated exceptional performance in terms of solution optimality and savings in computational time.
Certain types of structured Markov processes are known to have monotone optimal policy. Two special action elimination algorithms
are proposed in this research to accelerate successive approximation for these types of MDPs. The first algorithm uses the state space partitioning and prioritize iterate values updating in a way that maximizes temporary elimination of sub-optimal actions based on the policy monotonicity. The second algorithm is an improved version that includes permanent action elimination to improve the performance of the algorithm. The performance of the proposed algorithms are assessed and compared to that of other algorithms. The proposed algorithms demonstrated outstanding performance in
terms of number of iterations and omputational time to converge.|
|Appears in Collections:||Doctoral|
Department of Mechanical & Industrial Engineering - Doctoral theses
Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.