test Browse by Author Names Browse by Titles of Works Browse by Subjects of Works Browse by Issue Dates of Works

Advanced Search
& Collections
Issue Date   
Sign on to:   
Receive email
My Account
authorized users
Edit Profile   
About T-Space   

T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Doctoral >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1807/24839

Title: Visual Object Recognition Using Generative Models of Images
Authors: Nair, Vinod
Advisor: Hinton, Geoffrey
Department: Computer Science
Keywords: machine learning
computer vision
Issue Date: 1-Sep-2010
Abstract: Visual object recognition is one of the key human capabilities that we would like machines to have. The problem is the following: given an image of an object (e.g. someone's face), predict its label (e.g. that person's name) from a set of possible object labels. The predominant approach to solving the recognition problem has been to learn a discriminative model, i.e. a model of the conditional probability $P(l|v)$ over possible object labels $l$ given an image $v$. Here we consider an alternative class of models, broadly referred to as \emph{generative models}, that learns the latent structure of the image so as to explain how it was generated. This is in contrast to discriminative models, which dedicate their parameters exclusively to representing the conditional distribution $P(l|v)$. Making finer distinctions among generative models, we consider a supervised generative model of the joint distribution $P(v,l)$ over image-label pairs, an unsupervised generative model of the distribution $P(v)$ over images alone, and an unsupervised \emph{reconstructive} model, which includes models such as autoencoders that can reconstruct a given image, but do not define a proper distribution over images. The goal of this thesis is to empirically demonstrate various ways of using these models for object recognition. Its main conclusion is that such models are not only useful for recognition, but can even outperform purely discriminative models on difficult recognition tasks. We explore four types of applications of generative/reconstructive models for recognition: 1) incorporating complex domain knowledge into the learning by inverting a synthesis model, 2) using the latent image representations of generative/reconstructive models for recognition, 3) optimizing a hybrid generative-discriminative loss function, and 4) creating additional synthetic data for training more accurate discriminative models. Taken together, the results for these applications support the idea that generative/reconstructive models and unsupervised learning have a key role to play in building object recognition systems.
URI: http://hdl.handle.net/1807/24839
Appears in Collections:Doctoral
Department of Computer Science - Doctoral theses

Files in This Item:

File Description SizeFormat
Nair_Vinod_G_201006_PhD_thesis.pdf5.08 MBAdobe PDF

Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.