test Browse by Author Names Browse by Titles of Works Browse by Subjects of Works Browse by Issue Dates of Works
       

Advanced Search
Home   
 
Browse   
Communities
& Collections
  
Issue Date   
Author   
Title   
Subject   
 
Sign on to:   
Receive email
updates
  
My Account
authorized users
  
Edit Profile   
 
Help   
About T-Space   

T-Space at The University of Toronto Libraries >
School of Graduate Studies - Theses >
Doctoral >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1807/19194

Title: Cumulative Distribution Networks: Inference, Estimation and Applications of Graphical Models for Cumulative Distribution Functions
Authors: Huang, Jim C.
Advisor: Frey, Brendan J.
Department: Electrical and Computer Engineering
Keywords: Graphical models
Cumulative distribution function
Inference
Message-passing
Learning to rank
Information retrieval
Computational biology
Bioinformatics
Genomics
Extreme value distribution
microRNA
Gene regulation
Copula
Issue Date: 1-Mar-2010
Abstract: This thesis presents a class of graphical models for directly representing the joint cumulative distribution function (CDF) of many random variables, called cumulative distribution networks (CDNs). Unlike graphical models for probability density and mass functions, in a CDN, the marginal probabilities for any subset of variables are obtained by computing limits of functions in the model. We will show that the conditional independence properties in a CDN are distinct from the conditional independence properties of directed, undirected and factor graph models, but include the conditional independence properties of bidirected graphical models. As a result, CDNs are a parameterization for bidirected models that allows us to represent complex statistical dependence relationships between observable variables. We will provide a method for constructing a factor graph model with additional latent variables for which graph separation of variables in the corresponding CDN implies conditional independence of the separated variables in both the CDN and in the factor graph with the latent variables marginalized out. This will then allow us to construct multivariate extreme value distributions for which both a CDN and a corresponding factor graph representation exist. In order to perform inference in such graphs, we describe the `derivative-sum-product' (DSP) message-passing algorithm where messages correspond to derivatives of the joint cumulative distribution function. We will then apply CDNs to the problem of learning to rank, or estimating parametric models for ranking, where CDNs provide a natural means with which to model multivariate probabilities over ordinal variables such as pairwise preferences. We will show that many previous probability models for rank data, such as the Bradley-Terry and Plackett-Luce models, can be viewed as particular types of CDN. Applications of CDNs will be described for the problems of ranking players in multiplayer team-based games, document retrieval and discovering regulatory sequences in computational biology using the above methods for inference and estimation of CDNs.
URI: http://hdl.handle.net/1807/19194
Appears in Collections:Doctoral
The Edward S. Rogers Sr. Department of Electrical & Computer Engineering - Doctoral theses

Files in This Item:

File Description SizeFormat
Huang_Jim_C_200911_PhD_thesis.pdf4.32 MBAdobe PDF
View/Open

This item is licensed under a Creative Commons License
Creative Commons

Items in T-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

uoft