Dimitris Margaritis

Learning Bayesian Network Model Structure from Data

Degree Type: Ph.D. in Computer Science
Advisor(s): Sebastian Thrun
Graduated: May 2003

Keywords: Bayesian networks, Bayesian network structure learning, continuous variable independence test, Markov blanket, causal discovery, DataCube approximation, database count queries

Abstract

In this thesis I address the important problem of the determination of the structure of directed statistical models, with the widely used class of Bayesian network models as a concrete vehicle of my ideas. The structure of a Bayesian network represents a set of conditional independence relations that hold in the domain. Learning the structure of the Bayesian network model that represents a domain can reveal insights into its underlying causal structure. Moreover, it can also be used for prediction of quantities that are difficult, expensive, or unethical to measure -- such as the probability of lung cancer for example -- based on other quantities that are easier to obtain. The contributions of this thesis include (a) an algorithm for determining the structure of a Bayesian network model from statistical independence statements; (b) a statistical independence test for continuous variables; and finally (c) a practical application of structure learning to a decision support problem, where a model learned from the database -- most importantly its structure -- is used in lieu of the database to yield fast approximate answers to count queries, surpassing in certain aspects other state-of-the-art approaches to the same problem.

Thesis Committee

Sebastian Thrun (Chair)
Christos Faloutsos
Andrew W. Moore
Peter Spirtes
Gregory F. Cooper (University of Pittsburgh)

Randy Bryant, Head, Computer Science Department
James Morris, Dean, School of Computer Science

Thesis Document

CMU-CS-03-153.pdf (1.05 MB) (126 pages)

About Main page

Admissions Main page

Academics Main page

People Main page

Research Main page

Dimitris Margaritis

Learning Bayesian Network Model Structure from Data

Abstract

Thesis Committee

Thesis Document