Off-campus University of Miami users: To download campus access dissertations, please use the following link to log into our proxy server with your University of Miami CaneID and Password.

Non-University of Miami users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Publication Date

2008-06-09

Availability

UM campus only

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PHD)

Department

Electrical and Computer Engineering (Engineering)

Date of Defense

2008-05-01

First Committee Member

Kamal Premaratne - Committee Chair

Second Committee Member

Michael Scordilis - Committee Member

Third Committee Member

Mei-Ling Shyu - Committee Member

Fourth Committee Member

Manohar N. Murthi - Committee Member

Fifth Committee Member

Subramanian Ramakrishnan - Outside Committee Member

Abstract

Management of data imprecision has become increasingly important, especially with the advance of technology enabling applications to collect and store huge amount data from multiple sources. Data collected in such applications involve a large number of variables and various types of data imperfections. These data, when used in knowledge discovery applications, require the following: 1) computationally efficient algorithms that works faster with limited resources, 2) an effective methodology for modeling data imperfections and 3) procedures for enabling knowledge discovery and quantifying and propagating partial or incomplete knowledge throughout the decision-making process. Bayesian Networks (BNs) provide a convenient framework for modeling these applications probabilistically enabling a compact representation of the joint probability distribution involving large numbers of variables. BNs also form the foundation for a number of computationally efficient algorithms for making inferences. The underlying probabilistic approach however is not sufficiently capable of handling the wider range of data imperfections that may appear in many new applications (e.g., medical data). Dempster-Shafer theory on the other hand provides a strong framework for modeling a broader range of data imperfections. However, it must overcome the challenge of a potentially enormous computational burden. In this dissertation, we introduce the joint Dirichlet BoE, a certain mass assignment in the DS theoretic framework, that simplifies the computational complexity while enabling one to model many common types of data imperfections. We first use this Dirichlet BoE model to enhance the performance of the EM algorithm used in learning BN parameters from data with missing values. To form a framework of reasoning with the Dirichlet BoE, the DS theoretic notions of conditionals, independence and conditional independence are revisited. These notions are then used to develop the DS-BN, a BN-like graphical model in the DS theoretic framework, that enables a compact representation of the joint Dirichlet BoE. We also show how one may use the DS-BN in different types of reasoning tasks. A local message passing scheme is developed for efficient propagation of evidence in the DS-BN. We also extend the use of the joint Dirichlet BoE to Markov models and hidden Markov models to address the uncertainty arising due to inadequate training data. Finally, we present the results of various experiments carried out on synthetically generated data sets as well as data sets from medical applications.

Keywords

Evidential Network; Dependency Network; Dempster Shafer Theory; Bayesian Network; Data Imperfections; Partially Observed Data

Share

COinS