Publication Date
2012-04-21
Availability
Open access
Embargo Period
2012-04-21
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PHD)
Department
Electrical and Computer Engineering (Engineering)
Date of Defense
2011-04-26
First Committee Member
Miroslav Kubat
Second Committee Member
Kamal Premaratne
Third Committee Member
Akmal A. Younis
Fourth Committee Member
Nigel M. John
Fifth Committee Member
Maria M. Llabre
Abstract
Traditional approach to automated classification assumes that each object should be assigned to only one out of two or more classes. However, some real-world applications digress from this generic scenario in two important ways. First, each example can belong to several classes simultaneously (multi-label classification). Second, the classes can be hierarchically ordered in the sense that some are more specific versions of others (hierarchical classification). Seeking to address both of these issues, the presented work deals with “hierarchical multi-label classification”. In non-hierarchical multi-label classification, literature survey indicates that good performance is achieved when a Support Vector Machine (SVM) is used to induce each class separately. This said, some experiments suggest that further improvement can be achieved by explicitly dealing with the problem of imbalanced training sets. The author proposes a solution in terms of a technique referred to as R-SVM; the idea is to re-adjust the SVM-hyperplane offset accordingly. Experiments in the first part of this dissertation rely on data from domains of text-categorization. More important, however, is then the second part that focuses on hierarchical multi-label classification. Here, the author proposes a new technique, HR-SVM, which constitutes a hierarchical extension of R-SVM proceeding in a top-down fashion with a new mechanism to correct errors propagated from classifiers at higher levels of the hierarchy. The system has been subjected to experiments with data from the field of gene function prediction. The results show that the new technique compares favorably with other existing approaches along various performance criteria.
Keywords
hierarchical multi-label classification; support vector machines; threshold adjustment; decision trees; text categorization; gene-function prediction
Recommended Citation
Vateekul, Peerapon, "Hierarchical Multi-Label Classification: Going Beyond Generalization Trees" (2012). Open Access Dissertations. 723.
http://scholarlyrepository.miami.edu/oa_dissertations/723