Treatment of Instance-Based Classifiers Containing Ambiguous Attributes and Class Labels

Hans Mullinnix Holland, University of Miami

Abstract

The importance of attribute vector ambiguity has been largely overlooked by the machine learning community. A pattern recognition problem can be solved in many ways within the scope of machine learning. Neural Networks, Decision Tree Algorithms such as C4.5, Bayesian Classifiers, and Instance Based Learning are the main algorithms. All listed solutions fail to address ambiguity in the attribute vector. The research reported shows, ignoring this ambiguity leads to problems of classifier scalability and issues with instance collection and aggregation. The Algorithm presented accounts for both ambiguity of the attribute vector and class label thus solving both issues of scalability and instance collection. The research also shows that when applied to sanitized data sets, suitable for traditional instance based learning, the presented algorithm performs equally as well.