Doctor of Philosophy (PHD)
Electrical and Computer Engineering (Engineering)
Date of Defense
First Committee Member
Second Committee Member
Third Committee Member
Fourth Committee Member
Fifth Committee Member
Biometrics attracted the attention of researchers in computer vision and machine learning for its use in many applications. We propose systems for face and ear recognition, gender classification and object tracking. First, we present a fully automated system for recognition from ear images based upon sparse representation. In sparse representation, extracted features from the training data is used to develop a dictionary. Classification is achieved by representing the extracted features of the test data as a linear combination of entries in the dictionary. In fact, there are many solutions for this problem and the goal is to find the sparsest solution. We use a relatively new algorithm named smoothed l0 norm to find the sparsest solution and Gabor Wavelet features are used for building the dictionary. Experimental results conducted on the University of Notre Dame (UND) collection J data set, containing large appearance, pose, and lighting variations, resulted in a gender classification rate of 89.49%. Furthermore, the proposed method is evaluated on the WVU data set and classification rates for different view angles are presented. Results show improvement and great robustness in gender classification over existing methods. Furthermore, we present an approach for gender classification using facial images based upon sparse representation and Basis Pursuit. In sparse representation, the training data is used to develop a dictionary based on extracted features. Basis pursuit is used to find the best representation by minimizing the l1 norm. Experimental results are conducted on the FERET data set and obtained results are compared with other works in this area. The results show improvement in gender classification over existing methods. We present a novel classification technique based on sparse representation. Currently, most of the methods for sparse representation classification do not apply constraints to the coefficients that form the linear combination of the atoms, which leads to having coefficients that can be positive or negative. In addition, all the training samples are treated uniformly without differentiating between the training samples in the dictionary. In this technique, we impose non-negative constraint on the components of the coefficient vector to ensure that the coefficient vector represents the contributions of the training samples towards the query, which is more natural for classification purposes. We also use the mutual information between the query sample and each of the training samples to obtain a weight for each of the atoms in the dictionary. These weights have the effect of reducing the search space and speeding the convergence of the algorithm in finding the coefficient vector. Experiments conducted on the Extended Yale B database for face recognition and on the University of Notre Dame (UND) database for ear recognition show that the proposed nonnegative weighted sparse representation obtained by smoothed l0 norm outperforms other state-of-the-art classifiers. Finally, a general tracking system is developed based upon sparse representation. Developing an effective and complete tracking algorithm is a challenging task because of factors such as illumination, occlusion and pose variations. Most of the tracking algorithms do not consider the situation when the tracked object or disappears temporarily from the video sequence or becomes temporarily fully occluded. Here, our goal is to develop an automatic object tracking system that can handle pose variations, scale variations and temporary disappearance of the object from the scene. We present a robust tracking system based on adaptive sparse representation and feedback. We focus on automatic tracking with no prior knowledge other than the location of the region to be tracked in the first frame, which can either be located manually or using a detector that finds the region of interest (ROI). The visual tracking is a binary classification problem. The positive samples are bounding boxes that have high overlap with current position of the target while negative samples are drawn from regions outside the ROI to model background close to the target. The tracking algorithm uses the dictionary to locate the ROI in the following frames via adaptive sparse representation. One of the main issues in tracking systems is false tracking when the object disappears from the scene. Motivated by the concept of feedback in control systems, we overcome the problem of false tracking when the object disappears by comparing the newly tracked region with previous regions to confirm that the object is still in the frame. A structural similarity measure is used to measure similarity between a newly tracked ROI and the previously tracked ROIs and if the similarity is below a certain threshold, the object is assumed to be out of the scene. In fact, this similarity evaluation is like a feedback loop in our tracking algorithm which makes our method robust, reliable and accurate when compared to the state-of-the-art methods on challenging sequences. If the object is not located in the current frame, the algorithm stops tracking and starts searching for the object in the following frames. The searching is achieved by using a detector based on sparse representation and an adaptive dictionary to efficiently locate the object when it reappears in the scene.
Sparse Representation; Tracking; Biometrics; Recognition; Classification; Dictionary Learning
Khorsandi, Rahman, "Sparse Representation and Dictionary Learning for Biometrics and Object Tracking" (2015). Open Access Dissertations. 1565.