A data mining and semantic Web framework for building a Web-based recommender system

Date of Award




Degree Name

Doctor of Philosophy (Ph.D.)


Electrical and Computer Engineering

First Committee Member

Mei-Ling Shyu, Committee Chair


Recent technological advances in many networks and applications, particularly the Internet and the World Wide Web (WWW), have generated a huge amount of information available to users. One of the most widely applied Information Retrieval (IR) approaches for assisting the users in finding their information is the keyword-based search, as adopted by many Web search engines. However, without prior knowledge of the retrieval process, or keywords which accurately depict the search topic, discovering the desired information can be a tedious and formidable task. In addition, the traditional IR approaches have no way to customize the results, according to the users' preferences. In general, the main goal of the users is to obtain the information which best suits their needs within a reasonable amount of time. The system administrator, on the other hand, is mainly concerned with the technical issues and details of the system implementation. As a result, the users' individual needs are often neglected. Therefore, the remaining challenges for IR research lie in the effective assessment of the interaction between the users and the system.In this thesis, a new framework based on data mining techniques and the Semantic Web concept is proposed to overcome the drawbacks associated with the traditional IR approaches. This proposed framework is then applied to construct a Web-based recommender system, which automatically generates a recommended list of information based on an individual's preferences. Two information filtering methods for providing the recommended information are considered: (1) by analyzing the information content, i.e., content-based filtering, and (2) by referencing other users' access behaviors, i.e., collaborative filtering. In order to utilize all the underlying data components (e.g., feature content, link structure, and user access sequences), three Web mining algorithms are developed: (1) Web document classification based on the fuzzy association concept, (2) document category merging using a clustering algorithm; and (3) a method for mining user access patterns using the association rule mining technique. In addition, the Semantic Web and ontology concepts, in which the information is given well-defined meaning, are incorporated into the framework in order to provide the users with semantically-enhanced information and thus, improve the results of the IR process.To demonstrate a potential use of the proposed framework, a system prototype for recommending the University of Miami's Web pages has been designed and implemented. The Web-page recommender system improves the performance of traditional query-based IR systems, such as search engines, with its ability to automatically recommend a list of information according to each individual user's profile. In addition, the interaction between the users and the system is also improved by providing each Web page with the semantic information.


Engineering, Electronics and Electrical; Computer Science

Link to Full Text