Show simple item record

dc.contributor.advisorSaiedian, Hossein
dc.contributor.authorMorozov, Serhiy
dc.date.accessioned2011-10-09T03:54:21Z
dc.date.available2011-10-09T03:54:21Z
dc.date.issued2011-08-31
dc.date.submitted2011
dc.identifier.otherhttp://dissertations.umi.com/ku:11692
dc.identifier.urihttp://hdl.handle.net/1808/8160
dc.description.abstractThe use of recommender systems is an emerging trend today, when user behavior information is abundant. There are many large datasets available for analysis because many businesses are interested in future user opinions. Sophisticated algorithms that predict such opinions can simplify decision-making, improve customer satisfaction, and increase sales. However, modern datasets contain millions of records, which represent only a small fraction of all possible data. Furthermore, much of the information in such sparse datasets may be considered irrelevant for making individual recommendations. As a result, there is a demand for a way to make personalized suggestions from large amounts of noisy data. Current recommender systems are usually all-in-one applications that provide one type of recommendation. Their inflexible architectures prevent detailed examination of recommendation accuracy and its causes. We introduce a novel architecture model that supports scalable, distributed suggestions from multiple independent nodes. Our model consists of two components, the input matrix generation algorithm and multiple platform-independent combination algorithms. A dedicated input generation component provides the necessary data for combination algorithms, reduces their size, and eliminates redundant data processing. Likewise, simple combination algorithms can produce recommendations from the same input, so we can more easily distinguish between the benefits of a particular combination algorithm and the quality of the data it receives. Such flexible architecture is more conducive for a comprehensive examination of our system. We believe that a user's future opinion may be inferred from a small amount of data, provided that this data is most relevant. We propose a novel algorithm that generates a more optimal recommender input. Unlike existing approaches, our method sorts the relevant data twice. Doing this is slower, but the quality of the resulting input is considerably better. Furthermore, the modular nature of our approach may improve its performance, especially in the cloud computing context. We implement and validate our proposed model via mathematical modeling, by appealing to statistical theories, and through extensive experiments, data analysis, and empirical studies. Our empirical study examines the effectiveness of accuracy improvement techniques for collaborative filtering recommender systems. We evaluate our proposed architecture model on the Netflix dataset, a popular (over 130,000 solutions), large (over 100,000,000 records), and extremely sparse (1.1\%) collection of movie ratings. The results show that combination algorithm tuning has little effect on recommendation accuracy. However, all algorithms produce better results when supplied with a more relevant input. Our input generation algorithm is the reason for a considerable accuracy improvement.
dc.format.extent210 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsThis item is protected by copyright and unless otherwise specified the copyright of this thesis/dissertation is held by the author.
dc.subjectComputer science
dc.titleA Distributed, Architecture-Centric Approach to Computing Accurate Recommendations from Very Large and Sparse Datasets
dc.typeDissertation
dc.contributor.cmtememberAgah, Arvin
dc.contributor.cmtememberGrzymala-Busse, Jerzy
dc.contributor.cmtememberLuo, Bo
dc.contributor.cmtememberStahl, Saul
dc.thesis.degreeDisciplineElectrical Engineering & Computer Science
dc.thesis.degreeLevelPh.D.
kusw.oastatusna
dc.identifier.doi10.1146/annurev.ecolsys.28.1.289
kusw.oapolicyThis item does not meet KU Open Access policy criteria.
kusw.bibid7643113
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record