A Stochastic System Model for PageRank: Parameter Estimation and Adaptive Control
Issue Date
2015-05-31Author
Clifton, Cody Edward
Publisher
University of Kansas
Format
139 pages
Type
Dissertation
Degree Level
Ph.D.
Discipline
Mathematics
Rights
Copyright held by the author.
Metadata
Show full item recordAbstract
A key feature of modern web search engines is the ability to display relevant and reputable pages near the top of the list of query results. The PageRank algorithm provides one way of achieving such a useful hierarchical indexing by assigning a measure of relative importance, called the PageRank value, to each webpage. PageRank is motivated by the inherently hypertextual structure of the World Wide Web; specifically, the idea that pages with more incoming hyperlinks should be considered more popular and that popular pages should rank highly in search results, all other factors being equal. We begin by overviewing the original PageRank algorithm and discussing subsequent developments in the mathematical theory of PageRank. We focus on important contributions to improving the quality of rankings via topic-dependent or "personalized" PageRank, as well as techniques for improving the efficiency of PageRank computation based on Monte Carlo methods, extrapolation and adaptive methods, and aggregation methods We next present a model for PageRank whose dynamics are described by a controlled stochastic system that depends on an unknown parameter. The fact that the value of the parameter is unknown implies that the system is unknown. We establish strong consistency of a least squares estimator for the parameter. Furthermore, motivated by recent work on distributed randomized methods for PageRank computation, we show that the least squares estimator remains strongly consistent within a distributed framework. Finally, we consider the problem of controlling the stochastic system model for PageRank. Under various cost criteria, we use the least squares estimates of the unknown parameter to iteratively construct an adaptive control policy whose performance, according to the long-run average cost, is equivalent to the optimal stationary control that would be used if we had knowledge of the true value of the parameter. This research lays a foundation for future work in a number of areas, including testing the estimation and control procedures on real data or larger scale simulation models, considering more general parameter estimation methods such as weighted least squares, and introducing other types of control policies.
Collections
- Dissertations [4702]
- Mathematics Dissertations and Theses [179]
Items in KU ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
We want to hear from you! Please share your stories about how Open Access to this item benefits YOU.