# PageRank, HITS and a unified framework for link analysis

siam international conference on data mining, pp.249-253, (2003)

Two popular webpage ranking algorithms are HITS and PageRank. HITS emphasizes mutual reinforcement be- tween authority and hub webpages, while PageRank em- phasizes hyperlink weight normalization and web surf- ing based on random walk models. We systemati- cally generalize/combine these concepts into a unied framework. The ranking framewo...更多

- PageRank[1] and HITS (Hypertext Induced Topic Selection)[3]. HITS makes the crucial distinction of hubs and authorities and computes them in a mutually reinforcing way.
- In the HITS algorithm[3], each webpage i has both a hub score yi and an authority score xi .
- Since LTL determines the authority ranking, the authors call LTL the authority matrix.
- This shows the close relationship between authority and co-citation.
- Since LLT determines the hub scores, the authors call LLT the hub matrix.
- The authors prove that L LT = Dout + R, where Dout is the diagonal matrix containing out-degrees of all nodes, and R

- PageRank[1] and HITS (Hypertext Induced Topic Selection)[3]
- We prove that LTL = Din + C, where Din is the diagonal matrix containing in-degrees of all nodes, and
- We prove that L LT = Dout + R, where Dout is the diagonal matrix containing out-degrees of all nodes, and R
- Assuming the web graph is a fixed degree sequence random graph, HITS results in average case can be solved in closed form [2], which proves that authority ranking by HITS
- Democracy: each website has a total of one vote. Another key feature is that PageRank adopts a web surfing model based on a Markov process in determining the scores:
- We propose to define hub in PageRank using the same random surfer model as in definition of authority

- Hits InDgr Page www.runnersworld.com/
- Www.coolrunning.com/
- Www.kicksports.com/
- Www.halhigdon.com/
- Www.ontherun.com/
- Www.runningroom.com/

- Www.adidas.com/
- The most important feature of HITS is the mutual reinforcement between hubs and authorities, while the most important feature of PageRank is the hyperlink weight normalization.
- The authors clarify and formalize weight propagation and random surfing as two different but related method to compute ranking scores.
- One can design new ranking algorithms.
- The authors study three new ranking algorithms: the AuthRank, the Hub-Rank and the Sym-Rank.
- HITS ranking and PageRank ranking are highly correlated with indegree ranking.

- Table1: Iop and Oop operations for HITS, PageRank, Auth-Rank, Hub-Rank, and Sym-Rank

- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Proc. of 7th WWW Conferece, 1998.
- C. Ding, H. Zha, X. He, P. Husbands, and H. Simon. Analysis of hubs and authorities on the web. Lawrence Berkeley Nat’l Lab Tech Report 47847, May 2001.
- J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 48:604–632, 1999.

