pt.tumba.links
Class PageRank

java.lang.Object
  extended by pt.tumba.links.PageRank

public class PageRank
extends java.lang.Object

Pagerank is a an algorithm that Google utilizes to rank its search results in presence of multiple resources matching a certain query. In very simple words Pagerank evaluates and ranks Web sites according to a computed value determined by the number of other sites linking to them. The way the Pagerank value is computed makes Pagerank somewhat immune to artificial linking efforts. The rank (a numerical measure of importance) of each web page is dependent on the importance conferred on it by other web pages that have links to it; each web page divides its importance equally amongst all of the web pages it references. The rank R(p) for any web page p can be expressed via a simple summation given the set of web pages B(p) that link to p and the outdegree function d+: V -> Z: R(p) = sum(q in B(p), R(q)/d+(q)) Note that the right-hand side of the expression may contain R(p), because the rank of some R(q) may depend on R(p). For this reason, the rank of each web page is not computed directly, but instead an iterative algorithm is applied using the formula: Ri+1(p) = sum(q in B(p), Ri(q)/d+(q))

Author:
Bruno Martins

Field Summary
private  double dampening
          The value for the PageRank dampening factor
private  WebGraph graph
          The data structure containing the Web linkage graph
private  java.util.Map scores
          A Map containing the PageRank values for each page
 
Constructor Summary
PageRank(WebGraph graph)
          Constructor for PageRank
 
Method Summary
 void computePagerank()
          Computes the PageRank value for all the nodes in the Web Graph.
 void computePagerank(int iter)
          Computes the PageRank value for all the nodes in the Web Graph.
 double getDampening()
          Returns the dampening factor used for the PageRank Algorithm.
 void initializePageRank(java.lang.Integer id, double value)
          Initializes PageRank value associated with a given link identifyer.
 void initializePageRank(java.lang.String link, double value)
          Initializes the PageRank value associated with a given link.
private  java.lang.Double pageRank(java.lang.Integer id)
          Returns the PageRank value associated with a given link identifyer.
 java.lang.Double pageRank(java.lang.String link)
          Returns the PageRank value associated with a given link
 void setDampening(double damp)
          Sets the value for the PageRank dampening factor.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dampening

private double dampening
The value for the PageRank dampening factor


graph

private WebGraph graph
The data structure containing the Web linkage graph


scores

private java.util.Map scores
A Map containing the PageRank values for each page

Constructor Detail

PageRank

public PageRank(WebGraph graph)
Constructor for PageRank

Parameters:
graph - The data structure containing the Web linkage graph
Method Detail

setDampening

public void setDampening(double damp)
Sets the value for the PageRank dampening factor. The amount of PageRank that is transferred depends on a dampening factor which stands for “the probability that a random surfer will get bored”. The dampening factor generally is set to 0.85.

Parameters:
damp - The dampening factor

getDampening

public double getDampening()
Returns the dampening factor used for the PageRank Algorithm. The amount of PageRank that is transferred depends on a dampening factor which stands for “the probability that a random surfer will get bored”. The dampening factor generally is set to 0.85.

Returns:
The dampening factor

pageRank

public java.lang.Double pageRank(java.lang.String link)
Returns the PageRank value associated with a given link

Parameters:
link - The url for the link
Returns:
The PageRank value associated with the given link

pageRank

private java.lang.Double pageRank(java.lang.Integer id)
Returns the PageRank value associated with a given link identifyer. Identifyers are Integer numberes, used in WebGraph to represent the Web graph for efficiency reasons.

Parameters:
link - An identifyer for the link
Returns:
The PageRank value associated with the given link
See Also:
WebGraph.IdentifyerToURL()

initializePageRank

public void initializePageRank(java.lang.String link,
                               double value)
Initializes the PageRank value associated with a given link.

Parameters:
link - The url for the link
value - The PageRank value to assign

initializePageRank

public void initializePageRank(java.lang.Integer id,
                               double value)
Initializes PageRank value associated with a given link identifyer. Identifyers are Integer numberes, used in WebGraph to represent the Web graph for efficiency reasons.

Parameters:
link - An identifyer for the link
value - The PageRank value to assign
See Also:
WebGraph.IdentifyerToURL()

computePagerank

public void computePagerank()
Computes the PageRank value for all the nodes in the Web Graph. In this method, the number of iterations of the algorithm is set accordingly to the number of nodes in the Web graph.


computePagerank

public void computePagerank(int iter)
Computes the PageRank value for all the nodes in the Web Graph. The formula of Sergey Brin and Lawrence Page (founders of Google) can be found in their original document. Essentially: The amount of PageRank that is transferred depends on a dampening factor which stands for “the probability that a random surfer will get bored”. The dampening factor generally is set to 0.85. The more outgoing links a web page has, the less PageRank of that page will be transferred to each of the pages it links to. Very simple: devide the real PageRank by the number of outgoing links and multiply it with the dampening factor to calculate the amount of PageRank that is transferred. Do this for all pages that link to your page and you know your own PageRank.

Parameters:
iter - The number of iterations for the algorithm