We have the answers to your questions! - Don't miss our next open house about the data universe!

Page Rank Algorithm: All you need to know

- Reading Time: 4 minutes
page rank

Are you familiar with the Page Rank algorithm? At the heart of Google positioning, let's find out how it works

You’ve probably wondered why certain web pages appear first when you make a query on a search engine.

If you try to search for the term “python” in your Google search bar, the official python language website comes up first, followed by the Wikipedia page dedicated to the python language.

But why aren’t web pages about the python snake at the top of the search results? This is due to a confidential positioning algorithm developed by Google.

The algorithm was the brainchild of Google co-founder Larry Page, after whom it was named.

His ambition was to rank web pages according to a score of “importance” or, in other words, “popularity”.

Since its launch, the role of the algorithm has changed, and other indicators have been developed to contribute to the positioning of web pages. There are now more than 200 of them, compared with just a dozen at the outset! Nevertheless, the score obtained by the PageRank algorithm still contributes to positioning, although its role and weight are not communicated.

Before going into more detail about how the algorithm works, it’s important to understand what is meant by “indexing a web page”. To be part of the web page index is simply to be part of a search engine’s catalog of web pages. If a web page is part of the index, then it has been referenced. Without this referencing, it will not appear in the search results.

As a general rule, a page is considered to be well positioned when it appears in the first 20 search results. Beyond this rank, web pages are rarely consulted by Internet users.

How are web pages ordered?

When we perform a search, it contains keywords, so Google will look for web pages within its catalog that contain these keywords. There are often a huge number of results! A web page containing these keywords is therefore a candidate for inclusion in the first search results.

It’s important to note that the algorithm’s workings are confidential, so we don’t know exactly how the score assigned to a web page, and therefore its rank, is calculated. A web page’s rank is not immutable, but the more stable it is, the better.

If we were to summarize the algorithm’s workings succinctly, we’d say that the more a web page is indexed within other web pages, the higher its score and the better its ranking.

This summary is a little simplistic, however, as these are not the only elements that come into play when calculating this score!

In addition to the frequency of appearance of the web page within the other pages, the popularity of the pages is also in question, along with the thematic proximity between the web pages.

This means that a web page cited very frequently on pages that are not very popular will not have the same impact on the final score awarded to the web page: the more popular a page is, the higher the score of the cited page.

The closer the pages listing the link to the web page are thematically, the more the score will be positively impacted.

What’s more, if a web page cites a large number of web pages, this will have a lesser impact on the web page’s score.
Let’s now go into a little more detail about how the algorithm works.

Let’s consider a query on the Google search engine. Based on the keywords it contains, Google will then search its catalog to find the web pages that contain them.

In order to order these pages, for each web page, Google will observe the number of links existing within the pages of its catalog to this web page.

Each link is perceived as a “vote” on the part of the page containing the link for the page in question. A link can be both an internal link (a link that redirects the user to a page within the same website) or an external link (a link that redirects the user to a page on another website).

In addition to these links, the algorithm takes into account the page rank of the pages listing the web page in question (popularity indicator). So the higher the page rank of the pages linking to the web page of interest, the higher the page rank of that page.

Let’s say the web page of interest is a DIY blog. If the link to this web page is quoted on a DIY retailer’s website, then the page’s page rank will increase all the more, since there’s a thematic link between these two web pages. Now let’s suppose that the DIY store is very popular, then this will increase the score of our web page even more!

At its launch, Google introduced a “ToolBar PageRank” (TBPR), which gave an approximation of the page rank of web pages. Since 2014, Google has removed the “ToolBar PageRank”, so unfortunately it’s no longer possible to obtain this approximation and therefore get this information.

Despite all these indications, Google remains very discreet about this Page Rank algorithm! However, the quantity of links plays a major role in the positioning of web pages, although it is not the only criterion involved. The popularity of pages, their thematic proximity and the number of links on these web pages all play an important role in their positioning.

To conclude this article, let’s take a look at how our website ranks when you type the query “datascience training” into a search engine.
Datascientest is in 1st place with the web page.

It would seem that page rank holds no secrets for us!

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox