Google doesn’t stop innovating their search engine, and there where others try to follow, Google is not just 1 step ahead, but 10 steps ahead. Their latest innovation, which actually may already be in place for a year or longer, can be found in the patent: “Information Retrieval Based on Historical Data.”
The abstract of the patent is: “A system identifies a document and obtains one or more types of history data associated with the document. The system may generate a score for the document based, at least in part, on the one or more types of history data“.
This article has the goal to give a implified representation of this patent + contains recommendations as to what would be the best SEO techniques to obtain high rankings, with a specific focus on links. This article is the opinion of the writer and following recommendation in this article is done at your own risk.
Google’s search results have been increasingly difficult to explain and many theories have been developed on what is going on. Most popular is the “sand box” theory, which says that a new site is put in a virtual sand box and has to wait until it has aged before obtaining high rankings. This patent has some excellent information that can explain this phenomenon.
Information Retrieval
The information that this invention of Google is claimed to retrieve based on the historical data are:
- Age/Time
- Change
- Trends
A score is calculated based on the above 3 factors which can then, at least partially, be used to rank the selected pages.
Historical Data
The patent describes a huge amount of historical data. The following is an overview of most items for which historical data can be measured:
- Pages/sites
- Links
- Anchor Texts
- Content
- Query
- Traffic
- Ranking
- User
- Domain
Ranking Based On Information Retrieved From Historical Data
The patent describes in quite a lot of detail how selected pages are ranked based on the information retrieved from historical data. This chapter will describe the basic logic applied.
Age/Time
Of all historical data a date of inception is used to determine 4 important values:
- Age
- Average Age
- Date
- Average Date
These factors can be determined for pages, links, anchor text, content, topics, queries, etc. Comparing the age or date of a page to the average of the site for example tells the search engine if this information is relatively new or old.
Comparing the average age or date of a page to the average age or date of all pages selected for a query (keyword phrase) tells the search engine if the page is relatively new or old. This information can be used to rank the selected pages.
Comparing to an average has the advantage that there is no preset base of rules that determine the rankings of a page. For one query 6 months may be considered new (product descriptions for example) while for another page 6 days may be considered old (news items for example). It all depends on the average age.
This same logic applies to links. In order to determine how popular a page or site is, the average age of all back links tells the search engine if the popularity of the page is recent or not. It makes sense that if most back links have been obtained 4 years ago and that hardly anybody has been interested to link to this page/site since then, that the page is not as popular as the existing back links would suggest.
The patent goes even as far as determining age factors for anchor texts of links.
Change
Information changes over time. Opinions change, - continued below ...