What are the Search Quality Analyzers?

The quality of web search is important for everyone. If the search engines (SEs) do their job well, the users save time and find what they need fast.

But how do you assess search quality? One cannot rely on individual opinions here since every user has his/her own search habits and preferred query types. Google may work well for some users, but others will prefer say Yahoo or something else just because they usually search for different things in a different way.
The popularity of a SE does not directly reflect the search quality either, because the popularity is highly influenced by marketing and PR.

In order to independently assess the search quality, we developed a set of analyzers, one for each type of search queries. For all of these analyzers we use special sets of sample queries and sample sites. We measure the quality of navigational and informational search, the percentage of pornography among the pages found by a SE etc.

We hope that our tests are (or eventually will be) an objective and reliable source of information on search quality.
Enjoy.

How do the Analyzers work?

To estimate the search quality for various types of queries, we use special sets of test queries and analyze the pages returned for these queries. For example, here is how we test the quality of navigational searh that is the queries aimed at finding a particular web page. We use approximately 500 sample queries and specify the corresponding set of ‘test sites’ (the sites that would be good responses for these queries).

Thus if the user inputs "CNN" , (s)he probably wants to see www.cnn.com as the first result. Cnn.com is listed as an organic result for the query 'CNN'.

In order to prevent the Analyzers from being compromised by search engine developers, we use a different set of queries every day. We constantly replenish and refine the pool of queries from which each day’s set is randomly selected.

You can find a description of the methods used in a particular analyzer on the page where the analyzer data is shown.

We highly appreciate any corrections and welcome any criticism. Please feel free to send us the errors you find, suggest new sample queries, criticize the method etc.

Analyzer of navigational search

A search query with a purpose of finding a certain website is called a navigational query. Such queries include "sberbank", "komsomolskaya pravda", "rambler", "gazeta ru", etc.

The best result for a navigational query is the required site in the first position of search results.
Show →

[ Link to article ]

Analyzer of person search

The queries consisting of a first and a second name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
Show →

[ Link to article ]

Analyzer of question answering

This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup] [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil] [nitric acid formula]).
Show →

[ Link to article ]

Analyzer of correct hints

Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
Show →

[ Link to article ]

Typo resistance analyzer

Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
Show →

[ Link to article ]

Quotation search quality analyzer

People are often looking for a particular text using a quotation from that text. Usually such queries are based on quotations mentioned somewhere on the Web for which the user wants to find the original source. In this case the job of a search engine is to find the full text rather than a bunch of excerpts.
Show →

[ Link to article ]

Catch phrase analyzer

This analyzer, is devoted to the queries containing popular quotations. These quotations often come from fiction, but they are also used in everyday life.

For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source of the expression. Quite often however the search engine results are a bit disappointing. They include multiple cases where the expression is used rather than the original source or the definiton of the popular quotation.
Show →

[ Link to article ]

Analyzer of the ranking of original texts

Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
Show →

[ Link to article ]

Analyzer of search spam level

At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.

Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
Show →

[ Link to article ] , [ ]

Analyzer of 'adult sites' presence in search results

This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.

For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
Show →

[ Link to article ]

Analyzer of the quality of SafeSearch

SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?

This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
Show →

[ Link to article ]

Recall analyzer

The recall analyzer estimates the relative size of indices of the Internet search engines.
Show →

[ Link to article ]

Update analyzer

‘Update’ refers to the process of search results renewal. When the results are updated, some sites may make it to the top 10, some other sites may "sink". Every search engine has its own update style which becomes clear in this analyzer. Every day the search engine update analyzer monitors the top ten responses to 140 queries in order to assess the number of sites that changed their positions, and how much the positions have changed.
Show →

[ Link to article ]