Integral Search Quality
Overall Search Quality
This rating helps to assess the overall quality of search for each search engine. It is based on the results of all the special analyzers, the analyzers of 'Clicks' and 'Updates' being excluded, since their results have a purely informational value.
The overall rating is calculated as follows:
1) For each special analyzer, the search engines' scores are normalized to 100 with respect to the best score. This is done in order to eliminate the differencies between the scoring scales of various analyzers.
2) For each analyzer, the normalized search engine scores are multiplied by the specific coefficient assigned to that analyzer. These coefficients reflect our understanding of how important the given feature / type of search is for the overall search quality. If in your opinion some search features have more or less merit for the overall search, please feel free to adjust the coefficients of the corresponding analyzers by moving the sliders on top of this page. Once you adjust the weights, the overall search quality scores will be recalculated.
3) Thereafter the numbers obtained are added up and divided by the sum of the coefficients. This operation yields a number in the range between 1 and 100 that represents the overall search quality of the search engine.
‘Update’ refers to the process of search results renewal. When the results are updated, some sites may make it to the top 10, some other sites may "sink". Every search engine has its own update style which becomes clear in this analyzer. Every day the search engine update analyzer monitors the top ten responses to 140 queries in order to assess the number of sites that changed their positions, and how much the positions have changed.
Let Di be the change in position for the page that appeared i-th in top 10 search results on day 1. For example, if the fifth page from the first day top 10 appeared third or seventh on the second day, D5=2. If the second day top 10 did not contain a certain page which was present on the first day, then we will assume that Di=10 for that page.
The update indicator is calculated using the formula:
Consider a couple of examples:
On Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
On Day 2, the same query has this Top 10:
Cn, C1, C2, C3, C4, C5, C6, C7, C8, C9.
In this case the update indicator is calculated as follows:
((2-1)+(3-2)+(4-3)+(10-9)+10)/100 = 0.19 (19%)
For Day 1, a certain query has the following Top 10:
C1, C2, C3, C4, C5, C6, C7, C8, C9, C10.
For Day 2, the same query has this Top 10:
Cn1, Cn2, Cn3, Cn4, Cn5, Cn6, Cn7, Cn8, Cn9, Cn10.
In this case the update indicator equals:
10*10/100 = 1.00 (100%)
The analyzer also calculates the additional parameters: the number of sites which disappeared from the search results and the number of sites which changed their positions.
This analyzer has no valuation. The results can be interpreted in two ways: a search engine that has frequent large updates could be considered more up-to-date; a search engine with rare updates can be considered more stable and predictable. The informer of this analyzer sorts the search engines in the ascending order of update level.
Navigational Search Analyzers
The analyzers in this group make an estimation of the search engine's navigational functioning. Different kinds of queries are used to check, whether or not the site / page in question is found on the first result page.
Navigational queries are those looking for a specific site, file or page. Such queries will usually consist of the name of some organization or business (e.g., "Punjab and Sind Bank" or "Moores Glassworks"), of some print source or web site (e.g., "Cooking Light" or "bash.org"), or they will just name the page needed (like "Bofinger Rue Sherbrooke Ouest Montréal"). Likewise, eminent bloggers or official site owners often become a target of navigational queries (think of Art Garfunkel or Jessica Gottlieb).
Evidently, a navigational query can have more than one meaning: user searching for "alabama state university" or "avril lavigne" might look for an independent information about the organization or the person in question. Still, the official site must be present in the SERP, and its position must be high enough. Furthermore, the analyzers of this group allow the switch from stricter (the official site takes first or close to first position) to looser (it's enough that the official site is among the top ten results) examination criteria.
Analyzer of Navigational Search
A search query with a purpose of finding a certain website is called a navigational query. Such queries include "sberbank", "komsomolskaya pravda", "rambler", "gazeta ru", etc.
The best result for a navigational query is the required site in the first position of search results.
For evaluation of navigational search, the search engines were tested with 200 queries randomly selected from the array of navigational queries. Each query was assigned one or more site/marker. The top 10 search results are checked for the site/marker entries. When several sites/markers were assigned to a query, each of them listed in one of the top positions was considered a hit. The percentage of queries which yielded the site/marker on the first page was calculated. This number is the aggregate indicator of the quality of navigational search.
The best search engine is the one with highest aggregate indicator for this analyzer. In the informer, the search engines are sorted by the aggregate indicator.
Analyzer of Person Search
The queries consisting of a first and a last name are often targeting a particular web page - the official personal web page of the person whose name is in the query. Even if the user is not sure that the personal site exists, such website is an obvious hit and it should appear in top 10 search results.
The queries for this analyzer include the names of celebrities as well as people who are well-known in some domain (for example, photographers, scientists, psychologists etc).
The analysis of queries and hits in this analyzer is analogous to the Analyzer of navigational search
Information Search Analyzers
The largest and the least defined group of queries are those aiming at finding information, in a broad sense of the word. Although an exhaustive survey of all such queries would seem impossible, yet some aspects of informational search come under close scrutiny in this group of analyzers.
Our analyzers cover Quote Search (Quotations, Catch Phrase and partly Originals) and Answer Search. It is very important that the search engine is able to (and is willing to) distinguish the original information source from its copies or imitations. This is the issue of the Originals' analyzer.
Our plans include broadening the scope of the search aspects under investigation. Yet, even at this moment, this scope is wider than it may seem, since a whole chain of analyzers in other groups are immediately related to information search. Thus, it is usually informational query that is affected by the search engine's "mistakes". Several of the Data Freshness analyzers also deal with informational queries. And naturally, informational queries form the most part of the Assessing analyzers, as they form the most part of the web search in general.
Analyzer of Quotation Search Quality
Quotation search is the search for the source of a certain text fragment, i.e. either the original text (in which case a larger portion of it should appear on the site), or at least the author and the title of this text.
This analyzer examines 100 queries that consist of significantly long extracts from texts, published on the Web. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the larger fragment of the original text or b. the name of the author and the title of the text.
The positions of the pages in the search results are not taken into consideration. Neither are (unlike the original texts analyzer
, where the priority of the copyright holder is important) the sites where the text in question was first published.
Catch Phrase Analyzer
This analyzer, is devoted to the queries containing short popular quotations. These quotations often come from fiction, but they are also used in everyday life.
For example, a Russian user typing the query [контора пишет] is likely to be looking for the meaning or the original source (the title of the text and the name of the author) of the expression. The search engines, however, often provide multiple examples of use of the expression, which is hardly what the user is looking for.
The analyzer examines 100 queries that consist of a popular quotation, the source of which is known. For each search engine we calculate the percentage of the search results containing at least one of the following: a. the given fragment (or one of several fragments) of the text where the quotation comes from or b. the name of the author and the title of the text. The positions of the pages among the search results are not taken into consideration.
Analyzer of Question Answering
This analyzer assesses the ability of search engines to find answers to question queries. The question queries include obvious quesitons with a question word ([when did CSKA win UEFA cup], [where is Lhasa located]) as well as other queries that imply an obvious short answer ([currency of Brasil], [nitric acid formula]).
The users typing such queries are most likely looking for the answer to their question. The quicker they get the answer, the better. Ideally, the snippet of the first page in search results should include the answer.
This analyzer has several tabs. In each of the tabs the total score of a search engine is the sum of its scores for each query.
1. Answer position in top 10 snippets
Here a search engine gets a score from 0 to 1 for each query, reflecting the highest position of a snippet, that contained the answer, in top 10 results. For example, if the answer was found in the first snippet, the search engine gets 1 for the query. If it was found in the second place, the score is 0.9 etc. If the answer was not found in the snippets of top 10 results, the score is 0.
2. Answer presence in snippets
Here a search engine gets 1 for a query if at least one of the top 10 snippets contained an answer to the question. 0 is assigned otherwise.
3. Position of answer web pages in top 10
A search engine gets a score from 0 to 1 for each query, reflecting the highest position of a web page, that contained the answer, in top 10 results. The score is 1 if the first page found contained an answer, 0.9 if the answer was present on the second page and 0 if none of the web pages found contained the answer.
4. Answer presence on web pages found
Here a search engine gets 1 for a query if at least one of the top 10 web pages contained an answer to the question. 0 is assigned otherwise.
A question query may have multiple possible answers. For example, the possible answers for [Where are Maldives located] are "Indian ocean" and "Asia".
Analyzer of Original Texts Ranking
Unfortunately, the copyrighted content is illegaly copied all too much on the web. Any author faces the fact that his or her original texts are being stolen: the text of a newly created article can be copied on some web page over days or even hours after the article has been created. The websites stealing the content in this way usually claim that it was "taken from open sources" or "uploaded by one of the users". The stolen content can be used to attract search engine users to a web page and the resulting traffic can be converted to money. This is the main reason for such 'borrowing'. The ability to identify the original texts and rank the corresponding web pages higher than the pages containing copied materials is a crucial property of any search engine.
The analyzer of the ranking of original texts uses exact quotation queries to daily monitor the position of 100 marker articles. For these articles, the web sites of copyright holders are known. The analyzer can thus calculate the percentage of queries for which the original text is ranked higher than the copied material.
The queries in this analyzer are fragments from the original article. By default the queries are submitted to the search engines in quotation marks. This way we expect that the only responses found will be the original article and its copies. However the real users rarely rely on quotation marks, so an additional tab estimates the results based on queries submitted without quotation marks.
The search engines are sorted by their ability to rank original texts higher than the copies in the informer of this analyzer.
Recall & Diversity Analyzers
As the web search is used for every possible need, it is vital that a search engine be able to find meaningful answers to most diverse questions. This ability depends both on the search engine's scope and on its interpreting potential.
These analyzers reflect the two above-mentioned aspects of the search engines' functioning, the quantity of the output and its diversity (when allowed by the non-strict sense of the query). To keep the results valid, we regularly put to test our set of queries.
It is worth noting that since the information that cannot be retrieved by means of a search engine is in some sense unattainable, the relative values in these analyzers are in no way less important than the absolute ones.
The recall analyzer estimates the relative size of indices of the Internet search engines.
The total number of documents indexed, as reported by the SE itself cannot be used for comparison because different SEs have different methods of document count. For example, some of them include duplicate documents in the count while others do not. Counting the duplicate documents may double the reported index size or increase it even more.
Additionally, the size of the index is a very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger index database size you report, the better press you get.
The number of documents found for a particular query does not always reflect the real number of documents indexed by a SE. Almost every frequent query will return tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages. Thus the exact number of web pages found can be verified only in the case when the number of possible results is very small, that is for queries containing very rare words.
For multiple-word queries, certain SEs show in the search results not only the documents where all the words comprising the query are found, but also the documents containing single words from the query. These "tail" documents usually irrelevant to the query, but counting them can increase the total number of pages found.
In order to obtain independent and reliable data on the relative index size of the popular SEs, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which occur several tens of times on the web. Once a day, we count how many of these occurences are found by each search engine.
To make the data steadier, we use a different set of sample queries from the whole query pool every day.
The set of sample queries is constantly replenished by our linguists. If you have some rare words and want to help us cover the 'faraway' areas of the Net, please send us these words, and we will consider including them into the sample queries list.
Query Comprehension Analyzers
Of course, the query comprehension by a search engine is but a metaphor. Yet one eventually gets used to the search engines' "creativity" in the interpretation of queries: their search prompts, spelling corrections, their attempts at refining or improving the query, or re-ordering the answers according to the assumed object of inquiry etc.
At present, the analyzers in this group test such relatively primitive skills of the search engines, as correcting typos, giving prompts, expanding queries by synonyms. Meanwhile, some more refined search techniques (query interpretation, dealing with multiple meaning queries and so on) are to be estimated only indirectly, by mistakes arising when the query is misunderstood by the search engine. (Search Engine Mistake Analyzers, several in number, are arranged in a distinct group.) But then, it's the occasional mistakes that most clearly demonstrate a SE's skill in processing the queries. When the search results are correct, we just don't pay any attention to the search engine's "tricks". These tricks are only exposed through incongruent or funny output.
Analyzer of Correct Hints
Most of the search engines attempt to suggest a correct spelling for a query in case a typo is suspected. The quality of such hints is an important addition to the overall quality of the search. This analyzer looks for the correct hint in the search results for a query with a deliberate typo and estimates the number of occurrences of a 'correct' query contained in the hint.
The evaluation is based on the same set of queries containing typos that is used for the typo resistance analyzer. The more correct hints have been given, the higher is the search engine's index for this analyzer.
Typo Resistance Analyzer
Human are not machines, they make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling ("yandax" instead of "yandex").
In this case, the search engine can adhere to one of the following strategies:
1) no processing: search with exact spelling only
2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for [correct spelling]?"
3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either remains unaware of the fact that (s)he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Apart from deliberate typo correction, matches can arise in four cases:
3) the page contains both the correct and mistyped spelling
4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom")
5) promotion of the same websites both for correct and incorrect spelling of queries
All of these cases produce noise in this analyzer: an accidental match of results.
The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for this analyzer. This determines the order of search engines in the informer of the analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Disturbing Content Analyzers
The search engine might work as good as it can, there still are these small annoying things that can easily damp the user's good spirits and significantly shatter his loyalty to a specific SE. Here belongs, e.g., the danger of contracting a computer virus, the presence of irritating and obtrusive ads etc.
Whereas the amount of advertisments or dangerous scripts on websites is not for the browsers to control, the concentration of objectionable content in the output is totally in their power. So, if the websites in the output abound with annoying factors, the search engine would only benefit from ranking such sites far below high-quality safe ones.
Most analyzers of this group make use of specific techniques developed by "Ashmanov & Partners" for detecting ads, pornography, computer viruses etc. To make the results more lucid, we sought out the markers, so that the probability of undesirable results was higher than usual.
Analyzer of Search Spam
At "Ashmanov and Partners" we study the phenomenon of search spam – the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information.
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of sites marked as spam in the overall number of sites that appeared in Top 10 of analyzed queries.
The source of information on the spam status of a given URL is the data of the anti-spam lab of the company "Ashmanov and Partners". The following categories of search spam are used:
* doorway – definite spam: doorways, leading the user to other pages,
* spamcatalog – definite spam: spammer catalogues,
* spamcontent – definite spam: spammers' stolen content,
* pseudosite – definite spam: site disguised as corporate (pseudo-company),
* catalog – catalogues,
* board – bulletin boards,
* domainsale – domains for sale,
* secondary – secondary, stolen content,
* partner – any partner programs,
* linksite – link support site,
* spamforum – forum containing spam,
* techspam – technical spam,
* searchres – search results
* cj – circular jerk
An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines in the informer of the analyzer.
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. No queries which unambiguously indicate that the user is searching for porn are included.
For instance, the query [stockings] could come from a user looking for a stockings shop or for the corresponding category of pornography.
When search engines include pornographic results in responses to such queries, they run the risk of showing these results to the users (potentially including minors) which were not looking for adult content.
We do not claim that it is «bad» or «immoral» for a search engine to return pornographic pages in response to unambiguously pornographic queries. However the appearance of adult results in responses to «regular» queries is a drawback in our opinion.
We make use of the 'Semantic Mirror'
technology developped at 'Ashmanov and Partners' in order to detect adult content.
For every search engine, we calculate the percentage of pages with adult content among the top 10 results.
Analyzer of the SafeSearch Quality
SafeSearch is a filter that is supposed to block the «adult only» websites from appearing in the search results. Nowadays most search engines allow their users to filter the results in this way. But does SafeSearch do its job well enough?
This analyzer collects search results for ambiguous queries which may be interpreted as targeting a certain category of pornography, but also admit other interpretations. The question is whether SafeSearch effectively filters out the adult responses in such cases.
The set of queries in this analyzer is the same as in the analyzer of the presence of adult sites in search results. We submit these queries to the search engines with the SafeSearch option turned on wherever possible. In this case, the ideal results should not contain any adult pages at all, as well as no results with obscene words. If the adult pages still show up in the search results, the search engine is not keeping its promise to protect the user with SafeSearch.
We make use of the Semantic Mirror
technology developped at 'Ashmanov and Partners' in order to detect pornography. For every search engine, we calculate the percentage of pages with adult content among the top 10 results.