*** Welcome to piglix ***

Search queries


A web search query is a query that a user enters into a web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are often plain text or hypertext with optional search-directives (such as "and"/"or" with "-" to exclude). They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

There are three broad categories that cover most web search queries: informational, navigational, and transactional. These are also called "do, know, go." Although this model of searching was not theoretically derived, the classification has been empirically validated with actual search engine queries.

Search engines often support a fourth type of query that is used far less frequently:

Most commercial web search engines do not disclose their search logs, so information about what users are searching for on the Web is difficult to come by. Nevertheless, research studies appeared in 1998. Later, a study in 2001 analyzed the queries from the Excite search engine showed some interesting characteristics of web search:

A study of the same Excite query logs revealed that 19% of the queries contained a geographic term (e.g., place names, zip codes, geographic features, etc.). Studies also show that, in addition to short queries (i.e., queries with few terms), there are also predictable patterns to how users change their queries.

A 2005 study of Yahoo's query logs revealed 33% of the queries from the same user were repeat queries and that 87% of the time the user would click on the same result. This suggests that many users use repeat queries to revisit or re-find information. This analysis is confirmed by a Bing search engine blog post telling about 30% queries are navigational queries

In addition, much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually. This example of the Pareto principle (or 80–20 rule) allows search engines to employ optimization techniques such as index or database partitioning, caching and pre-fetching. In addition, studies have been conducted on discovering linguistically-oriented attributes that can recognize if a web query is navigational, informational or transactional.


...
Wikipedia

...