Advanced search and query language. Search engine query language

Table of contents:

Advanced search and query language. Search engine query language
Advanced search and query language. Search engine query language
Anonim

A query language is an artificially created programming language used to make queries in databases and information systems.

query language
query language

In general, these types of queries can be classified according to whether they are used for a database or for information retrieval. The difference is that requests to such services are made to obtain actual answers to the questions posed, while the search engine tries to find documents containing information related to the area of interest to the user.

Databases

Database query languages include the following examples:

  • QL - object-oriented, refers to relational databases; successor to Datalog.
  • Contextual (CQL) is a formal query language for information retrieval systems (such as web indexes or bibliographic catalogs).
  • CQLF (CODYASYL) – for CODASYL-TYPE databases.
  • Concept-Based Query Language (COQL) - used in the corresponding models (com). It is based on the principles of construpt data modeling and uses operations such as projection and de-projection of multivariate analysis, analytical operations and inference.
  • DMX - used for data mining models.
  • Datalog is a deductive database query language.
  • Gellish English is a language that can be used to query Gellish English databases and allows for dialogues (queries and responses) as well as knowledge information modeling.
  • HTSQL - Translates http requests to SQL.
  • ISBL - Used for PRTV (one of the first relational database management systems).
  • LDAP is a query and directory services protocol that runs over TCP/IP.
  • MDX - Required for OLAP databases.
Yandex query language
Yandex query language

Search engines

The language of search queries, in turn, is aimed at finding data in search engines. It differs in that queries often contain plain text or hypertext with additional syntax (eg "and"/"or"). It differs significantly from standard similar languages, which are governed by strict command syntax rules or contain positional parameters.

How are search terms classified?

There are three broad categories that cover most searches: informational, navigational, and transactional. Although this classification has not been fixed theoretically, it is empirically confirmed by the presence of actual queries in search engines.

Inquiry requests are those that cover broad topics (such as a particular city or truck model) for whichget thousands of relevant results.

Navigation queries are queries that search for a single site or webpage on a specific topic (such as YouTube).

search query language
search query language

Transactional - reflects the user's intention to perform a certain action, such as buying a car or booking a ticket.

Search engines often support a fourth type of query, which is much less frequently used. These are so-called connection requests that report the connectivity of an indexed web graph (number of links to a particular URL, or how many pages indexed from a particular domain).

How is information searched?

Most search engines do not disclose their search logs, so information about what users are looking for on the Web is very difficult to find. However, the first scientific studies appeared in 1998. Later, a follow-up study was conducted in 2001, which analyzed the queries displayed as highly relevant. It also became clear how search engines use the query language.

Interesting web search features revealed:

Average search query length was 2.4 words.

  • About half of users made one request, and just under a third of users made three or more unique requests in succession.
  • Almost half of users viewed only the first one or two pages of results.
  • Less than 5% of users useadvanced search options (for example, selecting any specific categories or searching in the search).

Features of custom actions

The study also showed that 19% of queries contained a geographic term (eg names, postcodes, geographic features, etc.). It is also worth noting that in addition to short queries (that is, with several conditions), there were often predictable patterns in which users changed their search phrases.

logical query language
logical query language

It was also found that 33% of requests from the same user are repeated, and in 87% of cases the user will click on the same result. This suggests that many users use repeated queries to revise or rediscover information.

Frequency distributions of requests

Besides, experts have confirmed that the frequency distributions of queries correspond to a power law. That is, a small part of the keywords is observed in the largest search list (for example, more than 100 million), and they are the most frequently used. The remaining phrases within the same topics are used less frequently and more individually. This phenomenon has been called the Pareto principle (or "80-20 rule"), and it has allowed search engines to use optimization techniques such as indexing or database partitioning, caching and prefetching, and has also made it possible to improve the search engine's query language.

In recent years, it has been found that the average length of queries has steadily increased over time.time. So, the average query in English has become longer. In this regard, Google implemented an update called "Hummingbird" (in August 2013), which is able to handle long search phrases with non-protocol, "colloquial" query language (like "where is the nearest coffee shop?").

request in English
request in English

For longer queries, their processing is used - they are divided into phrases formulated in the standard language, and answers are displayed separately for different parts.

Structured queries

Search engines that support logical operations and syntax use more advanced query languages. A user who is looking for documents that cover multiple topics or facets can describe each of them by the logical characteristic of a word. At its core, the logical query language is a collection of certain phrases and punctuation marks.

What is advanced search?

The query language of "Yandex" and "Google" is capable of performing a more narrowly targeted search, subject to certain conditions. Advanced Search can search by part of the page title or title prefix, as well as specific categories and lists of names. It can also restrict searches for pages that contain certain words in the title or are in certain subject groups. With the right use of a query language, it can handle parameters orders of magnitude more complex than the superficial results of most search engines, including user-defined ones.words with variable endings and similar spelling. When presenting advanced search results, a link to the relevant sections of the page will be displayed.

search engine query language
search engine query language

It is also the ability to search for all pages containing a specific phrase, while with a standard query, search engines cannot stop at any page of the discussion. In many cases, the query language can lead to any page located in the noindex tags.

In some cases, a well-formed query allows you to find information containing a number of special characters and letters from other alphabets (Chinese characters for example).

How are query language characters read?

Upper and lower case, as well as some diacritics (umlauts and accents) are not taken into account in searches. For example, searching for the keyword Citroen will not find pages containing the word "Citroen". But some ligatures correspond to individual letters. For example, a search for "aeroskobing" will easily find pages containing "Ereskoebing" (AE=Æ).

Many non-alphanumeric characters are consistently ignored. For example, it is impossible to find information on a query containing the string |L| (letter between two vertical bars), even though this character is used in some conversion templates. The results will contain only data with "LT". Some characters and phrases are handled differently: the query "loan (finance)" will display articles with the words "loan" and "finance" ignoring the brackets, even if there is an article with the exact titled "loan (Finance)".

database query languages
database query languages

There are many functions that can be used using the query language.

Syntax

The query language of "Yandex" and "Google" may use some punctuation marks to refine the search. An example is curly braces - {{search}}. The phrase contained in them will be searched in its entirety, without changes.

A phrase in double quotes allows you to define the search object. For example, a word in quotation marks will be recognized as being used in a figurative sense or as a fictional character, without quotation marks - as information of a more documentary nature.

Also, all major search engines support the "-" character for logical "not" as well as and/or. The exception is terms that cannot be prefixed with a hyphen or dash.

Inexact match of the search phrase is marked with ~ symbol. For example, if you don't remember the exact wording of a term or name, you can enter it in the search bar with the specified character, and you will be able to get the most similar results.

Custom search options

There are also search parameters such as in title and incategory. They are colon-separated filters in the form "filter: query string". The query string can contain the term or phrase you are looking for, or part or the full title of the page.

The "in title: query" feature prioritizes search results by title, but also showsnormal results for title content. Several of these filters can be used simultaneously. How to use this opportunity?

A query like "in title: name of the airport" will return all articles containing the name of the airport in the title. If you formulate it as "parking in title: name of the airport", then you will get articles with the name of the airport in the title and mentioning the parking in the text.

Searching by the "incategory: Category" filter works on the principle of first returning articles belonging to a certain group or list of pages. For example, a search query like "Temples incategory: History" will return results related to the history of temples. This function can also be used as an advanced function by setting various parameters.

Popular topic

Editor's choice