Internet search systems: Yandex, Google, Rambler, Yahoo. Warehouse, functions, operating principle. What exactly does the search system do? 1 What to understand about the sound system briefly

Golovna / Google Play

Hello, fellow readers of the blog site. If you are engaged in, otherwise seemingly, sound optimization, both on the professional level (selling commercial projects for pennies), and on the amateur level (), you will definitely come to grips with this, which is necessary It is important to know the operating principles in general in order to successfully optimize their own or someone else's site.

The enemies, as it seems, need to be known in person, although, of course, stinks (for RuNet and Yandex i) are not enemies for us at all, but rather partners, since their part of the traffic is in most cases the predominant and main one. And, of course, blame and stink no longer confirm this rule.

What is snipe and the principles of operation of sound systems

But here you will need to get started right away, and what is the snipe that is still needed and why is it so important for the optimizer? The results of the search are displayed immediately after the document sent to the search (the text of which is taken from having already written):

How snіpet vikoristovuyutsya zazvichiy shmatki text of this document. The ideal option for clicking is to give the author the opportunity to think about the side instead of going to it (otherwise we’ll put it away, but not again).

The snippet is generated automatically and any fragments of the text will be highlighted in the new version, and, importantly, for different queries on the same web page there will be different snippets.

It is also true that instead of the Description tag itself, it can be searched (especially in Google) as a snippet. Of course, it’s still stale and that’s why, in the form of some kind of vein, it appears.

For example, instead of the Description tag, you can display, for example, when you search for keywords, the words that you entered in the description, or in the end, if the algorithm itself does not yet know the fragments of text on your site for all the keywords that your page is missing from I see Yandex abo Google.

Therefore, please do not delete it and remember it instead of the Description tag for the skin status. You can make money with WordPress by vikorizing descriptions (and I recommend that you vikorist).

If you are a fan of Jumli, you can quickly enjoy this material.

Ale Snippet cannot be removed from the return index, because There, information is saved only about the words on the page and their positions in the text. The axis itself for creating snippets of the same document in different search engines (for different queries) is our favorite Yandex and Google, in addition to the return index (required directly for search - read about it below), save direct index, then. a copy of the web page.

By saving a copy of the document as a base, you can then manually cut the required snippets from them, without changing from the original.

That. It turns out that search engines store both the forward and reverse index of the web site in their database. Before speech, the formation of snippets can be indirectly integrated, optimizing the text of the web story in such a way that the algorithm selects the same fragment of text that you have in mind. We’ll talk about that in another article.

How to operate search systems

The essence of optimization is to “help” search engine algorithms to raise the pages of the sites you visit to the highest possible position in terms of these and other queries.

I took the word “help” from the forward proposition from the paw, because With our optimization actions, we do not entirely help, but often rely heavily on the algorithm to generate a relevant query for the view (about riddles).

This is the bread and butter of optimizers, and the search algorithms will not become thorough, so there is a possibility of using internal and external optimization to improve their positions in Yandex and Google.

First of all, let’s move on to learning optimization methods, it is necessary to get a thorough understanding of the principles of operation of sound systems, so that all further work is aware and understandable in the future. Those whom we are trying to fool a little react.

It is clear that it is impossible for us to understand the entire logic of their work, because there is not enough information to divulge, unless we have enough understanding of the basic principles. Come on, let's see.

How do sound systems work? It’s not surprising, but the logic of their work is, in principle, the same and comes to the fore: information is collected about all the web pages within reach, which can be reached, after which this data is collected from the cunning wisdom in order to use them handily b conduct a search. The axis, lord, and everything on this article can be considered complete, but still add a little detail.

First of all, let’s clarify that what we call the side of the site is called a document. In this case, you are responsible for your unique address () and, notably, the hash message is not generated until a new document appears (about those).

In another way, you should focus on algorithms (methods) for searching information from a collected database of documents.

Algorithms for direct and reverse indexes

Obviously, the method of simply enumerating all the pages that are stored in the database will not be optimal. This method is called an algorithm direct search And despite the fact that this method allows you, of course, to find out the necessary information without missing anything important, it is absolutely not suitable for working with large amounts of data, because the search will take quite an hour.

Therefore, for efficient work, with great effort of data, an algorithm of reverse (inverted) indexes was developed. And what’s important is that he himself is victorious about all the great sound systems in the world. Therefore, in this next report, let’s take a look at the principles of this work.

When using the algorithm return indices It is necessary to convert documents from text files to create a list of all the words that are in them.

Words in such lists (index files) are arranged in alphabetical order and the order of each of them is indicated in the view of the location coordinates in the web page, where the word appears. Around the document position for each word, other parameters are specified that indicate its value.

As you might guess, in many books (mostly technical or scientific) on the remaining pages there is a list of words that are included in this book, from the designated page numbers, where they are narrowed down. Of course, this list does not include all the words that appear in the book, but can serve as an example for the index file using additional inverted indexes.

I appreciate your respect that search engines search for information not on the Internet, and the return indices of the web sites they provide. Want and direct indexes (original text) stinks are also saved, because This is useful for writing snippets, but we have already talked about this at the beginning of this publication.

The algorithm of return indexes is used by vikoryst systems, because This allows you to speed up the process, otherwise there will be an inevitable loss of information for the process involved in transforming the document into the index file. To make it easier to save the return index files, use a tricky method to compress them.

A mathematical model that is used for ranking.

In order to search for portal indexes, a mathematical model is developed, which makes it possible to simplify the process of identifying the required web sites (by entering a query) and the process of determining the relevance of all found documents for that query. The more evidence is consistent with a given query (the more relevant it is), the more likely it is to be found in sound.

This means that the main task that the mathematical model is based on is the search for pages in its database of portal indexes relevant to this query and their further sorting in order of decreasing relevance to this query.

The use of a simple logical model, if the document is found, as the phrase that is being sought is sharpened, will not work for us, due to the great number of such web sites that appear to be worthwhile.

The search system is responsible for not only providing a list of all web sites on which words are missing from the title. You can enter this list in this form if the most relevant documents are found at the top (sort according to relevance). This task is not trivial and, in our minds, cannot be ideal.

Before speaking, the imperfection of any mathematical model is also taken advantage of by optimizers who use these and other methods to rank documents in the form (at the expense of the site, which is used by them, of course). The mathematical model, which is used by all sound systems, is classified as a vector model. She has a vikorystvo understanding that the document was completely given by the koristuvach.

In the basic vector model, the length of the document behind a given word is calculated based on two main parameters: the frequency with which a given word is heard (TF - term frequency) and how rarely this word is heard on all other sides ka collections (IDF - inverse document frequency ).

Below the collection is the entire set of pages included in the sound system. By multiplying the two parameters one by one, we subtract the document's value into the written task.

Naturally, different sound systems, in addition to the TF and IDF parameters, are used without any significant factors for the expansion of the voice, but the essence remains unchanged: the value of the side will be greater, the more often the word is The sound note is sharpened in it (before the song between, after which document can being detected as spam) and how sooner this word appears in other documents indexed by this system.

Evaluation of the cost of the robotic formula by assessors

Thus it turns out that the formation of data for these and other queries occurs I'll follow the formula without human participation. If any formula does not work perfectly, especially in the beginning, you will need to control the operation of the mathematical model.

For these purposes, specially trained people are used to look at the data (specifically the search systems that hired them) for various queries and evaluate the accuracy of the flow formula.

All of them are respectfully supported by people who are in charge of adjusting the model. Changes and additions are made to the formula, and as a result, the efficiency of the sounder increases. It turns out that the assessors are concluding the role of such a kind of gateway connection between the developers of the algorithm and their koristuvachami, which is a necessary enhancement of jaundice.

The main criteria for assessing the efficiency of the robot formula are:

The accuracy of the sound system is hundreds of relevant documents (which correspond to the request). The fewer web pages (for example, doorways), so that you don’t bother with those questions, being present will be better
The completeness of the sound type is a high number of similar queries (relevant) web sites to the total number of relevant documents that are in the entire collection. Tobto. It turns out that the entire database of documents, which are in search of web pages, that correspond to a given query, will be shown below in search form. In this case we can talk about the inconsistency of appearance. It is possible that some of the relevant pages were lost under the filter and, for example, were mistaken for dirt or other slag.
The relevance of the view is the level of relevance of a real web page on a site on the Internet to what is written about it in the search results. For example, the document may no longer be created, or it will be greatly changed, but the type of the given query will be present, regardless of his physical presence at the specified address, or who will be completely different from the given query. The relevance lies in the frequency with which search robots scan documents from their collection.

How Yandex and Google collect their collection

Regardless of the simplicity of indexing web pages, which appears to be the case, there are a lot of nuances that you need to know and then use when optimizing (SEO) your own or other sites. Indexation of a database (collection collection) is carried out by a specially designed program called a search robot (bot).

The robot selects the initial list of addresses that it will be responsible for extracting, copying these pages and giving it to the algorithm for further processing (it converts them into return indexes).

The robot can not only go behind this list, but also go to the messages from these pages and index the documents that are behind these messages. That. The robot behaves in the same way as a prime minister who has to go after orders.

It turns out that with the help of an additional robot you can index all those that are available to the user, which is a browser for surfing (search engines index direct visibility documents that can be accessed by any Internet user).

There are few features associated with the indexing of documents at the border (I remember what we have already discussed).

The first feature that can be taken into account is that in addition to the return index, which is created from the original document imported from time to time, the sound system saves another copy, otherwise, apparently, sound systems save another direct Index. What is needed? I had already guessed a little earlier what would be needed to compose different snippets based on the entered query.

How many pages of one site Yandex shows in the form and indexes

I would like to express your respect to such a special feature of Yandex’s work, as the presence of more than one document on each site for a given request. This, for a species to be present in different positions on two sides from one resource, could not have happened until recently.

This is one of the basic rules of Yandex. If on one site there are hundreds of pages relevant to a given query, then there will only be one (the most relevant one).

Yandex aims to ensure that koristuvach selects different information, and not burn out a number of pages of sound-type information from the pages of the same site, which koristuvach appeared to be insignificant from these people Other reasons.

However, I hasten to wait, because if I have completed this article, I have learned something new that Yandex will begin to allow the appearance of another document from the same resource, as if this side appears “even good and correct” (in other words, it is highly relevant to the query).

What is noteworthy is that additional results from the same site are also numbered, and therefore, through this top, various resources will fall that occupy lower positions. Axle butt of the new Yandex version:

They try to gradually index all sites, but often this happens not simply through the very different number of pages on them (some have ten, and some have ten million). Yak buti u tsomu vipadku?

Yandex is moving away from this situation by interchanging a number of documents that can be downloaded to the index from one site.

For projects with domain names of another level, for example a website, the maximum number of pages that can be indexed by a Runet mirror is in the range of one hundred to one hundred and fifty thousand (the specific number is based on the assignment to that project).

For resources from domain names of the third level – from ten to thirty thousand pages (documents).

If you have a website with a domain of another level (), and you will need to index, for example, a million web sites, then the only way out of this situation will be to create the impersonality of subdomains ().

Subdomains for a domain of another level may look like this: JOOMLA.site. The number of subdomains for other countries that can be indexed by Yandex is a little more than 200 (sometimes up to a thousand), so in this simple way you can put a few in the index of the RuNet mirror More web pages.

How Yandex is placed before websites in non-Russian domain zones

Due to the fact that Yandex has until recently been interested in the Russian part of the Internet, it indexes mainly Russian projects.

If you are creating a website not in domain zones, which are supposed to be attributed to the Russian ones (RU, SU and UA), then it is not possible to check for indexation, because You, who have seen everything, will know you no earlier than a month ago. If indexation has already begun, it will occur at the same frequency as in Russian domain zones.

Tobto. The domain zone flows only for an hour, which passes to the beginning of indexation, but does not flow further to its frequency. Before speaking, what is the frequency?

The logic of the work of sound systems from the re-indexation of pages is reduced to approximately the same:

Having known and indexed the new page, the robot will go to it the next day
having realized what happened yesterday, and not knowing the duties, the robot will come to her again in three days
As soon as nothing will change on it, it will come after a decade, etc.

That. Therefore, the frequency of the robot’s arrival on this side is equal to the frequency of its update or will be equal to it. Moreover, the hour for the robot to re-enter may vary for different sites, both in China and in Russia.

These are intelligent sound systems that create an individual delivery schedule for different parts of different resources. You can, however, ask the sound systems to re-index the page behind our banners, as if nothing has changed on it, but about this in a different statistic.

Let us continue to introduce the principles of search in the current situation, where we will look at the problems that arise in search systems and look at the nuances. Well, and a lot of other things, of course, so it helps in another way.

Good luck to you! See you soon on the blog site

You may be in trouble

Rel Nofollow and Noindex - how to block indexation by Yandex and Google of external messages on the site
The appearance of the morphology of speech and other problems caused by sound systems, as well as the frequency of high frequency, mid frequency and low frequency inputs
Trust for a site - what is it, how to become extinct in XTools, what influences it and how to increase the authority of your site
SEO terminology, shorthand and jargon
Relevance and ranking - what are these factors that influence the ranking of sites in Yandex and Google?
What search engine optimization factors influence the site’s performance in this way?
Search optimization of texts - optimal frequency of keywords and your ideal birthday
Content for the site - as the addition of unique and unique content helps the daily development of sites
Meta tags title, description and keywords
Yandex updates - what happens, how to track up Tits, change sound types and all other updates

Sound systems (PS) are now an important part of the Internet. Today, they rely on complex mechanisms that are not only tools for finding any necessary information, but also accessing hot areas for business.

Most koristuvachs have never thought about the principles of their work, about the methods of processing koristuvach’s drinks, about how these systems are made and function. This material will help people who are engaged in optimization and understanding of the devices and basic functions of sound machines.

Functions and understanding of PS

Poshukova system– this is a hardware-software complex, which is used for this function of searching on the Internet, and responds to the request of the user, which requires him to enter any text phrase (or more precisely, a search request), in the form of a list of orders on it Information sources that are concerned with relevance. The widest and largest search systems: Google, Bing, Yahoo, Baidu. Runet has Yandex, Mail.Ru, Rambler.

Let’s take a closer look at the most important thing, just for fun, using the Yandex system as an example.

The question is to be formulated in a manner that is completely similar to the subject of your search, as simply and briefly as possible. For example, we want to know information in this search engine: “how to choose a car for yourself.” To do this, open the main page and enter the search for “how to select a car.” Then our functions are limited to going to the information desk at the border for these messages.

Well, if you work in this manner, you can and will not reject the information we need. If we have received such a negative result, we just need to reformat our request, otherwise the search database does not have any useful information on this type of request (this is entirely possible when specifying “university” parameters of the request, such as, for example, “how to choose a car in Anadyri” ).

The most important task of the cutaneous auditory system is to provide people with the very type of information they need. And it is practically impossible to get students into the habit of making the “correct” type of calls to sound systems, so phrases that would be consistent with their principles of work.

That’s why the fakhivtsy-speculators of jokes are trying to break down such principles and algorithms of their robots, as if they would let the traders know what is useful for them. This means that the system is responsible for “thinking” the same way a person thinks when searching for necessary information on the Internet.

When you enter your search into a search machine, you can find what you need in the simplest and fastest way. Having taken the result, the expert begins to evaluate the robotic system based on a number of criteria. Did you manage to find out the information you need? Anyway, how many times have you had to reformat the text of a query in order to know it? How much relevant information has been lost? How did Shvidko Poshuk’s system process this? How easy were the search results? Did you get the desired result first, or did you experience it on the 30th month? How much “stuff” (unnecessary information) was found at once from the background? Do you find relevant information, by hour, by hour, by year, by month?

In order to select the right types of food for such foods, manufacturers are looking to gradually improve the principles of ranking and their algorithms, add new capabilities and functions to them, and in any way try to create better working systems.

Main characteristics of sound systems

Significantly the main parameters of the search:

Povnota.

Repetition is one of the most important characteristics of the search, and it relates to the number of information documents found in a search to the number of them on the Internet that can be searched for. For example, a line has 100 pages containing the words “how to choose a car”, and after the same search a total of 60 were selected from the total number, then in this case the search frequency becomes 0.6. It is clear that the higher the search itself, the greater the likelihood that the student will find the document that he needs, especially since he is sleeping.

Accuracy.

Another main function of the sound system is accuracy. Vaughn indicates the level of the correspondence of the customer of the identified pages at Merezha. For example, since the key phrase “how to choose a car” contains a hundred documents, half of them contain phrases, and others simply contain words (how to choose a car radio correctly, and install it in a car), then Poshkov’s point no more than 50/ 100 = 0.5.

The more accurate the search, the more accurate the information you need, the less varied “suggestions” there are among the results, the less found documents are not suitable for replacing the question.

Relevance.

What is important is the storage time, which characterizes the hour that passes from the moment information is published on the Internet until it is entered into the search engine’s index database.

For example, the next day, after information appeared about the release of the new iPad, many people started asking for similar types of queries. In most cases, information about this new product is already available online, although the time has passed since it appeared. This is always obvious from the great sound systems of the Swedish base, which are updated several times a day.

Sounds like a joke.

This function, such as elasticity, is closely related to the so-called “resistance to vantage”. When searching, there is a huge number of people, such fascination requires a significant reduction in the time required to process one query. Here the interests of both the sound system and the user are completely avoided: you want to reject the results as soon as possible, and the sound system is responsible for processing this request as quickly as possible, so as not to over-process the upcoming requests.

Completeness.

Initially, the manifestation of results is the most important element in the success of the search. Behind the scenes, the search system contains thousands, and in some cases millions, of different documents. Due to the vagueness of the composition of key phrases for search or its inaccuracy, the main results of the search will not always be without the necessary information.

This means that people often have to carry out their thoughts in the midst of given results. Various components of the pages of the PS type help to navigate the sound results.

History of the development of sound systems

Once the Internet began to develop, the number of permanent traders was small, and the amount of information available for access was still small. Greater access to this area is limited to those in the scientific and research spheres. At that time, the knowledge of information was not as relevant as it was now.

One of the first methods of organizing wide access to information resources was the creation of directories of sites, and messages on them began to be grouped by topic. The resource Yahoo.com, which emerged in the spring of 1994, became such a first project. This year, as the number of sites in the Yahoo catalog has increased, an option has been added to search for the necessary information in the catalogue. There has not yet been a complete search system in the world, since the area of such search has been limited only by sites that are included in this directory, and not by all resources on the Internet. Catalogs sent to the great people were widely used in the past, but now they have almost completely lost their popularity.

Even today's great catalogs contain information about a small number of sites on the Internet. The most popular and largest catalog in the world contains information about five million sites, if the Google database contains information about more than 25 billion sites.

The world's most popular search engine was WebCrawler, which dates back to 1994.

AltaVista and Lycos appeared in the coming fate. Moreover, Persha has been the leader in the search for information for a very difficult time.

In 1997, Sergiy Brin, together with Larry Page, created the Google search engine as a follow-up project at Stanford University. Today Google is the most popular search engine in the world.

In the spring of 1997, the Yandex PS was announced (officially), as it became the most popular search system in Runet.

For tributes on spring 2015 roku, parts of sound systems around the world are divided in the following order:

Google – 69.24%;
Bing – 12.26%;
Yahoo! - 9.19%;
Baidu – 6.48%;
AOL – 1.11%;
Ask - 0.23%;
Excite - 0.00%

For tributes on breast 2016 roku, parts of sound systems on Runet:

Yandex - 48.40%
Google – 45.10%
Search.Mail.ru - 5.70%
Rambler – 0.40%
Bing – 0.30%
Yahoo - 0.10%

Principles of the robotic sound system

Russia's main search system is Yandex, then Google, and then [email protected]. All great systems are searching for their structure as they diverge from others. However, you can still see the basic elements that are essential for all sound systems.

Indexing module.

This component consists of three software robots:

Spider(in English pavuk) is a program designed to attract web pages. “Pavuk” captivates the song’s song, instantly drawing out all the messages from it. The html code is enchanted practically from the skin side. For this purpose, we use HTTP protocols.

“Pavuk” functions like this. The robot sends the request to the server "get/path/document" and other commands using HTTP. In response, the robot program selects the flow of text, which places the information in the service view and, of course, the document.

URL of the desired page;
date when the site was created;
server http-video header;
html code, "body" of the page.

Crawler("Mandrous" spider). This program automatically accesses all messages found on the site, and also sees them. Your task is to determine where the spider may go next, based on these messages or leaving the given address list.

Indexer(Robot indexer) is a program that analyzes pages that spiders have downloaded.

The indexer thoroughly analyzes the warehouse elements and conducts their analysis, using its own morphological and lexical types of algorithms.

The analysis is carried out on various parts of the page, such as headings, text, message, style and structural features, html tags, etc.

Thus, the indexing module allows you to go through a specified number of resources, capture pages, extract messages to new pages from seized documents and perform a report analysis of them.

Database

Database(or the search engine index) - a complex of data saving, an array of information in which is saved by the first step of processing the parameters of the skin acquired by the indexing module and the stored document.

Sound server

This is the most important element of this system, because the type of algorithms that lie at the heart of its functionality directly contains the liquidity and, especially, the acidity of the joke.

The sound server works as usual:

When it comes out of the mouth, it is subject to morphological analysis. Information specific to any document that is in the database is generated (it will later be displayed as a snippet, an information field to the text that corresponds to this query).
The extracted data is passed as input parameters to a specialized ranking module. All documents are reviewed, and the result of each such document is assigned its own rating, which characterizes the relevance of such a document to the merchant and other warehouses.
Based on the minds assigned by the correspondent, this rating can be entirely corrected by additional ones.
Then the snipe itself is generated, then. For any document found in the summary table, select the title, abstract, which most closely resembles the query, and the message for this document, in which the word form and words are highlighted.
The search results are shared with the people who created them on the page where the search results appear (SERP).

All these elements are closely connected with each other and function, interacting, creating a clear, but not simple mechanism for the functioning of the PS, which will require a huge investment of resources.

The Internet is necessary for rich traders in order to select inputs and inputs.

As if there were no search systems, the koristuvachs had to independently search for the necessary sites, memorize them, and record them. In such situations, knowing “manually” what you need would be even more difficult, and often simply impossible.

For us, all this routine work is done by searching, saving and sorting information on websites.

Let's talk about the known search systems of the Runet.

Search systems on the Russian Internet

1) Let’s start with the ham sound system. Yandex operates not only in Russia, but also in Belarus and Kazakhstan, Ukraine, Turkey. Also Yandex English language.

2) The Google search engine came to us from America and has Russian localization:

3) The popular search engine Mail ru, which simultaneously represents the social network VKontakte, Odnoklassniki, as well as My World, visible on Mail.ru and other projects.

4) Intelligent search system

Nigma (Nigma) http://www.nigma.ru/

On 19 June 2017, the intellectual nigma does not work. It ceased to represent a financial interest for its creators; they switched to a different search system called CocCoc.

5) At home, the Rostelecom company created the Suputnik search system.

And the joker Saputnik, specially for children, which I wrote about.

6) Rambler was one of the first popular search engines:

There are other types of sound systems in the world:

Bing,
Yahoo!,
Baidu,
Ecosia,

Let's try to understand how the search system works, and how sites are indexed, analyze the results of indexing, and formulate search results. The principles of operation of sound systems are approximately the same: searching for information on the Internet, saving and sorting it for the purpose of obtaining relevant information from customers. And the algorithms behind which sound systems operate can vary greatly. These algorithms are kept in the dark and their discord is protected.

By inserting the same signal into a row of different sound systems, you can select different types. The reason is that all search engines use powerful algorithms.

Meta of sound systems

We first need to know that sound systems are commercial organizations. It's meta - otrimannaya profit. Profits can be collected from contextual advertising, other types of advertising, and from placing unnecessary sites on the top rows. There are a lot of ways.

It depends on the size of the audience and how many people use this search system. The larger the audience, the more people the advertisement will be shown to. Apparently, there will be more advertising. Search engines can increase the audience of search engines by reducing the cost of advertising, as well as by increasing the profitability of search engines by reducing the capacity of their services, the algorithm and the reliability of searches.

The most advanced and complex thing here is the development of a fully functional search algorithm that would produce relevant results for more customer queries.

The work of the search engine and webmasters

The skin pricking system has its own powerful algorithm, which is responsible for incorporating a large number of different factors when analyzing information and the complex types of responses to the doctor’s request:

century of this or that site,
website domain characteristics,
the content of the site is clear,
peculiarities of navigation and structure of the site,
usability (usefulness for business owners),
behavioral officials (the search engine can be identified by those who know the answer to the site, and the person who turns back to the search system and there again searches for the answer to the same question)
etc.

All this is necessary in order to ensure that the drink you drink is as relevant as possible, so that the drink you drink satisfies you. As a result, the algorithms of sound systems are gradually changing and being refined. As it seems, there is no lack of thoroughness.

On the other hand, webmasters and optimizers are constantly coming up with new ways to promote their sites, which are not always fair. Instructions to the algorithm of search engines - make changes before the next change, so as not to allow the “filthy” sites of dishonest optimizers to be listed in the TOP.

How does the search system work?

Now let’s talk about how the sound system works without any problems. It consists of at least three stages:

scanning,
indexing,
ranking.

The number of sites on the Internet is simply astronomical. And the skin site is information, information content that is created by readers (living people).

Skanuvannya

This means searching the Internet to collect new information, analyze the message and search for new content that can be searched for in order to obtain a response to your question. For scanning, sound systems have special robots, which are called sound robots or spiders.

Search robots are programs that automatically navigate websites and collect information from them. Skanuvannya mozhe buti pervinnim (robot go to the new site first). After the initial collection of information from the site and its entry into the search engine database, the robot begins to visit its pages with regularity. If any changes have been made (new content has been added, old content has been removed), then all these changes will be recorded by the search engine.

The main task of the search engine is to find new information and provide it to the search engine for the next stage of processing, then for indexing.

Indexing

The search engine can search for information only from those sites that are already listed in its database (indexed by it). Just as crawling is the process of searching and collecting information from another site, indexing is the process of entering this information into the search engine’s database. At this stage, the search engine automatically makes decisions about how to enter this and other information into its database and where to enter it, to which section of the database. For example, Google indexes almost all the information found by its robots on the Internet, while Yandex is more powerful and indexes not everything.

For new sites, the indexing stage may be longer, which means that due to search engines, new sites can be scanned for a long time. And new information that appears on old, untwisted sites can be indexed as quickly as possible and almost immediately put into an “index”, then into the database of search engines.

Ranjuvannya

Ranking is a selection of information that was previously indexed and entered into the database of one or another search engine, following the rank, so that what information the search engine will show its correspondents to us in advance, and which information will be sent We are looking for a “rank” lower. The ranking can be brought to the stage of servicing the sound system of your client – the customer.

On the servers of the search system, data is processed and processed for a wide range of different queries. This is where the robot starts to use joke algorithms. All sites are entered into the database and classified according to topics, topics are divided into groups of queries. According to the skin of the groups of applications, the front view can be folded, as it will be adjusted accordingly.

Hello, fellow readers of the blog site. , then the countless koristuvachs had enough power bookmarks. However, as you remember, having been in geometric progression, it has become more difficult to navigate all of its diversity.

Then catalogs appeared (Yahoo, Dmoz and others), in which their authors added and sorted various sites into categories. This immediately made life easier for those who are still outnumbered by the number of profiteers on the global scale. There are a lot of live catalogs.

Only an hour later, the size of their databases became so large that the developers immediately began to think about creating a search among them, and then about creating an automated system for indexing everything on the Internet, in order to make it accessible to everyone I'm afraid of them.

The main sound systems of the Russian Internet segment

As you can imagine, this idea was implemented with resounding success, except, however, everything turned out well for only a handful of companies that managed to survive on the Internet. Perhaps all the sound systems that appeared in the first edition either appeared or were still alive, or were bought by distant competitors.

The sound system is a very complex and, importantly, resource-intensive mechanism (it is not only material resources that are at stake, but also human ones). Behind the call, or its ascetic analogue of Google, there are thousands of spyware, hundreds of thousands of servers and billions of dollars of deposits, which are necessary for this machine to continue to operate. lost its competitive advantage.

Entering this market at once and starting from scratch is more of a utopia than a real business project. For example, one of the world's largest corporations, Microsoft, has been trying to gain a foothold in the search market for decades, and now its search engine Bing is slowly beginning to vindicate its insights. Until then there had been very few failures and failures.

What can we say about those who need to enter this market without special financial inflows. For example, our homemade sound system Nigma has a lot of value and innovation in its arsenal, and their advances are given to the leaders of the Russian market thousands of times. For example, take a look at the Yandex audience:

In connection with this, you can take into account that the list of the main (shortest and most successful) search engines of the RuNet and the entire Internet has already been formed and the whole intrigue lies mainly in who has been killed, and in what order to divide them Not a percentage, because all the stinks are gone . and lose afloat.

Market of sound systems in Russia It looks really good and here, melodiously, you can see two or three main gravels and a few others. In RuNet, a unique situation has developed, which has been repeated, as I understand it, only in two countries in the world.

I'm talking about those that Google's search engine, which arrived in Russia in 2004, has not yet managed to achieve leadership. In fact, stinks began to appear around this period, buy Yandex, but it didn’t work out there, and at the same time “our Russia” together with the Czech Republic and China and these places, the almighty Google, having not recognized the damage, then accept, there is a serious op ir.

Really, improve the production mill right in the middle the best jokers on the RuNet maybe someone. All you need to do is paste this URL into the address bar of your browser:

http://www.liveinternet.ru/stat/ru/searches.html?period=month;total=yes

On the right is that most vikorists are on their sites, and this URL allows you to get statistics on the access of adverts from various search engines to all sites that fall within the RU domain zone.

After entering the specified URL, you will be not too attractive and presentable, but will better represent the essence of the picture. Return your attention to the first five search engines from which Russian sites remove traffic:

So, of course, not all resources with Russian content are located in this zone. Also SU, and RF, and in hidden zones such as COM or NET, there are also Internet projects oriented towards the RuNet, but nevertheless, the selection is still quite representative.

This content can be arranged even more quickly, as, for example, by taking this measure for your presentation:

The essence does not change. A couple of leaders and a number of highly standing sound systems. Before speaking, I have already written about a lot of them. Sometimes it is difficult to delve into the history of success or, perhaps, to delve into the reasons for the failures of promising sound systems.

Well, since they are important for Russia and the RuNet as a whole, I will interrupt them and give them a short demonstration:

Google searches have become obsolete for the rich inhabitants of the planet - about those you can read for the sake of it. In this search system, there is a need for the “result transfer” option, if you have collected the signals from all over the world, and also from your own family, but unfortunately, it is not available (accepted on google.ru).

So the remaining time is less sparing and the brilliance of their species (Search Engine Result Page). Especially, I’m starting from the very beginning the search system of the RuNet mirror (there, that’s the sound I’m talking about before it) and just not knowing there any sensible way, I’m going to Google.

Look at their appearance, it made me happy, but the rest of the time it’s just sleepy - it’s so maddening to hang out for an hour. It is possible that the current struggle to increase income from contextual advertising and constant reshuffling as a way of discrediting SEO promotion can lead to a turning point. There is a well-known competitor in RuNet whose search engine is like that.

I think that it’s unlikely that you specifically go to Go.mail.ru to search on Runet. Therefore, the traffic on important projects using the search system can be much higher, at least ten hundred. The owners of such projects should increase their respect for the system.

However, in addition to the clear expressions of the leaders on the market of search engines in the Russian segment of the Internet, there are also a number of traders, some of whom are low, and despite the very fact of their existence, it is difficult to say a few words about them.

Search systems for RuNet from another echelon

Sound systems for the entire Internet

Behind the great rakhunko, on the scale of the entire Internet, there is only one serious gravets. Google. This is a crazy leader, but he still has competition.

First of all, it’s still the same Bing Which, for example, has a very good position on the American market, especially since it is believed that its engine is being victorized the same way on all Yahoo services (maybe a third of the entire market according to the USA).

Well, in a different way, across a large part of the world, which is what the koristuvachs from China put in the vast majority of koristuvachs on the Internet, their main sound system called Baidu is wedged at the center of the light Olympus. Having been born in 2000 people, its share now accounts for nearly 80% of the total national audience in China.

It is important to say this clearly about Baida, but on the Internet there is a growing trend that the place in this Top is occupied not only by the most relevant sites, but also by those who paid for it (not in the middle of the search engine) , not SEO office). Of course, we are in trouble before the commercial aspect.

If you look at the statistics, it becomes clear why Google is easily willing to lose its data in exchange for greater income from contextual advertising. In fact, they are not afraid of the flow of mercenaries, because in most cases they have nowhere to go. This situation will make you a little bored, but you will wonder what will happen next.

Before speaking, in order to make life even more difficult for optimizers, and perhaps in order to encourage the calmness of the search engine, Google has recently introduced stagnant encryption when transmitting requests from the browser to the search engine. Soon it will no longer be possible to show up in the statistics of doctors and doctors, for what kind of queries people came from Google.

Of course, in addition to the sound systems mentioned in this publication, there are thousands of others - regional, specialized, exotic, etc. It will be impossible to over-explain and describe them within one article, and, frankly, there is no need. Let's briefly say a few words about those It's not easy to make a joke And it’s not easy or cheap to keep him up to date.

It is important that most systems work on similar principles (read about those and about) and follow the same criteria - to give feedback to the clients on their supply. Moreover, the evidence may be relevant (corresponding to nutrition), comprehensive and, not unimportantly, relevant (primary freshness).

Finding this problem is not so simple anymore, especially for doctors, because the search system will need to analyze several billions of Internet pages, types of applications, and those who have lost the ability to formulate a list (apparently) From the very beginning there will be varieties that are most suitable for nutrition koristuvacha.

This superfluous task is based on the forward collection of information from these pages in addition to other indexing robots. They collect messages from previously published pages and import the information into the search system database. There are robots that index text (primary and fluid, which is live on new and frequently updated resources, so that the latest data will always be presented).

In addition, robots use indexers to display images (for their further display in), favicons, mirror sites (for their further alignment and possible gluing), robots check the functionality of Internet pages, like core either through tools for webmasters (here you can read about, and ) .

The process of indexing itself and the subsequent process of updating index databases takes hours. If Google wants to compete significantly more for its competitors, hire Yandex, which is worth a week or two (read about).

Call the text instead of the Internet page, the sound engine breaks down the words to the basic principles, so that you can then give the correct answers to the words, delivered in different morphological forms. All the cool stuff looks like Html tags, clearings too. speeches are deleted, and words that are missing are sorted by alphabet and their position in this document is indicated with them.

This tool is called the gateway index and allows you to search not for web sites, but for structured data located on the servers of the search system.

The number of such servers at Yandex (which is mainly based on Russian sites and a few Ukrainian and Turkish ones) is in the tens or even hundreds of thousands, and at Google (which is based on hundreds of words) - in the millions.

Many servers make copies, which serve as a means of saving documents and help increase the speed of data processing (with the help of additional data processing). Estimate the expenditures for the support of everyone's dominion.

Zapit koristuvach nadsilatimetsya balansuvalnik navantazhennya on that server segment, which is the least at once navantazhennya. Then, an analysis of the region is carried out, data from the sound system, having submitted your request, and a morphological analysis is carried out. If a similar command was recently introduced in the search sequence, then you need to add data from the cache so as not to interfere with the server.

If the request has not yet been cached, it is transferred to the region, the index database of the search engine is decomposed. You will see a list of all existing Internet sites that you may want to contact before asking. Insure as direct entry, and other morphological forms, and so on. speeches.

Ix need to be revamped At which stage the algorithm (piece intelligence) enters on the right. In fact, the correspondent's request is multiplied for the range of all possible variants of his interpretation and is immediately searched for in the absence of queries (for the range of different operators of the search queries, which are available to others koristuvacham).

As a rule, each species has one side of the skin site (sometimes more). Today it is even more difficult to provide insurance to a large number of officials. In addition, for their correction, they need to manually evaluate reference sites, which allows the robot to correct the algorithm as a whole.

Zagalom, clear river, what is dark on the right. We can talk about the process for a long time, but it’s so clear that it’s not easy to achieve satisfaction with the sound system. And in the future there will be those to whom this does not belong, like you and me, gentle readers.

Good luck to you! See you soon on the blog site

You may be in trouble

Yandex People - how to prank people with social networks Apometr - cost-free service with support for changes, types and updates of sound systems DuckDuckGo - a search system that won't follow you
How to check the speed of the Internet (Spidtest, Internetometer from Yandex)
Yandex widgets - how to customize and make the main page more informative and handy for you
Yandex and Google images, as well as search for the image file in Tineye and Google Updating of sites at SEObuilding.RU for cost-free analysis of potential donors upon purchase sent Google Alerts - what is it like and what is it like?
Mine on the right is a look at online accounting or electronic document management via the Internet
Free file sharing services - how to upload a photo and remove the message from the picture

What is snipe and the principles of operation of sound systems

How to operate search systems

Algorithms for direct and reverse indexes

A mathematical model that is used for ranking.

Evaluation of the cost of the robotic formula by assessors

How Yandex and Google collect their collection

How many pages of one site Yandex shows in the form and indexes

How Yandex is placed before websites in non-Russian domain zones

Functions and understanding of PS

Main characteristics of sound systems

Povnota.

Accuracy.

Relevance.

Sounds like a joke.

Completeness.

History of the development of sound systems

Principles of the robotic sound system

Indexing module.

Database

Sound server

Search systems on the Russian Internet

Meta of sound systems

The work of the search engine and webmasters

How does the search system work?

Skanuvannya

Indexing

Ranjuvannya

The main sound systems of the Russian Internet segment

Search systems for RuNet from another echelon

Sound systems for the entire Internet

Editor's choice