What is Search Engine
Search Engine Definition
A search engine is a piece of software with natural language capabilities that apply an algorithm to find, gather, filter and rank the collected data about web pages. The information collected includes the URLs, inbound and outbound links to the web page, the relevant keywords to the web page as a whole, and the codes that make up the visiting page.
Brief History of Search Engines
Alan Emtage, a student at McGill University in 1991 created the first search tool (Archie) with the intention to solve the problem of indexing FTP files on the web. In the same year, a new program called Gopher was developed by Mark McCahill, a student at the University of Minnesota in 1991. Gopher, an application designed to index a plain text document, consequently, became the first website on the web.
Making of Search Engines
In the wake of Gopher, came the invention of the first ever created search engine known as Wandex. This program was developed by Mathew Gray on June 30, 1993, to perform two different tasks – indexing and searching the indexed web pages on the web. It is the first of its kind to crawl the web. As a result of this foundation, thanks to Wandex, the development of commercially made search engines hit the world by storm.
1 Excite – 1993
2 Yahoo – 1994
3 Webcrawler (meta search engine) – 1994
4. Lycos – 1994
5. Infoseek – 1995 merged with Disney in 1999
6. Alta Vista – 1995 acquired by Yahoo in 2003
7. Inktomi – 1996 acquired by Yahoo in 2003
8. AskJeeve – 1997
9. Google – 1997
10. MSN Search -1998
How Search Engine Works
Search engine task is only complete whenever a search term(s) is sent in form of query by the user, requesting for particular information regarding his/her demand. This request trigger a communication that automatically put the algorithm to work to dig into billions of data in the information retrieval system, and release the most relevant web document to the user.
Basic Steps on How it Works
A spider from search engines (Yahoo, Bing, Google) visits a web page, and painstakingly check every URL on the page to collects keyword and phrases on each page.
The collected data is stored in a database of search engines
The indexed data is retrieved from the database whenever a query containing specific search terms or keywords are requested by the user
The retrieved data is processed by the algorithm, considering valuable ranking factors and show the result in SERP.
Search Engine Terms
Database: the database is a large storage repository that stores all the URLs on the web, thanks to the spider whose job is collecting information.
Crawler: the crawler looks at every URL on the web and collects keywords or key phrases on each page stored in the database to power the search engine information retrieval (IR) system.
SERP: is the most important component of search engine everyone wants to be part of and be listed. Whenever a query is sent to the search engine, it returns collections of information relevant to the searched terms. The page that displays these myriads of information is called SERPs.
How Page is Ranked
Data collected are processed, arranged and ranked with the help of the PageRank (Google) method of ranking using sophisticated ranking factors to vote the right web page.
PageRank is a method employed by Google search engine to rank web pages, alongside with a combination of other factors such as link structures and the behavior of links from page to page. The vast link structure of the web pages is crucial to this program.
Ranking by Quality Score
Quality score is an estimate of how relevant your web documents to the users visiting your website. The search engine takes relevancy very seriously, and the following factors are the yardstick used by Google, Bing, Yahoo, and other search engines to measure the quality of your web pages whenever it’s visited
- Domain names relevance and URLs
- Page content
- Link structure
- Usability and accessibility
- Meta tags
- Page structure
- Indexing and crawlability
- Domain age
- Content quality
- Content relevance
- Content length
- Structured schema markup
and much more….
The Search Engine Algorithms
A search algorithm is a problem-solving method that weighs in a problem in different ways possible and returns the solution to that specific problem.
The puzzle here is the word or phrase being searched for, and the procedure is examining or combing through the database (repository) that contains cataloged keywords and the URLs with which those words are related.
The results returned by the search algorithm are based on the perceived quality of the page in question. Its behavior can be grouped into 3 – on-page, whole-site, off-site algorithms. Each of these algorithms considers the different elements of a web page to rank.
On Page Algorithms
The on-page algorithm considers the relevancy of the keywords and measures on-page factors such as how keywords are used in the content and how other words on the page relate to the keyword.
Characteristics of On-Page Algorithm
- Measures the on-page factors that contain relevant element leading the user to the page.
Compares and weighs the relationship of the search terms queried with the content found on the page.
- Examines the topic of the page as related to the keyword
- Monitors and compare the proximity of related words or key phrases.
- Weighs in how meta tags are implemented on the web page against the other element of the on-site optimization.
Meta Tags Effect
Meta tag is designed specifically to satisfy the crawlers and let the crawler measures the primary intention of your website and in addition to the other elements of on-site optimization.
The proximity and inclination of this element within the web document validate the algorithmic results and in turn contribute to the quality score of any web page.
The whole-site algorithm checks the relationship of pages on a site, how they are connected and the relevance of the connection.
Characteristics of Whole-site Algorithm
- It primarily takes the entire website architecture, the page structure and how anchor text is used.
- How the pages on your site are functionally linked together.
- Determines the relationship of pages on the websites and if the visiting page delivers the right result to the visitor when visited.
- It practically considers the user confidence and if the needs are met when visited.
Checks the quality of incoming links (backlinks) to your page.
Search Engine Ranking
It’s helpful to understand how search engines work and what factor each of the search engines consider to rank your website, however, the criteria for ranking has not been fully understood by many SEO evangelists. So a quick glimpse through the following information wouldn’t hurt to get you acquainted of search criteria of interest.
Google search engine considers the combination of keyword searched with link popularity and use in ranking the relevancy of web pages. The link-ranking capability of Google is considered accurate by many.
New websites suffer from being listed quickly and may be delayed for a very long time as a result of the quality of the link considered, which happen to be the major factor Google consider during ranking.
Yahoo search engine includes the combination of All the web, AltaVista, and Overture to search its directory of links and rank pages through the combination of the technologies acquired from different sources.
Yahoo’s link-ranking capability is not as accurate as Google’s. But it’s a good start for a newbie site.
MSN search engine relies strongly on page content, a properly tagged website that contains a good ratio of keywords are considered ranked better and listed faster by the MSN SE.
Pros: It should be considered if you are having difficulty ranked on Yahoo and Google, as there is the possibility of being listed sooner.
Although search engines possess different patterns in executing their ranking strategies. Nonetheless, some factors never change, and these include links structure, terms frequency, keywords proximity within web document and the amount of click-through each website.