Search engines create temporary "databases" of internet sites: think of it as a "snapshot" of the web. Each search engine uses a different method to determine which sites to list when you do a search--that's why the results vary so greatly. In general, these are some of the factors that are taken into account:
When you search for a particular keyword, like weather, most search engines will look for that keyword in a site's URL. However, even if a site is at www.weather.com, that doesn't mean it will automatically be first on the list. Variations in search algorithms can make seemingly unrelated links float to the top and obvious links disappear. For example, we searched for Farmingdale University using Webcrawler and AltaVista, and none of our pages were the first ones listed. Our pages were listed in WebCrawler along with some unrelated pages that had the words "farmingdale" or "university" somewhere in the text.
Another factor considered is the title of the page. This is the phrase that appears at the top of your browser, above the navigational buttons. In HTML, the title is defined using the <title></title> tag. In other words, whatever words are placed between <title> and </title> in an HTML document will be indexed by most search sites. Theoretically, if a site has the word movie in its title, it's more likely to be listed when you do a search for movie sites.
Some search engines--like AltaVista, Infoseek, or Excite--send out spiders that actually retrieve the full text of the pages they visit. So when you do a search for a word, such as College, they look through the pages in their database and return a list of the ones that include the word College. One factor that determines how high a particular page appears on the list of results is how many times the word you searched for appears. So a document that mentions College seven times will probably be listed ahead of a page that mentions College once. If you search for a phrase, like New York, documents that include the two words close together should appear higher on the results list than a document that includes the word new and then the word york five sentences later.
Search engines use different methods to generate the site descriptions that appear in the list of results when you do a search. Some engines just use the first 200 words of the document--which is why you sometimes come across incomprehensible descriptions. Others, such as Excite and Lycos, use proprietary technology to generate their summaries. Still others, like Infoseek and AltaVista, support an HTML tag known as the META description tag. This tag allows Webmasters to write their own site description. It's included in the HTML document and is read by robots or spiders, but it's not visible on the site itself unless one views the site's source code. Besides using these site descriptions in their list of results, search engines will look in the description tag for keywords when you conduct a search. Engines that don't support the META description tag will look for keywords in whatever description they do use.
Another HTML tag that influences how a site is indexed by search engines is the META keyword tag. Like the description tag, this tag allows Webmasters to influence how their site will be ranked by search engines, but without having to show visitors any extra text. For example, a site for a business that does Web marketing can include a keyword tag that defines keywords such as marketing, promotion, or advertising. These words form part of the code for a Web page, but they aren't visible to people who visit the site. (It's kind of like including washing instructions or fabric content on a tag inside a shirt, rather than printing it on the outside.) However, not all Webmasters use keyword tags, and not all search engines look for them.
This information was adapted from "Can You Trust Your Search Engine?" by Susan Stellin, http://www.cnet.com/Content/Features/Dlife/Search/index.html.
![]() |
To the Greenley Library | To The Research Guides Page | ![]() |