![]()
Google Gets Rivals
Originally printed May 17, 2001
Joyce Kasman Valenza
When you practically live on the Web, you tend to believe that you can search with your eyes closed. It's just then that they start moving the furniture around. And so it's important that we take some time every few months to take stock of any changes or improvements in the searching landscape.
Joe Barker, Web Instruction Coordinator in the Teaching Library at UC Berkeley continues to promote Google (http://google.com). "It's gotten better and the others are falling behind," said Barker. "It is possible, but hard to find searches that Google is not the best engine for." Barker noted some recent Google improvements. Google is now "doing PDF," said Barker. Google recently became the first of the search engines to translate Adobe PDF(Portable Document Format) files into html and index them across the web. PDFs are a popular format for posting serious, often lengthy Web documents. Until now the text of PDF documents could not be read by search engine spiders and searchers need to use a specialized PDF search tool. Many remained buried as part of what's known as "the invisible web." Google now displays PDFs tagged, along with other results and users have the option to displayed them as html.
Another reason for selecting Google is that the engine searches keywords in context. Your results page display your keywords surrounded by document text, making it possible to evaluate what you are going to click on and sometimes making it possible to see the answer to a query without even clicking on a result. Google's relatively new directory integrates its own PageRank technology with the exhaustive Open Directory Project where "volunteer armies of editors" work to organize the Web. Google does have its shortcomings. "It may not be the not great for complicated searches, because it doesnıt provide full Boolean," said Barker. "I teach when not to use Google." Use AltaVista (http://altavista.com) for those complex searches, when you don't want what everyone else wants. And because it relies on link relevance, very current material generally does not make it to the top of Google's result lists.
"It's important that we teach people the universe of databases exists," said Barker. "Keyword searches are often frustrating and in many cases it's easier to search a specialized directory, like a database of every state's tax forms." Though there are a growing number of gateway sites to help you locate the valuable databases of the "invisible web," Barker suggests "you can find databases by using the words "database" or "directory" as a keywords combined with another important keyword, for instance "database AND plane crash."Academic Info (http://www.academicinfo.net/table.html) and Argus Clearinghouse (http://www.clearinghouse.net/) are among Barkerıs favorite tools to find directories. Barker noted that Yahoo also lists thousands of directories.
"The Internet has exploded with content and search engines are struggling to deal with it," said Tara Calishain, author of five books on the Internet, and the editor of the e-newsletter Research Buzz, when I asked her what was new. Calishain, also a big Google fan, told me that the engine now supports more search syntaxes, for instance the "OR" operator, and that Google will now translate search results to the language of your choice. Calashain turned me on to a few new search tools. Guidebeam (http://guidebeam.com/) "generates categories for searches on the fly," said Calishain. My search on the Civil War led me to an intermediate page where I could select among a wide array of related categories. This combination of browsing and searching might be just the tool to help students narrow a topic for research.
Subjex (http://subjex.com/) calls itself the "world's first dialogue based Internet search engine portal." When users type their query in natural language they are returned list of suggested resources. "And they can actually have a conversation with it, said Calishain. Calishain recommended Moreover (http://w.moreover.com/categories/category_list.html), a news portal that she says "aggregates news from around the web and sorts it into 700 categories. You can request daily news in any of the categories via email or you may choose to have any of the news feeds appear on your web site."
"This year there's been a lot of talk about the deep web and deficiency of the typical search engine or the spider-crawled web," said Laura Cohen, Network Services Librarian of the University of Albany. "The static web, sometimes called the invisible web, with its content stored in databases, PDF files, multimedia, is not picked up by search engines. This invisible Web is estimated to be more than 500 times the size of the standard web. Sometimes it's superb content, sometimes it's merely items in catalogs or lists of employees."
Cohen and I discussed some of the databases that offer access to the "invisible web," for instance, Complete Planet (http://completeplanet.com), Invisible Web (http://invisibleweb.com/) or Search IQ (http://www.zdnet.com/searchiq/).
Cohen led me to two new metasearch tools that use "clustering" or autocategorization as a feature to help users fine tune a search. QueryServer (http://www.queryserver.com/) broadcasts a query across a set of web search tools that may be selected by the user and lists results as "a single merged, ranked and conceptually clustered list." Users may also choose to narrow their search using search tools for news, health, money or government. Launched in February, Vivisimo (http://vivisimo.com/) was developed by faculty and students at Carnegie Mellonıs Computer Science Department. The "clustering engine" organizes results from several search engines into meaningful groups displayed in hierarchical categories on the left frame. Vivisimo is getting great reviews and is quick becoming one of my own favorite search tools. Old timer Excite (http://www.excite.com/) recently added a "Zoom In" feature. Click on the Zoom In button to open a new window that suggests other terms that might help clarify your search. These suggestions are generally derived from the most popular searches related to your own. This same pop up window suggests possible spelling errors.
"One of my big points is that you've got to think about the query, and gear that query towards a particular tool--to fine tune your search using the right tool," said Cohen. Cohen offered advice to searchers. Combine your tools. "We have so much choice that we can now pick and choose little steps within your search." Cohen suggests using the clustering and narrowing features of the new search tools "Bring the alternate terms and categories you discover back to your favorite engine."
"And it's important to understand we're dealing with commercial products," said Cohen. Even when there are features like Google's page ranking you have to be careful. "Nothing is neutral," said Cohen. "And paid listings are the least neutral." Searchers should be aware of result lists that involve an increasing amount of paid placement.
Back to Neverending Search
Back to Virtual Library