![]()
Thursday, May 17, 2001
tech.life@school | Joyce Kasman Valenza
Searches change, so Web users must
When you practically live on the Web, you tend to believe you can search with your eyes closed. It's just then that they start moving the furniture around. So it is important to take stock every few months of any changes or improvements in the searching landscape.
Joe Barker, Web instruction coordinator in the teaching library at the University of California at Berkeley, continues to promote Google, http://google.com. "It's gotten better, and the others are falling behind," Barker said. Google recently became the first of the search engines to translate Adobe PDF (portable document format) files into HTML and index them across the Web. PDFs are a popular format for posting serious, often lengthy, Web documents. Until now, the text of PDF documents could not be read by search-engine spiders, and searchers needed to use a specialized PDF search tool. Google now displays PDFs along with other results, and users have the option to display them as HTML.
Another reason for selecting Google is that the engine searches keywords in context. Your results page displays your keywords surrounded by document text, making it possible to evaluate what you are going to click on and sometimes making it possible to see the answer to a query without even clicking on a result. Google's relatively new directory integrates its own PageRank technology with the exhaustive Open Directory Project, in which "volunteer armies of editors" work to organize the Web. Barker says it is important that searchers recognize that databases exist as an alternative to search engines: "Keyword searches are often frustrating and in many cases it's easier to search a specialized directory, like a database of every state's tax forms."
There are a growing number of gateway sites to help locate the valuable databases of the "invisible Web." Academic Info, www.academicinfo.net/table.html, and Argus Clearinghouse, www.clearinghouse.net/, are among Barker's favorite tools to find directories. Yahoo also lists thousands of directories, he noted.
"The Internet has exploded with content, and search engines are struggling to deal with it," said Tara Calishain, author of five books on the Internet and the editor of the e-newsletter Research Buzz.
Calishain turned me on to a few new search tools.
Guidebeam, http://guidebeam.com/, "generates categories for searches on the fly," Calishain said. My search on the Civil War led me to an intermediate page where I could select among a wide array of related categories. This combination of browsing and searching might be just the tool to help students narrow a topic for research. Subjex, http://subjex.com/, calls itself the "world's first dialogue-based Internet search engine portal." When users type their query in natural language, they are returned a list of suggested resources. "And they can actually have a conversation with it," Calishain said.
She recommended Moreover, http://w.moreover.com/categories/category_list.html, a news portal that she says "aggregates news from around the Web and sorts it into 700 categories. You can request daily news in any of the categories via e-mail or you may choose to have any of the news feeds appear on your Web site." Also new is the news search tool Rocket News at http://www.rocketnews.com/.
"This year, there's been a lot of talk about the deep Web and deficiency of the typical search engine or the spider-crawled Web," said Laura Cohen, network services librarian of the University of Albany. "The static Web, sometimes called the invisible Web, with its content stored in databases, PDF files, multimedia, is not picked up by search engines. This invisible Web is estimated to be more than 500 times the size of the standard Web. Sometimes it's superb content, sometimes it's merely items in catalogs or lists of employees." Cohen and I discussed some databases that offer access to the invisible Web, such as Complete Planet, which is at http://completeplanet.com, Invisible Web, http://invisibleweb.com/, or Search IQ, http://zdnet.com/searchiq/.
Cohen led me to two new metasearch tools that use "clustering" to help users fine-tune a search. QueryServer, which is at www.queryserver.com/, broadcasts a query across a set of Web search tools that may be selected by the user and lists results as "a single merged, ranked and conceptually clustered list." Users may also narrow their search using search tools for news, health, money or government.
Launched in February, Vivisimo, http://vivisimo.com/, was developed by faculty and students at Carnegie Mellon University's computer science department. The "clustering engine" organizes results from several search engines into meaningful groups displayed in hierarchical categories on the left frame. Vivisimo is getting great reviews and is quickly becoming one of my favorite search tools.
Old-timer Excite, www.excite.com/, just added a Zoom In feature. Click on the Zoom In button to open a new window that suggests other terms that might help clarify your search. These suggestions are generally derived from the most popular searches related to your own. This same pop-up window suggests possible spelling errors.
Cohen offered advice to searchers. Combine your tools. "We have so much choice that we can now pick and choose little steps within your search." Cohen suggests using the clustering and narrowing features of the new search tools and such concept processing search tools as Oingo, http://oingo.com, and Simpli, http://simpli.com, as a thesaurus. Bring the alternate terms and categories you discover back to your favorite engine.
"And it's important to understand we're dealing with commercial products," Cohen said. Even when there are features like Google's page-ranking, you have to be careful. "Nothing is neutral," Cohen said. "And paid listings are the least neutral." Searchers should be aware of result lists that involve an increasing amount of paid placement. GoTo, for instance, carries only paid listings.
------------------------------------------------------------------------
Joyce Kasman Valenza is the librarian at Springfield High School in Erdenheim, Pa. Her column appears each week in tech.life. Her e-mail address is joyce.valenza@phillynews.com
Back to Virtual Library
Back to Neverending Search