On The Insider: Sexiest Magazine Covers of All Time
Find Articles in:
all
Business
Reference
Technology
News
Sports
Health
Autos
Arts
Home & Garden
advertisement

Brought to you by IBM

advertisement

Content provided in partnership with
Thomson / Gale

WebSPHINX: A Personal, Customizable Web Crawler

Whole Earth,  Summer, 2001  by Mikael Huss,  Joel Westerberg

There's a problem with web surfing: no sense of direction. Slow-loading webpages and crummy browsers feed your screen one page at a time. You never see an overview of the web's structure.

Explorer and Netscape, the factory-issue web browsers, are too crude for conceptual web research. But a web spider crawls the web for you, plowing through page after page, relentlessly extracting links, page titles, page sizes, and even keywords. Search engines acquire their databases this way, but they jealously guard that painstakingly indexed data.

Though the standard home PC isn't on a par with the indexing factories of major search engines, it's perfectly possible to do a limited, personal crawl of smaller portions of the web. This is where WebSphinx comes in. It's Java-based, and we may also rejoice in the fact that it's open-source. To start, select a local or global crawl, type in the URL you care to crawl from, then watch that spider make its multithreaded way through the linkspace. Each page becomes a document icon, with links as arrows connecting pages.

WebSphinx shines with its graphic interface: an ongoing webcrawl is like an evolving protozoan lifeform, rocking in the gentle currents of the oceanic web. The look and feel of WebSphinx resemble yesteryear's big hit from the net.art scene, Plumb Design's Visual Thesaurus.

WebSphinx navigation is fast and intuitive. You hover the cursor above a page to see its title, then double-click to launch the page in another window. The real strength of this program is its God's eye view of the web, zeroing in on content-dense formations, deep within websites.

WebSphinx is ideal for ego-surfing; by launching it from your own home page, you can easily find your place in the web's semantic space.

Joel Westerberg is a website editor and graphic designer who is heavily into power tools, screwdrivers, and tool belts. Mikael Huss is Sweden's foremost expert on Chinese science fiction.

COPYRIGHT 2001 New Whole Earth LLC
COPYRIGHT 2008 Gale, Cengage Learning