The Metaphors of the Net – Part II

Written By: Sam Vaknin

2. The Internet as a Chaotic Library

A. The Problem of Cataloguing

The Internet is an assortment of billions of pages which contain information. Some of them are visible and others are generated from hidden databases by users’ requests (“Invisible Internet”).

The Internet exhibits no discernible order, classification, or categorization. Amazingly, as opposed to “classical” libraries, no one has yet invented a (sorely needed) Internet cataloguing standard (remember Dewey?). Some sites indeed apply the Dewey Decimal System to their contents (Suite101). Others default to a directory structure (Open Directory, Yahoo!, Look Smart and others).

Had such a standard existed (an agreed upon numerical cataloguing method) – each site could have self-classified. Sites would have an interest to do so to increase their visibility. This, naturally, would have eliminated the need for today’s clunky, incomplete and (highly) inefficient search engines.

Thus, a site whose number starts with 900 will be immediately identified as dealing with history and multiple classification will be encouraged to allow finer cross-sections to emerge. An example of such an emerging technology of “self classification” and “self-publication” (though limited to scholarly resources) is the “Academic Resource Channel” by Scindex.

Moreover, users will not be required to remember reams of numbers. Future browsers will be akin to catalogues, very much like the applications used in modern day libraries. Compare this utopia to the current dystopy. Users struggle with mounds of irrelevant material to finally reach a partial and disappointing destination. At the same time, there likely are web sites which exactly match the poor user’s needs. Yet, what currently determines the chances of a happy encounter between user and content – are the whims of the specific search engine used and things like meta-tags, headlines, a fee paid, or the right opening sentences.

B. Screen vs. Page

The computer screen, because of physical limitations (size, the fact that it has to be scrolled) fails to effectively compete with the printed page. The latter is still the most ingenious medium yet invented for the storage and release of textual information. Granted: a computer screen is better at highlighting discrete units of information. So, these differing capacities draw the battle lines: structures (printed pages) versus units (screen), the continuous and easily reversible (print) versus the discrete (screen).

The solution lies in finding an efficient way to translate computer screens to printed matter. It is hard to believe, but no such thing exists. Computer screens are still hostile to off-line printing. In other words: if a user copies information from the Internet to his word processor (or vice versa, for that matter) – he ends up with a fragmented, garbage-filled and non-aesthetic document.

Very few site developers try to do something about it – even fewer succeed.

C. Dynamic vs. Static Interactions

One of the biggest mistakes of content suppliers is that they do not provide a “static-dynamic interaction”.

Internet-based content can now easily interact with other media (e.g., CD-ROMs) and with non-PC platforms (PDA’s, mobile phones).

Examples abound:

A CD-ROM shopping catalogue interacts with a Web site to allow the user to order a product. The catalogue could also be updated through the site (as is the practice with CD-ROM encyclopedias). The advantages of the CD-ROM are clear: very fast access time (dozens of times faster than the access to a Web site using a dial up connection) and a data storage capacity hundreds of times bigger than the average Web page.

Another example:

A PDA plug-in disposable chip containing hundreds of advertisements or a “yellow pages”. The consumer selects the ad or entry that she wants to see and connects to the Internet to view a relevant video. She could then also have an interactive chat (or a conference) with a salesperson, receive information about the company, about the ad, about the advertising agency which created the ad – and so on.

CD-ROM based encyclopedias (such as the Britannica, or the Encarta) already contain hyperlinks which carry the user to sites selected by an Editorial Board.


CD-ROMs are probably a doomed medium. Storage capacity continually increases exponentially and, within a year, desktops with 80 Gb hard disks will be a common sight. Moreover, the much heralded Network Computer – the stripped down version of the personal computer – will put at the disposal of the average user terabytes in storage capacity and the processing power of a supercomputer. What separates computer users from this utopia is the communication bandwidth. With the introduction of radio and satellite broadband services, DSL and ADSL, cable modems coupled with advanced compression standards – video (on demand), audio and data will be available speedily and plentifully.

The CD-ROM, on the other hand, is not mobile. It requires installation and the utilization of sophisticated hardware and software. This is no user friendly push technology. It is nerd-oriented. As a result, CD-ROMs are not an immediate medium. There is a long time lapse between the moment of purchase and the moment the user accesses the data. Compare this to a book or a magazine. Data in these oldest of media is instantly available to the user and they allow for easy and accurate “back” and “forward” functions.

Perhaps the biggest mistake of CD-ROM manufacturers has been their inability to offer an integrated hardware and software package. CD-ROMs are not compact. A Walkman is a compact hardware-cum-software package. It is easily transportable, it is thin, it contains numerous, user-friendly, sophisticated functions, it provides immediate access to data. So does the discman, or the MP3-man, or the new generation of e-books (e.g., E-Ink’s). This cannot be said about the CD-ROM. By tying its future to the obsolete concept of stand-alone, expensive, inefficient and technologically unreliable personal computers – CD-ROMs have sentenced themselves to oblivion (with the possible exception of reference material).

D. Online Reference

A visit to the on-line Encyclopaedia Britannica demonstrates some of the tremendous, mind boggling possibilities of online reference – as well as some of the obstacles.

Each entry in this mammoth work of reference is hyperlinked to relevant Web sites. The sites are carefully screened. Links are available to data in various forms, including audio and video. Everything can be copied to the hard disk or to a R/W CD.

This is a new conception of a knowledge centre – not just a heap of material. The content is modular and continuously enriched. It can be linked to a voice Q&A centre. Queries by subscribers can be answered by e-mail, by fax, posted on the site, hard copies can be sent by post. This “Trivial Pursuit” or “homework” service could be very popular – there is considerable appetite for “Just in Time Information”. The Library of Congress – together with a few other libraries – is in the process of making just such a service available to the public (CDRS – Collaborative Digital Reference Service).

E. Derivative Content

The Internet is an enormous reservoir of archives of freely accessible, or even public domain, information.

With a minimal investment, this information can be gathered into coherent, theme oriented, cheap compilations (on CD-ROMs, print, e-books or other media).

F. E-Publishing

The Internet is by far the world’s largest publishing platform. It incorporates FAQs (Q&A’s regarding almost every technical matter in the world), e-zines (electronic magazines), the electronic versions of print dailies and periodicals (in conjunction with on-line news and information services), reference material, e-books, monographs, articles, minutes of discussions (“threads”), conference proceedings, and much more besides.

The Internet represents major advantages to publishers. Consider the electronic version of a p-zine.

Publishing an e-zine promotes the sales of the printed edition, it helps sign on subscribers and it leads to the sale of advertising space. The electronic archive function (see next section) saves the need to file back issues, the physical space required to do so and the irritating search for data items.

The future trend is a combined subscription to both the electronic edition (mainly for the archival value and the ability to hyperlink to additional information) and to the print one (easier to browse the current issue). The Economist is already offering free access to its electronic archives as an inducement to its print subscribers.

The electronic daily presents other advantages:

It allows for immediate feedback and for flowing, almost real-time, communication between writers and readers. The electronic version, therefore, acquires a gyroscopic function: a navigation instrument, always indicating deviations from the “right” course. The content can be instantly updated and breaking news incorporated in older content.

Specialty hand held devices already allow for downloading and storage of vast quantities of data (up to 4000 print pages). The user gains access to libraries containing hundreds of texts, adapted to be downloaded, stored and read by the specific device. Again, a convergence of standards is to be expected in this field as well (the final contenders will probably be Adobe’s PDF against Microsoft’s MS-Reader).

Currently, e-books are dichotomously treated either as:

Continuation of print books (p-books) by other means, or as a whole new publishing universe.

Since p-books are a more convenient medium then e-books – they will prevail in any straightforward “medium replacement” or “medium displacement” battle.

In other words, if publishers will persist in the simple and straightforward conversion of p-books to e-books – then e-books are doomed. They are simply inferior and cannot offer the comfort, tactile delights, browseability and scanability of p-books.

But e-books – being digital – open up a vista of hitherto neglected possibilities. These will only be enhanced and enriched by the introduction of e-paper and e-ink. Among them:

Hyperlinks within the e-book and without it – to web content, reference works, etc.;
Embedded instant shopping and ordering links;
Divergent, user-interactive, decision driven plotlines;
Interaction with other e-books (using a wireless standard) – collaborative authoring or reading groups;
Interaction with other e-books – gaming and community activities;
Automatically or periodically updated content;
Database, Favourites, Annotations, and History Maintenance (archival records of reading habits, shopping habits, interaction with other readers, plot related decisions and much more);
Automatic and embedded audio conversion and translation capabilities;
Full wireless piconetworking and scatternetworking capabilities.
The technology is still not fully there. Wars rage in both the wireless and the e-book realms. Platforms compete. Standards clash. Gurus debate. But convergence is inevitable and with it the e-book of the future.

G. The Archive Function

The Internet is also the world’s biggest cemetery: tens of thousands of deadbeat sites, still accessible – the “Ghost Sites” of this electronic frontier.

This, in a way, is collective memory. One of the Internet’s main functions will be to preserve and transfer knowledge through time. It is called “memory” in biology – and “archive” in library science. The history of the Internet is being documented by search engines (Google) and specialized services (Alexa) alike.


