Faceted classification and logical division in information retrieval
Library Trends, Wntr, 2004 by Jack Mills
ABSTRACT
THE MAIN OBJECT Or THE PAPER is to demonstrate in detail the role of classification in information retrieval (IR) and the design of classificatory structures by the application of logical division to all forms of the content of records, subject and imaginative. The natural product of such division is a faceted classification. The latter is seen not as a particular kind of library classification but the only viable form enabling the locating and relating of information to be optimally predictable. A detailed exposition of the practical steps in facet analysis is given, drawing on the experience of the new Bliss Classification (BC2). The continued existence of the library as a highly organized information store is assumed. But, it is argued, it must acknowledge the relevance of the revolution in library classification that has taken place. It considers also how alphabetically arranged subject indexes may utilize controlled use of categorical (generically inclusive) and syntactic relations to produce similarly predictable locating and relating systems for IR.
- More Articles of Interest
- Knowledge profiling: the basis for knowledge organization
- Grounded Classification: Grounded Theory and Faceted Classification
- Classification and categorization: a difference that makes a difference
- Libraries Need Relationship Marketing - mutual interest marketing concept,...
- The importance of understanding organizational culture
- Most Popular Articles in Reference
- The importance of understanding organizational culture
- Credit card attitudes and behaviors of college students
- What factors attract foreign direct investment?
- Libraries Need Relationship Marketing - mutual interest marketing concept, ...
- How to set performance goals: employee reviews are more than annual critiques
- More »
1. INTRODUCTION
As a memorable aphorism prefacing his novel Howard's End, E. M. Forster gave simply "Only connect." It could claim to be the finest, even though briefest, definition of intelligence we have. To understand anything, whether it is the operation of a complicated mechanism or the complex social factors that underlie almost any human situation, understanding it means seeing the connections. The basic intellectual instrument we use to do this is classification. It is appropriate that libraries, which seek to organize everything in the way of recorded human knowledge should find explicit classification as central to their organization.
1.1. Indexing and searching
Indexing and searching are the two fundamental operations in retrieval. The usual situation in the library is that the librarian prepares the scene for retrieval by indexing each document (assigning to them retrieval handles such as classmarks, subject headings, etc.). Searching may then be done directly, by examining the documents on the shelf or vicariously via their surrogates in the catalog. Although the term "indexing" is used with various connotations, especially ones involving terms in alphabetical order, the central meaning of pointing out or indicating describes exactly what librarians do when, in response to any enquiry, they indicate where the inquirer may best begin looking and, perhaps, where they might next look should the first search prove inadequate. This function is neatly summarized in cataloging theory as one of locating and relating.
1.2. Classification
This is the most fundamental operation in indexing. In its broadest sense, it is the action of recognizing and establishing groups of classes of objects, the subclasses and members of which all manifest (even though in different ways) a particular characteristic or set of characteristics. The different kinds of shared characteristic (s) used to define a class for retrieval have been called index devices (Cleverdon et al., 1966). Library classification, via shelf order and the classified catalog, uses a number of different devices; two of these reflect the sort of class definition usually understood by the term "classification"--those defined by generic and whole-part relations; but coordination (combination), synonym control, role indication (by inclusion of terms in facets defining their relation, such as agent, property), and some confounding of word forms (via their adjacency in the A/ Z index) are also prominent. Mechanized retrieval systems developed a number of less direct devices, e.g., an extended confounding of word forms and oblique ways of defining a set of documents sharing the same subject content such as is found in citation indexing. Electronic systems have now extended these oblique forms of class definition (see Section 3.5).
9. WHAT IS CLASSIFIED IN THE LIBRARY
Library materials physically are the object of relatively rudimentary classification in that significantly different physical forms are separately housed and may be separately indexed. However, in nearly all cases it is their content which is their ultimate justification and the problems of information retrieval (IR) are paramount. Whether this content is best described as information or knowledge is best left to the philosophers. Early writers on library classification tended to use the term "knowledge" as the object of classification and retrieval. A dissident voice at the beginning of the last century, when Bliss began opting firmly for a knowledge basis, was Wyndham Hulme (1911-1912). Hulme distinguished mechanical classification from philosophical and claimed that library classification belonged to the first kind. He coined a term "literary warrant" and described library classification as the plotting of areas preexisting in literature. This was, in fact, not a bad description of what the Library of Congress was doing in many of its classes but interpreting preexistence as being what they held in stock. When we consider content only, a major distinction is found in all general libraries (and some special) between subject content and what we may call, for lack of better words, "imaginative content." Much discussion of the exact nature of information reflects the unease over the use of the term "information retrieval" when it is clear that the content of a significant class of documents is not defined sensibly as information. The term "knowledge" appears to be somewhat more receptive to the inclusion of imaginative works than the term "information."