It's Back to Work for this Eighty-four Year Old Librarian

Frances Williams has been asked to come back to work...even though she's now 84 years old. She'll be working at a library in Madras India helping them to catalogue thousands of books using the Dewey Decimal system. She worked at the same library during the 1980's. Story from the UK's Malvern Gazette.


Working Group Established To Discuss Future of Bibliographic Control

Is there anything librarians love more than committees and working groups? Probably not. The LOC Announced a new group. "Advances in search-engine technology, the popularity of the Internet and the influx of electronic information resources have greatly changed the way libraries do their work. To address those changes, the Library of Congress has convened a Working Group on the Future of Bibliographic Control to examine the future of bibliographic description in the 21st century."

Brown University cataloguing its repository of rare maps

The AP Wire has one on a Brown University collection of more than 1,000 rare maps that librarians are in the process of cataloging online in an effort to move into the digital age.

Officials say the push to catalog the artifacts -- some brittle with age, and many dating back 100 years or more -- will make them more accessible to the public and help those interested in urban studies, genealogy and other research areas.

What is Going on at the Library of Congress?

I thought we had pointed to What is Going on at the Library of Congress? [pdf] but I could be wrong. In any case, Steven Chabot has written A summary and commentary of Thomas Mann's "What's Going on at the Library of Congress?"Both pieces are worth a read. Chabot focuses on 2 of Mann's points, A move to abandoning the LC system of headings (essentially leaving categorization to Google-like keyword searches and Amazon-like user recommendations) and
To accept digital copies of those works not "born digital", i.e. books, in place of their paper representation on a physical shelf.

New Blog Tracks Typos in Online Catalogs

Terry Ballard writes "The blog Typo of the day at was inaugurated on October 11, 2006. This will highlight one word each day from the list of thousands found by the members of the forum LIBTYPOS-L ( The group was formed in the summer of 2000 to supplement the findings of Terry Ballard's 1992 study of typographical errors in library databases. Since that time the group, now numbering more than 150 librarians, has found thousands of correctable errors in online catalogs, using the massive OHIOLINK catalog as their baseline. The full list, which is updated annually, can be found at yposcomplete.html

Update: 10/18 19:14 GMT by B :Terry Ballard writes some more: :I sent in a story about the Typo of the Day blog yesterday, and notice two things that need fixing (how ironic). Wendy Eyler should be Wendee Eyler, and the postings are done every weekday.


A catalogue of errors How many books written in seemingly obscure languages are misfiled and languishing unfindable in libraries? Joyce Flynn's experience at Harvard suggests the answer is: a lot.
Flynn, a researcher in Celtic languages, discovered some common mishaps that no one discusses much.

Sometimes, cataloguers and shelfers did strange things with books written in foreign languages. They mangled the catalogue listings, and tucked the books away on the wrong shelves.


NYPL Reading Room switches from home-grown classification to LC

The New York Times ran an editorial today praising the NYPL Main Reading Room for reclassifying its materials from the Billings system (created by a former director) to Library of Congress classification. There is a fair bit of musing about library classification, not something you normally see on the newspaper editorial page.

The Economics of Open Bibliographic Data Provision

Paper from the University of Connecticut, Department of Economics. Abstract: In this paper, we discuss the provision of bibliographic data as an extension
of the open source concept. Our particular concern is the sustainability of such
endeavors. We describe the RePEc (Research Papers in Economics) project, probably
the largest "open source" bibliographic database. It demonstrates that opensource
bibliographic data collection is sustainable. Click here for full text of paper.


Thirteen Tips for Effective Tagging

Thirteen Tips for Effective Tagging is one from over at TechSoup on tagging. A tag is a collaboratively generated, open-ended labeling system that enables Internet users to categorize content such as Web pages, online photographs, and Web links. Tagging lets you categorize information online your way. Sounds suspiciously like cataloging, only without the messy MARC rules.

Ontology is Overrated

At this website you can listen to the entire one hour presentation.
During the presentation Shirky refers to several graphics. You can see the graphics at this website. This makes it easier to follow the presentation because you can then see what the audience is seeing.

Overview of prsentation:
There are many ways to organize data: labels, lists, categories, taxonomies, ontologies. Of these, ontology -- assertions about essence and relations among a group of items -- seems to be the highest-order method of organization. Indeed, the predicted value of the Semantic Web assumes that ontological successes such as the Library of Congress's classification scheme are easily replicable.

Those successes are not easily replicable. Ontology, far from being an ideal high-order tool, is a 300-year-old hack, now nearing the end of its useful life. The problem ontology solves is not how to organize ideas but how to organize things -- the Library of Congress's classification scheme exists not because concepts require consistent hierarchical placement, but because books do.

The LC scheme, when examined closely, is riddled with inconsistencies, bias, and gaps. Top level geographic categories, for example, include "The Balkan Penninsula" and "Asia." The primary medical categories don't include oncology, defaulting to the older and now discredited notion that cancers were more related to specific organs than to common processes. And the list of such oddities goes on.

The reason the LC scheme is accumulating these errors faster than they can correct them is the physical fact of the book, which makes a card catalog scheme necessary, and constant re-shelving impossible. Likewise, it enforces cookie-cutter categorization that doesn't reflect the polyphony of its contents--there is a literature of creativity, for example, made up of books about art, science, engineering, and so on, and yet those books are not categorized (which is to say shelved) together, because the LC scheme doesn't recognize creativity as an organizing principle. For a reader interested in creativity, the LC ontology destroys value rather than creating it.

As we have learned from the Web, when data is decoupled from physical presence, it is fluid enough to be grouped differently by different readers, and on different days. The Web's main virtue, in handling data, is to transmute organization from an a priori, content-based judgment to one that can be ad hoc, context-based, socially embedded, and constantly altered. The Web frees us from needing to argue about whether The Book of 5 Rings "is" a business book or a primer on war -- it is plainly both, and not only are we freed from making that judgment firmly or in advance, we are freed from needing to make it explicit at all.

This talk begins by exploring the rise of ontological classification. In the period after the invention of the printing press but before the invention of the search engine, intellectual production was vested in books, objects that were numerous but opaque. When you have more than a few hundred books, categorization becomes a forced move, even if the categories are somewhat arbitrary, because without categories, you can no longer locate individual books.

It will relate this "opaque objects" problem to the more recent history of organizing pure data -- a hierarchical file-system; then the emergence of "symbolic" links, which undermined the hierarchy but left intact the idea that data "was" somewhere, and that all other pointers were second-class; to our current system, where the URL makes all links equally symbolic.

The URL represents the inversion of the traditional scale, making the mere label and not the mighty ontology the key site of organizational value. The talk will go on to describe the tension between productive and extractive modes of metadata, and the effects of scale, heterogenous user assumptions, na�ve and flat classifications, lowered barriers to production and tagging, and long-lived classifications by individuals. These are all things that are inimical to ontology but predictive of extractive organizational value, in the manner of Google.

The talk ends by discussing key technologies in the spread of extractive value -- Google,, fotonotes, purple numbers, RDF -- and wrapping up with some predictions about where value might be encapsulated in user-tagged, semi-structured data in the future.

Clay Shirky teaches at NYU's graduate Interactive Telecommunications Program. He writes and consults on the social and economic effects of the Internet, concentrating particularly on the decentralization of applications (peer-to-peer architectures and programmatic interfaces) and on the current explosion in social software.

This presentation is one of a series from the O'Reilly Emerging Technology Conference held in San Diego, California, March 14-17, 2005.

This program is from our O'Reilly Media Emerging Technology Conference 2005 series.



Subscribe to Cataloging