Information Retrieval


Popline Has Unlocked Abortion Searches

Here's an update to the Blog post we pointed to this morning on Popline who was quietly blocking searches on the word "abortion," concealing nearly 25,000 search results. At some point in the past 6 hours they stopped blocking the search.

A sandbox for collecting search examples, patterns, and anti-patterns

Peter Morville put together this neat sandbox for collecting search examples, patterns, and anti-patterns. He's looking for folks to add tags, notes, and comments, and suggest new examples. Over time, he hopes to add patterns that illustrate user behavior and the information architecture of search. He's blogging about search patterns at
(Link stolen from the NGC4LIB list)

Feed Reader Down, Reading Up

A few months ago Connie Reece did some serious pruning on her Google Reader, which was choked by an overgrowth of blog feeds. One day she decided she had officially hit Information Overload. she was either spending so much time reading that I had no time to write, or I was feeling guilty for clicking on “mark all as read.” Choices were difficult, but she managed to cut back to 50 RSS feeds.

Now She's Trying another tactic: cutting RSS feeds even further, yet increasing the number of blogs she's able to read, through Twitter, Instapaper, and a couple other sites

Towards a modern, functional OPAC

Aaron Schmidt has used quite a few library OPACs. He's also used and sought out the best of the open web. You’ve probably done the same and like him, you’ve probably been dismayed at the disparity between the two worlds. The open web can be fun and inspiring. Would you say the same of our OPACs? He's thought about what OPACs should be like in bits and pieces and decided to assemble them here.

A Problem
Besides all of the small, simple usability enhancements OPACs need (listed way below) a big concern about library websites and OPACs is the distracting transition between the two. You know the routine. Ubiquitous “Click here to search the catalog” links take users from one place to another and create a disjointed experience.

Aaron's Solution
One way to provide a seamless experience is to put some OPAC functions into the website, letting people accomplish OPAC tasks without having to leave the library website. In Aaron's dream OPAC this go-between is essentially an ecommerce shopping basket but called a backpack or bookshelf in this instance. Just like on, when logged in, a patron’s library backpack appears on every library webpage, whether it be the homepage, a book list, or the results list of a search. Any item cover on the website can be dragged and dropped into users’ backpack/bookshelf.

Where Do the Footprints of Metadata Lead? asks Where Do the Footprints of Metadata Lead? Metadata, often described as "data about data," is electronically stored information that generally is not visible from the face of a document that has been printed out, or as first seen on a computer screen. Embedded in the software, metadata gives information about the creation or modification of the document -- information which often is mundane but at other times, can be quite significant and perhaps even privileged.

By "mining" the metadata in a document, someone may be able to identify the document's author, changes made during various stages of its preparation, comments made by others who reviewed the document and other documents embedded within the document.

Pondering Podcast Access -- A Conundrum I Have No Solution To

The Online Dictionary for Library and Information Science defines podcast as:
A digital media file (audio or video) syndicated over the Internet via an RSS feed. The author or host of a podcast is known as a podcaster. Once available online, podcasts can be downloaded for listening on portable media devices (MP3 players, pocket CDs, cell phones) and personal computers. Despite the similarity in name, listening to or watching a podcast does not require an iPod, although the device can be used for that purpose. Online directories of podcasts are usually browsable by subject and searchable by keyword(s) (examples: Podcast Alley,, and
This describes the program produced here in the Las Vegas metro. This also describes the method of normal distribution. Is this the normal means of accessing LISTen, though?

Calais and the Semantic Web

Reuters has released an open API for its new Calais Web service that aims to make it easier for publishers, bloggers, and other content producers to automatically metatag their content and to develop their own applications for the Semantic Web. To use the service, a publisher simply inputs unstructured text and the service returns semantic metadata in RDF format in less than a second. Using natural-language processing and machine learning techniques, Calais locates entities, facts, and events and processes those components into metadata.

An Online Organizer That Helps Connect the Dots

HOW often have you wasted time searching through page after page of e-mail messages, Web sites, notes, news feeds and YouTube videos on your computer, trying to find an important item?

If the answer is “too often,” a San Francisco company, Radar Networks, is testing a free, Web-based application, called Twine, that may provide some robotic secretarial help in organizing and retrieving documents.

Twine ( can scan almost any electronic document for the names of people, places, businesses and many other entities that its algorithms recognize.

Then it does something unusual: it automatically tags or marks all of these items in orange and transfers them to an index on the right side of the screen. This index grows with every document you view, as the program adds subjects that it can recognize or infer from their context.

Article continued here.

ISU-based center to help librarians train info consumers

Teaching people to find, evaluate and use information effectively has always been part of a librarian’s job, but Ward wants to create a centralized one-stop shop to help. With a nearly $60,000 federal grant in hand, he aims to build the Illinois Center for Information Literacy at ISU.

“The idea is to have one place to see what is going on in Illinois in terms of information literacy,” Ward said.

A new search engine...

Sarah, from LibrarianInBlack, shares this cool search engine that I hadn't seen before. It's called Carrot, and not only is it open source (so you can use it on your library's website), but it clusters results together. What I mean by this is try searching for the term Harry Potter. Over on the side they divide topics up so that you can narrow results by title of books or wands. You also have subheadings so that you can see where the results came from or the sources the engine found it in (such as Ask!, Google, etc.)
Very cool!

ChaCha Guided Searching Goes Mobile

A search that uses human guides didn't make much sense in a world of libraries with IM service, and when ChaCha debuted in 2006, the buzz died down quickly. Expect to see a revival now that they've started providing text message services. Still powered by human guides, the service will likely be a hit with folks unwilling to pay for Internet access on their phones. It's free for now, but the company plans to charge $5-10/mo. for the service in the future, which may effectively kill any interest from users without smart phones.

Before Memex

Taxonomy upgrade extras: 

Information Re/Evolution

The professor behind the The Machine is Us/ing Us video has produced a new one, called Information Re/Evolution. This video deals with the search for information and has some relevance for libraries and librarians.

New York Times Will Now Give Away Web Access

The NYTimes, they are a-changin'.

In this article in today's paper, the New York Times is ending payment for the 'Times Select' program. They found that readers were accessing those articles via search engines, and have chosen instead to give free online access to the work of its columnists and to the newspaper's archives, with a few minor exceptions.

Eight steps to thriving on information overload

Eight steps to thriving on information overload: "There is no quick fix; enhancing our skills requires effort and is an ongoing process. Implementing the following principles can make a real difference to your effectiveness in dealing with the new reality of information overload..."

1. Set information objectives.

4. Filter aggressively.

8. Sleep on it!

Full-text Searching in Books

Scott Boren has a great look at electronic searching of the full-text of books. He covers Several Book Search Web Sites, Text Archives for older materials, Proprietary Book Search sites, some Smaller Collections, Google Book Search, some Emerging Projects, and some sites for Further Reading.

A Long Look @ Service Oriented Library Systems

Be sure to check out Eric Schnell's Service Oriented Library Systems series. In Part 4: Challenges he says if we step back and look at it, we have nobody to blame for outdated library information systems or our inability to more readily adopt SOA except for ourselves. We keep licensing these products at a time when the number alternative approaches available to libraries has never been greater.

In the last posting in these services he'll provide some closing remarks and observations.

Be sure to read all 3 posts so far:


What We Have Today

Where Are We Heading?

Robots...The New Librarians

The LA Times reports: "It's not often that a librarian is warned to stay away from the bookshelves because of high voltage and that students aren't allowed to roam freely through the stacks - but it's becoming more common.

At Chicago State University only robots are allowed to browse most books and archives. Books are supplied with RFID chips, and to get a particular book, students and faculty must log onto the library's website from home or school and place an order for a title.

Once the order is received, the library's computer system directs a robotic crane - dubbed "Rover" by the librarians - to retrieve one of more than 6,300 bins. Each bin holds the equivalent of four bookshelves.

The crane then brings the bin to a workstation at the front of the warehouse, where a HUMAN BEING--a librarian--picks up the book.

Archival Environment, Part 3 The Metadata Dilemma

Archival Environment: The Metadata Dilemma: There is an apocryphal story about metadata and archive processing that illustrates the unique relationship of the two environments. It seems that once there was a government store of space exploration information. As rockets entered outer space, metrics (i.e., telemetry) were sent back to the ground and recorded. This telemetry information provided much interesting scientific information about space exploration and was of great value to the scientific community. Indeed, the government built an impressive archive of the telemetric data that had been captured. The archive had state-of-the-art archival technology. The archival storage was protected. The telemetry data was classified. The metrics were protected and stored in excruciating detail.

In order to make your archival data really useful, you need to know what the archival data means.

NYTimes Article on Video Search Engine

Paul Lewis writes "Articles discusses video site which employs speech recognition technology to posted videos to enhance keyword searching capabilities for video content. They Say Blinkx's speech-recognition technology employs neural networks and machine learning using "hidden Markov models," a method of statistical analysis in which the hidden characteristics of a thing are guessed from what is known."


Subscribe to RSS - Information Retrieval