Information Retrieval


Words,Extended: find, rank, read and cite textual information from the web

<a href="">Glenn Scheper</a> is a software engineer, and gives away a freeware program he developed over the past ten years, which you can use to find, rank, read and cite textual information from the web. He says WordsEx.exe is easy to use, ergometric, and very powerful.

ChaCha Promises to Answer Any Reference Question Any Time

A new 24/7 service from ChaCha allows cell phone users on the go to ask a wide range of reference questions in conversational English and get answers free of charge. Each question is routed to a human guide who searches the Web for the information and within minutes returns the answer in a text message with a web reference link.

Asked if ChaCha represents competition for reference librarians, David Tyckoson, president of ALA's Reference and User Services Association, told American Libraries that people already rely on librarians less for help with finding short, factual results that they can obtain on their own. "What they need a librarian's help with are the more complex searches," he said.


Von Totanes sent this one over: "LCSH, SKOS and the entire LCSH has recently been uploaded at, which is an experimental service that aims to "encourage experimentation and use of LCSH on the web."

It's far from being an ebook that non-techies like me can use easily, but if you understand what Simple Knowledge Organization System (SKOS) is all about, you might be able to make it work for you and develop
new ways of using LCSH online in non-traditional ways, aside from not having to buy a new set every few years. More...

Dead Media: What Other Forms Of Technology Have Gone Extinct?

Kevin Kelly has <a href="">an interesting article</a> about what he believes to be the only extinct form of technology (at least the he could find): the Edge-Notched Cards. Any library out there that can prove him wrong?

Piercing the fog of production

Rarely is it good to talk about the inner-workings of editorial decision-making. Such ranks up there with the making of sausage and the creation of laws as things best not known. Sometimes it is necessary to do so, though. This week's episode of LISTen features five separate Public Service Announcements. We received absolutely no compensation for running such. The five discrete ads are all available as free downloads from a federal agency, namely the Federal Communications Commission. While it may sound fairly odd to some and perhaps quite condescending, there is a purpose to such. The role of the librarian in today's Amazoogle world is to meet information needs. When you start from that philosophical standpoint you have to consider some things. When there is a lack of a clutch in a coming paradigm shift, what responsibility do you have to those you serve? How does such impact serving their information needs? For the audience that LISTen serves, the whole discussion of the digital television transition in the United States probably seemed meaningless. Such misses the forest for the trees. While we acknowledge that librarians are striving today to be technological elites, the people who are served by librarians more often than not are not such elites at all. The whole Tech for Techies discussion was an attempt to discuss the transition in terms of how to approach patron questions. Rather than tell a patron you don't know, why not take a look at some of the common questions patrons might pose let alone some uncommon ones? I made a conscious choice to use all five of the ads I used. Those are the US government's best effort to reach out to the public. Have you ever heard such outside LISTen, though? With reports of somewhere around eighty percent of the population not even knowing this is coming, can we take steps to at least prevent catastrophic information seeking sessions that barely help anyone involved? I will not order anyone to "be creative". That's not the way such works! Considering that ALA is entering into a public education partnership with an electronics retailer to try to get word out to folks, it is not like this is an issue that the profession's organization in the United States is ignoring. I would much rather you heard the government's best effort at outreach and be stirred to action on your own to try to do better. As information professionals who deal with the information-seeking needs of rather diverse populations, this should be an easy one to plan a program on! The ALA is already trying to make it easier for you to get speakers in as it is. If a listener can come up with something creative on their own, the result is probably going to be far better than my sounding like a drill sergeant barking orders. Part of the infrastructure to our Amazoogle world is changing fundamentally. What is the role of libraries in trying to be relevant to their served populations? I do not argree that being hip and trendy is the way to go. Establishing a firm foundation and reputation as being the source for good information is what you build relevance on top of. In an unorthodox way I tried to show something that would be an easy thing to start with. This wouldn't require an investment in new servers or software. This would not require necessarily an infrastructure investment. If anything this is something that libraries do well but have gotten away from over time. Being the "People's University" doesn't always require a new social network and sometimes requires merely a meeting room as well as speakers and potentially refreshments.

The Web Time Forgot

On a fog-drizzled Monday afternoon, this fading medieval city feels like a forgotten place. Apart from the obligatory Gothic cathedral, there is not much to see here except for a tiny storefront museum called the Mundaneum, tucked down a narrow street in the northeast corner of town. It feels like a fittingly secluded home for the legacy of one of technology’s lost pioneers: Paul Otlet.

In 1934, Otlet sketched out plans for a global network of computers (or “electric telescopes,” as he called them) that would allow people to search and browse through millions of interlinked documents, images, audio and video files. He described how people would use the devices to send messages to one another, share files and even congregate in online social networks. He called the whole thing a “réseau,” which might be translated as “network” — or arguably, “web.”

Full story here.

Selling Placement in Library Search Results

If you're like me (and you know you want to be) you love ads!

The Disruptive Library Technology Jester isn't like me, he Writes About Selling Placement in Library Search Results.

All of this still leaves my vaguely uncomfortable, and I’m not yet sure why. (Writing it all down in this posting hasn’t helped.) It would be one thing if “preferential placement” meant “invisibly raising the relevance of such content” in the search results list. That clearly would seem to be out of bounds: invisible mucking with search results placement leads to distrust of the underlying service. Google has shown us, though, that it is possible to sell conspicuously marked advertisements on search results pages and make billions doing it. Could the same thing work for libraries selling conspicuously marked, relevant results that could lead users to an e-commerce transaction at a publisher site? Is the value libraries (and our users) could receive in exchange for such placement the free access to the digital form of the content?

Shopping is a way of interacting with the world around us

Shopping is a way of interacting with the world around us: "This means searching becomes a way for us to interact with the world around us, an experiental horizon where certain aspects loom large in the foreground while others are pushed into the background," he explains.

In particular, his research focuses on what is actually going on when we are "window shopping", i.e. strolling round and "just looking" at things without having a clear idea of what we are looking for. The people he has been studying search patiently for certain things, but more than anything, they are searching for the feeling of having found something that is better and finer that they could have imagined. At this point they have stretched the boundaries of what would be reasonable to expect to find.

Usability testing on Vufind

Katie Bauer posted this one to NGC4LIB:
Of possible interest to those who may be contemplating doing usability testing on their OPAC, Yale recently conducted two tests on pilot VuFind installations at Yale. One study looked at a subject based presentation of ebooks for the Cushing/Whitney Medical Library

and the other looked at a pilot test of Vufind with a sample of 400,000 records drawn from the Library's Voyager system

Test questions were drawn from user search logs in the current library system, and some were designed to test for those problems that the logs have demonstrated exist for patrons, such as incorrect spellings, and incomplete title information. In reading the reports please be aware that some of the problems uncovered may have had a lot to do with peculiarities of the Yale implementation, such as the sample of records imported into VuFind for this test, and less to do with VuFind itself.

In general participants were intrigued by the possiblities offered by facets, although the topic facets in particular did not always seem to function as they expected or desired. The most desired feature participants wanted to see developed was an easy direct export from the catalog to a bibliographic citation management tool such as RefWorks and Endnote (while other catalogs may have this feature already, the current Voyager system at Yale does not a direct export feature.)

Storage comes full circle

Storage comes full circle:

“And what storage format has zero energy consumption, a tiny carbon footprint, can sit for long periods of time without degrading and offers the easiest data destruction?”

Mrs. Rat rolled her eyes. “Paper.”

“Yes. The future of data storage is paper,” the Rat said, grinning as an entrepreneurial idea came to him. “And, of course, no one has the capacity to store all that paper because we got rid of all the filing cabinets.”

Towards a Value-Added User Data Economy

Every week it seems like the debate over access to, portability of and privacy over user data on the social web has reached new heights. It's only going to get louder though, just as discussions about other forms of economics will never be resolved.

User data has been sold by ISPs, leveraged by ad networks and horded by social networks for years. Now, users are storming the castle to recapture their own booty. Read Write Web argues that it's in everyone's best interest that the data be freed. Vendors have far more to gain by working to add value to freely flowing data than they do from trying to horde as much data as they can.

Locked Information

The recent earthquake in China came as no surprise to some scientists. Last July they published the results of a study showing that the region was ripe for a major quake. There is little reason to believe Chinese officials were aware of the report, or that it would have made much difference if they had been. “We had certainly identified the potential of these active faults,” said one of the co-authors. “But that information was effectively locked in an academic journal.”

MP3 Audio: How To Find Someone's Date of Birth

Pete Weiss sent over This MP3 audio file (3:36 minutes). It's a recording from of the article, How To Find Someone's Date of Birth. It suggests sources of information and research strategies for finding a person's date of birth.

The History of TEMPEST

One never knows how and where information may be encoded.

Take the case of an Bell telephone engineer back in the early 40s. He noticed that an oscilliscope seemed to spike every time the brand new, fancy, and highly top secret encrypted teletype machine coded a letter. He figured out that, if one studies the spikes, they could read the plain text the machine was encrypting.

And thus was born TEMPEST, the US Government's top secret method of gathering information based solely on the electromagnetic waves that all electronic devices give off. Newly de-classified documents recount the history of this still highly confidential information gathering system.

On Innovation in the ILS Marketplace

The Disruptive Library Technology Jester takes a look at Last month's ILS Discovery Interface Task Force1 of the DLF meeting of library system vendors (including one commercial support organization for open source ILS software) to talk about the state of computer-to-computer interfaces in-to and out-of the ILS. The meeting comes as the work of the task force is winding down. An outcome of the meeting, the “Berkeley Accord2,” was posted last week to Peter Brantley’s blog. The accord has three basic parts: automated interfaces for offloading records from the ILS, a mechanism for determining the availability of an item, and a scheme for creating persistent links to records.

Information alert

A recent survey shows many students from the so-called 'Google generation' lack the basic skills needed for online research, Wendy Wallace Says Many libaries have assumedyoung students have learned to use the internet for research simply by virtue of their age. But while many are proficient with Facebook and Wikipedia, they may not be information- literate. Many lack the skills to differentiate between authoritative information and amateur blogging.

Evernote: a search engine for your brain

Apparently you never need to remember that website or the wonderful bottle of wine you had at dinner last night, just upload it to <a href="">Evernote</a> and search for it at your leisure. Farhad Manjoo briefly explores this <a href="">obsession with remembering everything.</a>

Copyright in today's world

This is a podcast from the "Real Deal," where they discuss copyright with Colette Vogele, attorney, Fellow at Stanford's Center for Internet and Society. They discuss some of the concerns people have over copyright in today's world with the internet, downloads, mashups, etc.

In Storing 1’s and 0’s, the Question Is $

LISTEN. Do you hear it? The bits are dying.

The digital revolution has spawned billions upon billions of gigabytes of data, from the vast electronic archives of government and business to the humblest photo on a home PC. And the trove is growing — the International Data Corporation, a technology research and advisory firm, estimates that by 2011 the digital universe of ones and zeros will be 10 times the size it was in 2006.

But the downside is that much of this data is ephemeral, and society is headed toward a kind of digital Alzheimer’s. What’s on those old floppies stuck in a desk drawer? Can anything be read off that ancient mainframe’s tape drive? Will today’s hard disk be tomorrow’s white elephant?

Data is “the natural resource for the Internet age,” said Francine Berman, director of the San Diego Supercomputer Center at the University of California, San Diego, a national center for high-performance computing resources. But, she added, “digital data is enormously fragile.” It can degrade as it is stored, copied and transferred between hard drives across data networks. The storage systems might not be around or accessible in the future — it is like putting precious information on eight-track tapes.

Full story in the New York Times.

Links to OPAC Enhancements, Wrappers, and Replacements

Here are the supplemental links for the presentation at the NISO workshop on discovery layers1 in Chapel Hill, NC, on March 28, 2008. Carolyn McCallum at Wake Forest University posted a great summary of day two of the NISO discovery layer forum2, including an overview of the talk.

Foundational Pieces
The presentation started as an extension of a DLTJ blog post. Also mentioned was Marshal Breeding’s Library Technology Report4 published in July/August of 2007 and available from the ALA store5.

Tour of Systems
For each of the 10 systems that were toured in the course of the presentation there is a link to the home page of the product/project and a link to a demo or canonical live example.


Subscribe to RSS - Information Retrieval