Information Retrieval

$5 million contract awarded to Amherst software firm

The Buffalo News has a piece on Janya, a local software company that designs programs to help computers "read" vast chunks of text has won a $5 million research contract from the Air Force Research Laboratories. "We pick up where search engines leave off," Srihari said.

First, the program converts documents into words. Then, it scans each word and determines - based on context and rules of grammar - whether it is a name, place or other entity and how each is connected.

Can subjects be relevancy ranked?

Over at LibraryThing Time Spallding wondered Can subjects be relevancy ranked?
Some ideas he considered:

  • Treating subjects as links, and running some sort of "page-rank" style connection algorithm against them. Maybe this would bring out coincidences that simple statistics misses.
  • Using other library data, such as LCC and Dewey. This would be reminiscent of how I made LibraryThing's LCSH/LCC/Dewey recommendations.
  • Doing statistics on other fields, such as the title. So, for example, there's probably a statistical correlation between "Man-woman relationships" and books with "dating," "men and women" and "proposal" in the title.

EBSCO Hands it to Me

WoodyE writes "A recent post at about certain ill changes in EBSCO database inerfaces gets answered in fine form by Kate Hanson, Customer Account Specialist at EBSCO HQ. (Link to Story).
Lesson learned: EBSCO's foray into Google-fication is not so deep nor total as it may at first seem."

Tech Friday: ChaCha.- Human Guided Search

mdoneil writes "A new search engine has sprung up to provide human powered search (I thought that is what librarians did) not as pretty as, but interesting.
You can check it out at

I tried it this morning using Droghdea as my search and the first human powered search person bailed on me and pawned me off on another search person who found some results, but I consider the result set to be inferior to what I could have found using any algorithmic search engine.
It is quite like an or any other ask a librarian service except that the search staff are not necessarily librarians.
It is an ad driven model and the search staff can earn $5-10USD per hour according to their website.
I doubt it will catch on, especially for Internet search, it may be something used in Enterprise search, but then again is is no different than an enterprise ask-a-librarian would be.
Have a look and tell us what you think."

Microsoft to Google: Move Over, We're Gonna Do Book Search Too

The latest technology news from PC World and The Sydney Morning Herald report that tech heavyweight Microsoft is planning on launching "a US test of Live Search Books featuring tens of thousands of out-of-copyright books, including works held by the British library and major universities in Toronto and California."

ALA's custom search engine

rteeter writes "The American Library Association is experimenting with a Google custom search engine called the Librarian's E-Library. (Found via"

Could Digitalization Cause a Void in History?

Search Engines WEB sends us something from TechDirt "Even if you can store the data perfectly forever, without the right applications, it's meaningless. Matt Sullivan writes in with yet another article on the topic, this time from Popular Mechanics, that suggests we could be facing a "digital ice age" as plenty of data from this era of history are lost to bad archiving capabilities."

College Students Can't Search Effectively

shoe writes "Via Slashdot comes an article confirming something we as librarians see every day on the front lines... People (in this case, college students) are unable to evaluate web resources for things like objectivity, timeliness or audience. And (don't scream) they can't narrow down a search. Who'd a thunk it?"

Dutch Websites Archived by National Library

TransLibrarian writes "The national library of the Netherlands, the KB has recently started a research project to archive the estimated 1.4 million active websites and 60 million webpages based in Holland. From a blog entitled Will this blog be preserved for eternity?"

Change Afoot for the New York Public Library Reading Room

Shelves are empty, but not because of missing or stolen books. The reading room at the New York Public Library is letting go of the outdated ordering system created by John Shaw Billings, the library system's director from 1896 (the year after it was founded)until his death in 1913, and replacing it with a system people might actually understand.

The current system is used only by the New York Public Library. Its greatest drawback is that no one but the system's librarians really understands it. They are switching to a classification system parallel to that used by the Library of Congress [dividing all knowledge among 21 classes, each signified by a letter]. New York Times reports.


Subscribe to Information Retrieval