By Jeffrey Beall
Word-sense disambiguation is the ability of an online system to differentiate the different senses, or meanings, of words in online searching. Say for example that you need information on boxers, so you access an Internet search engine and enter "boxers" in the search box. The search engine then finds documents that contain the word "boxers" and returns those documents to you as search results.
You probably already see the problem here -- the word "boxers" is a homonym with several different meanings, and the search engine doesn’t know which meaning you want. Boxers are a breed of dog, a category of athlete, and a kind of men’s garment. It’s also the possessive of a surname, as in "Barbara Boxer’s bill …" Finally, boxers were those who participated in the Boxer Rebellion in China from 1899 to 1901. There may be additional meanings.
Information retrieval in libraries has transitioned from the high precision and recall that legacy library systems offered to the probabilistic and linguistic free-for-all that internet search engines now provide. One of the great values of legacy library databases was that they effectively handled polysemy -- the ability of a term to have multiple meanings -- in searching. Because online searching needs word-sense disambiguation to be effective and precise, it’s important for all librarians to understand the problem and its solutions. -- Read More