Information Retrieval

Violate Terms & Conditions, Get Indicted

The Bits Blog online with The New York Times reports that programmer Aaron Swartz was indicted for allegedly stealing 4 million documents from MIT and JSTOR. According to documents posted to Scribd, the arrest warrant cites alleged violation of 18 USC 1343, 18 USC 1003(a)(4), 18 USC 1003(a)(2), 18 USC 1003(a)(5)(B), and 18 USC 2. The Boston Globe summed up the charges stating:
Aaron Swartz, 24, was charged with wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. He faces up to 35 years in prison and a $1 million fine.
Activist group Demand Progress, of which Swartz previously served as Executive Director, has a statement posted. Internet luminary Dave Winer also has a thought posted as to the indictment. Wired's report cites the current Executive Director of Demand Progress as likening the matter to checking too many books out of a library. (h/t Evan Prodromou and Dave Winer) (Update at 1641 Eastern: The Register has reporting here)

The Echo Chamber Revisited

In 2004, we spoke with law professor Cass Sunstein about the echo chamber effect, the phenomenon by which the explosion of information streams allows us to cherry-pick our media diet so we encounter only news that reinforces our worldview (while evading facts and opinions that contradict it). And so, seven years later are we on a path to ever more intellectual isolation? Eli Pariser, Lee Rainie, Clay Shirky, Joseph Turow and Ethan Zuckerman weigh in. If you do not want to listen to the piece you can read the transcript.

How the Modern Web Environment is Reinventing the Theory of Cataloguing

Panizzi, Lubetzky, and Google: How the Modern Web Environment is Reinventing the Theory of Cataloguing: This paper uses cataloguing theory to interpret the partial results of an exploratory study of university students using Web search engines and Web-based OPACs. The participants expressed frustration with the OPAC; while they sensed that it was "organized," they were unable to exploit that organization and attributed their failure to the inadequacy of their own skills. In the Google searches, on the other hand, students were getting the support traditionally advocated in catalogue design. Google gave them starting points: resources that broadly addressed their requirements, enabling them to get a greater sense of the knowledge structure that would help them to increase their precision in subsequent searches. While current OPACs apparently fail to provide these starting points, the effectiveness of Google is consistent with the aims of cataloguing as expressed in the theories of Anthony Panizzi and Seymour Lubetzky

Scrapers Dig Deep for Data on Web

'Scrapers' Dig Deep for Data on Web
The market for personal data about Internet users is booming, and in the vanguard is the practice of "scraping." Firms offer to harvest online conversations and collect personal details from social-networking sites, résumé sites and online forums where people might discuss their lives.

The Guardian: Yahoo! to sell Delicious

The Guardian reports that Yahoo! is rumored to be preparing to sell Delicious to StumbleUpon. From the story:
At the same time of the December announcement the handful of engineers who were developing the Delicious system are understood to have either been sacked or redeployed inside Yahoo, leaving only support staff.
Services like Pinboard and Opera Link exist as potential replacements among other offerings online.

This Data Isn’t Dull. It Improves Lives.

The private sector can often reformat government information in ways that help consumers, workers and companies.

Full article in the NYT

Mendeley Offers $10,001 for Best New Research Tool

From the Chronicle of Higher Ed
March 8, 2011, 4:32 pm
By Ben Wieder

The developers of Mendeley, a research-management tool that has more than a million users, want to put more than 70 million academic papers, reader recommendations, and social-networking tags to new and innovative uses. The company announced Tuesday its “Binary Battle,” a contest for outside developers to build applications drawing from Mendeley’s collected information, with a $10,001 grand prize for the best new application.

Steven Rosenbaum and the Curation Nation

What if instead of relying on search engines to get our information, we relied on each other - friends, experts, journalists - to deliver us information by way of carefully curated websites? Steven Rosenbaum, CEO of and author of Curation Nation: How to Win in a World Where Consumers are Creators tells Bob that our curated content future may have already arrived. If player does not show above or you want to download MP3 or read transcript that is here.


Subscribe to Information Retrieval