Technology

How Big Data Creates False Confidence

In the case of language and culture, big data showed up in a big way in 2011, when Google released its Ngrams tool. Announced with fanfare in the journal Science, Google Ngrams allowed users to search for short phrases in Google’s database of scanned books—about 4 percent of all books ever published!—and see how the frequency of those phrases has shifted over time. The paper’s authors heralded the advent of “culturomics,” the study of culture based on reams of data and, since then, Google Ngrams has been, well, largely an endless source of entertainment—but also a goldmine for linguists, psychologists, and sociologists. They’ve scoured its millions of books to show that, for instance, yes, Americans are becoming more individualistic; that we’re “forgetting our past faster with each passing year”; and that moral ideals are disappearing from our cultural consciousness.
From How Big Data Creates False Confidence - Facts So Romantic - Nautilus
Topic: 

LITA ALA Annual Precon: Digital Privacy

Don’t miss these amazing speakers at this important LITA preconference to the ALA Annual 2016 conference in Orlando FL. Digital Privacy and Security: Keeping You And Your Library Safe and Secure In A Post-Snowden World Friday June 24, 2016, 1:00 – 4:00 pm Presenters: Blake Carver, LYRASIS and Jessamyn West, Library Technologist at Open Library
From LITA ALA Annual Precon: Digital Privacy – LITA Blog

Google BigQuery Public Datasets Includes GDELT HathiTrust and Internet Archive Book Data

Google BigQuery Public Datasets A public dataset is any dataset that is stored in BigQuery and made available to the general public. This page lists a special group of public datasets that Google BigQuery hosts for you to access and integrate into your applications. Google pays for the storage of these data sets and provides public access to the data via BigQuery. You pay only for the queries that you perform on the data (the first 1 TB per month is free, subject to query pricing details). It includes the GDELT HathiTrust and Internet Archive Book Data. This dataset contains 3.5 million digitized books stretching back two centuries, encompassing the complete English-language public domain collections of the Internet Archive (1.3M volumes) and HathiTrust (2.2 million volumes).
From Google BigQuery Public Datasets — Google Cloud Platform

When our culture’s past is lost in the cloud

We tend to think of memory as a purely mental phenomenon, something ethereal that goes on inside our minds. That’s a misperception. Scientists are discovering that our senses and even our emotions play important roles in recollection and remembrance. Memory seems to have emerged in animals as a way to navigate and make sense of the world, and the faculty remains tightly tied to the physical body and its material surroundings. Just taking a walk can help unlock memory’s archives, studies have shown.
From When our culture’s past is lost in the cloud - The Washington Post
Topic: 

Japanese AI Writes Novel, Passes First Round for Literary Prize

A short-form novel “coauthored” by humans and an artificial intelligence (AI) program passed the first screening process for a domestic literary prize, it was announced on Monday. However, the book did not win the final prize.

Two teams submitted novels that were produced using AI. They held a press conference in Tokyo and made the announcement, which follows the recent victory of an AI program over a top Go player from South Korea. These achievements strongly suggest a dramatic improvement in AI capabilities.

http://the-japan-news.com/news/article/0002826970

Data Is a Toxic Asset

We can be smarter than this. We need to regulate what corporations can do with our data at every stage: collection, storage, use, resale and disposal. We can make corporate executives personally liable so they know there's a downside to taking chances. We can make the business models that involve massively surveilling people the less compelling ones, simply by making certain business practices illegal.

From Data Is a Toxic Asset - Schneier on Security

Apple and the FBI: Why is this relevant to libraries?

Why is this relevant to libraries? I think it’s past time that we start paying very close attention to the details of our data in ways that we have, at best, hand-waved as a vendor responsibility in the past. There have been amazing strides lately in libraryland in regards to the security of our data connections via SSL (LetsEncrypt) as well as a resurgence in anonymization and privacy tools for our patrons (Tor and the like, thank you very much Library Freedom Project).

Data about our patrons and their interactions that isn’t encrypted at rest in either the local database or the vendor database hosted on their servers (and our electronic resource access, and our proxy logins, and, and, and…) is data that is subject to subpoena and could be accessed in ways that we would not want. It is the job of the librarian to protect the data about the information seeking process of their patrons. And while it’s been talked about before in library circles (Peter Murray’s 2011 article is a good example of past discussions) this court case brings into focus the lengths that some aspects of the law enforcement community will go to in order to have the power to collect data about individuals.

From Apple, the FBI, and Libraries | Pattern Recognition

What it looks like to process 3.5 million books in Google’s cloud

What did it look like to process 3.5 million books? Data-mining and creating a public archive of 3.5 million books is an example of an application perfectly suited to the cloud, in which a large amount of specialized processing power is needed for only a brief period of time. Here are the five main steps that I took to make the invaluable learnings of millions of books more easily and speedily accessible in the cloud:

From Google Cloud Platform Blog: What it looks like to process 3.5 million books in Google’s cloud

A Short History of the Index Card

Index cards are mostly obsolete nowadays. We use them to create flash cards, write recipes, and occasionally fold them up into cool paper airplanes. But their original purpose was nothing less than organizing and classifying every known animal, plant, and mineral in the world. Later, they formed the backbone of the library system, allowing us to index vast sums of information and inadvertently creating many of the underlying ideas that allowed the Internet to flourish.

From A Short History of the Index Card

Topic: 

Medieval Handwriting App - Medieval Histories

If you want to study medieval scripts, handwriting, and manuscripts or simply want to get acquainted with some of the finest medieval codices here is an app to get you started

The origins of the app – Medieval Handwriting – lie in online exercises in palaeography developed for postgraduate students in the Institute for Medieval Studies at the University of Leeds in West Yorkshire, U.K.

The aim is to provide practice in the transcription of a wide range of medieval hands, from the twelfth to the late fifteenth century. Please note that it is not a tutorial on the development of handwriting in medieval Western Europe. Users can examine 26 selected manuscripts, zoom in on individual words, attempt transcription and receive immediate feedback. They can optionally compare their transcription with a full transcript. The user’s transcripts can be saved and reopened. The saved transcripts are accessible via File Manager apps.

From Medieval Handwriting App - Medieval Histories

Topic: 

Pages

Subscribe to Technology