By way of Bernie Sloan:
There’s an article on Google’s digitization project in the new (March)
issue of American Libraries. It’s an “e-mail symposium” involving four
librarians: Deanna Marcum, Susan McGlamery, Ann Wolpert, and Michael
Gorman:Google at the Gates. American Libraries. 36(3), 40-43. March 2005.
Among Gorman’s quips: “Any user of Google knows that it is pathetic as an information-retrieval system – utterly lacking both recall and precision, the essential criteria for efficiency in such systems. If you cannot find what you want and if you are lucky enough to find something, it is a paragraph or two wrenched out of context; where is the advance in that?”
I Honestly Don’t Understand
Unfortunatly I can’t RTFA, so I’m only able to comment on this one quote, but it’s probably safe to assume that it’s representative of all the FUD Gorman is now spreading.
“Any user of Google knows that it is pathetic as an information-retrieval system – utterly lacking both recall and precision, the essential criteria for efficiency in such systems. If you cannot find what you want and if you are lucky enough to find something, it is a paragraph or two wrenched out of context; where is the advance in that?”
I’m not even sure why I’m even bothering, other than to really explore my own ideas.
Recall & Precision?? Is he serious? These are great theorectical constructs, but in practice they are impossible to measure on any database, let alone one made up of 8+ billion documents written in different languages and different formats with no standardization. I’d say all databases are utterly lacking both recall and precision, these are numbers that are impossible to measure. Ah to be an academic and really think this is something that matters…
As for “it is a paragraph or two wrenched out of context” I don’t even understand what he’s picking on here. Is it the search results page? Having a paragraph or two wrenched out of context in the search results page with the terms you searched for highlighted is a HUGE thing, anyone remember when they first did this?
I just don’t get it, how can he call Google (or any modern search engine or DB) “pathetic as an information-retrieval system.” This is just sad pathetic thinking I’ve seen far too often. As an information-retrieval system Google is a truly amazing piece of work. I don’t think it’s perfect, I don’t think it’s all anyone needs, and I don’t think it’s useful for everything.
Imagine 10 years ago if I said in 10 years you’ll be able to search a database (full text) of over 8 billion documents. You would’ve said I was crazy.
So what’s his reasoning, can anyone explain it to me? Is he on the attack here because he thinks if he knocks Google down he makes us (libraries & Librarians) look better?
Do we knock down & attack everything that isn’t perferct or do we use all the tools we have available to us to answer our users questions? Google is (in its current form) an invailuable tool. Google in 10 years, I can’t even imagine.
Brain Pollution Re:I Honestly Don’t Understand
Google is cool, its the wet dream of every trivia master out there. It can be good for getting you started on something but its not a primary resource for serious research. I think he finds the idea of the ‘quick fix’ a little offensive. There’s merit in that. The same way we get mad at cars with a bass stereo that rattles windows or billboards as far as the eye can see. There’s room for debate on this.
Collected Gorman Quotes…
Since I have the actual magazine in front of me, I’ll just type in the Gorman ones for Blake.
Q: Folowing the announcement of Google’s digitization plan, librarians and others made sweeping claims for the project’s impact. A month later, what do you forsee as the implications for the larger library world?
– Since scholarly books are, with very few exceptions, intended to be read cumulatively and not consulted for snippets of information, making those that are out of copyright available by means of a notoriously fallible search engine seems to be, at best, a misallocation of resources.
– With all due respect, this response[Ann Wolpert’s comment that the Google initiative is a wonderful experiment] is typical of many in that it contains numerous univerifiable assumptions. How will a “wealth of historical print material” be made available to the world? Any user of Google knows that it is pathetic as an information-retrieval system — utterly lacking both recall and precision, the essential criteria for efficiency in such systems. If you cannot find what you want and if you are lucky enough to find something, it is a paragraph or two wrenched out of context; where is the advance in that?
Also, no amount of ‘research on search engines’ is going to overcome the fundamental fact that free-text searching is inherently inferior to controlled-vocabulary systems and will be so until we have computers with the capabilities of human brains. Google is supposed to have complex algorithms but still produces piles of rubbish for almost all searches. You can put lipstick on a pig, but it’s still a pig.
What’s more, this program will contribute nothing to the preservation of the human record. In fact, a scheme to transfer recorded knowledge from the stability and fixity of paper to the instability and mutability of digital records is a giant step back.
Q: Do you share some observer’s concern that a for-profit company is poised to become the gatekeeper to the world’s knowledge?
– If Google were to be the gatekeeper to the world’s knowledge, I woudl be very concerned, but since that statement is a combination of hype and hubris, I am not losing sleep.
Q: What are ramifications down the road of prestigious libraries taking digitization “sides”…?
– I do not think a choice between the two is very important since both will exclude copyrighted books and atomize out-of-copyright books.
Q: Does Google’s move open the door to the commercialization of libraries?
– No, but it does provide cover for the de-professionalization of libraries, allowing university administrators in the grip of cupidity to replace librarians with clerks and eventually with bibiographic versions of ATMs. The university of Wales at Bangor has just announced that it is doing away with almost all its librarians because “information resources” are available to all. The Google project will give more ammunition to such philistines.
=========================================
I’m tired of typing…he also is against digitization and doesn’t care if these efforts might help smaller libraries. If you read the whole article, the other three people are generally hopeful and positive – go figure. I found it interesting that I agreed with Gorman’s concerns about Google and digitization, but not his degree of contempt.
As the ALA president I’m looking for more answers on where his leadership will lead the Association during his tenure. For many of those answers, he could have added a solution or a strategy, rather than just saying ‘no’ to everything.
Re:Collected Gorman Quotes…
Thanks kmhess.
“pathetic as an information-retrieval system“
I already addressed this. Is it really possible that he believes this? Again, I’m not saying it’s perfect or the answer to all our problems. Google TODAY is amazing, Google in 3 or 5 years will be more amazing still, and there’s a million brilliant people working hard to make Google look like a booger.
“no amount of ‘research on search engines’“
Me thinks Gorman doesn’t know the first thing about computers, let alone what research is being done on information retrieval. Please someone explain the past, present and future of information retrieval research.
“transfer recorded knowledge from the stability and fixity of paper to the instability and mutability of digital records“
I don’t recall reading anywhere part of the plan was to ditch the books, did I miss that? And that’s not a rhetorical question, was that part of the plan?
“If Google were to be the gatekeeper to the world’s knowledge…I am not losing sleep. “
And while he is sleeping Google is slowing becoming the gatekeeper to the world’s knowledge and he’s doing nothing to stop it. I doubt there’s many librarians that want Google to be the gatekeeper, but burying our heads in the sand is doing nothing to stop this.
“bibiographic versions of ATMs“
Eek! It’s 1993 again!
This is, no doubt, something we all worry about. As the President of the “Voice of America’s Libraries” I sure hope he has a good understanding of what this really means and what he can do to stop it.
And finally one from kmhess:
“For many of those answers, he could have added a solution or a strategy, rather than just saying ‘no’ to everything.“
The Voice Of America’s Libraries just become a bitter, contemptuous old man.
Re: There’s room for debate on this.
Are you saying there’s an issue you see two sides on?! Now that’s suprising!!
Of course there’s room for debate, that’s a great point, but one I think Gorman doesn’t get. I’d expect the kind of rhetoric I am seeing from him here from us blog people, but I’d want better from someone in his position. I don’t think the ALA needs to look any more extreme than they already do, plus he’s in a position to encourage productive action in this area that I think is critical to our future.
Re: There’s room for debate on this.
Are you saying there’s an issue you see two sides on?! Now that’s suprising!!
Bitch…
You’re not the first person to say that ‘someone in his position’ should be somehow better. It ain’t exactly a rigorous campaign to run for ALA President. If you get your hands on a print copy of this discussion, flip back a few pages and you’ll see the text from the open forum of the two current candidates. There was only one hardball question and I’m proud to say it was from me.
Re: I’m proud to say it was from me.
good, at least someone is asking them something difficult. Maybe we need to start the LISNews interviews again, not that we were ever really hard on them, but at least we had some good questions.
I could never tell any of the candidates apart, they all had the same platform as far as I could tell.
Re: I’m proud to say it was from me.
I sent them a long list of difficult questions. They chose to opt out.
Maybe you’ll have better luck.
Re:Collected Gorman Quotes…
The ironic thing is that Gorman didn’t sound nearly as bad with the selected quotes as he did reading them in context in the article. The contrast between three people with positive attitudes with his was simply unbelievable.
I give
I feel that when the President of the ALA is clearly proving himself to be a fool on a weekly basis, it is much more damaging to libraries than Google or the Internet.
Call this flamebait if you wish, because I feel his comments deserve nothing else, since he is clearly incapable of rational thought and debate.
Google Wars
Actually, google does have many problems which make it an unreliable information source. This is a re-hash of the boring librarian vs. computer staff wars of the 1970’s on Key word in Context (KWIC) and Keyword Out of Context (KWOC). The best example of this, is of course, to do a google search on dolphins in Florida. This was posed by a librarian some years ago, and shows the problems of trying to find scientific information when the web is flooded with fans of the Miami Dolphins football team.
There are new advantages to finding infomration with scanning historical books online, and advantages to be found that haven’t even been thought of today.
But the mix of clutter and fact on the web makes any search process inherently unreliable. Another librarian did a search on the text of Martin Luther King’s “I have a Dream” speech during the time of the Million Man March. She found well over 200 sites with the text, but all of them were incorrect. Dr. King had given out a text to reporters before his speech, but made some impromtu changes as he delivered the speech. Not a single site had what he actually said- all had copies that were given to press beforehand. Also, many of these sites had simply copied from each other- many typos and other mistakes were replicated again and again. Some copies of the speech were deliberately altered by people with other agendas. End result- all the sites were unreliable in showing what Dr. King actually said.
More access to more crap does not make a better machine. If everyone reproduces incorrect data, then all data sources are incorrect. If you need to use the Latin term for dolphins to find out about the cheerful water mammals in Florida, then you are using the hierarchical terms, and not KWIC.
Now if google will put all the Project Gutenberg texts into one big file that can be searched for phrases, that would be interesting indeed. But this is still limited by copyright to older texts, not the current ones people need. If you are looking for stuff on Florida football teams, google is great. If you are looking for real heart cutting, library research questions such as what to do when your own child is diagnosed with cancer, data from scientific and medical publications are still not available online for free, and most people won’t ever know when or where to look for more infomration. The tragedy is that they think if they have done a google search, they buy the hype (I call it the “KWIC-Fix) and think they have done it all. Instead, they have only scratched the surface. And they often don’t know it.
One day, I hope to meet Kwic and Kwoc, the “Shelve-it Brothers,” and shake them by their hands.
Re:Google Wars
Good comment!
>>Actually, google does have many problems which make it an unreliable information source.
Of course it does, I don’t think anyone is saying Google [today] doesn’t have any problems. Just because it’s not perfect doesn’t mean it’s unusable.
>>The best example of this, is of course, to do a google search on dolphins in Florida.
Good example.
>>But the mix of clutter and fact on the web makes any search process inherently unreliable.
Great! That makes libraries more important than ever. Where did Gorman make this point, when does anyone at the ALA make this point? I would just add to your point, today it makes it less reliable, what about in 5 or 10 years?
>>More access to more crap does not make a better machine.
>>If everyone reproduces incorrect data, then all data sources are incorrect.
True. Another good point, one that “The Voice Of America’s Libraries” needs to be SCREAMING.
>>Now if google will put all the Project Gutenberg texts into one big file that can be
>> searched for phrases, that would be interesting indeed.
Interesting indeed, there’s a million interesting things they can now do.
>>limited by copyright to older texts, not the current ones people need.
Copyright is another thing we should all be talking about, and making people aware of.
>>If you are looking for stuff on Florida football teams, google is great.
Google is great at many things, libraries are great at many things, and some of these things currently overlap. We need to make sure we have someone who is focusing on what we can do, rather than spreading FUD about our biggest competitor.
>> If you are looking for real heart cutting, library research questions such as what
>> to do when your own child is diagnosed with cancer, data from scientific and
>> medical publications are still not available online for free, and most people
>> won’t ever know when or where to look for more information.
Yes, yes, good, great! Again my rhetorical question, how does calling Google a pathetic an information-retrieval system help us get the word out?
>>One day, I hope to meet Kwic and Kwoc, the “Shelve-it Brothers,” and shake them by their hands.
HA! Great quote!
Insanity from Gorman (again)
I posted a much longer response on my blog, but I’m beginning to believe that Gorman is doing this on purpose for some reason. I can’t believe that any librarian is this out of touch with reality these days.
Re:Insanity from Gorman (again)
“The fact that it’s a freaking verb at this point in time.“
heh. some good points in there.
Unfortunately, I CAN believe that a librarian is that out of touch with reality these days. That’s not really a bad thing, not everyone needs to believe Google is useful for anything, or even believe the ‘net is useful. BUT, The Voice Of America’s Libraries should be well versed in as many of the issues impacting our profession as possible. That voice shouldn’t be dismissive and curmudgeony.
If he is doing this on purpose what does he thing he’s accomplishing?
Re:Google Wars
On the other hand, Googling “dolphin Florida -football” is about seven-in-ten cetaceans; and “dolphin Florida -football biology” gets not only interesting stuff, but stuff that’s academic enough to provide expert-approved pointers. It’s easier to exclude terms in a full-text search than when using even a very good paper index.
Of course, you do have to use one of the links to get useful info, and I’m beginning to wonder if Gorman ever does; is he mistaking the index for the substance? I’m gravely puzzled by his constant harping on out-of-context paragraphs, because I get full copies of all sorts of original docs. It’s true many are erroneous – but I don’t think reproducing King’s own press release is the worst kind of error, and besides there’s plenty of bad transmission in print. I myself found an error in a Project Gutenberg copy of _Extraordinary Popular Delusions…_ that’s probably inherited from a human printer’s transcription error; it took about a week for someone at PG to check an old source and correct the online version. Some of the print copies are correct, some not.
sadly, librarians are needed to inform many of the wild starts of digitalizers, but all this FUD won’t help. And it doesn’t escape the attention of a lot of digitalizers that we’re giving our work away for free, and the professional world can give us good advice or just quit complaining already.