University of Michigan's confidential agreement with Google
Google Watch appeals to the American Library Association
Google-eyed U.Michigan gives away its library
by Daniel Brandt
June 19, 2005
Public Information Research, the sponsor of this site, filed a freedom of information request with the University of Michigan on May 25. We were already aware that libraries that had contracted with Google to digitize some or all of their collections were under nondisclosure constraints. John Wilkin, a U of M librarian who is working with Google, admitted this last December when I directly asked him this question. A couple months later, a reporter told me that he got nowhere when he tried to ask Harvard about their program.
Unlike the other four libraries that signed with Google, the University of Michigan is a public institution, and is subject to Michigan's freedom of information law. The University notified me on June 17 that they had just posted a copy of the contract on their website, and our copy was in the mail. Sure enough, every page is marked "confidential," and there's an entire section in the contract about confidentiality. Curiously, it ends with an acknowledgement that because the University is subject to Michigan's freedom of information law, it will not be considered a violation of this section if the University reveals the contract. That raises the question of why the entire thing was confidential to begin with or did they tack this on after we filed our request?
The next issue doesn't have much to do with the contract, but it's interesting because this is the most immediate issue facing the University. This is the copyright issue. I predict that the University of Michigan's current intention of providing their entire seven-million-book collection to Google will not stand. This will be true even though the logistics of the digitizing process are locked in until 2009, and the terms and conditions for using the digital files are perpetual, according to the language in the contract.
Google does not have the right to challenge the University, if the library simply says that certain types of material are unavailable for digitization. The University has an easy way out of the contract in this respect, and they will probably need it. This is almost the only item in the contract language that wasn't conceded to Google by the University.
The University of Michigan is wide open to a court order over copyright issues. U of M is the only one of the five libraries to openly brag that they expect their entire collection will be digitized by Google. Stanford and Harvard also plan to allow Google access to copyrighted material, but their projects appear to be less ambitious, or at least on a slower track, than Michigan's. The other two libraries, the New York Public Library and Oxford, have excluded copyrighted material.
There are copyright issues that Google must deal with once they own the digital files. These involve the definition of "fair use." But at this particular moment, this aspect of copyright law is not the potential show-stopper. The immediate question is, does the University of Michigan have the right to turn over copyrighted material to Google for the express purpose of digitization by Google? That's the show-stopper.
Look at it this way: You walk into the library down the street with a big bag full of quarters. You take a book down from the shelf, one which is copyrighted and cannot be checked out, and you start copying from page one. Many librarians would be nervous about allowing you to do this under current copyright law, even if they prefer not to stop you.
This is exactly what Google is doing, except that they have seven million bags of quarters, and the University is telling Google, "Wonderful, here's the next cart of books for you! Can we be of further assistance?" It's not going to fly in the long run. The Association of American University Presses will probably not file suit. Their member organizations are from campuses, where anti-copyright librarians hang out, and no one likes internal wars. But Google has also sparked the interest of major for-profit publishers. It would not take much to cause the University of Michigan to become sufficiently nervous over liability issues, so that they decide to withhold all copyrighted material from Google until this matter is resolved.
I dwell on copyright issues only because it's the weakest link for the University of Michigan. Other issues are also important, although they don't have equivalent legal standing. Even if Google cannot get any copyrighted material from libraries, they can still get the public domain material. There's a lot of public domain material that Google would love to acquire. That's the point at which these other issues become crucial.
Beyond the privacy issue, we have the cultural issue and the censorship issue. If Google has a near-monopoly on digitized books, which books will they choose to make available through web search services? Mostly English or lots of other languages too? Is Karl Marx okay? What about a book on explosives? How about that glowing biography of Osama bin Laden that's in the public domain?
There's also the monetization issue. Google will not impose a "direct cost" for access to their digital books through their search services, although they can license or sell their files to their partners. However, we all know by now that there are many ways for Google to monetize their asset indirectly. Some of these are intrusive, and an insult to our cultural heritage and our self-respect. Will Google care, as long as they are making money and their stockholders are happy? (Click on the cartoon to find out why it's not much of an exaggeration.)
And what about the language in the contract that says the University of Michigan can only use their copies of the files on their own website, and only if they lock out bots and third party redistribution, and only if they have limited traffic? Sure, the University gets a copy of Google's files. But they cannot do anything with these copies by way of making them publicly available. The University is not allowed to compete with whatever Google decides to do with the digitized files. The library at the University of Michigan has betrayed the trust that the we placed in them, as a public institution that acts as a custodian of our public-domain printed heritage. They've agreed to hand it over to Google, at which point Google claims its own copyright on the digital files. Small wonder that the contract was confidential!
4.4.1 ... U of M shall restrict access to the U of M Digital Copy to those persons having a need to access such materials and shall also cooperate in good faith with Google to mutually develop methods and systems for ensuring that the substantial portions of the U of M Digital Copy are not downloaded from the services offered on U of M's website or otherwise disseminated to the public at large.
Librarians at the University of Michigan were blindsided by the opportunity to get their entire collection digitized, and they signed on with Google before they considered their responsibility to the public. The trend of universities selling out to private interests has been pronounced over the last two decades, yet this is bigger than some mere proprietary research done at a university and funded by a big drug company. This is our culture and our consciousness. Why hand that over to a greedy corporation?
Google will bastardize this material by monetizing it, at the same time that they track everyone who accesses it, to the tune of millions of users a day. Meanwhile, the University of Michigan will be able to brag that students with a library card can use the University library from their dorm rooms. This is an unfortunate situation for the University and for the public. Assuming that the University doesn't have the funds to do its own digitization, it would still be better for all of us if they just waited until someone comes along with better terms and a more public-spirited proposal.
Any major library is in a strong bargaining position, and it should recognize this. It is expensive and time-consuming to digitize books, but John Wilkin makes it sound like it would all be impossible without a mysterious Google technical innovation, and then he declines to explain the digitizing process in further detail. Google is bluffing, and I suspect that Mr. Wilkin knows this. However, the process is too far along at U of M by now, and Mr. Wilkin has to protect himself.
Secret Google technology revealed
Detroit Free Press, December 14, 2004:
The size of the U-M undertaking is staggering. It involves the use of new technology developed by Google that greatly speeds the digitizing process. Without that technology which Google won't discuss in detail the task would be impossible, says John Wilkin, the U-M associate librarian who is heading the project.
Christian Science Monitor, June 27, 2005:
"We had all these cockamamie schemes for how we could get content," recalls Marissa Mayer, director of consumer Web products at Google. "We thought, well, could we just buy books? But then you don't get the old content. We thought maybe we should just buy one of every book, like from Amazon, and scan them all." How long would it take to scan all the world's books? No one knew, so Ms. Mayer and Google cofounder Larry Page decided to experiment with a book, photographing each page so that it could be digitally scanned. "We had a metronome to keep us on rhythm for turning the pages. Larry's job was to click the shutter, and my job was to turn the pages," Mayer says. "It took us about 45 minutes to do a 300-page book."
After five years of watching Google, I don't buy the secret sauce argument. Google needs access to major libraries, because they really believe that they're on a mission from God to organize the world's information. It's more difficult to acquire access to out-of-print books than it is to digitize them, which means that the big libraries themselves are holding a winning hand here. This entire situation simply shows us that Google plays a much better game of poker than the University of Michigan.