Carl H. Pforzheimer University Professor em. and University Librarian em., Harvard University, Cambridge, Massachusetts, USA.

2011 is the year of the 'big no' to Google Books, the project made in Mountain View to digitize every book ever published. The agreement proposed by Google to settle the class action against it brought by the Authors Guild and the Association of American Publishers alleging copyright infringement, was rejected by a federal judge in Manhattan. While many applauded the decision to prevent a company from monopolizing access to public cultural heritage, the dream of making all the books in the world available to everyone is inspiring a new ambitious project embraced by a pioneer of the history of the book, Robert Darnton. This Chevalier of the Légion d'Honneur, whose career spanned across Harvard, Oxford and Princeton, is today Professor and University Librarian at Harvard, a trustee of the New York Public Library, the founder of Gutenberg-e program, sponsored by Mellon Foundation and an international authority on electronic publishing, with his articles being featured in the New York Times. Prof Darnton will bring the latest experiences from his research group at Harvard that is studying the legal, financial, technological, and political implications related to a new model of digital public library, which would provide digital books free of charge to readers.  

Breaking the Wall of Inaccessible Knowledge. How Digitization Can Democratize Culture


The wall that I think most deserves destruction is the wall that prevents open access to knowledge. In its most powerful and pervasive form, it is invisible, but it can also be a physical presence as imposing and oppressive as the wall that used to divide Berlin.

As an example, I would like to cite Thomas Hardy’s great last novel, Jude the Obscure. Its hero, Jude Fawley, is a working class lad consumed with the ambition to penetrate into the closed world of learning, which is epitomized by Christminster, a lightly fictionalized version of Oxford University. Having taught himself Latin and Greek, Jude deserves to be admitted into one of the university’s colleges. But he is kept out by the barriers of class. He works as a stonemason and never gets beyond repairing the walls that shut the colleges off from the outside world.

When I was a student at Oxford, 65 years after Hardy published his novel, the walls still looked formidable. The massive gate of my college slammed shut at 10:00 in the evening, and if you hadn’t made it inside, you had to climb over one of the walls – a daunting experience, as the walls were ten to fifteen feet high and bristled at their top with spikes and shards of glass. A few secret passes existed, but even they were treacherous – as you can see in this photo, which shows me posing with a friend at one of my favourite entry points, where you had to slip in between rows of fixed and revolving spikes.

I will flash more pictures of Oxford walls as accompaniment to the rest of my remarks, which are aimed at the invisible barriers to knowledge, especially the knowledge contained in libraries. Most Oxford libraries are walled inside the colleges.

Even the Bodleian Library of the university has the air of a fortress with crenulations and iron gates. Libraries everywhere in the world frequently keep outsiders outside by all sorts ofmeasures: locked doors, turnstiles, restrictive qualifications for entry, payment to obtain a readers’ card and an atmosphere of intimidation. Ordinary folk hesitate to brave these barriers. They are kept at a distance by the learned elite, who wear an air of effortless superiority, which corresponds to the social sifting that the French sociologist Pierre Bourdieu identified as “la distinction”. Oxford students captured this intellectual hauteur in verse, which they aimed at Benjamin Jowett, the master of Balliol College, who was also reviled in Jude the Obscure:

Here come I, my name is Jowett All there is to know I know it I am master of this college What I don’t know is not knowledge

A counter-tendency gathered force in the age of Enlightenment, when spreading light became identified with reading books. The French Royal Library began to admit readers in 1692, although not very many (the royal librarian often gave all of them lunch on Thursdays). The British Museum was opened to the public in 1759. In the United States, the first large public library, established in Boston in 1848, allowed any citizen to borrow books and take them home to read. The New York Public Library opened its great collections in 1911 to anyone who walked in from the street. It served as an informal university for generations of immigrants who wanted to read their native literatures in Yiddish or Italian or Chinese. Andrew Carnegie financed the creation of 39 branch libraries throughout New York, while paying $40 million to build 1,697 community libraries everywhere in the country between 1886 and 1919.

It would be misleading, however, to conclude triumphantly that everyone at last has access to learning. We have only now reached a point where the democratization of knowledge can be achieved on a mass scale. Thanks to modern technology, we can create a library system that will make our entire cultural heritage openly accessible to everyone. That sounds utopian, I know: how can we create a worldwide, twenty-first century, Carnegian system without any  Carnegies to finance it? You will also reply with other objections – the Digital Divide, the problem of illiteracy, the grinding poverty that keeps most of the world’s population beyond the boundaries of book culture. I don’t pretend to any insight about the global dimension of this problem, although it may be that developing countries are capable of great leaps forward into a digital future that we cannot foresee. Instead, I would like to limit my remarks to the developed world, where there are still plenty of unacknowledged walls that need to fall before we all have access to knowledge. “The field of knowledge”, Thomas Jefferson said, “is the common property of mankind.” But since Jefferson’s day, it has been appropriated and fenced off by commercial interests. I would like to mention two of them.

1. The cost of academic journals. Everyone complains of information overload. My doctor, for example, laments that medical knowledge doubles every two years. How can he possibly keep up with it? The number of medical journals increased from 3,472 in 2000 to 4,866 in 2010. But my doctor does not know how much those journals cost, because he has access to them through his hospital, which subscribes to them online. Their price has increased at four times the rate of inflation since 1980. The Journal of Comparative Neurology now costs $29,113 for a year’s subscription; Brain Research costs $23,446; and I could give many more examples. Three big publishers – Elsevier, Wiley-Blackwell and Springer – publish 42% of all academic journal articles. They have no effective competition, because the market is divided into highly specialized sectors. Last year, Elsevier’s profit margin was 36% on revenues of 2 billion pounds.

These publishers have a stranglehold on the market, and they squeeze all the money they can get out of research libraries, whose budgets are declining. Libraries are therefore cancelling subscriptions, having already cut back on their acquisitions of monographs.

In other words, while the amount of knowledge is increasing, the proportion of it that is available to the public is decreasing. Of course, public funds subsidized most of that research in the first place, so you might think that the public should have access to the results of the research. The National Institutes of Health (NIH) acted on that principle in 2009, when the US Congress required that articles based on NIH grants be made available on an open-access repository, PubMed Central. But lobbyists for the publishers blunted that requirement by getting the NIH to accept a twelve-month embargo to prevent public accessibility long enough for them to cream off the demand. The publishers defend themselves by insisting on their value-added function and by invoking the sacred cause of copyright. They certainly add some value by editing and marketing the journals, but can those services justify profit margins of 30-40%? In article 1, section 8, the United States Constitution set two goals in establishing copyright: the advancement of knowledge and emolument to authors for a limited period. The copyright law of 1790 set that limit at 14 years, renewable once. The copyright extension act of 1998, pushed through Congress by more lobbyists, stretched that limit to the life of the author plus 70 years – that is, in effect, for a century or more. As a result, most twentieth-century literature has been excluded from the public domain. An enclosure movement has fenced off the public’s access to a public good.

2. A digital library. Commercialization has also threatened to take over the entire corpus of books in our research libraries. In 2004 Google began to digitize the libraries’ collections and to display snippets of them, with advertisements attached, as an online search service to its users. A group of authors and publishers then sued Google for alleged infringement of their copyrights. If Google had won its case in court, it could have scored a great victory for the doctrine of fair use. Instead, it reached a settlement with the litigants, which transformed the search operation into a joint speculation called Google Book Search. Having digitized many millions of books, Google would sell access to their contents in its database, which would become a gigantic commercial library. The research libraries, which had provided their books to Google for free, would have to buy back access to the same books, in digital form, at a subscription rate to be determined by Google and the copyright owners, who would have an interest in extracting all

the money they could get. No one spoke for the public interest, because the deal was negotiated in secret, and the public was not consulted. Fortunately, the settlement had to be approved by a federal court. On March 23, the court rejected it on the grounds that, among other things, it threatened to create a monopoly in restraint of trade. It now seems unlikely that Google Book Search can be revived in a way that will satisfy the court, but we are still staring in the face of a great danger: a commercial enterprise nearly walled off an enormous stretch of our cultural heritage in order to exploit it for its own profit.

The collections of research libraries are a public good. At Harvard, we have built them up at enormous expense and labour over many generations since 1638. Strictly speaking, they belong to the president and fellows of Harvard College, but they are a national asset, and we should make them available to the entire country, in fact to the whole world. The same is true of the books, manuscripts, photographs, recordings, films and databases in all the research libraries of the United States. Thanks to the latest technology, we can connect them together in a way that will make them accessible to everyone within the range of the internet.

We are doing so at this moment. We call our project the Digital Public Library of America, and we are designing it so that it will be interoperable with Europeana, the digital network that is aggregating the collections of 27 countries in Europe. Within a decade, a digital library system will make the world of learning available to the entire world.

A utopian dream, you say? I believe we need an infusion of utopian energy if we are to tear down the walls that stand in the way of the public good. But the DPLA is do-able. We will tap other sources of energy available in the United States: private foundations eager to invest resources in the public welfare, technological expertise ready to be harnessed, legal talent capable of negotiating a path through copyright restrictions and the no-nonsense, can-do, pragmatic spirit of people ready to commit themselves to the enterprise.

I will spare you the details. We unveiled most of them at a public launch of the DPLA in Washington last 21 October. Let me conclude with a promise: we will have a preliminary version of a great, open-access, digital library up and running by April 2013. We will build it with material recovered from the dismantling of walls that have kept the public from the public domain for centuries, and we will build it for the benefit of everyone.