A few days ago OCLC, the organization that operates the WorldCat library cataloging system, unveiled The Library 100, their version of a most-popular-novels list. Rather than tallying sales, OCLC decided to rank novels based on how many libraries that register information with WorldCat hold at least one copy of a given book.
Just glancing at The Library 100, something becomes apparent almost immediately. Rather than featuring contemporary bestsellers, the list is dominated by “classics,” the marketing category that covers older, timeless literature and usually carries prestigious connotations. Classics are also my wheelhouse, so on a personal taste level I don’t really have any complaints. I’m more interested in talking about what sort of classics ended up on this list, because I get the sense that libraries have a more narrow conception of the term than I do.
As an exercise, I recorded the publication date of every novel on The Library 100 and sorted them into one of eight broad eras: pre–17th century, 1601–1700, 1701–1800, 1800–1850, 1851–1900, 1901–1950, 1951–2001, and 2001–present. I then counted the number of novels that fell into each period, to get a sense of which points in time libraries were especially fond of. The results are presented in the chart below:
Before going any further, I’ll note a few limitations to this approach. First, pinpointing exactly when some novels were published can take a bit of guesswork, especially for older works where the records may have been lost. Second, even if records are present and accurate, there may be multiple possible publication dates to choose from. For instance, many of the novels on The Library 100 were originally published in serial formats, and were subsequently compiled into a single book. In such cases, it’s not clear which date should be the “official” date of publication: when the first installment was published? the last installment? the completed and compiled book? It was because of such ambiguous cases that decided to just use broad periods rather than precise years.
Based on the above chart, we can see that the periods 1851–1900 and 1901–1950 make up a large portion of the list. Combined, this 100-year stretch accounts for 63 of the 100 novels. We then see a sharp drop off on either side of this combined stretch, with the periods on either extreme of the chart accounting for just 1 novel each. Why exactly is this the case?
A few reasons spring to mind immediately. First, a list that only includes novels, like The Library 100, will necessarily be biased towards the period of time when the novel was popular, i.e., from the 18th century onward. You can surely imagine that a list that included drama and poetry would at least feature the likes of Shakespeare and Homer. Second, a list based on library holdings will be biased towards works that have been around for long enough to end up in such collections, especially if the novel in question still has to be translated into other languages. And third, well, the classics are popular. It may not be reflected on the bestseller charts, but think of how many people read Pride and Prejudice or A Christmas Carol every year. Almost by definition, they have a proven, consistent fanbase, and that will convince libraries to keep those books on shelves.
But of course, there are other, more socially systemic reasons why one would expect classics to dominate this particular list, reasons that OCLC actually acknowledges in the FAQ section. They note that classics “are the novels most often translated, retold in different editions, taught and widely distributed in library collections,” and that as a result, “the list tends to reflect more dominant cultural views.” (They go on mention various efforts to diversify their holdings and encourage the reader to lend a hand in the effort.) It’s no surprise that white men are overly represented here, but something that did surprise me was how Anglophone the list was as well. You can see just how much English-language works dominate the list in the chart below:
Even though I tend to think of French and Russian as especially literary languages, combined they only account for 17 of the 100 books on the list. And that’s to say nothing of languages that are completely absent: no Arabic, no Japanese, no Mandarin, etc. English is especially over-represented in the top slots. While the #1 novel on the list was written in Spanish (Don Quixote by Miguel de Cervantes), the rest of the top 20 was all written in English. This might not be too shocking if WorldCat was only used in, say, the United States, where publishers have been historically reluctant to publish works in translation. But OCLC boasts that WorldCat is used in over 120 countries, so what gives?
My best hypothesis is based around the fact that the rise of the novel coincided with the height of the British Empire and the emergence of the United States as a world power. In addition to imposing economic, political, and social systems onto the rest of the world, both British and American empires could impose their cultural products onto it as well. This cultural imperialism could take a softer form, such as associating Anglophone literature with high class and prestige, or a harder form, such forcing Anglophone literature into classroom curricula at the expense of literature in the local language. Even in our slightly more conscious postcolonial world, the effects of that imperialism may still linger in the collective taste of libraries.
Combine the context of world and literary history and the dominance of Anglophone literature in general on the list, and it’s almost natural that Charles Dickens is the most-represented author here. Six of his novels made The Library 100, with 4 of them in the top 20. Dickens is the epitome of Victorian novelists, which in the somewhat conception of classics this list presents, makes him the epitome of literature. Which, hey, maybe he is, at least to some people! He’s never been to my taste, exactly, but what’s wonderful about libraries (in theory, at least) is that they don’t pander to any one group’s preferences. They’re not marketplaces that conflate popularity with quality, but repositories and archives that treat all entries as worthy of respect. (Libraries are in fact run by fallible humans who do face economic realities, but can’t we live the dream for a few more minutes?)
Libraries are a hodge-podge—meticulously organized, but a hodge-podge nonetheless. That’s what I love about them, and that’s what I tried to capture in my Classics Club reading list. As I wrote in December, I wanted “kitchen-sink Naturalism and spiritual science fiction, epic and lyrical poetry, literary theory and analytic philosophy, Renaissance and modernist drama.” But I also wanted works from people of different backgrounds, from different languages and vastly different time periods. I’m not trying to disparage the list per se, which seems like a perfectly fine piece of descriptive analysis of library holdings. I’ve just been trying to figure out why I, of all people, found it all just a little bit boring.
Not too boring, though. Otherwise I wouldn’t have bothered futzing with Excel to write about it.
Thank you for reading! If you share my love of the classics but want something a little less obvious that The Library 100 catalog, you might enjoy my own list of books that should be taught in high school, which if nothing else includes some really good poetry collections.