What makes a book a “classic”? How should we define what works belong in the Western literary canon? These questions have been debated in both academic and public circles for centuries. To some, the classics represent the pinnacle of human literary expression, works that deserve to be taught because they hold inherent and objective value. Others reject the notion of the “classics,” arguing that it is an outdated and insufficient term to use to describe literature that spans hundreds of years, across a variety of subcultures.
In any case, even if we cannot agree on what defines a “classic,” or if this is even a term we should be using anymore, it is hard to deny that there are certain books that have had a clear and measurable impact on Western culture. Maybe better questions to start with, then, are these: how did these works become so ingrained in Western culture? What makes a “classic” different from other novels in terms of their cultural impact? How do classic novels correlate with, or even influence, other novels to rise in popularity?
To try to answer these questions, we compiled a list of around 100 widely agreed upon Western classics, and collected data on how mentions of their titles in other written works has changed over time. In theory, if their titles are being mentioned at certain points in history more so than others, this can act as a measure of how much discussion is ocurring surrounding the novel.
To determine if there were any general trajectories that novels follow in their path to becoming classics, we ran a clustering algorithm to group our classic novels by similar patterns of their mentions in the first 50 years after their publication. In doing so, we found that there isn't a "one size fits all" path for a book to become a classic; rather, three common trajectories emerged. Hover over the lines in each graph to see which novels fall under each trajectory.
This category describes those novels that either stayed stable in their relevance in the first 50 years after their publication, or who continually saw rises and falls in their mentions over time. In either case, the novels in this trajectory saw an overall linear rise in their popularity in the first few years after their release.
Many of the classic novels we examined were massively popular at the time of their release, but quickly trailed off in popularity in the following years. This supports the fact that many classics during their time were actually considered to be "fads" that wouldn't stand the test of time.
In some cases, however, novels which we consider to be classics today were not acclaimed at all at the time of their publication, but saw a steady path of growing interest in the fifty years after. This speaks to the fact that many classics are now acclaimed as being "ahead of their time" -- that even if these stories did not speak to readers during their time, they still were able to captivate those of later generations.
Great works of literature don't exist in a vacuum. Rather, novels that touch on similar themes and philosophical ideas tend to see similar trends in their popularity over time, and this can help us understand how classics might influence the discourse on other novels, driving up their popularity and eventually allowing them to become classics as well.
Each cell in the matrix below shows the correlation between the mentions of classic novels in other corpora from 1900 to 2019. A value near 1 (red) means the two books within that cell are mentioned together across time; near −0.3 (blue) means their mentions diverge. Hover over any cell to see the pair of titles and how much their popularities are correlated with each other.
Taken from the above visualization, the pairs below had the highest correlation in year-by-year
Ngram frequency of mentions from 1900 to 2019. A high r means both books were mentioned more
(and less) at the same moments in history, which would likely indicate that they touch on
similar themes that are deemed by audiences as being culturally relevant. However, there are
some pairs listed -- Dracula and Wide Sargasso Sea, Frankenstein and Beloved, or Brave New
World and Anna Karenina -- that seem to have been produced in widely different cultural
contexts (not to mention being published decades, even centuries, apart).
Nevertheless, when viewed with a wide lens, these novels might be able to be compared -- Dracula and Wide Sargasso Sea, for example, both have central characters that are victims of xenophobia and being "othered." Frankenstein and Beloved both have characters who were created and abandoned, and appear to haunt their "makers" for the duration of the book. Brave New World and Anna Karenina comment on the depraved nature of modern elite society and the lives of people who live in empty and fabricated worlds. Therefore, the correlation patterns that these novels show in their popularity over time is a testament to the fact that the staying power and universal relevance that defines the classics is found in the common themes, philosophical ideas, and character archetypes that have resonated among readers across time and place.
These following pairs moved in opposite directions over time -- when one was frequently cited, the other tended to recede, suggesting they occupy different cultural moments or audiences. This provides an important counterpoint to the picture of the classics that was just painted above, as a body of works with interconnected themes motifs, and universal appeal. In reality, "the classics" refers to a body of works that as a whole defy categorization, as classic novels often contain philosophical and moral ideas that are contradictory to those found in other classics, or hold relevance to cultures that might be largely isolated from one another. While the correlation heat map shown above does indicate that a majority of the most lauded classics are positively correlated with each other - suggesting that the popularity of the majority of classic literature is correlated with the popularity of other classics - it is important to recognize that there is still no single thematic thread that runs through all the works that earn the title of "classics."
If the popularity of novels are correlated with each other based on the mutual presence of certain themes and motifs, this might be an indication that these novels are categorized in the same literary "genre." Therefore, organizing our novels by genre could provide insight into how classics within those genres have changed in popularity over time.
Clearly, Fantasy classics have seen the most dramatic rise in their popularity. While books like Wuthering Heights and Dracula were not immediately successful upon their release, they have clearly become staples in the Western literary canon. On the other hand, classic adventure novels have seen a decrease in popularity since around 1850, suggesting they have a diminished cultural relevance in the modern day. While we might tend to think of "the classics" as being a static group of books with a steady pattern of popular attention and acclaim, the reality is that there are many books that were once considered classics that many scholars would argue no longer hold that title. Only time will tell if classics in the Adventure genre -- such as Moby Dick or The Three Musketeers -- will see a renewed interest among the coming generations' readers, or if many of them will be lost to history.
The jumps seen in the above graph indicate the publication of works that were seminal in establishing certain genres, or at pushing certain genres into the spotlight that were not previously considered worthy of reading. For example, books such as Ulysses, A Portrait of the Artist as a Young Man, and To the Lighthouse were all published around 1920, and are viewed as seminal works in the genre of "Literary Fiction" and modernism -- all of these books are responsible, then, for the rise of this genre seen at the beginning of the 20th century.
While the analysis above is useful for determining the macro trends that emerge among the classics, it is undeniable that each book in the Western canon has had a unique and singular impact on literary history that can't fully be appreciated from the above data. The following "case studies" explore in-depth some of the most recognizable classics in our dataset. Explore the following pages to learn about the impact that these novels have had on our society over the years.
En un lugar de La Mancha... there was a really cool article about Don Quixote.
Read Case StudyIt is a truth universally acknowledged that all people reading this article should take a look at this case study.
Read Case StudyA true American classic, and a staple on any fifth grade girl's bookshelf.
Read Case StudyYou'll regret not reading this article as much as Raskolnikov regretted... everything.
Read Case Study"You can’t consume much if you sit still and read books."
Read Case StudyIf you want to learn more about classics that we didn't include in our case studies, you can explore their popularity graphs using the same dataset we used for this project (The Google NGram dataset) here. If you want to make your own graphs such as the ones we have here, feel free to visit the GitHub!