Joyce, Beckett, and the Meaning of "Statistical Similarity"

My research revolves around the application of modern techniques of stylometry to assess the similarity between the works of Irish authors James Joyce and Samuel Beckett. Joyce and Beckett did have some similarities. They were both Irish expatriate authors writing in Paris between the wars. They were also friends and colleagues, and Beckett was a disciple of Joyce. However, despite these similarities, and despite the fact that they are so frequently compared, the two authors occupied different historical moments and wrote in considerably different styles and forms—one may classify Joyce as a turn of the century, stream of consciousness, often comic, modernist, while Beckett may be classified as a tragicomic existentialist with a considerably bleaker view of human existence than Joyce. As such, comparing the two may not be particularly appropriate or generative, and may even be reductive, as such a comparison risks flattening consideration of the two authors to the axis of their Irish heritage.

To assess the similarity between their respective oeuvres, I built a corpus of texts comprised of works by Joyce, Beckett, and other artists whose works may be considered similar to the works of Joyce and Beckett. In the case of Joyce, these authors include Virginia Woolf and William Faulkner, among others, and in the case of Beckett, these writers include Wyndham Lewis, Edward Albee, and Joe Orton. Through comparing these texts in the statistical programming language R, using the R package stylo() and others, I hope to address a number of questions, including: to what degree are the works of Joyce and Beckett similar, how similar are their works to the works of these other authors, from a computational standpoint, what do we mean when we say the works of two authors are similar, what do we mean by the phrase statistical similarity, and what metrics can we use to assess similarity?

This research experience has taught me the value of interdisciplinary research—in this case, the intersection between literary studies, Statistics, and computer science—in particular, how the methodologies and strengths of one field can be leveraged to further knowledge in what appears to be a totally disparate field. The boundaries between fields are far more permeable than they appear on the surface, and I hope to continue this interdisciplinary focus in my studies.