My project for this summer focused on the decline in usage of the English definite article ‘the’ and determining an explanation for the trend. We hypothesized that the trend was the result of English becoming less and less formal over the past century or so. We sought both to verify that formality is influencing “the” and also to determine what might be happening linguistically to steer such a trend.
My work involved writing Python scripts and using a formality classifier to determine correlations between formality and “the”; determining correlations between formality, “the” and various concrete linguistic features (e.g. parts of speech); as well as figuring out how the above are changing over time. In the end, we found strong evidence that formality may indeed be contributing to the decline in ‘the’ usage, and also determined a select number of linguistic features that could be replacing it.
I learned a great deal from working on this project this summer, both in terms of hard skills that I may apply directly in the future, as well as soft skills related to the research process in general. As far as hard skills are concerned, I learned a lot about libraries available for natural language processing and statistics using Python, which I am sure I will make use of as I complete further coursework in my field (computer science and linguistics) and begin my career. With regards to the research process, I learned that after crunching lots of numbers, it can be easy to lose sight of what they actually mean; it is important to constantly ask yourself what they indicate and how they should steer further research. I also learned a lot about the importance of building off the work of others. It is almost impossible to get anything significant done completely from scratch all by yourself. When doing research, it is crucial to focus on your main question of interest and leverage the work of others appropriately to get yourself to the point where you can answer that question. I’m confident that the research skills I have developed will prove invaluable as I perform future research or work in teams on larger-scale projects for courses or work.