I entered this Praxis assignment with a goal: measure pronoun usage in at least one Shakespearean play. Why pronouns? I was curious to see how often characters reflected on each other using gendered pronouns. Furthermore, in my research, I found that pronouns were often used as stop words (filler words that are removed before mining – think “the”, “an”, “a”, etc), which made me curious about how much these overlooked words could tell. Why Shakespeare? His plays are easy to find as public domain text files, the plots are well known, several have interesting gender dynamics, and. . . I was in a Shakespeare company in my undergrad years. I figured for new tools I may as well retread familiar territory.
I settled on As You Like It as this play particularly plays with gender – an exiled daughter of a duke disguises herself as a man who then offers to play the role of a woman (herself, no less) to help her love interest cope with his inability to see her. It makes more sense in context. With a woman as the main character, and one whose gender presentation changes throughout the play, I felt that As You Like It would serve as a good test case.
Next, came which software to use. I wound up trying a few, each of which will receive its own section:
Voyant
I started with Voyant, which was definitely the easiest to use. I copied and pasted the link from MIT’s webpage of the script (https://shakespeare.mit.edu/asyoulikeit/full.html) and clicked Reveal.
Immediately, I saw this Cirrus map of the most common words:

The script has 22,817 total words and 3,267 unique word forms according to Voyant with the top five words being Rosalind (the main character), Orlando (her love interest), Celia (Rosalind’s cousin / best friend), love, and good. I scrolled through the most popular words, trying to select both pronouns and gendered terms, but realized that this would be far too manual to be efficient. I then tried searching for pronouns before realizing that this was essentially using the Find function on the original webpage, which also didn’t seem like the best use of the tool. Interestingly, the context tab does show what precedes and follows the term. Unfortunately, the terms only show up when you use the regex term ending in an asterisk, so you do have to filter through words that start with the pronoun (i.e. she -> shepherd). It’s fascinating and provides some additional information, but doesn’t always refer to whom the pronoun is referring. Overall it was a fascinating starting point, but didn’t really answer my question
Google Ngram (Aside)
I pulled up Google Ngram to see if it would help. Considering that it shows how often words appear in a large selection of text, it did not help with my original question. I did want to share however, that out of curiosity, I searched for the term “themself” (single reflexive form of they) as proof of it being a legitimate pronoun / using singular “they” has historical validation. The result? From 1800 – 2022, we see a spike for “themself” usage in the most recent years, which seems to affirm the narrative that singular “they” is a more recent trend.

However, if you were to expand the search to texts from 1500 – 2022, the modern usage pales in comparison to its usage throughout the 16th century. Did this answer my question either? No, but I did find this to be an informative aside.

{L}exos
The last software that I tried was {L}exos, a data cleaning and analysis software from Wheaton College. My original MIT link resulted in unwanted HTML that I couldn’t scrub, so I downloaded a text file of the script from Project Gutenberg and trimmed the front matter (title, Dramatis Personae, and scene list) and end matter (terms of use). Then, I went to the scrub function and tried to: make all text lowercase to avoid case sensitivity, remove digits, and remove punctuation but keep apostrophes. I had to replace the roman numerals for Act and Scene numbers with digits so that I wouldn’t get an improper count of “I”‘s in the text. I attempted to remove all words but pronouns and character names, but the result wasn’t as revealing as I’d hoped. Fascinating, but not revealing. Then, I took out a number of the stop words (i.e. the, an, a, that, enter, exit) to attempt to get a more comprehensive look at word frequency, resulting in this word cloud:

From this cloud, we can see that the first and second person are the most popular, but I still wanted more information. So, I went into the Content Analysis tab and entered text file dictionaries for each pronoun type I was looking for: first person, second person, third person masculine, third person feminine, they pronouns, and themself (out of curiosity). For second person, I included both forms of you and thou. I then selected all of the dictionaries and hit analyze. {L}exos generated a table of how many times each pronoun type occurred – first was first person, then second, then third person masculine, then third person feminine, then they pronouns (sadly, no record of themself).

Interestingly, in As You Like It, the most popular feminine pronoun is “her”, which was used 81 times, as opposed to the most popular pronouns for first person – I: 503 times, second person – you: 422 times, third person masculine – he: 180 times, and they – they: 48 times. “She” occurred in the text 46 times.
Conclusion
Did I (eventually) answer my original question? Yes, thanks to {L}exos, I was eventually able to find out the exact number of times pronouns appeared in As You Like It. Am I surprised that the amount of feminine pronouns was significantly less than first person, second person, or third person masculine? Not entirely – since a play involves characters either monologuing their inner thoughts or speaking with each other, I’m not surprised that first and second person pronouns had such high representation. The fact that the masculine pronouns occurred at roughly three times the amount of feminine pronouns was surprising – though the characters are mostly men, the lead is a woman, and there is quite a lot of pining at the center of this plot. The fact that the most popular feminine pronoun was the objective her rather than the nominative she was in fact surprising.
However, while I was able to find some information, my next step is to dig deeper into whom the pronouns were referring. Is it neat to know how often different pronouns occur and compare them? Sure. Is there a lot of context missing that would make this study even richer? Absolutely. In the future, I’d like to do a web scrape of the script to find who is speaking when each pronoun occurs and at what time so I can go back into the text and analyze who is being referred to. Is Rosalind (/ Ganymede, her male alter ego) referred to more with feminine pronouns or masculine pronouns? How often are each character referred to by pronouns? Do any characters have any interesting balance of pronoun types (i.e. does Jaques have more objective pronouns used toward him than nominative)? I had originally selected As You Like It for its gender subversion. While I answered my literal original question, I feel somewhat unsatisfied in the lack of answer toward the spirit behind my original question.


