Tag Archives: text analysis

Echoes of Resistance: Revealing Afrobeat’s Cultural Impact – Analyzing Fela Kuti’s Legacy.

Fela Anikulapo Kuti was a Nigerian musician, the original creator of Afrobeat, and an outspoken activist from a family of generational activists – who used his music to awaken African consciousness, challenge political corruption and call out social injustice in Nigeria, and the globe. Growing up as a young Nigerian millennial in the 90’s and being exposed to the sound of this legendary figure, the vibrant and revolutionary genre, which was conceived in Nigeria in the 1970’s, has always been more than just music—it’s a cultural force intertwined with political commentary and social movements. My final project, Echoes of Resistance: Revealing Afrobeat’s Cultural Impact while also Analyzing Fela Kuti’s legacy, researches this unique intersection through the lens of Digital Humanities, while uncovering hidden narratives and reimagining how we understand music’s role in shaping history.

The project focuses on Afrobeat’s representation in Nigerian media during 1996-1997, a politically tumultuous period marked by military rule and social unrest. Drawing from the Archivi.ng digital newspaper repository, my research will be using natural language processing, sentiment analysis, and advanced text analysis to dissect the cultural expressions embedded in the media coverage of Afrobeat. By combining these techniques with innovative data visualization, this project reveals the intricate connections between political events, musical commentary, and grassroots activism.

An interdisciplinary team comprising statisticians, data engineers, and digital humanities researchers will be collaborating to bring this project to life. Through web scraping and interactive visualization development, we will be transforming historical newspaper data into dynamic tools for analysis. The results include an interactive web platform, thematic dashboards, and temporal visualizations that highlights Afrobeat’s evolution and its societal impacts.

Key findings from this research will uncover narratives that highlight Afrobeat’s dual role as a cultural critique and a rallying cry for positive change. The project’s deliverables not only preserve these narratives but will also be making them accessible to a global audience, fostering a deeper appreciation for Afrobeat’s legacy.

Looking ahead, this project aims to contribute to academia and beyond. Open-source code and research tools will be made available on platforms like GitHub, encouraging collaboration and innovation in Digital Humanities. Academic publications will be developed to share insights with a broader scholarly community, while public-facing platforms will bring Afrobeat’s historical significance to light for enthusiasts and researchers alike.

By merging traditional scholarship with digital analysis, Echoes of Resistance demonstrates the transformative potential of integrating music, culture, and technology. It will be reaffirming Afrobeat’s place not just as a musical genre, but as a powerful agent of societal change.

PowerPoint slide for ref. _{http://Echoes of Resistance_ Revealing Afrobeat’s Cultural Impact – Analyzing Fela legacy..pptx}

Kelechi Iwuagwu _{(Data Analysis & Viz Candidate, CUNY Graduate)}

Text Mining Praxis: Gender and As You Like It. . . Or Maybe Not Quite

I entered this Praxis assignment with a goal: measure pronoun usage in at least one Shakespearean play. Why pronouns? I was curious to see how often characters reflected on each other using gendered pronouns. Furthermore, in my research, I found that pronouns were often used as stop words (filler words that are removed before mining – think “the”, “an”, “a”, etc), which made me curious about how much these overlooked words could tell. Why Shakespeare? His plays are easy to find as public domain text files, the plots are well known, several have interesting gender dynamics, and. . . I was in a Shakespeare company in my undergrad years. I figured for new tools I may as well retread familiar territory.

I settled on As You Like It as this play particularly plays with gender – an exiled daughter of a duke disguises herself as a man who then offers to play the role of a woman (herself, no less) to help her love interest cope with his inability to see her. It makes more sense in context. With a woman as the main character, and one whose gender presentation changes throughout the play, I felt that As You Like It would serve as a good test case.

Next, came which software to use. I wound up trying a few, each of which will receive its own section:

Voyant

I started with Voyant, which was definitely the easiest to use. I copied and pasted the link from MIT’s webpage of the script (https://shakespeare.mit.edu/asyoulikeit/full.html) and clicked Reveal.

Immediately, I saw this Cirrus map of the most common words:

The script has 22,817 total words and 3,267 unique word forms according to Voyant with the top five words being Rosalind (the main character), Orlando (her love interest), Celia (Rosalind’s cousin / best friend), love, and good. I scrolled through the most popular words, trying to select both pronouns and gendered terms, but realized that this would be far too manual to be efficient. I then tried searching for pronouns before realizing that this was essentially using the Find function on the original webpage, which also didn’t seem like the best use of the tool. Interestingly, the context tab does show what precedes and follows the term. Unfortunately, the terms only show up when you use the regex term ending in an asterisk, so you do have to filter through words that start with the pronoun (i.e. she -> shepherd). It’s fascinating and provides some additional information, but doesn’t always refer to whom the pronoun is referring. Overall it was a fascinating starting point, but didn’t really answer my question

Google Ngram (Aside)

I pulled up Google Ngram to see if it would help. Considering that it shows how often words appear in a large selection of text, it did not help with my original question. I did want to share however, that out of curiosity, I searched for the term “themself” (single reflexive form of they) as proof of it being a legitimate pronoun / using singular “they” has historical validation. The result? From 1800 – 2022, we see a spike for “themself” usage in the most recent years, which seems to affirm the narrative that singular “they” is a more recent trend.

However, if you were to expand the search to texts from 1500 – 2022, the modern usage pales in comparison to its usage throughout the 16th century. Did this answer my question either? No, but I did find this to be an informative aside.

{L}exos

The last software that I tried was {L}exos, a data cleaning and analysis software from Wheaton College. My original MIT link resulted in unwanted HTML that I couldn’t scrub, so I downloaded a text file of the script from Project Gutenberg and trimmed the front matter (title, Dramatis Personae, and scene list) and end matter (terms of use). Then, I went to the scrub function and tried to: make all text lowercase to avoid case sensitivity, remove digits, and remove punctuation but keep apostrophes. I had to replace the roman numerals for Act and Scene numbers with digits so that I wouldn’t get an improper count of “I”‘s in the text. I attempted to remove all words but pronouns and character names, but the result wasn’t as revealing as I’d hoped. Fascinating, but not revealing. Then, I took out a number of the stop words (i.e. the, an, a, that, enter, exit) to attempt to get a more comprehensive look at word frequency, resulting in this word cloud:

From this cloud, we can see that the first and second person are the most popular, but I still wanted more information. So, I went into the Content Analysis tab and entered text file dictionaries for each pronoun type I was looking for: first person, second person, third person masculine, third person feminine, they pronouns, and themself (out of curiosity). For second person, I included both forms of you and thou. I then selected all of the dictionaries and hit analyze. {L}exos generated a table of how many times each pronoun type occurred – first was first person, then second, then third person masculine, then third person feminine, then they pronouns (sadly, no record of themself).

Interestingly, in As You Like It, the most popular feminine pronoun is “her”, which was used 81 times, as opposed to the most popular pronouns for first person – I: 503 times, second person – you: 422 times, third person masculine – he: 180 times, and they – they: 48 times. “She” occurred in the text 46 times.

Conclusion

Did I (eventually) answer my original question? Yes, thanks to {L}exos, I was eventually able to find out the exact number of times pronouns appeared in As You Like It. Am I surprised that the amount of feminine pronouns was significantly less than first person, second person, or third person masculine? Not entirely – since a play involves characters either monologuing their inner thoughts or speaking with each other, I’m not surprised that first and second person pronouns had such high representation. The fact that the masculine pronouns occurred at roughly three times the amount of feminine pronouns was surprising – though the characters are mostly men, the lead is a woman, and there is quite a lot of pining at the center of this plot. The fact that the most popular feminine pronoun was the objective her rather than the nominative she was in fact surprising.

However, while I was able to find some information, my next step is to dig deeper into whom the pronouns were referring. Is it neat to know how often different pronouns occur and compare them? Sure. Is there a lot of context missing that would make this study even richer? Absolutely. In the future, I’d like to do a web scrape of the script to find who is speaking when each pronoun occurs and at what time so I can go back into the text and analyze who is being referred to. Is Rosalind (/ Ganymede, her male alter ego) referred to more with feminine pronouns or masculine pronouns? How often are each character referred to by pronouns? Do any characters have any interesting balance of pronoun types (i.e. does Jaques have more objective pronouns used toward him than nominative)? I had originally selected As You Like It for its gender subversion. While I answered my literal original question, I feel somewhat unsatisfied in the lack of answer toward the spirit behind my original question.

Praxis: Text Mining

Wow! Voyant is so powerful and accessible. Personally, I am really curious about what goes into building a tool like this.

The thing I enjoyed the most about exploring the text mining resources was actually looking through the Library of Congress’ Chronicling America database. How rich! I think it would be really interesting to try to look through this specifically for Black newspapers throughout the country over time.

I had some “fun” looking at mentions of Palestine pre-1948 in Nebraska newspapers:

Omaha daily bee. (Omaha [Neb.]), 08 Jan. 1911. Chronicling America: Historic American Newspapers. Lib. of Congress.

I played with entering some of these texts in Voyant as well as personal writing and cover letters. I found this tool easier to experiment with and reason about than the more at-large data vis tools, which makes sense because they are designed for a smaller subset of use cases.

As with our other praxis assignments, it’s hard for me to really get in the weeds and reason about the use of these tools without a defined problem or question to work through. I think so much of what is interesting about Digital Humanities are the lines of inquiry that open up to us when using these powerful tools, and I’m interested in better learning how to ask meaningful questions.

Blog Post Textual Analysis Praxis

Text Analysis Praxis Assignment

I chose to experiment with Voyant and word tree for the text analysis praxis. I wanted to find a text that I was familiar with so that I could at least kind of understand what I was looking at. I went to Project Gutenberg to see if I could find a text that was available that I was familiar with. I was excited to find The Well of Loneliness by Radclyffe Hall. Project Gutenberg is an awesome resource that I enjoyed exploring and hope to continue to use in the future.

The first tool I used was Voyant. I liked the various charts that this tool produced and the ease of viewing/playing with the different visualizations.

Voyant – The Well of Loneliness

I didn’t think it was all that illuminating to see that “like”, “little”, and “said” were among the most used words. I was also unsure of how to read or manipulate the data in any meaningful way based on the initial output. But, once I played around a little bit things got more interesting. I decided that I would look at the main characters love interests throughout the novel and compare that to instances of of the word “longing”. I wanted to use the words “longing”, “longed”, “lonely”, but I could not figure out how to make them all into one category. I think if I had played around with the tool more I would have been able to figure that out. Based on the data “longing” was the word in this cluster used the most throughout the book, so I chose to use that word for my analysis.

Briefly, the novel follows Stephen, a lesbian in early 20th century England. Collins is her tutor (childhood), Angela a friend who she has a relationship with (adolescent/early 20s), and then Mary who is arguably the love of her life (adulthood). I think it’s interesting that Stephen’s feelings of longing are heightened when she is in a relationship. In the novel Stephen is obviously queer, she wears “mens” clothes, doesn’t marry, does traditionally masculine activites, etc. Whether it’s a symptom of the time or genuine attraction, Stephen dates feminine women who are often betrothed to men (Angela) or they face discrimination/a harder life, which prompts Stephen to push them away and into the arms of a man (Mary). I think you could extrapolate that these factors influence her feelings of longing.

The other tool I explored for this praxis was word tree. I really liked the interface and how the user interacted with the text. It was useful to have the full quote highlighted on the side of the page, which I think would be really useful for performing close readings of texts. This tool also seemed to capture the overall themes of the novel better than the Voyant analysis. As a fun little treat the results also read like poetry to me.

word tree – The Well of Loneliness

Blogpost: (PRAXIS) Text Analysis of the US Constitution using Voyant for the first time.

My Experience with Voyant.

Before resolving to use Voyant, I initially explored Google N-gram but found it “kinda” difficult to navigate for deeper insights. Voyant, on the other hand, felt much more user-friendly, especially with its collection of very helpful features. The Cirrus tool, which creates a word cloud, stood out immediately. It highlights the most frequent words in a corpus, offering a quick, visual snapshot of key terms. Another useful feature was Terms, which displays the frequency of terms across the document, making it easy to track word usage patterns.

Links, a network diagram tool, was particularly helpful for exploring how words co-occur, offering insight into relationships between key concepts. The Reader view displayed the full text, allowing me to highlight and analyze terms within the document directly. Additionally, TermsBerry, a playful bubble chart, allowed me to visualize word frequency and connections in an engaging manner.

Other features, such as Trends, as well as Context and Bubblelines, added even more depth to the analysis. Voyant also provides statistics such as word counts, vocabulary density, and readability scores, making it not only visually engaging but also a quantitative tool for text analysis. Its ability to generate instant visual feedback and downloadable outputs made it ideal for my praxis.

Analyzing the U.S. Constitution

First, as part of the mining, I searched on Google for a txt. file of The US Constitution, and was able to find THE
CONSTITUTION OF THE UNITED STATES OF AMERICA As Amended on www.govinfo.gov which I highlighted all, copied, and pasted into the Voyant reader for analysis.

Using Voyant to analyze the U.S. Constitution was an interesting experience. The corpus was a single document, containing 39,243 words and 1,896 unique word forms. Voyant’s summary statistics revealed key insights, such as a vocabulary density of 0.048, indicating high repetition in language, and a readability index of 10.001, suggesting that the text is accessible to a broad audience.

From the Cirrus tool, it was revealed that the most frequent words in the text were terms like “shall” (1,268 occurrences), “states” (592), “congress” (396), “state” (387), and “president” (370). These terms reflect the U.S. Constitution’s focus on governance, authority, and the distribution of power.

Voyant Cirrus for analysis of The US Constitution

The Links tool allowed me to explore how these terms are connected. For example, it was interesting to see how frequently “states” and “congress” appeared together, highlighting their relationship in the text.

The Reader view allowed me to read the full document while tracking specific words, and TermsBerry provided an interactive visualization of word frequency, which made it easy to explore patterns and relationships between terms.

The Trends (which combines line and bar charts for term frequency over time)

In summary, Voyant offered a visually engaging, data-oriented approach to analyzing the U.S. Constitution, making the analysis both colorful, accessible, and insightful. As a prospective Digital Humanist, I will very likely be using it much more in the future.

Kelechi Iwuagwu – _{(A Data Analysis & Viz Candidate, CUNY Grad Center)}

Introduction to Digital Humanities Fall 2024

thinking, writing, and reading digitally