| text_length | count | |
|---|---|---|
| 0 | <=05 | 28704 |
| 1 | <=10 | 52709 |
| 2 | <=15 | 18678 |
| 3 | <=20 | 6262 |
| 4 | <=25 | 2171 |
| 5 | <=30 | 838 |
| 6 | <=40 | 596 |
| 7 | <=50 | 183 |
| 8 | >50 | 106 |
Executive summary
In our analysis of the r/anime community, we discovered a balanced mix of positive and negative sentiments. Surprisingly, over half of the posts show a positive tone. This suggests a generally upbeat and harmonious atmosphere in the community. Interestingly, we found that the number of non-controversial comments far outweighs the number of controversial ones, highlighting the community’s preference for calm discussions despite their large size.
Our study also ventured into sentiment analysis over time for popular anime series like One Piece, Pokémon, and Naruto. We observed a declining trend in positive sentiment for some series, indicating changing interests or perceptions among fans. For Pokémon, despite high engagement, there was a notable amount of negative sentiment, suggesting complex fan relationships with the series.
Furthermore, our analysis of Pokémon subreddit sentiments over time revealed interesting patterns. We noticed more positive sentiments in comments than in submissions, suggesting a difference in how users express satisfaction or criticism. Our findings also highlighted certain events that caused significant shifts in sentiment, like the Pokémon Go event featuring Latias and Latios and the Eevee Day events, which influenced fan discussions and reactions.
Lastly, our examination of sentiment by day of the week and hour of the day for Pokémon-related discussions uncovered patterns linked to the community’s engagement rhythms. Weekdays generally saw more positive interactions, while weekends had more critical discussions. Similarly, positive sentiments were more frequent in the afternoon and early evening, while late evenings saw an increase in negative sentiments.
These insights into the anime community’s sentiment dynamics offer valuable understanding into the nature of fan engagement and the factors influencing their perceptions and discussions.
Analysis report
Basic Text Analysis
| text_length | count | |
|---|---|---|
| 0 | <=10 | 7524 |
| 1 | <=20 | 11460 |
| 2 | <=30 | 12833 |
| 3 | <=40 | 11887 |
| 4 | <=50 | 9311 |
| 5 | <=60 | 7986 |
| 6 | >60 | 49246 |
| text_length | count | |
|---|---|---|
| 0 | <=010 | 2384465 |
| 1 | <=030 | 2282496 |
| 2 | <=050 | 881447 |
| 3 | <=070 | 414939 |
| 4 | <=100 | 365355 |
| 5 | <=140 | 232981 |
| 6 | <=200 | 182853 |
| 7 | >200 | 134583 |
Figure 1 illustrates the distribution of submission title length, categorized into nine groups. The distribution is skewed to the right, indicating that the majority of titles are less than or equal to 10 words. The title length between 5 and 10 words is the most common one. Titles with fewer than five words from the second common group. This pattern contrasts with submission selftext and comments, which is not surprising. Typically, users try to keep titles concise and to the point, opting to provide more detailed information in the selftext. Figure 2 illustrates the distribution of submission selftext length, categorized into seven groups. Unlike its title, most of the selftext is in more than 60 words. Since the title is short to the main idea, the self text user tends to write more to fully explain its submissions to other users. Figure 3 illustrates the distribution of comments body length, categorized into eight groups. The first two groups which represent less than or equal to 30 words are the most common length for comments. More than 30 words and less than or equal to 50 words length are also very common. Given that comments often serve as a platform for discussions related to submission topics, it is not surprising to observe varied word lengths. However, these comments tend to be somewhat short, fostering a back-and-forth dynamic.
The analysis of submission title, selftext, and comment body length distributions reveals distinct patterns. Submission titles predominantly consist of concise phrases, with the majority being less than or equal to 10 words, while selftext tends to be longer, exceeding 60 words in most cases. Comments, serving as discussion platforms, commonly fall within the 30 to 50 words range, fostering interactive and concise exchanges.
Submission Title
Figure 4 is the top 20 common words in the submission title. The most frequently used words in submission titles are closely associated with recommendations. Common words such as ‘like,’ ‘good,’ ‘rewatch,’ ‘recommendations,’ and ‘best’ suggest a general theme of users seeking or providing anime suggestions. The presence of words like ‘anime,’ ‘episode,’ ‘watch,’ and ‘season’ among the top 20 words reflect a strong connection to the anime content itself.
While the common words generated via TFIDF for submission titles in Figure 5 are different from the most common words, and are highly related to Japanese words and culture.
Following is part of the most common words:
- Dango: a japanese cuisine.
- Muri (Japanese): “muri” (無理) means “impossible” or “unreasonable.
- Toaru (Japanese): Toaru Majutsu no Indekkusu (A Certain Magical Index), a popular light novel series.
- Kougeki (Japanese): “Kougeki” (攻撃) means “attack”.
- Zutto (Japanese): “zutto” (ずっと) means “always” or “forever”.
Submission Selftext
The Figure 6 is the 20 most common words from submission selftext. They are very similar to the submission titles. It is also highly related to recommendations with additional words like ‘please’.
Figure 7 is the common words generated via TFIDF for submission selftext. It also contains many Japanese words:
- Kita (Japanese): “kita” (来た) means “came” or “arrived”.
- Ryo (Japanese):
- “ryo” (良) as a given name.
- “ryō” (両) was a gold currency unit.
- Gagumber (Japanese): a character in “Gagumber the Gale” (疾風のガガンバー)
- Saori (Japanese): “Saori” as a Japanese given name.
- What’s more the presence of adjective words like ‘irony’ and ‘situational,’ indicates users’ exploration of content and preferences.
External Data Analysis
Data source: yahoo finance
| week | total_submissions | total_comments | NTDOY | SONY | TYO | |
|---|---|---|---|---|---|---|
| 108 | 2023-01-22 | 707 | 48227.0 | 10.734 | 89.502000 | 11.827641 |
| 109 | 2023-01-29 | 640 | 37987.0 | 10.772 | 91.054001 | 11.796592 |
| 110 | 2023-02-05 | 634 | 52538.0 | 10.146 | 90.430002 | 12.403984 |
| 111 | 2023-02-12 | 626 | 50620.0 | 10.036 | 87.999998 | 12.794034 |
| 112 | 2023-02-19 | 598 | 51840.0 | 9.840 | 82.777500 | 13.246667 |
| 113 | 2023-02-26 | 524 | 47569.0 | 9.394 | 83.927998 | 13.407248 |
| 114 | 2023-03-05 | 248 | 39966.0 | 9.402 | 86.634000 | 13.191847 |
| 115 | 2023-03-12 | 249 | 39501.0 | 9.486 | 85.529999 | 12.037221 |
| 116 | 2023-03-19 | 267 | 44612.0 | 9.570 | 88.206000 | 11.796666 |
| 117 | 2023-03-26 | 159 | 24835.0 | 9.670 | 88.058000 | 11.973750 |
Figure10. External Data
As mentioned during the EDA phase for the external data, we selected stock prices of anime production companies such as the Nintendo ADR (NTDOY) for Pokémon and the Tokyo Stock Exchange (TYO) to represent the broader anime production industry. Notably, we decided to exclude TOEAF and NCBDF from our external stock prices dataset due to their status in the US stock market, and SONY as well due to its broader association with electronics rather than anime production.
The graph shows weekly total submissions and comments compared with the average adjusted stock prices of NTDOY and TYO from January 2021 to March 2023. The x-axis denotes time, and the secondary y-axis on the right side represents stock prices. Both total submissions and comments exhibit a decreasing trend, mirroring the trend observed in NTDOY stock prices. While TYO in green started to increase after March 2022. Since TYO is a more general stock index, it serves as a reference rather than a direct representation of anime production companies’ status in our case. Given the similarities in trends between Nintendo’s stock prices and anime subreddit activity, we aim to dig into the Pokemon subreddit under anime to gain valuable insights.
Note that We choose a weekly basis due to the nature of stock exchange trading days. In 2023, there are exactly 250 trading days, and 105 out of 365 days are weekend days when the stock exchanges are closed. So a weekly basis is a better aggregation unit for comparing stock price fluctuations with anime subreddit activity over time.
NLP
Text Cleaning Pipline
We Build a SparkNLP pipline to clean our text. This pipline include 6 stages in total as follow:
- DocumentAssembler(): Transforms raw texts to
documentannotation - DocumentNormalizer(): Removes all dirty characters from text following a regex pattern and transforms
- Tokenizer(): Identifies tokens with tokenization open standards
- LemmatizerModel.pretrained(): Find lemmas out of words with the objective of returning a base dictionary word
- Stemmer(): Find stems out of words with the objective
- StopWordsCleaner(): Drops all the stop words from the input sequences
Sentiment Analysis and Controversiality for r/anime
| Text Category | Positive | Negative |
|---|---|---|
| submission title | 52.05% | 47.95% |
| submission selftext | 49.42% | 50.58% |
| comment body | 45.53% | 54.47% |
The table reveals there is no substantial discrepancy between the proportions of posts portraying positive and negative emotions within the r/anime community. Notably, over half of the submission titles conveyed positive sentiments, and the distribution between positive and negative sentiment in submission contents was nearly equivalent. However, a higher proportion of comments leaned towards negativity, accounting for approximately 54.47%.
This overall scenario suggests a generally harmonious atmosphere within this community. The prevalence of submissions expressing positive emotions indicates a tendency among individuals to initiate discussions with a positive tone rather than with negative sentiments.
| Controversiality | Count |
|---|---|
| Non-controversial | 6768222 |
| Controversial | 110897 |
The count of non-controversial comments surpasses the count of controversial ones by more than sixty-fold. This is consistent with the fact as displayed in the sentiment analysis for r/anime, that the overall vibe of this community is calm and harmonious rather than extreme, despite the high number of people involved in this community. This could be attributed to the universality of the topic of anime to the general public, where people engage in anime mostly for hobbies and leisure, and rarely hold offensive views.
Sentiment Analysis for Top-tier anime subreddits
Sentiment over time by anime series
| sentiment | OnePiece | Pokemon | Naruto | OnePunchMan | YuGiOh | OnePiece_Perc (%) | Pokemon_Perc (%) | Naruto_Perc (%) | OnePunchMan_Perc (%) | YuGiOh_Perc (%) | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | negative | 12461 | 39728 | 12535 | 2726 | 1616 | 25.213978 | 37.590599 | 24.892269 | 18.128616 | 31.118814 |
| 1 | neutral | 3367 | 7497 | 3373 | 652 | 396 | 6.812893 | 7.093655 | 6.698175 | 4.335971 | 7.625650 |
| 2 | positive | 33593 | 58461 | 34449 | 11659 | 3181 | 67.973129 | 55.315747 | 68.409556 | 77.535413 | 61.255536 |
The time series graphs present an intriguing comparative sentiment analysis for five beloved anime franchises: One Piece, Pokémon, Naruto, One Punch Man, and YuGiOh. A discernible declining trend in positive sentiment over time is evident for One Piece, Naruto, and One Punch Man, hinting at a possible waning interest or reception among their fan bases. In contrast, Pokémon and YuGiOh exhibit pronounced fluctuations, with noticeable peaks and troughs suggesting episodic events triggering varied fan reactions. Notably, YuGiOh experienced a significant sentiment bifurcation in July 2022, likely indicating a polarizing event that split the fanbase into starkly positive and negative camps. Despite the general downtrend for three franchises, an upsurge in positive sentiment across all series in October 2022 could point to seasonal releases or events that traditionally energize the anime community, perhaps correlating with new content releases or other significant developments that tend to cluster around the autumn season. This uptick might also imply a rejuvenation of the audience demographic, perhaps through strategic initiatives aimed at broadening viewership or revitalizing interest in these storied series.
Sentiment counts and percentages across anime series
The grouped bar chart of sentiment counts across anime series reveals that Pokémon garners the highest volume of commentary within the anime subreddit, but when evaluating sentiment percentages, a different narrative emerges. Pokémon stands out for having a disproportionately high percentage of negative feedback relative to its comment volume, coupled with the lowest positive sentiment percentage. This juxtaposition of high engagement and lower relative satisfaction is quite striking, especially when visualized in the line and area plots, indicating a complexity in fan engagement that warrants further exploration. When considering the reasons behind the sentiment distribution for Pokémon, the high negative sentiment could be indicative of a vocal minority within a large fan base or it might reflect recent controversies or disappointments in the franchise. The line plot and area chart amplify this story, revealing that despite the breadth of its audience, there’s a nuanced undercurrent of dissatisfaction that could be tied to specific developments or decisions within the Pokémon series. To unpack these dynamics, a focused analysis of the Pokémon subreddit could offer insights into specific issues or events that have influenced fan sentiment.
Sentiment distribution across anime series
The sunburst chart illustrates the sentiment distribution across different anime series, with each series segmented into positive, neutral, and negative sentiments. This visual suggests that while Pokémon has a substantial share of sentiment, it is not overwhelmingly positive. In contrast, series like One Piece and Naruto appear to have a more balanced sentiment distribution, with significant portions of both positive and negative feedback. The chart helps to quickly identify which series are more polarizing versus those with a more balanced or positive sentiment profile. It also indicates the relative volume of discussion for each anime, as larger segments suggest more comments and a higher level of engagement within the community.
Comment Body
In contrast, the 20 most comment body words in figure 8 exhibit a more generalized set of commonly used words except those words like ‘anime,’ ‘watch,’ and ‘episode’ that are also in submissions titles and selftext. This suggests that comment sections may serve as spaces for more broad, divergent and inclusive conversations about various anime-related topics than submissions. While the 20 common words generated via TFIDF for comment body in figure 9 contains some japanese word, and anime, but also notably is the words that are considered over 18 content.
The analysis of text data from submission titles, selftext, and comment bodies reveals distinct patterns. Submission titles predominantly focus on recommendations, with users frequently using words like ‘like,’ ‘good,’ and ‘rewatch.’ The selftext of submissions follows a similar trend but tends to be more detailed, providing additional context. Comment bodies, on the other hand, exhibit a broader and more inclusive vocabulary, suggesting that the comment sections foster diverse discussions about various anime-related topics.