Introduction

Background

In an era shaped by digital discourse and diverse perspectives, where clashing opinions abound, the need for understanding online communities and their interactions is paramount. This project embarks on a comprehensive analysis of Reddit’s political and economic space. Our analysis will be further augmented by incorporating external data from the Federal Reserve Economic Data (FRED) to unravel multifaceted insights across socio-political, economic, and sentiment-driven dimensions. In doing so, we endeavor to offer a holistic perspective on the unfolding narratives of online political discourse.

At the heart of this exploration lies an intricate tapestry of ideological diversity represented through Reddit’s various subreddits. The nine chosen subreddits, seven related to politics and two related to finance/economics, cover the four quadrants and the center of the political compass, thereby encapsulating a broad spectrum of beliefs. These political subreddits serve as hubs for ideological exchange and mirror the heterogeneous landscapes of societal viewpoints growing online. Examining these subreddits goes beyond the mere enumeration of user activities; it seeks to decipher the nuanced interplay between the online public discourse and the socio-economic climate, reflecting the dynamism and intricacies of contemporary societal discourses.

Figure 1: Political Compass

By scrutinizing these subreddits through various analytical lenses, encompassing activity metrics, sentiment analysis, topic modeling, and alignment with external economic indicators, this project seeks to uncover patterns, trends, and correlations. The fusion of Reddit’s user-generated content with data from external economic sources enables us to discern the pulse of online ideological discussions and their connections and implications within broader socio-economic contexts. Through this, we aim to offer insights into the complex interactions that characterize Reddit’s political and economic spheres, shedding light on the evolving nature of digital discourse and its reflections on contemporary societal dynamics.

A blend of robust Cloud Computing technologies and platforms aided our study. Given the scale of Reddit data, the backbone of our analysis relied on Amazon Web Services (AWS) and Microsoft Azure ML, harnessing the power of Spark for efficient data processing and analysis. Leveraging the capabilities of SparkML and SparkNLP, we executed various tasks encompassing natural language processing and machine learning. We employed a combination of models to achieve comprehensive insights, including pre-trained models sourced from JohnSnowLabs and custom models trained on our data.

Subreddit Submissions Comments
r/Ask_politics 5,903 60,149
r/changemyview 64,632 3,909,587
r/finance 28,904 137,118
r/Economics 40,604 1,428,423
r/Conservative 343,938 5,231,661
r/socialism 40,094 371,369
r/Liberal 11,086 96,396
r/Libertarian 51,153 2,706,903
r/centrist 13,594 92,1871
Table 1: Number of Comments and Submissions on Select Subreddits from January 2021 to March 2023.

About the Team

raunak
Raunak Advani
teg
Tegveer Ghura
eric
Eric Lim
anthony
Anthony Moubarak

Acknowledgment

We would like to disclose that we employed Grammarly, Inc. to assist with grammar and proofreading for this section.