Data Science as Movement Building

New Page


Civic Education via Natural Language Processing

As we look to evaluate the long-term impacts of capitalism and consider alternatives for our future, it is important to truly understand the underlying principles of socio-economic systems that we have never truly experienced or have been systematically vilified through pro-capitalism propaganda. Nothing exemplifies this more than the mis-information surrounding Socialism and Communism. There is much to be learned from those who have studied, debated, and lived through these systems. One way to more easily disseminate this information and promote more universal civic education is by building and harmonizing a collection (corpus) of these principles comparing them to the status-quo. Web scraping and Natural Language Processing can assist with this and provide useful context on how ideologies are currently viewed and how they have evolved over time.

In this study I build a text classification algorithm that can be used a the basis for developing civic education tools by extracting the common and most unique themes in communism / social political discourse.