Semantic & Natural Language Analysis of Open Source Communities

Current Open Source Software (OSS) communities have many avenues of communication available to them, including mailing lists, chat channels, and forum boards. 

Mailing lists have become popular targets for mining sentiment and emotions, as they usually provide a centralized communication hub for members of a distributed OSS community. Sentiment and emotions within communities can provide insights into community health. Good community health will help with initiatives aimed at fostering positive interactions between OSS community members, strengthening social ties, and helping the community accomplish its tasks.

With that, the task of conducting sentiment analysis on the Fedora user and developer mailing list found its roots. Over the summer, the project evolved into being focused on hate speech and offensive language detection. Through the next few months, there will be multiple combinations of models and training sets tested to see which gives the best accuracy. A service then will be developed that can be used to analyze any OS mailing list data.

Red Hat Intern: Cali Dolfi


