The WallStreetBets Sentiment Algorithm Explained

Quantbase
2 min readJun 15, 2021

We’re kicking our sentiment venture off with a strategy that automatically follows posts/comments on tickers on the Reddit community, r/WallStreetBets. This algorithm scans the most popular trading communities and logs the tickers mentioned in due-diligence or discussion-styled posts. Instead of just scanning for how many times each ticker was mentioned in a comment, we also logged how popular the post was among the sub-reddit. Essentially if it makes it to the ‘hot’ page, regardless of the subreddit, then it will most likely be on this list.

So, how is this sentiment calculated? We’ll give you the general view today and save a walkthrough of the source code for a later blog post: it uses VADER (Valence Aware Dictionary for Sentiment Reasoning), which is a model used for text sentiment analysis that is sensitive to both polarity (positive/negative) and intensity (strength) of emotion. The way it works is by relying on a dictionary that maps lexical (aka word-based) features to emotion intensities — these are known as sentiment scores. The overall sentiment score of a comment/post is achieved by summing up the intensity of each word in the text. In some ways, it’s easy: words like ‘love’, ‘enjoy’, ‘happy’, ‘like’ all convey a positive sentiment. Also VADER is smart enough to understand the basic context of these words, such as “didn’t really like” as a rather negative statement. It also understands the emphasis of capitalization and punctuation, such as “I LOVED” which is pretty cool. Phrases like “The turkey was great, but I wasn’t a huge fan of the sides” have sentiments in both polarities, which makes this kind of analysis tricky — essentially with VADER you would analyze which part of the sentiment here is more intense. There’s still room for more fine-tuning here, but make sure to not be doing too much. There’s a similar phenomenon with trying to hard to fit existing data in stats called overfitting, and you don’t want to be doing that.

--

--