Using chat emotes as signals for content themes
At Twitch, our Science team is always working to better understand the complex three-sided marketplace that revolves around streamers, viewers, and game developers. With a strong knowledge of what motivates viewers to watch particular channels, we can build more personalized experiences for users to discover content that is relevant to their interests.
Our goal is to identify elements of a channel that are especially good at captivating viewers.
****Individual streams are filled with a wide variety of moments, which makes it difficult to categorize an entire stream as exciting, funny, or sad. That’s where Clips come in! Clips are short video segments captured by our viewers whenever a noteworthy moment happens on stream. They are essentially microcosms of the best moments on Twitch and are a high-quality proxy for the different segments that make up a single piece of long-form content. Here is one of my personal favorite Clips:
Each Clip contains not only a snippet of video but the accompanying chat from the channel during that segment as well. The frenetic nature of chat makes for a very messy and complex data source, which makes extracting overarching content themes particularly hard. Luckily, Twitch chat contains emotes, which we believe encapsulate some aspects of viewer feelings and reactions very succinctly. Emotes also have a much smaller cardinality than the set of all words used in chat and generally carry over the same meaning across languages, making them a perfect way to capture the sentiment of a channel at any given time.
Building a Training Dataset
For this exercise, we construct our dataset by making use of Twitch emote frequencies in Clip chat. We start by pulling the chat logs that occur between the start and end timestamps of each Clip made over a month, then sum the counts of each global emote used. Next, we weigh these emotes for relevancy such that our transformed dataset consists of a triplet of (clip URL, emote name, and TF-IDF score) for every Clip in the studied period:
The clip_slug above is a unique identifier for a Clip, while the TF-IDF score is a measure of each emote’s prevalence in the particular Clip, adjusted for its global frequency. In essence, it gives high weight to emotes that are used frequently in a given Clip but are not commonly used across all Clips. It also assigns a lower weight to emotes that are used infrequently in a given Clip, but have high prevalence across all Clips.
Clustering Clips by Emote Usage
With this dataset, we want to split out meaningful emotional themes used in Clip chat. Our goal is to organize Clips visually so that ones with similar emote usage are organized next to each other. Our training dataset is made up of all Clips created in a 30-day period containing at least one global emote.
We use a custom expectation-maximization algorithm similar to spherical k-means (k-means using cosine similarity). The difference is the algorithm generates many (k = 250) distinct clusters of roughly equal size and arranges them continuously like the ordering of a dendrogram.
The output is the following pretty visualization:
In th_e stacked-bar chart above, we have laid out 250 equal-width vertical bars along the x-axis, one for each cluster centroid. Each bar represents roughl_y 1/250th of all Clips created that month. On the y-axis, the 20 most frequently used emotes are each represented by a distinct color, with black being used as the catch-all for emotes less frequently used. For instance, cyan maps to PogChamp emotes, green maps to WutFace, and pink maps to Kappa. In each bin, the amount of color for each emote is proportional to the usage of that emote in that bin.
Notice that the graph has sections where a single emote dominates across many bins. With the aid of this low-dimensional representation of our data, we can condense bins that share the same dominant emote into mega-clusters and label sections of our graph with their dominant emotes.
We started with the premise that we could learn what themes were prevalent in a live stream using just emote content. We typically think of Clips as highlighting important moments in individual streams. Looking at this visualization, we have several interesting theories to manually validate:
- The cyan area, which indexes heavily on PogChamp emotes, dominates most of the mass. Qualitatively, we think of PogChamp as an emote typically used to express awe. Are Clips in this region generally of hype moments? Below are the most popular Clips from this section:
- The magenta area is dominated by BibleThump emotes, which are commonly used in an emotional context. Are Clips in these clusters good at eliciting compassion? Popular Clips from this subset of clusters include:
Productizing Data Science
With the output of this algorithm, we can use the cluster assignments to start categorizing Clips by theme. A few practical applications might be to inform a tagging service with hashtags or provide an additional feature to power our recommender systems.
We discuss further how we build products on top of data science models in this post on our Science blog. Be sure to keep an eye out for future product releases built on machine learning models such as this one!
If you are passionate about building using data to build consumer products, consider joining our Science team. We’re hiring!