Language not available Language not available Language not available

This post is not available in your language. Here are some other options:

Twitch Engineering: An Introduction and Overview

Dec 18 2015

If you’re a Twitch user, you probably know that Twitch is a pretty big site that streams live video, but you may not have an idea of the sheer scale and complexity of all of the different moving parts needed to hold everything together.

Among other things, we have:

Each of these different components of Twitch has different challenges that need to be tackled — among them low-level optimization of video encoding, large scale system scaling, implementing products across multiple platforms and form factors, and improving our engineering efficiency via improved tooling.

Read More.

Engineering at Scale

Before we dive into what technology we currently use, it’s important to note that many of the decisions that we make surrounding technology are driven by the size of Twitch and how rapidly it is growing:

This means that we have to scale at both an architectural and organizational level. As each individual systems grows, we expose different kinds of failure modes that aren’t problems in smaller systems, but become problematic as systems grow larger. This includes problems like hardware and network failures that are rare in small systems but commonplace in larger ones, and doing things like capacity planning to make sure that you can build and architect new infrastructure and systems before they are needed.

Beyond individual systems growing larger, we are also adding more systems to Twitch to make our product better. Each of these new systems adds more complexity to our overall system, which makes engineering more complex for everyone. And adding the extra engineers to build those systems also adds organizational complexity — communication and coordination is an O(n2) scaling problem if left alone. There have been many great articles written (and ideas) about how to try to solve these problems, but there isn’t enough room here to do them justice.

Our Technology Stack (A 50 thousand foot view)

There is way too much to talk about in depth here, but I’ll attempt a high-level overview of the the key pieces of technology that keep Twitch up and running.

Video System

The video system is responsible for getting video from the broadcaster to our viewers. This includes the following core components:

Chat

Chat is a highly scalable real-time distributed system written in Go. It delivers hundreds of billions of messages per day to users who are watching video via our own protocols, as well as supporting IRC as a communication protocol, making it easy for developers to create IRC bots to add their own custom chat features. Core chat components include:

Web APIs and Data

Besides our real-time services (video and chat) we also have a substantial number of services including, but not limited to:

This is all built on a mix of Rails, Go, and various different open-source applications that are used for routing, caching, and data storage. We have also been doing a lot of vertical and horizontal partitioning of our data and data APIs in order to make our systems scale better — both at a technical and organizational level.

Web and Client Applications

All of these services are lovely, but useless without easy to use client applications that create a good user experience. We have a multitude of different client applications that make it easy to access Twitch.

Data Science Infrastructure

Science Engineering builds data collection systems and analysis tools to study how our users interact with our products. These systems include:

Tools and Operational Infrastructure

The above services directly support the application and product. However, the operational infrastructure that we need to manage these services is also extremely important. We have and maintain tools for:

At the physical level, we currently run an actually fairly large number of “bare metal” POP (points of presence) all over the world — this allows us to deliver higher quality video due to the unusual requirements of video delivery (lots of bandwidth!).

We also have been moving an increasing amount of our services to Amazon Web Services — this helps to reduce the amount of operational overhead, as well as to take advantage of the convenience and scalability of many of their services.

In Closing

We hope this has been an interesting overview of what makes Twitch tick under the hood. If you are interesting in learning more about these (or other) systems, please look at the Twitch Engineering site!

In other news