Rocket League: Let's Talk About Server Performance

By Michelle-Louise Janion, 25 days ago
“Community First” has long been part of our mission statement here at Psyonix, and it’s a mantra we are always pushing to improve upon. The happiness of our players will always drive us to grow and influence the decisions we make in Rocket League, which is why I want to directly address the recent issues with our servers, matchmaking delays, and our PsyNet backend.

In regards to both backend (PsyNet) issues, and game server performance, we agree that the server outages and recent lengthy matchmaking times are totally unacceptable. We sincerely apologize to all of our players for the quality of online play, and we are focusing all of our available resources on addressing PsyNet’s capabilities and performance, and the quality of our game servers.

The number of monthly active Rocket League players has jumped roughly 40% over 2016 in the first few months of this year alone. While our player population continues to grow at a healthy pace, we need to do a better job at scaling up our systems and internal processes to handle this kind of growth. We are growing so fast, both as a company and as a game, that we are racing to fill new positions on our Online Services team (that's the group at Psyonix that builds and maintains our backend systems — which makes this a good time to mention our Careers page, for those of you who want to come work on Rocket League with us.) We are effectively becoming an Online Service company, as Rocket League is primarily an online game, and we continue to grow in size and scope. Growing this team is one of our top priorities as we approach Rocket League’s two-year anniversary in July.

IMPROVING THE PSYNET DATABASE

Our Online Services team continues to work on the PsyNet database; if you have hopped into Rocket League recently and seen, in red letters, “Cannot Connect to Rocket League Servers” where the Playlists normally live, you’re seeing the PsyNet DB gone awry.

Issues with the PsyNet database started spiking around the same time we had a free weekend on Xbox One last month. We have seen more and more of you playing Rocket League online, and this has led to new issues that hadn’t appeared as we scaled our capacity up to this point.

Further complicating the problem has been outages and performance problems within the Google Cloud infrastructure, which we use to power PsyNet. Some of our downtime has resulted from unexpected slowdowns on database operations that led to a death spiral, but ultimately do not appear to have been caused by our own usage or overhead. At the end of the day, however, we need to own our own game’s performance. Here’s what we’ve been doing to address these issues:

  • Since the February outages, we assigned additional dedicated staff from our Online Services team to database stability and reliability. They are hard at work on changes like separating high-traffic features like Player Trading and “Scraper API” access used by third party sites from our core services. This will reduce load on the PsyNet database and reduce exposure to outages during peak hours. We definitely don’t want to keep you off the pitch on weekends or holidays.
  • We are working closely with Google engineers to investigate the disturbances to our database performance from outside our cloud instance.

MATCHMAKING SERVER DELAYS

Issues with matchmaking delays have only become apparent in recent days, going back to the Dropshot release last week. Our matchmaking server has periodically fallen into a problem state with a huge backlog of match reservations — there are empty servers for you to play on, but PsyNet isn’t putting you into the servers fast enough. This causes the extreme search times being reported on social media.

As this is a new problem, we are still investigating the root cause. We have made changes in the interim to reduce the likelihood of it recurring, but more substantial improvements will be made throughout 2017.

DEDICATED SERVER PERFORMANCE

The last category of “server issues” deals with problems when actually connected to a dedicated game server. This is most commonly reported as in-game lag spikes or hiccups independent of the player’s ping to the server.

This problem is particularly difficult to identify because so many issues factor into a user’s experience with a dedicated server. Local connection problems can cause packet loss or an unstable connection despite a steady ping sample. Individual ISPs can have routing issues to one of our server hosts that causes unusually high ping that we have no control over.

However, these outside factors do not mean there are not legitimate issues at hand, and this is another set of problems we need to own. After reports of instability in January, we significantly expanded our hardware investment in Europe, US-East, US-West, and Oceania to reduce our dependency on virtual servers. For those who are unfamiliar with the term, a “virtual” server exists in the cloud and can be rapidly spun up in quantity to meet demand. While virtual servers are not inherently bad — and many games successfully and heavily rely on them — we weren’t confident in their recent reliability.

For context, we do not view virtual servers as a cost-saving measure, but instead as a way to scale to unexpected demand. It takes time to order and deploy hardware servers, and without a virtual solution, players would be forced to wait in long queues to get into an available server if demand spiked above our capacity.

We ordered enough hardware servers to satisfy demand above our then-peak population to ensure a good experience for everyone. For a time, this seemed to have a positive impact, but the concerns have resurfaced since our latest update.

We are addressing the issues with our game servers on a few fronts:

  • We are doing our own profiling as well as working with our server providers to investigate this issue further and seeking a resolution as quickly as possible.
  • Part of this profiling is completing a performance pass on Dropshot, as we want to ensure the best server performance possible when you’re playing our new game mode. Some of these improvements will be coming in our next patch — a hotfix we hope to deploy in the very near future.
  • We are also working on improving our internal metrics to detect and isolate these problems more efficiently and rapidly.
  • Finally, while we have already invested in new server hardware in Europe and North America, we are continuing to investigate new hardware options in other regions, including Asia, South Africa, and Central America.
Growing pains are just that — painful — and we cannot thank our players enough for sticking with us as we continue to grow. Some of the issues we face will be addressed shortly, including those in our next hotfix; while other projects, like growing our Online Services team, will take more time to bear fruit. We thank you for your continued patience during this time, and as always, we welcome your feedback on Reddit, Facebook, and Twitter. We promise to do better by all of our players in the future.
Michelle-Louise Janion
I tend to favour the puzzle, RPG, action-adventure and strategy genres. My particular favourites over the years have been Final Fantasy IX, Metal Gear Solid, Civilisation V and The Last of Us. I'm also a sucker for a series: Darksiders, Assassin's Creed, Tomb Raider etc. I am a bit of a lone gamer as I tend to steer clear of PvP, and only play multiplayer with close friends.