Jump to content
FORUMS
Sign in to follow this  
Starym

The Second War of the Shifting Sands Behind the Scenes: Engineer's Workshop

Recommended Posts

51192-blizzard-congratulates-new-scarab-
 

Blizzard are taking an in-depth look at how the second Ahn'Qiraj gate opening series of events was created, from a technical standpoint. They delve into a bit of history and the first, Vanilla opening and what was learned from that, how they used automated players and stress tests to get the second version working as well as possible, the limitations of the Classic/original code itself, the very first openings of the gates and how GMs were following them live and implementing solutions on the fly, and a whole lot more!

If you have even a slight and passing interest in how things are actually done behind the scenes, it's a fascinating look at one of the biggest events ever in WoW history and how it was improved and handled.

    Blizzard LogoAQ (source)

     Join us for a behind-the-scenes deep dive on recreating one of World of Warcraft’s most iconic events, the Ahn’Qiraj war effort.

     

    War is upon us. Earlier this month, one of the most anticipated events of World of Warcraft: Classic went live—the Ahn’Qiraj war effort. Entire Classic realms—the might of the Horde and Alliance combined—came together, contributing resources to open the gates and unlock the Ahn’Qiraj raids. When the War of the Shifting Sands took place the first (and only) time in 2006, thousands of players from each realm flew or hoofed it over to Silithus to partake in or witness the chaos. The turnout was beyond the development team’s wildest imaginings and, simply put, we were not prepared. Servers quickly become overloaded, and many players were caught in a loop of logging in, disconnecting, and trying to get back online over a 12-hour period while our engineers scrambled to hotfix issues and get players reconnected. While we did manage to stabilize servers during the event, and learned quite a few lessons, we saw opportunities to do better. Fifteen years later, we were ready to recreate one of the most epic moments in WoW history for WoW Classic by focusing on server optimization to combat lag and eliminate server crashes, all while hosting up to twice as many players in Silithus than we did during the event’s debut in 2006.

    In this article, we’ll walk you through how we were able to recreate this highly anticipated event by going over how we use automated players and stress tests to determine breakpoints and handcraft optimization solutions, how we came up with solutions in the software to solve problems that hardware couldn’t, and how we curated a global event with limited server crashes, all while preserving the WoW Classic gameplay experience.

    Recreating the Second War of the Shifting Sands

    We had three specific goals in mind when approaching how we would need to engineer this event: Prevent chain crashes, increase the expected zone player limits, and determine how much lag was tolerable before porting players outside of Silithus. Before we can get into the nuts and bolts of how we maximized server performance, it’s important to understand the constraints we’re working in: the limitations of WoW Classic’s codebase, how population management solutions work, and how they affect gameplay.

    Anubsiaths Invade Azeroth

    Beyond Boundaries

    The modern version of World of Warcraft was built upon the foundation of the original codebase released 15 years ago. Since the game’s launch, we’ve developed more modern ways to handle high player counts within Battle for Azeroth, most notably sharding. Shards allow WoW servers to host many more players in-game than we were capable of in 2006. In Battle for Azeroth, we use them to manage servers’ player load by making a copy of a zone (e.g. Zuldazar) once the player count reaches a certain threshold. This neutralizes lag issues by spreading players across different versions of the zone, since player interactions are among the most CPU intensive due to the amount of packets that they constantly send to the server for pinpoint accuracy on their movements and spells casts. Additionally, sharding mitigates potential lag issues that can be encountered when transitioning into a new zone where the player count goes over the threshold. Sounds simple enough, except there’s a catch—WoW Classic has been engineered to be a faithful recreation of the original 1.12 game data, which includes preserving its gameplay quirks. In rare cases, shards will cause your quarry, such as an enemy player or NPC, to disappear when phasing into a new zone. Keeping shards in would mean losing some of those nostalgic gameplay moments of chasing players and NPCs across zone boundaries. So, now we needed to come up with a solution that didn’t interfere with the original gameplay while also allowing us to get more players onto the server without forcing players to suffer through unplayable lag.

    To handle this issue, we elected to use layers—copies of entire regions (e.g. Eastern Kingdoms)—to manage player population and lag issues while keeping the memorable charm of the original release intact so players could once again kite world bosses across zones and chase enemy players across borders within a region without the risk of them being reassigned to a different shard. However, layers were designed as a non-permanent solution. Because the original 1.12 release did not use either sharding or layering technologies, we promised players that we only use layers at the launch of WoW Classic and phase them out over time as they dispersed more evenly throughout the world. There are a few cases in which we still use layering due to incredibly high populations of active players (e.g. North America’s Faerilina), but we have reduced the number of layers active on these realms since the game’s release. With 15 years of buildup, the AQ war is among the most highly anticipated events of WoW Classic, and we expect it to have the most amount of players in one area, outside of starting areas at the game’s release, without layers to manage it. Without layers or sharding population tech, we had to get creative, and quickly.

    Players Gather around the Gong

    Handcrafting an Unforgettable Experience

    We started the undertaking of finding a non-layer and shard population solution by generating headless clients—automated players—and instructing them to mimic what real players might do, such as casting spells, fighting NPCs, and moving around the area. This allowed us to take a snapshot of what performance could look like with thousands of players interacting in a single zone. After running these simulations, we then organized stress tests with volunteers so we could capture realistic player behavior and see how they compared. This gave us an indication of certain breakpoints and which pieces of our server’s code were experiencing the most issues at high player counts. Server frame time measurements were heavily scrutinized to see how close they were to causing a server to become unresponsive, also known as deadlocking.

    The next step was to analyze what was affecting server performance so we could begin breaking down this monumental task into comprehensible goals. What we faced is a polynomial problem, which means we can’t solve it by throwing faster hardware at it because hardware’s not exponentially better. Instead, we have to handcraft the optimization by deliberately choosing which data should be communicated to players and how often. To illustrate this conundrum, let’s say we have 20 players jumping in a circle. The server relays the actions of each player to the other 19 through packets (data deliverables). In this group of 20, the server processes 380 packets (20 total players * 19 recipients = 380 packets). This issue compounds when more players do the same action in the zone. If we increase our example to 500 players, then 249,500 packets are sent from the server. If we increase our example again to 1,500 players then 2,248,500 packets are sent to the server. Depending on player actions, multiple packets are sent per second—keep in mind the above examples only account for one action. The more packets sent to the server increases the processing time the server must take on a single player while then going on to handle every other players’ actions. When this problem compounds, the servers begin to approach deadlocking. In WoW Classic, we have significantly more players per realm than realms did back in 2006, so the expectation is that we accommodate more players around the gates than we ever did before.

    Optimizing Server Performance

    Our servers are engineered to crash and restart if they encounter a deadlock, so we knew it was critical to do everything in our power to help minimize processing time. After some testing, it became clear that movement was the first piece of processing power that was putting heavy stress on our servers. We began by dropping facing updates (displaying the direction a character model is facing) and only send out player updates whenever a player starts, stops, or uses keyboard movement. Since latency with an excessive amount of players is already compromised, spending CPU time sending minor facing updates made the fidelity worse. As such, it was better to stop sending them. We made the decision to cull how often we sent movement updates in favor of having more players in a zone. Keep in mind we’re trying to find the breaking point before the servers fall over while allowing as many players into Silithus as possible. After all, it’s better to miss some movement updates than to not be able to login to your character at all. We also started throttling data that was marked as lower priority. Doing something that is deemed a “less important” action should not be sent with the same rate as “more important” actions. We saw many messages all being sent at once regardless of how important they were and optimized the code to only send you less important information in batches and less frequently.

    Buffs and debuffs were another large hit on our performance. Throughout the world, especially when fighting mobs, buffs and debuffs are applied to units all the time. Though this may not seem like a big deal, with a high concentration of players all around each other, this information needs to be passed around. Similar to throttling low priority data, we now batch the buffs and debuffs to avoid sending multiple packets in succession to players.

    Managing Player Populations

    Aside from optimizing the servers to handle more players in each zone, it didn’t escape us that it’d be impossible to fit an entire realm’s population, (more than double what the original 1.12 WoW realm could handle) all within Silithus. Hard decisions had to be made to limit access into the zone by controlling who we allowed in and how many players we could allow in. We decided that we would only allow level 60 characters inside Silithus and would stop allowing eligible characters inside if was full. Creating this restriction was the right choice to make since the event in Silithus is known to be end-game content, and lower-level characters can still participate in the war effort in other zones, such as slaying the anubisaths that roam in The Barrens intended for level 20 to level 30 players. The second sticking point was that we knew the upper bound for how many players in an area we could handle without crashing the server; the question then became what that number should be reduced to for the best performance to player ratio. Over testing, we found this number to be around 1,500 players if they were stacked on top of each other. However, since the even takes over the whole zone, we saw minimal performance problems once players spread out.

    The event was planned to take place in all regions, so we had to make sure this event worked across multiple layers. This means that a Scepter-bearer who rung the gong on one layer should begin the event across all other layers connected to that realm. Since the trigger for the event was based on a player interaction, we wanted to ensure the Scepter-bearer was visible across multiple layers so all players on the same realm could see them. This created an interesting problem since servers now had to relay this information that they typically wouldn’t need to communicate to each other. This can create a lot of complications as we compile and send updates through the servers to make sure we mirror the data across multiple layers, potentially to thousands of players.

    We began developing this tech with the introduction of the Stranglethorn Fishing Tournament and applied it to the Onyxia, Nefarian, Zul’garub, and Rend world buffs later. Once we felt it worked as intended, we were ready to test it along with our other tech for the AQ war event.

    Horde Players in Silithus

    Experimenting with Solutions

    Now that we had addressed major tech hurdles and implemented several ways to optimize server performance, it was time to test everything we had worked on. We created a shortened version of the 10-hour war, scaled down to only run for an hour.

    During the first stress test, we let nearly all players into the zone to see what would happen. At one point, we were nearly at 150% the capacity of an entire 1.12 realm. This was when we saw our test realm crash. We knew we had put a very high number on how many people we’d cap the zone to, and we were seeing numbers that had exceeded that number greatly. We investigated the issue and realized that the code allowing players to transfer both into a zone and out of zone was a queue that didn’t process many players at once. This was why players weren’t being ported out and why players were stuck on flight paths for an unusually long time. We restored the server and continued the stress test, adjusting as we went. We slowly lowered the number to a point where we felt it was still laggy, somewhat playable, and retained a much higher number of players than any zone had seen before. The event that was supposed to only take an hour and a half ended up taking up to four hours to complete because of crashes.

    The second stress test was performed a week later. This allowed us to see if our optimizations worked. Upon loading into the stress test, we immediately noticed improvements—players were no longer stuck on flight paths leading into Silithus! We were able to obtain enough data that demonstrated how many players we could comfortably have in Silithus. After both tests, we moved forward with numbers that we felt accounted for the best balance between managing lag and server stability. These tests allowed us to see if our optimizations worked, and consider both tests successful since they allowed us to identify zone caps and iterate on them.

    Spreading Server Solutions Across Azeroth

    Originally, the optimizations were planned to only be active for Silithus during the War of the Sands. After we determined they’d be safe to rollout globally, we applied them to the entire world in 1.13.5. Once the war effort started, players began turning in supplies and harvesting bug corpses en masse. We saw a massive spike of players not only in Silithus, but also in our capital cities and outside zones. These optimizations helped make these experiences more performant, allowing large-scale PvP battles to take place across Azeroth. Some players even went as far as spawning the world boss Thunderaan to help clear out the other faction from a Hive.

    Even though the gate opening event hadn’t taken place yet, some servers were experiencing strange issues regarding their war effort not progressing. The rate at which some servers were completing their war effort was so fast that it would cause there to be a race condition in the logic of each turn-in that could prevent the five-day timer from starting. Since the chance of this edge case happening was so small, we were able to fix those servers manually and then address this issue for future realms completing their effort.

    Once the war efforts had been completed and five days had passed to open the gates, we began monitoring the Chinese realms that were first to open in the world. The first server in China to have an active Gong was Ouro. As we monitored our layer populations, we saw that most players on each layer were in Silithus. The event going off across multiple maxed-out layers for several thousand players at once was something we’d never done before. Though there was apparent lag, our servers didn’t experience any crashes during the first set of China realms opening.

    Bang a Gong!

    On August 4, it was noted that there would be several realms in North America ready to hit their gongs shortly after servers came up from reset. One by one, we actively monitored these realms on Game Master accounts and through our observation tools to monitor and address any issues that might be encountered. Each realm opened and began the event without issue. Scepter-bearers received their prestigious Black Qiraji Battle Tank mounts, players got to fight even bigger bugs, and we were pleased with the stability. As we were waiting for our first post-reset server to complete its five-day wait period, we noticed a significant issue: Events weren’t persisting after server restarts. This means that if a server would crash or restart, we would lose all progression in the event. Though this problem had existed since the beginning of WoW Classic’s development, there hadn’t been many applications of the use of events persisting across server restarts. Our team was able to address the problem quickly, but we needed to ensure that no further restarts could happen until we were able to deploy a fix and properly catalog all existing status of war efforts into our database without interruption to players.


    Some may argue that allowing servers to crash is what made the original AQ war chaotic, which in turn made it memorable. Instead, we strove to cultivate that same fervor by curating a much more stable experience that could be shared with around 1,500 players in Silithus at the same time on each server. We wanted the memories of the Classic AQ war to be of having as many players as possible play through the 10-hour event without interruption. While we did experience a few realms crashing, we were able to get them back online quickly. These realms fully recovered and were back online within minutes and no subsequent crashes took place.

    Over 4,000 players worldwide have become Scarab Lords, and that number continues to climb as each server progresses their war efforts. The excitement and engagement on Classic since the AQ war effort began has been incredible to watch and we’re grateful to all who joined us for the second War of the Shifting Sands!

     

    Share this post


    Link to post
    Share on other sites

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Reply to this topic...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

    Loading...
    Sign in to follow this  

    • Recently Browsing   0 members

      No registered users viewing this page.

    • Similar Content

      • By Starym
        Blizzard have stated that the BC Classic beta is coming soon, but didn't specify when exactly. Today the Battle.net launcher's app catalog was updated with the TBC Internal Alpha, visible only to Blizzard but datamineable nonetheless!
        Right at the bottom there, the Internal Alpha most likely won't last too long and many have speculated that the beta won't be too long either, as there aren't as many issues to be dealt with as there were for Vanilla Classic. The release date is up in the air but hopefully Blizzard won't put it too close to patch 9.1, as many players will want to play both.
        In any case all we can do now is make sure we've opted-in to the beta (here) and wait!
        Source.
      • By Staff
        The level 58 boost option that is offered with the arrival of  Burning Crusade Classic has caused quite a bit of controversy and Blizzard have now tried to clarify the details on it a little, clarifying that the boosts will only be available on the new BC realms, while the Classic Era realms (where characters will stay if they don't transfer to the new BC  realms) will not have the boost available.
        Boost (source)
        Hello.
        I think it might be helpful to note that what we’re now calling WoW Classic Era realms start at Level 1, and we feel the same as we did two years ago: no Boost would ever be needed (or appropriate) there. When we originally laid out our principles for WoW Classic, the 1-60 experience was the only thing to which we could refer.
        Burning Crusade Classic begins at level 58, and we’re configuring this new Boost service to avoid minimizing the accomplishments of existing players or skipping any new content at launch. It’s for players who want a way to quickly join their friends in Outland.
      • By Staff
        While BlizzConline may be done, the interviews sure aren't, and we got a lot of new info on the Burning Crusade Classic from MrGM's interview with John Hight and Brian Birmingham!
        The biggest takeway from the interview is the fact that BC Classsic will work in the newest Shadowlands client, meaning we may get new graphical options like Ray Tracing and other goodies!  Raid attunements will mostly work as they did, but there will also be improvements, like everyone getting attuned instead of 1 person, and possibly non-current raids no longer requiring attunement at all.  The release Phases themselves aren't 100% set in stone, there's still room to tweak them depending on community feedback. Fresh Classic servers (meaning starting at Phase 1 of Vanilla) may happen, but not until after launch. They did not mention fresh BC Classic servers, however (meaning new servers that you can start on when BC launches). Leatherworking Drums will be changed/nerfed as they feel too powerful and mandatory. The fact that the whole expansion is based on the final patch, 2.4.3 was reiterated, and it was chosen because they felt that was the best balance the expansion had. We hope it won't have similar adverse effects they way Vanilla Classic's class tuning did. The XP curve will also be as it was in 2.4.3. There will be other quality of life improvements that came either with the final patch or even beyond it, like mounts at level 30. Collector's Edition items from the original Burning Crusade will be given to accounts that have it in Classic as well. And here's the full interview, very helpfully marked with each question in the video timeline/progress bar!
         
      • By Staff
        The Burning Crusade Classic will follow the same route Vanilla Classic did but with a twist, with a lot of the game being based off of the expansion's final patch.
        As talked about in many in panels and interviews, the approach to BC Classic is different than Vanilla Classic, with a new motto to replace "no changes". "Some changes" means that class balance will be based on patch 2.4.3 of BC, but the rest of the game's features might not. The developers talked about quality of life improvement changes from the final patch and beyond, but not everything will be taken from it - attunements will still be needed for current phase content for example, as well as both Paladin seals being availabe to both factions. The XP curve will also use the final patch changes instead of the BC start values.
        Luckily, the raid bosses themselves will actually be in their pre-nerf states (although we're unsure to what degree, as some were potentially impossible), so the difficulty should be significantly greater than in Vanilla Classic, where everything was much easier than the original, with more powerful classes due to the final patch tuning and some extremely nerfed bosses. 
        So the approach and "some changes" mantra seems like a good one, as each improvement/change is individually vetted and then added to the game if it fits and is a quality of life improvement. Not all details are clear yet as Blizzard are still deciding on what goes in and what doesn't.
      • By Staff
        In case you missed it in the giant pile of news coming out of BlizzConline, there's already a beta opt-in for the Burning Crusade Classic and you don't want to miss it! You can head on over to the game's page and scroll on down to the bottom to click the opt-in button.

        Blizzard also mentioned there will not be a gold cap for charaters transfering from current Classic onto BC Classic realms in a interview with Blizzard Watch.
        And in case you were still fuzzy on some of the key issues, here's the official FAQ:
        FAQ (source)
        Do I need to purchase Burning Crusade Classic to play it? No. Access to Burning Crusade Classic is included in and available to all players with an active World of Warcraft subscription—no additional purchase required.
        Will I be able to level up blood elf or draenei characters in advance? Yes. Sometime before Burning Crusade Classic is released, the game client will be updated, and players who wish to level up a blood elf and draenei character up to level 60 can do so without restrictions.
        I have never played WoW Classic but I'm interested in Burning Crusade Classic. How can I join my friends? If you don’t have characters ready for the journey beyond the Dark Portal and are interested in adventuring in Outland with your friends, we will offer an optional Level-58 Character Boost service closer to the launch of Burning Crusade Classic. 
        This boost will not be usable on Classic Era realms or on the new blood elf or draenei races; in addition, players will be limited to boosting only one character per World of Warcraft account. Further details, including details around pricing and availability, will be announced at a later date.
        Will Burning Crusade Classic support value-added services like Character Transfers?   Yes. Like in WoW Classic before it, Burning Crusade Classic will offer the Character Transfer service. We have no plans at this time to offer new services, but keep an eye on https://worldofwarcraft.com/news for future updates.   What will happen to my character if I choose to transfer it to an Era realm?   Characters transferred to an Era realm will not be able to progress past the Shadows of the Necropolis content update. They will not be able to level up past level 60, go to Outland, play as Blood elves or Draenei, or otherwise access any features available to players on Burning Crusade Classic Progression realms.   Can I transfer my character from a Progression Realm to an Era realm? No. Once the decision is made to leave a character on a Progression realm, you can no longer bring that character back to an Era realm. Choose wisely!
        Can I keep my character in an Era realm and a Progression realm at the same time?   Yes. We’re introducing a new character ‘clone’ service. You may purchase this service only once per WoW character. Once you do, that hero will be allowed to venture forth into Burning Crusade Classic while their copy continues their original adventure on a Classic Era realm. Please note that it is not possible to unlock a single WoW Classic character across multiple Era realms in this way.
    ×
    ×
    • Create New...