Jump to content
FORUMS
Sign in to follow this  
Starym

The Second War of the Shifting Sands Behind the Scenes: Engineer's Workshop

Recommended Posts

51192-blizzard-congratulates-new-scarab-
 

Blizzard are taking an in-depth look at how the second Ahn'Qiraj gate opening series of events was created, from a technical standpoint. They delve into a bit of history and the first, Vanilla opening and what was learned from that, how they used automated players and stress tests to get the second version working as well as possible, the limitations of the Classic/original code itself, the very first openings of the gates and how GMs were following them live and implementing solutions on the fly, and a whole lot more!

If you have even a slight and passing interest in how things are actually done behind the scenes, it's a fascinating look at one of the biggest events ever in WoW history and how it was improved and handled.

    Blizzard LogoAQ (source)

     Join us for a behind-the-scenes deep dive on recreating one of World of Warcraft’s most iconic events, the Ahn’Qiraj war effort.

     

    War is upon us. Earlier this month, one of the most anticipated events of World of Warcraft: Classic went live—the Ahn’Qiraj war effort. Entire Classic realms—the might of the Horde and Alliance combined—came together, contributing resources to open the gates and unlock the Ahn’Qiraj raids. When the War of the Shifting Sands took place the first (and only) time in 2006, thousands of players from each realm flew or hoofed it over to Silithus to partake in or witness the chaos. The turnout was beyond the development team’s wildest imaginings and, simply put, we were not prepared. Servers quickly become overloaded, and many players were caught in a loop of logging in, disconnecting, and trying to get back online over a 12-hour period while our engineers scrambled to hotfix issues and get players reconnected. While we did manage to stabilize servers during the event, and learned quite a few lessons, we saw opportunities to do better. Fifteen years later, we were ready to recreate one of the most epic moments in WoW history for WoW Classic by focusing on server optimization to combat lag and eliminate server crashes, all while hosting up to twice as many players in Silithus than we did during the event’s debut in 2006.

    In this article, we’ll walk you through how we were able to recreate this highly anticipated event by going over how we use automated players and stress tests to determine breakpoints and handcraft optimization solutions, how we came up with solutions in the software to solve problems that hardware couldn’t, and how we curated a global event with limited server crashes, all while preserving the WoW Classic gameplay experience.

    Recreating the Second War of the Shifting Sands

    We had three specific goals in mind when approaching how we would need to engineer this event: Prevent chain crashes, increase the expected zone player limits, and determine how much lag was tolerable before porting players outside of Silithus. Before we can get into the nuts and bolts of how we maximized server performance, it’s important to understand the constraints we’re working in: the limitations of WoW Classic’s codebase, how population management solutions work, and how they affect gameplay.

    Anubsiaths Invade Azeroth

    Beyond Boundaries

    The modern version of World of Warcraft was built upon the foundation of the original codebase released 15 years ago. Since the game’s launch, we’ve developed more modern ways to handle high player counts within Battle for Azeroth, most notably sharding. Shards allow WoW servers to host many more players in-game than we were capable of in 2006. In Battle for Azeroth, we use them to manage servers’ player load by making a copy of a zone (e.g. Zuldazar) once the player count reaches a certain threshold. This neutralizes lag issues by spreading players across different versions of the zone, since player interactions are among the most CPU intensive due to the amount of packets that they constantly send to the server for pinpoint accuracy on their movements and spells casts. Additionally, sharding mitigates potential lag issues that can be encountered when transitioning into a new zone where the player count goes over the threshold. Sounds simple enough, except there’s a catch—WoW Classic has been engineered to be a faithful recreation of the original 1.12 game data, which includes preserving its gameplay quirks. In rare cases, shards will cause your quarry, such as an enemy player or NPC, to disappear when phasing into a new zone. Keeping shards in would mean losing some of those nostalgic gameplay moments of chasing players and NPCs across zone boundaries. So, now we needed to come up with a solution that didn’t interfere with the original gameplay while also allowing us to get more players onto the server without forcing players to suffer through unplayable lag.

    To handle this issue, we elected to use layers—copies of entire regions (e.g. Eastern Kingdoms)—to manage player population and lag issues while keeping the memorable charm of the original release intact so players could once again kite world bosses across zones and chase enemy players across borders within a region without the risk of them being reassigned to a different shard. However, layers were designed as a non-permanent solution. Because the original 1.12 release did not use either sharding or layering technologies, we promised players that we only use layers at the launch of WoW Classic and phase them out over time as they dispersed more evenly throughout the world. There are a few cases in which we still use layering due to incredibly high populations of active players (e.g. North America’s Faerilina), but we have reduced the number of layers active on these realms since the game’s release. With 15 years of buildup, the AQ war is among the most highly anticipated events of WoW Classic, and we expect it to have the most amount of players in one area, outside of starting areas at the game’s release, without layers to manage it. Without layers or sharding population tech, we had to get creative, and quickly.

    Players Gather around the Gong

    Handcrafting an Unforgettable Experience

    We started the undertaking of finding a non-layer and shard population solution by generating headless clients—automated players—and instructing them to mimic what real players might do, such as casting spells, fighting NPCs, and moving around the area. This allowed us to take a snapshot of what performance could look like with thousands of players interacting in a single zone. After running these simulations, we then organized stress tests with volunteers so we could capture realistic player behavior and see how they compared. This gave us an indication of certain breakpoints and which pieces of our server’s code were experiencing the most issues at high player counts. Server frame time measurements were heavily scrutinized to see how close they were to causing a server to become unresponsive, also known as deadlocking.

    The next step was to analyze what was affecting server performance so we could begin breaking down this monumental task into comprehensible goals. What we faced is a polynomial problem, which means we can’t solve it by throwing faster hardware at it because hardware’s not exponentially better. Instead, we have to handcraft the optimization by deliberately choosing which data should be communicated to players and how often. To illustrate this conundrum, let’s say we have 20 players jumping in a circle. The server relays the actions of each player to the other 19 through packets (data deliverables). In this group of 20, the server processes 380 packets (20 total players * 19 recipients = 380 packets). This issue compounds when more players do the same action in the zone. If we increase our example to 500 players, then 249,500 packets are sent from the server. If we increase our example again to 1,500 players then 2,248,500 packets are sent to the server. Depending on player actions, multiple packets are sent per second—keep in mind the above examples only account for one action. The more packets sent to the server increases the processing time the server must take on a single player while then going on to handle every other players’ actions. When this problem compounds, the servers begin to approach deadlocking. In WoW Classic, we have significantly more players per realm than realms did back in 2006, so the expectation is that we accommodate more players around the gates than we ever did before.

    Optimizing Server Performance

    Our servers are engineered to crash and restart if they encounter a deadlock, so we knew it was critical to do everything in our power to help minimize processing time. After some testing, it became clear that movement was the first piece of processing power that was putting heavy stress on our servers. We began by dropping facing updates (displaying the direction a character model is facing) and only send out player updates whenever a player starts, stops, or uses keyboard movement. Since latency with an excessive amount of players is already compromised, spending CPU time sending minor facing updates made the fidelity worse. As such, it was better to stop sending them. We made the decision to cull how often we sent movement updates in favor of having more players in a zone. Keep in mind we’re trying to find the breaking point before the servers fall over while allowing as many players into Silithus as possible. After all, it’s better to miss some movement updates than to not be able to login to your character at all. We also started throttling data that was marked as lower priority. Doing something that is deemed a “less important” action should not be sent with the same rate as “more important” actions. We saw many messages all being sent at once regardless of how important they were and optimized the code to only send you less important information in batches and less frequently.

    Buffs and debuffs were another large hit on our performance. Throughout the world, especially when fighting mobs, buffs and debuffs are applied to units all the time. Though this may not seem like a big deal, with a high concentration of players all around each other, this information needs to be passed around. Similar to throttling low priority data, we now batch the buffs and debuffs to avoid sending multiple packets in succession to players.

    Managing Player Populations

    Aside from optimizing the servers to handle more players in each zone, it didn’t escape us that it’d be impossible to fit an entire realm’s population, (more than double what the original 1.12 WoW realm could handle) all within Silithus. Hard decisions had to be made to limit access into the zone by controlling who we allowed in and how many players we could allow in. We decided that we would only allow level 60 characters inside Silithus and would stop allowing eligible characters inside if was full. Creating this restriction was the right choice to make since the event in Silithus is known to be end-game content, and lower-level characters can still participate in the war effort in other zones, such as slaying the anubisaths that roam in The Barrens intended for level 20 to level 30 players. The second sticking point was that we knew the upper bound for how many players in an area we could handle without crashing the server; the question then became what that number should be reduced to for the best performance to player ratio. Over testing, we found this number to be around 1,500 players if they were stacked on top of each other. However, since the even takes over the whole zone, we saw minimal performance problems once players spread out.

    The event was planned to take place in all regions, so we had to make sure this event worked across multiple layers. This means that a Scepter-bearer who rung the gong on one layer should begin the event across all other layers connected to that realm. Since the trigger for the event was based on a player interaction, we wanted to ensure the Scepter-bearer was visible across multiple layers so all players on the same realm could see them. This created an interesting problem since servers now had to relay this information that they typically wouldn’t need to communicate to each other. This can create a lot of complications as we compile and send updates through the servers to make sure we mirror the data across multiple layers, potentially to thousands of players.

    We began developing this tech with the introduction of the Stranglethorn Fishing Tournament and applied it to the Onyxia, Nefarian, Zul’garub, and Rend world buffs later. Once we felt it worked as intended, we were ready to test it along with our other tech for the AQ war event.

    Horde Players in Silithus

    Experimenting with Solutions

    Now that we had addressed major tech hurdles and implemented several ways to optimize server performance, it was time to test everything we had worked on. We created a shortened version of the 10-hour war, scaled down to only run for an hour.

    During the first stress test, we let nearly all players into the zone to see what would happen. At one point, we were nearly at 150% the capacity of an entire 1.12 realm. This was when we saw our test realm crash. We knew we had put a very high number on how many people we’d cap the zone to, and we were seeing numbers that had exceeded that number greatly. We investigated the issue and realized that the code allowing players to transfer both into a zone and out of zone was a queue that didn’t process many players at once. This was why players weren’t being ported out and why players were stuck on flight paths for an unusually long time. We restored the server and continued the stress test, adjusting as we went. We slowly lowered the number to a point where we felt it was still laggy, somewhat playable, and retained a much higher number of players than any zone had seen before. The event that was supposed to only take an hour and a half ended up taking up to four hours to complete because of crashes.

    The second stress test was performed a week later. This allowed us to see if our optimizations worked. Upon loading into the stress test, we immediately noticed improvements—players were no longer stuck on flight paths leading into Silithus! We were able to obtain enough data that demonstrated how many players we could comfortably have in Silithus. After both tests, we moved forward with numbers that we felt accounted for the best balance between managing lag and server stability. These tests allowed us to see if our optimizations worked, and consider both tests successful since they allowed us to identify zone caps and iterate on them.

    Spreading Server Solutions Across Azeroth

    Originally, the optimizations were planned to only be active for Silithus during the War of the Sands. After we determined they’d be safe to rollout globally, we applied them to the entire world in 1.13.5. Once the war effort started, players began turning in supplies and harvesting bug corpses en masse. We saw a massive spike of players not only in Silithus, but also in our capital cities and outside zones. These optimizations helped make these experiences more performant, allowing large-scale PvP battles to take place across Azeroth. Some players even went as far as spawning the world boss Thunderaan to help clear out the other faction from a Hive.

    Even though the gate opening event hadn’t taken place yet, some servers were experiencing strange issues regarding their war effort not progressing. The rate at which some servers were completing their war effort was so fast that it would cause there to be a race condition in the logic of each turn-in that could prevent the five-day timer from starting. Since the chance of this edge case happening was so small, we were able to fix those servers manually and then address this issue for future realms completing their effort.

    Once the war efforts had been completed and five days had passed to open the gates, we began monitoring the Chinese realms that were first to open in the world. The first server in China to have an active Gong was Ouro. As we monitored our layer populations, we saw that most players on each layer were in Silithus. The event going off across multiple maxed-out layers for several thousand players at once was something we’d never done before. Though there was apparent lag, our servers didn’t experience any crashes during the first set of China realms opening.

    Bang a Gong!

    On August 4, it was noted that there would be several realms in North America ready to hit their gongs shortly after servers came up from reset. One by one, we actively monitored these realms on Game Master accounts and through our observation tools to monitor and address any issues that might be encountered. Each realm opened and began the event without issue. Scepter-bearers received their prestigious Black Qiraji Battle Tank mounts, players got to fight even bigger bugs, and we were pleased with the stability. As we were waiting for our first post-reset server to complete its five-day wait period, we noticed a significant issue: Events weren’t persisting after server restarts. This means that if a server would crash or restart, we would lose all progression in the event. Though this problem had existed since the beginning of WoW Classic’s development, there hadn’t been many applications of the use of events persisting across server restarts. Our team was able to address the problem quickly, but we needed to ensure that no further restarts could happen until we were able to deploy a fix and properly catalog all existing status of war efforts into our database without interruption to players.


    Some may argue that allowing servers to crash is what made the original AQ war chaotic, which in turn made it memorable. Instead, we strove to cultivate that same fervor by curating a much more stable experience that could be shared with around 1,500 players in Silithus at the same time on each server. We wanted the memories of the Classic AQ war to be of having as many players as possible play through the 10-hour event without interruption. While we did experience a few realms crashing, we were able to get them back online quickly. These realms fully recovered and were back online within minutes and no subsequent crashes took place.

    Over 4,000 players worldwide have become Scarab Lords, and that number continues to climb as each server progresses their war efforts. The excitement and engagement on Classic since the AQ war effort began has been incredible to watch and we’re grateful to all who joined us for the second War of the Shifting Sands!

     

    Share this post


    Link to post
    Share on other sites

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Reply to this topic...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

    Loading...
    Sign in to follow this  

    • Recently Browsing   0 members

      No registered users viewing this page.

    • Similar Content

      • By Staff
        Hunters got some newly added tuning today, as Blizzard detail additional changes added to the already quite significant class tuning! 
        Hunter (Source)
        This morning with scheduled weekly maintenance, in addition to previously-planned adjustments to other classes, we deployed several changes to Hunters in Season of Discovery.
        Hunter
        Rapid Killing now reduces the cooldown of Rapid Fire by 80% (was a reduction of 2 minutes). Chimera Shot weapon damage percent increased to 135% (was 120%). Explosive Shot base damage before attack power increased by 50%. Sniper Training has an additional new effect: while Sniper Training is at 5 stacks, Aimed Shot is instant. Serpent Spread now causes Multi-shot to apply Serpent Sting to its targets for 12 seconds (was 6 seconds). Steady Shot now deals 100% weapon damage (was 75%). As we previously noted, there are more adjustments coming for Hunters and other classes. In general, we intend to improve the ranged playstyle and diminish the melee playstyle so that the two become more equivalent.
        We apologize for the confusion caused by the changes above going live earlier than intended, and we intend to post more about further adjustments soon.
        Thank you for your patience and understanding.
      • By Staff
        This weekend, a global damage reduction aura will be tested in PvP combat in Season of Discovery to balance the increased burst damage observed the latest phase.
        (Source)
        Hello, Classic PvPers!
        This coming weekend, we will test an adjustment to Player versus Player combat in Season of Discovery. We will implement a global damage reduction aura on PvP combat between players (and player controlled pets and other units).
        When we started Season of Discovery, we knew that PvP would become significantly more fast paced and bursty, with each class being capable of significantly more damage output, and this quickly proved to be true. We experimented with a damage reduction aura in battlegrounds during Phase 1, but it didn’t work as well as we hoped, mostly because healing was scaling more aggressively than damage at that point, causing healing classes to become significantly more oppressive in battlegrounds than they normally would be. Now that we’re in Phase 3, we’ve made many changes and added more runes, and burst damage has significantly outpaced healing output, so we feel that it’s a good time to try this again.
        Damage scaling and its impact on PvP has always been a concern in World of Warcraft, even since original WoW. Various systems have been introduced to counteract this to varying levels of success, but ultimately it always comes down to the fact that player damage outscales player health at a base level, and an adjustment is needed for combat between players to continue to feel satisfying as player power increases. We’ve been hesitant to introduce this as a permanent feature, and we prefer to give it a trial run this weekend to get an idea of how it feels before we take further steps. Please note: this aura will affect all areas of the game world, not just battlegrounds.
        We encourage you to get out into the world (or in battlegrounds) and PvP the weekend and let us know how it feels. We’re prepared to do some tuning to make it the most effective, so your feedback will be very helpful.
        We currently plan to enable the aura on all Season of Discovery realms on Friday, April 26 at approximately 19:00 CEST, and it will be disabled four days later, during scheduled weekly maintenance in each region.
        We look forward to your feedback. Thank you!
      • By Staff
        Blizzard have announced that there will be realm transfers coming from PvP to PvE realms over the next few weeks. These free transfers will be open for a short time and on multiple occasions. Blizzard also explain why these are opening now and also talk about faction balance, which is the reason the transfers will work in these short bursts on numerous occasions. 
        Transfers (Source)
        Good evening everyone,
        When we launched Season of Discovery, we made some bold decisions, one of those was the creation of Enforced Faction Balanced PvP Realms. When we did this, we did our best to communicate some caveats that this may not end up being the experience you want from World of Warcraft, as PvP realms can be an intensely challenging experience at times. While we are very happy with how the faction balance system has worked out, we know from player feedback internally and externally that the reality of a balanced presence of both factions on a single realm, and that individual layers will not always have equal factions, means that some folks have now decided they would instead prefer a PvE ruleset, but have characters locked to these PvP realms.
        To that end, we genuinely want everyone to have the best time they can in Season of Discovery, so over the next few weeks, we will be periodically opening Free Character Transfers from PvP realms to PvE realms. These transfer windows will open and close fairly rapidly, and we offer no guarantee that you will be able to transfer. The reasoning for this is simple; we need to ensure that one faction doesn’t disproportionally leave a realm, undoing months of hard work with realm faction balance.
        These transfer options will be periodically available starting Tuesday and will open and close throughout the rest of this week.
        As always, thank for you joining us in Season of Discovery. We’ve experimented a lot and learned much that will ultimately lead to bigger and better things.
      • By Staff
        We have some more class tuning coming to Phase 3 of SoD, as Druids, Paladins, Priests, Rogues and Shamans get some solid changes.
        Class Tuning (Source)
        Tomorrow, during each region’s normal maintenance period, we will apply the following adjustments to classes in Season of Discovery.
        Druid
        The Natural Weapons talents now increases all damage done by druids in Season of Discovery, instead of just physical damage. The Gale Winds rune now reduces the mana cost of Hurricane by 60%. Swiftmend no longer consumes a Rejuvenation or Regrowth effect on the target when used. Developers’ notes: Please note that it does still require Rejuvenation or Regrowth to be on the target to be used. Paladin
        Seal of Righteousness damage can now be critical hits. Sacred Shield’s duration is extended to 60 seconds (was 30 seconds). Crusader Strike has now gained an additional effect: Crusader strike now refreshes all judgement effects active on the target to a 30 second remaining duration. Seal of Martyrdom can now “twist” with other seals, including Seal of Command. Developers’ notes: Seal “twisting” was an interesting emergent effect that became popular during the original Burning Crusade, that utilized the slower server messaging system used in early versions of WoW to slightly extend the duration of the Paladin’s active seal for a short time whenever a second seal was cast. This effectively allowed paladins to momentarily gain the benefit of two seals at once if they timed the application of a new seal to line up perfectly just before their weapon swing. This is something we would have considered a bug at the time, but for many players, it became a popular feature of the class, allowing skilled players to increase their output with precision gameplay.
        We recreated this playstyle in Burning Crusade Classic and since then it’s been a popular request in other versions of original WoW, including Season of Discovery. We had concerns about allowing Martyrdom and Command specifically to be twisted together in Season of Discovery for a variety of reasons, including the fact that the playstyle is highly mana inefficient, it largely requires addons such as a weapon swing timer to function properly, and twisting can be unintuitive for less experienced players. As the game matures and we continue to listen to player feedback however, we recognize that for some, this is part of the charm and uniqueness of playing a paladin during those early eras of WoW’s history. We consider this change to be experimental and we will watch the performance and behavior of Retribution closely after this change. Should it prove problematic, we may revert this change or apply additional adjustments later. We greatly appreciate the feedback we’ve received about this thus far.
        With this change to allow twisting, we do not expect to see a major change in the “optimal” way to play Retribution right away. Due to the interaction between Art of War, Martyrdom, and Exorcism, we’ll likely need a larger redesign and changes to that interaction to truly add more diversity to the Retribution playstyle. We’re evaluating options here for potential changes we can make either via hotfix during Phase 3 or as part of a larger effort for Phase 4. Priest
        Shadowform now increases all shadow damage done by 25% (was 15%). Rogue
        Saber Slash bleed now stacks up to 5 times. Saber Slash bleed now also increases the impact damage done by Sinister Strike and Saber Slash by 15% per stack for the rogue who applied the bleed. Saber Slash bleed now deals 3% of the rogue’s Attack Power in damage per tick (was 5%). Shaman
        Mental Dexterity now only triggers from dealing damage with Stormstrike and Lava Lash, and it now lasts 30 seconds (was 10 seconds). Burn now increases Flame Shock Damage by 100% and flame shock DoT duration by 6 seconds, and causes Flame Shock to strike up to 5 targets (was 3 targets). Developers’ notes: We’ve received a lot of feedback about the overall usability of mental dexterity, particularly for Elemental Shamans. This ability was not intended to be used by elemental, so the adjustments we’ve made are to help ensure its usage is a bit more enhancement-centric. To compensate, we’ve increased the output and usability of the Burn rune, with a slight quality-of-life improvement in the form of a Flame Shock duration increase which allows two Lava Burst casts to be used within a single Flame Shock duration. This is likely not going to be enough to overall bring elemental up to the level we want, so we intend to continue to make other tactical adjustments in future class adjustment updates as needed. In the Weeks to Come
        We’re carefully planning for more adjustments to Hunters, Mages, Shamans, and Warlocks in future updates. We really appreciate the feedback we’ve received since the start of Phase 3 and look forward to sharing more about our next round of adjustments with you soon.
      • By Staff
        Players have been noticing that Righteous Orbs were suddenly no longer dropping, and now Blizzard have confirmed that there was a hotfix, and that they were never intended to drop before Phase 4.
        Righteous Orbs (Source)
        With a hotfix that went live to all realms earlier today, Righteous Orbs no longer drop during Phase 3 of Season of Discovery.
        They’ll become available as intended in Phase 4.
    ×
    ×
    • Create New...