Skip to main content

Late nights, high scores, and blanket forts: the challenges of testing music and fitness games

For games like Just Dance and EA Sports Active, QA is a tricky process

Share this story

Photo by Amelia Holowaty Krales / The Verge

“I hopped into the lounge, with myself, and was baffled to see that this dumb idea actually worked.”

One day during the testing of Dance Central VR in the Harmonix office, Robbie Russell was struggling with a very particular problem. They were running a multiplayer check of a social hub feature where up to four players can hang out in a private visual space and take part in dances, play mini-games, or just chat. But everyone else in quality assurance (QA) was busy at the time. So Russell came up with a workaround.

“You can’t just log someone in and then take the headset off because eventually it’ll time out,” they say. “So me, in my infinite dumbassery, put two headsets on, one Oculus Quest and one Oculus Rift, and strapped two sets of controllers to my wrist. The Rift controllers had holes big enough that I could slip my hands through them and wear them like very stupid bracelets while the Quest controllers fit in my hand normally. That became my go-to way to check multiplayer stuff for the rest of the project.”

Finding ways to cheat video game features and mechanics is not uncommon in QA. Testers are expected to dive into each corner of the experience in a thorough manner, often multiple times, to ensure that everything is working as intended. DIY solutions and shortcuts are bound to happen to optimize time. When it comes to QA testing for music and fitness games, however, these hacks aren’t just a way to save time — they’re a necessity.

The physical component of these genres, all the way from Dance Dance Revolution to Ring Fit Adventure, has always made them unique. Breaking a sweat with a drum kit in Rock Band or strapping a Joy-Con around your leg to go for a jog in Nintendo’s fitness RPG are not experiences you find in most games. But while players can pause or take breaks at any time, testers have to juggle these activities on a 9-5 routine. All of a sudden, the prospect of doing hundreds of jumping jacks or dancing to “What Does The Fox Say” a dozen times per day isn’t as enticing.

The singer Usher performs as he introduc
Usher during the reveal of Dance Central 3 in 2012.
Photo by Robyn Beck / AFP / GettyImages

The development of EA Sports Active 2 is a prime example of this complexity and differences from traditional testing. The first in the series was released as a Wii exclusive in May 2009, offering body workouts that required wearing a strappable pouch to hold the Nunchuck controller, alongside a resistance band depending on the exercise. Its sequel, released in November 2010, was multiplatform. This meant that PlayStation 3 players used Move controllers, while Xbox 360 players had the just-released Kinect scanning their full body instead.

A contract QA tester with Volt Media Consulting working at EA Vancouver at the time, who we’ll be calling Kyle upon his request for anonymity, recalls the project as a nightmare. “It was kinda like one of those Benny Hill sketches where everyone is running around,” he says over a Discord call.

QA was divided into three groups, one per platform, and Kyle was part of the Xbox 360 team. For both him and his co-workers, this meant learning the ins and outs of Kinect’s first iteration. There was no training manual available. Aside from the availability of an engineer (who was also learning on the fly) to verify whether certain bugs came from the game or the Kinect itself, understanding how the hardware worked came purely from trial and error.

“It was kinda like one of those Benny Hill sketches where everyone is running around.”

Due to the sensor-tracking nature of the device, the entire office floor had to be adapted. Testers taped off and fragmented the room into triangle-shaped spaces after finding out exactly where the limits of the Kinect’s sensor started and ended, marking the perfect spot — an imaginary box that doesn’t exist for anyone except the hardware. 

Everyone was sharing desks to avoid moving their own all the time — as that would ruin the invisible spot — while being extra careful during walks around the office to prevent being read in somebody else’s triangle. Even jackets hanging on the walls had to be carefully placed. But the worst moments were when the device couldn’t read people at all, whether that was because they were using dark colors or wearing baggy clothes like a hoodie. Figuring out what to wear at work became part of the learning process.

“The Kinect used infrared,” Kyle continues. “In the summer, when it can get over [80 degrees Fahrenheit] in Vancouver, the air conditioner ran through the entire building. If you by chance had one of the massive air vents in your triangle with cold air blowing on you, you would lose your body being read.” Since the Kinect could only read a person by their body temperature, the team had to make covers or lids to block the air conditioning vents. 

According to interviews with eight current and former QA workers in AAA studios and outsource companies, not all offices offered the proper accommodation to cover the basic needs of this kind of testing. Elements such as showers, proper ventilation, hydration at hand, and sound-proof spaces varied greatly.

An anonymous source who worked on Just Dance for Ubisoft Reflections says the office was well-equipped, including showers and gym mats to dance on. There were around 10 testers on Just Dance 4, with more at other Ubisoft studios due to co-development. “We had a lot of space and the way they’d set up the consoles and PCs made it easy to test and input bugs quickly,” they add.

While working on Xbox Fitness at an outsource studio, one former QA recalls the company commissioning a purpose-built gym to be constructed above a kitchen and bathroom stalls, providing ventilation and preventing noise disturbances. In addition, the team was also able to expense appropriate clothing after requesting a budget for it.

Milan Games Week 2019
Photo by Emanuele Cremaschi / Getty Images

The inclusion of properly equipped, separated spaces for QA that work on these types of games is crucial. Not just to keep singing or loud noises at a minimum, but also to divide such different work environments. When daily tasks demand physical activity in the way these tests do, it becomes the equivalent of putting an office and a gym under the same roof.

A former QA on Rock Band 4 and the expansion Rock Band Rivals, as part of the outsourcing studio Keywords, originally worked in a noise-proof room located within the main floor. The company later had it torn down to expand the space, so QA (composed of around 16 testers) was relocated with everybody else. “I tested mainly drums and vocals which is an obvious distraction for everybody,” they mention. “Couldn’t do much about it.”

At the Harmonix office, Russell (who is currently a senior QA at Proletariat working on Spellbreak) adds that there was a dedicated room for drum testing in Rock Band, and meeting rooms would often be booked entirely to do full-band runs when needed. But for the most part, people would “always hear somebody clacking away at a guitar or quietly humming at their desk to test vocal pitches.”

“You’re gonna get sweaty eventually”

There was plenty of room for QA to do their jobs safely as well as hydration available. Sadly, aside from a member-exclusive gym in the building (which was free for salaried employees), there were no office showers available. This is similar to how amenities worked in EA Vancouver. According to Kyle, there were showers and lockers available in the gym, but only full-time employees had access to it. The EA Sports Active 2 QA team, due to how chaotic the project was, was granted this privilege. But this wasn’t the norm for contractors. “It was kind of encouraged among the team to take breaks and not overexert yourself,” Russell explains. “But it’s rapid physical activity, you’re gonna get sweaty eventually.”

Caelyn Sandel, Caves of Qud developer and former QA at Harmonix, recalls the encouragement for “everyone to bathe regularly” from the team leads. “I’m not saying it never got ripe, but it wasn’t a persistent issue.” That being said, Sandel recalls a specific day that stood out from the rest. “One time the HVAC system had some kind of crisis and that smelled so bad that we all evacuated and went to Flour Bakery. That was a fun day except for the horrible stink.”

The lack of widely available showers isn’t unique to Harmonix or EA. During a CEDEC 2020 presentation reported by Famitsu and translated by PushDustin on Twitter, it was mentioned that the Ring Fit Adventure office only had one, so the testing team would have to wait sometimes. (Nintendo declined our request for comment.)

The intensity of the required effort from testers, even as they all juggle far more ordinary QA tasks in between (inputting bugs, working on tickets on platforms such as JIRA, maintaining databases, and so on), can lead to a myriad of results. Not all QAs are professional drummers, dancers, or fitness trainers.

Two of the developers recall how the constant movement and exercising helped them to get in better shape, one case being the result of the studio hiring professional dancers who had teaching experience and would lead testers through stretches and better practices. For others, the training activity during the day job was a compliment to already established exercising / health habits, while a few opted to not change theirs.

While pretty much everyone I spoke to mentioned ways in which they could get away with reducing the physical workload, it’s important to note that tests that involved completionist achievements or high difficulties would usually require traditional testing. Without the possibility of workarounds, some tests could only be done by someone who excelled at the task at hand.

These logistics are part of daily planning. For a studio like Harmonix, this meant making sure there were enough working instruments for particular tests, knowing who in the office had a reliable vibrato to test out certain exploits on vocals, and, ultimately, recognizing whether there was someone who had the required level of skill to test specific bugs.

Some cases could be bypassed by just playing in lower difficulties, but the full spectrum had to be tested at some point to avoid inaccurate results. Drum testing a fake plastic kit might not seem like much — yet, as Russell points out, playing it at a decent skill level “is a huge burden and often caused a lot of issues.” If the two people who were good enough at them to play on the expert difficulty weren’t in the office, then those bugs had to wait.

In most cases, testers were able to take breaks throughout the day without objection from leads or management, sometimes outside of settled company breaks as well. EA, for example, offered two 15-minute breaks across mornings and evenings, on top of the lunch hour. Everyone would usually be understanding, allowing QA to trade tasks between them and, once again, encouraging them to find ways to cheat exercises.

Based on those I interviewed, studios and managers mostly had the right intentions in terms of setting healthy boundaries. But in practice, issues often took place. Testers who suffered any kind of injury outside of work still had to put up with physical testing, although there was a push to focus more on administrative work during the recovery periods. Scenarios like this, however, are inevitable — even in safe and accommodating environments. 

Two particular cases stood out during interviews. An anonymous QA who worked on the Nintendo DS game New International Track & Field says that while there weren’t any set breaks, the team didn’t need to put 100 percent effort in, except for high-end tasks or multiplayer tests. The latter would get “a bit too competitive” when the producers would ask the team to take part in a game mode that involved playing every event back to back. Producers would claim the testers wouldn’t be able to beat them, staking “free morning Starbucks on the challenge” as an incentive.

A similar scenario took place in EA for EA Sports Active 2 by a POC (point of contact). Back in the day, they acted as a contact intermediary for managers or developers to get and spread work orders to employees. While our source is unsure whether this came as an order, he recalls a voluntary incentive for testers: whoever lost the most weight under a certain period of time would receive a prize. The reward was a one-month gym pass or a punch card that could be used for a gym or yoga class, one-on-one training, or even a consultation with a chiropractor, all activities that were already available for full-time employees.

“We were up late and kind of delirious, fueling ourselves on stress and whiskey.”

On social media, it’s common to see people joking around how QA testers who work on these games must be in excellent shape, but the reality is far more difficult to parse. Office preparations and day-to-day responsibilities play a big role, but so do each person’s capabilities and health conditions. In most cases, the aforementioned problems do nothing but strengthen or add upon the issues already in place that testers have to deal with across the video game industry as a whole.

In a report on Paradox Interactive, one tester mentions how the team “often got overlooked and it was very, very obvious that QA was lowest down on the hierarchical ladder.” During an interview with EGM, a contractor who worked on Titanfall 2 paints a similar picture to the clear and enforced distinction between salaried employees and contractors, as the latter are often subject to lower pay and lack of benefits. In that same report, another source calls working QA for VR projects a “horrific job.” Both internships and contract work are especially common with young developers, offering a foot in the industry in exchange for deplorable conditions. In many cases, QA is considered a stepping stone instead of receiving the same recognition as any other role in game development.

Sandel considers her time at Harmonix a “weird employment experience,” as she was kept as a contractor (alongside the professional dancers hired for Dance Central 2) for two years with “no benefits and low pay.” She mentions it was a different world after getting hired full-time for Dance Central 3. “Harmonix’s healthcare coverage is the best in the industry by a country mile. I’m still kicking myself for only transitioning after I stopped working there.”

For many testers, one of the things that made these challenging conditions more tolerable was the camaraderie that emerged from working with a small team on intense projects. “I have this really strong positive memory from close to the end of Dance Central’s production,” Sandel says, “trying to record the filter data for individual moves over and over because the first scoring system was a mess. We were up late and kind of delirious, fueling ourselves on stress and whiskey. We crunched a lot for the first game and I definitely sort of fell for the exhilaration of it, which was helped by how personable and fun to be around the team members were. Thankfully, crunch eased up a bit on subsequent projects, but we did still crunch.”

The former Ubisoft QA recalls their time working on the Just Dance series with “a massive wave of nostalgia” as they “genuinely had so much fun on that project.” They fondly remember one specific contraption called “Sweaty Betty,” which involved strapping a controller to a desk fan to test the scoring during Just Dance 3’s development.

But the series’s tight schedule often resulted in late nights. “It’s difficult for me to critically analyze exactly why there was so much overtime on that project since I was so new to the industry at the time, but Just Dance was usually featured at E3 too so there was a lot of work to get that ready for the reveal.” The source adds that at that point in time in their career, they were “extremely stoked to be in the industry, so I kind of expected it.”

According to Kyle, in order to avoid overtime, the team was split into day and night shifts. During the last few months of development, EA put budget aside to offer employees working late nights “10 to 15 dollars” toward the campus’s own restaurant for dinner. A second Volt Media contractor recalls taking part in the so-called “graveyard shifts.” During the last month or two of the project, some shifts would go from 8AM to 8PM, while the other half of the team worked 8PM to 8AM. The source did about five nights a week.

Photo by Amelia Holowaty Krales / The Verge

While most of the stories from the developers I spoke to happened years ago, it’s worth noting that the issues surrounding QA continue to this day. Music and fitness games have their own particular challenges when the work conditions aren’t suitable for the physical task at hand. But at the very least, this also opens the door to inventive solutions. As absurd and anecdotal as they may sound, they’re often a lifesaver.

During a graveyard shift, the second EA contractor recalls one time when the team was asked to test the Kinect in low-light situations. Since the project was one of the first games for the hardware, the UX was uncharted territory. This, of course, led to its own array of experimental testing.

“For a lot of us, around 5AM, when your last two brain cells have decided they’ve had enough, that’s when the real ingenuity kicks in,” the source concludes. “To ensure the game would pick up the user’s motions in a low-light environment, some of us ended up using a combination of cubicle walls and blankets to build tunnels that we would modify to allow more or less light in to see how the game responded. It’s the only time in my life I’ve been paid to build blanket forts.”