Some Stuff About User Experience, eCommerce, Social Media & etc.

The Science Of Usability Testing

From unskippable cutscenes to galvanic skin response, we investigate the world of videogame user research.

Difficulty spikes, unreliable checkpoints, context-sensitive buttons that might open a door, but might bounce a grenade into your lap instead: these things matter. “Every moment in a game, you’re bleeding players,” says John Hopson, Bungie’s user research lead. “Hopefully, you’re bleeding them as slowly as possible. The most powerful thing I ever did on Halowas make a graph showing how many players we lost each mission. We had these people: they bought the game, they wanted to play, and we failed them.”

Usability testing didn’t start with videogames. It started with product development of a more domestic stripe: with teapots, toasters and car dashboards. Although designers have always spared a thought for their audiences since the days of Jet Set Willy – it’s hard to make even the simplest videogame without thinking of what the player’s going to do or see from one second to the next – it’s only become a serious issue in the games industry relatively recently. Yet with no bespoke track at GDC, no standardised terminology, and no agreed best practices, usability may be gaining respectability, but it’s still one of the least understood aspects of design. That poses some interesting questions. How does the industry approach user research today, and why has something so fundamental waited so long to be taken seriously?

Usability is made up of two elements: user testing, which investigates whether people can understand how to play a game properly, and playtesting, which then looks at whether they’re actually enjoying themselves. Playtesting has been taking place on an unofficial basis since Spacewar. User testing, however, has been far less common.

“The problem is that user testing is complex,” says Chris Viggers, the development director at Blitz Games Studios. “It’s about the psychology of how people interact with a computer and with different control systems. It’s about what they’re expecting out of a game and how they think it should react. You’re working out how to factor it into the game, and making sure that testing sessions are as objective as possible when it comes to what kind of questions you ask. You can quite easily skew your own results by approaching your testers incorrectly.”

That said, certain developers began thinking about usability a lot earlier than others. “Because Microsoft was a conventional software company, they were used to doing usability for Word and Office already,” says Hopson, who worked for the platform holder prior to joining Bungie. “They just transferred that philosophy across. We had to bend the process around quite a bit, though. When you’re testing whether a spellchecker works, you don’t have to worry about whether it’s fun.”

Speaking of processes, while there are currently as many approaches to usability as there are developers, there’s one golden rule everyone can agree on. “Start early,” laughs Dr Graham McAllister, the director of Vertical Slice, the UK’s first game usability studio. “Come to us earlier and we solve more problems. We almost always have fundamental changes to make and, at the moment, most companies come to us at the end. When they get our report and we say: ‘Here are the five things that are absolutely critical and must be changed or there’ll be an impact on the review score,’ it may be too late.”

“Now we do usability as soon as we can get something for people to play,” says Jason Avent, a game director at Black Rock Studios, the creator of Pure. “That can sometimes mean it’s not even first playable for the game: it’s a prototype in XNA or Unity. That gives you enough data to make more committed choices. With Pure, we started user testing early. We had a fairly early version of a track with a couple of massive jumps in it, and just one guy on the track. We had some art, but it didn’t look great. The most important thing was that we had the handling, the collision response, and the rider response. Those were the aspects we were testing. At that stage, you can change stuff, but as you go further and further it gets harder.”

At its core, usability testing is fairly simple: developers bring people in to play their game, and then talk to them about their experiences. Increasingly, however, researchers are trying to look into the player’s head a little more directly. Vertical Slice is one of the pioneers of the biometric approach, using diagnostic tools to dig deeper into user responses. “You’ve got an emotional spectrum,” explains McAllister. “Think of it as a graph, with arousal – positive and negative excitement – on the Y axis, and then mood – happiness and sadness – on the X. Arousal can be measured by galvanic skin response, which we do by placing sensors on players’ fingers, while we measure overall skin temperature to give us valance – whether a player is happy or sad. This is all still research, but we’re already seeing, for example, skin temperature decreasing and an increase in galvanic skin response during combat in good FPS games – they seem to be aroused and happy.”

Biometrics helps to pick up disparities between what players say and what they may actually think, but not everyone in the usability community is convinced. “We don’t use biometrics pretty much at all,” says Hopson. “However, we do look at the difference between what people say and what they do. If they say they love the shotgun, but then when we look through the data it shows they never pick it up, we know we have to investigate. It would be too strong to say that I consider the use of biometrics in game research to be snake oil, but it’s close to how strongly I feel about it. To pick up a problem with biometrics that you couldn’t pick up with other techniques, there’d have to be something in the game that isn’t fun, which the player never said isn’t much fun, never acts any differently, and which the experts watching them play don’t pick up on. That’s a very small category of problem.”

“In some ways that’s right – it’s potentially a small set of issues that biometrics can identify,” says McAllister. “But they’re a different sort of issue than can be revealed with other methods. We use the same methods as Bungie also, of course – observing behaviour, listening to the player, interviews. But biometrics help us to identify how the player feels about the game. They also offer us the ability to identify the precise moment when the player reacted to gameplay elements, which helps us to counteract the problem of collecting general feedback. We’ve seen many instances where players may not say anything about an issue, or behave differently, but their physiology can show us that they’ve reacted to a game element. In other words, we can identify the precise second when the player felt something. We then interview the player afterwards about that precise moment.”

Biometrics isn’t the only controversial issue in user research; there’s also the huge matter of where to draw your testers from to consider. “Initially we got guys in who weren’t in our team, but were inside the company,” explains Avent. “Then we got people who were in the same building, then we got friends of people at work, and then we got people on the street. You get more intelligent feedback from people who understand games, but it doesn’t necessarily mean it’s more usable. Sometimes insider knowledge pollutes it and sometimes it doesn’t. You’ve just got to make a judgement. It’s always important to have a few people in the room watching the tests so you can decide what to ignore and what to take on board.”

“Getting the right audience is very important,” agrees McAllister. “We have profiles on everyone we use for testing in our database. We know who’s hardcore and who isn’t, which games they’ve played, how many hours per week they play. We’d like to go further. We’re thinking of psychometric testing to learn about their game styles. It’s so clients can come to us and we can give them the right audience for their tests. It’s one of the things that drives me mad about usability. People will say: ‘82 per cent of people feel like this,’ and I’m thinking: ‘Who were the 82 per cent?’ I want to know much more about the person. We have a list of the big problems in usability – the stuff that’s really hard to solve – and audience is right up there. In fact, understanding people is at the very top.”

And even the right audience doesn’t guarantee that you’ll get the right data. “You need to focus the tests, and focus them on the things the designers are worried about,” says Hopson. “The places where they’d taken a risk and they don’t know how it’s going to play out.”

Bungie user research lead John Hopson (above left) and Vertical Slice director Dr Graham McAllister

Viggers concurs with this point. “It’s important to keep the sessions distinct: one session you’ll look at control or menus, and you won’t look at anything else. You keep doing that throughout the process, and try to get the results feeding back into the game as quickly as possible. Regression testing is crucial, too: expose a feature to the player, get feedback, make changes, and then expose the feature again to see what they think now. It’s a great way of seeing if your changes are working, and if the problems are going away or whether they’re just revealing new problems.”

“Even after that, you get a player’s view of what you’ve made, but it’s all about interpretation,” suggests Avent. “You can’t listen to the solutions that people voice. You have to look deeper, at what they really mean, what they’re really saying, and what’s really making them unhappy. There’s always interpretation involved when you’re user testing, regardless of what methods you use, and it’s crucial to never forget that.”

With such knotty issues to consider, it becomes easier to understand why so many developers still put usability to one side entirely, deeming it too complex or too expensive to work into the production schedule. Avent believes that’s a mistake. “You shouldn’t be put off by usability. You really don’t need all that high-end stuff all the time,” he argues. “It’s like levels of service. You can do the fundamentals very cheaply with your own staff and going out and finding people on the street. For 60 per cent of the testing that has to go on, that probably does it. For stuff like measuring fun and excitement and fine-tuning, I think you need biometrics. It’s here that those additional tools will really help.”

“And even if you only user test late on, there’s still a lot you can change,” says Hopson. “But towards the end of development, what you’re doing is parameter tweaks. You can make the guns fire faster, you can make something more powerful or a little less powerful, you can move things around in the level. You can’t redesign an environment, but you can put a pile of ammo in the middle of the floor. Even at the late stage, there’s still something you can do.”

So what’s driving the industry’s sudden engagement with players and their myriad frustrations with games? “I think people are starting to see that if a game doesn’t get a good Metacritic rating, there’s going to be trouble,” says McAllister. “We can’t affect marketing, but we can work alongside that. I recently saw some research based on a study of 1,700 PS2 games, in North America only. The results said that for a game to sell one million units, you had to get at least 60 on Metacritic. No PS2 game that ever got below that went on to sell over a million. People see this as evidence of the marketing ceiling. What the researchers ended up with is a large number of games that sold under one million, and there’s no correlation about quality – some got very good scores, some got very bad scores. But there is a correlation between all the games that sell over that amount. They all had 80 per cent and above on Metacritic, more or less. It seems there’s a correlation between reviews and buyers, and usability matters to reviewers.”

Black Rock Studios game director Jason Avent (above, left) and Blitz Games Studios development director Chris Viggers.

For Viggers, meanwhile, Blitz’s user research was sparked in part by changes in hardware. “What’s really kicked it off for us, and a lot of people, is that the Wii came out, which had a unique control system,” he says. “We did a Wii launch title, and we had no idea how people were even going to hold the controller, let alone know how they’d react to onscreen prompts. We were very much aware that we had to go back to the player to see how they were just going to relate to it. Now with touchscreen phones and Kinect and Move, every platform has a different – and often untested – way of interacting. Doing a launch title for Kinect, we had no other games to look at to see how people were going to react. We had to start thinking about usability. We had to go outside of the studio, and just bring in real people and test out what they think they’d do, and what they’d expect to happen with the hardware. We know how these pieces of hardware work from a technical level; a user on the street will just approach it in a much more natural way, and we need to capture that.”

“The other thing that’s pushing usability in the industry is free-to-play games,” suggests Hopson. “You’re getting people playing who haven’t paid you any money yet, so every usability problem that stops people from playing has a direct impact on the bottom line. It’s always been difficult to make a return-on-investment argument for usability: how does making this mission better translate into money? But with free-to-play, there’s a very direct line between the two issues.”

“In the old days, designers just designed for themselves,” agrees Avent. “That’s why games used to be hard: the team was really good at them because they balanced for themselves. I don’t think the audience complained so much, either. They literally didn’t have the forums. But you didn’t have so many different alternatives for entertainment, either. You just accepted there’s this amazing new medium for entertainment, but it’s hard. Not any more.”

And so, slowly, studios around the world are beginning to involve the player in the design of games, bringing in playtesters, listening to feedback, and building usability into production schedules. “That’s something we’re moving to: testing with every milestone of the game,” says Viggers. “Testing all the features and getting that feedback flowing throughout the game. It does become an overhead, but the proof is there using it this year with Kinect and Move: sessions with usability have forced us to make large changes, and we’ve been able to do that because of where we are in the cycle of development. That’s proved to us how vital it is that we do this kind of regular testing.”

“Usability and marketing should complement each other,” argues McAllister. “Take a high-profile game like Black Ops. I don’t know what the marketing budget was, but let’s say it’s $10 million. Now, I’m betting the usability budget was fractions of that. When people talk about the difficulty of building usability into the production schedule, I always think about how marketing’s already an accepted part of the process. You wouldn’t start development without a marketing director, but projects often start without usability budgets.” He laughs. “Ultimately, we just have to get better at explaining ourselves. Some people think we’re QA, some people think we’re market research, and we’re not: we’re there to present the player’s perspective before your game is on the shelf.”

We also asked John Hopson, Jason Avent and Codemasters executive producer, Adam Parsons, for their opinions on some of gaming’s more irritating quirks. You can read what they said, here.

via: http://www.next-gen.biz


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: