How Ordinary User Testing Can Miss What's Amazing About Websites

July 13th, 2017

  Steve Ellis

Usability is essential. It establishes a baseline. Without it you have nothing but problems. But once you've got it, you need something else.

Frictionless, seamless. These are words we use to describe what we think are the best designs, for products that are so good, they disappear. You forget you are using them.

My Keurig makes coffee seamlessly. I don’t have to think about boiling water or how coffee requires hot water to filter through coffee grounds like a traditional coffee maker. I don’t have to think about measuring the amount of coffee that goes into the filter or the amount of water that gets poured in the top. I don’t think much about making coffee the old way anymore.

A bedazzled Keurig on Etsy
It resists Keurig's tendency to make me not think about making coffee.

Websites ≠ coffeemakers

But websites are not coffeemakers. Websites are so much more, except in their most utilitarian form, when we use them to do the most boring thing, like register for an account, or enter credit card details. Most websites that matter even a little bit are interfaces to bigger things like Mortgages or the Library of Congress or Cinnabon or the Red Cross or Kleenex. Websites are magical looking glasses we create to communicate complex ideas and feelings.

Were we to break down the average, say, bank web site and quantify how much of it was devoted to this “communication of complex ideas and feelings” and how much was devoted to just getting things done, like applying for an account, you would find something like an 80/20 distribution of the former to the latter.

Most bank web sites devote a good deal of effort to communicating the feeling that the bank values you as a customer, that the bank knows what it is doing. The best bank websites go further by leaving you feeling more confident about your financial well-being. The best experiences are mostly about a “feeling” and not necessarily about being more efficient at something. Efficiency might enter into it—making it faster to get to a content piece or calculator—but efficiency isn’t the whole story. But if websites are not coffeemakers, do we study them like they are?

I think we mostly do.

Some history: The science of usability was developed to improve efficiency and reduce human error as automated systems became more complex. Usability is a discipline of engineering, which concerns itself with such things. Usability finds user interface problems, problems which largely can be fixed by making things more efficient and by making things more clear to reduce errors. Usability predates websites by a couple of decades.

As the science of usability advanced, researchers began to notice something. You could find most problems by testing with a small number of people, even with as few as just five people. In my own experience having run hundreds of usability studies over the years, it’s true. You’ll find the majority of problems with just a few people. But if you're looking to learn about more than just the problems of a site, just studying a few people won’t get you as far as you might like.

How to look beyond usability

Think about how a website can elevate a brand to a new height, to communicate the importance of an idea, like why we might want to pay attention to something like global warming, or care about refugees, or buy this particular pair of purple pants now instead of some other time. Conventional user testing will only get us so far. We’ll find problems for sure, but likely miss whether the experience hits the mark in other important ways.

In a user test people use a website. They think aloud as they do it. In a moderated test, we ask them non-leading questions. We probe. But after our test of is done, how do we know whether Kleenex now matters more to people than Puffs? How do we know if people really want to call it “global warming” instead of “climate change” after visiting our global warming action website? How do we know if they care more about refuges after visiting How do we know if they’re more likely to buy the purple pants on than (and why)?

Measuring and comparing

One way we can answer questions like this is by measuring and comparing. Although it may not require as large a sample size as you might think, measuring requires a bit more than five people to give us valid statistics. Because of this we recommend testing with at least 30 people. But some will inevitably ask, why measure to compare at all? Just show the same five people the same two sites and ask them to make the comparisons. Done.

Doing a purely qualitative, comparative test will reveal valuable qualitative findings, but consider the following problems. When we're measuring things beyond usability, we're asking people to tell us their opinions about their experiences. We're asking them to tell us about their feelings. How do we know they can articulate those feelings in words? How do we know we are listening? This is where measuring with simple Likert scales can help. To put it another way: there's a reason why we don't try to predict elections with focus groups.

Measurement helps us see things that are hard to see. It focuses our attention as researchers. It also helps focus the attention of others in communicating results, and track things over time—something we humans also have trouble with.

We think another reason why we often don't measure is the difficulty of setting up tests that enable measurement and comparison. We think this should change. One SoundingBox goal is to make well-designed quantitative, comparative tests as easy to conduct as conventional user testing. One day, we hope that all user testing will incorporate measurement and comparison to help us capture the full range of human experience, not just usability, so that what is amazing about websites can shine through.