Docs

Detect Subtle Differences With Prototype A/B Tests

You've got two or more versions of a prototype, and you need to test how people react to subtle differences in each.

Producing multiple versions of a prototype is a cornerstone of any design process. Your job is to generate ideas and see which one sticks. But getting to the final version can be hard when there is disagreement among the team. Perhaps the client has asked for something more empirical. A lot is riding on your decision. More inputs shouldn't be a blocker.

Testing multiple prototype versions can be handled through a balanced comparison test. This can be a great option if the things you're comparing are easy for people to detect and consider. But what if the differences are subtle? And what if you're looking to get empirical data about what people prefer?

Asking testers to see the differences in design alternatives can be tricky. The things you care about, the things that are different, the things which you know as a designer matter, can go unnoticed by test participants.

Prototype A/B tests

A SoundingBox experience2 prototype A/B test solves these problems by creating a workflow for measuring and comparing the differences between prototype versions. It uses some of the principles from both usability testing and A/B testing. User test participants provide the feedback on your prototype versions, so you're not having to build things out and test with actual customers on a live site. Group A interacts with version A. Group B interacts with version B. Each participant group can be composed of the same mix of people (the same demographic mix), and each group is asked the same questions about their preferences after interacting with the prototype.

At the end of the day your prototype A/B test lets you say people preferred version A over B N% of the time. Of course, you can also determine the extent to which people felt successful and were successful using each version—all without leaving the comfort, and low cost, of your favorite prototyping tool. Participants aren't asked to detect differences between the designs on their own. Instead, the numbers tell the story. And qualitative data help flesh it out. You can go back to clients and team members and present empirical results, arguing persuasively for one version over the other.

Creating your first prototype A/B test, step-by-step

  1. Create an account if you haven't already.
  2. Create a new study, give your study a name, and select experience2 as your test type.
  3. Choose your devices and people taking care to choose at least five people for each prototype version you're creating. So a two prototype test should have at least 10 participants. A three prototype test needs at least 15.
  4. Set up your screening questions to determine who can take your test.
  5. Next, the study step will pop up a dialog asking you to name your prototype groups. If you have two prototypes, you'll have two groups. If you have three prototypes, you'll three groups. Give these groups names that make it easy for you to identify your prototype versions later.
  6. Close out of the groups dialog and choose a template or start adding your tasks and questions one by one. Click on any task that you have added. You'll notice that the prototype group names you defined appear in your task. This is where you put your prototype URLs.
  7. If you have additional tasks you would like to include, add those here. Each task that you add has a field for its own alternative prototype versions.
  8. The final step is to set your quotas for any screening questions you have defined. We recommend for prototype A/B tests that you choose yes here. You want the same mix of people to be in each of your groups to reduce sources of bias.
  9. Initialize your study and click on Get Study URL to try it out as a participant. If it looks good, you're all set to launch.

Read more about designing your study.

A note on sample size

Since prototype A/B tests usually involve asking people about their opinion about an experience, having an adequate sample size can be vital to making claims about your data. That said, you can remain agile and not break the bank with around 30 participants per group. If you want greater certainty about your results, you're welcome to go higher, and many customers do.

Analyzing your results

Once your results are in, click on Analyze and load up your study by clicking on the study tile. You'll notice right off that we load your prototype versions to the right of your study. Clicking on the prototype tile loads more tiles, each of which summarizes the questions you've asked. At a glance, you can see which prototype "won" by looking at the Overall tile and toggling between your versions.

Next try clicking on the Comparison tab. Here you'll see a chart showing the same summary data for each prototype version, with one dot for each version. If you've iterated and have prior A/B tests you'd like to compare, load them. You'll see them in this view alongside your current study, letting you see how much things have changed between iterations.

Read more about our analysis dashboard.

Getting to the "why"

All of this is just a starting point. You can see which version won the competition, but you still need to come up with some reasons why it won so you can tell the story to your colleagues or your client. That's where the open-ended responses and the replays come in. Often participants will give you clues about their feelings by telling you about them in the open-ended (free text) responses you've asked them to provide. You can find other clues by replaying their interactions. Did they encounter usability problems, or react adversely in verbal comments as they interacted?

You'll find replays in the Replay tab, and you'll find open-ended text responses in the Grid view. Remember clicking on any tile or data point in the dashboard will sort replays by that measure, making it easy to prioritize which responses to watch first.