Back to Blog

Usability Testing: How to Run Effective User Tests

Pixel Font:On

You've shipped a feature your team spent weeks building. Usage metrics come in. They're disappointing. Users aren't completing the flow. Support tickets pile up asking how to do something that seemed obvious. The checkout page everyone agreed was "clean and intuitive" has a 68% abandonment rate.

Sound familiar?

Here's the uncomfortable truth: what makes sense to your team rarely makes sense to users. You've spent months staring at the same screens, talking about the same features, building assumptions on top of assumptions. Your users see it for the first time with fresh eyes and zero context. They get stuck in places you never anticipated.

Usability testing catches these problems before they cost you users, revenue, and credibility. It's the practice of watching real people attempt real tasks with your product, observing where they struggle, and fixing those friction points before launch. Done right, it's the fastest way to turn guesswork into validated design decisions.

This guide walks you through how to run usability tests that produce actionable insights. You'll learn how to define what to test, recruit the right participants, and ask questions that reveal truth instead of confirming your biases. Whether you're a product manager running your first test or a seasoned practitioner looking for sharper techniques, you'll find a practical framework you can apply immediately.

What Is Usability Testing?

Usability testing is a user research method where you observe real users attempting specific tasks with your product. You're not asking people what they think of your design or whether they like the colors. You're watching them try to accomplish something: sign up for an account, complete a purchase, find a specific piece of information. Then you document where they succeed, struggle, or fail completely.

The key distinction from other research methods: usability testing evaluates how well your solution works, not whether people want it in the first place. It sits on the evaluative side of research, answering questions like "Can users complete this task?" and "Where do they get confused?" rather than "Is this problem worth solving?"

This makes usability testing different from user interviews (which explore problems and needs), surveys (which collect opinions at scale), A/B testing (which compares conversion metrics), and focus groups (which gather reactions to concepts). Each has its place in your product discovery toolkit, but usability testing is specifically designed to identify friction in existing or proposed solutions.

The output isn't a satisfaction score or a list of feature requests. It's a documented record of where users got stuck, why they got stuck, and how you can fix it.

Why Usability Testing Matters for Product Teams

The business case for usability testing is stark. According to a Cambridge/NIST study, developers spend 42% of their time reworking code due to avoidable errors. Many of these errors stem from building the wrong thing or building it in a way users can't figure out. That's nearly half your engineering capacity fixing problems that shouldn't have shipped in the first place.

The flip side is equally compelling. Forrester Research reports that every $1 invested in UX returns $100, a 9,900% ROI. And the Baymard Institute, drawing on 200,000+ hours of UX research, found that large e-commerce sites can increase conversion rates by 35.26% through checkout redesign alone.

But beyond the numbers, usability testing changes how teams make decisions. Instead of debating whether users will understand a flow, you test it. Instead of assuming the design is intuitive because it makes sense to you, you watch someone prove or disprove that assumption in real time.

"The best ideas come from the people that are doing the work."
Ryan Sousa, ex-Amazon

This principle applies equally to usability testing. The insights don't come from executives in a conference room debating user behavior. They come from watching users actually behave. And the people best positioned to interpret those insights are the ones building the product.

The Sample Size Question: How Many Users Do You Need?

One of the most common questions in usability testing is how many participants you need. The answer depends on what you're trying to learn. But there's a research-backed starting point that works for most teams.

"Five to seven users per segment. That's the magic number for usability testing."
Nikki Anderson, User Research Lead at Zalando

The "per segment" qualifier is crucial and often overlooked. If you're testing a product used by both novice and expert users, you need 5-7 of each. If you're operating in multiple markets with different user behaviors, you need 5-7 per market. Five users total tells you nothing if three of them are power users and two are first-timers. You're conflating fundamentally different experiences.

"When I say talk to five to seven users, it's five to seven users per segment."
Nikki Anderson

The research behind this comes from Jakob Nielsen at Nielsen Norman Group, who found that testing with 5 users uncovers approximately 85% of usability issues. There's diminishing returns beyond that. 10 users find 95% of issues. 20 users find 98%. For most qualitative usability studies, 5-7 users per segment gives you the best insight-to-effort ratio.

The exception is quantitative studies where you need statistical significance. If you're measuring task completion rates or comparing two designs with metrics, you need at least 20 participants to draw meaningful conclusions. But that's a different type of study with different goals.

Sample Size Quick Reference

Study TypeRecommended Sample SizeGoal
Qualitative usability testing5-7 users per segmentIdentify major usability issues
Quantitative comparison20+ usersStatistical significance for metrics
Early prototype testing3-5 usersCatch obvious problems quickly
Card sorting15+ users per groupInformation architecture validation

Remember: you're not trying to survey a representative sample of your entire user base. You're trying to observe enough people to identify patterns in where your product breaks down.

Types of Usability Tests

Not all usability tests are structured the same way. The right approach depends on your goals, timeline, and resources.

Moderated vs. Unmoderated

Moderated testing involves a facilitator guiding each session in real time. This can happen in person or via video call. The facilitator introduces tasks, observes behavior, and asks follow-up questions when something unexpected happens. It's more time-intensive but produces richer context because you can probe deeper into why users struggled at specific moments.

Unmoderated testing uses a platform to present tasks and record user behavior without a live facilitator. Participants complete sessions on their own time while software captures their screen and voice. It's faster to scale and easier to schedule, but you lose the ability to ask follow-up questions or redirect when users misunderstand a task.

Choose moderated when you need to understand the "why" behind behavior or when testing complex flows. Choose unmoderated when you need quick feedback on specific screens or when scheduling logistics make live sessions impractical.

Remote vs. In-Person

Remote testing happens over video calls or through asynchronous platforms. Participants use their own devices in their natural environment. This often reveals real-world context you'd miss in a lab setting: interruptions, different browser configurations, or slow internet connections.

In-person testing gives you more control over the environment and makes it easier to observe body language and micro-expressions. It's valuable when testing physical products, hardware interactions, or when you need to see exactly how users physically interact with a device.

For most software products, remote testing is now the default. It's cheaper, easier to recruit for, and often produces more realistic behavior since users are in their natural context.

Exploratory vs. Evaluative

This distinction maps to where you are in the product development cycle.

"There are two buckets of user research. The first one is generative research, understanding the problem space. The other side is evaluative user research, evaluating solutions."
Nikki Anderson

Exploratory (generative) testing happens early, often before you've built anything. You're testing concepts, wireframes, or early prototypes to understand whether your approach makes sense. For exploratory research, Nikki recommends 10-12 users to surface a broader range of perspectives.

Evaluative testing happens later, when you have a working prototype or live product. You're validating that your solution actually works and that users can complete tasks successfully. The 5-7 user guideline applies here.

Want More Product Insights?
SUBSCRIBE

How to Run a Usability Test: 6 Steps

A structured approach ensures you get useful insights rather than scattered observations. Here's the process that works.

Step 1: Define Your Goals

Start with clarity on what you want to learn. Vague goals like "see if users like the new design" produce vague results. Sharp goals lead to actionable findings.

Good usability testing goals answer specific questions:

  • Can users complete the checkout flow without assistance?
  • Where do first-time users get stuck during onboarding?
  • How long does it take users to find the settings page?
  • What workarounds do power users create for missing features?

Define success criteria before you test. What task completion rate would indicate the design is working? What error rate is acceptable? What time-on-task would signal a problem? Having benchmarks makes analysis much cleaner.

Step 2: Recruit the Right Participants

Your findings are only as good as your participants. Testing with the wrong people produces misleading results.

Define your segments clearly. If your product serves both technical and non-technical users, test with both. If you're launching in a new market, recruit users from that market. Don't assume German users behave the same as American users.

Write screener questions that filter for your actual user profile. Don't just ask "Have you ever bought something online?" Ask about recency ("When was your last online purchase?"), frequency ("How many times per month do you shop online?"), and relevant behaviors ("Do you typically use mobile or desktop for shopping?").

Exclude people too close to your product. Employees, investors, and friends of the team bring biases that contaminate results. You want fresh eyes that match your real user base.

Step 3: Create Realistic Tasks

Tasks should mirror what users actually do with your product. Avoid artificial scenarios designed to showcase features.

Bad task: "Click on the navigation menu and select Settings." (You're telling them exactly what to do.)

Good task: "You want to change your email notification preferences. Show me how you'd do that." (You're giving them a goal and watching how they achieve it.)

Frame tasks as scenarios with motivation. Instead of "Find a blue shirt," try "You need a blue shirt for a job interview next week. Find one that you'd want to order." Context makes behavior more realistic.

Start with easy tasks to build confidence, then progress to more complex ones. Don't front-load the hardest scenarios. Frustrated users produce less useful data.

Step 4: Prepare Your Moderator Guide

A moderator guide keeps sessions consistent and ensures you cover everything important. It includes your introduction script, tasks, and the questions you'll ask.

The key to getting honest feedback is asking open-ended questions that don't lead users toward particular answers.

"The way that I recommend writing and asking open ended questions is by following an acronym. And that acronym is TEDW... The T stands for tell me about or talk me through. The E stands for explain. The D stands for describe. And the W stands for walk me through."
Nikki Anderson

The TEDW framework keeps your questions exploratory instead of leading:

  • T - Tell me about / Talk me through: "Talk me through what you were trying to do there."
  • E - Explain: "Explain what you expected to happen when you clicked that button."
  • D - Describe: "Describe what's confusing about this screen."
  • W - Walk me through: "Walk me through your thought process as you completed that task."

This matters because leading questions corrupt your data.

"You should never really ask a user, hey, did you ever try to press this button?"
Nikki Anderson

Questions like that plant ideas and create false positives. The user might say "Oh yeah, I guess I could have pressed that" even if it never would have occurred to them naturally. Instead, observe what they actually do, then ask them to explain their choices afterward.

Step 5: Run the Test

During the session, your job is to observe and facilitate. Not to help, explain, or defend the design.

Use the think-aloud protocol. Ask participants to verbalize their thoughts as they work through tasks. "Just say whatever comes to mind as you're doing this: what you're looking for, what you're clicking on, anything that surprises or confuses you." This narration reveals mental models you can't see from behavior alone.

Stay neutral. When users struggle, resist the urge to help. A gentle "What would you do if I wasn't here?" keeps them working independently. If they ask whether they're doing it right, redirect: "There's no right or wrong way. I'm just interested in how you'd approach this."

Know when to probe. If something unexpected happens, don't let it pass.

"If you are struggling with what to ask next, just ask why. That's the best question that you could ask. Just ask why."
Nikki Anderson

"Why did you click there?" "Why did you expect that to happen?" "Why did you go back to the homepage?" These follow-ups turn observations into insights.

Take structured notes. Document timestamps, what the user did, what they said, and your interpretation. Separate observation ("User paused for 8 seconds on the pricing page") from inference ("User seemed confused by pricing options"). You need both, but mixing them makes analysis messy.

Step 6: Analyze and Report Findings

Raw observations aren't useful until you've identified patterns across participants.

Start by reviewing all sessions and flagging moments where users struggled, succeeded, or exhibited unexpected behavior. Group similar issues together. If three users missed the same button, that's a pattern worth investigating.

Rate issues by severity:

  • Critical: Users cannot complete the task at all
  • Serious: Users complete the task but with significant difficulty or errors
  • Minor: Users notice a problem but work around it easily

Structure your findings as problem, evidence, recommendation. "Users couldn't find the settings page (3/5 participants). They looked under Profile instead of Account. Recommendation: Rename 'Account' to 'Settings' or add a secondary link under Profile."

When sharing with stakeholders, lead with insights, not methodology. Executives don't need to know your sample size first. They need to know what's broken and how to fix it. Build trust in your process over time through consistently useful findings.

Common Usability Testing Mistakes

Even experienced teams fall into these traps. Watch for them.

Testing with the wrong users. If your participants don't match your real user segments, your findings won't generalize. Five power users will navigate your product differently than five first-timers. Treating those results as equivalent leads to bad design decisions.

Leading questions. Asking "Did you notice the help button in the corner?" after a user struggled plants the answer. Ask what they would have done differently, what they were looking for, or where they expected to find help. Don't point them to solutions.

Helping users when they struggle. The whole point is to see where people get stuck. If you intervene at the first sign of confusion, you've eliminated the most valuable data from your study. Let users work through problems as they would in the real world.

Ignoring bias.

"Average human has almost 600 biases. The biggest one is confirmation bias."
Discovery Panel, Munich

You'll naturally notice evidence that supports what you already believe and downweight evidence that contradicts it. Counter this by defining success criteria before testing, having multiple people review sessions independently, and specifically looking for data that disproves your hypotheses.

Testing too late. Usability testing after development is complete turns into a change request negotiation rather than a design improvement process. Test early with prototypes when changes are cheap. The best time for usability testing is before you've written the code.

When to Use Usability Testing

Usability testing isn't a once-per-project activity. It fits at multiple points in your development cycle.

Before design (concept testing). Test wireframes and low-fidelity prototypes to validate your approach before investing in visual design or development.

During design (prototype testing). Test interactive prototypes to identify navigation issues, unclear labels, and missing steps before they're coded.

Before launch (validation testing). Test the functional product to catch implementation issues, edge cases, and final polish problems.

After launch (optimization testing). Test live flows that aren't performing. Analytics show where users drop off, but usability testing shows why. Integrate this with your product launch process to catch issues early.

The teams that ship great products test continuously, not as a one-time checkpoint. Build usability testing into your regular workflow rather than treating it as a special event.

Summary: Start Testing This Week

Usability testing isn't complicated. Recruit 5-7 users from your target segment. Give them realistic tasks. Watch them work. Ask TEDW questions to understand their mental models. Document patterns. Fix the problems you find.

The barrier isn't expertise or resources. It's just doing it. One afternoon of testing with five users will reveal more about your product's usability than months of internal debate.

If you're building products without usability testing, you're guessing. And your competitors who test aren't.

For more research methods that complement usability testing, from user interviews to surveys to diary studies, explore our complete guide to user research methods.

Play The Product Game

START GAME