Zeno's Paradox
While critical thinking may not make up for a lack of knowledge, it is essential for gaining knowledge.
Friday, January 10, 2003:  

Human Factors, Critical Thinking:

Toward a Reliable and Valid Usability Testing Methodology: Part 1 - The Problem

As I’ve pointed out before, there have been some nice studies showing that usability testing is unreliable. Disappointingly, little has been done to address the problem.

As long as people are calling radically different testing methods "usability testing" and are using them for radically different goals, then discussions about the value of usability testing, the number of test participants required to achieve certain results, the ways to shortcut usability testing methods, etc. are a waste of time.

There is no known usability testing method that is both reliable and valid. End of discussion. Why do people continue to ignore this, when instead we should be doing something about it?

Step back for a moment and examine the problem:

What type of usability testing am I talking about?

I'm discussing usability testing that is at least fairly formal, primarily diagnostic, and that quantitatively measures the behavior of test participants using the system to be tested. This type of testing is important because it is the standard against which any other usability testing should be compared. In practice, less formal testing is the norm, but we must still ensure that we know the cost of choosing informal over the formal.

Why does it matter that testing is reliable and valid, since statistically significant results are often not required?

Reliable: Yielding the same or compatible results in different clinical experiments or statistical trials. (via dictionary.com)
By using a reliable method, different people will produce similar and compatible results with other people using the same method. Current usability testing methods are unreliable – they produce dissimilar and (at least sometimes) incompatible results.

Valid: Producing the desired results; efficacious: valid methods. (via dictionary.com)
A valid method truly produces the results claimed.

The issue isn’t whether statistical significance is required, but does the usability testing method produce repeatable results and are those results a true measure of usability? Currently, no such method is known.

What evidence is there that usability testing is unreliable?

In hindsight, the evidence exists in almost every study that compares or consolidates test results from different teams – many times the results are dissimilar, even incompatible. I say hindsight because of recent studies specifically on the reliability of usability testing by Molich and Kessner. In Kessner's study, which built upon Molich's, six usability teams could not find a single common usability problem when independently testing the same product. No team found 50% or more of the total problems found by all.

What are the consequences of usability testing being unreliable?

Until we can define a usability testing method that produces reliable results, we can't clearly demonstrate that claims of the value of usability testing aren't erroneous, fraudulent, unconscious cheating, and/or self-deception. We need a reliable and valid usability testing method, even if it is some impractical, formal method that is used solely to determine the effectiveness of more practical methods.


References:
Deciding Which Type of Test to Conduct at usability.gov.
How reliable is usability performance testing? analysis by Dr. Robert Bailey.
Comparative Usability Evaluation - CUE results and reports of research managed by Rolf Molich.
Kessner, M. (2000), On the reliability of usability testing, Carleton University Masters Thesis, Ottawa, Ontario, Canada December.
Kessner, M., Wood, J. Dillion, R.F. and West, R.L. (2001), On the reliability of usability testing, CHI 2001 Poster.

Seeking Comments:
I'd like your comments and questions on this, so I can address them in Part 2. Please email me.

    -  Ron  11:26 AM

Tuesday, January 07, 2003:  

Human Factors:

Usability Testing: Myths, Misconceptions and Misuses (Richard F. Dillon - HOT Lab)
In this article, I identify and try to straighten out some common misconceptions about usability testing...

1) Usability testing is user-centred design.
2) There are many types of usability testing procedures.
3) Evaluation isn't necessary in an iterative user-interface design process.
4) You don't have to measure user performance.
5) Usability test results will help you determine what users want.
6) A usability test tells if an application is useful.
7) A beta test is a usability test.
8) You can do a usability test in a focus group.
9) Iterative means one usability test.
10) A usability test tells you how to fix the problems you detect.
11) Almost any usability test is better than no usability test.
12) Anybody can do a usability test.
13) Don't test until the UI is complete.
14) You need a usability lab to run a usability test.
15) You have to test a large number of users.
16) Usability testing is a good way to test large numbers of navigation paths.
17) More usability tests are required for web sites than for desktop applications.
18) You have to test everything.
19) If usability testing is about performance, why include rating scales and comments?
20) Let users decide what they want to do.
20 misconceptions, all very common, some held even by the so-called "gurus" and authors of well-known websites on usability/ia/ux/etc. Highly recommended!

    Information quality:   High
    Propaganda quality:  High
    Propaganda level:     Very Low

    -  Ron  9:50 AM

Monday, January 06, 2003:  

Human Factors:

56 Rules to Design By (UI Design Update - Dec'02 - Bob Bailey)
Every year since 1983, I have reviewed and summarized much of the usability-related research literature that was published during the previous year. This has provided the basis for the popular, annual 3-day User Interface Update course. My annual two-month "read and outline" activity provides me with a number of research-based insights into "what works" and "what does not work" in usability. I have listed many of these insights in this article. What makes these "Do's and Don'ts" unique is that they all have recent research to support them.
56 rules for design, all backed by quality research, with references! There are many important topics here, most of which could use a great deal of explanation. Highly recommended!

    Information quality:   Very High - I only wish there was more explanation of each guideline.
    Propaganda quality:  Medium
    Propaganda level:     Low - Some promotion for Bailey's course and HFI.

    -  Ron  8:41 AM

Copyright © 2002-2005 Ron Zeno      This page is powered by Blogger.
Musings not completely unrelated to human factors, management, critical thinking, medicine, software engineering, science, or the like.
Zeno's Paradox Main
Comments? Email me!
My other site
Reader Favorites:
Miller's 7+-2 Doesn't Apply
Analysis of a Dilbert Comic
Reliable & Valid Usability?
Designing for Seniors

Weekly Archives
Looking for Ancient Greek Philosophy or Mathematics?
Zeno's Paradox
Zeno and the Paradox of Motion


(Current color scheme is from Aguilar's "Equilibrium".)