Marielle's little place on the web

about usability, cognition, neuroscience, psychology, learning, interface design, ergonomics, and other interesting things
  • InterPlay is a platform for designers in the Media Lab to create dynamic social simulations, which transform public spaces into immersive environments where people become the central agents. It utilizes computer vision and projection to facilitate full body interaction with digital content.
  • Mouseless consists of an Infrared (IR) laser beam (with line cap) and an Infrared camera. Both IR laser and IR camera are embedded in the computer. The laser beam module is modified with a line cap and placed such that it creates a plane of IR laser just above the surface the computer sits on. The user cups their hand, as if a physical mouse was present underneath, and the laser beam lights up the hand which is in contact with the surface.

How many people are needed for a usability study? The question comes up time and time again, with different answers. This time, Hwang and Salvendy have tried to answer it by a meta-analysis of the available literature since 1990. As inclusion criteria they used:

  1. the usability evaluation was done with one of the methods think-aloud, heuristic evaluation or cognitive walkthrough
  2. the study reported the number of participants in the evaluation (users or evaluators) and the overall discovery rate of errors.

Out of the 102 usability evaluation experiments found, only 27 satisfied the inclusion criteria. Hwang and Salvendy then performed a linear regression analysis on the data, and tried to estimate the number of people needed to detect 80% of the usability problems. The results: 9 for think-aloud, 8 for heuristic evaluation and 11 for cognitive walkthrough. This leads them to propose 10±2 as a rule of thumb.

I have a few problems with this approach. The ‘how many users’ question is a very logical question to ask, both when planning a study and when interpreting the results. But the statistical analyses are just numbers, and to determine the real value of a study one should not forget the content.

  • Not all errors are equal. While 80% detection sounds good, issue severity should not be neglected. Hwang and Salvendy mention that the cognitive walkthrough method is good at finding critical issues, but less adequate for detecting minor flaws. Knowing that, I wouldn’t choose for increasing the number of evaluators to 11 (!), but would rather combine a smaller early cognitive walkthrough with another evaluation method later, maybe even on a fixed prototype. If 2 or 3 evaluators using the cognitive walkthrough method can point to some of the severe issues, this is valuable enough in itself.
  • Issues are not always errors. A usability evaluation can help discover potential problems, but the way these issues influence the user experience and user behavior in real life may not be the same as in the user study.
  • The usefulness of your results depends on how they are used. Using the results wisely to improve the next design iteration of a system is useful. Using them to formulate specific questions to be answered with other methods (think of A/B testing for example), or to decide on what data to collect, makes sense too. But using them to calculate a magic number to report to stakeholders (based on ‘few issues found = good system’) makes less sense.

Besides, what does that 80% number mean? 80% of the total number of issues hidden in a system, but that’s not a real, measurable quantity. Okay, when the number of distinct issues found is plotted against the number of participants used, the curve does flatten and an upper limit can be estimated. But even with very large groups, there is still a chance that one more participant will find something that all of the others overlooked. In addition, the characteristics of the participants and the protocol used also influence which types of issues will be found easily.

However, I think that in practice the 10±2 guideline should work pretty well, especially for the think-aloud case with non-expert users. With a very small group of users, it is often difficult to say if the findings will generalize across the real user base. On the other hand, using a very large amount of people is costly in time and money, and does not have much added value since there will be lots of repetition in your findings.

Hwang, W., & Salvendy, G. (2010). Number of people required for usability evaluation Communications of the ACM, 53 (5) DOI: 10.1145/1735223.1735255

Imagine the following task: you have arranged to meet someone on the university campus at a specific place and on a specific time, or the variant where you have just agreed to meet but haven’t decided on a time yet. Your challenge is to arrive at the right place together with your meeting partner. But something can go wrong: your partner may be late, or went to a wrong location.

For this task, Dearman, Inkpen and Truong created a (wizard-of-oz) prototype for a map application to be used on your location-aware mobile device (such as your smartphone). They instructed participants to perform a number of rendezvous tasks, observed their behavior and noted their comments. Only one person in each pair was a real participant, the other was a fellow researcher following a detailed script.

Although all people were familiar with the campus, only a few people chose not to use the application. The others used it to find their own location, plan their routes to the target location and find out where their partner was. In the first part of the study participants had to manually zoom or pan the map to change the map view, and many of the interactions with the map consisted of preventing yourself, or your partner, to walk off the map. In the second part of the study the prototype was adapted to support this with autofocus and autozoom

For the male participants this automatic zooming and panning seemed to do the trick pretty well. When they were obliged to use the autofocus feature, apparently most of the time they didn’t feel the need to further manipulate the map a lot. Their number of interactions in this condition was significantly different from the female participants: 9.8 for the males (sd 16.9) and 52 (sd 31.6) for the females.

The autofocus as implemented was not ideal though. It did not always show the appropriate level of detail (which made it hard to see in which direction one was moving; and street names were not always visible); landmarks that people would like to use for orientation sometimes fell outside the map boundaries; and the map did not take direction into account (so a participant could be shown at the edge of the map).

Still, the Wizard of Oz approach with a minimal prototype was sufficient to carry out this type of user research with interesting results. Automation of the most frequent user interactions seemed to work well, combined with giving the user the opportunity to temporarily override the automated view, or to switch the autofocus off entirely. Mobile maps show promise for use in social tasks, such as a rendezvous. The map task in this study was different from traditional navigation tasks, because the navigation target was not stationary (the other person is moving too). Other social tasks are even more complex, involving more than two people, or unknown strangers (privacy issues come to mind). When mobile devices become even more common, we’re bound to discover more applications to support such social tasks.

Dearman, D., Inkpen, K., & Truong, K. (2008). Mobile map interactions during a rendezvous: exploring the implications of automation Personal and Ubiquitous Computing, 14 (1), 1-13 DOI: 10.1007/s00779-008-0195-2

I recently read an interesting article by Roth, Schmutz and others about ‘Mental models for web objects’. In the main part of the study, the authors asked 516 participants to construct prototypes for what they thought typical for a company website, a news site and a webshop.

The results describe very ‘normal’ websites. Most people expect a logo in the top left corner. A navigation menu is placed left or above the main content. Login fields are expected in the top right corner (for the news website and the webshop, that is; for the company website 77% of the participants did not include a login field). Webshops have shopping carts and no archives; company websites show no advertisements.

The study also compared the web design experts in the group with the laypeople. For most objects, the two groups agreed on the placement (which is a good sign I believe). However, there were a few notable exceptions: ‘external links’ and ‘newsletters’ were placed on the left by the laypeople but on the right by the experts; in the webshop the laypeople expected ‘contact’ to be on the left, where the experts placed it in the top right corner.

It’s good to see that this kind of research is done. It offers a little insight in the mental models your users may have. Expectations and unwritten standards can constantly keep changing, but it’s nice to have some information about what your general audience may expect. If you wish to place your logo somewhere else than top-left, now you’ll have to ask yourself critically if you have a reason.

Roth, S., Schmutz, P., Pauwels, S., Bargas-Avila, J., & Opwis, K. (2010). Mental models for web objects: Where do users expect to find the most frequent objects in online shops, news portals, and company web pages? Interacting with Computers, 22 (2), 140-152 DOI: 10.1016/j.intcom.2009.10.004

  • reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.