Common Industry Format for Usability Test Reports
Example Reportv3.2 (DiaryMate)
4-Aug-99
This is an example of a report that follows the guidelines contained in the Common Industry Format for Usability Test Reports version dated Jan 15, 1999.
This example is derived and adapted from a real usability test, and shows how the report format can be customized to meet specific needs. In particular:
· Under “Participants”, it contains a table with a different example of relevant characteristics and capabilities to the example in the report format
· It contains an example of how mean goal achievement can be calculated.
· The performance data includes the number of references to manuals.
It contains two examples of satisfaction data:
1. An example of results from a commercial user satisfaction questionnaire (SUMI).
2. An example of results from an in-house subjective ratings scale.
Please provide feedback on the preferred example. The in-house data is preferred because it is more typical, the SUMI data is preferred because it is more useful to a consumer organization.
Editor’s note
The CIF format does not currently include sections for
Nigel Bevan
Common Industry Format Usability Test Report
Super Software Inc
January 1, 1999
Tested December 1998
Any enquiries about the content of this report should be addressed to
E Frost, Usability Manager
Super Software Inc
19483 Outerbelt Ave
Hayden CA 95014 USA
408 555-2340
EFrost@supersoft.com
Contents
2.2 Context of Product Use in the Test6
Participant's Computing Environment
6 Appendix A – Participant Instructions13
6.1 Participant General Instructions13
6.2 Participant Task Instructions14
DiaryMate is a computer version of a paper diary and address book. DiaryMate provides diary, contact and meetings management facilities for individuals and work groups. The test demonstrated the usability of DiaryMate installation, calendar and address book tasks for secretaries and managers.
Four managers and four secretaries were provided with the distribution disk and user manual, and asked to install the product. Having spent some time familiarizing themselves with it, they were asked to add information for a new contact, and to schedule a meeting. No significant differences in the results were found between managers and secretaries.
All participants installed the product successfully in a mean time of 5.6 minutes (although a minor subcomponent was missing from one installation). All participants successfully added the new contact information. The mean time to complete the task was 4.3 minutes.
Seven of the eight participants successfully scheduled a meeting in a mean time of 4.5 minutes.
Overall satisfaction on the SUMI global scale was 51 (above the industry average of 50). The target value of 50 was within the 95% confidence limits for all scales.
DiaryMate is a computer version of a paper diary and address book. DiaryMate provides diary, contact and meetings management facilities for individuals and work groups. It is a commercial product which includes online help and a 50 page user manual.
The primary user group for DiaryMate is office workers, typically lower and middle level managers and their secretaries. DiaryMate requires Microsoft Windows 3 or higher, and is intended for users who have a basic knowledge of Windows. A full technical specification is provided on the SuperSoft web site: www.supersoft.com/diarymate.
The aim of the evaluation was to validate the usability of the calendar and address book functions, which are the major features of DiaryMate. Representative users were asked to complete typical tasks, and measures were taken of effectiveness, efficiency and satisfaction.
It was expected that installation would take less than 10 minutes, and that all users could successfully fill in contact information in an average time of less than 5 minutes. All SUMI scores should be above the industry average of 50.
Intended context of use: The key characteristics and capabilities expected of DiaryMate users are:
· Familiarity with a PC and a basic working knowledge of Microsoft Windows
· A command of the English language
· Familiarity with office tasks
· At least 10 minutes a day spent on tasks related to diary and contact information
Other characteristics of users which it is expected could influence the usability of DiaryMate are:
· amount of experience with Microsoft Windows
· amount of experience with any other diary applications
· attitude towards use of computer applications to support diary tasks
· job function and length of time in current job
Context
used for the test: Four junior or middle managers and four secretaries were
selected who had the key characteristics and capabilities, but no previous
experience of DiaryMate. The other
characteristics of the participants that might influence usability were
recorded, together with the age group and gender.
|
Job |
Time in job (years) |
Windows experience (years) |
Computer diary experience (years) |
Attitude to computer diaries (1-7)* |
Gender |
Age group |
1 |
secretary |
3.5 |
3.5 |
0 |
6 |
F |
20-35 |
2 |
secretary |
0.8 |
2.1 |
0.8 |
1 |
F |
20-35 |
3 |
secretary |
2.1 |
2.5 |
2.1 |
3 |
F |
20-35 |
4 |
secretary |
4.9 |
3.5 |
1.5 |
2 |
F |
36-50 |
5 |
junior manager |
0.7 |
0.7 |
0.7 |
2 |
M |
20-35 |
6 |
junior manager |
1.6 |
2.1 |
0 |
3 |
F |
36-50 |
7 |
middle manager |
4.3 |
1.4 |
0 |
4 |
M |
36-50 |
8 |
middle manager |
2.7 |
4.6 |
2.7 |
4 |
F |
20-35 |
*1=prefer to use a computer as much as possible, 7=prefer to use a computer as little as possible
Intended context of use: Interviews with potential users suggested that installing the software was an important task. Having gained familiarity with the application, other key tasks would be adding information for a new contact, and scheduling a meeting.
Context used for the test: The tasks selected for the evaluation were:
[1] The participant will be presented with a copy of the application on a disk together with the documentation and will be asked to perform the installation.
[2] Following this each user will restart the program and spend some time familiarizing themselves with the diary and address book functions.
[3] Each participant will then be asked to add details of a new contact using information supplied.
[4] Each participant will then be asked to schedule a meeting using the diary facility.
Intended context of use: office environment.
Context used for the test: The evaluation was carried out in our usability laboratory in Hayden. The test room was configured to represent a closed office with a desk, chair and other office fittings. Participants worked alone without any interruptions, and were observed through a one way mirror, and by video cameras and a remote screen
Intended context of use: DiaryMate is intended for use on any pentium-based PC running Windows, with at least 8MB free memory.
Context used for the test: The PC used was a Netex PC-560/1 (Pentium 60, 32MB RAM) in standard configuration, with a Netex pro mouse and a 17" color monitor at 800x600 resolution. The operating system was Windows 95.
Tasks were timed using Hanks Usability Logger. Sessions were videotaped (a combined picture of the screen and a view of the participant), although information derived from the videotapes does not form part of this report. At the end of the sessions, participants completed a subjective ratings scale and the SUMI satisfaction questionnaire. SUMI scores have a mean of 50 and standard deviation is 10 (based on a standardization sample of 200 office-type systems tested in Europe and USA - for more information, see http://www.ucc.ie/hfrg/questionnaires/sumi/index.html ).
Eight users were tested, divided into two subgroups: managers and secretaries, to explore any major differences between these groups.
The mean completion rate, mean goal achievement, mean task time, mean completion rate efficiency and mean goal achievement efficiency was calculated for three tasks:
· Install the product
· Add information for a new contact
· Schedule a meeting
On arrival, participants were informed that the usability of DiaryMate was being tested, to find out whether it met the needs of users such as themselves. They were told that it was not a test of their abilities. Participants were shown the evaluation suite, including the control room, and informed that their interaction would be recorded. They were asked to sign a release form. They were then asked to confirm the information they had provided about themselves before participating: Job description, Time in job (years), Windows experience (years), Computer diary experience (years), and Age group. They also scored their attitude towards use of computer applications to support diary and contact management tasks, on a scale of 1 to 7, with anchors: prefer to use a computer as much as possible, prefer to use a computer as little as possible.
Participants were given introductory instructions. The evaluator reset the state of the computer before each task, and provided instructions for the next task. Participants were told the time allocated for each task, and asked to inform the evaluator (by telephone) when they had completed each task. Participants were told that no external assistance could be provided.
After the last task, participants were asked to complete a subjective ratings scale and the SUMI questionnaire.
The evaluator then asked them about any difficulties they had encountered (this information is not included in this report).
Finally they were given $75 for their participation.
Completion Rate: Percentage of participants who completed each task correctly.
Mean goal achievement: Mean extent to which each task was completely and correctly achieved, scored as a percentage.
Errors: Errors were not measured.
Assists: The participants were given no assistance.
Task time: Mean time taken to complete each task (for correctly completed tasks).
Completion rate efficiency: mean completion rate/mean task time.
Goal achievement efficiency: mean goal achievement/mean task time.
No of references to the manual: number of separate references made to the manual.
Satisfaction was measured using a subjective ratings scale and the SUMI questionnaire, at the end of the session, giving scores for each participant’s perception of: overall satisfaction, efficiency, affect, controllability and learnability.
Mean goal achievement
Mean extent to which each task was completely and correctly completed, scored as a percentage.
The business impact of potential diary and contact information errors was discussed with several potential customers, leading to the following scoring scheme for calculating mean goal achievement:
· Installation: all components successfully installed: 100%; for each necessary subcomponent omitted from the installation deduct 20%.
· New contact: all details entered correctly: 100%; for each missing item of information, deduct 50%; for each item of information in the wrong field, deduct 20%; for each typo deduct 5%.
· New meeting: all details entered correctly: 100%, incorrect time or date: 0%; for each item of information in the wrong field, deduct 20%; for each typo deduct 5%.
Combined deductions equaling or exceeding 100% would be as scored 0% goal achievement.
In addition to data for each task, the combined results show the total task time and the mean results for effectiveness and efficiency metrics.
SUMI results were analyzed using the SUMI scoring program (SUMISCO).
Participant # |
Unassisted Task Completion Rate (%) |
Goal Achievement (%) |
Task Time (min) |
Completion Rate / Task Time* |
References to manual |
1 |
100% |
100% |
5.3 |
19% |
1 |
2 |
100% |
100% |
3.9 |
26% |
0 |
3 |
100% |
100% |
6.2 |
16% |
1 |
4 |
100% |
80% |
9.5 |
11% |
2 |
5 |
100% |
100% |
4.1 |
24% |
0 |
6 |
100% |
100% |
5.9 |
17% |
1 |
7 |
100% |
100% |
4.2 |
24% |
0 |
8 |
100% |
100% |
5.5 |
18% |
0 |
Mean |
100% |
98% |
5.6 |
19% |
0.6 |
Std error |
0.0 |
2.5 |
0.6 |
1.8 |
0.3 |
Std Deviation |
0.0 |
7.1 |
1.8 |
5.1 |
0.7 |
Min |
100% |
80% |
3.9 |
11% |
0.0 |
Max |
100% |
100% |
9.5 |
26% |
2.0 |
*This combined figure of percentage completion per minute is useful when making comparisons between products. A related measure can be obtained by dividing goal achievement by task time.
Participant # |
Unassisted Task Completion Rate (%) |
Goal Achievement (%) |
Task Time (min) |
Completion Rate / Mean Task Time |
References to manual |
1 |
100% |
100% |
4.4 |
23% |
0 |
2 |
100% |
100% |
3.5 |
29% |
0 |
3 |
100% |
95% |
4.6 |
22% |
1 |
4 |
100% |
100% |
5.5 |
18% |
1 |
5 |
100% |
100% |
3.8 |
26% |
0 |
6 |
100% |
100% |
4.5 |
22% |
0 |
7 |
100% |
95% |
4.9 |
20% |
1 |
8 |
100% |
100% |
3.3 |
30% |
0 |
Mean |
100% |
99% |
4.3 |
24% |
0.4 |
Std error |
0.0 |
0.8 |
0.3 |
1.5 |
0.2 |
Std Deviation |
0.0 |
2.3 |
0.7 |
4.2 |
0.5 |
Min |
100% |
95% |
3.3 |
18% |
0.0 |
Max |
100% |
100% |
5.5 |
30% |
1.0 |
Participant # |
Unassisted Task Completion Rate (%) |
Goal Achievement (%) |
Task Time (min) |
Completion Rate / Mean Task Time |
References to manual |
1 |
0% |
0% |
0 |
0% |
3 |
2 |
100% |
95% |
4.2 |
24% |
2 |
3 |
100% |
80% |
5.6 |
18% |
0 |
4 |
100% |
100% |
3.5 |
29% |
1 |
5 |
100% |
90% |
3.8 |
26% |
1 |
6 |
100% |
60% |
6.1 |
16% |
0 |
7 |
100% |
75% |
4.6 |
22% |
0 |
8 |
100% |
80% |
3.5 |
29% |
2 |
Mean |
88% |
73% |
4.5 |
22% |
1.1 |
Std error |
0.0 |
4.8 |
0.4 |
1.7 |
0.4 |
Std Deviation |
0.0 |
13.5 |
1.0 |
4.9 |
1.1 |
Min |
100% |
60% |
350% |
16% |
0.0 |
Max |
100% |
100% |
610% |
29% |
3.0 |
|
|
|
Total Task Time (min) |
|
Total References to manual |
1 |
67% |
67% |
9.7 |
7% |
4.0 |
2 |
100% |
98% |
11.6 |
9% |
2.0 |
3 |
100% |
92% |
16.4 |
6% |
2.0 |
4 |
100% |
93% |
18.5 |
5% |
4.0 |
5 |
100% |
97% |
11.7 |
9% |
1.0 |
6 |
100% |
87% |
16.5 |
6% |
1.0 |
7 |
100% |
90% |
13.7 |
7% |
1.0 |
8 |
100% |
93% |
12.3 |
8% |
2.0 |
Mean |
96% |
90% |
13.8 |
7% |
2.1 |
Std error |
4.2 |
3.5 |
0.5 |
0.3 |
0.4 |
Std Deviation |
11.8 |
10.0 |
1.4 |
0.8 |
1.2 |
Min |
67% |
67% |
9.7 |
7% |
1.0 |
Max |
100% |
98% |
13.7 |
9% |
4.0 |
Subjective Ratings Results
These subjective ratings data are based on 7-point bipolar Likert-type scales, where 1= worst rating and 7=best rating on the different dimensions shown below:
Participant # |
Satisfaction |
Usefulness |
Ease of Use |
Clarity[1] |
Attractiveness |
1 |
5 |
3 |
3 |
3 |
4 |
2 |
5 |
6 |
6 |
5 |
5 |
3 |
5 |
5 |
4 |
5 |
6 |
4 |
2 |
5 |
4 |
2 |
5 |
5 |
4 |
4 |
4 |
4 |
5 |
6 |
4 |
4 |
6 |
5 |
6 |
7 |
3 |
2 |
4 |
2 |
3 |
8 |
6 |
6 |
4 |
5 |
6 |
Mean |
4.3 |
4.4 |
4.4 |
3.9 |
5.0 |
Std. dev. |
1.3 |
1.4 |
1.1 |
1.4 |
1.1 |
Min |
2 |
2 |
3 |
2 |
3 |
Max |
6 |
6 |
6 |
5 |
6 |
Participant # |
Global |
Efficiency |
Affect |
Helpfulness |
Control |
Learnability |
1 |
35 |
39 |
33 |
30 |
40 |
42 |
2 |
50 |
62 |
33 |
44 |
54 |
36 |
3 |
55 |
52 |
45 |
53 |
46 |
49 |
4 |
51 |
53 |
51 |
52 |
55 |
47 |
5 |
48 |
45 |
44 |
46 |
48 |
42 |
6 |
51 |
59 |
36 |
45 |
53 |
38 |
7 |
54 |
52 |
46 |
52 |
47 |
50 |
8 |
52 |
49 |
49 |
53 |
56 |
48 |
Median |
51 |
52 |
44 |
49 |
50 |
44 |
Upper confidence level |
58 |
58 |
51 |
55 |
56 |
50 |
Lower confidence level |
44 |
46 |
37 |
43 |
44 |
38 |
Min |
35 |
39 |
33 |
30 |
40 |
36 |
Max |
55 |
62 |
51 |
53 |
56 |
50 |
The global measure gives an overall indication of satisfaction. Efficiency indicates the participant’s perception of their efficiency, affect indicates how much they like the product, helpfulness indicates how helpful they found it, control indicates whether they felt in control, and learnability is the participant’s perception of ease of learning.
Thank you for helping us in this evaluation.
The purpose of this exercise is to find out how easily people like you can use DiaryMate, a diary and contact management software application.
To achieve this, we will ask you to perform some tasks, and your performance will be recorded on videotape for later analysis. Then, to help us understand the results, we will ask you to complete a standard questionnaire, and to answer a few questions about yourself and your usual workplace.
The aim of this evaluation is to help assess the product, and the results may be used to help in the design of new versions.
Please remember that we are testing the software, not you.
When you have finished each task, or got as far as you can, please phone us by dialing 1234. I am afraid that we cannot give you any assistance with the tasks.
You have just received your copy of DiaryMate. You are keen to have a look at the product which you have not seen before, to find out whether it could meet your current business needs.
You will perform the following tasks:
1. Install the software.
2. Following this you will be asked to restart the program and take some time to familiarize yourself with it and specifically the diary and address book functions,
3. Add details of a new contact to the address book using information supplied.
4. Schedule a meeting using the diary facility.
We are interested to know how you go about these tasks using DiaryMate and whether you find the software helpful or not.
LET US KNOW WHEN YOU ARE READY TO BEGIN
Task 1 – Install the software
(YOU HAVE UP TO 15 MINUTES FOR THIS TASK)
There is an envelope on the desk entitled DiaryMate. It contains a diskette, and an instruction manual.
When you are ready, install the software. All the information you need is provided in the envelope.
LET US KNOW WHEN YOU ARE READY TO MOVE ON
Task 2 – Familiarization period
Spend as long as you need to familiarize yourself with the diary and address book functions.
(YOU HAVE UP TO 20 MINUTES)
LET US KNOW WHEN YOU ARE READY TO MOVE ON
Task 3 – Add a contact record
(YOU HAVE ABOUT 15 MINUTES FOR THIS TASK)
Use the software to add the following contact details.
NAME - Dr. Gianfranco Zola
COMPANY Chelsea Dreams Ltd
ADDRESS - 25 Main Street
Los Angeles
California 90024
TEL: (work) 222 976 3987
(home) 222 923 2346
LET US KNOW WHEN YOU ARE READY TO MOVE ON
Task 4 – Schedule a meeting
(YOU HAVE ABOUT 15 MINUTES FOR THIS TASK)
Use the software to schedule the following meeting.
DATE: 23 November 2001
PLACE: The Blue Flag Inn, Cambridge
TIME: 12.00 AM to 1.30 PM
ATTENDEES: Yourself and Gianfranco Zola.
LET US KNOW WHEN YOU HAVE FINISHED
Last modified: Monday, 18-Mar-02 14:44:57 |