September 1, 2014
Evaluate Design
Develop
many iterations
• Usability refers to the extent to which a product can be • Identify any usability problems that the product has.
used by specified users to achieve specified goals with • Collect empirical data on users' performance.
effectiveness, efficiency and satisfaction in a specified • Determine users' satisfaction with the product.
context of user
• Usability measures the quality of a user's experience
when interacting with a product or system • Test early;
– Ease of learning • Test often.
– Efficiency of use
– Memorability
– Error frequency and severity
– Subjective satisfaction
1
3/09/2014
2
3/09/2014
• Empirical evaluation:
– Observational experiment
• Observation, problem identification
– Controlled Experiment
• Formal controlled scientific experiment
• Comparisons, statistical analysis
3
3/09/2014
Interview Questionnaire
• Established questionnaires will give more reliable and • A focus group is a small-group discussion guided by a
repeatable results than ad hoc questionnaires. trained leader (moderator) in an interactive setting.
• Three questionnaire for assessing the perceived • It is used to learn more about opinions on a designated
usability of an interactive system: topic, or a product, and then to guide future action.
– Questionnaire for User Interface Satisfaction (QUIS) (1988) • A focus group:
– Computer System Usability Questionnaire (CSUQ) (1995) – is an interview,
– System Usability Scale (SUS) (1996) – conducted by a trained moderator among a small group of
respondents,
– conducted in an unstructured and natural way where
respondents are free to give views from any aspect.
4
3/09/2014
Nielsen’s heuristics
Analytic inspection
• Visibility of system status.
• Experts use their knowledge of users & technology to • Match between system and real world.
review system usability.
• User control and freedom.
• Expert critiques can be formal or informal reports.
• Consistency and standards.
• Benefits: • Error prevention.
– Generate results quickly with low cost.
• Recognition rather than recall.
– Can be used early in the design phases
• Flexibility and efficiency of use.
• Aesthetic and minimalist design.
• Heuristic evaluation is a review guided by a set of
heuristics. • Help users recognize, diagnose, recover from errors.
• Walkthroughs involve stepping through a pre-planned • Help and documentation.
scenario and noting potential problems.
5
3/09/2014
• Observational experiment
– Observation, problem identification • In an observational usability test, representative users try
• Controlled Experiment to do typical tasks with the product, while observers,
– Formal controlled scientific experiment including the development staff, watch, listen, and take
– Comparisons, statistical analysis notes.
35 36 / 66
6
3/09/2014
Eye tracking
Think aloud
7
3/09/2014
• Independent Variable (IV, what you very) • Dependent Variable (DV, what you measure)
– Independent of participant behavior – User performance time
– Examples: interface, visual layout, gender, age – Accuracy, errors
• Test conditions: levels, or value of an IV – Subjective satisfaction
• Provide a name for both IV and its levels (test
conditions)
IV Conditions (levels)
Device Mouse, trackball, joystick
visualization 2d, 3d, animated
Search engine Google, Bing
Feedback mood Audio, video, tactile, none
• How to “prove” a hypotheses in science? • Statistical tests calculate the probability that the pattern
• In most cases, it is impossible to prove the hypothesis of results we have observed in our data could have
directly. arisen by chance alone.
• This is done by disproving the null hypothesis. This is true for the
population sample
IT students are better 100 IT students and sample. What about
Result:
– Easier to disprove things, by counter-example than psychology 100 psychology population?
Mean for IT =80;
– First we suppose the null hypothesis true: Null students in memorizing students were
Mean for psychology
hypothesis=opposite of hypothesis numbers? tested
=60
– Then a conflicting result is found
Two possibilities for the population: Statistical tests allow us to
– Disprove the null hypothesis decide which of the two
– Hence, the hypothesis is proved 1) There really is a difference in the
possible explanations is the
population, or
most likely by calculating a p
2) There is no difference in the value.
population.
8
3/09/2014
• Within-subjects
• Main types:
– Each subject is exposed to each of the conditions
– Between-subjects (B-S)
– Each subject contributes one entry for each of the conditions
– Within-subjects (W-S)
• Usually, WS design is used in HCI evaluations, and
• Between-subjects
requires 10-30 subjects
– One subject is exposed to only one condition, and contributes
one entry to the whole data.
– Subjects are randomly allocated to one of the conditions Condition 1 Condition 2
subject 1 subject 1
Condition 1 Condition 2
subject 2 subject 2
subject 1 subject 13
…. ….
subject 2 subject 14
subject 12 subject 12
…. ….
subject 12 subject 24
9
3/09/2014
• Advantages • Disadvantages
– fewer subjects – Learning effect
– Less time – Carryover effect
– Fatigue, boredom
– Less expensive
• Solutions:
– Increased control of subjects variability: comparisons between – More practice before testing; randomization; counterbalancing
(Latin square)
conditions happen within each subject.
– More power to detect significant difference A B C
B C A
• Parametric methods are more powerful in detecting Parametric t test t test paired t test B-S ANOVA W-S ANOVA
significant differences, but have more strict assumptions Non- Wilcoxon rank Mann- Krusakl-
on data. parametric sum Whitney Wilcoxon Wallis Friedman
10
3/09/2014
Ethics Acknowledgement
11