Anda di halaman 1dari 38

Self Reported

Mustika Febrillia 13414076
Atikah Arysolia Taifur 14414056
Self Reported Data
To learn about the usability of something is to ask
the partcipants to tell you about their experience
with it

How exactly to ask participants so we get good


Self-reported data is one the best to describe it

Self-reported data is the verbatim comment

made by participants while using a product
Kind of Self-reported Data
Subjective data used as a counterpart to
objective, whichh is often used to describe
performance data form a usability study. But
imply lack of objectivity to the data youre

Preference data Often used as a counterpart

to performance. Preference implies a choice of
option over another, which is often not the case
in UX studies.
Importance of Self-Reported
This data give the most important information about
users perception of the system and their interaction
with it.

The data would tell about how the users feel about
the system

Fact : users will not remember how long the process

of using a website and how many clicks they do,
but if the experience makes them happy , thats the
only thing matters.

Subjective reaction may be the best predictor of

their likelihood to return or make a purchase in the
Rating Scale
One of the most common ways to capture sefl-
reported data in a UX study is with some type of
rating scales.
Likert Scales
Using a statement to which respondents rate their
level of agreement.

Characteristics : (1) Express a degree of

agreement with a statement, (2) Give odd
number of response allowing natural response.

In designing statement, avoid adverb such as

very, extremely, or absolutely and use
unmodified versions of adjective
Semantic Differential Scales
The semantic differential technique involves
presenting pairs of bipolar or opposite adjective

Five seven point scale is commonly used (odd)

Please be aware of the connotations of different

pairings of words
When to collect self-
reported data?
Post-task ratings

Quick rating immediately after each task can help

pinpoint tasks and parts of the interface that are
particulary problematic

Post-study ratings

Can provide overall evaluation after the

participant has had interact with product more
fully. Can do more in-depth rating
How to collect ratings
There are 3 ways :

Answer questions or provide rating orally

Easiest method, but can get bias. Ex: feel

uncomfortable verbally stating poor ratings

Record responses on a paper form

Manual entry data can get bias

Provide responses using some type of online tool

Need tablets or laptop

Biases in collecting self-
reported data
Social desirability bias : respondents tend to give
answers they believe will make them look better
in the eyes of others.

Studies shown (Dillman et al., 2008) people

who are asked directly for self-reported provide
more positive feedback than when asked
through an anonymous web survey
General guidelines for rating
Multiple scales to help triangulate can get
more reliable data if can think different ways to
ask participants

Odd or even number of values odd number

can get gather neutral response.

Total number of points five or seven points. But

more is not always better.
Analyzing Rating-Scale Data
The most common technique for analyzing data
from rating scales is to assign a numeric value to
each of the scale positions and then compute
the averages.

Can use descriptive statistics such as average,

modus and etc
Post-Task Ratings
The main goal of ratings associated with each
task is to give you some insight into which tasks
the participants thought the most diffiicult. Next
will examine some of spesific technique.
Ease of Use
Usually asking users to rate how easy or how
difficult each task was

Some UX professionals prefer to use a traditional

Likert scale

Compared to severl other post-task ratings and

found it to be among the most effective
Questionnaire (ASQ)
Touch upon 3 fundamental areas of usability : (1)
effectiveness, (2) efficiency, and (3) satisfaction

There are developed 3 rating scales :

Expectation Measure
Most important thing about each task is how
easy or difficult it was in comparison to how easy
org difficult the user thought it was going to be

Expecting measure : asking the respondent rate

how easy/difficult they expect each of task to be

The result can interprate in quadran

A comparison of Post-task
Self-Reported Metrics
Goal : to see if these rating techniques are
sensitive to detecting differences in perceived
difficulty of the tasks

Also wanted to see how the perceived difficulty

of the tasks corresponded to the task
performance data
These can be used as an overall barometer of
the usability of the product
Aggregating Individual Task
Take an average of the individual task-based
ratings using rating or weighted average

Simply take an average of the data. If some tasks

are more important than others, than take a
weighted average.

By looking the data, we can take an average

perception as it changes over the time
System Usability
One of the most widely used
tools for assessing the
perceived usability of a system
or product.
Consist of 10 statements to
which user rate their level of
Interpretation of SUS score:
<50 : Not acceptable
50-70 : Marginal
>70 : Acceptable
Computer System Usability
Questionnaire (CSUQ)
CSUQ designed to administered by mail or online

Consists of 19 statements, user rates agreement

on seven-point scale

Statement example : It was simple to use this


Results viewed in four categories: System

Usefulness, Information Quality, Interface Quality,
and Overall Satisfaction
Questionnaire for User
Interface Satisfaction (QUIS)
Developed by HCIL 1998

Consists of 27 rating scales divided into five

categories: Overall Reaction, Screen,
Terminology/ System Information, Learning, and
System Capabilities

10-point scales change depend on statement

USE Questionnaire
Proposed by Arnie Lund (2001)

Summary in Radar Chart

Product Reaction Cards
Proposed by Benedeck and Miner (2002)
118 Cards of

Pick Top 5

Explain Why!

Result using
Word Cloud
Comparison of Postsession
Self-Reported Metrics
Study conducted by Tullis and Stetson (2004)

Used SUS, QUIS, CSUQ, Words, and Ours to evaluate

two web portals 123 participants, each use one
Net Promoter Score (NPS)
Originated by Fred Reichheld (2003)

How likely is it that you would recommend this to

a friend or colleague?

Three catagories of respondent:

o Detractors : gave ratings 0-6
o Passives : gave ratings 7 or 8
o Promoters : gave ratings 9 or 10
Using SUS to Compare
Traci Hart (2004): comparing three different
websites for adults. After attempting task, fill SUS

The American Insititutes for Research (2001) :

comparing Windows ME and Windows XP, 36
expert attempt task, then fill SUS questionnaire

Sarah (2006) : comparing three types of paper

ballots, using the ballots in simulation, then fill SUS
Online Services
VoC study typically done on live websites

Common approach Pop-up surveys

Another approach standard mechanism for

getting feedback

1. Website Analysis and Measurement Inventory

2. American Customer Satisfaction Index

3. OpinionLab

Composed of 20 statements, five-point likert

3. OpinionLab
Page-level feedback from users
Issues with Live-Site Surveys
Number of questions

Self-selection of respondents

Number of respondents

Nonduplication of respondents
Other Types of Self-reported
Assessing Specific Attributes

Assessing Specific Elements

Open-Ended Questions

Awareness and Comprehension

Awareness and Usefulness Gaps

Assessing Specific Attributes
Some of attribues of product or websites that
might be assessing:
- Visual Appeal
- Perceived Efficiency
- Confidence
- Usefulness
- Credibility
- Appropriateness of terminology
- Ease of navigation
- Responsiveness
Assessing Specific Elements
Such as:

- Instructions

- FAQs or Online help

- Homepage

- Search function

- Site map

- etc.
Open-Ended Questions
Allow user to add comments related to any of
the individual ratings scales

Ask the users to list three to five things they like

the most about the product and three to five
things they like the least

Ask to describe anything they found confusing

about the interface

Simple analysis method

Word Clouds
Awereness and
Testing users learning and comprehension to
website contents by giving quiz to test their
comprehension of the information

If necessary to administer a pretest to determine

what they already knew and compare to the
post test
Awereness and Usefulness
Typically ask user about awareness as
a yes/no question, e.g: Were you
aware of this functionality?, answer
in 1-5 rating

Convert rating-scale data into top-2-

box score

Plot % of user who aware and % of

user who found the functionality