Evaluation Techniques in Human-Computer Interaction

(1)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas Indonesia

Evaluation Techniques

in Human-Computer Interaction

Prof. Dr. Ir. Riri Fitri Sari, M.Sc., M.M.

(2)

D E P A R T E M E N

TEKNIK ELEKTRO

• Evaluation test the usability, functionality and acceptability of interactive system

• To assess design and test systems to ensure that they actually behave as we expect and meet user requirements

• should be considered at all stages in the design life cycle

• It is not possible to perform extensive experimental testing continuously throughout the design, but analytic & informal techniques can be used

Evaluation Technique

(3)

D E P A R T E M E N

TEKNIK ELEKTRO

3 Main Goals of Evaluation

• assess extent & accessibility of system functionality

• assess users experience of interaction

• identify specific problems

(4)

D E P A R T E M E N

TEKNIK ELEKTRO

Evaluation through Expert Analysis

• Evaluation should be performed before any implementation work has started

• Expensive mistakes can be avoided

• Later in design process that error is discovered, the more costly it is to put right

• A number of methods proposed to evaluate interactive systems through expert analysis

• consider four approaches to expert analysis: cognitive

walkthrough, heuristic evaluation, use of models and use of previous work

(5)

D E P A R T E M E N

TEKNIK ELEKTRO

Cognitive Walkthrough

Proposed by Polson

• evaluates design on how well it supports user in learning task

• usually performed by expert in cognitive psychology

• expert ‘walks through’ design to identify potential problems using psychological principles

• main focus “how easy a system is to learn”

(6)

D E P A R T E M E N

TEKNIK ELEKTRO

• For each task walkthrough considers

what impact will interaction have on user?

what cognitive processes are required?

what learning problems may occur?

• Analysis focuses on goals and knowledge: does the design lead the user to generate the correct goals?

(7)

D E P A R T E M E N

TEKNIK ELEKTRO

Heuristic Evaluation

• Proposed by Nielsen and Molich.

• A heuristic is a guideline or general principle or rule of thumb that can guide a design decision

• Heuristic evaluation perform on design specification so it is useful for evaluating early design

• It can also used on prototypes, storyboards and fully functioning systems, therefore it is flexible and cheap approach

• design examined by experts to see if these are violated

(8)

D E P A R T E M E N

TEKNIK ELEKTRO

• Example heuristics

• system behaviour is predictable

• system behaviour is consistent

• feedback is provided

• Heuristic evaluation `debugs' design.

Heuristic Evaluation

(9)

D E P A R T E M E N

TEKNIK ELEKTRO

Nielsen’s ten heuristics are:

1) Visibility of system status

2) Match b/w system and the real world 3) User control and freedom

4) Consistency and standards 5) Error prevention

6) Recognition rather than recall 7) Flexibility and efficiency

8) Aesthetic and minimalist design

9) Help users recognize and recover from errors 10) Help and documentation

Heuristic Evaluation

(10)

D E P A R T E M E N

TEKNIK ELEKTRO

https://www.nngroup.com/articles/ten-usability-heuristics/

10 Usability Heuristics for User Interface Design

(11)

D E P A R T E M E N

TEKNIK ELEKTRO

10 Usability Heuristics for Virtual Reality

1.Visibility of System Status: The design should constantly inform users about system status. For example, the Oculus Quest displays battery life for the headset and controllers.

2.Match Between System and the Real World: The design should be familiar to the user. Real-world conventions can help new users navigate VR systems, as illustrated by the Immersed VR environment.

3.User Control and Freedom: Provide clear exits for users to leave unwanted actions. Beat Saber allows users to cancel customization processes easily, emphasizing this principle.

4.Consistency and Standards: Ensure uniformity across your design to avoid confusion. For instance, Gravity Sketch used a slider for on- off switches, which contradicted standard designs and increased cognitive load.

5.Error Prevention: Design proactive systems that can anticipate and prevent errors. Oculus uses a grid to alert users when they approach the boundary of their play area, thereby preventing errors.

6.Recognition Rather than Recall: Reduce memory load by making elements and options visible. Oculus used unlabeled icons with tooltips, causing potential strain on short-term memory.

7.Flexibility and Efficiency of Use: Design should cater to both novice and experienced users. Beat Saber and Firefox Reality allow users to customize their experiences, thereby enhancing user engagement.

8.Aesthetic and Minimalist design: Remove irrelevant information. While YouTube maintained a minimalist interface focusing on primary actions, Pokerstars VR had a cluttered interface.

9.Help Users Recognize, Diagnose, and Recover from Errors: Provide clear, constructive error messages. Firefox Reality and Pokerstars VR demonstrated failures in this area with their lack of diagnostic and actionable help.

10.Help and Documentation: Provide documentation to help users understand how to complete their tasks. Immersed offered multiple channels of support, such as video tutorials and FAQs, for user assistance.

In conclusion, while VR presents a unique interface, standard usability heuristics remain relevant for enhancing the user experience.

•https://www.nngroup.com/articles/usability-heuristics-virtual-reality/

(12)

D E P A R T E M E N

TEKNIK ELEKTRO

Model-based evaluation

• Another expert-based approach is the use of models

• Certain cognitive and design models provide a means of combining design specification and evaluation into same framework

e.g. GOMS (goals, operators, methods & selection) model predicts user performance with a particular interface and can used to filter design options.

• Design rationale can also provide useful evaluation information

(13)

D E P A R T E M E N

TEKNIK ELEKTRO

Review -based evaluation

• Results from the literature used to support or refute parts of design.

• Care needed to ensure results are transferable to new design.

(14)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Evaluating through user Participation

• The techniques considered so far concentrate on evaluating a design or system through analysis by

designer or expert evaluator, rather than testing with actual users

• There are number of different approaches to evaluation through user participation

(15)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Styles of evaluation

• We distinguish between two distinct evaluation styles:

• Those performed under laboratory conditions

• Those conducted in the work environment or ‘in the field’

(16)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Laboratory studies

• Advantages:

• specialist equipment available

• uninterrupted environment

• Disadvantages:

• lack of context (filling cabinets, calendars, books)

• difficult to observe several users cooperating

(17)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Field Studies

• Advantages:

• natural environment

• longitudinal studies possible

• Disadvantages:

• distraction

• noise

(18)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Experimental evaluation

• One most powerful methods of evaluating a design is to use a controlled experiment

• It can be used to study a wide range of different issues at different levels of detail

• There are number of factors that are important to overall reliability of experiment, which must be

considered carefully in experimental design

• This include the participants chosen, variables tested and manipulated and hypothesis tested

(19)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Experimental factor

• Subjects

• who – representative

• Variables

• things to modify and measure

• Hypothesis

• what you’d like to show

• Experimental design

• how you are going to do it

(20)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Variables

• independent variable (IV)

IV are those elements of the experiment that are manipulated to produce different conditions for comparison.

e.g. interface style, number of menu items and icon design

• dependent variable (DV)

DV are the variables that can be measured in experiment, their value is ‘dependent’ on the changes made to

independent variable

(21)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Hypothesis

• A hypothesis is a prediction of outcome of an experiment.

• The aim of experiment is to show this prediction is correct

(22)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Experimental design

• In order to produce reliable and generalizable results, an experiment must be carefully designed

• The first phase in experimental design is to choose the hypothesis: to decide exactly what it is you are trying to demonstrate

• In doing this we are likely to clarify the independent and dependent variables

• a number of experimental conditions are considered which differ only in the value of some variable

(23)

D E P A R T E M E N

TEKNIK ELEKTRO

Experimental studies on groups

More difficult than single-user experiments Problems with:

• complexities of human-human communication & group working

• choice of task

• data gathering

• analysis

(24)

D E P A R T E M E N

TEKNIK ELEKTRO

Participant groups

larger number of subjects

 more expensive longer time to `settle down’

… even more variation!

difficult to timetable

(25)

D E P A R T E M E N

TEKNIK ELEKTRO

The Task

• Choosing a suitable task is also difficult, we may want to test a variety of different task types:

options:

• creative task e.g. ‘write a short report on …’

• decision games e.g. desert survival task

• control task e.g. ARKola bottling plant

(26)

D E P A R T E M E N

TEKNIK ELEKTRO

Data gathering

• Even in a single-user experiment use several video cameras

• In group setting this is replicated for each participant

problems:

• synchronisation

• volume!

one solution:

• record from each perspective

(27)

D E P A R T E M E N

TEKNIK ELEKTRO

Observational techniques

• A particular way to gather information about actual user of system is to observe users interacting with it

• Usually they are asked to complete a set of predetermined tasks

• The evaluator watches and records the users action

• Users are asked to elaborate their actions by ‘thinking aloud’

(28)

D E P A R T E M E N

TEKNIK ELEKTRO

Think Aloud

• user observed performing task

• user asked to describe what he is doing and why, what he thinks is happening etc.

• Advantages

• simplicity - requires little expertise

• can provide useful insight

• can show how system is actually use

(29)

D E P A R T E M E N

TEKNIK ELEKTRO

Cooperative evaluation

• A variation on think aloud is known as cooperative evaluation in which the user is encouraged to see himself as collaborator

• both user and evaluator can ask each other questions throughout

• Additional advantages

• less constrained and easier to use

• user is encouraged to criticize system

(30)

D E P A R T E M E N

TEKNIK ELEKTRO

Protocol analysis

• paper and pencil – cheap, limited to writing speed

• audio – good for think aloud, difficult to match with other protocols

• video – accurate and realistic, needs special equipment, obtrusive

• computer logging – relatively easy to get system automatically to record user actions, it tells us what user is doing on system

• user notebooks – coarse and subjective, useful insights, good for longitudinal studies

(31)

D E P A R T E M E N

TEKNIK ELEKTRO

automated analysis – EVA

• Analyzing protocols, whether video, audio or system logs is time consuming

• It is harder if there is more than one stream of data

• One solution to this problem is to provide automatic analysis tools to support the task

• EVA (Experimental Video Annotator) is a system that runs on a multimedia workstation with a direct link to video recorder

• Post task walkthrough

• user reacts on action after the event

• used to fill in intention

(32)

D E P A R T E M E N

TEKNIK ELEKTRO

Interviews

Questionnaires

(33)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Interviews

• analyst questions user on one-to-one basis usually based on prepared questions

• informal, subjective and relatively cheap

• Advantages

• issues can be explored more fully

• can elicit user views and identify unanticipated problems

• Disadvantages

• very subjective

• time consuming

(34)

D E P A R T E M E N

TEKNIK ELEKTRO

Universitas

Indonesia

Questionnaires

• Set of fixed questions given to users

• Advantages

• quick and reaches large user group

• can be analyzed more rigorously

• Disadvantages

• less flexible

• less probing

(35)

D E P A R T E M E N

TEKNIK ELEKTRO

• Need careful design

• what information is required?

• how are answers to be analyzed?

• Styles of question

• general

• open-ended

• scalar

• multi-choice

Questionnaires

(36)

D E P A R T E M E N

TEKNIK ELEKTRO

Choosing an evaluation method

• A range of techniques available for evaluating system at all stages in design process

• So how do we decide which methods are most appropriate……no hard and fast rules

• Each method has its particular strengths and weakness and each is useful

• There are number of factors that should be taken into account when selecting evaluation techniques

(37)

D E P A R T E M E N

TEKNIK ELEKTRO

Choosing an Evaluation Method

when in process: design vs. implementation style of evaluation: laboratory vs. Field

type of measures: qualitative vs. quantitative level of information: high level vs. low level

resources available: time, subjects, equipment, expertise