Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
[email protected]
Department of Electrical and Information Engineering Faculty of Engineering
Universitas Gadjah Mada INDONESIA
Basic Statistics for User Experience
Version: 1 September 2022
1
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
❑ 2008–now : Faculty member in the Department of Electrical and Information Engineering, Universitas Gadjah Mada, Indonesia
❑ 2015 : Post-doctoral researcher in Tampere Unit for
Computer-Human Interaction, Tampere University, Finland
❑ 2016–2019 : Visiting research fellow in Anadolu University Turkey,
Shibaura Institute of Technology, Japan, and Universiti Teknologi Malaysia (UTM), Malaysia
❑ 2018–now : Section Editor-in-Chief ASEAN Engineering Journal (AEJ)–Computer and Information Engineering (ASEAN University Network, JICA, and UTM - indexed by Scopus and ASEAN Citation Index )
❑ 2019–now. : Chair, IEEE Systems, Man, and Cybernetics Indonesia Chapter
❑ 2014 Dr.Eng. Science and Technology Tokai University, Tokyo, Japan
❑ 2010 M.Eng. Electronics Engineering KMITL, Bangkok, Thailand
❑ 2007 B.Eng. Electrical Engineering UGM, Yogyakarta, Indonesia
q Human-computer interaction / touchless technology q Eye tracking applications and eye movements analysis q Virtual reality and human factors
q Applied artificial intelligence
Dr. Sunu Wibirama
Education Positions
Interests
sunu_wibirama
2
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Outlines
• Part 1: Types of data and descriptive statistics
• Part 2: Inferential statistics
• Part 3: Data visualization [self-study, see reading material]
Reading material:
• Albert, B. and Tullis, T., 2013. Measuring the user experience: collecting, analyzing, and presenting usability metrics, 2 nd Edition, Morgan Kaufmann, Chapter 2.
3
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
TYPE OF DATA AND DESCRIPTIVE STATISTICS
PART 1
4
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.
S. Wibirama, S. Murnani, and N.A. Setiawan, “Spontaneous Gaze Gesture Interaction in the Presence of Noises and Various Types of Eye Movements”, in Symposium on Eye Tracking Research and Applications (ETRA ’20 Short Papers), June 2–5, 2020, Stuttgart, Germany. ACM, New York, NY, USA, 5 pages, 2020.
5
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
Introduction
• Statistics is basic mathematical tool for analyzing UX metrics. Statistics for UX is mostly drawn from statistics for applied psychology.
• If you good at statistics, you can make an inference (conclusion) from your experiment with appropriate analysis à some final projects/capstone projects involve participants in their experiment.
• Purpose of this lecture:
• Provide basic information about understanding data
• Practical step-by-step guide to analyzing data without large number of formulas or complicated statistics
• You can use it to present results of your market research and validation of your interfaces design
6
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 77
7
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
(related / repeated-measures)
8
8
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 99
face
9
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
Summary: within vs. between-subject design
If you want to compare two or more stimulus, I suggest you to use same participants for all stimulus (within-subject).
Special case: if you want to discriminate participants based on gender or age (gender or age as independent variable), you should use different group of participants
for each stimuli (between-subject)
Interface A Interface B Interface C Interface A Interface B
Within-subject Between-subject
10
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1111
11
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1212
12
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1313
13
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1414
14
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1515
15
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1616
16
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1717
17
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 18
Raw task time for 12 users Descriptive statistics
In Mac:
(1) Tools > Excel Add-Ins > Analysis ToolPak (checked) (2) Data tab > Data Analysis
18
18
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1919
19
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2020
20
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 21
• Confidence level 95% : you are 95% certain that the true population value lies on the designated range
• Alpha level 5% : you are willing to be wrong 5%
of the time
Note: [standard deviation/sqrt(sample size)]
is “standard error of the mean” (SE)
SE : how precisely the sample mean estimates the population mean
21
21
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 22
If those bars are not overlapping, you can be sure that the difference between mean of checkout time on design A and design B is significant
22
22
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 23
INFERENTIAL STATISTICS
PART 2
23
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 24
Inferential statistics
• Inferential statistics allow you to create a conclusion (make a generalization) for a population based on your samples.
• Very powerful tool, most of usability testing with interval and ratio data use inferential statistics.
• The use of each statistical testing highly depends on the design of experiment
(e.g. repeated-measures / paired samples vs. between subject / independent samples) and normality of the data.
24
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 25
Example: there is no difference on time to complete task on interface A and B
Example: there is a difference on time to complete task on interface A and B
Note: time to complete task is dependent variable;
type of interface is independent variable with two factors (interface A and B)
25
25
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2626
Alpha and Beta
26
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 27
Alpha = type 1 error
You believe that there is a “genuine”
effect of your interface on task completion time but in reality, there is no effect
Beta = type 2 error
You believe that there is no “genuine”
effect of your interface on task completion time but in reality, there is an effect.
27
27
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2828
P-value
28
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 29
(Dr. Andy Field, 2018)
use statistical software such as G*Power to determine
minimum sample size
29
29
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 30
30
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 31
Note: use this test for between- subject design
(e.g.: novice and expert are two different groups, no participant belongs to “novice” and “expert” at the same time)
31
31
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 32
Note: use this test for within-subject design
32
32
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 33
33
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 34
• Suppose that you have a competition in two basket ball groups (team A and team B) to make a successful three-points shot.
• If a player misses a shot, they can repeat until they make a successful one.
• You put two conditions during experiment:
– Three-points shot with audience watching both groups
– Three-points shot without audience (isolated sport center)
• You then measure the duration (in seconds) needed by each player to successfully score a three-points shot
Concept of ANOVA
34
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 35
Concept of ANOVA
Team A Team B
Mean difference: 9.53-8.79 = 0.74 Mean difference: 9.53-8.79 = 0.74 Note: all data are in seconds
35
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 36
Team A Team B
Standard deviation: 0.279 0.286 Standard deviation: 4.75 5.101
Mean difference: 0.74 Mean difference: 0.74
Note: all data are in seconds
Concept of ANOVA
36
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 37
and between-treatments variability.
• In team A, within-treatments variability is low (about 0.28) compared with between-treatments variability (0.74).
• Therefore, we can see that different scoring time (in seconds) between “with audience” and
“alone” is an important, not chance, one.
• Hence, the difference between duration of scoring three-points shot in “with audience” and
“alone” is result of treatment (letting audience entering the sport center).
37
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 38
• In team B, however, within-treatments variability is large compared with between-treatments variability (the difference between the two means).
• If there is large within-treatments variability
compared with low between-treatments variability, we can see that any difference between means is not convincing.
• Thus, the difference between duration of scoring three-points shot in “with audience” and “alone” is not affected by audience.
Concept of ANOVA
38
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 39
• Measuring mean for each group is not enough to say that there is an effect of letting the audience involved during three-points shot experiment.
• To measure whether there is effect from our treatment (letting in audience in sport center), we have to compare variability within treatments and between treatments.
• If there is low within-treatments variability compared with high between-treatments variability, than effect of treatment is significant.
• One independent variable: One-Way ANOVA
• Two independent variables: Two-Way ANOVA
With
audience Alone
Concept of ANOVA
39
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 40
Little/no effect vs strong effect
High within-treatments variability Little/no effect
No significant different of means between treatments
Low within-treatments variability Strong effect
Significant different of means between treatments
40
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 41
(tidak ada efek dari perlakuan)
• Between-subject design: no different effect from treatment between groups (tidak ada perbedaan efek dari perlakuan antar kelompok)
• ANOVA is used to analyze means from 5 groups H 0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5
H 1 : at least one out of five groups has different mean compared with other groups
41
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 42
• Samples in each group are independent.
Samples in group A do not interfere with samples in group B, C, … and so on.
• Samples are taken randomly, there is no bias in sampling.
• Samples are taken from populations with normal distribution.
• In practical case, you can see the histogram or running normality check (Kolmogorov- Smirnov Test)) to observe whether the population is normally distributed.
Basic assumptions of ANOVA
42
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 43
R2= 0.28
43
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 44
Note: the nominal and ordinal data are generally not normally distributed and the variances are not equal
44
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 45
9
45
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 46
In the previous example we were just examining the distribution of success rates across a single variable (experience group).
There are some situations in which you might want to examine more than one variable, such as experience group and design prototype. Performing this type of evaluation works the same way.
46
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama
Summary: what test should we use?
(courtesy of Prof. Hideyuki Takagi, 2015)
47
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 48
DATA VISUALIZATION
PART 3
[This part is self-study material]
48
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 49
Reading material:
– Albert, B. and Tullis, T., 2013. Measuring the user experience: collecting, analyzing, and presenting usability metrics, 2 nd Edition, Morgan Kaufmann, Chapter 2.
49
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 50
vertical
50
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 51
Good example
Bad example
Presenting your data graphically
51
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 52
Question: should we present “task” as continuous data point? or discrete data point?
Presenting your data graphically
52
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 53
S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.
Legends
You need to tell the story right away
53
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 54
Good example Bad example
Presenting your data graphically
54
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 55
Good example
Not so good example
Presenting your data graphically
(except you write the percentage inside the bar)
55
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama
Presenting your data graphically
Bar graphs: independent vs. dependent variable
S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.
Statistical significance
Independent variable
Dependent variable Standard error or
standard deviation
56
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama
Figure 1 | Factors that affect the conservation status of European fishes. b, Box plots of IUCN Red List category against size.
Middle band is the median, boxes indicate the interquartile range (IQR), whiskers min(max(x), Q3 + 1.5 × IQR) and max(min(x), Q1 − 1.5 × IQR), where Q1 and Q3 are the 1st and 3rd quartiles respectively, and dots are outliers from the whiskers.
Maximum
Minimum 75%
2 5%
Median
P. Fernandes, G. Ralph, A. Nieto, et al.”Coherent assessments of Europe’s marine fishes show regional divergence and megafauna loss”,Nat. Ecol. Evol., Vol. 1, 2017, p. 0170.
Outliers
57
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama
Presenting your data graphically
Scatter plots – data transparency
Q. Li, Y. Zhang, P. Pluchon, et al. “Extracellular matrix scaffolding guides lumen elongation by inducing anisotropic intercellular mechanical tension”, Nat. Cell. Biol.,Vol. 18, 2016, pp. 311–318.
Mean
Transparently show distribution of the data
58
9/1/22
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 59
Conclusion
• Understanding type of data is important. The specific type of data will dictate what statistics you can (and can’t) do.
• If your data are interval and ratio data, you should check whether the data are normally distributed. If so, you can use parametric test. If not, you can use non-parametric test.
• Nominal and ordinal data are generally not normally distributed. You can use non-parametric test.
• Displaying confidence interval (or standard error) is important to see quickly any different between means.
• Use the appropriate types of graph when presenting your data. Use bar graphs for categorical data and line graphs for continuous data. Use pie chart or stacked bar graphs when data sum to 100%.
59
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 60
Assignment – see PDF file
60
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 61