• Tidak ada hasil yang ditemukan

Basic Statistics for User Experience

N/A
N/A
Zona Maya

Academic year: 2023

Membagikan "Basic Statistics for User Experience"

Copied!
31
0
0

Teks penuh

(1)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1

Sunu Wibirama

[email protected]

Department of Electrical and Information Engineering Faculty of Engineering

Universitas Gadjah Mada INDONESIA

Basic Statistics for User Experience

Version: 1 September 2022

1

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2

❑ 2008–now : Faculty member in the Department of Electrical and Information Engineering, Universitas Gadjah Mada, Indonesia

❑ 2015 : Post-doctoral researcher in Tampere Unit for

Computer-Human Interaction, Tampere University, Finland

❑ 2016–2019 : Visiting research fellow in Anadolu University Turkey,

Shibaura Institute of Technology, Japan, and Universiti Teknologi Malaysia (UTM), Malaysia

❑ 2018–now : Section Editor-in-Chief ASEAN Engineering Journal (AEJ)–Computer and Information Engineering (ASEAN University Network, JICA, and UTM - indexed by Scopus and ASEAN Citation Index )

❑ 2019–now. : Chair, IEEE Systems, Man, and Cybernetics Indonesia Chapter

❑ 2014 Dr.Eng. Science and Technology Tokai University, Tokyo, Japan

❑ 2010 M.Eng. Electronics Engineering KMITL, Bangkok, Thailand

❑ 2007 B.Eng. Electrical Engineering UGM, Yogyakarta, Indonesia

q Human-computer interaction / touchless technology q Eye tracking applications and eye movements analysis q Virtual reality and human factors

q Applied artificial intelligence

Dr. Sunu Wibirama

Education Positions

Interests

sunu_wibirama

2

(2)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3

Outlines

• Part 1: Types of data and descriptive statistics

• Part 2: Inferential statistics

• Part 3: Data visualization [self-study, see reading material]

Reading material:

• Albert, B. and Tullis, T., 2013. Measuring the user experience: collecting, analyzing, and presenting usability metrics, 2 nd Edition, Morgan Kaufmann, Chapter 2.

3

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4

TYPE OF DATA AND DESCRIPTIVE STATISTICS

PART 1

4

(3)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5

S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.

S. Wibirama, S. Murnani, and N.A. Setiawan, “Spontaneous Gaze Gesture Interaction in the Presence of Noises and Various Types of Eye Movements”, in Symposium on Eye Tracking Research and Applications (ETRA ’20 Short Papers), June 2–5, 2020, Stuttgart, Germany. ACM, New York, NY, USA, 5 pages, 2020.

5

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6

Introduction

• Statistics is basic mathematical tool for analyzing UX metrics. Statistics for UX is mostly drawn from statistics for applied psychology.

• If you good at statistics, you can make an inference (conclusion) from your experiment with appropriate analysis à some final projects/capstone projects involve participants in their experiment.

Purpose of this lecture:

• Provide basic information about understanding data

• Practical step-by-step guide to analyzing data without large number of formulas or complicated statistics

• You can use it to present results of your market research and validation of your interfaces design

6

(4)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 77

7

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8

(related / repeated-measures)

8

8

(5)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 99

face

9

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10

Summary: within vs. between-subject design

If you want to compare two or more stimulus, I suggest you to use same participants for all stimulus (within-subject).

Special case: if you want to discriminate participants based on gender or age (gender or age as independent variable), you should use different group of participants

for each stimuli (between-subject)

Interface A Interface B Interface C Interface A Interface B

Within-subject Between-subject

10

(6)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1111

11

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1212

12

(7)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1313

13

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1414

14

(8)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1515

15

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1616

16

(9)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1717

17

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 18

Raw task time for 12 users Descriptive statistics

In Mac:

(1) Tools > Excel Add-Ins > Analysis ToolPak (checked) (2) Data tab > Data Analysis

18

18

(10)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1919

19

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2020

20

(11)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 21

• Confidence level 95% : you are 95% certain that the true population value lies on the designated range

• Alpha level 5% : you are willing to be wrong 5%

of the time

Note: [standard deviation/sqrt(sample size)]

is “standard error of the mean” (SE)

SE : how precisely the sample mean estimates the population mean

21

21

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 22

If those bars are not overlapping, you can be sure that the difference between mean of checkout time on design A and design B is significant

22

22

(12)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 23

INFERENTIAL STATISTICS

PART 2

23

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 24

Inferential statistics

• Inferential statistics allow you to create a conclusion (make a generalization) for a population based on your samples.

• Very powerful tool, most of usability testing with interval and ratio data use inferential statistics.

• The use of each statistical testing highly depends on the design of experiment

(e.g. repeated-measures / paired samples vs. between subject / independent samples) and normality of the data.

24

(13)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 25

Example: there is no difference on time to complete task on interface A and B

Example: there is a difference on time to complete task on interface A and B

Note: time to complete task is dependent variable;

type of interface is independent variable with two factors (interface A and B)

25

25

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2626

Alpha and Beta

26

(14)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 27

Alpha = type 1 error

You believe that there is a “genuine”

effect of your interface on task completion time but in reality, there is no effect

Beta = type 2 error

You believe that there is no “genuine”

effect of your interface on task completion time but in reality, there is an effect.

27

27

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2828

P-value

28

(15)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 29

(Dr. Andy Field, 2018)

use statistical software such as G*Power to determine

minimum sample size

29

29

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 30

30

(16)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 31

Note: use this test for between- subject design

(e.g.: novice and expert are two different groups, no participant belongs to “novice” and “expert” at the same time)

31

31

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 32

Note: use this test for within-subject design

32

32

(17)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 33

33

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 34

• Suppose that you have a competition in two basket ball groups (team A and team B) to make a successful three-points shot.

• If a player misses a shot, they can repeat until they make a successful one.

• You put two conditions during experiment:

– Three-points shot with audience watching both groups

– Three-points shot without audience (isolated sport center)

• You then measure the duration (in seconds) needed by each player to successfully score a three-points shot

Concept of ANOVA

34

(18)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 35

Concept of ANOVA

Team A Team B

Mean difference: 9.53-8.79 = 0.74 Mean difference: 9.53-8.79 = 0.74 Note: all data are in seconds

35

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 36

Team A Team B

Standard deviation: 0.279 0.286 Standard deviation: 4.75 5.101

Mean difference: 0.74 Mean difference: 0.74

Note: all data are in seconds

Concept of ANOVA

36

(19)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 37

and between-treatments variability.

• In team A, within-treatments variability is low (about 0.28) compared with between-treatments variability (0.74).

• Therefore, we can see that different scoring time (in seconds) between “with audience” and

“alone” is an important, not chance, one.

• Hence, the difference between duration of scoring three-points shot in “with audience” and

“alone” is result of treatment (letting audience entering the sport center).

37

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 38

• In team B, however, within-treatments variability is large compared with between-treatments variability (the difference between the two means).

• If there is large within-treatments variability

compared with low between-treatments variability, we can see that any difference between means is not convincing.

• Thus, the difference between duration of scoring three-points shot in “with audience” and “alone” is not affected by audience.

Concept of ANOVA

38

(20)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 39

• Measuring mean for each group is not enough to say that there is an effect of letting the audience involved during three-points shot experiment.

• To measure whether there is effect from our treatment (letting in audience in sport center), we have to compare variability within treatments and between treatments.

• If there is low within-treatments variability compared with high between-treatments variability, than effect of treatment is significant.

• One independent variable: One-Way ANOVA

• Two independent variables: Two-Way ANOVA

With

audience Alone

Concept of ANOVA

39

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 40

Little/no effect vs strong effect

High within-treatments variability Little/no effect

No significant different of means between treatments

Low within-treatments variability Strong effect

Significant different of means between treatments

40

(21)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 41

(tidak ada efek dari perlakuan)

• Between-subject design: no different effect from treatment between groups (tidak ada perbedaan efek dari perlakuan antar kelompok)

• ANOVA is used to analyze means from 5 groups H 0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5

H 1 : at least one out of five groups has different mean compared with other groups

41

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 42

• Samples in each group are independent.

Samples in group A do not interfere with samples in group B, C, … and so on.

• Samples are taken randomly, there is no bias in sampling.

• Samples are taken from populations with normal distribution.

• In practical case, you can see the histogram or running normality check (Kolmogorov- Smirnov Test)) to observe whether the population is normally distributed.

Basic assumptions of ANOVA

42

(22)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 43

R2= 0.28

43

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 44

Note: the nominal and ordinal data are generally not normally distributed and the variances are not equal

44

(23)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 45

9

45

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 46

In the previous example we were just examining the distribution of success rates across a single variable (experience group).

There are some situations in which you might want to examine more than one variable, such as experience group and design prototype. Performing this type of evaluation works the same way.

46

(24)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama

Summary: what test should we use?

(courtesy of Prof. Hideyuki Takagi, 2015)

47

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 48

DATA VISUALIZATION

PART 3

[This part is self-study material]

48

(25)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 49

Reading material:

– Albert, B. and Tullis, T., 2013. Measuring the user experience: collecting, analyzing, and presenting usability metrics, 2 nd Edition, Morgan Kaufmann, Chapter 2.

49

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 50

vertical

50

(26)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 51

Good example

Bad example

Presenting your data graphically

51

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 52

Question: should we present “task” as continuous data point? or discrete data point?

Presenting your data graphically

52

(27)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 53

S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.

Legends

You need to tell the story right away

53

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 54

Good example Bad example

Presenting your data graphically

54

(28)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 55

Good example

Not so good example

Presenting your data graphically

(except you write the percentage inside the bar)

55

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama

Presenting your data graphically

Bar graphs: independent vs. dependent variable

S. Wibirama, P. I. Santosa, P. Widyarani, N. Brilianto, W. Hafidh, “Physical Discomfort and Eye Movements during Arbitrary and Optical Flow-Like Motions in Stereo 3D Contents”, Virtual Reality, Vol. 24, 2020, pp. 39-51.

Statistical significance

Independent variable

Dependent variable Standard error or

standard deviation

56

(29)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama

Figure 1 | Factors that affect the conservation status of European fishes. b, Box plots of IUCN Red List category against size.

Middle band is the median, boxes indicate the interquartile range (IQR), whiskers min(max(x), Q3 + 1.5 × IQR) and max(min(x), Q1 − 1.5 × IQR), where Q1 and Q3 are the 1st and 3rd quartiles respectively, and dots are outliers from the whiskers.

Maximum

Minimum 75%

2 5%

Median

P. Fernandes, G. Ralph, A. Nieto, et al.”Coherent assessments of Europe’s marine fishes show regional divergence and megafauna loss”,Nat. Ecol. Evol., Vol. 1, 2017, p. 0170.

Outliers

57

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama

Presenting your data graphically

Scatter plots – data transparency

Q. Li, Y. Zhang, P. Pluchon, et al. “Extracellular matrix scaffolding guides lumen elongation by inducing anisotropic intercellular mechanical tension”, Nat. Cell. Biol.,Vol. 18, 2016, pp. 311–318.

Mean

Transparently show distribution of the data

58

(30)

9/1/22

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 59

Conclusion

• Understanding type of data is important. The specific type of data will dictate what statistics you can (and can’t) do.

• If your data are interval and ratio data, you should check whether the data are normally distributed. If so, you can use parametric test. If not, you can use non-parametric test.

• Nominal and ordinal data are generally not normally distributed. You can use non-parametric test.

• Displaying confidence interval (or standard error) is important to see quickly any different between means.

• Use the appropriate types of graph when presenting your data. Use bar graphs for categorical data and line graphs for continuous data. Use pie chart or stacked bar graphs when data sum to 100%.

59

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 60

Assignment – see PDF file

60

(31)

[email protected]

Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 61

61

Gambar

Figure 1 | Factors that affect the conservation status of European fishes. b, Box plots of IUCN Red List category against size

Referensi

Dokumen terkait