ART Iwan Setyawan Perceptual quality Full text

(1)

PERCF.PTV4L QH4L TTY OF OEOMETRJC4.H Y I>TSTORTEJ> IMAGES PART II: HUMAN PERCEPTION OF THE QUAUrY OF (,'EOMETRil ALLY DTSTORTEl> IMA(IES /wan Setyawan

PERCEPTUAL QUALITY OF GEOMETRICALLY DI.\'TORTEI)

EYAGES

PART IT:

HUMAN PERCEPTION OF THE QUALITY OF GEOMETRICALLY

DISTORTEIJ

TMA

GES

lwan Setyawan

(iwan.setyawan(Q)ieee.org)

Abstract

120JU4l''

This paper presents a ｳｴｵ､ｾ＠ of how human perceire the ｱｵ｡ｬｩｴｾ＠ of geometrically

distot1ed images by presenting the design and analysis of a sulJ.jectiYe-test experiment

The results of this experiment ts then used to eYaluate the performance HPQM

(Homogeneity-based Perceptual Quality Measurement) method presented in the first

paper of this series.

Keyworfls:Geometric distortion. human Yisual system. perceptual ｱｵ｡ｬｩｴＮｾ＠ measurenwnt of images. paired-comparison experiment

1.

Introduction

Research on human perception of image quality has been widely performed.

Aspects of images considered in such research are. for example. color. granularity or

sharpness. Another example is to test specific artifacts of a compression algorithm

(e.g .. the blocking artifact of JPEG compression) or watermarking system (e.g .. the

random noise artifact of noise-based watermark-ing systems). Some examples of the

image quality assessment for these disto11ions can be found in [I]. As a result "e

already haYe a good understanding of how these aspects influence human perception

of quality and we are able to quantify these perceptual aspects in cases "·here the

distot1ion is near the Yisibility threshold. We can use the result. tor example. to build a

system to objectiYely measure image quality based on these aspects \Yhich

corresponds quite well to subJectiYe ｱｵ｡ｬｩｴｾ＠ perception. We can also use the result of

this research to improYe the performance of mrious applications dealing "·ith images

(2)

Techne Jurnal llnuah Elektrotekmka Vol. <J No. i April 20 lO Hal 13 .l'J

mentioned above, namely the compression algoritluns and watermarking systems .. are

t\Yo examples of applications that can take adYantage of this knowledge. HoweYer. the

research on human perception of image quality has not dealt with another type of

distortion that an image cru1 undergo, namely geometric distortion (i e .. distortions due

to geometric operations). As a result we are currently unable to ｱｵｲｵｬｴｩｾ＠ the

nerceptual impact of geometric distmtions on images.

This paper presents a study of the impact of geometric distoi1Ions on human

perception of the quality of the affected images. The aim of tllis study is to proYide

both a better understru1ding of human perception of geometric distm1ion and a

reference pomt with "hich to emluate the performance of our noYel OQjectiYe

geometric distortion measure scheme. the HPQM (Homogeneity·based Perceptual

Quality Measurement) method. described in the preYious paper of this series (see also

[2]).

In order to perform this study we propose a user test system that is speciftcally

designed to obserye the impact of geometric distortion on humru1 perception of image

quality. The results we obtain from tllis test can also be useful to other researchers

performing similar research in the fields of watermarking. image processing ru1d

human dsual systems. Therefore. we haYe also made our test set and test results

aYailable for download on our website [3].

The rest of this paper is organized as follows. In Section 2. we present the

design of our user test experiment and statistical analysis methods used to process the

test results. In Section 3. we present the actual setup of our user test. In Section 4. we

present and analyze the result obtained from this user test. In Section 5. we will

briefly reYiew our objectiYe geometric distortion measure algorithm. present scores

obtained using this method and e\ aluate its performance based on the ｳｵｾｪ･｣ｴｩＧ＠ e test

result and compare its performance witl1 other possible objectiYe perceptual quality

measurement systems. Finally. in Section 6, we present our conclusions and proYide

an outlook for fui1her research.

2. Test design & analysis method

In this section we shall discuss in more detail the test design and the analysis

tools we use to ｡ｮ｡ｬｾ＠ ze the test results. The test design and ｡ｮ｡ｬｾ＠ sis tools "e use are

(3)

f'EIU 'EPH '.4 '-ｑｬｾＴｬＮｉｔｙ＠ OF UEOMETRJC4l.LY DISTORTED IMAGES PART II: HUMAN PERCEPTION OF THE QUAL/1T OF UEOMETRICAU Y InSTORrED IMA(i£S /wan .'\erymran to determine consumer preference to certain products or product Yariants (e.g ..

different flmors offood)

f4l.

Ho\\eYer. their usage in e,·aluating perceptual impact of

geometric distortions in images. to the best of the author's knowledge. is nm·el and

has neYer been discussed in the literature.

In order to e\aluate the perceptual impact of geometnc distortiOn. we

performed a subjectne test inYOlYing a panel ofusers. who are asked to e\aluate a test

set comprised of an onginal image and Yarious distorted Yersions of it. The test

subjects eYaluate one pair of images at a time. comparing 2 images and choosing the

one they flunk is more distorted This t\pe of experiment is called the prured

comparison test. There are two experiment designs for a paired comparison test

namely the balanced and incomplere designs [4. 5}. In a balanced design. a test

subject has to eYaluate all possible comparison pairs taken from the test set In the

incomplete design. a test subject only has to perform comparisons of part of the

complete test set. The latter design is useful "hen the number of ob_jects in the test set

is Yery large. ln our experiment we used the balanced ｰ｡ｩｲ･､ｾ｣ｯｭｰ｡ｲｩｳｯｮ＠ destgn. Our

choice for this design is based on three factors. Firstly. the number of ob_jects in our

test set is not Yery lru·ge and a test subject can finish the test within a reasonable time

frame (as a rule of thumb. \Ye consider a test lasting 60 minutes or less to be

reasonable). Secondly. by asking eYery test subject to e\·aluate all objects in the test

set \Ye

'"ill

be able to get a more complete picture of the perceptual quality of the

images in the test set. ｆｩｮ｡ｬｬｾﾷＭ in this design we make sure that each test ｳｵｾｪ･｣ｴ＠

eYaluates an identical test set. Thts makes it easier to eYaluate and compare the

performance of each test subject.

Let 1 be the number of objects in the test set One test ｳｵｾｪ･｣ｴ＠ petforming all

possible comparisons of 2 ｯｾｪ･｣ｴｳ＠ A,. and A1 from the test set. eYaluating each pair

once. will make 1 ₎ _{paired comparisons in total. The result of the comparisons is}

usually presented in a 1 x

r

matrix. If ties are not allowed (i.e . a test ｳｵｾｪ･｣ｴ＠ must cast

his/her Yote for one ｯｾｪ･｣ｴ＠ of the pair). the matrix is also called a rwu-wuy pre.ferem .. ..:

mamx \Yith entries containing l s if the object \\US chosen and (J's othen,ise. An

(4)

Tcchne Jurnal llmiah Elcktroteknib Vol SJ No I April 20 I 0 Hal ｉｾ＠ - YJ

interpreted as objecl A, is prejerred 10 ubji.:cl A,. The indices i and .1 refer to the ro"s

and columns of the matrix. respectively.

A, A:- ａｾ＠ａｾ＠

A] / l ₁ ₀

A? 0 I 1

A3 , 11 ll II

ａｾ＠ 1 0 1 X

Ftgure 1. An c.twnplc o(a prc{ercncc mmrix

Let u, be the number of votes object A, receh ed during the test. In other 'vords.

o,

=fA,, .

We call a, the score of object A,. It JS easy to see that the total score for all J=l

1'11-.J

objects is

I I

' ( / =-f(l _£.., I ., -I)

'""l

-and that the a,·erage score among all ｯｾｪ･｣ｴｳ＠ is

I

'"

£.., I I

ｵ］ｾ］ＭｕＭｉｊ＠

f 2

(l)

(2)

We can extend these results to the case "·here 've have n test subjects

performing the paired comparison test In this case. the test result can also be

presented in a preference matrix similar to the one presented in Figure 1. Ho" eYer.

each entry A,,, of this matnx nmY contains the number of test subjects "ho prefer

object A, to object A,. If again \Ve do not allmv ties. the values of A,_," ill be integers

ranging from 0 ton. We also note that in tlus case A1.1 = n-A,_,. Finally. in tllis case.

the total an9 average scores are expressed as .!_nt(f -1) and ..!.n(f -1). respectiYeh.

J 2 2

(5)

PERCEPTUAL f_JUALITY OF UEOMETRIC4/.LY DISTORTEI> IMAGES PART II: HUMAN

PERCEPTION OF THE QUALITY OF &EOMETR/( ALLY IJI.\'TORTEIJ IMA&E.\ !wan .'>'etyawan

2.2. Statistical analysis of the experiment

After performing paired comparison tests. "e obtain a preference matnx for each test set. No" "e haYe to perform an analysis of this test result. We ha' e t" o main objectiYes for this analysis. In the first place. we \Yant to obtain the OYerall ranking of 1 he test objects. The second objectin is to see the relati\ e qualih differences bet\\een the test ｯｾＱ･｣ｴｳＮ＠ that Is. \\hether obJect A, Is perceiYed to be either similar to or Yery different in quality from ｯｾｪ･｣ｴ＠ A,. The analyses "e perform on the data to achieYe these objecti,·es are the coejl/cient of consisten<.y the coejlicent of

agreement and the significance test on score (Nierences. Each of these analyses IS

discussed in the folio" ing sections.

2.2.1. Coefficient of consistency

A test subject is consistent when he/she. in eyaJuating three objects A,.. A,. and

ａｾ＠ from the test set. does not make a choice such that ａＬＮｾ＠ａＬＮｾ＠ａｾ＠ but ａｾ＠ｾａｸＮ＠ The arrows can be interpreted as "'preferred to'·. Such a condition is called a circular triad.

While circles inYo!Ying more than three ｯｾｪ･｣ｴｳ＠ are also possible. any such circles can easily be broken up into hYo or more circular triads. The preference matrix presented in Figure 6.1 has one such triad. namely A1 ｾａ｟ＮＬｾ＠ A-1 but A-1 ｾ＠ A1.

For smaller Yalues of t. one can easily enumerate the circular triads encountered. For larger t. tllis task becomes Yery tedious. Ho,Yever. ''e can compute the number of circular triads. c. from the scores a, using the follo\\ing relation f4.6]:

I . T

c = ( r l )

-24 2 (3)

''here

I

T =

L(a, Ｍ｡Ｉｾ＠

(4)

d

(6)

Techne hunallhniah Elcktrotcknika Vol.() No. I Aprii20LO Hal 13 ... ,lJ

possible number of circular triads. The coetflcient of consistency 1, IS defined as

follows:

24c .

:: =

1 - , . 1ft odd

- t(r-1)

(5a)

24c

C"" 1---ift eyen (5b)

There are no mconsistenc1es 1C and only iC ｾ＠ = l. This number "ill mo' e to zero as

the number of circular triads. thus the inconsistencies. increases.

The coefficient of consistency can be used in the following \Yays In the first

place. we can use this coefficient to JUdge the quality of the test subject. ｓ･｣ｯｮ､ＡｾＮ＠ "e

can use tllis coefficient as an indication of the similarity of the test objects. If. on

aYerage. the test suhfecrs are inconsistent (either for the "·hole data set or a subset

thereof} ''e can conclude that the test ohJecrs being e' aluated are Yery similar and

thus it is difficult to make a consistent judgement. OthenYise. if one particular test ｳｵｾｦ･｣ｲ＠ is inconsistent ''"hile the other test subjects are - on aYerage - consistent ''e

may conclude that this particular subject is not performing ''ell. If the consistency of

this subject is significantly lmYer than ayerage. ''e may consider remoYing the result

obtained by this subject from further analysis.

2.2.2. Coefficient of agreement

The coefficient of agreement shmYS us the di,ersity of preferences among n test

ｳｵｾｪ･｣ｴｳ＠ Complete agreement 1s reached when all n test subjects make identical choices

during the test. From Section 2. 1. we see that if eYery subject had made the same choice

during the test (in other \\Ords. if there has been complete agreement). then half of the entries

in the preference matrix \Yill be equal ton. while the other half would be zero. AltematiYely.

in the worst case situation. all entries will be equal to n/2 (if 11 is e\en) or (n ± l )/2 if n is odd

J

It is ob\ious that the nlinimum number of test subjects. n. that "e need in

order to be able to measure agreement is 2. Each time 2 test subjects make the same

(7)

PERCEPTUAL QUALITY OF 6'EOMETRIC4U Y J)JSTORJ'£/) IMAGES PART II: HUMAN PERCEP110N OF THE QUALITY OF GEOMETRIC4.LL Y J)J.\'TORTEJ) IMAGES !wan ,\'etyawan of pairs of test subjects that make the same decision about each pair of test objects.

We do tllis by computing r. defined as

(6)

/ ,..j '

In Equation ( 6

).l· ;'

J

giYes us the number of pairs of test subJects making the

same choice regarding objects A, and A. Thus r giYes us the total number of

agreements among n test subjects eYaluating t objects. ObYiously. \\henA,_₁

=

l \\e do not haYe any agreement among the subjects and the contribution of this particular A,., to r would be zero. If A,,₁

=

u.

it means that a11 test subjects agree nor to choose A, oYer

A,. Although the contribution of this A,,₁to r is also zero. the number of agreements

regarding this pair of test objects will be reflected by the value of ａ _ＱＬｾＮ＠

We haYe (

ｾＩ＠

paJCS of compansons and (

ｾＩ＠

possible paJCS of subjects.

therefore the ma'-.imum number of agreements between the subjects is given by

max( r)

=C)(;)

(7)

Meamvhile. the nlinimum value of r is gi,·en by

( r

y

Ln

n

J

1

min(r)=i.2Jl 2 J (8)

We can also express r in a more computationally conYenient \vay. as follo,vs.

r

=+[I

｡Ｎ］ＬＭｾｾＨＺ＠

1]

- 1=/ - )

(9)

(8)

Teclme .Turnal Umiah Elcktroteknika Vol. l) No. April 2010 Hal l l -19

II= 2r (10)

max( r)

The 'alue of 11

=

1 if and only if there is complete agreement among the test

subjects. and it decreases when there is less agreement among the test subjects. The

iillllillllllli 'dlllt ＭＭｾｾ＠ '

,-a1ue of 11 IS -l '' h1ch can ｯｮｬｾ＠ be aclue\ ed ''hen 11 '"' This \ alue of 11 shtms the

strongest form of disagreement bet\Yeen the test subjects. ｮ｡ｭ･ｬｾ＠ that the test subjects

completely contradict each other.

We can perform a hypothesis test to test the significance of the value 11. The

null hypothesis IS that all test ｳｵｾｪ･｣ｴｳ＠ cast their preference completely at random The

alternative hypothesis is that the value of 11 is greater than what one "·otlld expect if

the choices would lul\·e been made completely at random To test the significance of u

we use the follO\ving statistic.

as

proposed in

Hl

• -t [

l(')(n)(ll

3)

x-

=

n-2

r-2

2

ＲＩＨｮｾＲＩ＠

\\hich has

x

2 distribution with ( 1) n(n

ｉｾ＠

degrees of freedom

ｾＲ＠ (11 2)"

(11)

As n increases. the expression in Equation ( 11) reduces to a simpler form [7]

x=

I :

lP

₊ll ( 11 I)

J

(12)

' ｾｉ＠

\Yith (

ｾ＠

J

degrees of freedom.

It is important to note that consistency and agreement are two different

concepts fTherefore. a high 11 value does not necessarily imply the absence of

inconsistencies and nee wrsa.

The coefficient of agreement also shows \\hether the test objects. on a\erage.

(9)

I

r

PERCEPTUAl. QUAI.ITY OF lrEOMETRJCALLY DISTORTED IMAGE.\' PART 11: HUMAN PERCEPTION OF l'HE QUALITY OF GEOMETRICALLY DISTORTED IMAGE.\ !wan .)'etyawan agreement is very low we can expect that the score of each test ob.iect

''ill

be ,·ery

close to the avemge scores of all test objects. i.e .. there is no significant difference

among the scores. As a consequence. assigning ranks to the ob_jects or drmving the

conclusion that one object is better (or "·orse) than the others is pointless since the

obsened score differences (if any) cannot be used to suppot1 the conclusion. On the

,,ther hand <.:lrnnQ Lll'reement ｡ｭｯｮｾ＠ the test subJects indtcates thnt there e'\ist significant differences among the scores

2.2.3. Significance test of the

score

difference

A significance test of the score difference is performed in order to see whether the perceptual quality of any 2 objects from the test set is perceived as different In

other words. the perceptual quality of object A1 is declared to be different fi·om the

quality of object A,. only if a, is significantly different from C/_1•Othenvise. we have to

conclude that the test ｳｵｾｪ･｣ｴｳ＠ consider the perceptual quality of the 2 ｯｾｪ･｣ｴｳ＠ to be

similar.

Tllis problem is equh alent to the problem of diYiding the set of scores "e

obtain. i.e. S' = (a J. a:;. . .. ar}. into sub-groups such that the mriance-nonnalized

range (the difference of the largest and lowest Yalues) of the scores witllin each group.

R (13)

is lower or equal to a certain Yalue

r

Rcl

(in other words. tile difference of all) 2 scores

within the group must be lo\\ er or equal to

r

K.l>.

\\llich depends on the Yalue of the significance leYel a. In other \Yords. ne want to find

K·

such that the probabilit) P[R 2:: K] is lo\Yer or equal to the chosen sigtlificallce le\'el

a.

We declare the ｯｾｪ･｣ｴｳ＠ ''ithin each group to be not significalltly different. while those from different groups are

declared to be significantly different. By ad.iusting the Yalue of a. we call ad.iust the

size of the groups. Th.is in turn controls the probability of false positiYes (declaring 2

objects to be significantly different when they are not) and false negatiYes. The larger

the groups. the higher the probability of false negatiYes On the otl1er hand. the

(10)

Tcchne Jurnalllmiah Elektrolekmka Vol tJ No I Apnl 20!0 Hal 11-:;lJ

The distribution of the rangeR is asymptotically the same as the distribution of Yariance-normaJized range. W,, of a set of normal random Yariables with yariance

=

1 and t samples

l4l

Therefore. "e can use the folio" ing relation to approXimate f'LR :::_

Rl

2R l

' 2

Pl

ｵｾ＠

. ·, . ---

J

J;i

( 14)

In Equation ( 14 ). W,., is the yaJue of the upper percentage point of W, at significance point u. The 'alues of Wr.u are tabulated in statistics books for example the one prO\ ided in [g

J.

The significance test for the differences bet\\ een scores proceeds as folio\\ s.

1. Choose the desired significance leYel u...

2. Compute the critical Yalue

R

using the following relation

1

1 r:

tl

R.= _'

-W,,-vnt+-l

₂ _. ₄ (15)

3 Am difference bet\Yeen 2 scores that is 10\Yer than R is declared to be insignificant. OthenYise. the score difference is declared significant.

3. Test procedure

User test mechanism to measure the impact of geometric distortion on the human perception of image quality is not ''idely discussed in the literature. Therefore. "e ha' e proposed a ne" user test system that is specifica11y designed for tllis purpose. In this section\\ e shall describe in more detail the design of a suitable test set and user interface for such user test system.

3.1. Test set

J

We

used the same test sets as the ones "e used in the preYious paper in tllis series. The t" o images used as a basis to build our test set. i e. the Bird and Kremlin images (shom1 in Figure 2). are g-bit grayscale images with 512 ' 512 pixels resolution. The images are chosen prunarily due to theu content. The first image.

(11)

PERl'FPJ'C:4L ｑｬｾＴｌＯｔｙ＠ OF UEOMETR/('ALLY J)JSTORTED IMAOES PART 11: HUMAN PERCEPTION OF THE QUALITY OF GEOMETRICALLY D/.\'TORTE/) IMA(JES

bmn .\'e ryawan

Bird. does not haYe many stmctures such as straight lines. Furthermore. not eYery test

subject is Yery familiar ''"ith the shape of a bird (in particular the species of bird

depicted in the image). So in this case. a subject should haYe little (if any) .. mental

picture·· of \\hat things should look like. On the other hand. the Kremlin picture has a

lot of structures and eYen though a test subject may not be familiar "ith the Kremlin.

h"' dw dwnld h<n e some prior kno" led11e nf \\hat buildings should look like

(a) (h)

Figure 2. The 2 hasis images. (a) Bird and (h) Kremlin

We used 17 different Yersions of the images. Each Yersion is geometrically disto11ed in a different way. Thus in our test ''"e use t

=

17. The geometric distortions we used in the experiment are also identical to the ones used in the preYious paper and

their descriptions are repeated here in Table l for conyenience. As in the preYious

paper. ''e use the notation A,. with i

=

I. 2 ... I 7 to ｩ､･ｮｴｩｾﾷ＠ each image.

The distortions chosen for the test set range from distortions that are

perceptually not disturbing to distortions that are easily Yisible. The global bending

distortions (A(,. A-. As. A<Jl are chosen because these kinds of distortions are. up to a certain extent. 'isually not ｙ･ｲｾ＠ disturbing in natural images. Howe' er. this disto11ion

seYerely affects the PSNR Yalue of the distorted images. The sinusoid (stretch-shrink)

distortions (A](;. An AD A13l distort the image by locally stretching and shrinking the image. Depending on the image content. this kind of disto11ion may not be

perceptually disturbing. The rest of the distortions distorts the image by shifting the

pixels to the left/right or upwards/downwards. TI1ese distortions are easily Yisible.

eYen ''hen the seYerity is lo". The distortions (A.:. A3. A.;. A5) ｡ｰｰｬｾ＠ the same

[image:11.522.72.456.202.406.2]

(12)

Techue Junwlllmiah Elcktrotcknib Vol 1₎_{No I April2010 Hal13}_--39

Au" Arl is n1ried within the image. Some examples of the geometric distortions used

in the experiment are shown in Figure 3.

Tahle 1. Geometnc distortwns used m the expemnenr

--- ＭＭＭＭｾＭＭＭ

Image Description

A,

No distortion {original image)

-- "' ｾﾷﾷＭＭ

I A, t Sinusoid. am p litude factor

=

o .2. 5 p eriods

r

.

Ｍﾷﾷｾ＠

A,

Smus01d. amplitude factor= (I 2. 10 penods

I

--ａｾ＠ Sinusoid, amplitude factor= 0.5. 5 periods r-·--- ... -- --- ｾＭｾＭＭＭ

A_; Sinusoid, amplitude factor= 0.5. 10 periods

I

An Global bending. bending factor= O.R

ＭＭｾ＠

...

A

Global bending. bending factor=- 0.8

As Global bending. bending factor= 3

i

A,) Global bending. bending factor= -3

AJO

Sinusoid (stretch-shrink). scaling factor L 0. 5 period

- - -

-Ail Sinusoid (stretch-shrink). scaling factor L I period

---A;.: Sinusoid (stretch-shrink). scaling factor 3. 0.5 period

A13 Sinusoid (stretch-shrink). scaling factor 3. l period

Au Sinusoid (increasing freq). amplitude factor = 0.2. stru1ing period =L

ti·eq increase factor = 4

A15 Sinusoid (increasing freq). runplitude factor 0.2. stru·ting period

=L

freq increase factor = 9

ｾＭＭﾷ＠ ·

-ｬＧｳｩｮｾ［ｳｯｾ､＠ (increasing amplitude). start umplih1de factor 0.1. 5

periods. amplitude increase factor 4

AJ Sinusoid (increasing runplitude). start amplitude factor 0. 5

periods. amplitude increase factor 9

... L ... - - ---

(13)

PERCEPTE4L Ol!AI.ITY OF OEOMETRICAU Y I>JSTORTEI> /MAGE.4i PART II: HUMAN PER(::ePTWN OF THE QUALITY OF liEOMETRICALL Y J>ISTOR1'El> IMAG'E.\'

/wan .')eTJm1·an

the left-right ordering of the images re,·ersed. Thus we have 306 pairs of images for each of the hYO images for a totaJ of 6 12 pairs of images in the test set

(a}

___, _____ _

i

(--l __

(h)

Figure 3. Examples rd' rhe geometric distortions:

(c)

(a) Disrortion A;;. (b) Distortion A13 and (c) Distortion A16

3.2. Test subjects

The user test experiment tnYohed 16 subJects. consisting of 12 male (IL ON.

PD. AH. ES. DS. IS. JO. JK. JJ. KK and RH) and 4 female (KC. CL. CE and ID)

subjects. The subjects haYe ditTerent backgrounds and leYels of familiarity with the

field of digital image processing. As discussed in Section 3. L each user will examine

each pair of test images twice in one test session. Furthermore. ｳｵｾｪ･｣ｴｳ＠ IL. OS and lS

each perform 3 test sessions. Therefore. in the tables found in Section 4. a number "·ill be added to the subject names to show different test sessions ( eg .. IL 1 shows the

result of subject IL from the 1st test. etc.). These repetitions are done to see the

difference bet\veen test results for one person when the test is repeated. We assume

that each repetition of the test (both within a single test session and bet\veen test

sessions) is independent. Therefore. "e ha,·e the total number of test repetitions n =

44.

3.3. Test Jlrocedure

The test is performed on a PC with a 19-mch flatscreen CRT monitor. The resolulion is ｾ･ｴ＠ at ll 52 R64 pixels The 'ertical refresh rate of the monitor is set at 75 Hz. T0 perfbrm the test. a graphical user interface is used. This user interface is

[image:13.524.68.462.123.599.2]

(14)

Techne Jumal Ilmiah Flektroteknika Vol. 9 No. I April :WIO Hal 13-V>

[image:14.518.74.451.75.509.2]

Ｐｾ＠ . . . .

Figure -1. The user infeliace used in the e1periment

4. Test results and analysis

4.1. User preference matrix

After performing the user test. "e obtain the preference matrices for the Bird

and Kremlin linages. In Tables 2(a) and 2(b), we shmY the preference matrices

obtained for the Bird and Kremlin test images. These preference matrices are

a\·ailable for dmmloading from our website [3]. The images codes refer to Table l.

The column

a/

shows the sum of each ro\\. i.e .. the score of each image AI- Since in our experiment the test subject is asked to choose the image \Yith the most distortion. a

(15)

A1

Az

PERCEPTVAL QUAUTY OF CiEOMEfRIC4l.LY lJISTORTED IMAliES PART /l: HUMAN PERCEPTION OF THE QUAliTY OF OEOMETRJCAH Y mSTORTEJJ IMAUES

/won ｓ｣ｾｶｭｮｭ＠

Tahle 2(a). Preference matrtxti>r the Bird image

AJ A1

ｾｾ＠

A_; A6 A,. As A9 A1o An AI1 Au Au AH An) A17

/ 7 . 0 0 11 21 19 N 8 24 2 10 9 1 0 0

37 :-:· ₄ ₀ ₀ _{33 28 29 24} ₃₀ ₃₆ _lO _{I l} ₁₂ ₅ _I _l

(lj

123

i

261

ｾＮＱＨＩ＠

ｾ＠

! -C

_.3

j<) )il ' ₄₁ 4::! ""'t; _tR T7 q

'

i) ＬｾＮＮＮＬＮＺ［＠

ｾＧ＠

A-I

A·

:>

Ao

A,.

As

A9

A1o

An

Au

AIJ

Al6

AJ7

44 44 41 3 44 44 42 43 43 43 37 3'J 43 31 15 l 557

44 44 43 41 y 44 ₄₄ 44 43 43 44 42 43 ₄₄ 43 42 24 672 33 II 1 0 () ₂₅ 15 ₁₇ ₁₅ ₃₃ ₄ _{I l} ₈ ₁ 0 0 175 23 16 I 0 0 19 15 13 21 28 3 11 5 2

o

I

o

157 25 IS 5 2 0 29 29 12 17 27 6 10 9 I I

Ｐｾ＠

36 20 5 I 1 27 31 2 X 30 40 8 15 15 2 ₂ ()

36 14 3 1 l 29 23 27 14 X 34 6 9 9 ()

20 8 2 1 0 ll I 16 17 4 10 4 5 6 I 0 42 34 19 7 2 40 41 38 36 38 40 ;.: ₂₀ ₃₁ ₉ ₅

34 33 26 5 1 33 33 34 29 35 39 24 X 25 17 5

35 32 7 1 0 36 39 35 29 35 38 13 19 X 6 1

43 39

r

) 13 l 43 42 43 42 44 43 35 27 38 7 44 43 39 29 2 44 44 43 42 44 44 39 3') 43 37 X

44 43 44 43 20 44 44 44 44 44 44 43 44 44 42 43

4.2. Statistical analysis of the preference matrix

4.2.1. Coefficient of consistency

ＨｾＩ＠

We measured the coefficient of consistency for indiddual test sub.iects using

Equation (5a) since we haYe t = 17. Since each test subject performs the user test twice per session. we use the average Yalue of

C

as an indication of each subjecr s

consistency The aYerage coefficient of consistencv is presented in Table 3.

0 206 0 105

l 403

() ₃₇₃

0 326

2 497

(16)

rt:chne Junwl Jhniah Elektroteknib VoL ') \lo I April ｾＰ＠ I 0 Hnl ｉｾ＠ - Yl

Tahle 3. Coefficient l?f"consistem.y ((.1

r : :

-; Subject Bud Kremlin Subject Bird · Kremlin

ILl 0.83 0.93 DSl 0.67 0.87

IL2 0.83 0.91 DS2 0.73 0.92

IL3 0.85 0.95 DS3 082 093

i KC

tl.851

ｦｕｾＶ＠ IS J ' 0.921 0. 95

-!

' ON 0.94 0.98 IS2 0.94 0.93

'

i PD 0.70 IU{7 IS3 0.94 O.Y7

i-... ,_. ____ i -· .... ｾＭＭＭＭ ··· ···- ＭＭＭＭＭｾＭＭＭｾＭＭﾷＭＭｾ＠

AH 0. 87 0.96 JO O.Y3 0.97

I CL 0.82 0.90 JK 0.90

ＰＹｾｾ＠

I

CE 0.83 0.94 JJ 0.85 0 88'

l

ES 0.89 0.94 KK 0.70 0.79

I

i

I

ID 0.66 0. 90 RH 0.90 0.95 _____j

From Table 3 we can conclude that in general the test subjects are consistent

in their decisions. We can also see that in general the yalues ｯｦｾ＠ for the Bird image

are lower than those of the Kremlin image. This is due to the fact that the Kremlin

image contains more stmcture compared to the Bird image. which helps the test

subjects to make consistent decisions. Furthermore. the wlfamiliarity of the test

subjects with the particular species of bird depicted in the image also makes it

difficult to make consistent decisions.

4.2.2. Coefficient of agreement (u)

We measured 1\Yo types of coefficient of agreement from the preference

matrix. The first is the overall coefficient of agreement that measures the agreement

among all test subjects in the experiment. The second is the indh·idual coefficient of

agreement that measures the agreement of a test subject with him-/herself during the

<I

t\vo repetirlons in a test session. A low u value in this case \YOuld indicate that the

subject is confused and does not haYe a clear preference for the images being sho\\11.

For the m erall coefficient of agreement we haYe n = 44 and t = 1 7. For these values. the maximum and minimum Yalues of 11 are 1 and -IU>227. respectiYely. From

(17)

PERCEPTUAL QUALITY OF GEOMETR!C4.U Y [)[STORTEIJ IMAGES PART 11: HUMAN PERCEPTION OF THE QUALITY OF (rEOMETRICAJI Y J)JSTORTEI> IMA(iES

!wan S'etymmn

u1, " 0.574 and llkr.:mim = 0.731. Petforming the significance test on both 11 Yalues

using the method described in Section 2.2.2 sho\YS that in both cases the Yalue of 11 is significant at a OJI5. Therefore. we can conclude that in both cases there are strong agreements among the test subjects. HoweYer. we can also see that the agreement in the case of the Bird image is much weaker than the Kremlin image. due to the image

Tahlc ../. lndividuut Coefficient ofAgreements(u)

- - - - --ｾＭＭＭＭＢｾＭＭＭＭＭ

ﾷ｡ｩｾ､ｬ＠ｋｲ･ｭｬｩｬｾ＠

Subject Bird Kremlin SubJect

ILl 0.559 0.750 DSl 0.265 o.o47 IL2 0.574 0.721 DS2 0.471 0.794

c--- ＭＭＭｾＭ -·-

----DS3 ---c--- - . ----ＭＭＭＭＭｾＭＭＮ＠

IL3 0.677 0.779 0.55l) I () 750 KC 0.662 0.691 ISl 0.721 0.882 ON 0.779 0.868 IS2 0.809 0.721 PD 0.485 0.677 IS3 0. 735 O.X38 AH 0.559 I 0.794 JO 0.721 0.853 CL 0.456 0.750 JK 0.824 0.735 CE 0.618 0.691 JJ 0.529 0.691

ＭＭｾ＠ｾＭ

---ES '0.765 0.765

KK

0.368 () 515 ID 0.279 0.691 RH 0.691 0.765

I

[image:17.523.74.452.209.504.2]

For the indiYidual coefficient of agreement \\'e haYe n = 2 and t::; 17. In tllis case. \Ye ha' e -1 :::; 11 :::; 1. The indi' idual coefficient of agreements are presented in

Table 4. As expected. we see that all subjects haYe larger 11 Yalues for the Kremlin

image. The exceptions to this are subJect ES. who has the same u Yalues for both images. and ｳｵｾｪ･｣ｴｳ＠ JS2 and JK \Yho haYe larger 11 for the Bird image. After

performing the significance test on the Yalues of u. we can conclude that all subjects haYe u Yalues that are significant at a= 0.05 for both the Bird and Kremlin linages.

4.2.3. Significance

test of score

differences

(18)

Tcchne Jurual llmiah Elcktroteknika Vol 9 No I April 2010 Hal 13 39

the test ob_jects. We use the procedure described in Section 2.2A to find the critical Yalue for the score difference for the images, at significance le,·el a= 0.05. From [8] we haYe

W.

"= 4.XIJ. Substitutmg this value into Equation (15). \\e ｨ｡ｮｾ＠ R m.12 and thus "·e set R = 6g_ Therefore. only objects ha,·ing a score difference of more than 68 are to be declared significantly different.

In Figure 5. "e present the grouping of the images in the test set based on the s1gnitkance ()f the score dttlerences. lhe Images lla\ e been soneo Hum ｴｾ［ｈ＠ lu 11g1u

based on their scores. starting from the tmage with the smallest score (I_ e., perc en ed to haYe the highest quality) to the one with the largest score. The score for each image is shown directly under the image code. Images ha,ing a score difference smaller than 68 are grouped together. This is represented by the shaded boxes under the image code For e'-::lmple. m Figme 5(a). imagesA1-1 and

A1-"

belong to one group.

(J (,

10 12 15 17 18 20 26 26 32 37 40 42 4l) 55 57 67 67

5 3 7 5 8 6 5 6 3 3 5 7 7 7 2 4

(a)

A1 A1

A,

A(,

A,

A A1 As A::

Al AI)

A3 A, A.; A1 A1

A_;

0 3 5 ()

85 94 10 1 I 15 20 27 31 34 38 39 48 49 57 59 67 68

8 3 4 2 4 5 8 l) l) ₅ l)

"

5

..

(b)

Figure 5. Score groupingfbr: (a) Bird image and (h) Kremlin Image

Frpm Figure 5. "e can see that the images occupying the last 6 positions of the ranking for both the Bird and Kremlin images are distorted using the same dist011ion. Futihennore. they are sorted m the same order (except for images A5 and A 1 _ but the

difference between their scores is not signiticant). Thus we can conclude that these distortions are percei,-ed similarly by the test subjects. regardless of the image

[image:18.520.76.461.184.581.2]

(19)

PERl'EPTUAL QUALITYOFliEOME1'Rll'AHY lJJ.'n'ORTED JMAliES PARTJJ: HUMAN PER( 'EPTJON OF THE QUALJTY OF &'EOMEJ'RJCALL Y J)JSJ'ORTEiJ JMA6'ES

hmn .\etya;um

content. These distortions occupy the "lower quality" segment of the ranking so we

can also conclude that the distortions are so seYere that the image content no longer

plays a significant role. For the other images. the influence of image content on the

perceiYed quality of the distorted images is larger.

Bud Kremlin

I

·---

ＭＭＭｾｊ＠

l

Group 11 Significant') Group u Significant?

1 - - · ---ＭＭＭＭｾＭＭＭＭＭＭＭＭＭｾＭﾷ＠ ·---ＭｾｾＭＭＭＭＬＭ

AuA 1A- 0.006 No A1AJOA12Ao O.OOS No

A1A-AtAs 0.061 Yes AJoA12A(1 0.03 Yes

Au

ＭＭＭｾ＠ ---- ＭＭＭＭＭＭｾＭＭＭＭＭＭｾＭＭｾｾ＠ ---t- - ---·-- --- ---·--

--ａＭａｾａｳａｊｵ＠ 11.041 Yes I _{Au A} 0_011 No

!

Aj(A_:AQ 0.07 Yes A13As 0.08 Yes

A.:A<Au 0.085 Yes AsA.:AJ.J 0.175 Yes

--- ＭｾＭＭＭＭ

A1¥1J3 -O.IJ04 No A2AJ-1A{) (),(J54 Yes

AuA1.:A3 -0.003 No A3A15

-

No

i

0.021

I

A1sA-1 0.148 Yes A..A16 O.l12 Y1

--- ···--- - ＭＭｾＭＭＭＭＭＭﾷ＠

-A ..A/(. 0.08 Yes ArA5 0.01 I N1

A:Ar -0.015 No

-

-Table 5 shows the oYerallu yaJues for each score group. We expect that '"hen

the images in a group do not haye significantly different scores. there will not be any

clear preference tor any of them among the test subjects and therefore the u ya]ues

should be lon. The groups are presented 111 the I '1 and ＴＱｾ＾＠ columns using their

members as group names. The 3.-d and 6th colunms of the table show the result of the

significance test for 11. as described in Section 2.2.2. with significance leYel a= 0.05.

We can conclude from Table 5 that the u yafues for each group are Yery lo".

Some groups eYen hm·e ll Yalues that are not significantly larger than the

u

Yalues that

"ould haYe been achieYed had the Yotes within that group been cast at random. This

result shows that indeed the grouping of the images pe1formed based on the

[image:19.523.77.455.173.510.2]

(20)

Techne Jurnal lhniah Elektroteknika Vol. 9 No I April2010 Hal 13 39

4.3. Conclusions

From the analysis of the user test results. we cru1 draw the follo,Ying conclusions:

J. The test objects are generally perceptually distinguishable by the test ｳｵｾｪ･｣ｴｳＮ＠

This is supported by the fact that the consistency of the test subjects is

indn idualu Yalues (shmHl in Table 4) are also high.

2. There is a general agreement as to the relati,·e perceptual quality of the test unages among the test subjects. This is supported by both the high oYerall and indindualu ,·alues. Therefore. ''e can make a ranking of the images based on their perceiyed quality.

3 For some images. the relattYe perceptual quality among them ts not clearly distinguishable. We can see this from the grouping of the scores based on the significance test of score differences. Tllis is further supported by the lack of agreement among test ｳｵｾｪ･｣ｴｳ＠ regarding the relatiYe quality of images \Yithin such groups.

5. Evaluation of the objective perceptual quality measurement

method

5.1. Overview of the method

The objectiYe geometric distortion measurement is based on the ideas in our pre,iously published work

191

m1d further deYeloped and described in

121

and in the pre' ious paper of this series Thl! algorithm is based on the hypothesis that the perceptual quality of a geometrically distorted image depends on the homogeneity of the geometric distortion. We call our proposed scheme the Homogeneity-based Perceptual Quality Measurement (HPQM). The less homogenous the geometric distot1ion is. the lower the perceptual quality of the image ''ill be. We proposed a

I

(21)

I

PERCEPJ'U.4L QUALITY OF GEOMETRICALLY /JlSTORTED IMAfiE.ft PART If: HUMAN PERCEP110N OF THE QUALITY OF GEOMETR/( 'ALLY ])JSTORTED IMAGES

hmn .\'etyrrwm1

predetermined threshold or until the locality of the approximation reaches a

predetermined maximum. The locality is increased using quadtree partitioning of the

image. \Yhere smaller block sizes indicate higher approximation ｊｯ｣｡ｬｩｴｾﾷ＠ We then

determine the score (i.e .. the quality) of the image based on the resulting quadtree

structure In the objecti' e test the score that can be achie,·ed by an image is

5.2. Performance evaluation

In this section we shall e,·aluate the performance of our proposed ＰｾＱ･｣ｴｩｙ･＠

quality measurement algorithm In this performance e\·aluation we use the results of

the subjectiYe-test as a ground truth. In other words. the proposed algorithm "ill be

considered to be perfonning well if its results haYe a good correspondence to the

subjective-test results. Furthermore. in order to evaluate the performance of the

proposed obJectiYe quality measurement algorithm relative to the peiformance of

other possible measurement schemes. \Ye also evaluate the performances oftwo other

possible objecti,·e quality measurement schemes. The other possible measurement

schemes we eYaluate in this section are PSNR measurement and Motion-Estimation

(ME)-based measurement scheme.

The PSNR measurement is a widely used tool used to e,·atuate the ob.iective

quality of images. Although this measurement does not al\Yays correspond well to

human perception of quality. its performance is good enough to eYaluate the quality

of. for example. images degraded by addith·e noise. HoweYer. PSNR measurement

relies hea' ｩｬｾ＠ on the pixel-per-pixel correspondence between the images being

eYaluated. Since geometric distortion ､･ｳｴｲｯｾﾷｳ＠ this correspondence. PSNR

measurement is not well suited for eYaJuating geometrically distoi1ed Images.

Therefore. in our experiments the results of the PSNR measurement are used to

indicate the \Yorst-case scenario (i.e .. an ineffectiYe measurement scheme).

The second alternatiYe objectn·e quality measurement scheme we evaluate is

an motion estimation (ME)-based measurement scheme. This measurement scheme is

inspired by lhe use of motion estimation techniques in image and 'ideo \Yatermarking

to deal "·ith geometric distortion for example the technique presented in [ l2l In order

(22)

Techne Jurnal Ilmiah Elektroteknika Vol 9 No I April 2010 Hnl 13 --39

l\Yo outputs of the motion estimation process. namely the motion Yector entropy and

the ,-ariance of the prediction error. The motion 'ector entropy is used to indicate the

"acti' ity·· of the distortion. A high acti' ity means that Yarious parts of the image are

dist011ed in a different "ay. The higher the actiYil\ of the distortion. the lower the

perceptual quality of the image. The ,·ariance of the prediction error sho\\ s the

residual error after the motion estimation and compensation process A large error

'anance mdicates a hem y distortiOn and thus a I on ei perceptual qualll). Uw

obsen·ations mdicate that the motion Yector entropy plays a more important role in

determinmg the perceptual quality of the image. Therefore. we giYe this measurement

parameter a larger n eight than the residual error 'ariance. These weights are

determined experimentally. The proposed ME-based quality measurement (MEQM)

scheme is presented in Figure 6. In our experiments. "e chose a block size of 16 1(,

pixels. maximum displacement of 7 pixels and full-search method. This ME-based

measurement approach is somewhat similar to the HPQM approach \Yith two main

differences. The first difference between the h\'O is the simpler approximation model

of the MEQM scheme. The MEQM scheme uses only translation instead of an

RTS/affine model used by HPQM. The second difference is in the locality of the

approximation. The MEQM scheme uses a fixed locality tor the approximation. This

locality is determined by the chosen block size. In other "ords. we can regard the

MEQM scheme as a simpler. more restricted. Yersion of the HPQM.

Original 1mage

r --·---,

l_

Distorted image

j

-____ t_ ____ _

I

Motion

111-j Estimation & . ..,.! Motion vectors

I

1 Compensation

ＢＭＭＭｾＭＭＭＭ ---.---'

'

Compensated I_

- l

'

,- I ｾＭＭＭＭ

score calculation j....J Image score

--

..

i

I

!

Residual error r---

[image:22.519.67.459.255.652.2]

(23)

PFRCEPTllA/. Ql!AUTY OF GEOMETR/C4H Y I>ISTORTEJJ IMAGES PART ll: HUMAN PERCEPTION OF THE QUAUTY OF (iEOME1'R/C-tU Y J)JSTORTEfJ IMAW:::s ]wan .\'etyawan

In entluating the performance of the objecti\e quality measurements. we look at the imra- and inter-distortwn comparisons. For intra-distot11on comparisons. we eYaluate the scores of the images "·ithin one type of geometric distot1ion. but with different distortion parameters. For example. we perform an intra-distortion comparison by eYaluating the scores of images A.:. A3. A-1 and A5 that are distorted by

i!w 'Wilh' "'!Ptt,oid distortion hut "ith different parameter:-> (see Table ll In thi<; comparison an tmage \Yith a more se\ ere disto11ton parameters should get a IO\\ er

score. For inter-distot1ion comparisons. we eyaJuate the scores of alJ images in the test set Tltis is a more diflicult test for the objectne quality measurement schemes since they haYe to be able to indtcate the relatiYe perceptual qualities between different types of geometnc distot1ions.

All measurement schemes that we eYaluated m our expenments. mcludmg PSNR measurement. perform weil 111 the intra-distortion comparison. In other ''ords.

the images distorted "ith a more seYere parameter set are correctly giYen lower scores. In order to eYaluate the performance of the objectiYe quality measurement schemes in performing inter-distot1ion comparisons. we plot their results against the subjecti,·e test scores. The comparison plots for the Bird image set is sh0\\11 in Figure 7. The plots for the Kremlin image set sho" sintilar behm·ior.

User test vs HPOM

Figure . Rnullt-nmpuri.wmsjhr the Rtrd imugc.

[image:23.524.59.457.290.631.2]

(24)

Techne Jurnal Ilmiah Elektroteknika Vol <)No I April 2010 Hal 13-39

3or

- - - - T - · · 28 ... '

I I

ＲＶｾﾷﾷﾷ＠

I

ｾ＠ 24

r

m . . . . ' •

(L i

22f. 20; i I 18" 100- • 70 I ｾ＠ sol

,.

50f 200

20 I ____ · - · - - - · - _ J _ _

!00 200

Usertestvs PSNR value

ｾＭＭＭＭＭｾｾＭＺ＠ｾｾＭｾｾＭｾｾＭＭｾＭＭ

.. ··---.--- ····l

i

I

• • i

I . usr,testvspsnr ·1..

ｬＮ］ＮﾷﾷＭ］ＭＭｾｾｴｾ＠ --·

• ---..:..L...:

300 400

User test score

500 600

Figure 7 (continued)

(h) ｣Ｚｾﾷ＼Ｚｲ＠ leSI \'S. FSNR

User test vs MEOM

··i·

.: ... ····L •

Us« test va MEOM l.

- l i n e fit _j

_ .. ..1.. _ _ _ _ _ _ - j ___ _ _ _ _ l _ __ - .

300 400 500

ｕｾ･ｲ＠ test $COI"e

[image:24.517.76.450.59.585.2]

(c)

Figure 7 (confinued):

(c) User resr vs. MEQM

600

Fm m Figure 7 (b) ''"e can see that the PSNR measurement has a Yery poor

i

correspondence to the subjectiYe test result This is shm\n by the regression line that

is ｙｩｲｴｵ｡ｬｬｾ＠ horizontaL The 'alue of the correlation coefficient J) in this case also

(25)

PERl 'EP1'1 !A I. QUALITY OF (IEOMETRIC4.LLY I>JSTORTED IMAGES PART II: HUMAN PERCEPTION OF THE QUALITY OF UEOMETRU 'ALLY L>ISTORTEIJ IMAGES

/won Se!yml'0/1

also see that the HPQM scheme giYes the best performance among the three eYaluated

schemes. as sho,Yn in Figure 7(a) and "ith fJuh

=

-0.6. The negatiYe Yalues of p"'" and

p111, correctly reflect the fact that in our experiments a larger subjectiYe test score

represents a lo,Yer perceptual quality.

If "e e' aluate Figure 7 (a) we can see that image A 1-' does not properly fit the

hehm ior of (he rest of the data set and can be considered an outlier RemoYing this

image from the data set and recalculating the correlation coefficient. "e get flu,

=

-0.87 In generaL ''"e obserYe that the HPQM scheme crumot hru1dle images distorted

by the sinusoid (srrerch-shrink) disto11ion (see Table I) "elL except for image A1o1. At

present. we do not yet hm e a satisfactory explanation regarding this phenomenon. In

the case of image Aw. the geometric transformation applied to this image is similar to

the one tmplemented in teleYiston broadcasting when It ts ｮ･｣･ｳｳ｡ｲｾ＠ to com ert Yideo

frames from one aspect ratio to another. This transformation is perceptually not

disturbing (unless there is a lot of moYement. for example camera pruming). and

therefore. our test subjects giYe tllis image a high rru1king. In this distot1ion. the image

is stretched slightly in the horizontal and YeJ1ical direction. The slight increase in

image '"idth and height is compensated by slu-inking the outer pru·ts of the image. This

distortion cru1 be approximated by slightly enlru·ging the original image. Since tllis is a

homogenous RTS approximation. the HPQM scheme giYes this image a high score.

Image Au of the Bird test set is interesting since the subjects prefer tlus image to the

undistorted image A1. Tllis is probably due to t11e unfruniliarity of the subjects to the

bird species shown in the picture. Apparently. the test subjects get the impression that

the size of the bird·s head in t11e original image was either too large or too flat

Therefore. they preferred the image in which the head of the bird is slightly shrunk

horizontally (ru1d ｣ｯｮｳ･ｱｵ･ｮｴＡｾﾷ＠ slightly rounder) The fact that this does not happen in

the Kremlin test set (see Table 2(b)) seems to suppot1 this conclusion.

6. Conclusion and future works

In this paper. "e haYe described the method we use to petfonn a perceptual

user test for geometrically distot1ed images We also described the statistical tools we

(26)

Techne Jurnal flmiah Elektroteknil.a Vol 9 No I April 20 I

o

Hal 13 3t)

ground truth to validate our objective perceptual quality measurement scheme. the

HPQM. which is based on the hypothesis that the perceptual quality of a distorted

image depends on the homogeneity of the geometric transformation causing the

distortion. Furthermore. in order to have a better assessment of the performance of the

HPQM. "e also compare its performance to the performance of the PSNR

measuremenl and the MEQM scheme. In our experiments. we eYaluate the

perlormance o1 aH three obJeCUH:: measurement Ｎｳ｣ｨ･ｭｾ［ｳ＠ m l\\u weas. nameh 111

performing intra- and inter- distortion compansons.

All objecti,·e measurements evaluated in our experiments. the HPQM. PSNR

and MEQM. gi'e similar performance in performing intra-distortion compansons. For

inter-distortion comparisons. the PSNR measurement performs poorly. The MEQM

and HPQM schemes outperform PSNR measurement in this category, "ith the HPQM

giving the best performance among the three e\·aluated schemes.

Whde the amount of data collected in our experiments is not yet large enough

to form firm conclusions. we obserYe a Yery strong tendency that our HPQM scheme

has a very good oYerall conespondence to the results of the subjecti,·e test The

scheme is not yet perfect. hO\vever. and we still obserYe some discrepancies bet\veen

the ranking of the images generated by HPQM to that generated by the subjective test

result.

In the future. more measurements and user test experiments similar to the one

described and analyzed in this chapter should be performed. The data collected from

such experiments can than be used to fm1her Yalidate or refine the hypothesis and to

further fine-tune the pe1formance of the HPQM scheme. Finally, other ｯｾｪ･｣ｴｩＬＭ･＠

quality measurement approaches should also be explored and tested.

7. References

I. Keelan. B.W .. Handbook of] mage Quality: Characteri::atton and Prediction.

Marcel Dekker. Inc .. Ne\\ York. 2002

2. I. Setya\\an. D. Delannay. B. Macq and RL Lagendijk. Perceptual Quality

Evaluation ＨｾＨｇ･ｯｭ･ｴｲｩ｣｡ｬｬｹ＠ Distorted Images 11sing Relevant Ueomefric

TransformatiOn .l14ode!lrng. m Proceedings of SPIE. Security and Watennarkmg of

Multimedia Contents V_ Vol. 5020. pp. 85 94. Santa Clara. CA. 2003

(27)

PER( 'EPTUAJ. QUALITJ' OF (i£0METRIC4Ll Y f)l .. TORT£1) lll!AG'E.'i PART II: HUMAJV PERCEPTION OF THE QUALITY OF GEOMETRICALLY [)[STORTEI> IMAG'ES /wan .\etyawan 4. Dm id. H. A.. The Method otPaircd ComJXtrisom;_ 2'\d eel.. Charles Griffin &

ｃｯｭｰ｡ｮｾＬＮ＠ Ltd .. London. I9R8

5. Bechhofer. RE.. T.J. Santner ru1d D.M. Goldsman. Design and Analysis o(

Experiments

.for

Statistical ,)'elecllun. Screening and !vfulliple ( 'omparisons. John

Wiley & Sons. Ltd .. Ne"· York. 1995

Lld .. London. ! '>7'5

7 SiegeL S . ru1d N J Castellan. Jr .. Nonporamerric Swfi.l·ficsf()r the Behavioral

,\'ciences 2nd ed .. McGraw-HilL Boston.. 1988

8 Pearson. E.S. and H.O. Hartley. BiomerrtlaJ Tahle5'.f(n· Statisticwns. Vol I. 3'd ed ..

Cambridge UniYersity Press. 1966.

1

) D. Delru1na}. I. Setyawan. RL Lagendi.Jk ru1d B. Macq. Re!evum A1odellmg ano

Comparison ＨｾＨｇ･ｯｭ･ｴｲｩ｣＠ Distortions m Watermarking Sy\·tems. in Proceedings

ofSPIE. Application of Digital Image Processing XXV. Vol. 4790. pp. 200 210.

Seattle. W A. 2002

10. Tekalp. AM .. Digital Video Processing. Prentice--HalL Inc .. Upper Saddle River.

1995

11. Tekalp. AM .. Differential Methods. prui of the lecture notes for Digital Video

Processing. Unhersity of Rochester. Ne"- York. USA. 2001

12. D. Delannay. J-F Delaigle. B. Macq and M. Barlaud. Compensation Ｈｾｬ＠

geometrical detimnationsti:Jr 'A'Cltermark extraction in the dtgttal cinema

applic"'ion. in Proceedings of SPIE. Security and Watermarking of Multimedia