• Tidak ada hasil yang ditemukan

Adapted from Hang Li, MSR Asia, 09

N/A
N/A
Protected

Academic year: 2018

Membagikan "Adapted from Hang Li, MSR Asia, 09"

Copied!
43
0
0

Teks penuh

(1)

Learning to Rank

Pradeep Ravikumar

[email protected] !

CS 371R !

Adapted from Hang Li, MSR Asia, 09

(2)

Ranking

(3)

Ranking

(4)

Ranking

(5)

Whither Machine Learning?

Machine Learning Magic Box

Coke, Pepsi, Sprite Coke > Sprite > Pepsi

(6)

Whither Machine Learning?

Machine Learning Magic Box

Coke, Pepsi, Sprite Coke > Sprite > Pepsi

List of Objects Ranking of Objects

(7)

Ranking

16

s

n s

s s

o o

o

, 2 ,

1 ,

s

request offerings

ranking(of(offerings

o o oN

O , , ,

2

1

) , (s o

f Objects

Ranking of Objects

(8)

Ranking via Machine Learning

Learning(to(Rank

1

Learning( System

Ranking( System

1

(9)

Is learning to rank necessary for IR?

• Has been successfully used in web-document search

• Has been the focus of increasing research at the intersection of machine

(10)

Example: Machine Translation

138 Re5Ranking(

Model

1 0 0 0 2 1

e e

e

ranked(sentence

candidates(in(target( language

Generative Model

f

sentence(source(language

e

~

re5ranked(top

(11)

Example: Collaborative Filtering

I tem1 I tem2 I tem3

...

User1 5 4

User2 1 2 2

... ? ? ?

(12)

Example: Collaborative Filtering

I tem1 I tem2 I tem3

...

User1 5 4

User2 1 2 2

... ? ? ?

User M 4 3

Input: User Ratings on some items

(13)

Example: Collaborative Filtering

I tem1 I tem2 I tem3

...

User1 5 4

User2 1 2 2

... ? ? ?

User M 4 3

Any user is a query; based on this query,

(14)

Example: Key Phrase Extraction

• Input: Document

• Output: Key phrases in document

• Pre-processing step: extraction of possible phrases in document

(15)

Other IR applications of learning to rank

• Question Answering

• Document Summarization Opinion Mining

• Sentiment Analysis

• Machine Translation

• Sentence Parsing

(16)

Our focus: ranking documents

q

n q

q q

d d

d

, 2 ,

1 ,

q

query documents

ranking(of(documents

d d dN

D 1, 2,,

) , (q d f

ranking(based(on( relevance,(importance,(

(17)

Traditional approach to “learning to rank”

Probabilistic(Model

N

d d

d

2 1

q

) ,

| ( ~

) ,

| ( ~

) ,

| ( ~

2 2

1 1

n n P r q d

d

d q r P d

d q r P d

query

documents

ranking(of(documents

} 0 , 1 {

) , | (

R

(18)

Modern approach to learning to rank

Learning(to(Rank

1

Learning( System

Ranking( System

(19)

Learning to Rank: Training Process

1.(Data(Labeling (rank)

& & % & & $

& & % & & $

& & % & & $

& & % & & $

& & % & & $ #

(20)

Learning to Rank: Training Process

1.(Data(Labeling (rank)

2.(Feature( Extraction

& & % & & $

& & % & & $

& & % & & $

& & % & & $

& & % & & $ #

(21)

Learning to Rank: Training Process

1.(Data(Labeling

(rank) 3.(Learning

) (x

f

2.(Feature( Extraction

& & % & & $

& & % & & $

& & % & & $

& & % & & $

& & % & & $

& & % & & $

(22)

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

& & % & & $

& & % & & $

& & % & & $

(23)

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

2.(Feature( Extraction

& & % & & $

& & % & & $

& & % & & $

(24)

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

2.(Feature( Extraction

3.((Ranking( with f (x)

& & % & & $

& & % & & $

& & % & & $

(25)

Learning to Rank: Testing Process

1.(Data(Labeling (rank)

2.(Feature( Extraction

28

3.((Ranking( with f (x)

& & % & & $

4.((Evaluation

Evaluation Result

& & % & & $

& & % & & $

(26)

Issues in learning to rank

• Data Labeling

• Feature Extraction

• Evaluation Measure

(27)

Data Labeling

E.g.,(relevance(of(documents(w.r.t.(query

Doc(A

Doc(B

Doc(C

Query

(28)

Data Labeling: Supervision takes different forms

• Multiple levels (e.g., relevant, partially relevant, irrelevant)

‣ Widely used in IR

• Ordered Pairs

‣ e.g. ordered pairs between documents (e.g. A>B, B>C)

‣ e.g. implicit relevance judgments derived from click-through data

• Fully ordered list (i.e. a permutation) of documents is given

(29)

Data Labeling: Implicit Relevance Judgement

Doc(A

Doc(B

Doc(C

B(>(A ranking(of(documents(at(search(system

users(often(clicked(on(Doc(B

(30)

Feature Extraction

35

Doc(A

Doc(B

Doc(C Query

BM25

BM25

BM25

PageRank

PageRank

PageRank

...

...

...

Query5document(feature

Document(feature

(31)

Feature Extraction: Example Features

• BM25 (a classical ranking score function)

• Proximity/distance measures between query and document

• Whether query exactly occurs in document

• PageRank

(32)

Evaluation Measures

• Allows us to quantify how good the predicted ranking is when compared with ground truth

ranking

• Typical evaluation metrics care more about ranking top results correctly

• Popular Measures

‣ NDCG (Normalized Discounted Cumulative Gain)

‣ MAP (Mean Average Precision)

‣ MRR (Mean Reciprocal Rank)

‣ WTA (Winners Take All)

(33)

Kendall’s Tau

-!#

Comparing(agreement(between(two(permutations

Number(of(pairs(

Number(of(concordant(pairs(and(disconcordant pairs(

Completely(agree

Completely(disagree( )

1 (

2 1

n n

Q P

) 1 (

2 1

n n

Q P,

1

-1

(34)

Kendall’s Tau

-!#0"-1

Example

A((B((C

C((A((B

3 1

3 2 -1

(35)

Learning to Rank Methods

• Three major approaches

‣ Pointwise approach  

‣ Pairwise approach

(36)

Pointwise Approach

• Transforming ranking to regression, classification, or ordinal regression

(37)

Pointwise Approach

Pointwise(Approach

Learning

Regression Classification Ordinal2Regression

Input Feature(vector

Output Real(number Category Ordered category

Model Ranking(function

Loss2

Function Regression(loss Classification loss

Ordinal(regression( loss

x

) (x f

y y sign( f (x)) y thresold( f (x))

(38)

Pairwise Approach

• Transforming ranking to pairwise classification

(39)

Pairwise Approach

Pairwise(Approach

Learning

Input Ordered(feature(vector(pair

Output Classification on(order(of(vector(pair

Model Ranking function(

Loss2

Function Pairwise classification(loss(

)) (

sign(

,j i j

i f x x

y

) , (xi xj

Ranking function((f (x)

(40)

Pairwise Ranking

Converting(document(list(to(document(pairs

Doc(A

Doc(B

Doc(C

Query

Doc(A

Doc(B Doc(C

Query

Doc(A Doc(B

Doc(C Rank(1

Rank(2

(41)

Transformed Pairwise Classification Problem

Transformed(Pairwise(Classification(Problem

66

) ;

(x w

f

3

1 x

x

3

2 x

x

2

1 x

x

+1 51

1

2 x

x

2

3 x

x

1

3 x

x

Positive(Examples

(42)

Listwise Approach

• Entire list of documents as input-instance

• Query-document group structure is used

• Straightforwardly represents learning to rank problem

(43)

Listwise Approach

Learning &2Ranking

Input Feature(vectors

Output Permutation(on(feature(vectors

Model Ranking(function

Loss2Function Listwise loss(function

(ranking(evaluation(measure)

n i i

x } 1

{

x

) (x

f

) )} (

{

sort( f xi ni1

Referensi

Dokumen terkait

In this paper we prove the uniqueness of bounded solutions to a viscous diffusion equation based on approximate Holmgren’s approach.. Key words and phrases: Holmgren’s

In order to relate the system to the KP trigonometric solitons introduced in Definition 1 , we need the following lemma which can be extracted from Ruijsenaars [ 16 ], and was

In the future, it is anticipated that 1 the multidisciplinary approach will continue; 2 patenting issues will cause disruptions hopefully small in information sharing; 3 funding will

The approach adopted in the report drew heavily on feedback from stakeholders and acknowledged a number of major policy issues including: • Health care is a national issue and that

IMPLEMENTING ABCD TOOLS AND PROCESSES IN THE CONTEXT OF SOCIAL WORK STUDENT PRACTICE Natalie Mansvelt INTRODUCTION Community development as an approach is well aligned to the

Email: [email protected] Earlier, in the Editorial note of the 13th issue of the journal, regarding the ranking of universities, these issues were raised that "Is the university

Subordinating conjunctions COORDINATING CONJUNCTIONS Coordinating conjunctions allow you to join words, phrases, and clauses of equal grammatical rank in a sentence1. The most common

The approach used in this paper is entirely new in two ways; first, a metric similar to journal ranking is proposed for ranking authors and their institutions and secondly,