A Survey of Automatic Query Expansion in Information Retrieval

AQE is also covered in the books Baeza-Yates and Ribeiro-Neto [1999] and Manning et al. However, the additional terms can cause query shifting—the change of focus of a search topic caused by improper expansion Mitra et al.

Fig. 1. Main steps of automatic query expansion.

Generation and Ranking of Candidate Expansion Features

Using this formula, we can calculate the correlation between each term of the query and each term in the document collection. Formula (5) is applied to calculate a term-concept correlation (instead of a term-term correlation), wherewu,j is the frequency of the search term in the jth passage andwv,jis the frequency of the conceptv in the j -the passage. e passage.

Table I. Main Term-Ranking Functions Based on Analysis of Term Distribution in Pseudo-Relevant Documents

Selection of Expansion Features

The idea is to use a query supplemented with some context as input instead of just a user query. A similar idea is exploited by Collins-Thompson and Callan [2007], with the difference that multiple feedback models are created from the same term classification function by resampling the documents and by generating versions of the original query. 2007], the number of expansion terms is a function of the ambiguity of the original query on the web (or in the user's personal data repository), as measured by the clarity score [Cronen-Townsend and Croft 2002].

The selection of the best expansion terms for a given query (including null terms) is explicitly labeled as an optimization problem in Collins-Thompson [2009]. By optimizing with respect to the uncertainty sets defined around the observed data (e.g., using query perturbations and topic-specific constraints such as aspect balance, aspect coverage, and query support), the system smoothes the risk-reward trade-off of expansion.

Query Reformulation

There has been some research on the optimal number of features to include, and there are various suggestions ranging from five to ten features [Amati 2003; Chang et al. On the other hand, the reduction in efficiency associated with suboptimal values is usually modest. Carpineto et al. Different queries have been shown to have different optimal numbers of expansion functions [Billerbeck and Zobel 2004a; Buckley and Harman 2003; Cao et al.

Even a simple reweighting scheme based on an inverse function of joint ranks can give good results (e.g. Carpineto et al. [2002], Hu et al. [2006]). Various methods of creating a query expansion model have been investigated, not only based on feedback documents [Lavrenko and Croft 2001; Zhai and Lafferty 2001a], but also on term relations [Bai et al.

A CLASSIFICATION OF APPROACHES

Linguistic Analysis
Corpus-Speciﬁc Global Techniques
Query-Speciﬁc Local Techniques
Search Log Analysis
Web Data
A Feature Chart

In particular, its use for query expansion is only beneficial if the query words are almost exactly unambiguous ([Voorhees 1994]; Gonzalo et al. [1998]), while word meaning disambiguation remains a difficult problem [Navigli 2009]. For example, it is possible to index the user query and the extracts in the top position by relation paths induced from parse trees, and then learn the most relevant paths to the query [Sun et al. Examples of the latter approach include using top results from previous queries [Fitzpatrick and Dent 1997], finding queries related to the same documents [Billerbeck et al. 2003] or user clicks [Beeferman and Berger 2000]), and extract terms directly from clicked results [Cui et al.

Another interesting method, based on Wikipedia documents and hyperlinks, is proposed in Arguello et al. Other types of web data that can be used for AQE include FAQs [Riezler et al.

RETRIEVAL EFFECTIVENESS

Experimental Setting

The initial set of candidates associated with a query is limited by considering only those anchor texts that point to a short set of top-ranked documents from a larger set of top-ranked documents, followed by evaluation of each candidate in proportion to its frequency and inversely proportionally. in the range of documents to which it is related. In Tables III and IV we consider some of the most influential or innovative AQE methods, regardless of their broad conceptual paradigms, and provide a detailed classification along five specific problem dimensions.

Published Results of AQE Approaches

To allow cross-comparisons, we only considered the experiments performed (a) on the full set of documents; In addition to the average precision of the individual AQE methods, for each test set we have listed the best baseline (an implementation of the classification system without AQE) and real relevant feedback, if available. It is important to note that such findings should be taken with caution, as we have listed in Table VI the absolute overall performance of the system, including, but not limited to, the AQE component.

2005], for example, benefited greatly from a very high baseline performance (e.g., 3107 in TREC-3), even higher than that of other methods with AQE. For example, the removal of spurious, low-frequency words at the time of indexing from the TREC-9 and TREC-10 collections was found to be very useful because it reduced the number of typographical errors in the documents, which is one of the causes. of weak query expansion in noisy collections.

Table V. Overview of TREC Collections. The Meanings of the Acronyms are the Following:

Other Evaluation Methods

One thing common to these four methods is that they explicitly take term dependence into account, although different techniques are used; in addition, they used primarily top-retrieved documents, possibly combined with other sources of evidence, and were built on top of very effective baseline ranking systems. An effective AQE method will clearly produce poor results when combined with an ineffective basic IR system, and vice versa. In fact, the underlying ranking methods used in the experiments were usually very different and never exactly the same.

We also need to consider that even when performing strict single-word indexing and using the same weighting function for document ranking, there are a number of system-specific factors that can significantly change the final retrieval performance, including document parsing, stop-wording, and sprouting. There is another problem that can complicate interpretation of results, namely that the parameters involved in each AQE method may have been optimized by training data or other types of data that are not always readily or widely available.

COMPUTATIONAL EFFICIENCY

Which AQE Method is Best?

In general, linguistic techniques are considered less efficient than those based on statistical analysis, global or local, because they require almost exact discrimination of the meaning of words, but statistical analysis cannot always be used (e.g., when expressions with good expansion do not often occur together with query terms). Among the statistical techniques, local analysis seems to perform better than corpus analysis because the extracted features are query-specific, while methods based on online data (query logs or anchor texts) have not yet been systematically evaluated or compared with others in standard test collections. From the perspective of computational efficiency, query-specific techniques require dual execution at query time, while other forms of AQE are mostly performed in the offline phase, but the inherent complexity of the latter techniques may prevent their use in high-dimensional domains.

Query-specific techniques depend on the quality of the first-pass retrieval, corpus-specific techniques are not suitable for dynamic document collections, linguistic techniques and methods based on analysis of query logs or hyperlinks make use of data that are not always available or suitable for the IR task at hand. To summarize, there is a wide range of AQE techniques that offer different features and are mostly useful or appropriate in certain situations.

CRITICAL ISSUES

Parameter Setting

The best choice depends on evaluating a number of factors, including the type of collection being searched, the availability and characteristics of external data, the facilities provided by the underlying classification system, the type of searches and efficiency requirements.

Efﬁciency

The adoption of such approximate techniques usually involves a moderate trade-off between speed and retrieval performance, although their overall adequacy ultimately depends on the requirements of the search application.

Usability

RESEARCH DIRECTIONS

Selective AQE

The most well-known predictive function, called the clarity score [Cronen-Townsend and Croft 2002], is the Kullback-Leibler divergence between the query model, estimated from the top-ranked documents, and the collection model. Arguably, a more effective strategy is to use the difference between the clarity score of the original query and that of the expanded query, yielding performance results comparable to the non-AQE ranking on the worst (those damaged by expansion) and most complex queries. better than conventional AQE. over the entire set of questions [Amati et al. Queries are classified into three types: (1) entity queries, if they match the title of an entity or redirect page, (2) vague queries, if they match the title of a clarifying page, and (3) broader queries in all the others. cases.

Navigation queries are then handled by a separate anchor-based retrieval model that expands the anchor terms with their synonyms. Focused extensions have also been applied in a federated search environment, producing specific queries for each resource [Shokouhi et al.

Evidence Combination

Instead of just disabling AQE when its use is considered harmful, it may be more convenient to use different extension strategies depending on the query type. The best fit of such mixed weights confirmed that all models affected final performance, but the pseudorelevant feedback model was largely the most effective.

Active Feedback

CONCLUSIONS

In Proceedings of the 20th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval. In Proceedings of the 30th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval. In Proceedings of the 16th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval.

Proceedings of the 26th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval. Proceedings of the 17th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval. Proceedings of the 29th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval.

InProceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.