Computational generation of referring expressions: A survey

It is this later algorithm that became known as "the" Incremental Algorithm (IA). Referring to {d1,d2}, a CNF-oriented algorithm can generate (man∪woman)∩(left∪center) ("the people who are on the left or center").

Figure 2 contains a sketch of the IA in pseudo code. It takes as input a target object r, a domain D consisting of a collection of domain objects, and a domain-speciﬁc list of preferred attributes Pref (1)

Relational Descriptions

Context Dependency, Vagueness, and Gradability

Given that type and height, the referent uniquely identifies, this set of attributes can simply be realized as "the man who is 180 cm tall." But other possibilities exist. The new descriptor can be realized as "the tallest man" or simply as "the tall man" (provided the referent's height exceeds a certain minimum value).

Degrees of Salience and the Generation of Pronouns

They propose to assign individual saliency weights (sws) to the objects in the domain and to reinterpret referent terms such as "the man" as referring to the currently most salient man. The algorithm will select type,man, which excludes the only distractor d2, leading to a successful reference ("The Man").

Beyond Content Determination

For example, a text might begin with "The new president applauded the old president." From this alone, the algorithm has to figure out if it can speak of "the old president" (or some other appropriate noun) in the next sentence without risking it being misinterpreted by the reader. Some REG studies have taken a different approach, mixing content determination and surface realization (Horacek 1997; Stone and Webber 1998; Krahmer and Theune 2002; Siddharthan and Copestake 2004), which goes against the pipeline architecture (Mellish et al. 2006) . One president may be "old" in the sense of former while another is "old" in the sense of old, in which case "the old president" may become ambiguous between the two people.

In this situation, "the old president" seems clear enough, because only one of the two interpretations justifies the definite article (namely, the one where "old" should be understood as "former"). Similar ambiguities occur in conjoined references to plurals, as in "the old men and women", where "old" may or may not refer to the women. These issues have been studied in some detail as part of a systematic study of the ambiguities that arise in coordinated expressions of the form "the adjective and the noun", asking when such expressions give rise to actual comprehension problems and when they should be avoided. by a generator (Chantree et al. 2005; Khan, van Deemter and Ritchie 2008).

Discussion

Kopp, Bergmann, and Wachsmuth (2008) are more ambitious, modeling different types of pointing gestures and integrating their approach with the generation strategy of Stone et al.

REG Frameworks

REG Using Graph Search
REG Using Constraint Satisfaction
REG Using Modern Knowledge Representation
Discussion

Intuitively, such a distinctive graph can be "placed" over the target node with its edges, and not over any other node in the scene graph. The leftmost one, which could be realized as "the man", fails to discern our target, as it can be "placed" across the scene graph in two different ways (across nodes 1 and 3). In the graph-based perspective, relationships are treated in the same way as individual properties and there is no risk of them ending up in infinite loops ("the cup to the left of the saucer to the right of the cup.

For example, from 'chair ais in roomb', 'roombis in apartmentc' and 'appartementcis in house d' the transitivity of 'in' allows us to deduce that 'chair is housed in'. Now the REG task can be formalized as: Given a model M and a target set S⊆D, look for a Description Logical formula ϕ such that ϕ=S. The following three expressions are the Logic counterparts of the referencing graphs in Figure 5:. Finally, by using more expressive snippets of Description Logic, it becomes possible to identify objects that previous REG algorithms failed to identify, such as when we say "the man who owns three dogs" or "the man who only kisses women ,” referencing phrases typically not considered by previous REG algorithms.

Evaluating REG

Corpora for REG Evaluation

The referential phrases in a subset of these stories were analyzed by Passonneau (1996), who asked how best to predict the form of the redescriptions (such as 'he', 'she' and 'the man') in these stories, where "Informative" considerations (which are at the heart of most algorithms in the tradition started by Dale and Reiter, as we have seen) are compared to considerations based on centering theory (Grosz, Joshi, and Weinstein 1995). Reference expressions are routinely produced in this task to refer to the landmarks on the maps ("the cliff"). All objects had the same shape and size, so targets could only be distinguished by their color (green or purple) and their location on the surface ("the green cone at the bottom left").

Reference expressions (“the pink drawer in the first row, third column”) are again used in this corpus for identification purposes only. Viethen and Dale (2008) found that spatial relations were often used (“the ball before the cube”), even though they were never required for identification. Example trials from the TUNA corpus, a single trial for the furniture domain ("the little blue fan", left) and a multiple trial for the human domain ("the men with glasses", right).

Evaluation Metrics

Consider the two (roughly equivalent) terms "palomino" and "a horse with a golden coat and a white mane and tail." A direct count of attribute-value pairs would result in an overlap result of zero, which would be misleading since both descriptions express essentially the same content, with the latter description combining in one property all the properties expressed in the former. One well-known distance metric between strings that has also been proposed to evaluate REG is the Levenshtein (1966) distance: the minimum number of insertions, deletions, and substitutions required to convert one string into another, preferably normalized by length (Bangalore, Rambow and Whittaker 2000). BLEU (Papineni et al. 2002) and NIST (Doddington 2002) metrics derived from machine translation evaluation have also been proposed for REG evaluation.

It is e.g. not obvious that a smaller Levenshtein distance is always preferable to a longer one; the terms "man wearing a t-shirt" and "woman wearing a t-shirt" are only a Levenshtein distance of 2 apart, but only the former would be a good description for targetd3. On the other hand, "the male person on the right" is at a Levenshtein distance of 15 from "the man wearing a t-shirt", and both are perfect descriptions of d3. How easily would they be able to find it, based on the given sentence?”) and Fluency (“How fluent is this description?).

Discussion

After participants read this description, a scene appeared and participants were asked to click on the target target. Moreover, tasks such as the one on which the TUNA corpus is based can be argued not to be "ecologically valid": human participants produce machine-written expressions for an imaginary audience on the basis of abstract visual scenes. The effects of these limitations on the descriptions produced are partly unknown, although some reassuring results have recently been obtained.

For example, speakers addressing an imaginary audience have been shown to relate in a similar way to those addressing a physically present audience (van der Wege showed that speaking instead of typing does not affect the type and number of attributes). in emerging referential expressions, although speakers usually use more words than typists to convey the same amount of information. GREC (Belz et al. 2010) focuses on the task of deciding what form a referential expression should take in a textual context, which is important for generating coherent texts such as summaries (see also Section 6).GIVE (Koller et al. 2010) focuses on the generation of instructions in a virtual 3D environment, where reference is only one task among many others.

Open Issues

How Do We Match an REG Algorithm to a Particular Domain and Application?
How Do We Move beyond the “Paradigms” of Reference?
How Do We Handle Functions of Referring Expressions Other than Identiﬁcation?
How Do We Generate Suitable Referring Expressions in Interactive Settings?
What Is the Impact of Visual Information?
What Knowledge Representation Framework Suits REG Best?

Gatt and Belz (2010) note that the nature of the TUNA data may be partially responsible. Regeneration of referring expressions is a potentially attractive way to recover some of the coherence of the source document (Nenkova and McKeown 2003). One well-performing entry (Hendrickx et al. 2008) predicted the correct type of referring expression 76% of the time, using a memory-based learner.

This model offers a new way of thinking about interactive REG and the role therein of REG algorithms of the kind discussed in this study. When discussing visual scenes, most REG researchers assume that some of the relevant visual information is stored in a database (compare our example visual scene in Figure 1 and its database representation in Table 1). For example, how dopeople generate referential expressions of the kind highlighted by the work of Poesio and colleagues.

General Conclusion and Discussion

New Complexities

Human-Likeness and Evaluation

In NLG systems whose main purpose is to be practically useful, for example, it may be more important to refer expressions to beclear than to be human-like in all respects. If usefulness or success (Garouﬁ and Koller 2011), rather than human equality, is the criterion for success, a different type of evaluation test must be used. A variety of listener-oriented tests are beginning to be used in recent REG research (Paraboni, van Deemter, and Masthoff 2007; Khan, van Deemter, and Ritchie 2008), but evaluation of REG algorithms (and of NLG in general) remains difficult ( see eg Oberlander 1998; Belz 2009; and Gatt and Belz 2010).

It is one thing for a REG algorithm to use logical quantification to generate a fairly simple description, such as "the woman who owns four cats," but quite another to generate a highly complex description (" the woman who owns four cats chased by between three and six dogs, each fed only by men"), which can be generated using the same methods. There are difficult methodological questions to be answered here about whether the purpose of the generator is to model human competence or human performance. And if it is performance that needs to be modeled, then this raises the question of what types of complexities are exploited by human speakers, and what types of complexities are comprehensible to human hearers.

Widening the Scope of REG Algorithms

In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL), Columbus, OH. In Proceedings of the 48th annual meeting of the Association for Computational Linguistics (ACL), pages 55–59, Uppsala. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL), pages 206–213, Madrid.

InProceedings of the 21st International Conference on Computational Linguistics (COLING) og 44th Annual Meeting of Association for Computational Linguistics (ACL), sider Sydney. InProceedings af det 40. årlige møde i Association for Computational Linguistics (ACL), side 311-318, Philadelphia, PA. InProceedings of the 42nd Annual Meeting of Association for Computational Linguistics (ACL), side 407-414, Barcelona.