Summary of the panel "Figurative Language in WordNets and other Lexical Resources, and their Applications" at the Second Global Wordnet Conference 2004 (Brno, Czech Republic, January 21, 2004) The panel addressed several questions pertaining to figurative language representation in lexical resources and to the perspective of enhancing information (coverage and/or linking) on figurative language use, especially metaphors, in wordnets. It had been organized by Antonietta Alonge and Birte Lönneker and was chaired by Piek Vossen. Please refer to the Web pages of the conference (http://www.fi.muni.cz/gwc2004/) for more information including the conference program, proceedings and archived presentation slides. In her opening statement, Christiane Fellbaum pointed out the difference between conventionalized metaphors (e.g. Engl. tiger in the 'fierce person' sense) and ad-hoc metaphors such as 'my work is a jail', which are created by language users on the fly. Some metaphors of the first kind are already represented in Princeton WordNet synsets, for example {tiger} glossed as "a fierce or audacious person". Inspired by the class inclusion theory (Glucksberg/Keysar 1990), Christiane Fellbaum explained the second kind of metaphorical language use as a mechanism of class creation, in our example that of a (non-lexicalized) ad-hoc superordinate like {confinement, lack of freedom,...}, of which the metaphorical interpretation of jail is a prototypical member. The ad-hoc class creation relies on similarity and analogy between the two basic classes s1={job, place_of_work,...} and s2={jail, prison,...} in a given context (that of freedom). Christiane Fellbaum is currently heading a research project in which subjects estimate the likeliness of a WordNet synset to evoke others. The results promise to provide more insights into concept similarity perception and thus eventually in mechanics of metaphor production and comprehension. Wim Peters had prepared a contribution on building and extending knowledge fragments using WordNet. He showed, in absentia, that there is a great amount of implicit information available in the hierarchical structure and the glosses that are associated with each synset His research starts with the detection of regular polysemy patterns resulting from figurative language use. The set of instances of systematic lexicalised figurative language use forms the basis of the building of knowledge fragments. By parsing and analyzing WordNet and EuroWordNet sysets, glosses and relations, Wim Peters showed that verbs that occur between pairs of words from different sets or classes can be taken as semantic relations between the word senses. For example, the analysis of the WordNet glosses yields 'master' as a significant relation between PROFESSION and DISCIPLINE. The triple 'PERSON speak LANGUAGE' directly illustrates a regular polysemy pattern instantiated in WordNet in synsets containing words like Tatar, Assyrian and Hopi. Other polysemy patterns can be explained by means of long chains of relations. For example, the chain "PERSON make/accomplish MUSIC, MUSICIAN is-a PERSON, MUSICIAN play MUSIC, MUSIC accompany ACTIVITY, DANCING is-a ACTIVITY" explains the regular polysemy between designations of MUSIC and DANCE subconcepts like tango and bolero. Wim Peter's presentation was delivered by Paul Clough. Eneko Agirre presented some results of sense clustering experiments, where all WordNet synsets containing the same "literal" (lexical unit) were interpreted as different senses of a polysemous word. For example, the word channel has seven senses, because it occurs in seven different synsets. Sense clustering is useful for reducing the sense number of polysemous items in very fine-grained lexical resources, which might improve Word Sense Disambiguation. In the sense clustering experiments performed by Eneko Agirre and Oier Lopez de Lacalle, senses of 16 polysemous nouns were clustered using various automatic methods based on example sentences (contexts) extracted from the Web, or on so-called Topic Signatures which can be derived from document collections in which only one sense of the word occurs. The clustering method based on topic signatures distinguishes, for instance, successfully between the "literal (basic)" sense and "metaphorical" senses for the word channel: the 'body of water that allows a passage for vessels' sense builds a cluster on its own, while more abstract senses like 'television channel' and 'transmission channel' are bound together in another cluster. In comparison with sense clustering performed by humans (Senseval-2), intermediate cases like 'a passage for water or other fluids to flow through' present a problem: while humans clustered this sense together with the literal sense, the automatic procedure attached it to the cluster containing metaphoric senses. Eneko Agirre concluded with the wish for more indications about the nature of the relations, or about the similarity/dissimilarity between different word senses, directly in the lexical resource. Julio Gonzalo showed in his talk that sense distinctions and sense clusters are not absolute. Rather, different applications need distinct clusters, which means that they should differently exploit the fine-grained sense distinctions in wordnets. For example, many conventionalized metaphors can be observed across several languages (as, for instance, fight in the sense of 'intense verbal dispute') and would not need to be distinguished in Machine Translation, but they occur in different contexts and documents and should thus be distinguished in Information Retrieval (IR). Experiments that were performed by Irina Chugur and Julio Gonzalo showed that explicit information on the kind of the polysemy could add substantial information to sense clustering. It was proved that polysemy based on sense extension justified the clustering of word senses for IR, because the senses tend to occur in similar documents, while polysemy based on metaphor or on homonymy should be preserved in IR, as the distinct senses belong to different contexts and documents. The metonymy relation was ambiguous with regard to the need for sense clustering in IR. As a result of the semantic tagging of relations between senses of 1,000 WordNet nouns, metonymy and metaphor turned out to be among the most frequent ones: metonymy was identified between 32,5% of sense pairs, while metaphor accounted for relations between 13% of the sense pairs. Julio Gonzalo proposed a schema along which metonymy could be further subdivided, especially into target-in-source (e.g. animal-meat) and source-in-target (e.g. material-product) groups and concluded that an annotation of relations between word senses is very useful for applications. John Barnden reported on insights from his work on ATT-meta, an implemented reasoning system designed to work out the significance of metaphorical utterances. Individual metaphors generally rest on conceptual metaphor schemas (Lakoff 1993) that map between a source domain (e.g. PHYSICAL SPACE; PHYSICAL OBJECTS; ...) and a target domain (e.g. THE MIND; IDEAS; ...) each containing a set of links that put individual aspects of the two domains in correspondence with each other. John Barnden is mainly interested in metaphorical utterances that rest on such metaphor schemas, but nevertheless go beyond them by using unmapped elements of the source domains. His ATT-Meta method copes with this phenomenon by means of complex reasoning within the terms of the source domains rather than by creation of new links. Relatedly, only very general mappings are needed between source and target. John Barnden therefore argued for a relatively sparse use of source-target mappings in lexical resources like WordNets, but for such resources to be able to facilitate reasoning within source domains. He also held that while it is beneficial to include conventionalized metaphorical phrases in WordNets, such phrases can be richly varied in ways that require ATT-Meta-style reasoning. A more general insight was that metaphor does not necessarily involve what we would normally view as qualitatively different domains, so the inclusion of domain markers within a Wordnet can be at best a heuristic guide for metaphor handling. Finally, John Barnden evoked the relativity of metaphor: matters such as whether an utterance is metaphorical and what metaphorical schemas it rests on are not objective, and are instead determined by what lexical senses and metaphorical schemas a particular user possesses. Antonietta Alonge then presented a possible way of representing information on metaphors in WordNets. The method, which had been designed by Antonietta Alonge and Birte Lönneker, relies on the above mentioned theoretical framework of Conceptual Metaphor (Lakoff/Johnson 1980), in which individual conventionalized metaphors are instantiations of broader mapping schemas between concrete source domains and abstract target domains. For example, an Italian corpus shows that the following words illustrate the general mapping ARGUMENT/DISCUSSION IS WAR: difendere (to defend), attaccare (to attack), lottare (to fight), vincitore (winner), guerra (war), as they can all be used figuratively in a context referring to arguments or discussions rather than to war. The corpus shows futhermore that this is a normal way to talk about arguments. Problems in WordNet and EWN concern 1) the lack of data or inconsistency in representing the metaphoric uses (i.e. the conventionalized individual mappings): not all the established metaphoric senses are encoded (for example, lottare in this sense is missing from ItalWordNet); 2) the lack of structure: there is no connection among literal and related metaphoric senses and consequently no way to connect source and target domains. After discussing the possible enhancement of wordnets by language-internal relations between word senses (e.g. by a DERIVED_FROM_LITERAL relation), Antonietta Alonge presented a way using a composite interlingual index, which was already introduced in EuroWordNet to account for metonymic sense extensions, for the possible disambiguation of infrequent or novel metaphoric senses fitting into the general matching schema. Patrick Hanks started his reflections on WordNet in general by looking at some examples of figurative language use. For instance, the inclusion of a sense of the word mousetrap, referring to a special situation in American Football, is not justified by corpus material. According to Patrick Hanks, lexical resources should include information on the norm (i.e. conventions), rather than on dynamic language use ("exploitations", including ad-hoc metaphors). He also detected that some senses of verbs were included either in the wrong synsets or were missing correct information on phraseology in WordNet. For example, the correct use of the metaphorical sense of plow (in the sense of 'cover' or 'treat', which are in the same synset) requires a PP-complement introduced by through, as in "This book plows through all aspects of lexicography." Patrick Hanks argued that a great deal of corpus pattern analysis is needed in order to revise the "pre-corpus" work of the WordNet structure (synsets and relations like hyperonymy) in general, and to achieve a higher match between the lexical resource and the actual conventional language use. The presentations were followed by a vivid discussion between the panel participants and the remaining participants of the Global WordNet conference. Some questions that were raised concerned the desired degree to which machines should understand and produce metaphors; while it seems useful to automatically understand all kinds of figurative language, the idea of an automatical metaphor generation system might seem frightening. A related topic is the involvement of emotion in figurative language, but many questions are open: which information on emotional aspects can be included in the lexicon, and how? Should we distinguish between different kinds of emotions, between different degrees of emotionality? A practical question concerned the use of information on metaphors for Word Sense Disambiguation (WSD). Conference participants were interested in the degree to which errors in WSD were caused by ad-hoc metaphors; the estimation of the panel participants, according to which pure ad-hoc metaphors outside the conventionalized metaphorical mapping schemas occur by definition very rarely in texts and therefore should not be the cause of a high percentage of errors, was confirmed later by conference participants who had been working on sense assignment to the Prague Treebank. The general results of the panel and discussion were that it is useful both for humans and for applications to include information on conventionalized figurative language and on sense relations in the lexicon, as long as the distinction between conventionalized and ad-hoc metaphors (corresponding largely to norms and exploitations) is respected, and that general similarity estimations based on contexts could futher help in doing so. What remains open is how "intermediate cases" in the continuum of figurativity should be treated, as for example novel metaphors resting on a pre-existent conceptual mapping schema. Comments are welcome to the Global WordNet Association mailing list (gwa@lists.ut.ee). You can subscribe to the GWA mailing list by sending a mail with the message: "subscribe gwa@lists.ut.ee", to kvider@psych.ut.ee. Birte Lönneker, 6 February 2004 (birte.loenneker@uni-hamburg.de) Thanks to Antonietta Alonge and John Barnden for comments on a previous version