 |
EuroWordNet top-ontology |
To maximize the uniform encoding of the wordnets, the Base Concepts have been classified using a Top Ontology, specifically been designed
for this purpose. The Top Ontology is based on existing linguistic classifications
and is adapted to represent the diversity of the Base Concepts. The Top Concepts can either
be applied disjunctively or conjunctively. In the latter case it is
possible to get complex clusters of features, such as: Container+Part+Object+Natural,
which could apply to "seed case". Click BC
Clustering Overview Image to see some examples how BCs can be clustered.
The first level of the Top Ontology is divided into three
types:
- 1stOrderEntity (roughly corresponding to concrete, perceivable
objects and substances)
- 2ndOrderEntity (states, situations and events)
- 3rdOrderEntitiy (mental entities such as ideas, concepts,
knowledge)
The data is available as:
- Top Concept Ontology: 64 Top Concepts linked to the Inter-Lingual-Index
- EuroWordNet database format: EuroWordNet database created
from the previous file.
- Included in the EuroWordNet database.
- The common Base Concepts and their classifications in terms
of the Top Ontology.
The Base Concepts are specified in terms of WordNet1.5
synsets (identified by their file offset position+pos, e.g. 00123456-n).
The classification in terms of Top Concepts is available in 2 formats:
- Flat Ascii files for the 1stOrderEntities,
2ndOrderEntities and 3rdOrderEntities:
listing the Base Concepts and the cluster of Top Concepts that applies.
Additional information is provided in the form of a domain label, a gloss,
the hyperonym in WordNet1.5, 1 synset member from the synset (the sense
numbers do not correspond with the sense numbers in WordNet1.5 database).
Fields are separated by TABs. A fixed number of fields is provided per
line: 9 for 1stOrderEntities and 22 for 2ndOrderEntities and 3rdOrderEntities.
-
Flat Ascii files for the 1stOrderEntities,
2ndOrderEntities and 3rdOrderEntities:
listing the TopConcept combinations that occurred followed by all Base
Concepts that belong to these clusters.
In addition to classifying the 1024 Common Base Concepts,
we have also constructed a reduced set of 164 Core Base Concepts
that occur in 3 or more wordnets as important meanings. The can be accessed
separately. They are listed as WordNet1.5 synsets with glosses, their WordNet1.5
hyperonym and the EuroWordNet Top Concepts that have been applied. By clicking
on the links either the EuroWordNet Top Ontology is activated or the WordNet1.5
hyponymy tree. Finally, we have reduced the 164 Core Base Concepts
to 71 Base Types. The reduction involved removing unbalanced hyponyms
(when both the hyperonym and hyponym are present but not other co-hyponyms)
and by replacing closely related synsets (e.g. act and action)
by a single Type. The Base Types can be seen as a minimalized list of fundamental
concepts (semantic primitives or taxonomy tops). For each Base Type we
have provided the mapping to the Core Base Concepts that it represents.
[home | webmaster]