ИСТИНА |
Войти в систему Регистрация |
|
ИПМех РАН |
||
In the recent years creation of the computer thesaurus of Russian similar in structure and functionality to WordNet thesaurus attracts large interest. Such thesauri give ample opportunities for investigating semantic relations between the meanings of the words of some Natural Language. Unfortunately, the lexical covering by such thesauri for the languages other than English is limited, despite considerable efforts on sinset expansion and their interrelations (sinset is the basic semantic unit of WordNet; a set of English words which code some semantic value). So a necessity of the automated revealing of lexical-semantic relations from the existing sources, such as test corpora or explanatory dictionaries exists. For the decision of this problem methods of the formal concept analysis (FCA) are involved. We develop methods using bilingual (English-Russian) dictionaries as a source of the formal context and the further construction of a conceptual network for representation of ontological relations in the class of Russian adjectives. In this paper we describe the semantic paradigm of the adjectives characterizing appearance of the person. The frequency of the words in this group is rather considerable: большой (big) - 1631 ipm (items per million), хороший (good) - 854 ipm, старый (old) - 528 ipm, белый (white) - 493 ipm, [9] etc. This group is chosen also in view of its importance for specification of system relations of the Russian rating lexicon, notions about types of lexical values, features of connotation, standard lexical associations [3], understanding the structure of a fiction novel [6]. It is important for lingvo-didactics, as a basis for creation of various manuals for speech developing, training in Russian for the Russians and the foreigners, and also for translation of legal, psychological, etc. documents. Investigation of the meanings of adjectives is similar to investigation of other parts of speech. The component analysis of adjectives with attraction of explanatory dictionaries is used; corpora research is used for the compatibility analysis of syntagma of type adjective - noun which allows to cluster adjectives as the attributes of certain noun for which some classification [12] is already constructed. Methods of direct in-field testing for revealing connotations, i.e. narrowing the set of possible syntagmatic partners (adjectives) of the given lexeme (noun) [4] are used. System relations in lexicon are reflected in thesaurus where the lexical meaning of an adjective is frequently the same as this of a semantically similar verb or noun. No hierarchical relations similar to the hyponymy relations between nouns or troponymy relations between verbs are revealed in WordNet for adjectives and, as a rule, the direct hypernym is not indicated, instead of it the reference «Pertains to noun …» is given, that hypernym of an adjective often is a noun, for example for the adjectives designating size (big, small, narrow, spacious) a generic hypernym is the noun "size". In this paper we expect, however, to find hierarchical, etc. relations within the class of adjectives. Formal concept analysis is based on an intuitive guess that concept has two parties: an extent which contains some objects, and intent which includes all attributes peculiar to these objects. For the formal analysis of concepts it is necessary to define, first of all, a formal context, K: = (G, M, I), where G = set of objects; M = set of attributes; and I = the binary relation between elements of G and M, showing, what attributes m are attributed to objects g. It is easy to present a formal context in the form of a table. Table 1 contains some adjectives of Russian as objects, a set of translations of these adjectives – as attributes; the certain Russian word, e.g. алчный has a translation equivalent rapacious, crossing of the corresponding line and column is marked by cross (X). Derivation operation over the formal context is defined as follows: X G: X→X ’: {m∊M|gIm for all g∊X} Y M: Y→Y ’: {g∊G|gIm for all m∊X} In our example let X: = {ХИЩНЫЙ, прожорливый} and let Y: = {ravening, wolfish} Then X ’ = {ravening, rapacious, ravenous}, Y ’ = {ХИЩНЫЙ, жадный}, further X "= {ХИЩНЫЙ, жадный, прожорливый }, etc. It is possible to show that generally X X" and X’ = X’’’ and also Y Y" and Y’ = Y’’’ The formal concept for the given formal context is the pair (A, B) where A = B’, B=A’, i.e. A = set of objects, having all attributes from the set B, B = set of attributes attributed to all objects of the set A. Relation ≤ establishes a partial order over the formal concepts for the given formal context B(K): (A1, B1) ≤. (A2, B2). <-> A1 A2 (B2 B1). This relation is called as the relation subconcept – superconcept and ≤ defines a complete lattice B(K) over B(K) which can be depicted in the form of the labeled oriented graph (fig. 1). The nodes this graph are the formal concepts, and the edges reflect the subconcept – superconcept relation. The experimental approbation of our technique was carried out over the Dictionary «Assessment of a person appearance» by Boguslavsky, (hereinafter - the Dictionary) containing more than 200 dominants and more than 1200 members of synonymic series of the adjectives attributed to appearance of a person. In particular, 603 adjectives for which more low 1040 conceptual lattices with number of attributes more than 2 have been constucted. For each adjective ari all English equivalents aeij=Lj (ari) from the Dictionary containing in the lexical database (LDB) are listed. For every aeij the set of synsets {sk} = WN (aeij) containing aeij is defined. For each synset sk all Russian adjectives which are the translation equivalents of the synset elements are listed; doubles are rejected. Thus, the set of objects G and a set of attributes M of formal context K are received. At this stage we do not carry out the semantic division of inconsistent translation equivalents (which actually exist, e.g. large-handed it is translated as жадный and as расточительный). Also the adjectives concerning appearance of the person are not selected; such selection is carried out later, at an analysis stage of the constructed conceptual lattice. Some of these relations coincide with those registered in the Dictionary: изящный (graceful) тонкий (delicate), коварный (artful) хитрый (sly), the others are newly revealed, or contradict the Dictionary, e.g. in the Dictionary adjective ястребиный (hawk) is a hyponym of the adjective беличий (squirrel) (?). Complexity of the problem of revealing semantic structure of adjectives is confirmed by the previous researches. Application of methods of the formal concept analysis (FCA) for its decision can appear useful as addition to the corpora – based methods, the component analysis, etc. It is supposed to develop the described methods for formal revealing hierarchical relations from the concept lattice. Besides, expansion of the proposed approach on other semantic relations is possible.