Intellectual Partner
INTELPART: Intellectual technologies in business
• products • articles • methodology • projects • bsc • download • news • contact • search
Belarusian Russian English


Preface

Part 1. Standard machine for comprehensive information search based on module of elementary sense

Part 2. Standard machine for search with context

Part 3. Search machine with time variant context

 Part 4. Machine of categories. Two machines interaction

Conclusion

Projects. Comprehensing information processing

Part 1. Standard machine for comprehensive information
search based on module of elementary sense

The structure of a standard information search process can by convention be presented as follows:

User with his demand for information -- Request for the necessary information -- Search machine transforming the user's request -- Machine's immediate searching algorithm -- Information space -- Information found for the user
1   2   3   4   5   6

Where

  1. User with his demand for information.
  2. User's specifically presented request for the necessary information (user presenting his demand for information in the form of a request).
  3. Search machine transforming the user's request and forming a search pattern.
  4. Machine's immediate searching algorithm of identification of the search pattern in the information space.
  5. Information space: the space where search is carried out.
  6. Information found for the user.

There exist two basic principles of operation of search machine. The first is search of information based on key words and the other is that based on analogies in one their form or another.

A crucial drawback of a standard search machine is the following contradiction (contradiction of request):

  • if key words are used for information search the search machine doesn't find information sources connected with the request by sense;
  • in case analogies are used for information search, it results in unbounded increase in number of found information sources in the course of search.

The problem of comprehensive search based on analogies lies in the fact that in the general case the number of possible analogies (relations) for any notion (for that reflected by word in particular) is limitless. For instance: direct analogies, associations, system relations, temporal relations and similarities, relations and similarities based on properties, functional similarities, professional and disciplinary similarities, etc.

A convenient way of circumventing the problem of the multitude of analogies is the use of

analogies characteristic of natural languages
(subject-oriented analogies).

For instance, an explanatory dictionary of a natural language depicts analogies that are the most likely implied by a native speaker. For any notion designated by word the dictionary provides a certain specific set of analogies characteristic of this particular notion.

In order to be able to use subject-oriented analogies for information search it is necessary to have:

  • Analogies peculiar to notions (words) of a natural language.
  • Principles of relation and interaction between different analogies.
  • A framework for presentation of notions of a language in the form of a system of analogies.

It is convenient to present notions of a natural language in the form of modules of elementary sense (module of notion). Module of elementary sense is a structure reflecting objects and phenomena by means of recording their fundamental properties and analogies. Module of elementary sense is described in detail in the Addendum 1.1. Module of elementary sense.

Module of elementary sense makes it possible to organize information search based on the notions implied by the user or those the latter is interested in.

Besides the universal structure of the module of elementary sense makes it possible to

  • Combine several notions into a more general one.
  • Expand a notion to its components.
  • Transform notions immanently inherent to the language into the form necessary for the user.

Association and disassociation of notions makes it possible to operate notions on the level of meaningful groups of words i.e. sentences as a sentence itself is a notion too. Specific patterns of association and dissociation of notions and those of presentation of natural language sentences in the form of modules of elementary sense are analyzed in detail in the Addendum 1.2. Operating module of elementary sense.

Patterns described in the Addendum 1.2 make it possible to construct a search machine with the following features:

1) User's request for information search may be presented in a natural language.
2) The machine can carry out search of complex notions.

The general operational algorithm of such a machine is described in the Addendum 1.3. Information on search machine based on module of elementary sense.

Use of notion operating search machine makes it possible, for instance,

  • To organize search of a specified notion in the information space (to determine where the notion is located and what it is related to; to deliver a text fragment or complete text of a document where the notion in question is mentioned or where the required implication is present).
  • To scan the information space in search of everything related with the notion of interest (the search machine forms priority centers in the information space from the point of view of the specified notion).
  • To determine priority sense centers in the information space (to determine, for instance, what the analyzed text communicates about).
  • To regard user's requests as executive commands and to search their sense bringing it into correlation with particular executable procedures.

A model of such a system made it possible to give commands to the computer in the natural language.


Addendum 1.1. Module of elementary sense

Let us examine subject-oriented analogies within the framework of a simple sentence of a natural language. The grammar structure of a natural language sentence consists in the general case of the following elements (here and further examples are derived from Belarusian language) [V.A.Karpov, Language like system. Minsk: Vyshejshaja shkola, 1992]:

  • S -- subject,
  • A -- action,
  • O -- object,
  • Adr -- addressee,
  • In -- instrument,
  • Topic -- topic,
  • Loc -- location,
  • G -- belonging,
  • Adv -- feature of an action,
  • Atr -- feature of an object,
  • Cause -- cause,
  • Goal -- goal,
  • Time -- time,
  • Condition -- condition,
  • Number -- number,
  • Prep -- preposition,
  • Modal -- modality and others.

As a mater of fact all the above listed elements are analogies of various types and sorts. However, it is convenient to regard the basic elements of a sentence as analogies corresponding to a) interaction, b) structure and c) time.

In that case the first significant feature of a phenomenon is interaction and its properties. Any phenomenon "action" has subjects (S) that carry out this action and objects (O) this action is directed at. In a similar way any phenomenon "subject-object" has actions that it initiates as a subject (As) and actions that are directed at it as an object (Ao) [V.V.Martynov, Universal semantic code. Grammar. Dictionary. Texts. Minsk: Navuka i Technika, 1977].

The inherent structural properties of a phenomenon - supersystems (SuperS) and subsystems (subS) form the second significant feature of a phenomenon. The supersystem for a phenomenon "subject-object" comprises phenomena that include the phenomenon in question. The subsystem for the latter contains phenomena the "subject-object" consists of, those it includes. The supersystem for an "action" comprises phenomena indicating the form of what the action in question is. The subsystem for an "action" includes phenomena indicating the existing types of action.

The third significant feature of a phenomenon is time and the temporal interval inherent to the phenomenon. For an "action" time can be regarded as a number of cause-and-effect chains revealing what processes the action comes into (t+) and what processes it consists of (t-). For a phenomenon "subject-object" time can be regarded as a certain number of basic qualitative phases of existence within the temporal interval inherent to the phenomenon (t-intro - internal time) and of basic qualitative phases one of which is the phenomenon itself (T-extro - external time).

Iinternal time phenomena and external time phenomena

Where t-intro stands for internal time phenomena. T-extro stands for external time phenomena.

As a result we have two modules: phenomenon "subject-object" and phenomenon "action" with the above listed systems of subject-oriented analogies. It is convenient to present the key "analogies" of any word of a natural language in the following way:

Module of a phenomenon "subject-object S-O".
External time phenomena of the OBJECT Who comprises the OBJECT?
What comprises the OBJECT?
 
What is done to the OBJECT? Module of a phenomenon "subject-object S-O" What does the OBJECT do?
  What does the OBJECT consist of?
What parts does the OBJECT consist of?
What does the OBJECT comprise?
Internal time phenomena of the OBJECT

Module of a phenomenon "action A".
What processes does the ACTION come into? The form of what is the ACTION?  
Who performs the ACTION? Module of a phenomenon "action A" What is the ACTION directed at?
  What types of ACTION exist? What processes does the ACTION consist of?

These modules reflect and operate the elements of grammar structure of a sentence in the following way.

1) Subject (S) and Object (O) are the centers of the "subject-object S-O" module of elementary sense.
Subject or Object

2) Action (A) is the center of the "action A" module of elementary sense.
Action

3) The acting subject ("subject S performs action A") is presented as
SubjectAction
For an immanent subject S the specific action A "cuts off" the alternative analogies of its possible actions thus specifying and concretizing this subject. Besides, reciprocal correlation and refinement of other axes of interacting modules take place. Besides, there appears a possibility of furling this pair into a new phenomenon "subject-object" with new refined properties or into a new phenomenon "action" with new refined properties. An action upon an object O is presented in a similar way.
ActionObject

4) Instrument (In). There are two ways of presenting it.
Version A. Instrument (In) is presented as the center of an elementary module following the module of the subject.
SubjectActionInstrument
Where A` is a specific action connecting the subject (S) and the instrument (In).

Version B. Instrument (In) is presented as a specific subject - "subject having an instrument", where the element instrument (In) finds its way into the subsystem of the subject S and corrects its other axes.
Subject having an instrument

5) Addressee (Adr) is presented analogously to instrument (In).
Version A. Addressee (Adr) is presented as the center of an elementary module following the module of the object:
ObjectActionAddressee
Where A` is a specific action connecting the object (O) and the addressee (Adr).

Version B. Addressee (Adr) is presented as a specific object - "object with an addressee", where the element addressee (Adr) finds its way into the subsystem of the object O and corrects its other axes:
Object with an addressee

6) Topic (Topic) is a specific object presented analogously to addressee (Adr).

7) Modality (Modal) is a specific action and is presented analogously to addressee (Adr) and instrument (In), with the only difference that it pertains to the phenomenon "action".

8) Belonging (G) reflects the relation of any module of elementary sense with its supersystem and is located accordingly i.e. in the supersystem. For instance,

For S:
Belonging
Subject
For A:
Belonging
Action
For Adr:
Belonging

For other elements of the sentence grammar structure (object, instrument, topic, location, etc.) belonging is presented analogously.

9) Feature of an action (Adv) is a procedure correcting the subsystem of the phenomenon "action".

10) Feature of an object (Atr) is a procedure correcting the subsystem of the phenomenon "subject-object". Feature of an object belongs to the features of such phenomena as subject (S), object (O), addressee (Adr), instrument (In), topic (Topic), location (Loc) and belonging (G).

11) Cause (Cause) is an element located on the axis of cause-and-effect chains. It is supported by procedure.

For the phenomenon "action": For the phenomenon "subject-object":
Cause     Action  
  Action or   Cause
Cause     Subject  
  Subject or   Cause

12) Goal (Goal) is analogous to the (Cause) with a particular quality of being a phenomenon that follows the phenomenon under review in the order of time. From this point of view the phenomenon under review becomes cause vis-a-vis the goal (Goal). It is supported by procedure.

13) Condition (Condition) is presented analogously to cause (Cause) and goal (Goal). It is supported by procedure.

14) Number (Number) is a specific phenomenon "subject-object" corresponding to the category of "quantity". Its specificity lies in the fact that for the number the axis of the external time phenomena (T-extro) takes the form of a row of figures differing from one another by the value of 1.
Number

The elements of quantity may be regarded as the process of addition of new qualities proceeding along the axis of external time (T-extro). For instance,
- quantity 1 = quality "one",
- 1 + 1 = 2, quantity 2 = new quality "two",
- 2 + 1 = 3, quantity 3 = new quality "three" (or the "third").
In the interaction of the number with the phenomenon "subject-object", the number presents itself as a category and can be found simultaneously both in the supersystem and subsystem properties of the phenomenon "subject-object".

15) Time (Time) is a specific phenomenon "subject-object" corresponding to the category of "time" it is presented analogously to the category of "quantity".

16) Location (Loc) is presented analogously to number (Number) and time (Time).

Commentary.
A) The same principle is applicable to such categories as "distance", "importance" etc.
B) Categories can be dealt with using the principle of reducing them to adjectives. For instance,
- category of time "early" = adjective "early";
- category of number "one" = adjective "single".
This allows the categories to be regarded as a feature of an object (Atr) or a feature of an action (Adv).

17) Preposition (Prep) is a grammar element that cannot be found directly in the module of elementary sense. However, its presence there is not required, its main function is to indicate the location of modules and the type of their relations between themselves. It is supported by procedure.


Addendum 1.2. Operating module of elementary sense

1. Bringing a sentence in a natural language to the form correlating with module of elementary sense.
2. Search matrix construction.
3. An example of a query matrix construction algorithm in operation.

1. Bringing a sentence in a natural language to the form correlating with module of elementary sense.
The essence of the method lies in the fact that each word according to its specific grammar characteristics holds a specific position in the grammar structure of the sentence, which makes it possible to code the grammar characteristics of words and then use this code for the analysis of sentences. Professor V.A.Karpov [V.A.Karpov, Language like system. Minsk: Vyshejshaja shkola, 1992] has worked out the rules of grammar coding upon the authors' request. A model program of analysis and parsing of natural language sentences based on this approach showed acceptable results in operation.

Commentary.
We are using Belarusian language in our examples of grammar coding in use. However it is worth mentioning that
A) If necessary similar work can be quickly carried out for any other language (English, German, French, Russian etc.).
B) Based on a rich enough vocabulary it is possible to work out rules of inflection and word-formation. Using these rules the program is capable of developing grammar codes of unknown words and to supplement the vocabulary automatically.

The grammar code structure looks as follows:

  first figure second figure third figure
0 - -nominative case;
-indication of analytical forms of complex future tense of verbs of the imperfective aspect;
-adverbial particles of perfective and imperfective aspect;
-adverbs of quantity answering the question "how much (how many)?"
-adverbs;
-infinitive of verbs of the imperfective aspect;
1 -personal pronouns: I, thou, he, she, it, we, you, they -genitive case;
-prepositions used only with genitive case
-singular, 1-st person of personal pronouns;
-adverbial particles of the imperfective aspect;
-singular, 1-st person of verbs
2 -inanimate nouns -dative case;
-prepositions used only with dative case
-plural, 1-st person of personal pronouns;
-plural, 1-st person of verbs
3 -verbs of the imperfective aspect ("non-resulting verbs") and adverbial particles of these verbs -accusative case;
-prepositions used only with accusative case
-singular, 2-nd person of personal pronouns;
-singular, 2-nd person of verbs
4 -adverbs -instrumental case;
-prepositions used only with instrumental case
-plural, 2-nd person of personal pronouns;
-plural, 2-nd person of verbs
5 -adjectives -prepositional case;
-prepositions used only with prepositional case
-masculine gender
6 -numerals -present tense of verbs;
-adverbs of time answering the question "when?"
-plural
7 -verbs of the perfective aspect ("resulting verbs") -past tense of verbs;
-adverbs of manner answering the question "how?"
-feminine gender
8 -animate nouns;
-proper names
-simple future tense of verbs and the verb "to be";
-adverbs of direction answering the question "where to?"
-interrogative operators (always connected with interrogative words belonging to different parts of speech)
9 -connective words: prepositions, particles, conjunction, interjections -imperative mood of verbs;
-adverbs of location answering the question "where from?"
-neutral gender

The grammar code is a three-figure code.

The first figure.

The first figure of the codes serves only for indicating the parts of speech:
1 - pronouns,
2 - inanimate nouns,
3 - infinitive of verbs of the imperfective aspect,
4 - adverbs,
5 - adjectives,
6 - numerals,
7 - infinitive of verbs of the perfective aspect;
8 - animate nouns,
9 - connective parts of speech (prepositions, conjunctions, interjections).

The second figure.

The second figure of the codes of pronouns, nouns, adjectives and numerals serves for indicating cases:
0 - nominative,
1 - genitive,
2 - dative,
3 - accusative,
4 - instrumental,
5 - prepositional.

The second figure of the codes of verbs serves for indicating the category of tense:
0 - with the infinitive indicates the analytical forms of complex future tense used only with the auxiliary verb "to be",
6 - present tense,
7 - past tense,
8 - future tense.
It also indicates the category of mood:
9 - imperative mood.

The second figure of the codes of adverbs signifies their falling into this or that semantic category. In that connection the second figures of the codes
of adverbs and those of the codes of the corresponding interrogative words coincide.

Adverbs:
400 - adverbs of quantity,
460 - adverbs of time,
470 - adverbs of manner,
480 - adverbs of direction,
490 - adverbs of location,
Interrogative words:
how much (many)? - 408;
when? - 468;
how? - 478;
where to? - 488;
where from? - 498.

The second figure of the codes of connective words separates them into the following groups:
910 - prepositions used with genitive case,
920 - prepositions used with dative case,
930 - prepositions used with accusative case,
940 - prepositions used with instrumental case,
950 - prepositions used with prepositional case,
960 and 970 - conjunctions,
980 - interjections.

The third figure.

The third figure of the codes of pronouns and verbs serves for indicating person:
1 - 1-st person singular,
2 - 2-nd person singular,
3 - 1-st person plural,
4 - 2-nd person plural.

The third figure of the codes of nouns, adjectives and pronouns in the third person singular and verbs in the past tense serves for indicating the category of gender:
5 - masculine gender,
7 - feminine gender,
9 - neutral gender.

The third figure "6" of the codes of nouns, adjectives and pronouns in the third plural and of verb in the past tense and also of numerals signifies only plural.

The third figure "8" signifies interrogative words. For example,

who? - 108,
whom? - 118,
to whom? - 128,
whom? - 138,
by whom? - 148,
about whom? - 158,
when? - 468 and so on.
what? - 208;
what for? - 218;
to what? - 228;
what? - 238;
by what? - 248;
about what? - 258;

Commentary.
A) Further work with adverbs is possible involving other semantic categories for which the codes 410, 420, 430, 440, 450 can be used.
B) For more precise work with conjunctions and particles further elaboration of their semantic categories is possible.

2. Search matrix construction.
Methods described in the Addendum 1.1 make it possible to present a sentence as a linked structure of modules of elementary senses (a subject-oriented picture of the world for a sentence). As a result we get a matrix of query with subject-oriented analogies. Description of a possible algorithm of construction of a matrix of query is given below.

Algorithm for constructing a matrix of query.
1) In the sentence regarded as a query the grammar codes of each word are determined. The grammar codes are determined:
a) By the vocabulary.
b) Based on the system of endings if the word is missing in the vocabulary.

2) The word having the code of action is the first to be found. A module of elementary sense for action is then constructed.

3) Module of elementary sense constructed at the step 2 is the basis for constructing a matrix of query. In the general case a matrix of query for a simple sentence of a natural language consists of three linked modules of phenomena - subject (S), action (A) and object (O), each with its own group of dependent words.

4) The group of the subject and the group of the object are now identified in the sentence. The group of the subject includes all the words dependant of the subject (S). The group of the object includes all the words dependant of the object (O).

The rule for identifying the groups:
- the words preceding the action by their position in the sentence are regarded as belonging to the group of the subject;
- the words positioned in the sentence after the action are regarded as belonging to the group of the object.

5) Operations with the group of the subject.

Here are a few basic grammar rules for the Belarusian language:
- each pair of words contains the main word and the dependent word;
- the agreement between the main word and the dependant one is achieved according to grammar codes;
- the main word in turn may be dependant of another word;
- the preposition and noun must have the same code of case (the second figure);
- the preposition is always positioned before the noun it governs;
- if the noun is found in the group of subject, the codes of object cannot pertain to it, as it cannot belong simultaneously to the group of the subject and that of object;
- if the noun is found in the group of object, the codes of subject cannot pertain to it, as it cannot belong simultaneously to the group of the object and that of subject;
- the word in the genitive case indicates belonging, belonging to the supersystem in particular;
- if the verb is in the plural, the subject is also in the plural.
The complete set of rules for any natural language is determined by its grammar.

a) The group of the subject is searched for the word having the code of subject.
b) The module of elementary sense is constructed for the subject.
c) In order to adjust the matrix of query the module of elementary sense of the subject is interfaced with the module of elementary sense of the action. After that the analogy axes of each of the modules are mutually corrected.
d) The words that are dependant of the subject are then found.
e) The module of elementary sense is constructed for each dependant word and is integrated into the picture of the world of the sentence similarly to 5c).

6) Operations with the group of the object. The basic rules and the processing algorithm are similar to those for the group of the subject (see item 5).

7) Detection of the "dropped out" words. The "dropped out" words are those that haven't found their way (couldn't be linked) to the query matrix.
The "dropping out" of the words may occur for the following reasons:
- the list of rules is incomplete (see item 5);
- the rules (item 5) are contradictory;
- the rules of connection cannot be stated as the connection of the "dropped out" word with the others is not on the grammar level but on the sense level (for instance, the agglutination type of word connection).
In such case there appears a necessity to operate with the sense of the word. Then it is expedient to use a dictionary where each word is represented as a module of elementary sense.

Then the "dropped out" word is linked to the query matrix in the following way.
a) If the "dropped out" is present in the vocabulary,
- the module of elementary sense of the "dropped out" word is found;
- it is found out where the analogies pertaining to the "dropped out" word and those of the already built query matrix intersect;
- based on the found intersections of analogies the place of the "dropped out" word in the query matrix is determined.
b) If the "dropped out" cannot be found in the vocabulary,
- the dictionary of phenomena is supplemented and the word is entered into the query matrix;
- if the user's query is presented in the form of a text, the work with the "dropped out" word should be postponed until there appears a possibility to clarify its sense based on another sentence of the same text.

Possible approaches to supplementing the vocabulary.
- Reference to the user (in order to find out what he had in mind concerning the word in question).
- Search for the word in an explanatory dictionary and construction of the module of elementary sense based on the information derived from this dictionary.

Commentary.
Supplementing the vocabulary by hand may present a problem. However, this problem is not topical as the procedure of automatic supplementing can be used. If the word is missing in the vocabulary but it can be positioned in the query matrix according to the rules, the elements of the query matrix constitute the analogies for this word. Based on these analogies the module of elementary sense of the unknown word is constructed with its further automatic introduction into the vocabulary. In other words the problem of completing the vocabulary by hand exists only at the initial stage of the work of the search machine. It is necessary to start with building up a minimal vocabulary with further training of the search machine using texts of gradually growing sophistication.

8) As a result we have the matrix of query, corresponding to the initial sentence.

Note.
After the matrix of query is completed there appears a possibility of elucidating specifically what the user is really after. The user's explanations will be used to cut off he analogies that have nothing to do with the topic of the search.

An example of a query matrix construction algorithm in operation.
Let us scrutinize a sentence in Belarusian: "У сям'і Барадуліных любяць спорт" [They like sport in the Baradulins' family].

1) Grammar codes are determined for each word in the sentence (the grammar codes are determined by the vocabulary or based on the system of endings if the word is missing in the vocabulary.

У [in] = 930, 950
сям'i [family] = 227, 257
Барадулiных [the Baradulins'] = 816, 836, 216, 236
любяць [like] = 366
спорт [sport] = 205, 235

2) The word with the code "action" is found.

любяць [like] = 366

Note.
In Belarusian the word "любяць" [like] carries the formal easily identifiable signs of its being a verb of the third person plural in the present tense.

3) The action "любяць" [like] is entered into the module of elementary sense of the phenomenon "action".

The action "like" is entered into the module of elementary sense of the phenomenon "action"

4) The words so positioned in the sentence that they precede the "action" and those positioned after it are identified. The words preceding the "action" by their position in the sentence are regarded as belonging to the group of the subject and those positioned in the sentence after the "action" are regarded as belonging to the group of the object.

The group of the subject (S) У [in] = 930, 950
сям'і [family] = 227, 257
Барадуліных [the Baradulins'] = 816, 836, 216, 236
Action (A) любяць [like] = 366
The group of the object (O) спорт [sport] = 205, 235

5) The group of the subject is scanned for the words that have the code of subject (nouns in the nominative case).

У [in] = 930, 950
сям'i [family] = 227, 257
Барадулiных [the Baradulins'] = 816, 836, 216, 236

No such words are found which means that the subject is implicit. The grammar of the Belarusian language allows the existence of formally impersonal sentences (without the subject) and the sentence under analysis is one of them. Let us denote it by "X".

No such words are found which means that the subject is implicit, let us denote it by "X"

6) To handle the group of the subject the following words are taken a look at:

У [in] = 930, 950
сям'i [family] = 227, 257
Барадулiных [the Baradulins'] = 816, 836, 216, 236

According to the rules:
- Each pair of words contains the main word and the dependent word and the main word in turn may be dependant of another word.
- The preposition and noun must have the same code of case.
- The preposition is always positioned before the noun it governs.

The word agreement in the word combination "у сям'i" [in family] is described by the pair of codes 950+257.

Note.
In Belarusian the prepositional case of the noun "сям'i" [family] is identified not only by the preposition but also by the ending "-i" as opposed to the form of the nominative case - "сям'я".

According to the rule:
- The preposition "у" [in] in the prepositional case indicates the relation to the supersystem phenomenon.
Based on this rule we position the word "сям'я" [family]:

Position the word "family"

The following word is taken a look at:

Барадулiных [the Baradulins'] = 816, 836, 216, 236

According to the rule:
- If a noun is found in the group of the subject the code of the object can be omitted, as this noun cannot simultaneously belong to the group of the object. The word is taken another look at:

Барадулiных [the Baradulins'] = 816, 216

According to the rule:
- A word in the genitive case indicates belonging to a supersystem phenomenon.

A word in the genitive case indicates belonging to a supersystem phenomenon

According to the rule:
- If the verb "любяць" is in the plural, consequently the subject "X" is in the plural too.
Thus we can partially restore the code of the subject "X" = _06

7) To handle the group of the object the following word is taken a look at:

спорт [sport] = 205, 235

According to the rule:
- If a noun is found in the group of the object, the codes of the subject can be omitted.
The word is taken another look at:

спорт [sport] = 235

As a result the word "спорт" [sport] has only one code left.

We have obtained a matrix of query with subject-oriented analogies built for the sentence "У сям'і Барадуліных любяць спорт" [They like sport in the Baradulins' family]:

We have obtained a matrix of query with subject-oriented analogies built for the sentence "They like sport in the Baradulins' family"

As it can be seen in this example, besides the analogies pertaining to separate elements we have also got some complex analogies such as: "X любіць спорт" [X likes sport]; "сям'я любіць спорт" [family likes sport]; "Барадуліны любяць спорт" [the Baradulins' like sport] and so forth. Here there appears a possibility to find out from the user what exactly he needs and what projections of senses he is interested in.

Consequence.
As the language phenomena can be characterized as phenomena that are significant, repetitive in their connections, relations and in the movement of their elements (sense), this can be used as the basis for constructing a subject-oriented picture of the world for a sentence and then use it for information search.


Addendum 1.3. Machine for information search based on the module of elementary sense

General algorithm.

1) The query in the form of a sentence is scrutinized.
a) The composition of the sentence is determined.
b) The structure of the sentence is built.

2) A search matrix is constructed and submitted to the user for correction of sense. In general a search matrix has the following aspect:
SubjectActionObject

3) The search of required information is initiated. The search matrix of query becomes the center of "crystallization" to which sentences in the information space under review begin to link.
a) If a sentence links to the query it means that the former contains the elements of the desired sense. And the order of connection describes in what way the query is related to the found sense.
b) If the sentence does not link to the query it means that the former contains no elements of the desired sense.

Consequences.
1) The given algorithm makes it possible to introduce the "levels" of information search depending upon what the target notions are:
- analogies of the first order are involved in the search based on the notions making up the query itself;
- analogies of the second order are involved in the search of notions that are close to the query;
- and so on.
2) Reverting to the analysis of the structure of the process of information search, we see that with the use of the information search machine based on elementary senses the searching process changes.
At the step 2: The user's request (query) for the necessary information is presented in the form of a regular sentence in a natural language.
At the step 3: Construction of the search matrix is reduced to developing a picture of the world for the sentence (the query) based on the use of subject-oriented analogies (with module of elementary sense as the basis).
At the step 4: The basic principle of the search mechanism is the principle of "crystallization" residing in the fact that the search matrix of query links (identifies itself with) only to those sentences in the information search under analysis that reflect either the immediate target sense or the sense that is close to it by its subject-oriented analogies.

Next: Part 2. Standard machine for search with context Next



Top Copyright (C) 2000 S.Alaksandrau, P.Fadziejeu. All Rights Reserved.
Copyright (C) 2001-2024 INTELLECTUAL PARTNER. All Rights Reserved.