GHuRU - Reduced Conceptual Form (RCF)

http://www.cs.hmc.edu/~dbethune/ghuru/RCF.html


|| GHuRU || HRU || NLP || Search Engine ||

To exhibit any level of comprehension at all, a computer must be able to build grammar trees from the available text. This is a process where the individual words are analyzed independently (to determine their part of speech), and then strung together to create grammatical units (phrases), and eventually sentences. This process is analagous to the way you and I learn to understand written and spoken English by gradually piecing small sections together to make larger ones.

To then actually understand what is being said enough to logically respond to it is another step that requires some sort of vocabulary, a set of words that are "known" by the computer. At the most basic level, these words are known by their part of speech. At a slightly higher level, you can try to figure out what it actually being communicated by developing a list of relationships between the words. Let me try an example.

Given the sentence ... The purple dog swallows a cat.
You would start to list the parts of speech.

Then you would start developing dependencies.
The last construction represents the entire sentence, and indeed what thought is being communicated. By recursively going through this process. These thoughts, or concepts can be built up. By also being able to understand what it means to define (or equate) something, would allow GHuRU to also build its vocabulary, replacing those cumbersome conceptual constructions with new words (for instance continually(thinks(Bob, Mary)) could be replaced with loves(Bob, Mary) ).

Most natural language systems being worked on today work on some sort of concept like this. The problem I see is that each individual project seems to be working on their own versions of computerese. It did seem like some of the work on Multilingua, an intermediate language being used for translation may have been in the direction of standardization, though. I think that it is where the work should go. If a common language form could be developed (and continually redeveloped), that applications could be developed to support, then a lot of redundant effort could be avoided.

My idea of this standard language is called Reduced Conceptual Form. This name points out the fact that this language should incorporate some compression by tossing out unneeded words, and also be able to make the natural language more understandable by building word and phrase dependencies in a logical manner (unlike most natural languages). A definition of RCF would need to include a set of rules covering what syntz should be used for all possible dependency types. This is a major task, and the specification would surely have to be revised many times.

A simple version could be built off of the small example I have shown above.

|| GHuRU || HRU || NLP || Search Engine ||

questions or comments should be sent to dbethune@hmc.edu