Uploaded on Mar 31, 2010
What is our knowledge of language all about?
As language has diversified so much in recent years, how much can computers understand human language? With this question in mind, the Ohara Laboratory is doing research on natural languages. The research is based on the idea that computers can be a starting point for looking at human language. That is, the problems that computers have in processing language reflect the key features of human ability to process it. So by looking at problems of language processing by computers, its true nature will become clear.
Q. "When you look up a dictionary, what you find is definitions such as left is the opposite of right, and east is the opposite of west. But to understand the meaning of each word involves not just knowing purely linguistic meaning of the word like that, but also having encyclopedic knowledge of it. So in our descriptions of the meaning of words, we want to incorporate such encyclopedic knowledge, including common sense and scientific knowledge, which is not usually found in dictionaries. Thats the aim of our project."
In kana to kanji conversion by computers, the accuracy has currently improved to over 90%. But getting a computer to understand the meaning of words, which changes subtly depending on the situation, is a problem thats yet to be solved. For example, the Japanese phrase kurumadematsu can mean wait in the car or wait until someone comes. A person can judge which of the two is meant from the words that precede the phrase. But a computer cant always understand the context, so it cant always convert a phrase to the correct
To open the way to technology that can solve this kind of problem, the Ohara Lab has started a project called the Japanese FrameNet.
Q. "Were going to describe the ways in which Japanese speakers normally use words. So first of all, were collecting examples of how Japanese speakers write, read, and talk. Such a database of words is called a corpus, and recently, a representative Japanese Language Corpus is being created by researchers at the National Institute for Japanese Language and other organizations. We use
the corpus to look up each word, and carefully select example sentences of the word, so we can analyze the meaning of the words used in those sentences."
If Japanese FrameNet becomes widely used in society, it will be possible to investigate online how contemporary people use and understand words. Our colleagues outside Japan have also been building FrameNets for English, German, and Spanish. We are going to link Japanese FrameNet to those for other languages, so that the FrameNets can be used by speakers of one language to
understand another, helping them to overcome language barriers.
Q. "What Im most interested in is human cognition, and the question Why canpeople understand words?. Or: How does the knowledge needed to understand Japanese differ from the knowledge needed to understand English? When Japanese FrameNet becomes available, we want to use it for computer processing such as machine translation. And we want to see how far we can describe not just Japanese, but German and Spanish, using semantic frames. In that sense, contrasting Japanese with English, German, and Spanish is linguistically very interesting."