Linguistics Researcher Develops New System to Help Computers ‘Learn’ Natural Language

Newswise — AUSTIN, Texas - For more than 50 years, linguists and computer scientists have tried to get computers to understand human language by programming semantics as software. Now, a University of Texas at Austin linguistics researcher, Katrin Erk, is using supercomputers to develop a new method for helping computers learn natural language.

Instead of hard-coding human logic or deciphering dictionaries to try to teach computers language, Erk decided to try a different tactic: feed computers a vast body of texts (which are a reflection of human knowledge) and use the implicit connections between the words to create a map of relationships.

“An intuition for me was that you could visualize the different meanings of a word as points in space,” says Erk, a professor of linguistics who is conducting her research at the Texas Advanced Computing Center. “You could think of them as sometimes far apart, like a battery charge and criminal charges, and sometimes close together, like criminal charges and accusations (“the newspaper published charges…”). The meaning of a word in a particular context is a point in this space. Then we don’t have to say how many senses a word has. Instead we say: ‘This use of the word is close to this usage in another sentence, but far away from the third use.’ ”

To create a model that can accurately recreate the intuitive ability to distinguish word meaning requires a lot of text and a lot of analytical horsepower.

“The lower end for this kind of a research is a text collection of 100 million words,” she explains. “If you can give me a few billion words, I’d be much happier. But how can we process all of that information? That’s where supercomputers come in.”

MEDIA CONTACT

Expert Pitch

Expert Query

Expert Directory

Linguistics Researcher Develops New System to Help Computers ‘Learn’ Natural Language

Linguists, computer scientists use supercomputers to improve natural language processing

MEDIA CONTACT

TYPE OF ARTICLE

SECTION

CHANNELS

KEYWORDS