By Roland Piquepaille
It took almost thirty years to get decent speech recognition programs on our computers. But if they're good enough to translate our words into characters, they can't engage in a conversation with us (I must say that some humans can't do either). But according to this article from Technology Research News, things are changing. Computer scientists from Scotland and California have designed a multithreaded system which can anticipate what you're going to say and are also able to switch context when you jump from a topic to another. This approach, which could be used in a wide range of applications, is welcome. Unfortunately, these researchers have selected the name "Conversational Interface Architecture" for their system, which leads to the worrisome acronym CIA. Anyway, the first commercial applications should be available within two years. Read more...Here is a general description of this dialogue management system.
Researchers from Edinburgh University in Scotland and Stanford University have built a dialogue management system that promises to improve verbal communication with computers by giving the machine a sense of the type of phrase a person is likely to say next.
The Conversational Interface Architecture goes beyond the slot-filling dialogue systems commonly used for airline ticket booking systems by tracking multiple conversation threads, said Oliver Lemon, a senior research fellow at Edinburgh University. Slot-filling dialogue systems prompt users to provide topic-specific information and listen for keywords that determine the system's response to the user.
And here ere are some details on how this dialogue management works.
The software follows multithreaded conversations -- those that switch back and forth between several topics -- without having to be programmed, regulates particular topics, and uses this information to improve speech recognition rates, according to the researchers. It also recognizes corrective fragments -- phrases that correct something a user has just said -- and it allows users to initiate, extend and correct dialogue threads at any time.
The system accomplishes this by tracking different types of utterances, including yes or no answers; who, what, where answers; and corrections like "I meant the office" and "not the tree."
[Note: An utterance is a complete unit of talk, bounded by silence.]
I's interesting to note that, by using this analysis of utterances, the system can work with any speech recognition system.
What could we do with such a software?
The approach could be used in a wide variety of speech recognition systems including telephone-based information systems, interactive entertainment devices, robots, computer interfaces for the visually impaired, in-car dialogue applications, and speech interfaces for personal computers.
Another question remains: when will such systems be available?
The context-sensitive component of the researchers' system could be applied to practical applications now, said Lemon. Multithreaded dialogue management could be used practically within two years, he said.
This research work has been presented at the ACM Transactions on Computer-Human Interaction (TOCHI) conference last year and published in its September 2004 issue (Volume 11, Issue 3, Pages 241 - 267).
Here is a link to the abstract of this paper named "Multithreaded context for robust conversational interfaces: Context-sensitive speech recognition and interpretation of corrective fragments." Here is a summary of their results.
In an evaluation of a dialogue system built using this architecture we found that 87.9 percent of recognized utterances were recognized using a context-specific language model, resulting in an 11.5 percent reduction in the overall utterance recognition error rate, and a 13.4 percent reduction in concept error rate. Thus we show that by using context-sensitive recognition based on the predicted type of the user's next dialogue move, a more flexible dialogue system can also exhibit an improvement in speech recognition performance.
Sources: Eric Smalley, Technology Research News, April 6/13, 2005; and various websites
Related stories can be found in the following categories.
Famous quotes containing the words time, conversation and/or computer:
“A thing is mighty big when time and distance cannot shrink it.”
—Zora Neale Hurston (18911960)
“Conversation is a traffick; and if you enter into it, without some stock of knowledge, to ballance the account perpetually betwixt you,the trade drops at once: and this is the reason ... why travellers have so little [good] conversation with natives,owing to their [the natives] suspicion ... that there is nothing to be extracted from the conversation ... worth the trouble of their bad language.”
—Laurence Sterne (17131768)
“Family life is not a computer program that runs on its own; it needs continual input from everyone.”
—Neil Kurshan (20th century)