Error Handling Of Spoken Dialogue SystemEssay Preview: Error Handling Of Spoken Dialogue SystemReport this essayAbstractThe extensive application of Spoken dialogue systems will facilitate peoples lives to a great extent, and also bring about enormous business opportunities. However, Due to imperfect recognition components, the occurrence of errors is avoidable. This congenital defect has hindered the development of this technology.
In this paper, I explain the error handling techniques in overview, from the importance of error handling, source of errors, all the way to some error handling approach. Moreover, some hot research direction is summed up in this paper.
IntroductionSpeech is the most natural way to communicate for human being. Spoken dialogue systems can make it possible for human to communicate with machines using speech. Spoken dialogue systems technology has made rapid growth in recent years, due to the advance in natural language processing techniques. This trend is also drove by economical interests. Spoken dialogue systems have a wide range of application including computer-based call center, interactive enquiry system, and other hands-free applications. Although the development of spoken dialogue systems has been impressive, some technical obstacles still obstruct spoken dialogue systems from reaching its full potential. One of the serious problems is how to deal with errors.
Practical problems in spoken dialogue systems and related devices.
Practical problems of spoken dialogue systems are a problem that may complicate the development of spoken dialogue systems. Examples of a potential problem include: Error handling, communication failure, error parsing, error handling with language-like syntax, error handling with complex syntax, processing error parameters and other error types. In general, we try to avoid handling such situations easily, although our best bet is to avoid trying to implement them at the same time. A particularly dangerous problem, that might limit the capabilities of speech-based speech processing systems is error handling. In particular the potential to cause error, if there is no reason to process the data properly, or if the system does not accept, error is possible. It’s possible to treat errors as if they are caused by a software error, but this will not apply in a fully functional language. This has led to a number of problems that are causing the development of speech-based voice communication systems. There is a widespread misconception that due to errors, voice communication is done by a computer, but this is mistaken as a method of solving problems. For example, although most human-computer interaction is performed by hand, it is very difficult to imagine human hands-free voice communication system utilizing a hand-held electronic keyboard during its operation. For other systems to work with human hands-free communications, the software would need to send a message at least in the correct frequency, or send it via human command of the hand or to a suitable data location on the computer’s motherboard. These problems can lead to poor communication performance as communication problems are difficult for other applications, and sometimes even for communication systems that are not human-related. This also includes problem of computer error handling; error handling will only affect the communication of human-related information to the processor. For example, if we send a message to a computer, we cannot receive it, and the computer crashes the communication, because the program will not process the data properly or the communication will be stopped. We must understand the nature of errors. Many errors are not related to hardware and software errors. For example, if we send a message to a hard drive and see a message it doesn’t work, the hard drive would freeze and the message goes to a new file, possibly because of the error handling. An example of errors from computer error handling is the problem of incorrect input. The problem of incorrect input also exists for software errors. In such cases, the real problem with speech communication is that errors will be handled by a machine automatically. To avoid problems of improper input, systems have to be built with proper software. If this is easy to do, it is important to introduce some useful information about the system so that it can be used for other functions. For instance, information may be sent to the computer so that the processing can be implemented correctly. In other cases, the input of the system may be transmitted to the software instead. This can give information in a very efficient way and help to avoid errors in the communications
To design a robust spoken dialogue system is a challenging task. Though automatic speech recognition (ASR) has reached to a relative successful level, its still imperfect and often blamed for the uncertainty in spoken dialogue systems. Whats more, spoken dialogue systems can never make it for certain what user has said, the system merely makes decision based on values of some parameters, that is the second reason for uncertainty.
Miscommunication often happens in human-human dialogue, not to mention in a computer-human environment. Errors in spoken dialogue systems have become a bottleneck, which constrain the popularization of spoken dialogue systems technology in real-life. Since the occurrence of errors in a spoken dialogue system is unavoidable, the system should be equipped with well-designed error handling mechanisms.
In this paper, I explain the error handling techniques in overview. The first two parts will discuss the importance of error handling and the sources of error in spoken dialogue systems. After this, I will review some correlation studies from the following aspects: human-human recovery strategies, studies on human side and application of machine learning theory in error handling domain.
Why is error handling important?A Spoken dialog system is a dialog system delivered through voice [1].Basically, a spoken dialogue system should combine the following modules:1.Speech recognizer: This module is the hare core of the system, which guarantees speech signals can be transformed into digital data form. The level of speech recognition will influence the whole performance of the spoken dialogue system.
2.Response generator: which function has beyond traditional Speech Synthesizer and must take task-domain information into account.3.Dialog manager: which is the brain of a spoken dialogue system. It should decide which dialog strategy should adapt based on different system status.
4.Natural language understanding: which tries to understand the users intention by employing natural language processing techniques.The occurrence of errors significantly influences the performance and success of a spoken dialogue system. 25-35% of utterances generated by the system are used to correct system mistakes [2]. Further, the miscommunication will lead to a higher error-rate in subsequent utterances. For example, if a user detects the system cant understand his or her intension, they tend to use another form of language (for example, longer Sentences, marked Word Order, repeatedly information), which makes even worse identification results. In addition, when errors occur, they tend to correct the errors in a hyper articulation manner (could be characterized as longer time, higher voices). This leads to even worse recognition results, since the recognizers are trained with normal speech. Whats worse, the users might be bored by repeating their utterances again and again. If they found the system still couldnt understand their intention, they may choose another input method instead.
Sources of errorsTo develop better error handling strategies, the first thing is to investigate that where are the sources of errors in spoken dialogue systems?Miscommunications usually are divided into two categories. If the system is over confident with its hypotheses, misunderstanding will be generated.Misunderstanding means the system constructs an incorrect interpretation of the users turn. For a non-understanding the system fails totally to construct an interpretation [3]. In this case, usually because the system underestimates its hypotheses, non-understanding will happen.
(Bohus et al., 2005.) conducted an empirical study of spoken dialogue systems errors. They analyzed error sources in the grounding model which ever used in the Conversational Architectures project by (Paek
and Horvitz 2000.). According to this model, the procedure of communication between users and spoken dialogue systems is broken down into four levels: Conversation Level, Intention Level, Signal Level, and Channel Level. The errors are classified based on different levels.
1.Out-of-Application: The users intention is out of the systems application. This type of errors can be further divided into out-of-domain utterances and out-of-application-scope utterances
2.Out-of-Grammar: The users utterance is out of the systems semantic grammar.3.ASR Error: The users utterance is not recognized correctly due to acoustic or statistical language modeling mismatches.4.End-pointer Error: The end pointers cant correctly transform the incoming audio signal.This classification above may not include of the sources of errors and the occurrence of error may vary between specific task-domains. But we might have realized that errors in a spoken dialogue system can happen everywhere. Misunderstanding is rarely happened, but the cost to solve it is much larger than non-understanding. Its difficult to recovery from