Next: Importance of spoken language
Up: Spoken Language Systems
Previous: Spoken Language Systems
Spoken language systems cover the following subfields
(see also Gibbon et al., 1997a, User's guide, Section 1.2.1):
- automatic speech recognition: dictation software, which converts spoken utterances to written texts;
- automatic speech generation/speech synthesis systems: this covers
text-to-speech systems or read-back systems that use a grapheme-phoneme conversion to produce audible and understandable utterances;
- speech input/output systems (`speech understanding' systems): systems
combining automatic speech recognition and synthesis which are able to process
semantic information;
- spoken dialogue systems and speech-to-speech translation systems:
systems mediating, for example, between different language communities by
translating spoken utterances (inclusive of speech understanding systems and
at least a text-to-speech system in the target language);
- speech coding
- speech analysis or paralinguistic processing: prosodic analysis, speaker identification;
- general speech processing;
- multimodal speech recognition and synthesis systems: the most general
approach of all systems mentioned combining not only speech input/output
systems but also
systems which use non-verbal communication such as gestures, facial
expressions, and typed input (see fig 1.1).

Figure 1.1: Multimodal systems
Multimodal systems are the most powerful sets
in the field of spoken language systems, it might even be considered
to define the domain as
multimodal systems and to see spoken language systems as a subset
thereof.
Due to historical reasons, (Spoken Language Systems has a longer
tradition as domain) in this paper spoken language systems will be
the top domain with the subdomains, including multimodal
systems, mentioned above.
Thorsten Trippel
Fri May 21 13:04:11 MET DST 1999