Speech is efficient and robust, and remains the method of choice for human communication. Consequently, speech output is used increasingly to deliver information in automated systems such as talking GPS and live-but-remote forms such as public address systems. However, these systems are essentially one-way, output-oriented technologies that lack an essential ingredient of human interaction: communication. When people speak, they also listen. When machines speak, they do not listen. As a result, there is no guarantee that the intended message is intelligible, appropriate or well-timed. The current generation of speech output technology is deaf, incapable of adapting to the listener's context, inefficient in use and lacking the naturalness that comes from rapid appreciation of the speaker-listener environment. Crucially, when speech output is employed in safety-critical environments such as vehicles and factories, inappropriate interventions can increase the chance of accidents through divided attention, while similar problems can result from the fatiguing effect of unnatural speech. In less critical environments, crude solutions involve setting the gain level of the output signal to a level that is unpleasant, repetitive and at times distorted. All of these applications of speech output will, in the future, be subject to more sophisticated treatments based at least in part on understanding how humans communicate.

The purpose of the LISTA (The Listening Talker) project, funded under the EU Framework 7 Future and Emerging Technologies (FET) Programme, was to develop the scientific foundations needed to enable the next generation of spoken output technologies. LISTA will target all forms of generated speech -- synthetic, recorded and live -- by observing how listeners modify their production patterns in realistic environments that are characterised by noise and natural, rapid interactions.

Items in this Collection

  • Hurricane natural speech corpus - higher quality version 

    Valentini-Botinhao, Cassia; Mayo, Cassie; Cooke, Martin
    Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). This ...
  • Sharvard_IJA 

    Aubanel, Vincent; Garcia Lecumberri, Maria Luisa; Cooke, Martin (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2014-07-28)
    Two native Spanish talkers (one male, one female) recorded producing 700 Spanish sentences designed to be the Spanish equivalent of the English language Harvard sentences (thus phonetically balanced across sets of ten ...
  • DiapixFL 

    Cooke, Martin; Garcia Lecumberri, Maria Luisa; Wester, Mirjam (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece., 2013-10-01)
    DiapixFL consists of speakers whose first language (L1) is either English or Spanish solving a "spot-the-difference" task in both their L1 and their second language (L2, which for native English talkers is Spanish, and for ...
  • Hurricane natural speech corpus 

    Cooke, Martin; Mayo, Catherine; Valentini-Botinhao, Cassia (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-10-01)
    Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). A higher ...
  • Sharvard 

    Cooke, Martin; Garcia Lecumberri, Maria Luisa; Aubanel, Vincent (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-09-24)
    Two native Spanish talkers (one male, one female) recorded producing 720 Spanish sentences designed to be the Spanish equivalent of the English language Harvard sentences (thus phonetically balanced across sets of ten sentences).
  • Acted clear speech corpus 

    Mayo, Catherine (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-09-24)
    Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to ...