The Centre for Speech Technology Research (CSTR) is an interdisciplinary research centre linking Informatics and Linguistics and English Language .

Founded in 1984, CSTR is concerned with research in all areas of speech technology including speech recognition, speech synthesis, speech signal processing, information access, multimodal interfaces and dialogue systems. We have many collaborations with the wider community of researchers in speech science, language, cognition and machine learning for which Edinburgh is renowned.

http://www.cstr.ed.ac.uk/publications/

Collections in this community

Recent Submissions

  • The Edinburgh International Accents of English Corpus 

    Sanabria, Ramon; Markl, Nina; Carmantini, Andrea; Klejch, Ondrej; Bell, Peter; Bogoychev, Nikolay
    English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many ...
  • SUPERSEDED - The Edinburgh International Accents of English Corpus 

    Sanabria, Ramon; Nikolay, Bogoychev; Nina, Markl; Carmantini, Andrea; Klejch, Ondrej; Bell, Peter
    ## This item has been replaced by the one which can be found at https://datashare.ed.ac.uk/handle/10283/4836 - https://doi.org/10.7488/ds/3832 ##. English is the most widely spoken language in the world, used daily by ...
  • REYD Yiddish TTS Corpus 

    Webber, Jacob; Bleaman, Isaac; Lo, Samuel; King, Simon
    * The Reading Electronic Yiddish Documents (REYD) Dataset. The REYD TTS dataset is a speech dataset for Yiddish consisting of 4,892 short audio clips, with a total duration of 475.7 minutes. The recordings are of three ...
  • CSTR NAM TIMIT Plus 

    Yamagishi, Junichi; Brown, Georgina; Yang, ChenYu; Clark, Rob; King, Simon
    CSTR NAM TIMIT Plus (Version 0.8) RELEASE May 2012 The Centre for Speech Technology Research University of Edinburgh Copyright (c) 2012 Junichi Yamagishi jyamagis@inf.ed.ac.uk Overview This CSTR NAM TIMIT Plus corpus ...
  • Listening-test materials for "Where do the improvements come from in sequence-to-sequence neural TTS?" 

    Watts, Oliver; Henter, Gustav Eje; Fong, Jason; Valentini-Botinhao, Cassia
    This data release contains listening-test materials associated with the paper "Where do the improvements come from in sequence-to-sequence neural TTS?", presented at SSW10 (the 10th ISCA Speech Synthesis Workshop) in Vienna, ...
  • CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) 

    Yamagishi, Junichi; Veaux, Christophe; MacDonald, Kirsten
    This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation ...
  • ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database 

    Yamagishi, Junichi; Todisco, Massimiliano; Sahidullah, Md; Delgado, Héctor; Wang, Xin; Evans, Nicolas; Kinnunen, Tomi; Lee, Kong Aik; Vestman, Ville; Nautsch, Andreas
    This is a database used for the Third Automatic Speaker Verification Spoofing and Countermeasures Challenge, for short, ASVspoof 2019 (http://www.asvspoof.org) organized by Junichi Yamagishi, Massimiliano Todisco, Md ...
  • Listening-test materials for "Modern speech synthesis for phonetic sciences: a discussion and an evaluation" 

    Malisz, Zofia; Henter, Gustav Eje; Valentini-Botinhao, Cassia; Watts, Oliver; Beskow, Jonas; Gustafson, Joakim
    This data release contains listening-test materials associated with the paper "Modern speech synthesis for phonetic sciences: a discussion and an evaluation", presented at ICPhS 2019 in Melbourne, Australia.
  • Alba speech corpus 

    Valentini-Botinhao, Cassia; Yamagishi, Junichi
    Single speaker read speech corpus of a Scottish accented female native English speaker (Alba). The corpus was recorded in four speaking styles: plain (normal read speech, around 4 hours of recordings), fast (speaking as ...
  • Listening test results of the Voice Conversion Challenge 2018 

    Yamagishi, Junichi; Wang, Xin
    This dataset is associated with a paper and a dataset below: (1) Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling, "The Voice Conversion Challenge ...
  • UltraSuite Repository - sample data 

    Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
    UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children -- Ultrax Typically Developing ...
  • Hurricane natural speech corpus - higher quality version 

    Valentini-Botinhao, Cassia; Mayo, Cassie; Cooke, Martin
    Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). This ...
  • Parallel Audiobook Corpus 

    Ribeiro, Manuel Sam
    The Parallel Audiobook Corpus (version 1.0) is a collection of parallel readings of audiobooks. The corpus consists of approximately 121 hours of speech at 22.05KHz across 4 books and 59 speakers. The data is provided in ...
  • Manual and automatic labels for version 1.0 of UXTD, UXSSD, and UPX core data -- version 1.0 

    Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
    UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children (UXTD) and two from children with ...
  • The Voice Conversion Challenge 2018: database and results 

    Lorenzo-Trueba, Jaime; Yamagishi, Junichi; Toda, Tomoki; Saito, Daisuke; Villavicencio, Fernando; Kinnunen, Tomi; Ling, Zhenhua
    Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform. In 2016, we have ...
  • The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

    Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
    This is a database used for the Second Automatic Speaker Verification Spoofing and Countermeasuers Challenge, for short, ASVspoof 2017 (http://www.asvspoof.org) organized by Tomi Kinnunen, Md Sahidullah, Héctor Delgado, ...
  • Device Recorded VCTK (Small subset version) 

    Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
    This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are ...
  • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

    Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
    ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##
  • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

    Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
    ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##
  • SUPERSEDED - Dutch English Lombard Speech Native and Non-Native (DELNN) 

    Marcoux, Katherine; Ernestus, Mirjam; King, Simon
    ## SUPERSEDED - This dataset has been replaced by the one which can be found at https://doi.org/10.5281/zenodo.4267819 ## The DELNN (Dutch English Lombard speech Native and Non-Native) corpus consists of 30 native Dutch ...

View all