• Datasets of journal paper "Arduino-Based Myoelectric Control: Towards Longitudinal Study of Prosthesis Use" 

      Wu, Hancong; Kianoush, Nazarpour
      This dataset comprises the mean absolute value (MAV) to draw Figure 6 and the normalized MAV to draw Figure 7 in paper "Arduino-Based Myoelectric Control: Towards Longitudinal Study of Prosthesis Use".
    • Device Recorded VCTK (Small subset version) 

      Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
      This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are ...
    • DiapixFL 

      Cooke, Martin; Garcia Lecumberri, Maria Luisa; Wester, Mirjam (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece., 2013-10-01)
      DiapixFL consists of speakers whose first language (L1) is either English or Spanish solving a "spot-the-difference" task in both their L1 and their second language (L2, which for native English talkers is Spanish, and for ...
    • Dutch English Lombard Speech Native and Non-Native (DELNN) 

      Marcoux, Katherine; Ernestus, Mirjam; King, Simon
      The DELNN (Dutch English Lombard speech Native and Non-Native) corpus consists of 30 native Dutch speakers reading sentences in a quiet environment and in a noisy environment, to elicit Lombard speech. The Dutch speakers ...
    • EEMBC Benchmark Suite Simulations 

      Tomusk, Erik
      This dataset contains gem5 simulation results and McPAT power consumption figures for 3000 out-of-order CPU cores running EEMBC DENBench (digital entertainment) and Networking 2.0 benchmarks. The benchmarks have been ...
    • EEMBC FPMark Benchmark Suite Simulations 

      Tomusk, Erik
      This dataset contains gem5 simulation results and McPAT power consumption figures for 3000 out-of-order CPU cores running EEMBC FPMark benchmarks. The benchmarks have been compiled for the ARM ISA and have been simulated ...
    • Experiment materials for "Disfluencies in change detection in natural, vocoded and synthetic speech." 

      Dall, Rasmus; Wester, Mirjam; Corley, Martin
      The current dataset is associated with the DiSS paper "Disfluencies in change detection in natural, vocoded and synthetic speech." In this paper we investigate the effect of filled pauses, a discourse marker and silent ...
    • Experiment materials for "Testing the consistency assumption: pronunciation variant forced alignment in read and spontaneous speech synthesis" 

      Dall, Rasmus
      The matlab scripts are used to analyse the results files in the results folder. The Test_Wavs are the wavfiles used for the listening test divided by group and the pre-test test files.
    • Experiment materials for "The temporal delay hypothesis: Natural, vocoded and synthetic speech." 

      Corley, Martin; Dall, Rasmus; Wester, Mirjam
      Including disfluencies in synthetic speech is being explored as a way of making synthetic speech sound more natural and conversational. How to measure whether the resulting speech is actually more natural, however, is not ...
    • GitHub Java Corpus 

      Allamanis, Miltiadis; Sutton, Charles
      The GitHub Java Corpus is a snapshot of all open-source Java code on GitHub in October 2012 that is contained in open-source projects that at the time had at least one fork. It contains code from 14,785 projects amounting ...
    • Hiberlink project data 

      Tobin, Richard; Grover, Claire; Zhou, Ke
      Summary files (in XML format) listing URIs referenced in papers from arXiv, Elsevier, and PMC respectively (approximately 1 million URIs from 3 million papers in total). The focus of the Hiberlink project was to assess the ...
    • The Human Know-How Dataset 

      Pareti, Paolo; Klein, Ewan H.
      The Human Know-How Dataset describes 211,696 human activities from many different domains. These activities are decomposed into 2,609,236 entities (each with an English textual label). These entities represent over two ...
    • Human vs Machine Spoofing 

      Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
      Listening test materials for "Human vs Machine Spoofing Detection on Wideband and Narrowband data." They include lists of the speech material selected from the SAS spoofing database and the listeners' responses. The main ...
    • Hurricane natural speech corpus 

      Cooke, Martin; Mayo, Catherine; Valentini-Botinhao, Cassia (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-10-01)
      Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). A higher ...
    • Hurricane natural speech corpus - higher quality version 

      Valentini-Botinhao, Cassia; Mayo, Cassie; Cooke, Martin
      Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). This ...
    • IDEAL Household Energy Dataset 

      Goddard, Nigel; Kilgour, Jonathan; Pullinger, Martin; Arvind, D.K; Lovell, Heather; Moore, Johanna; Shipworth, David; Sutton, Charles; Webb, Jan; Berliner, Niklas; Brewitt, Cillian; Dzikovska, Myroslava; Farrow, Edmund; Farrow, Elaine; Mann, Janek; Morgan, Evan; Webb, Lynda; Zhong, Mingjun
      The IDEAL Household Energy Dataset comprises data from 255 UK homes. Alongside electric and gas data from each home the corpus contains individual room temperature and humidity readings and temperature readings from the ...
    • Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0 

      Wu, Zhizheng; Khodabakhsh, Ali; Demiroglu, Cenk; Yamagishi, Junichi; Saito, Daisuke; Toda, Tomoki; Ling, Zhen-Hua; King, Simon
      These files are complementary to the fileset: Wu et al. (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/252. ...
    • Listening test materials for "A study of speaker adaptation for DNN-based speech synthesis" 

      Wu, Zhizheng
      The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2015 paper, "A study of speaker adaptation for DNN-based speech synthesis". In this paper, we conduct an experimental analysis ...
    • Listening test materials for "A template-based approach for speech synthesis intonation generation using LSTMs" 

      Ronanki, Srikanth; Henter, Gustav Eje; Wu, Zhizheng; King, Simon
      This data release contains listening test materials associated with the paper "A template-based approach for speech synthesis intonation generation using LSTMs", presented at Interspeech 2016 in San Francisco, USA.
    • Listening test materials for "Deep neural network context embeddings for model selection in rich-context HMM synthesis" 

      Merritt, Thomas
      These are the listening test materials for "Deep neural network context embeddings for model selection in rich-context HMM synthesis". They include the waveforms played to listeners as well as the listeners' responses.