• ManySStuBs4J Dataset 

      Karampatsis, Rafael-Michael
      The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variations of the dataset. One mined from the 100 Java Maven Projects and one mined from the top ...
    • multimodal TRIPOD 

      Papalampidi, P; Keller, F; Lapata, M
      The data contain multimodal features extracted for the TRIPOD dataset and used in the AAAI 2021 paper "Movie Summarization via Sparse Graph Construction". The data contain 122 pickle files, each one corresponding to a movie ...
    • SUPERSEDED - ManySStuBs4J Dataset 

      Karampatsis, Rafael-Michael
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2628 ## The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are ...
    • Visual and Linguistic Treebank 

      Elliott, Desmond; Keller, Frank (2014-09-04)
      The Visual and Linguistic Treebank is a data set of images annotated with human-written descriptions, object boundaries, and Visual Dependency Representations. The images are freely available from the Action Recognition ...
    • WikiCatSum 

      Perez-Beltrachini, Laura; Liu, Yang; Lapata, Mirella
      WikiCatSum is a domain specific Multi-Document Summarisation (MDS) dataset. It assumes the summarisation task of generating Wikipedia lead sections for Wikipedia entities of a certain domain (e.g. Companies) from the set ...