Assembly and validation of conserved long non-coding RNAs in the ruminant transcriptome
Date Available
2018-01-08Type
datasetData Creator
Bush, Stephen JMuriuki, Charity
McCulloch, Mary E. B.
Farquhar, Iseabail L.
Clark, Emily L.
Hume, David A.
Publisher
Roslin Institute. University of EdinburghMetadata
Show full item recordAltmetric
Citation
Bush, Stephen; Muriuki, Charity; McCulloch, Mary E. B.; Farquhar, Iseabail L.; Clark, Emily L.; Hume, David A.. (2018). Assembly and validation of conserved long non-coding RNAs in the ruminant transcriptome, [dataset]. Roslin Institute. University of Edinburgh. https://doi.org/10.7488/ds/2284.Description
mRNA-like long non-coding RNAs (lncRNA) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. This dataset demonstrates that few lncRNA are fully captured by biological replicates of the same RNA-seq library. In a transcriptional atlas of the domestic sheep (https://doi.org/10.1371/journal.pgen.1006997), 31 diverse tissues/cell types were sampled in each of 6 individual adults (3 females, 3 males, all unrelated virgin animals approximately 2 years of age). By taking a subset of 31 common tissues per individual, each of the 6 adults (f1, f2, f3, m1, m2, and m3) was represented by ~0.75 billion reads. In a typical lncRNA assembly pipeline, read alignments from all individuals are merged, to maximise the number of candidate gene models (using, for instance, StringTie --merge). With n = 6 adults (and ~0.75 billion reads per adult), there are (2^n)-1 = 63 possible combinations of data for which GTFs can be made with StringTie --merge. This dataset comprises those GTFs.The following licence files are associated with this item: