Description
This track shows the location of non-protein coding RNA genes and
pseudogenes.
Feature types include:
- tRNA: Transfer RNA (or pseudogene)
- rRNA: Ribosomal RNA (or pseudogene)
- scRNA: Small cytoplasmic RNA (or pseudogene)
- snRNA: Small nuclear RNA (or pseudogene)
- snoRNA: Small nucleolar RNA (or pseudogene)
- miRNA: MicroRNA (or pseudogene)
- misc_RNA: Miscellaneous other RNA, such as Xist (or pseudogene)
- mt-tRNA: Mitochondrial tRNA-derived pseudogene
Methods
Eddy-tRNAscanSE (tRNA genes, Sean Eddy):
tRNAscan-SE 1.23 with default parameters.
Score field contains tRNAscan-SE bit score; >20 is good, >50 is great.
Eddy-BLAST-tRNAlib (tRNA pseudogenes, Sean Eddy):
Wublast 2.0, with options "-kap wordmask=seg B=50000
W=8 cpus=1".
Score field contains % identity in blast-aligned region.
Used each of 602 tRNAs and pseudogenes predicted by tRNAscan-SE
in the human oo27 assembly as queries. Kept all nonoverlapping
regions that hit one or more of these with P <= 0.001.
Eddy-BLAST-snornalib (known snoRNAs and snoRNA pseudogenes, Steve Johnson):
Wublastn 2.0, with options "-V=25 -hspmax=5000 -kap wordmask=seg
B=5000 W=8 cpus=1".
Score field contains blast score.
Used each of 104 unique snoRNAs in snorna.lib as a query.
Any hit >=95% full length and >=90% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 is annotated as a "related
sequence" and interpreted as a putative pseudogene.
Eddy-BLAST-otherrnalib
(non-tRNA, non-snoRNA noncoding RNAs with GenBank entries
for the human gene.):
Wublastn 2.0 [15 Apr 2002]
with options: "-kap -cpus=1 -wordmask=seg -W=8 -E=0.01 -hspmax=0
-B=50000 -Z=3000000000". Exceptions to this are:
- Large ncRNAs (LSU & SSU rRNA, H19, Xist):
change "-W=11"; addition "-maskextra=50".
Xist contains repetitive elements and was masked with
RepeatMasker, Library version 6.8.
- microRNAs:
"-kap -cpus=1 -S=70 -hspmax=0 -B=100" replaces all
above parameters.
The score field contains the blastn score.
41 unique miRNAs and 29 other ncRNAs were used as queries.
Any hit >=95% full length and >=95% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 and >= 65% identity is annotated
as a "related sequence". There is an exception to this:
all miRNAs consist of 16-26 bp sequences in GenBank
and are annotated only if they are 100% full length and have
100% identity. The set of miRNAs used consists of Let-7 from
Pasquinelli et al. (2000) and 40 miRNAs from Mourelatos et al. (2002),
as mentioned in the references section below.
Credits
These data were kindly provided by Sean Eddy at Washington University.
References
Pasquinelli AE, Reinhart BJ, Slack F, Martindale MQ, Kuroda MI, Maller B,
Hayward DC, Ball EE, Degnan B, Müller P, et al.
Conservation of the sequence and temporal expression of let-7
heterochronic regulatory RNA. Nature.
2000 Nov 2;408(6808):86-9.
Mourelatos Z, Dostie J, Paushkin S, Sharma A, Charroux B, Abel L,
Rappsilber J, Mann M, Dreyfuss G.
miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs.
Genes Dev. 2002 Mar 15;16(6):720-8.
|
|