Description
The polyA_DB database is a set of human mRNA polyadenlyation sites based
on EST/cDNA evidence.
A site is a single base denoting the beginning of a poly(A) tail in a nascent
mRNA transcript and is typically 10-30 nucleotides downstream of a
polyadenylation signal (most commonly AAUAAA).
The polyA_DB web server is found at
http://exon.umdnj.edu/polya_db/.
The Poly(A) composite track consists of two subtracks: a polyA_DB
subtrack that displays reported poly(A) sites, and a poly(A)
prediction subtrack that displays poly(A) sites predicted using a
support vector machine (SVM).
The poly(A) predictions are made using 1500-base DNA sequences centered at
the end of each RefSeq gene. The sequences serve as input into the
SVM described in Cheng et al., 2006. The SVM scores
each base
using a model derived from 15 different cis-elements and reports an E-value for
a region of DNA between 0 (excellent) and 0.5 (worst). This E-value is then
normalized to an integer value between 0 (worst) and 1000 (excellent).
High-scoring
regions are highlighted, with the highest-scoring base indicated by a thicker
line. The median length of these regions is 48 bases.
References
Cheng Y, Miura RM, Tian B.
Prediction of mRNA polyadenylation sites by support vector machine.
Bioinformatics. 2006 Oct 1;22(19):2320-5.
PMID: 16870936
Zhang H, Hu J, Recce M, Tian B.
PolyA_DB: a database for mammalian mRNA polyadenylation.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D116-20.
PMID: 15608159; PMC: PMC540009
|