Description
This track shows predictions of conserved elements produced by the phastCons
program based on a whole-genome alignment of vertebrates, and for the
placental mammal subset of species in the alignment.
They are based on a phylogenetic hidden Markov model
(phylo-HMM), a type of probabilistic model that describes both the process
of DNA substitution at each site in a genome and the way this process
changes from one site to the next.
Methods
Best-in-genome pairwise alignments were generated for
each species using blastz, followed by chaining and netting. A multiple
alignment was then constructed from these pairwise alignments using multiz.
Predictions of conserved elements were then obtained by running phastCons
on the multiple alignments with the --most-conserved option.
For more details see the track description for the Conservation track.
PhastCons constructs a two-state phylo-HMM with a state for conserved
regions and a state for non-conserved regions. The two states share a
single phylogenetic model, except that the branch lengths of the tree
associated with the conserved state are multiplied by a constant scaling
factor rho (0 <= rho <= 1). The free parameters of the
phylo-HMM, including the scaling factor rho, are estimated from
the data by maximum likelihood using an EM algorithm. This procedure is
subject to certain constraints on the "coverage" of the genome by conserved
elements and the "smoothness" of the conservation scores. Details can be
found in Siepel et al. (2005).
The predicted conserved elements are segments of the alignment that are
likely to have been "generated" by the conserved state of the phylo-HMM.
Each element is assigned a log-odds score equal to its log probability
under the conserved model minus its log probability under the non-conserved
model. The "score" field associated with this track contains transformed
log-odds scores, taking values between 0 and 1000. (The scores are
transformed using a monotonic function of the form a * log(x) + b.) The
raw log odds scores are retained in the "name" field and can be seen on the
details page or in the browser when the track's display mode is set to
"pack" or "full".
Credits
This track was created at UCSC using the following programs:
-
Blastz and multiz by Minmei Hou, Scott Schwartz and Webb Miller of the
Penn State Bioinformatics
Group.
-
AxtBest, axtChain, chainNet, netSyntenic, and netClass
by Jim Kent at UCSC.
- PhastCons by Adam Siepel at Cornell University.
References
PhastCons
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K,
Clawson H, Spieth J, Hillier LW, Richards S, et al.
Evolutionarily conserved elements in vertebrate, insect, worm,
and yeast genomes.
Genome Res. 2005 Aug;15(8):1034-50.
Chain/Net
Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D.
Evolution's cauldron:
duplication, deletion, and rearrangement in the mouse and human genomes.
Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9.
Multiz
Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM,
Baertsch R, Rosenbloom K, Clawson H, Green ED, et al.
Aligning multiple genomic sequences with the threaded blockset
aligner.
Genome Res. 2004 Apr;14(4):708-15.
Blastz
Chiaromonte F, Yap VB, Miller W.
Scoring pairwise genomic sequence alignments.
Pac Symp Biocomput. 2002;:115-26.
Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC,
Haussler D, Miller W.
Human-mouse alignments with BLASTZ.
Genome Res. 2003 Jan;13(1):103-7.
|
|