Vienna RNAz Track Settings
 
University of Vienna RNA secondary structure predicted by RNAz   (All Pilot ENCODE Regions and Genes tracks)

Display mode:   
View table schema
Data version: Dec 2006
Data coordinates converted via liftOver from: May 2004 (NCBI35/hg17)
Data last updated: 2007-06-14

Description

This track displays regions containing putative functional RNA secondary structures as predicted by RNAz on the basis of thermodynamic stability and evolutionary conservation.

Methods

RNAz evaluates multiple sequence alignments for unusually stable and conserved RNA secondary structures, two typical characteristics for functional RNA structures that can be found in noncoding RNAs or cis-acting regulatory elements of mRNAs.

The RNAz algorithm works as follows: First a consensus secondary structure is predicted using the RNAalifold approach (Hofacker et al., 2002), which is an extension of classical minimum free energy folding algorithms for aligned sequences. The significance of a predicted consensus structure is evaluated by calculating a structure conservation index, which is the ratio of unconstrained folding energies relative to the folding energies under the constraint that all aligned sequences are forced to fold into a common structure. Thermodynamical stability is evaluated by calculating a normalized z-score of the sequences in the alignment. The z-score indicates whether the given sequences are more stable than random sequences of the same length and base composition. Based on these two features, structure conservation index and z-score, an alignment is classified as structural RNA or "other" using a support vector machine classification algorithm (Washietl et al., 2005; Washietl et al. , 2007).

This track shows the result of a RNAz screen of 28-way TBA/MULTIZ alignments. Alignments were sliced in overlapping windows of 120 nt in size and with a step size of 40 nt. Sequences with more than 25% gaps with respect to the human sequence were discarded. Only alignments with more than four sequences, a minimum size of 50 columns and at most 1% repeat masked letters were considered. RNAz can only handle alignments with up to six sequences. From alignments with more than six sequences we chose a subset of six. For subset selection, we used a greedy algorithm and iteratively selected sequences optimizing the set for a mean pairwise identity of around 80%. In cases of alignments with more than 10 sequences we sampled three different of such subsets. The windows were finally scored with RNAz version 0.1.1 in the forward and reverse complement direction. Overlapping hits with at least one sampled alignment with RNAz score > 0.5 were combined to a single genomic region. The track shows regions with at least one window in the cluster with an average RNAz score of all samples > 0.5 and at least one hit with RNAz score > 0.9. More details may be found in Washietl et al., 2007.

Credits

The RNAz program and browser track were developed by Stefan Washietl, Ivo Hofacker (Institute for Theoretical Chemistry, Univ. of Vienna) and Peter F. Stadler (Bioinformatics group, Department of Computer Science, Univ. of Leipzig).

References

Hofacker IL, Fekete M, Stadler PF. Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 2002 Jun 21;319(5):1059-66.

Washietl S, Hofacker IL, Stadler PF. Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA. 2005 Feb 15;102(7):2454-59.

Washietl S, Pedersen JS, Korbel JO, Fried C, Gruber AR, Hackermuller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, et al. Structured RNAs in the ENCODE Selected Regions of the Human Genome. Genome Res. 2007 Jun;17(6):852-64.