Nucl Occ: A375 Track Settings

Home
Genomes
Genome Browser
Tools
Mirrors
- Third Party Mirrors
- Mirroring Instructions
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Contact Us
- Conditions of Use
- Jobs
- Licenses

Type of graph:

Track height:

pixels (range: 11 to 128)

Data view scaling:

Always include zero:

Vertical viewing range:

min:

max: (range: -10 to 10)

Transform function:

Transform data points by:

Windowing function:

Smoothing window:

pixels

Negate values:

Draw y indicator lines:

at y = 0.0: at y =

Description

Inside the nucleus, DNA is wrapped into a complex molecular structure called chromatin, whose fundamental unit is approximately 150 bp of DNA organized around the eight-histone protein complex known as the nucleosome. This track contains predicted nucleosome occupancy scores produced by a model that was trained using data from the A375 cell line from Ozsolak et al. (2007). This cell line was prepared with weak MNase digestion. The A375 model excels at recognizing regions of strong protection from MNase cleavage; i.e., positions that are frequently occupied by a nucleosome.

Display Conventions and Configuration

The output of the SVM is a unitless discriminant score. In the browser, the score of a 50-mer is assigned to its 26th base. Canonically, a score of 0 indicates an uncertain assignment; a score of 1.0 corresponds to a confident prediction for being in the positive class (i.e., a position of frequent nucleosome occupancy), and a score of -1.0 corresponds to a confident prediction for being in the negative class.

Methods

For a given microarray experiment, we identify the 1000 50 bp probes with the highest log intensity ratios. These comprise our positive training samples. In a similar fashion, we generate negative training samples with the lowest log intensity ratios. Each 50-mer in the training set is converted into a 2772-element vector of k-mer frequencies for k=1 up to 6 (collapsing reverse complements). A linear SVM is then trained to discriminate between the two classes. The SVM regularization parameter is selected by evaluating the entire regularization path on a held-out portion of the training data set. After training, each 50-mer in the human genome is converted to the 2772-element representation and scored using the trained SVM.

Detailed methods are given in Gupta et al. (2008), and supplementary data is available here.

References

Dennis JH, Fan HY, Reynolds SM, Yuan G, Meldrim JC, Richter DJ, Peterson DG, Rando OJ, Noble WS, Kingston RE. Independent and complementary methods for large-scale structural analysis of mammalian chromatin. Genome Res. 2007 Jun;17(6):928-39.

Gupta S, Dennis J, Thurman RE, Kingston R, Stamatoyannopoulos JA, Noble WS. Predicting human nucleosome occupancy from primary sequence. PLoS Comput Biol. 2008 Aug 22;4(8):e1000134.

Description

Display Conventions and Configuration

Methods

Credits

References