Description
This track shows regions that co-precipitate with antibodies against
each of ten factors in all ENCODE regions, in retinoic-acid stimulated
HL-60 cells harvested after 0, 2, 8, and 32 hours. Median P-values are shown in
separate subtracks for each of the ten antibodies:
- Brg1 - Brahma-related Gene 1
- CEBPe - CCAAT-enhancer binding protein-epsilon
- CTCF - CCTC binding factor
- H3K27me3 (H3K27T) - Histone H3 tri-methylated lysine 27
- H4Kac4 (HisH4) - Histone H4 tetra-acetylated lysine
- P300 - E1A-binding protein, 300-KD
- PU1 - Spleen focus forming virus proviral integration oncogene
- Pol2 - RNA Polymerase II (8WG16 ab against pre-initiation complex form)
- RARA (RARecA) - Retinoic Acid Receptor-Alpha
- SIRT1 - Sirtuin-1
Retinoic acid-stimulated HL-60 cells were harvested and
whole cell extracts (control) were made. An antibody was used to
immunoprecipitate bound chromatin fragments (treatment). DNA was
purified from these samples and hybridized to Affymetrix ENCODE
oligonucleotide tiling arrays, which have 25-mer probes tiled
every 22 bp on average in the non-repetitive ENCODE regions.
Only median P-values are displayed; data for all biological replicates
can be downloaded from Affymetrix in
wiggle,
cel, and
soft formats.
Display Conventions and Configuration
The subtracks within this composite annotation track
may be configured in a variety of ways to highlight different aspects of the
displayed data. The graphical configuration options for the subtracks
are shown at the top of the track description page, followed by a list of
subtracks.
For more information about the graphical configuration options, click the
Graph
configuration help link.
Color differences among the subtracks are arbitrary. They provide a
visual cue for finding the same antibody in different timepoint tracks.
Methods
The data from replicate arrays were quantile-normalized (Bolstad et
al., 2003) and all arrays were scaled to a median array intensity of
22. Within a sliding 1001 bp window centered on each probe, a signal
estimator S = ln[max(PM - MM, 1)] (where PM is perfect match and MM is
mismatch) was computed for each biological replicate treatment- and
all replicate control-probe pairs. An estimate of the significance of
the enrichment of treatment signal for each replicate over control
signal in each window was given by the P-value computed using the
Wilcoxon Rank Sum test over each biological replicate treatment and
all control signal estimates in that window. The median of the log
transformed P-value (-10 log[10] P) across processed replicate data is
displayed.
Several independent biological replicates (four each for Brg1, CEBPe,
CTCF, PU1, and SIRT1; five each for H3K27me3, H4Kac4, P300, Pol2 and
RARA) were generated and hybridized
to duplicate arrays (two technical replicates). Reproducible enriched
regions were generated from the signal by first applying a cutoff of
20 to the log transformed P-values, a maxGap and minRun of 500 and 0
basepairs respectively, to each biological replicate. Since each
region or site may be comprised of more than one probe, a median
based on the distribution of log transformed P-values was computed per
site for each of the respective replicates. These seed sites were then
ranked individually within each of the replicates. If a site was
absent in a replicate, the maximum or worst rank of the distribution
was assigned to it.
The following three values were computed for each
site by combining data from all biological replicates:
- average of all ranks computed among biological replicates
- sum of all pairwise differences in these ranks computed among biological
replicates
- a combined P-value, using a chi square distribution,
across all replicates
The final sites were selected when all of the
above three metrics were relatively low, where "low" corresponds to
the top 25 percentile of the distribution.
Verification
Using the P-values from the biological replicates, all pairwise
rank correlation coefficients were computed among biological
replicates. Data sets showing both consistent pairwise correlation
coefficients and at least weak positive correlation across all pairs
were considered reproducible.
Credits
These data were generated and analyzed by the Gingeras/Struhl
collaboration with the Tom Gingeras group at
Affymetrix and
Kevin Struhl's group at Harvard Medical School.
References
Please see the
Affymetrix Transcriptome site for a project overview and
additional references to Affymetrix tiling array publications.
Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P.
A comparison
of normalization methods for high density oligonucleotide array data based
on variance and bias.
Bioinformatics 19(2), 185-193 (2003).
Cawley, S., Bekiranov, S., Ng, H. H., Kapranov, P., Sekinger,
E. A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J.,
Williams, A. J., et al.
Unbiased mapping of
transcription factor binding sites along human chromosomes 21 and 22 points
to widespread regulation of noncoding RNAs.
Cell 116(4), 499-509 (2004).
|
|