Description
This track displays the results of ENCODE region-wide localization for three
transcription factors (HNF-3b, HNF-4a and USF-1) and acetylated
histone H3 (H3ac). The heights of the peaks in the graphical display indicate
the ratio of enriched non-amplified DNA to input DNA.
The data for each of the transcription factors and H3ac are displayed in
individual subtracks. The analysis cut-off threshold is indicated in each
subtrack by a horizontal line. Tentative binding sites (TBSs)
in spots passing the cut-off are displayed in a separate subtrack,
ChIP-chip (HepG2) Sites. These sites are numbered corresponding to the ranking
of spots based on enrichment ratios. Each TBS is assigned a value indicating
how often it was found in separate BioProspector software runs for the
prediction of TBSs (e.g. 1000 indicates that a TBS was found in ten
out of ten runs).
The raw data for this track is available at
EBI ArrayExpress, as experiment
E-MEXP-452.
Methods
Chromatin from HepG2 cells was cross-linked with formaldehyde and sonicated
to produce DNA fragments of size 0.5-2 kb. Chromatin was precipitated using
antibodies against HNF-4a, HNF-3b, USF-1 or H3ac. DNA from a single ChIP
reaction was labeled with Cy5, and a fraction of the total input was labeled
with Cy3. There was no amplification of the ChIP DNA or the input DNA prior to
this step to avoid introducing bias.
This DNA was combined and hybridized
to PCR-based tiling path ENCODE arrays. Most array elements were printed
only once on the slide, but X-chromosomal regions (ENm006 and ENr324) were
printed in duplicate. There were approximately 19,000 spots/slide. The array
provided about 75% coverage of the ENCODE regions.
Spots flagged as bad by the image processing step were removed; those that
remained were normalized. The average log2 ratio was calculated for spots
that were replicated on the array. A log odds score for differential
enrichment with the negative control was calculated using an empirical Bayes
method. There were four log odds scores for each spot, one for each
antibody. If this score was greater than 0 and the log2 ratio was greater than
1.25 (indicative of a strong positive signal), based on at least 2 replicates,
the spots were considered to be enriched.
Binding sites were identified using the BioProspector software.
Because the software is non-deterministic, different runs
may produce different results for the same data. Predictions consistent
across many runs are more likely to be correct; therefore, the analysis was
repeated, keeping all binding sites occurring in each top-scoring motif to
generate a set of candidates. TBSs present in at least
five out of ten runs were selected. Further method details are described in
Rada-Iglesias et al. (2005).
In the graphical display, overlapping sequences were
removed by changing the start position of downstream spots to generate a
continuous track. To give each track a comparable scale, the values for the
most enriched spots were lowered to 15. Spots deemed as false positives, when
compared to a no antibody ChIP-chip experiment, were assigned a value of 0.
Verification
A negative control was done using no antibody for the ChIP-chip to reduce
the number of false positives. Three independent biological replicates
were performed for each antibody; three negative control ChIPs were also
analyzed. Semi-quantitative PCR was used to verify enrichment in at
least ten positive spots for each antibody.
Credits
These experiments were performed in the
Claes Wadelius lab. The statistical analysis was done at the
Linnaeus Centre for
Bioinformatics at Uppsala University. Microarrays were produced at the
Sanger Institute.
References
Rada-Iglesias A, Wallerman O, Koch C, Ameur A, Enroth S, Clelland G, Wester K, Wilcox S, Dovey OM,
Ellis PD et al.
Binding sites for metabolic disease related transcription factors inferred at base pair resolution
by chromatin immunoprecipitation and genomic microarrays.
Hum Mol Genet. 2005 Nov 15;14(22):3435-47.
|