Description
Regulatory Factor Binding Regions (RFBRs) were identified from ChIP-Chip
experimental data; they are non-randomly distributed in the ENCODE regions
with local enrichment and depletion. By mapping the full set of RFBRs onto the
human genome sequence, we identified 689 genomic subregions with RFBR
enrichment and 726 subregions with RFBR depletion (the RFBR clusters and
deserts, respectively) in the ENCODE regions.
Methods
The data set analyzed in this study consists of 105 lists of transcriptional
regulatory elements (TREs) in the ENCODE regions. It was released on December
13, 2005 by the Transcriptional Regulation Group. TRE lists made available
after this data freeze were not included in this study. A total of 29
transcription factors (BAF155, BAF170, Brg1, CEBPe, CTCF, E2F1, E2F4, H3ac,
H4ac, H3K27me3, H3K27me3, H3K4me1, H3K4me2, H3K4me3, H3K9K14me2, HisH4, c-Jun,
c-Myc, P300, P63, Pol2, PU1, RARecA, SIRT1, Sp1, Sp3, STAT1, Suz12, and TAF1)
were assayed by seven laboratories (Affymetrix, Sanger, Stanford, UCD, UCSD,
UT, Yale) using ChIP-chip experiments on three different microarray
platforms (Affymetrix tiling array, NimbleGen tiling array, and traditional
PCR array) in nine cell lines (HL-60, HeLa, GM06990, K562, IMR90, HCT116, THP1, Jurkat, and fibroblasts) or at two different experimental time points
(P0, before addition of gamma-interferon, and P30, 30 minutes after the
addition of gamma-interferon).
The raw data from these 105 ChIP-chip experiments was uniformly processed
using a method based on the false discovery rate (Efron, 2004). Three sets of
TRE lists were generated at 1%, 5%, and 10% false discovery rates respectively,
and the list generated at the lowest (1%) false discovery rate was used in
this study. The non-redundant factor-specific RFBR lists were mapped onto the
ENCODE regions. Uninterrupted genomic regions that are covered by one or more
RFBRs were identified as RFBR groups. Neighboring groups that are less than
1 kb apart were collected into RFBR clusters. Un-clustered groups that are
covered by more than three RFBRs were promoted into clusters. Further details
of the method may be found in Zhang et al. (2007).
Credits
The data set was made available by the Transcriptional Regulation Group of the
ENCODE Project Consortium. The RFBR cluster and desert tracks were generated
by Zhengdong Zhang from Mark Gerstein's group at Yale University.
References
Efron B.
Large-scale simultaneous hypothesis testing: The choice of a null
hypothesis.
Journal of the American Statistical Association. 2004;99(465):96-104.
Zhang ZD, Paccanaro A, Fu Y, Weissman S, Weng Z, Chang J, Snyder M,
Gerstein M.
Statistical analysis of the genomic distribution and
correlation of regulatory elements in the ENCODE regions.
Genome Res. 2007 Jun;17(6):787-97.
|
|