Description
This track shows likely TAF1 binding sites in fibroblastoid
(IMR90) cells as assayed by ChIP-chip using a NimbleGen microarray.
The two subtracks show known TAF1 binding
sites and additional novel sites where, based on the data in the
LI TAF1Signal
companion track, TAF1 is most likely to bind.
TAF1, a protein found at the start of transcribed genes, is a general
transcription factor that is a key part of the pre-initiation complex found
on the promoter. It is more fully known as TBP-associated factor 1 of the
TFIID complex or by its molecular weight as TAF250.
To survey the entire human genome in an unbiased fashion, a
total of 38 high-density oligonucleotide arrays (NimbleGen platform)
were fabricated, representing approximately 1.45 billion base pairs of
non-repetitive DNA with 50-mer oligonucleotides positioned at every 100
base pairs throughout the human genome (UCSC hg16). Using this array,
genome-wide location analysis of TAF1 was conducted employing ChIP-chip using
chromatin extracted from primary fibroblast IMR90 cells.
Methods
Chromatin from IMR90 cells lines was cross-linked, precipitated with
TAF1 antibody (sc-735, Santa Cruz), sheared, amplified and hybridized
to 38 high-density oligonucleotide arrays (NimbleGen). These arrays
contain a total of 14,535,659 50-mer oligonucleotides positioned at
every 100 base pairs through the human genome (UCSC hg16). Using this
set of arrays, a total of 9,966 clusters of TFIID binding sites were
identified.
To verify the binding of TFIID to these sequences, a condensed array was
designed containing a total of 379,521 oligonucleotides to
represent the 9,966 putative TFIID binding sequences plus 29 control
genomic loci at 100 bp resolution. Using
these condensed arrays, two independent chromatin immunoprecipitation
(ChIP) experiments were performed with the antibodies against TAF1, RNA
polymerase II, acetylated histone 3 and dimethylated K4 histone 3. A
total of 8,597 TFIID binding regions, ranging in size from 400 bp to 9.8
Kbp, were confirmed by the TAF1 replicate experiments.
The verification data can be viewed in the LI TAF1 Valid
track.
To further define the sites of TFIID binding within the identified
regions, a model-based peak-finding algorithm was developed that
estimates the most likely TFIID binding sites based on the hybridization
intensity of probes within each fragment. The signals from a set of
consecutive significantly-enriched probes were collectively used to
locate the most likely TFIID binding site to the probe with the peak
signal. The algorithm predicted a total of 12,150 TFIID binding
sites within the 8,597 confirmed TFIID binding fragments.
The locations of the 12,150 peaks were compared to the annotated 5' end of
transcripts from RefSeq, GenBank and DBTSS, using a cutoff of 2.5 Kbp.
It was found that 10,504 peaks corresponding to 9,281 non-redundant
transcripts were within 2.5 Kbp of the annotated 5' end. 47 of the
remaining peaks were within 2.5 Kbp of Ensembl genes, resulting in a
total of 9328 known non-redundant promoters. The remaining peaks were
further filtered using Acembly annotation and H3ac, RNAP and MeH3K4
ChIP-chip data. The total number of novel peaks was 1,239.
The raw data are available from
GEO GSE2672.
Verification
The peaks from genome scan experiments were verified using condensed
arrays, as described in the Methods section.
The verification data may be viewed in the LI TAF1 Valid
track.
References
Kim TH, Barrera LO, Zheng M, Qu C, Singer MA,
Richmond TA, Wu Y, Green RD, Ren B.
A high-resolution map of active promoters in the human genome.
Nature. 2005 Aug 11;436:876-80.
|
|