Duke Affy Exon Track Settings
 
Affymetrix Exon Array from ENCODE/Duke   (ENC Exon Array)

This track is part of a parent called 'ENC Exon Array'. To show other tracks of this parent, go to the ENC Exon Array configuration page.

Display mode:       Reset to defaults   

Select subtracks by treatment and cell line: (help)

  Rep: 1 2 3 4
 All
Treatment
Treatment
All 
Cell Line












Cell Line
GM12878 (Tier 1)   GM12878 (Tier 1)
H1-hESC (Tier 1)   H1-hESC (Tier 1)
K562 (Tier 1)   K562 (Tier 1)
A549 (Tier 2)   A549 (Tier 2)
HeLa-S3 (Tier 2)   HeLa-S3 (Tier 2)
HepG2 (Tier 2)   HepG2 (Tier 2)
HUVEC (Tier 2)   HUVEC (Tier 2)
MCF-7 (Tier 2)   MCF-7 (Tier 2)
8988T   8988T
AoSMC   AoSMC
Chorion   Chorion
CLL   CLL
Colo829   Colo829
Fibrobl   Fibrobl
FibroP AG08395   FibroP AG08395
FibroP AG08396   FibroP AG08396
FibroP AG20443   FibroP AG20443
Gliobla   Gliobla
GM12891   GM12891
GM12892   GM12892
GM18507   GM18507
GM19238   GM19238
GM19239   GM19239
GM19240   GM19240
H7-hESC   H7-hESC
H9ES   H9ES
HEK293T   HEK293T
Hepatocytes   Hepatocytes
HMEC   HMEC
HPDE6-E6E7   HPDE6-E6E7
HSMM   HSMM
HSMM FSHD   HSMM FSHD
HSMMtube   HSMMtube
HSMMtube FSHD   HSMMtube FSHD
HTR8svn   HTR8svn
Huh-7   Huh-7
Huh-7.5   Huh-7.5
iPS CWRU1   iPS CWRU1
iPS NIHi7   iPS NIHi7
iPS NIHi11   iPS NIHi11
LNCaP   LNCaP
Medullo   Medullo
Mel 2183   Mel 2183
Melano   Melano
Myometr   Myometr
NH-A   NH-A
NHEK   NHEK
Osteoblasts   Osteoblasts
ProgFib   ProgFib
Stellate   Stellate
UCH-1   UCH-1
Urothelia   Urothelia
Cell Line












Cell Line
 All
Treatment
Treatment
All 
List subtracks: only selected/visible    all    ()
  Cell Line↓1 Treatment↓2 Rep↓3   Track Name↓4    Restricted Until↓5
 
dense
 GM12878      1  GM12878 Exon array Signal Rep 1 from ENCODE/Duke    schema   2010-09-16 
 
dense
 GM12878      2  GM12878 Exon array Signal Rep 2 from ENCODE/Duke    schema   2010-09-16 
 
dense
 H1-hESC      1  H1-hESC Exon array Signal Rep 1 from ENCODE/Duke    schema   2010-09-16 
 
dense
 H1-hESC      2  H1-hESC Exon array Signal Rep 2 from ENCODE/Duke    schema   2010-09-16 
 
dense
 K562      1  K562 Exon array Signal Rep 1 from ENCODE/Duke    schema   2010-09-16 
 
dense
 K562      2  K562 Exon array Signal Rep 2 from ENCODE/Duke    schema   2010-09-16 
     Restriction Policy
Downloads

Description

This track displays human tissue microarray data using Affymetrix Human Exon 1.0 ST expression arrays. This RNA expression track was produced as part of the ENCODE Project. The RNA was extracted from cells that were also analyzed by DNaseI hypersensitivity (Duke DNaseI HS), FAIRE (UNC FAIRE), and ChIP (UTA TFBS).

Display Conventions and Configuration

In contrast to the hg18 annotation, this track now displays exon array data that has been aggregated to the gene level for those probes that have been linked to genes. Probes not linked to genes are not included. The display for this track shows gene probe location and signal value as grayscale-colored items where higher signal values correspond to darker-colored blocks.

Items with scores between 900-1000 have signal values greater than 9 that have been linearly scaled for that particular cell type. Items scoring 400-900 have signal values between 4 and 9, and the signal is simply multiplied by 100 to get the score. Items with scores between 200-400 have signal values below 4 that have been linearly scaled to fit that score range.

The subtracks within this composite annotation track correspond to data from different cell types and tissues. The configuration options are shown at the top of the track description page, followed by a list of subtracks. To display only selected subtracks, uncheck the boxes next to the tracks you wish to hide.

For information regarding specific microarray probes, turn on the Affy Exon Probes track, which can be found in the Expression track group. See Methods for a description as to how probe level data was processed to produce gene level annotations.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Data from these tracks are stored as bed files whose first six fields follow the bed file standard. The three additional fields are as follows:

  • signalValue: The normalized expression value for a gene, calculated as described below.
  • exonCount: The number of exons used in the calculation of the expression value.
  • constitutiveExons: The number of constitutive exons used in the calculation of the expression value.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. Total RNA was isolated from these cells using trizol extraction followed by cleanup on RNEasy column (Qiagen) that included a DNaseI step. The RNA was checked for quality using a nanodrop and an Agilent Bioanalyzer. RNA (1 µg) deemed to be of good quality was then processed either by 1) the standard Affymetrix Whole transcript Sense Target labeling protocol that included a riboreduction step, or 2) the NuGEN labeling system. The fragmented biotin-labeled cDNA was hybridized over 16 h to Affymetrix Exon 1.0 ST arrays and scanned on an Affymetrix Scanner 3000 7G using AGCC software.

Data from all replicates were then normalized together. Probesets flagged as cross-hybridizing were removed from the analysis (Salomonis et al. 2010). Though these arrays provide exon-level resolution, gene-level expression was estimated by grouping probesets by gene for normalization (Bemmo et al. 2008). Probesets were assigned to genes based on the GENCODE v10 annotation (July 2011). An exon was classified as constitutive or non-constitutive based on whether it was present in all protein-coding transcripts. For genes with at least 4 constitutive probes, only constitutive probesets were used to estimate gene expression. For all other genes, including all non-protein-coding genes, all (non-cross-hybridizing) probesets that mapped to an expressed exon in any transcript of the gene were used. Gene-level expression estimates were normalized using Affymetrix Power Tools (APT) (Lockstone 2011) with the chipstream command "rma-bg, med-norm, pm-gcbg, med-polish". This chipstream calls for an RMA normalization with gc-background correction using antigenomic background probes.

While the data was generated using the same microarray platform, two different experimental backgrounds were present due to a change in labeling reagents (Affymetrix vs. NuGEN; see Methods above). It was found that batch effects related to this change were causing array data to group by experimental protocol rather than cell type relatedness. We used an R script (ComBat) to correct for this batch effect (Johnson et al. 2007).

Verification

When biological replicates were available, data were verified by analyzing replicates displaying a Pearson correlation coefficient > 0.9.

Release Notes

This is release 3 of this track (April 2012). Several new cell types have been added. The name of cell line Astrocy was changed to NH-A.

Credits

RNA was extracted from each cell type by Greg Crawford's group at Duke University. RNA was purified and hybridized to Affymetrix Exon arrays by Sridar Chittur and Scott Tenenbaum at the University of Albany-SUNY. Data analyses were primarily performed by Nathan Sheffield (Duke University) with assistance from Melissa Cline (UCSC), Zhancheng Zhang (UNC Chapel Hill), and Darin London (Duke University).

Contact: Terry Furey

References

Bemmo A, Benovoy D, Kwan T, Gaffney DJ, Jensen RV, Majewski J. Gene expression and isoform variation analysis using Affymetrix Exon Arrays. BMC Genomics. 2008 Nov 7;9:529.

Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007 Jan;8(1):118-27.

Lockstone HE. Exon array data analysis using Affymetrix power tools and R statistical software. Brief Bioinform. 2011 Nov;12(6):634-44.

Salomonis N, Schlieve CR, Pereira L, Wahlquist C, Colas A, Zambon AC, Vranizan K, Spindler MJ, Pico AR, Cline MS et al. Alternative splicing regulates mouse embryonic stem cell pluripotency and differentiation. Proc Natl Acad Sci U S A. 2010 Jun 8;107(23):10514-9.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.