Description
This track shows average methylation status in CpG islands. In general,
methylation of CpG sites within a promoter causes silencing of the gene
associated with that promoter.
Release Notes
This is release 2 of this track. Release 2 adds tables for several new cell types: GM12891, GM12892, H1-hESC, HeLa-S3, and HepG2.
Track Conventions
Methylation status is color-coded as:
- orange = methylated (bed score = 1000)
- blue = non-methylated (bed score = 0)
Methods
CpG regions were assayed via Methyl-seq, a method developed in the Myers
laboratory to measure the methylation status at CpGs throughout the genome. It
combines DNA digestion by a methyl-sensitive enzyme HpaII and its
methyl-insensitive isoschizomer MspI with the Illumina DNA sequencing
platform. The method was first applied in a collaboration with the laboratory
of Dr. Julie Baker at Stanford University to study methylation and gene
expression changes that occur in human embryonic stem cells before and after
differentiation to definitive endoderm. A paper describing the results as
well as the method has been submitted for publication [1].
This study profiled genomic DNA and mRNA samples derived from two human
embryonic stem cell lines: H9 and BG02. These cells were differentiated into
definitive endoderm, embryoid bodies, embryoid body-derived cells, and AFP+
(alpha-fetoprotein positive) hepatocytes. These in vitro samples were profiled
with Methyl-seq and compared them with normal tissue samples from 11-week and
24-week fetal liver and adult liver.
Methyl-seq assays more than 250,000 methyl-sensitive restriction enzyme
cleavage sites, representing more than 90,000 genomic regions. These regions
include 35,528 annotated CpG islands, while the remaining 55,084 non-CpG
island regions are distributed across the genome in promoters, genes, and
intergenic regions. Sequence tags present in MspI libraries but not in HpaII
libraries are derived from methylated regions. Conversely, sequence tags that
occur in HpaII libraries come from at least partially unmethylated regions.
In vitro differentiation
Definitive endoderm precursor cells were generated from H9 hES cells by treating
them with activin A. Embryoid bodies (EBs) were generated by growing
undifferentitated H9 and BG02 hESCs in suspension. EB-derived
cells were obtained by plating clumps of the cells from the EBs.
AFP+ fetal
hepatocytes were derived from EBs by plating EB cells with FgF, followed by fluorescence
activated cell sorting (FACS) to isolate cells expressing the green
fluorescent protein (GFP) reporter gene driven from the AFP promoter.
Isolation of genomic DNA
Genomic DNA is isolated from biological replicates of each cell line by using
the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by
the manufacturer. DNA concentrations and a level of quality of each
preparation is determined by UV absorbance.
HpaII and MspI digestions
Cleavage of DNA by restriction endonuclease HpaII is prevented by the presence
of a 5-methyl group at the internal C residue of its recognition sequence
CCGG. MspI, an isoschizomer of HpaII, cleaves DNA irrespective of the
presence of a methyl group at this position.
For the MspI library, 5 μg genomic DNA was digested in a 100 μl reaction
with 1X NEB Buffer2 and 20 units MspI restriction enzyme and incubated for 18 hr
at 37°C. For the HpaII library, 5 μg genomic DNA was digested in a
100 μl reaction with 1X NEB Buffer1 and 20 units HpaII restriction enzyme
and incubated for 18 hr at 37°C.
Note that in subsequent versions of the Methyl-seq protocol, which will be
described later, much lower amounts of genomic DNA were used (1 μg and
potentially lower).
DNA library construction and sequencing
High-throughput sequencing libraries were generated from DNA fragments of the
HpaII or MspI digested genomic DNA according to the protocol posted at the
Myers' lab protocols page.
This approach was recently modified by removing the first PCR amplification
step, just prior to the gel electrophoresis size-selection step, which was
found to reduce a fragment-size bias in the sequencing libraries.
These libraries were sequenced with an Illumina Genome Analyzer (GA2) according to the
manufacturer's recommendations.
Data analysis
For this analyis, reads that align to human genome sequence version hg18 and contain the
5'-CGG-3' HpaII-cut signature on their 5' end were used.
These aligned sequence reads were mapped to CCGG sites predicted in silico on hg18.
Sites with four or more Msp1 tags occurring in either the forward or reverse
direction were retained for analysis. These "assayable" sites were then grouped with
neighboring sites that are within 35-75 bp of each other. Thus, a "region"
can be comprised of between 2 and 18 digestion sites that are each within
35-75 bp of another site. Methylated and non-methylated calls were made by
using HpaII tag data from all assayable cut sites.
For each site across each region, the larger of
either the forward read count or reverse read count was used.
Regions that have an average of 0 or 1 read per cut site are called
methylated, and regions with more than one sequence read per site are called
unmethylated.
Credits
Dr. Richard M. Myers
Mr. Yuya Kobayashi:
yuyak@stanford.
edu
Dr. Devin M. Absher:
dabsher@hudsonalpha.
org
Dr. Rebekka O. Sprouse:
rsprouse@hudsonalpha.
org
Contact:
Flo Pauli.
References
1.
Brunner AL, Johnson DS, Kim SW, Valouev A, Reddy TE, Neff NF, Anton E, Medina C, Nguyen L,
Chiao E et al.
Distinct DNA methylation patterns characterize differentiated human embryonic
stem cells and developing human fetal liver.
Genome Research. 2009 Jun;19(6):1044-56. Epub 2009 Mar 9.
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior
consent, submit publications that use an unpublished ENCODE dataset until
nine months following the release of the dataset. This date is listed in
the Restricted Until column on the track configuration page and
the download page. The full data release policy for ENCODE is available
here.
|