Description
This track displays genomic mappings for human LongSAGE tags from the
The Cancer Genome Anatomy
Project. SAGE (Serial Analysis of Gene Expression) [Velculescu 1995] is a
quantitative technique for measuring gene expression. For a brief overview
of SAGE, see the CGAP SAGE information page.
Display Conventions and Configuration
Genomic mappings of 17-base LongSAGE tags are displayed. Tag counts are
normalized to tags per million (TPM) in each tissue or library. Tags with higher TPM are
more darkly shaded. The CATG restriction site before the start of the
tag is rendered as a thick line; the 17 bases of the tag are drawn as a thinner
line. Thus the thin end of the tag points in the direction of transcription.
The track display modes are:
- dense - Draws locations of mapped tags on a single line.
- squish - Draws one item per tag per library without labels.
- pack - Draws one item per tag per tissue with labels. The label
includes the number of libraries of each tissue type containing the tag.
Clicking on an item lists the libraries containing the tag, with the libraries
from the selected tissue in bold. Clicking on a library in the list
displays detailed information about that library.
- full - Draws one item per tag per library.
Clicking on an item displays information about the library, along with other
libraries containing the tag.
The track can be configured to display only tags from a selected tissue.
Methods
Tag and library data, along with genomic mappers, were obtained
from The Cancer Genome Anatomy Project.
Information about the various SAGE libraries, data downloads and other tools
for exploring and analyzing these data is available from the
CGAP SAGE Genie web
site.
Mapping SAGE tags to the human genome
The goal of the SAGE tag mapping is to identify the genomic
loci of the associated mRNAs. Since it is impossible to disambiguate tags
that map to multiple loci, only unique genomic mappings are kept. To compensate
for polypmorphisms between the reference genome and the mRNA libraries,
SNPs are considered by the mapping algorithm.
For each position in the genome on both strands, all
possible 21-mers, given all combinations of SNPs, were considered. The 21-mers
beginning with CATG were generated for use in mapping. Only 21-mers
that were unique across the genome were used in placing SAGE tags.
Only SNPs from dbSNP with the following characteristics were used:
- single-base
- maps to a single genomic location
- reference allele matches reference genome
- does not occur in a tandem repeat
Human embryonic stem cell (ESC) library construction
Detailed information regarding the human ESC lines used in this study can be
found at https://stemcells.nih.gov and in Hirst et al. 2007.
The ESC tags were generated from RNA purified from human ESCs maintained under
conditions that promote their maintenance in an undifferentiated state.
A complete set of embryonic stem cell LongSAGE tags is available through the
CGAP web portal.
Credits
Many thanks to Martin Hirst of Canada's Michael Smith Genome Sciences Centre for his
assistance in developing this track.
The LongSAGE data and genomic mappings were provided by the
The Cancer Genome Anatomy
Project of the National
Cancer Institute, U.S. National
Institutes of Health.
The human embryonic stem cell library was supported by funds from the
National Cancer Institute, National Institutes of Health, under Contract
No. N01-C0-12400 and by grants from Genome Canada, Genome British Columbia and
the Canadian Stem Cell Network.
References
Boon K, Osorio EC, Greenhut SF, Schaefer CF, Shoemaker J, Polyak K, Morin PJ, Buetow KH, Strausberg
RL, De Souza SJ et al.
An anatomy of normal and malignant gene expression.
Proc Natl Acad Sci U S A. 2002 Aug 20;99(17):11287-92.
PMID: 12119410; PMC: PMC123249
Hirst M, Delaney A, Rogers SA, Schnerch A, Persaud DR, O'Connor MD, Zeng T, Moksa M, Fichter K, Mah
D et al.
LongSAGE profiling of nine human embryonic stem cell lines.
Genome Biol. 2007;8(6):R113.
PMID: 17570852; PMC: PMC2394759
Khattra J, Delaney AD, Zhao Y, Siddiqui A, Asano J, McDonald H, Pandoh P, Dhalla N, Prabhu AL, Ma K
et al.
Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell
lines.
Genome Res. 2007 Jan;17(1):108-16.
PMID: 17135571; PMC: PMC1716260
Lal A, Lash AE, Altschul SF, Velculescu V, Zhang L, McLendon RE, Marra MA, Prange C, Morin PJ,
Polyak K et al.
A public database for gene expression in human cancers.
Cancer Res. 1999 Nov 1;59(21):5403-7.
PMID: 10554005
Liang P.
SAGE Genie: a suite with panoramic view of gene expression.
Proc Natl Acad Sci U S A. 2002 Sep 3;99(18):11547-8.
PMID: 12195021; PMC: PMC129301
Riggins GJ, Strausberg RL.
Genome and genetic resources from the Cancer Genome Anatomy Project.
Hum Mol Genet. 2001 Apr;10(7):663-7.
PMID: 11257097
Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, Kinzler KW, Velculescu VE.
Using the transcriptome to annotate the genome.
Nat Biotechnol. 2002 May;20(5):508-12.
PMID: 11981567
Siddiqui AS, Khattra J, Delaney AD, Zhao Y, Astell C, Asano J, Babakaiff R, Barber S, Beland J,
Bohacec S et al.
A mouse atlas of gene expression: large-scale digital gene-expression profiles from precisely
defined developing C57BL/6J mouse tissues and cells.
Proc Natl Acad Sci U S A. 2005 Dec 20;102(51):18485-90.
PMID: 16352711; PMC: PMC1311911
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW.
Serial analysis of gene expression.
Science. 1995 Oct 20;270(5235):484-7.
PMID: 7570003
|