Description
These tracks display the level of sequence uniqueness of the reference mm9
genome. They were generated using different window sizes and high signal
will be found in areas where the sequence is unique.
Display Conventions and Configuration
This track contains multiple subtracks representing different cell types
that display individually on the browser. Instructions for configuring tracks
with multiple subtracks are
here.
These tracks provide a measure of how often the sequence found at the particular
location will align within the whole genome. Unlike measures of uniqueness, alignability
will tolerate up to 2 mismatches. These tracks are in the form of signals ranging from
0 to 1 and have several configuration options.
Methods
The CRG Alignability tracks show how uniquely k-mer sequences align
to a region of the genome.
By using the GEM mapper aligner,
where up to two mismatches were allowed, the method is equivalent to mapping
sliding windows of k-mers back to the genome (where k has been set to 36, 40,
50, 75 or 100 nucleotides to produce these tracks).
For each window, a mappability score was computed
(S = 1/(number of matches found in the genome): S=1 means one match in the
genome, S=0.5 is two matches in the genome, and so on). The
CRG Alignability tracks were
generated independently of the ENCODE project, in the framework of the GEM
(GEnome Multitool) project.
Release Notes
This is Release 1 (June 2012) of the ENCODE mappability track. It is a port of the old mappability track into the ENCODE format.
There are no new datasets.
Credits
The CRG Alignability track was created by Thomas Derrien and
Paolo Ribeca
in Roderic Guigo's lab at the Centre for Genomic
Regulation (CRG), Barcelona, Spain. TD was supported by funds from NHGRI
for the ENCODE project, while PR was funded by a Consolider grant
CDS2007-00050 from the Spanish Ministerio de Educación y Ciencia."
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior consent,
submit publications that use an unpublished ENCODE dataset until nine months
following the release of the dataset. This date is listed in the
Restricted Until column, above. The full data release policy for ENCODE is available
here.
|