GWAS Catalog Track Settings

Home
Genomes
Genome Browser
Tools
Mirrors
- Third Party Mirrors
- Mirroring Instructions
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Contact Us
- Conditions of Use
- Jobs
- Licenses

Description

This track displays single nucleotide polymorphisms (SNPs) identified by published Genome-Wide Association Studies (GWAS), collected in the NHGRI-EBI GWAS Catalog published jointly by the National Human Genome Research Institute (NHGRI) and the European Bioinformatics Institute (EMBL-EBI). Some abbreviations are used above.

The Catalog is a quality controlled, manually curated, literature-derived collection of all published genome-wide association studies assaying at least 100,000 SNPs and all SNP-trait associations with p-values < 1.0 x 10^-5 (Hindorff et al., 2009). For more details about the Catalog curation process and data extraction procedures, please refer to the Methods page.

Methods

The GWAS Catalog data is extracted from the literature. Extracted information includes publication information, study cohort information such as cohort size, country of recruitment and subject ethnicity, and SNP-disease association information including SNP identifier (i.e. RSID), p-value, gene and risk allele. Each study is also assigned a trait that best represents the phenotype under investigation. When multiple traits are analysed in the same study either multiple entries are created, or individual SNPs are annotated with their specific traits. Traits are used both to query and visualise the data in the Catalog's web form and diagram-based query interfaces.

Data extraction and curation for the GWAS Catalog is an expert activity; each step is performed by scientists supported by a web-based tracking and data entry system which allows multiple curators to search, annotate, verify and publish the Catalog data. Papers that qualify for inclusion in the Catalog are identified through weekly PubMed searches. They then undergo two levels of curation. First all data, including association information for SNPs, traits and general information about the study, are extracted by one curator. A second curator then performs an additional round of curation to double-check the accuracy and consistency of all the information. Finally, an automated pipeline performs validation of the extracted data, see the Quality control and SNP mapping section below for more details. This information is then used for queries and in the production of the diagram.

Description

Methods

References