Notes from ASHG 2010 (American Society of Human Genetics)
November 2, 2010
Eric Lander (Broad Institute) – The human genome project: A decade later
The draft (~90% complete) of the human genome was announced in June, 2000 and published in February, 2001. The finished (~99.3%) sequence was announced in April, 2003 and published in October, 2004.
With the sequence available, we can now build maps of all kinds. Some types include structure maps, maps of molecular function and disease maps. We can also put together a catalog of signatures – allowing us to build platforms for gene expression and proteomics.
In 2000, the completed eukaryotic genomes numbered four (S. cerevisiae, C. elegans, D. melanogaster, A. thaliana). 38 prokaryotic genomes were known. In 2010, the genomes of 250 eukaryotes are complete, 4000 bacteria/viruses and at least 500 human genomes. This has happened for various reasons, a primary one being the drop in cost of sequencing; it fallen ~100,000-fold since 1999.
Understanding the genome. In 2000, the thought was there are 35,000 to 100,000 protein-coding genes, regulatory sequences were not so numerous, there was some non-coding sequence, and transposons and such were considered junk. In 2010, the gene count is 21,000, much more information is in the genome than we thought (~25% of evolutionarily conserved sequences are non-coding and number about 3 million elements (by sequencing and comparing the genomes of 29 mammals)), transposons are big players in the dissemination of these conserved elements, the epigenome, and the approximate 5000 large inter-genic non-coding RNAs.
Mendelian traits. In 1990, we knew the source of 70. In 2000, that number was 1300. In 2010 that stands at 2900 Mendelian disorders identified (see OMIM). There are about 1800 more to know.
The basis of disease – complex diseases and traits. In 1990, we knew only about HLA, number = 1. In 2000, that was ~25, with things like APOE and Alzheimer disease. In 2010 that has risen to ~1100 with respect to 165 common disease traits. But there is disappointment in GWAS because the effect size is small and there is this missing heritability. He thinks that rare variants are not needed because heritability increases as the number of subjects in the GWAS increases, because population genetics suggests that for many common diseases rare variants explain less than other variants, (point #3 I missed), and epistasis hugely distorts the estimate of variance (a – GWAS finds all loci, b – but the loci explain 33% of variance, c – thus we need to use GWAS to identify the biology and then look at variance).
Cancer. In 1990 we knew of 12 solid tumor cancer genes. In 2000 that number was 80. In 2010 it is 240. New pathways are being discovered as pertinent in certain concerns.
History of human populations. He rushed through this and did not really provide any information that is not widely published.
John Stamatoyannopoulos (University of Washington) – Using ENCODE to read the human genome: Function and disease
ENCODE is used to guide interpretation of disease-associated genetic variation (GWAS). Many GWAS point to non-coding GWAS SNPs – 47% in introns, 2% in promoters, 7% coding, 14% are 50-100 kbp from nearest known gene, 10% are 1-50 kbp from nearest known gene, 18% are >100 kbp from nearest known gene.
DNase I hypersensitivity site (D1HS) maps overlayed on inflammatory bowel disease GWAS near PTGER4. He uses data from relevant cell lines Th2, Th1, B lymphocytes and sees signals of histone marks in those cells.
Cancer GWAS at 8q24 (upstream of MYC). One SNP lands in a H3K27Ac site, a binding site for TCF7L2 (in colonic cells) and a D1HS.
26% of GWAS SNPs fall in D1HSs. This is ~2.5-fold enrichment. GWAS SNPs for cognition, Parkinson disease, bipolar disorder, and others, map to D1HSs found only in brain. He sees a similar result for heart with Q-T interval, atrial fibrillation, EKG traits and response to statin therapy.
ENCODE is heading to a point of nucleotide resolution in order to better define the regulatory genome.
Nathalie Cartier (INSERM) – Gene therapy for neurodegenerative diseases
Brain: 2% of body weight but 25% of all cholesterol.
LP: Hence the Alzheimer-lipid links
Michael Meaney – Environmental regulation of the neural epigenome
Environmental factors are social (parental) and economic (food, shelter, safety).
Parental care leads to epigenetic marks which lead to changes in gene expression which then leads to a phenotype. His example is licking of young rat pups (in the first one to two weeks of life) by rat mothers. This licking (care) leads to changes in phenotypic responses to stress, neural development, female reproduction and metabolism. He intends to discuss the endocrine response to stress. Expression of specific genes in specific brain region(s).
[Cool stuff – but delivered like a speed reading of a journal article...]