Thursday, November 4, 2010

ASHG 2010 conference notes - 4 Nov 2010

Notes from ASHG 2010 (American Society of Human Genetics)
Washington, D.C. 4 November 2010

A Goldstein – Challenges to identification of high-risk alleles

High-risk alleles are rare to very rare and typically have a penetrance greater than 5.

Challenges to finding high-risk alleles
There really is no major high-risk gene
Lack of power or informativeness
Underlying complexity of genetics
Clinical and epidemiological heterogeneity and/or misclassification
Follow-up of linkage results

Illustrations of challenges
BRCA1 – 10% of risk of breast cancer
BRCA2 – 12% of risk of breast cancer
Existence of a "BRCA3" with high-risk is rather unlikely

CDKN2A/ARF – ~20% risk for melanoma
CDK4 – ~1% risk for melanoma

So, increase power of the study. Better use or incorporate:
Molecular genetic data
Functional genomics data
Epidemiological and clinical data

New technology may help – such as NextGen sequencing


J. Bailey-Wilson – Complex traits really are complex

Major environmental risk factors may be common
Major genetic risk alleles for serious diseases tend to be rare in population
- Due to selection
- A major locus may have many “risk” alleles

She offers breast cancer as a model. Traditional approaches identified BRCA1 and BRCA2, but then came GWAS.

Linkage is very powerful to detect high penetrance risk alleles in families. Association is very powerful to detect common risk alleles but – if each family has a different, rare or private allele/variant, association will not succeed.

Why has “the gene” not been found?
- False positive linkage
- Have the right gene but don’t understand it yet
- Haven’t yet sequenced fully the region defined by the linkage study
- It is not a gene but a regulatory region
- Could be a long, non-coding RNA
- MicroRNAs and intronic variants, too

Synonymous variants are interesting – change the kinetics of translation!

She is hopeful that more sequencing will be done under broad linkage peaks. But need to phenotype well to fully test for GxE influence.


E. Wijsman – Cardiovascular QTLs and large pedigrees

They are looking at familial combined hyperlipidemia (FCHL) in 4 families with 253 subjects. They looked at 600 STRs and 48K SNPs on CVD chip. The phenotype of choice is plasma APOB. For plasma APOB levels, they noted a LOD score of 3.1 on chromosome 4.

Across this large APOB linkage peak, they used each SNP as a covariate to see which one(s) abolish the peak. Then, which gene? Do exome sequencing. All this identified a SNP in LRBP but direct genotyping of the entire pedigree brought the variance from 0.4 to ~0.18 – killed it. So, need to generate many candidate variants for quick screening by genotyping the entire pedigree – because finding one SNP and testing it in a one-by-one manner is not efficient.

The exome data may identify a haplotype which extends to the non-exome.


N. Camp – Analytical strategies to identify rare risk variants using extended high-risk pedigrees

They use Utah family data: 2.2 million individuals over three to eleven gnerations, with hospital records.


J. Degner – Using genome-wide sensitivity data to infer transcription factor binding

Transcription factor binding sites (TFBS) are poorly annotated. They use ENCODE’s DNase I data. See for their tool – it uses 230 position weight matrices, 800,000 sites. They also have an article in press at Genome Research. So, use this to check GWAS hits. An example is a binding site QTL for PEBPI.


I Aneas – What are the downstream targets of Tbx20?

- differential expression in Tbx20 wildtype vs knockout mice, in heart tissue
- ChIP-seq data from embryo gives 2000 binding sites, from adult gives 4000 binding sites

Combining the above gives 2000 genes. This set is enriched for ion transport and calcium homeostasis functions.


A Letourneau – Effect of trisomy 21 on gene expression

They used a twin study – monozygotic twins where one is trisomic for Chr21 and the other not. Many genes on Chr21 and elsewhere in the genome show differential expression. Many Chr21 genes show >1.5-fold increase in expression for trisomic:normal comparison. 58 genes show Chr21-trisomy-specific alternate splicing. [LP: This has got to be a harbinger of what is possible with careful analysis of the effect of CNVs.]


T. Teslovich – Sequencing of 400 cases, 200 controls at 26 genes for type 2 diabetes

Goal: Identify rare variants in genes implicated by GWAS.

To date, the most interesting finding is GCKR variant E584X (stop codon). In study #1, the minor allele frequency (MAF) was 0.56% in cases and 0.80% in controls. In study #2, the MAF was 0.08% in cases and 0.15% in controls. (I missed values for study #3.) The point here is one of where the differences in allele frequencies are not significant. So, go to the Metabolo-chip with 14,000 cases and 17,000 controls. This is on-going…


H. Daoud – Exome sequencing in ALS families

Six candidate genes were identified that are shared in two ALS families, but none are shared in three families. This is indicative of the heterogeneity of ALS.


D. MacArthur – Loss-of-function mutations in healthy human genomes

LOF is a premature stop, splice site disruption, small indel leading to a frameshift, others.

Data from the 1000G pilot:
- 1088 stop SNPs
- 643 splice disruptors
- 956 small (< 40 bp) frameshift indels
- 147 genes disrupted by large indels

Implication is each person has many of these types of variant. ~25% (453 of ~1743) LOF variants did not pass manual validation. OK, so a few of these LOF variants actually are from RefSeq errors and gene model errors. Gene models will be corrected in the next release of Gencode so that subsequent clinical sequencing won’t have to deal with this. In other words, there will be no error.

The estimate is there are ~140 true LOF variants per individual and about 35 or these are homozygous.

No comments:

Post a Comment