Variable Genome: 2011

Monday, December 12, 2011

A bitter taste for the microbiome

Taste is a fine thing - nuanced by the presence of many compounds, their ratios to one another, and past experience. That is true of foods that enter the mouth. Could this also be true of the mix of food waste and bacteria in the colon? Although sampling a spoonful of Firmicutes or Bacteroidetes is admittedly something quite revolting, my thinking is leaning toward "yes." The brain likely has an idea of who is in residence in the colon and which metabolic byproducts are present.

A stir was created in 2007 when it was reported that two taste receptors and gustducin have a role in glucose-mediated responses, suggesting function as a previously undescribed glucose sensor in the gut lumen (Jang, Kokrashvili, et al. 2007 Proc Natl Acad Sci USA 104:15069–15074). One can find many news articles from that time highlighting this finding. It seems that the repertoire of taste receptors expressed in the gut, particularly in the colon, is much more extensive.

The ATLAS gene expression tool at the EBI is a semantically enriched database of meta-analysis based summary statistics over a curated subset of ArrayExpress gene expression data. ATLAS supports queries for condition-specific gene expression patterns as well as broader exploratory searches for biologically interesting genes. Using ATLAS, I found that the following ten taste receptors are expressed either in the small intestine or the colon:

TAS1R1
TAS2R13
TAS2R14
TAS2R16
TAS2R3
TAS2R38
TAS2R4
TAS2R7
TAS2R8
TAS2R9

Every one of these, except for one, are described as responding primarily to bitter tastants. TAS2R38 is sensitive to glucosinolate, a plant derived family of compounds, which do have a bitter-like taste. TAS1R1 is a type of sweet receptor, detecting primarily L-enantiomers of certain amino acids. It is highly likely that querying other gene expression databases will turn up other members of the taste receptor family as expressed in the lower gastrointestinal tract.

It makes sense to me that proteins that are designed to communicate what is in essence the composition of the external environment should be given such roles as sentry or monitor beyond the taste buds on the tongue. Taste receptors are designed to sense specific classes of chemicals and to relay a signal to the brain. Fecal fermentation of proanthocyanins, phytochemicals and other complex molecules is likely to be monitored in some way for the benefit of the human host. Perhaps taste receptors have a role in that process.

Wednesday, October 26, 2011

The curious case of SNP rs2880301

"Wow! I've never seen anything like that before," my colleague Chao-Qiang Lai exclaimed when examining output from his analysis of genome-wide association (GWAS) data. He was looking for genetic markers influencing the level of triglyceride in serum as part of the GOLDN study. GOLDN is looking at the genetics of the response to lipid-lowering medication. The result of Chao's preliminary analysis indicated that SNP rs2880301 associated with TG levels with a p-value of 10^-218. He showed me some data from the scan of the Affymetrix 6.0 genotyping chip and we postulated that we could be looking at some type of CNV (copy number variant) or deletion, but the lack of minor allele homozygotes troubled us.

What intrigued us right from the start was our colleagues at other institutions who are also analysis the GOLDN GWAS data did not report this SNP in their initial findings. The dbSNP entry for rs2880301 indicates a C to T variant with an allele frequency of 0.24 in the four primary HapMap populations from USA, Nigeria, China and Japan. No differences in allele frequency means no chance of positive (or negative) selection on this variant. No, none indeed as we were to learn later.

So, Chao dug deeper into his data and he and I shot ideas back and forth. After my suggestion to look at sex, he saw that when the SNP and sex are together in the same model, the analysis did not complete. Then, looking at the individual genotypes, he saw that all the men had genotype CT and all the women CC. This is from a total of just over 800 subjects.

OK, time for me to step in and see where this "SNP" maps in the genome. My first query was the flanking sequence supplied by Affymetrix. This 33-bp segment maps nearly perfectly to both chromosome 13, within intron 1 of the TPTE2 gene and agreeing with both dbSNP and Affyemtrix's annotation of the SNP, and curiously to a spot on the Y chromosome. (TPTE2 is a membrane-associated phosphatase which acts on the 3-position phosphate of inositol phospholipids and could be argues as relevant to TG biology.) The only residue not matching is the "polymorphic" base of the "SNP." A C is found on chr13 and a T is found on Y. Thus, the SNP becomes a marker of sex and Chao was right - it is a type of deletion (females carry no Y chromosome) - but a deletion he had not envisioned.

Is rs2880301 then a marker for gender? Not really. I compared the genomic regions where the homologous SNP sequences were found on both chromosomes, extending over 6 kbp in each direction. I saw a large region of sequence identity between 13 and Y - over 96% - for a ~5 kbp segment. Running RepeatMasker indicated that rs2880301 falls within an L1 LINE, a common repeat element. Thus, while it is intriguing that an array of repeats (70% of the 13-kbp segment of chr13 is masked by RepeatMasker) are conserved between chromosomes 13 and Y, and in order, SNP rs2880301 is not really a SNP. All subjects are C on chr13 and all Y chromosomes are T.

What we then had in our data were five genotypes: CC on chr13 for all women, CC on chr13 for all men and T on Y. Thus, the "allele frequencies" of C between 0.75 and 0.80 and T between 0.20 and 0.25 seen by us and others, including the HapMap data, roughly correspond to populations that are half to slightly more than half women.

Tuesday, September 13, 2011

Genetics of hypertension

An important paper describing genetic factors involved in blood pressure and heart disease was published this week. 29 loci were identified - 16 of these novel. This is impressive, but not so surprising as both diastolic and systolic blood pressure are complex, heritable traits. Many, many factors are at play here. In fact, the authors describe a risk score based on genotypes at 29 genome-wide significant variants, which was associated with hypertension, left ventricular wall thickness, stroke and coronary artery disease, but not kidney disease or kidney function.

Some time ago, I wrote about the genome of the SHR (spontaneous hypertensive rat). This rat and the FHH strain are models for hypertension. The genome of the SHR strain showed that 788 genes contained variants with respect to the reference rat genome. Whether any or all of those variants function in producing the hypertension phenotype is not clear, but reasoning went that these variants would be a logical place from which to build list of candidate genes.

Well, I thought it would be a fun and quick exercise to see how many of these 29 loci detected by genome-wide associations in nearly 70,000 individuals of European ancestry (with validation of top signals in up to 133,000 additional individuals of European descent) compare to those genes that possess genetic variants in either the FHH or SHR rat strains. This may give insight into the applicability of sequencing the genomes of specific model organism strains and into how many or how often genes are identified by both GWAS and whole-genome sequencing. To do this, I used the Rat Genome Browser hosted by the Medical College of Wisconsin.

Of the 29 human loci, 23 map in the rat genome within a QTL for blood pressure. This looks good, but consider that there are numerous blood pressure (BP) QTL mapped throughout the rat genome. In fact, the rat Adm gene maps within 27 different BP QTL and three other gene regions, Furin - Fes, Plekha7 and Mov10, map within more than 20 different BP QTL. Some of these QTL are large, spanning many genes, which means that fine-mapping is needed - such as a GWAS - to identify more precisely candidate loci.

Only 18 of the human BP genes identified in the paper contain SNPs in either the FHH or SHR strains. Often, the variants are shared in both strains. Both synonymous and nonsynonymous SNPs were noted, but synonymous far outnumbered those variants that altered the underlying amino acid sequence. No SNPs in gene control regions were noted, which may indeed be the case or a limitation of the data sources used here.

The human genes whose rat versions contain SNPs in the hypertensive-susceptible strains are:

SLC39A8
ATP2B1
GNAS - EDN3
MTHFR - NPPB
FGF5
CYP1A1 - ULK3
FURIN - FES
FLJ32810 - TMEM133
NPR3 - C5orf23
EBF1
PLCE1
BAT2-BAT5
ZNF652
TBX5 - TBX3
JAG1
GUCY1A3 - GUCY1B3
MECOM
ULK4

SNPs altering gene expression still need to be added to this analysis. Nonetheless, the numbers and types of genes that share genetic variation in hypertensive mammals (human, rat) is revealing. It is likely that the 788 identified genes with variation in the SHR rat are not all important for hypertension, but that strain does carry variants in 17 of these new BP genes. Or is that just 17?

Tuesday, May 24, 2011

More on microRNAs - a nutrition connection

The landscape at the intersection of microRNA (miR) expression and diet is sparse. This is even more so concerning the consequence of bioactive food components in affecting the physical aspects of the miR-mRNA interaction.

Nonetheless, evidence has been reported to suggest that miRs are key metabolic regulators. In adipose of mice, expression of miRs was shown to be sensitive to conjugated linoleic acid in the diet. In rats fed a diet of corn oil/fish oil with pectin/cellulose and in which colonic tumors were induced, a number of miRs, including miR-16, miR-19b, miR-21, miR-26b, miR-27b, miR-93 and miR-203, exhibited altered expression and were linked to oncogenic signaling pathways. Also in rats, downregulation in the liver of three miRs (miR-122, miR-451 and miR-27) and upregulation of miR-200a, miR-200b and miR-429 was noted after feeding of either a high-fat or high-fructose diet with consequences of diet-induced nonalcoholic fatty liver disease.

In mice, pregnant and lactating dams fed a high-fat diet displayed reduced expression of miR-26a, miR-122, miR-192, miR-194, miR-709 and the let-7 family with a common predicted target of methyl-CpG binding protein 2 (Mecp2).

A comparison of miR expression profiles in subcutaneous adipose of women highlighted eleven miRNAs as significantly deregulated in obese subjects with and without type 2 diabetes. Many of the same miRs also showed significant deregulation during adipocyte differentiation. The role of diet in regulating miR expression in prostate cancer has been reviewed. MiR-33, encoded in an intron of SERBF1/SREBF2, cooperatively regulates cholesterol homeostasis via targeting of ABCA1 and NPC1. The FXR/SHP signaling cascade regulates miR-34a and its target SIRT1, which likely functions as either a regulator of epigenetic gene silencing or an intracellular regulatory protein with mono-ADP-ribosyltransferase activity.

Using a mouse diet-induced obesity model, it was shown that hepatic expression of miR-107 decreases while its target FASN, encoding fatty acid synthase, increases.

In summary, there is a growing body of evidence to strongly implicate microRNAs as having significant functions in regulating the metabolic-based response of a number of cell types.

Friday, April 1, 2011

CDKN2A and its response to diet

Although it is April 1st here, there is some serious business taking place on my desktop: Cleanup day. I'm reading through an electronic pile of papers and news items that have gathered over the last weeks.

Here's an interesting bit about human gene CDKN2A. This gene encodes cyclin-dependent kinase inhibitor 2A and is also known as p16^INK4a. Suppression of CDKN2A by glucose restriction in human cells (fetal lung fibroblasts) was shown by Li & Tollefsbol to contribute to lifespan extension via epigenetic and genetic mechanisms that were mediated by SIRT1.

A year ago, we published a paper showing the effects on gene expression in mononuclear cells in metabolic syndrome subjects after intake of phenol-rich virgin olive oil. We noted repressed expression of several pro-inflammatory genes. Interestingly, CDKN2A was also significantly repressed. Thus, two dietary conditions - low glucose and phenol-rich olive oil - repress expression of this gene, albeit in different cell types and under different circumstances.

Monday, March 28, 2011

MicroRNAs and glucose metabolism

A noteworthy article released yesterday in Nature Cell Biology reports that in mice, in liver, microRNA miR-143 induces the down-regulation of oxysterol binding protein-like 8 (OSBPL8, ORP8). This leads to an impaired ability of insulin to induce AKT activation. AKT is a protein kinase. What is most interesting about this work is the presentation of evidence that microRNA-based regulation of gene activity is important in glucose homeostasis and perhaps onset of type 2 diabetes.

Before I get into what else is known about miR-143/MIRN143, it is interesting to note that OSBPL8 suppresses ABCA1 expression and cholesterol efflux from macrophages, as reported by Yan et al. (2008). ABCA1 is itself regulated, in part, by microRNA MIR33A encoded within SREBF2 to regulate both HDL biogenesis in the liver and cellular cholesterol efflux.

MIRN143:
MIRN143 is frequently observed to be downregulated in colorectal (Ng Sung 2009 Br J Cancer 101:699) and gastric cancers (Takagi Akao 2009 Oncology 77:12)

MIRN143 is frequently downregulated in pancreatic cancer cells (Kent Mendell 2009 Cancer Biol Ther 8:2013)

MIRN143 was a transcriptional target of myocardin and other transcriptional factors involved in smooth muscle cell fate (Cordes Srivastava 2009 Nature 460:705)

MIRN143 has also been found to play a role in adipocyte differentiation (Xie Lodish 2009 Dibetes 58:1050, Walden Cannon 2009 J Cell Physiol 218:444, Takanabe Hasegawa 2008 Biochem Biophys Res Commun 376:728, Esau Griffey 2004 J Biol Chem 279:52361))

Expression of MIRN143 was elevated in differentiating adipocytes and inhibition of MIRN143 could suppress differentiation of adipocytes (Esau Griffey 2004 J Biol Chem 279:52361)

Ectopically expressed MIRN143 in preadipocyte 3T3-L1 cells has been found to accelerate adipogenesis (Xie Lodish 2009 Diabetes 58:1050)

In addition, MIRN145, neighboring MIRN143 in the human genome is also a participant to this regulatory network:

IRS1 translation is downregulated by MIRN145 (Shi B, Baserga R, et al J. Biol. Chem. 282:32582-32590, 2007)

MIRN145 regulates actin cytoskeletal dynamics (Xin 2009 Genes Dev 23:2166)

stem cell pluripotency is regulated by MIRN145 (Xu 2009 Cell 137:647)

Thursday, March 10, 2011

Genetics of coronary heart disease

Note: This is a guest-post, authored by geneticist and molecular biologist Dr. Chao-Qiang Lai; with edits added by LP.

Last week, Nature Genetics published three letters reporting results from genome-wide association studies (GWAS) for coronary heart disease (CAD). The studies reported a number of markers that reached the threshold of statistical significance for association to CAD with concomitant association to traditional biomarkers of disease risk, such as elevated LDL-cholesterol (LDL-C), elevated total cholesterol, decreased HDL-cholesterol (HDL-C), hypertension, obesity (as measured by elevated body mass index), or type 2 diabetes. However, the two larger and more highly powered GWAS (C4D Genetics Consortium, Schunkert, et al.) also identified CAD-associated variants that are not associated with traditional biomarkers. The third study is of interest because it examines CAD in Chinese populations, but beginning with a discovery set of 130 cases and 130 controls leaves it a bit under-powered. They report a unique association between a SNP in C6orf105 and CAD, which is not found in European or south Asian populations. Curiously, this gene has also been implicated in non-syndromic oral cleft.

There are many sources of CAD. Blood lipids are most commonly thought of as the prime source, but blood pressure in the form of hypertension is also a source. Traditional biomarkers such as LDL-C, HDL-C, triglycerides, and hypertension have been used almost as the sole surrogates for measuring the devolvement and progression of CAD over the course of some 50 years. Meta-analyses of GWAS based on over 100,000 subjects (22,233 cases and 64,762 controls from 14 GWAS) thus far have identified 23 genetic variants associating with CAD. The eye-opening aspect to this is these variants account for about 10% of CAD cases with the shocking observation that 17 of 23 confirmed loci appear to have no association with traditional markers. This observation then suggests two possible explanations.

One possibility is when we assume that the remainder of the CAD cases (90%) contribute to risk associated with traditional markers, such genetic factors cannot be detected based on current GWAS methodology. This is likely to be true because of to the effect sizes of these variants are too small, or their effects are camouflaged by gene-gene (GxG) and gene-environment (GxE) interactions or by epigenetic mechanisms.

This second possibility rests on the fundamental premise that all markers associating with CAD have more or less equal chance to be detected. It then follows that a majority of genetic factors that contribute to CAD has nothing to do with traditional markers. If this is indeed the case, it opens a new avenue to identify the new mechanism(s) and new biomarkers that lead to CAD. In fact, this possibility is supported by many observations. For example, 50% of those individuals who have CAD have low LDL-C (Braunwald & Shattuck; Ridker).

These genes – for example, ADAMTS7, PDGFD, ABO and PPAP2B – point to new mechanisms. While GxG, GxE and epigenetic interactions remain as viable contributors to CAD risk, the path to better understanding of the other component(s) to CAD risk will likely transit through metabolic profiling to identify the compounds that distinguish elevated from nominal risk. Furthermore, research will need to be conducted in model organisms based on these newly discovered genes, perhaps in pig as this is a good model for heart function and disease in human.

Friday, February 18, 2011

10 years with the human genome

This week marks the 10-year anniversary of the publications of a (nearly) completed human genome sequence. Much has been made already of this passage of time, as well as what we can look forward to in the next ten years.

What I thought I would do in this space is share a little personal story on my connection with this achievement. In late spring of 2001, I happened to search the Internet and PubMed for my name because I wanted to check to see if any presentations at conference or publications from previous laboratories in which I had worked had been released. To my surprise, I found a website in Japan with the title of something like "list of authors" which contained a collection of names of former colleagues from my days in the Genome Sequencing Center at Cold Spring Harbor Laboratory. That seemed strange and so investigating a bit I learned that we were included on the Nature paper describing the human genome - along with some 5000+ other authors (hence the special listing on this website, and no hits in PubMed). Well, needless to say but that was quite a thrill. I quickly updated my CV to include this landmark publication.

Back in 1997 to 1999, as the publicly funded project to sequence the human genome was ramping up and dollars were dangled in front of genome centers around the USA and the globe, we at CSHL were trying to deposit as much finished sequence into GenBank as possible. Monthly and quarterly totals of base pairs deposited were key to securing grant money. An introduction to all this came within my first two weeks as the Computational Fellow (post-doc) with Dick McCombie when I was told I would be leading the analysis segment of his Genome Sequencing course. I learned the ins and outs of a new computer system and new software tools (I came from a cell biology lab) just in time to teach the students. We worked hard during that 2-week course to sequence a 143-kbp BAC clone containing some critical HIV/AIDS-relevant genes: CCR2, CCR5 and CCR6. You can view the sequence entry I deposited to GenBank here, accession U95626.

From this initial BAC, we worked on many more to try to show that we could put high-quality sequence data together and to get as much sequence finished as possible. Of course, our main funding was to contribute to the Arabidopsis thaliana genome and so the human projects (BACs and cosmid/fosmid clones) took second priority. But we did contribute enough sequence to warrant inclusion on the paper and Dick was kind enough to remember everyone who had passed through his lab during those years.

Wednesday, February 16, 2011

PCSK, cholesterol homeostasis and osteoporosis

Today, I saw a news release on a series of articles concerning the PCSK gene family published by Dr. Nabil Seidah's group at the Institut de recherches cliniques de Montréal. The combined body of work suggests that the PCSK enzymes could influence health from cholesterol homeostasis to osteoporosis.

PCSK stands for proprotein convertase subtilisin/kexin. This means that it enzymatically converts a larger proprotein into a smaller functional entity. PCSK9 is certainly the most well publicized member of this family with much known about genetic variants associating with myocardial infarction, heart disease and plasma lipid levels, particularly LDL-cholesterol. PCSK9 interacts with the LDL-cholesterol receptor.

PCSK9 also shows decreased expression in a circadian rhythmic fashion in mouse liver depleted for Mir122. This comes from a report by Gatfield, Schibler, et al. 2009 Genes Dev. 23:1313-26.

Here are some other interesting bits about members of the PCSK gene family.

PCSK2 - homolog of nematode gene C51E3.7 which is involved in determination of adult lifespan. SNPs in PCSK2 may increase susceptibility to myocardial infarction and type 2 diabetes, which are both age-related afflictions. A QTL for HDL has been mapped to the vicinity of Pcsk2 in mouse: Hdlq19.

Interestingly, some of my own work on literature mining with Biomax BioLT tool indicated that both PCSK7 and PCSK1N have relationships with HDL-cholesterol. A QTL for HDL at the PCSK7 locus has been described.

Heterozygous knock-out mice for Pcsk1 show increased adipose mass. Transgenic expression in mice of Pcsk1n driven by an actin promoter yielded adult-onset obesity. This gene, in human, was recently proposed as a candidate obesity/type 2 diabetes (T2DM) genes by Chang Hsu (2011 Diabetes, in press) but did not pass their test for Fst measures of positive selection.

An interesting paper by Tiffin, Hide, et al. (2006) suggested that PCSK2 and PCSK7 are candidate obesity and T2DM genes.

Certainly interesting phenotypes here. Keep your eyes on these genes.

Thursday, February 10, 2011

Transcription factor databases

The following is a guest-post by my colleague Jacqueline Lane (with some editing by me). She has been interested in identifying novel transcription factors (TF) involved in obesity and genetic variants in their binding sites as well as in the TF genes themselves.

Jackie has put together a list of TF-gene interaction databases she is willing to share here. There are three types of data:

1) TF-gene interaction
This is a compilation of databases with TF-gene interaction data. This might be of the most interest because it lists many databases. See http://www.pazar.info/. There is also the oRegAnno database, which is easy to view if you click on the tfview link on the right-hand side; see http://www.oreganno.org/oregano/. Lastly, TF-gene binding data can also be found at http://www.tfcat.ca/.

2) TF-TF interactions
This is a database of TF-TF co-activators and co-repressors (TFs that direct transcription of a gene in concert). This helps with determining tissue/temporal specific combinatorial regulation. See http://www.cell.com/retrieve/pii/S0092867410000796.

3) TF co-activators
The TF co-factor database lists proteins that bind to TFs, but not directly to DNA. These protein interactions can give a better picture of the full interaction. Find the data at http://nar.oxfordjournals.org/content/early/2010/10/20/nar.gkq945.full.

Friday, February 4, 2011

A water flea's phenotypic plasticity and HDL-cholesterol in humans

This week marked the announcement of the completion of the genome sequence of the water flea Daphnia pulex. I remember peering through a microscope in my first biology classes amazed at the activity and diversity of structures of these creatures. Now, the 200-megabase genome has been deduced. One of the startling discoveries is the small D. pulex genome is packed full with more than 30000 genes, far exceeding the number in the human genome. Some 13000 genes were identified in the paper by Colbourne, et al. as paralogs - arising from gene duplication.

Here is part A of figure 1 from the paper illustrating major differences in gene numbers between D. pulex and other animal genomes.

So, why all these paralogous genes? Well, the upshot here is one of likely gene duplication as a means to build an inventory of possibilities for a wide range of phenotypes. This scenario is spelled out rather nicely by Dieter Ebert in an accompanying overview. The water flea is remarkably able to sense its predators in a very precise manner and in turn activate any of a number of genes that direct expression of defense mechanisms. Some of these are structural features such as protective helmets, tail spines and neck teeth. Herein is the water flea's phenotypic plasticity - different environments induce expression of different subsets of the vast genome for the purpose of evading the predator. A gene for each bad guy swimming nearby.

Now, let's consider humans and their environment. In particular, I'd like to offer the example of diet, for most this is high in fat and sugar, and the important blood lipid of HDL-cholesterol, so-called "good cholesterol." Regular readers of this blog know that our research expends a good deal of effort in describing gene-environment interactions (GxEs). This is a situation where one allele of a genetic variant like a SNP associates with disease risk only when a given environmental factor passes a certain threshold. We have compiled a series of these GxEs for phenotypes pertinent to metabolic syndrome - phenotypes such as body weight, BMI, blood lipids, blood pressure, glucose and insulin levels, as well as heart disease and type 2 diabetes risk. Those data are available here. If you mine those data, you'll notice that by far there are more GxEs reported in the literature for HDL-cholesterol than any other commonly measured phenotype.

Thus, it seems to me that the water flea has a lot of very similar genes, mostly in paralogous pairs to cope with slight changes in its environment. Humans do not. Eating a sub-optimal diet will likely drive HDL levels down (unhealthy). There are also age-related, natural declines in HDL. At the same time, there are a number of variants in our genomes that show an environmental sensitivity with respect to HDL - there are many ways to activate a program of increased risk (by lowering HDL levels). And similar cases can be presented for LDL, triglycerides, total cholesterol, blood pressure, waist circumference, body weight, etc. So, while it may take years of indulging in a sub-optimal diet before an adverse event such as diabetes of atherosclerosis is diagnosed, perhaps our (relatively) small number of genes, each with a collection of variants, that sets us up for sensitivity to what we put in our mouths. If we can't eat right, then perhaps more genes would be the answer to a better defense against a poor diet.

Wednesday, February 2, 2011

Synonymous SNPs are not so synonymous

Early this week, an excellent paper by Brest, Darfeuille-Michaud, Hofman, et al. in Nature Genetics provides a prime example of going beyond genome-wide association studies (GWAS) to dissect the functional consequences of a genetic variant associated with disease risk. In so doing, the authors provide another case of synonynous SNPs not being so synonymous.

Here are what I find to be the key points of the research presented in this report:

1. The exonic SNP c.313C>T (rs10065172) is in perfect linkage disequilibrium (r²=1.0) with a deletion polymorphism of 20 kbp mapping upstream of the IRGM gene. This deletion has been strongly associated with Crohn's disease in several European populations or those with European ancestry. What is important here is a SNP can act as a tag or proxy for the deletion.

2. The c.313C>T variant alters codon 105 of the IRGM protein from CTG>TTG. Both codons call for leucine upon translation and so this SNP is classified as synonymous. The authors speculate that there could be allele-specific consequences to protein expression. Based on two other reports from other groups, the authors decided to investigate whether allele-specific interactions between the IRGM transcript and a microRNA could be at play here. They observed a predicted binding between microRNA-196 (or miR-196, both miR-196A encoded by A1 and A2 genes and by miR-196B) that was affected by the variation at SNP c.313C>T. Importantly, they show that not only is the miR-196-IRGM interaction real but that expression of miR-196 is elevated in inflammatory epithelia from Crohn's sufferers. These results underscore the point that synonymous SNPs are not so synonymous. The different alleles can exhibit different functions that have health consequences.

3. From GWAS to function. Although this paper does not report original results from GWAS, it builds on those results in an important way. There are four key papers reporting GWAS results for IRGM and Crohn's disease. These papers are by Parkes et al (2007), the Wellcome Trust Case Control Consortium (2007), Barrett et al (2008) and Franke et al (2010). So, in just over three years from the initial discovery of association of this once rather unremarkable gene (only 5 papers were published on IRGM prior to the initial GWAS report of 2007, most reporting a role in autophagy), we now have a much deeper understanding how a synonymous variant leads to the disease condition.

This is very nice work indeed and can be held up as an example of the success of GWAS in laying a foundation for getting at the mechanism of a disease.

Friday, January 21, 2011

One size does not fit all

On January 11th of this year, 23andMe, one of several companies offering direct-to-consumer genotyping (or genetic testing), put out a press release entitled, "23andMe Presents Top Ten Most Interesting Genetic Findings of 2010."

I found number 5 on that list to be quite appealing. It reads, in part:

One size doesn’t fit all — personalizing treatment

The old adage, “take two aspirin and call me in the morning,” doesn’t work as well as we might think. It turns out that one size doesn’t fit all when it comes to drug response, and for some people, certain drugs might be more effective, not work at all, or even produce serious side effects. The growing body of pharmacogenomics research has helped us understand that, at least in part, genetics play a role in how well some drugs work for different people. The 23andMe Drug Response reports link customers’ genetics to the way they might respond to certain drugs and medications. The results range from whether you’re likely to benefit from a drug, need a different dose due to sensitivity, experience toxic or adverse effects, or even have increased risk for other conditions. 23andMe cautions that its Drug Response reports should not be used to independently establish, abolish, or adjust medical treatment and medications but should be discussed with your physician. Only a medical professional can determine whether a particular drug or dose is appropriate for you.

The piece goes on to describe, briefly, two genes, CYP2C9 and VKORC1 and the role of variants of these genes in warfarin dosing.

OK, so this is all neat but really only represents the tip of the tip of the iceberg. There are many more examples of one size not fitting all and reaching far beyond pharmaceuticals. We and many others have reported on many such interactions between certain genetic variants and diet which affect disease risk. On this blog, I have listed some examples pertaining to HDL-cholesterol. And those variants that show interactions with physical activity as modifiers of disease risk are really interesting. And not to forget other environmental factors or lifestyle choices of sleep, latitude and altitude of residence (how much seasonality you experience, oxygen tension), use of alcohol, use of tobacco, and so forth.

For scientists, medical professionals and the general public alike, clearly a greater understanding of elements at the basis of "one size does not fit all" would be welcome. Stay tuned, it's happening - more of these elements, the gene-environment interactors, are being described and collated into databases.