Height is a highly heritable trait that is easily visible within family trees. For instance, tall parents tend to have tall children. You might think this would make the identification of genes and non-coding regions of the genome involved in human height relatively easy. You could compare the genomes of tall families to short families to identify areas of disparity. However, height is a significantly complex genetic trait. This means that there are more than a handful of genomic variants responsible — in the case of height it is hundreds of variants, if not thousands — and how those variants are mixed together within each person will define the outcome (short, average or tall). Because of the number of variants involved, each having only a minor effect on the trait in isolation, identifying the genetics of human height has been a topic of research for decades. And despite all those years of study, geneticists can still only pinpoint a small proportion of the variants responsible. In a recent study published in Nature, a huge, international cohort of scientists worked together using new tools and techniques to define new genetic variants that may help explain the genetics of human height.
Before this publication, genome wide association studies (GWAS) identified 697 height-associated variants in the human genome, but those only explained about 20% of the heritability of height. As is common for complex genetic traits and diseases, most of these variants are non-coding (the variants are in genomic regions that do not make proteins), which complicates the identification of genes and pathways that influence height. Also typical of complex traits and diseases, most of the variants that affect height discovered thus far were common. This means that it occurs in at least 5% of the people studied for that GWAS cohort. However, whether less frequent and rare variants could also influence complex genetic traits was still up for debate. And because common variants could not explain the bulk of the height trait, it was clear that new techniques and analyses and larger sample sizes would be needed to uncover rare variants that contribute to height.
The authors used ExomeChip to tackle this problem. ExomeChip is a genotyping array designed to query variants identified in protein-coding DNA sequences, modeled from approximately 12,000 participants. They tested the association between 241,453 variants and adult height variation in 711,428 individuals. The large sample size allowed them the opportunity to pinpoint rare variants and overcome false discovery rates. Their main goals were to determine whether rare and low-frequency coding variants influence human adult height and to discover and characterize new genes and biological pathways implicated in human growth.
They found 606 independent, significant ExomeChip variants with 252 non-synonymous (when the variant alters the amino acid coded for in the gene, or results in an early stop codon) or splice-site (the resulting protein will be significantly different in size and possibly non-functional) variants, including 83 rare non-synonymous or splice-site variants. These 83 height variants are the largest set of validated rare and low-frequency coding variants associated with any complex human trait or disease to date.
So what is the value of connecting rare variants to traits? Models based on theoretical and empirical evidence suggest that variants with strong phenotypic (measurable trait) effects (like making one significantly taller than normal) are more likely to be deleterious, and therefore rarer. Additionally, rare variant validation is another step toward personalized medicine and risk assessment of disease development in people that harbor them, as several height associated genetic pathways are also linked to diseases, such as osteoporosis.
The results from this study corroborate the idea that large effect size variants are more rare. The largest effect sizes were observed for four rare missense variants, which alters the amino acid coded for in the gene. One of the missense variants was located in the gene STC2, which encodes for stanniocalcin-2, a calcium binding protein active in a long list of physiological processes, but not previously associated with height. Carriers of the rare STC2 missense variant are approximately 2.1 cm taller than non-carriers; the other three missense variants were associated with a 2 cm loss in height.
In a beautiful mouse validation study, the authors determined that over-expression of STC2 protein diminishes growth in mice. Because they were able to use a model system to study the phenotype, they were further able to validate the downstream pathway involved in the trait. They found that STC2 bound to and inhibited a proteinase, PAPP-A, that cleaves insulin growth factor binding protein 4 (IGFBP-4). The altered cascade resulted in reduced levels of bioactive insulin-like growth factors. Although there was no prior genetic evidence implicating STC2 variation in human growth, the PAPPA and IGFBP4 genes were both identified in previous height GWAS, emphasizing the likely relevance of this pathway in humans.
So while this study did identify and validate new height-associated variants, where does this put us in terms of explaining the genetics of height? Previous major frequency variant GWAS identified 23.3% of height heritability. When considering all rare, low-frequency and common height-associated variants validated in this study, science can now explain 27.4% of the heritability of height. And that 5% increase was earned after analysis more than 700,000 genomes. Thus, the complex trait of height will continue to be a great genomic puzzle worth pursuing for many more years.
Sara Cassidy, Ph.D. comes from a long line of hobbits, and is a senior science writer at The Jackson Laboratory for Genomic Medicine.