Mapping genotype-phenotype associations of point mutations in coiled-coil domains of the human proteome (#427)
Genome-wide association studies (GWAs) have led to exponential growth in the identification of novel loci associated with complex diseases. However, our ability to interpret this data has not kept pace with its generation. With the advent of exome sequencing, the need to identify pathogenic mutations at the amino acid level has become urgent. While several tools have been developed to distinguish pathogenic mutations from harmless genetic variations, many of these knowledge-based systems are predicated on data from Mendelian diseases, which may not translate well to interpretation of disease-associated variants in complex diseases. In this work, we instead adopted a physicochemical bioinformatic approach to distinguish harmless polymorphisms from those likely to be pertinent to the phenotype in question. To interpret the effect of these substitutions on protein function, we assume that the context of the mutation (such as the functional domain) is an important entity in determining the phenotype. Herein, we have studied the genotype-phenotype relationship based on patterns of mutations and variations observed in a common structural unit, the coiled coil domain. They are encoded in over 2,038 genes of the human genome and encode up to 10% of the expressed protein sequences. With a simple architecture composed of several repeats of seven-residue (a to g) alpha helices, coiled coils perform versatile functions as structural elements, spacers, oligomerisation subunits and molecular recognition systems in expressed proteins. Residues of the major register (a, d, e and g) are important for the stability and specificity of the coiled-coil structure, whereas minor-register residues (b, c and f) support the architecture indirectly. Results show the disease-associated mutations are intensely populated at the minor residues of the structure.