New Machine Learning Method Predicts Damaging Missense Variants
The premise of genomic medicine is that a person’s genomic characterization can be used to improve medical diagnosis, prognosis, and treatment. Each person, however, has millions of genetic variants, the vast majority of which have negligible impact on their health. How to determine which variants are relevant to a particular condition is a central issue in genomic medicine.
The issue is most pressing in the case of missense variants, which alter a single amino acid in proteins. Only about 20–30 percent of these mutations have a functional impact. Thus the question of how likely a variant is to change protein function—contributing to a health condition—is extremely uncertain for missense variants. As a result, most missense variants in clinical genetic testing are classified as VUS (variant of uncertain significance).
Yufeng Shen, PhD, an associate professor in the Department of Systems Biology and the Department of Biomedical Informatics, and his group have developed a new method for predicting which missense variants are potentially damaging. The method, called gMVP (graphical model for predicting Missense Variant Pathogenicity), uses one of the latest machine learning techniques, a graph attention model, to capture information relevant to predicting which variants are potentially damaging. Their paper, “Predicting Functional Effect of Missense Variants Using Graph Attention Neural Networks,” was published in Nature Machine Intelligence on November 15th, 2022.