Mining Patterns in Genomic and Clinical Cancer Data to Characterize Novel Driver Genes
Cancer research, like many areas of science, is adapting to a new era characterized by increasing quantity, quality, and diversity of observational data. An example of the advances, and the resulting challenges, is represented by The Cancer Genome Atlas, an enormous public effort that has provided genomic profiles of hundreds of tumors of each of the most common solid cancer types. Alongside this resource is a host of other data and knowledge. Thus, a current challenge is how best to integrate these data to discover mechanisms of oncogenesis and cancer progression. Ultimately, this could enable genomics-based prediction of an individual patient’s outcome and targeted therapies, a goal termed precision medicine. Here, I develop novel approaches that examine patterns in populations of cancer patients to identify key genetic changes and suggest likely roles of these driver genes in the diseases.
I consider that tumors are a result of random events that must collaborate to endow a cell with all of the invasive and immortal properties of a cancer. Some combinations of events are lethal to a developing tumor, while other combinations are simply not preferentially selected. In order to discover these complex patterns, I introduce a method based on the joint entropy of a set of genes, called GAMToC. Using GAMToC, I identify sets of genes with a strongly non-random joint pattern of co-occurrence and mutual exclusivity. Novel genes with a role in cancer can be highlighted by virtue of their non-random pattern of alteration. Insights into the roles of these novel drivers can come from their most strongly co-selected partners.
Finally, I develop the use of cancer comorbidity, or increased cancer risk, as a novel data source for understanding cancer. The recent availability of clinical records spanning a large percentage of the American population has enabled discovery of many cancer comorbidities. Germline mutations, those present at birth, could predispose some rare populations to increased cancer risk. Mendelian disease phenotype provides strong insight into the germline genotype of an afflicted individual. Thus, if Mendelian diseases with cancer comorbidity can be shown to have specific defects in processes that are important in the development of that cancer, statistical comorbidity could provide a new a resource for prioritizing Mendelian disease genes as novel cancer related genes. For this purpose, I integrate clinical comorbidity, Mendelian disease causal variants, and somatic genomic profiles of thousands of cancers. I demonstrate that comorbidity indeed is associated with significant genetic similarity between Mendelian diseases and the cancers these patients are predisposed to, suggesting highly interesting and plausible new candidate cancer genes.
Add to Calendar