Computational Biology×

News

September 28, 2016

Can Math Crack Cancer's Code?

An essay coauthored by Andrea Califano (Chair, Department of Systems Biology) and Gideon Bosker and published in the Wall Street Journal asks whether quantitative modeling could reveal the keys for turning cancer off. They write:

  • Disappointed with the slow pace of discovery and inclined to look for elegant, universal explanations for nature’s conundrums, many cancer researchers have increasingly been asking: Is there some sort of “Da Vinci Code” for cancer? And can we crack it using mathematics?

    Quantitative modeling has been extremely successful in disciplines as diverse as astronomy, physics, economics and computer science. Can “cancer quants”—scientists applying quantitative analyses to the landscape of cancer biology—find the answers we seek? And, if so, what would the new paradigm look like? 

The essay goes on to describe how computational methods developed in the Califano Lab are being tested in personalized N of 1 clinical trials to identify essential checkpoints in the molecular regulatory networks that sustain individual patients' tumors — as well as drugs capable of targeting them.

Click here to read the essay. (subscription may be required)

Yufeng Shen
Yufeng Shen's lab is interested in developing better computational methods for identifying rare genetic variants that increase disease risk.

On the surface, birth defects and cancer might not seem to have much in common. For some time, however, scientists have observed increased cancer risk among patients with certain developmental syndromes. One well-known example is seen in children with Noonan syndrome, who have an eightfold increased risk of developing leukemia. Recently, researchers studying the genetics of autism also observed mutations in PTEN, an important tumor suppressor gene. Although such findings have been largely isolated and anecdotal, they raise the tantalizing question of whether cancer and developmental disorders might be fundamentally linked.

According to a paper recently published in the journal Human Mutation, many of these similarities might not be just coincidental, but the result of shared genetic mutations. The study, led by Yufeng Shen, an Assistant Professor in the Columbia University Departments of Systems Biology and Biomedical Informatics, together with Wendy Chung, Kennedy Family Associate Professor of Pediatrics at Columbia University Medical Center, found that cancer-driving genes also make up more than a third of the risk genes for developmental disorders. Moreover, many of these genes appear to function through similar modes of action. The scientists suggest that this could make tumors “natural laboratories” for pinpointing and predicting the damaging effects of rare genetic alterations that cause developmental disorders.

“In comparison with cancer, there are relatively few patients with developmental disorders,” Shen explains, “For geneticists, this makes it hard to identify the risk genes solely based on statistical evidence of mutations from these patients. This study indicates that we should be able to use what we learn from cancer genetics — where much more data are available — to help in the interpretation of genetic data in developmental disorders.”

Master regulators of tumor homeostasis

In this rendering, master regulators of tumor homeostasis (white) integrate upstream genetic and epigenetic events (yellow) and regulate downstream genes (purple) responsible for implementing cancer programs such as proliferation and migration. CaST aims to develop systematic methods for identifying drugs capable of disrupting master regulator activity.

The Columbia University Department of Systems Biology has been named one of four inaugural centers in the National Cancer Institute’s (NCI) new Cancer Systems Biology Consortium. This five-year grant will support the creation of the Center for Cancer Systems Therapeutics (CaST), a collaborative research center that will investigate the general principles and functional mechanisms that enable malignant tumors to grow, evade treatment, induce disease progression, and develop drug resistance. Using this knowledge, the Center aims to identify new cancer treatments that target master regulators of tumor homeostasis.

CaST will build on previous accomplishments in the Department of Systems Biology and its Center for Multiscale Analysis of Genomic and Cellular Networks (MAGNet), which developed several key systems biology methods for characterizing the complex molecular machinery underlying cancer. At the same time, however, the new center constitutes a step forward, as it aims to move beyond a static understanding of cancer biology toward a time-dependent framework that can account for the dynamic, ever-changing nature of the disease. This more nuanced understanding could eventually enable scientists to better predict how individual tumors will change over time and in response to treatment.

Papers

Each year, participants in the ISCB/RECOMB Conference on Regulatory and Systems Genomics select publications over the past year that they consider to have made the most significant contributions to the field. During the most recent conference, held in Philadelphia on November 15-18, 2015, the top 10 papers were announced. Among those selected were four involving Columbia University Department of Systems Biology investigators. 

Factors affecting protein activity
Following gene transcription and translation, a protein can undergo a variety of modifications that affect its activity. By analyzing downstream gene expression patterns in single tumors, VIPER can account for these changes to identify proteins that are critical to cancer cell survival.

In a paper just published in Nature Genetics, the laboratory of Andrea Califano introduces what it describes as the first method capable of analyzing a single tumor biopsy to systematically identify proteins that drive cancerous activity in individual patients. Based on knowledge gained by modeling networks of molecular interactions in the cell, their computational algorithm, called VIPER (Virtual Inference of Protein activity by Enriched Regulon analysis), offers a unique new strategy for understanding how cancer cells survive and for identifying personalized cancer therapeutics.

Developed by Mariano Alvarez as a research scientist in the Califano laboratory, VIPER has become one of the cornerstones of Columbia University’s precision medicine initiative. Its effectiveness in cancer diagnosis and treatment planning is currently being tested in a series of N-of-1 clinical trials, which analyze the unique molecular characteristics of individual patients’ tumors to identify drugs and drug combinations that will be most effective for them. If successful, it could soon become an important component of cancer care at Columbia University Medical Center.

According to Dr. Califano, “VIPER makes it possible to find actionable proteins in 100% of cancer patients, independent of their genetic mutations. It also enables us to track tumors as they progress or relapse to determine the most appropriate therapeutic approach at different points in the evolution of disease. So far, this method is looking extremely promising, and we are excited about its potential benefits in finding novel therapeutic strategies to treat cancer patients.”

Clonal evolution in GBM tumors
The researchers' model of tumor evolution indicates that different clonal lineages branch from a common ancestral cell and then diversify, independently causing aggressive tumor behavior at different stages of disease.

Glioblastoma multiforme (GBM) is the most common and most aggressive type of primary brain tumor in adults. Existing treatments against the disease are very limited in their effectiveness, meaning that in most patients tumors recur within a year. Once GBM returns, no beneficial therapeutics currently exist and prognosis is generally very poor.

To better understand how GBM evades treatment, an international team led by Antonio Iavarone and Raul Rabadan at the Columbia University Center for Topology of Cancer Evolution and Heterogeneity has been studying how the cellular composition of GBM tumors changes over the course of therapy. In a paper just published online by Nature Genetics, they provide the first sketch of the main routes of GBM tumor evolution during treatment, showing that different cellular clones within a tumor become dominant within specific tumor states. The study uncovers important general principles of tumor evolution, novel genetic markers of disease progression, and new potential therapeutic targets.

Andrew Anzalone and Sakellarios ZairisMD/PhD students Andrew Anzalone and Sakellarios Zairis combined approaches based in chemical biology, synthetic biology, and computational biology to develop a new method for protein engineering.

The ribosome is a reliable machine in the cell, precisely translating the nucleotide code carried by messenger RNAs (mRNAs) into the polypeptide chains that form proteins. But although the ribosome typically reads this code with uncanny accuracy, translation has some unusual quirks. One is a phenomenon called -1 programmed ribosomal frameshifting (-1 PRF), in which the ribosome begins reading an mRNA one nucleotide before it should. This hiccup bumps translation “out of frame,” creating a different sequence of three-nucleotide-long codons. In essence, -1 PRF thus gives a single gene the unexpected ability to code for two completely different proteins.

Recently Andrew Anzalone, an MD/PhD student in the laboratory of Virginia Cornish, set out to explore whether he could take advantage of -1 PRF to engineer cells capable of producing alternate proteins. Together with Sakellarios Zairis, another MD/PhD student in the Columbia University Department of Systems Biology, the two developed a pipeline for identifying RNA motifs capable of producing this effect, as well as a method for rationally designing -1 PRF “switches.” These switches, made up of carefully tuned strands of RNA bound to ligand-sensing aptamers, can react to the presence of a specific small molecule and reliably modulate the ratio in the production of two distinct proteins from a single mRNA. The technology, they anticipate, could offer a variety of exciting new applications for synthetic biology. A paper describing their approach and findings has been published in Nature Methods.

cQTLs modify TF binding

Cofactors work with transcription factors (TFs) to enable efficient transcription of a TF's target gene. The Bussemaker Lab showed that genetic alterations in the cofactor gene (cQTLs) change the nature of this interaction, affecting the connectivity between the TF and its target gene. This, combined with other factors called aQTLs that affect the availability of the TF in the nucleus, can lead to downstream changes in gene expression.

When different people receive the same drug, they often respond to it in different ways — what is highly effective in one patient can often have no benefit or even cause dangerous side effects in another. From the perspective of systems biology, this is because variants in a person’s genetic code lead to differences in the networks of genes, RNA, transcription factors (TFs), and other proteins that implement the drug’s effects inside the cell. These multilayered networks are much too complex to observe directly, and so systems biologists have been developing computational methods to infer how subtle differences in the genome sequence produce these effects. Ultimately, the hope is that this knowledge could improve scientists’ ability to identify drugs that would be most effective in specific patients, an approach called precision medicine.

In a paper published in the Proceedings of the National Academy of Sciences, a team of Columbia University researchers led by Harmen Bussemaker proposes a novel approach for discovering some critical components of this molecular machinery. Using statistical methods to analyze biological data in a new way, the researchers identified genetic alterations they call connectivity quantitative trait loci (cQTLs), a class of variants in transcription cofactors that affect the connections between specific TFs and their gene targets.

Staphylococcus epidermis
Interactions between human cells and the bacteria that inhabit our bodies can affect health. Here, Staphylococcus epidermis binds to nasal epithelial cells. (Image courtesy of Sheetal Trivedi and Sean Sullivan.)

Launched in 2014 by investigators in the Mailman School of Public Health, the CUMC Microbiome Working Group brings together basic, clinical, and population scientists interested in understanding how the human microbiome—the ecosystems of bacteria that inhabit and interact with our tissues and organs—affects our health. Computational biologists in the Department of Systems Biology have become increasingly involved in this interdepartmental community, contributing expertise in analytical approaches that make it possible to make sense of the large data sets that microbiome studies generate.

Nicholas Tatonetti
Nicholas Tatonetti is an assistant professor in the Department of Biomedical Informatics and Department of Systems Biology.

A team of Columbia University Medical Center (CUMC) scientists led by Nicholas Tatonetti has identified several drug combinations that may lead to a potentially fatal type of heart arrhythmia known as torsades de pointes (TdP). The key to the discovery was a new bioinformatics pipeline called DIPULSE (Drug Interaction Prediction Using Latent Signals and EHRs), which builds on previous methods Tatonetti developed for identifying drug-drug interactions (DDIs) in observational data sets. The results are reported in a new paper in the journal Drug Safety and are covered in a detailed multimedia feature published by the Chicago Tribune.

The algorithm mined data contained in the US FDA Adverse Event Reporting System (FAERS) to identify latent signals of DDIs that cause QT interval prolongation, a disturbance in the electrical cycle that coordinates the heartbeat. It then validated these predictions by looking for their signatures in electrocardiogram results contained in a large collection of electronic health records at Columbia. Interestingly, the drugs the investigators identified do not cause the condition on their own, but only when taken in specific combinations.

Previously, no reliable methods existed for identifying these kinds of combinations. Although the findings are preliminary, the retrospective confirmation of many of DIPULSE’s predictions in actual patient data suggests its effectiveness, and the investigators plan to test them experimentally in the near future.

Yaniv Erlich
Yaniv Erlich. Photo: Jared Leeds.

A new article published online in Nature Genetics reports that short tandem repeats, a class of genetic alterations in which short motifs of nucleotide base pairs occur multiple times in a row, play a role in modulating gene expression. Leading the study was Yaniv Erlich, an assistant professor in the Columbia University Department of Computer Science and core member of the New York Genome Center who recently joined the Center for Computational Biology and Bioinformatics.

As an article in Columbia Engineering explains, the findings reveal a new class of genome regulation.

The Department of Systems Biology and Center for Computational Biology and Bioinformatics are pleased to announce that three Columbia University faculty members have recently joined our community. Kam Leong, the Samuel Y. Sheng Professor of Biomedical Engineering at Columbia University, is now an interdisciplinary faculty member in the Department of Systems Biology. In addition, Yaniv Erlich and Guy Sella are now members of the Center for Computational Biology and Bioinformatics (C2B2). Their addition to the Department and to C2B2 will bring new expertise that will benefit our research and education activities, incorporating perspectives from fields such as nanotechnology, bioinformatics, and evolutionary genomics.

Deep sequencing class

A new team-taught course covers both the experimental and analytical basics of next-generation sequencing. Assistant Professor Chaolin Zhang led the discussion in a recent class. Photo: Lynn Saville.

As the cost of next-generation sequencing has fallen, it has become a ubiquitous and indispensable tool for research across the biomedical sciences. DNA and RNA sequencing — along with other technologies for profiling phenomena such as de novo mutations, protein-nucleic acid interactions, chromatin accessibility, ribosome activity, and microRNA abundance — now make it possible to observe multiple layers of cellular function on a genome-wide scale.

Regardless of a biologist’s chosen area of investigation, such methods have made it possible to explore many exciting new kinds of problems. At the same time, however, it has also dramatically transformed the expertise that young scientists need to develop in order to participate in cutting-edge biological research. Bringing students up to speed with the pace of change in next-generation sequencing has posed a particular challenge for educators.

Now, a new multidisciplinary, graduate-level course organized by the Columbia University Department of Systems Biology is enabling young investigators to begin incorporating these powerful new tools into their studies and future research. Designed by assistant professors Yufeng Shen, Peter Sims, and Chaolin Zhang, the course covers both the experimental principles of next-generation sequencing and key statistical methods for analyzing the enormous datasets that such technologies produce. In this way, it gives students a strong grounding in principles that are critical for more advanced graduate courses as well as the ability to begin applying deep sequencing technologies to investigate the questions they are interested in pursuing.

As Dr. Sims explains, “Whether you are a graduate student in systems biology, biochemistry, or microbiology, the chance that you are going to be doing next-generation sequencing is pretty high. At the same time, it’s completely not taught at the undergraduate level. There is no text book nor is there any time in a typical undergraduate biology curriculum to get into this in any kind of detail. Even at top-tier universities students come into graduate school without having any experience with it, and often they’re expected to jump right into this kind of research. We decided that this was a problem we had to fix.”

DeMAND graphical abstract
By analyzing drug-induced changes in disease-specific patterns of gene expression, a new algorithm called DeMAND identifies the genes involved in implementing a drug's effects. The method could help predict undesirable off-target interactions, suggest ways of regulating a drug's activity, and identify novel therapeutic uses for FDA-approved drugs, three critical challenges in drug development.

Researchers in the Columbia University Department of Systems Biology have developed an efficient and accurate method for determining a drug’s mechanism of action — the cellular machinery through which it produces its pharmacological effect. Considering that most drugs, including widely used ones, act in ways that are not completely understood at the molecular level, this accomplishment addresses a key challenge to drug development. The new approach also holds great potential for improving drugs’ effectiveness, identifying better combination therapies, and avoiding dangerous drug-induced side effects.

According to Andrea Califano, the Clyde and Helen Wu Professor of Chemical Systems Biology and co-senior author on the study, “This new methodology makes it possible for the first time to generate a genome-wide footprint of the proteins that are responsible for implementing or modulating the activity of a drug. The accuracy of the method has been the most surprising result, with up to 80% of the identified proteins confirmed by experimental assays.”

Alex Lachmann
Alex Lachmann during his presentation to the RNA-Seq "boot camp."

In June 2015, the Columbia University Department of Systems Biology held a five-part lecture series focusing on advanced applications of RNA-Seq in biological research. The talks covered topics such as the use of RNA-Seq for studying heterogeneity among single cells, RNA-Seq experimental design, statistical approaches for analyzing RNA-Seq data, and the utilization of RNA-Seq for the prediction of molecular interaction networks. The speakers and organizers have compiled a list of lecture notes and study materials for those wishing to learn more. Click on the links below for more information.

PhenoGraph

PhenoGraph, a new algorithm developed in Dana Pe'er's laboratory, proved capable of accurately identifying AML stem cells, reducing high-dimensional single cell mass cytometry data to an interpretable two-dimensional graph. Image courtesy of Dana Pe'er.

A key problem that has emerged from recent cancer research has been how to deal with the enormous heterogeneity found among the millions of cells that make up an individual tumor. Scientists now know that not all tumor cells are the same, even within an individual, and that these cells diversify into subpopulations, each of which has unique properties, or phenotypes. Of particular interest are cancer stem cells, which are typically resistant to existing cancer therapies and lead to relapse and recurrence of cancer following treatment. Finding better ways to distinguish and characterize cancer stem cells from other subpopulations of cancer cells has therefore become an important goal, for once these cells are identified, their vulnerabilities could be studied with the aim of developing better, long lasting cancer therapies.

In a paper just published online in Cell, investigators in the laboratories of Columbia University’s Dana Pe’er and Stanford University’s Garry Nolan describe a new method that takes an important step toward addressing this challenge. As Dr. Pe’er explains, “Biology has come to a point where we suddenly realize there are many more cell types than we ever imagined possible. In this paper, we have created an algorithm that can very robustly identify such subpopulations in a completely automatic and unsupervised way, based purely on high-dimensional single-cell data. This new method makes it possible to discover many new cell subpopulations that we have never seen before.”

Monthly disease risk

Columbia scientists used electronic records of 1.7 million New York City patients to map the statistical relationship between birth month and disease incidence. Image courtesy of Nicholas Tatonetti.

Columbia University Medical Center reports on a new study in the Journal of American Medical Informatics Association led by Nicholas Tatonetti, also an assistant professor in the Department of Systems Biology.

Columbia University scientists have developed a computational method to investigate the relationship between birth month and disease risk. The researchers used this algorithm to examine New York City medical databases and found 55 diseases that correlated with the season of birth. Overall, the study indicated people born in May had the lowest disease risk, and those born in October the highest. The study was published this week in the Journal of American Medical Informatics Association.

“This data could help scientists uncover new disease risk factors,” said study senior author Nicholas Tatonetti, PhD, an assistant professor of biomedical informatics at Columbia University Medical Center (CUMC) and Columbia’s Data Science Institute. The researchers plan to replicate their study with data from several other locations in the U.S. and abroad to see how results vary with the change of seasons and environmental factors in those places. By identifying what’s causing disease disparities by birth month, the researchers hope to figure out how they might close the gap.

Reposted from the Columbia University Medical Center Newsroom. Find the original article here .

Cancer bottlenecks
In an N-of-1 study, researchers at Columbia University use techniques from systems biology to analyze genomic information from an individual patient’s tumor. The goal is to identify key genes, called master regulators  (green circles), which, while not mutated, are nonetheless necessary for the survival of cancer cells. 

Columbia University Medical Center (CUMC) researchers are developing a new approach to cancer clinical trials, in which therapies are designed and tested one patient at a time. The patient’s tumor is “reverse engineered” to determine its unique genetic characteristics and to identify existing U.S. Food and Drug Administration (FDA)-approved drugs that may target them.

Rather than focusing on the usual mutated genes, only a very small number of which can be used to guide successful therapeutic strategies, the method analyzes the regulatory logic of the cell to identify genes and gene pairs that are critical for the survival of the tumor but are not critical for normal cells. FDA-approved drugs that inhibit these genes are then tested in a mouse model of the patient’s tumor and, if successful, considered as potential therapeutic agents for the patient — a journey from bedside to bench and back again that takes about six to nine months.

“We are taking a rather different approach to tailor therapy to the individual cancer patient,” said principal investigator Andrea Califano, PhD, Clyde and Helen Wu Professor of Chemical Systems Biology and chair of CUMC’s new Department of Systems Biology. “If we have learned one thing about this disease, it’s that it has tremendous heterogeneity both across patients and within individual patients. When we expect different patients with the same tumor subtype or different cells within the same tumor to respond the same way to a treatment, we make a huge simplification. Yet this is how clinical studies are currently conducted. To address this problem, we are trying to understand how tumors are regulated one at a time. Eventually, we hope to be able to treat patients not on an individual basis, but based on common vulnerabilities of the cancer cellular machinery, of which genetic mutations are only indirect evidence. Genetic alterations are clearly responsible for tumorigenesis but control points in molecular networks may be better therapeutic targets.”

Topology of cancer

The Columbia University Center for Topology of Cancer Evolution and Heterogeneity will combine mathematical approaches from topological data analysis with new single-cell experimental technologies to study cellular diversity in solid tumors. Image courtesy of Raul Rabadan.

The National Cancer Institute’s Physical Sciences in Oncology program has announced the creation of a new center for research and education based at Columbia University. The Center for Topology of Cancer Evolution and Heterogeneity will develop and utilize innovative mathematical and experimental techniques to explore how genetic diversity emerges in the cells that make up solid tumors. In this way it will address a key challenge facing cancer research in the age of precision medicine — how to identify the clonal variants within a tumor that are responsible for its growth, spread, and resistance to therapy. Ultimately, the strategies the Center develops could be used to identify more effective biomarkers of disease and new therapeutic strategies.

Tracking clones

After identifying T cell clones that react against donated kidney tissue in vitro, new computational methods developed in Yufeng Shen's Lab are used to track their frequency following organ transplant. The findings can help to predict transplant rejection or tolerance.

When a patient receives a kidney transplant, a battle often ensues. In many cases, the recipient’s immune system identifies the transplanted kidney as a foreign invader and mounts an aggressive T cell response to eliminate it, leading to a variety of destructive side effects. To minimize complications, many transplant recipients receive drugs that suppress the immune response. These have their own consequences, however, as they can lead to increased risk of infections. For these reasons, scientists have been working to gain a better understanding of the biological mechanisms that determine transplant tolerance and rejection. This knowledge could potentially improve physicians’ ability to predict the viability of an organ transplant and to provide the best approach to immunosuppression therapy based on individual patients’ immune system profiles.

Yufeng Shen, an assistant professor in the Columbia University Department of Systems Biology and JP Sulzberger Columbia Genome Center, together with Megan Sykes, director of the Columbia Center for Translational Immunology at the Columbia University College of Physicians and Surgeons, recently took an encouraging step toward this goal. In a paper published in Science Translational Medicine, they report that the deletion of specific donor-resistant T cell clones in the transplant recipient can support tolerance of a new kidney. Critical to this discovery was the development of a new computational genomics approach by the Shen Lab, which makes it possible to track how frequently rare T cell clones develop and how their frequencies change following transplantation. The paper suggests both a general strategy for understanding the causes of transplant rejection and a means of identifying biomarkers for predicting how well a transplant recipient will tolerate a new kidney.

Pages