Computational Biology×

News

Two new precision medicine tests, born out of research from the Califano Lab, that look beyond cancer genes to identify novel therapeutic targets have just received New York State Department of Health approval and are now available to both oncologists and cancer researchers for use at the front lines of patient care. As reported by Columbia University Irving Medical Center (CUIMC), the tests are based on research conducted by CUIMC investigators—and could pave the way for a more precise approach to cancer therapy and help find effective drugs when conventional approaches to precision medicine have failed.

Columbia University to co-host Feb. 7-8 Cancer Genomics and Mathematical Data Analysis Symposium with Cornell University and Memorial Sloan Kettering

A multidisciplinary team of researchers across Columbia University have been busy addressing the complex challenges in basic and translational cancer research. Faculty and investigators are bridging their expertise in fields ranging from mathematics, biology, and engineering to physics, genomics, and chemistry to develop innovative approaches to better understand, for instance, cancer disease progression, drug resistance, and the systems-wide network of tumor evolution.

Central to this ongoing work is research grounded in cancer genomics and mathematical data analysis, which will be explored during a two-day conference Feb. 7-8 co-hosted by the National Cancer Institute (NCI) centers at Columbia University Medical Center, Cornell University, and Memorial Sloan Kettering Cancer Center. The upcoming Cancer Genomics and Mathematical Data Analysis Symposium will be held at the Vivian and Seymour Milstein Family Heart Conference Center; pre-registration is required

Courtesy of The Olive Lab

Shown here, a human pancreatic tumor stained with Masson's trichrome; Image credit: Dr. Kenneth Olive

The Lustgarten Foundation has awarded Columbia University’s Herbert Irving Comprehensive Cancer Center (HICCC) a three-year grant, as part of its Translational Clinical Program, to test a new precision medicine approach to the treatment of metastatic pancreatic cancer.

“The prevailing model in personalized cancer treatment is to attack the DNA mutations that are believed to be driving an individual patient’s tumor,” says principal investigator Kenneth P. Olive, PhD , assistant professor of medicine and pathology & cell biology at HICCC. “While this approach has been astonishingly effective for a handful of rare cancers, we expect it will only work for a very small fraction of patients with the most common types of cancer.”

Nicholas P. Tatonetti, PhD, has recently been named director of clinical informatics at the Institute for Genomic Medicine (IGM) at Columbia University Medical Center. In this new role, he is charged with planning, organizing, directing and evaluating all clinical informatics efforts across the Institute. In particular, he will focus on the integration of electronic health record data for use in genetics and genomics studies.

Dr. Tatonetti, who is Herbert Irving Assistant Professor of Biomedical informatics with an interdisciplinary appointment in the Department of Systems Biology, specializes in advancing the application of data science in biology and health science. Researchers in his lab integrate their medical observations with systems and chemical biology models to not only explain drug effects, but also further understanding of basic biology and human disease. They focus also on integration of high throughput data capture technologies, such as next-generation genome and transcriptome sequencing, metabolomics, and proteomics, with the electronic medical record to study the complex interplay between genetics, environment, and disease.

Broad, Columbia collaborators
Three of the investigators in new Columbia, Broad Institute research collaboration aimed at gastric and esophageal cancer; L to R: Dr. Andrea Califano, Dr. Cory Johannessen, and Dr. Adam Bass (Johannessen image: Martin Adolfsson; Bass image: Sam Ogden/Dana-Farber Cancer Institute)

A research collaboration underway between Columbia’s Department of Systems Biology, the Broad Institute of MIT and Harvard, and Columbia University Medical Center (CUMC) is working to accelerate the discovery of new cancer drug combinations targeted at gastric and esophageal cancer. These tumors have not yet attracted prominent research focus and attention, and yet the general outcome for patients with these diseases is poor. According to the American Cancer Society, survival rates are only 20% at five years after diagnosis.

The newly formed research alliance between research teams at Columbia and at the Broad Institute came about thanks to a four-year gift by the Price Family Foundation, known for its philanthropic support of education, health, and biomedical research.

HTS
Research scientist Hai Li holds up a 384-well plate, pictured in front of Columbia Genome Center's Hamilton Star automation system for HTS; Image credit: Systems Biology

Drug screening and analysis is critical in advancing research and discovery of cancer therapeutics. To this end, a Systems Biology-led team of investigators has recently developed PLATE-Seq, a new technique for low-cost, bulk mRNA sequencing. Coupled with genome-wide regulatory network analysis, the novel PLATE-Seq method advances the goal of providing cancer patients with personalized treatment.

Topology data analysis of cancer samples

Shown here, topology data analysis of cancer samples; Image credit: The Rabadan Lab

The new Program for Mathematical Genomics (PMG) is aiming to address a growing—and much-needed—area of research. Launched in the fall of 2017 by Raul Rabadan , a theoretical physicist in the Department of Systems Biology, the new program will serve as a research hub at Columbia University where computer scientists, mathematicians, evolutionary biologists and physicists can come together to uncover new quantitative techniques to tackle fundamental biomedical problems.

"Genomic approaches are changing our understanding of many biological processes, including many diseases, such as cancer," said Dr. Rabadan, professor of systems biology and of biomedical informatics. "To uncover the complexity behind genomic data, we need quantitative approaches, including data science techniques, mathematical modeling, statistical techniques, among many others, that can extract meaningful information in a systematic way from large-scale biological systems." 

Regulators of mesenchymal GBM subtype

An example of tumor oncotecture. Transcription factors involved in the activation of mesenchymal glioblastoma subtype are shown in purple. Together, they comprise a tightly knit tumor checkpoint, controlling 74% of the genes in the mesenchymal signature of high-grade glioma. CEBP (both β and δ subunits) and STAT3 regulate the other three transcription factors in the tumour checkpoint, synergistically regulating the state of mesenchymal GBM cells. (Image: Nature Reviews Cancer)

In a detailed Perspective article published in Nature Reviews Cancer, Department of Systems Biology chair Andrea Califano and research scientist Mariano Alvarez (DarwinHealth) summarize more than a decade of work to propose the existence of a universal, tumor independent “oncotecture” that consistently defines cancer at the molecular level. Their findings, they argue, indicate that identifying and targeting highly conserved, essential proteins called master regulators — instead of the widely diverse genetic and epigenetic alterations that initiate cancer and have been the focus of much cancer research — could offer an effective way to classify and treat disease.

As coverage of the paper in The Economist reports:

ONE of the most important medical insights of recent decades is that cancers are triggered by genetic mutations. Cashing that insight in clinically, to improve treatments, has, however, been hard. A recent study of 2,600 patients at the M.D. Anderson Cancer Centre in Houston, Texas, showed that genetic analysis permitted only 6.4% of those suffering to be paired with a drug aimed specifically at the mutation deemed responsible. The reason is that there are only a few common cancer-triggering mutations, and drugs to deal with them. Other triggering mutations are numerous, but rare—so rare that no treatment is known nor, given the economics of drug discovery, is one likely to be sought. 

Facts such as these have led many cancer biologists to question how useful the gene-led approach to understanding and treating cancer actually is. And some have gone further than mere questioning. One such is Andrea Califano of Columbia University, in New York. He observes that, regardless of the triggering mutation, the pattern of gene expression—and associated protein activity—that sustains a tumour is, for a given type of cancer, almost identical from patient to patient. That insight provides the starting-point for a different approach to looking for targets for drug development. In principle, it should be simpler to interfere with the small number of proteins that direct a cancer cell’s behaviour than with the myriad ways in which that cancer can be triggered in the first place. (Read full article.)

PrePPI inputs
PrePPI predicts the likelihood that two proteins A and B are capable of interacting based on their similarities to other proteins that are known to interact. This requires integrating structural data (green) as well as other kinds of information (blue), such as evidence of protein co-activity in other species as well as involvement in similar cellular functions. PrePPI now offers a searchable database of unprecedented scope, constituting a virtual interactome of all proteins in human cells. (Image courtesy of eLife.) 

The molecular machinery within every living cell includes enormous numbers of components functioning at many different levels. Features like genome sequence, gene expression, proteomic profiles, and chromatin state are all critical in this complex system, but studying a single level is often not enough to explain why cells behave the way they do. For this reason, systems biology strives to integrate different types of data, developing holistic models that more comprehensively describe networks of interactions that give rise to biological traits. 

Although the concept of an interaction network can seem abstract, at its foundation each interaction is a physical event that takes place when two proteins encounter one another, bind, and cause a change that affects a cell’s activity. In order for this to take place, however, they need to have compatible shapes and physical properties. Being able to predict the entire universe of possible pairwise protein-protein interactions could therefore be immensely valuable to systems biology, as it could both offer a framework for interpreting the feasibility of interactions proposed by other methods and potentially reveal unique features of networks that other approaches might miss. 

In a 2012 paper in Nature, scientists in the laboratory of Barry Honig first presented a landmark algorithm and database they call PrePPI (Predicting Protein-Protein Interactions). At the time, PrePPI used a novel computational strategy that deploys concepts from structural biology to predict approximately 300,000 protein-protein interactions, a dramatic increase in the number of available interactions when compared with experimentally generated resources.

Since then, the Honig Lab has been working hard to improve PrePPI’s scope and usefulness. In a paper recently published in eLife they now report on some impressive developments. With enhancements to their algorithm and the incorporation several new types of data into its analysis, the PrePPI database now contains more than 1.35 million predictions of protein-protein interactions, covering about 85% of the entire human proteome. This makes it the largest resource of its kind. In parallel with these improvements, the investigators have also begun to apply PrePPI in new ways, using the information it contains to provide new kinds of insights into the organization and function of protein interaction networks.

Cell Types in Autism

By inventing a new computational pipeline called DAMAGES, Chaolin Zhang and Yufeng Shen showed that brain cell types on the left of the plot are more prone to have rare autism risk mutations than cell types at the right. Narrowing the focus to these types of cells also helped to identify a molecular signature of the disorder that involves haploinsufficiency. Figure: Human Mutation.

Autism, a spectrum of neurodevelopmental disorders typically identified during early childhood, is widely thought to be the result of genetic alterations that change how the growing brain is wired. Nevertheless, despite a substantial effort in the field of autism genetics, the specific alterations that place one child at greater risk than another remain elusive. Although the list of alterations associated with autism is growing, it has been difficult to conclusively distinguish those that truly increase disease risk from those that are merely coincident with it. One troubling reason for this is that research so far seems to indicate that specific genetic abnormalities associated with autism risk are extremely rare, with many being found only in single patients. This has made it hard to reproduce findings conclusively.

In a paper recently published in the journal Human Mutation, Department of Systems Biology faculty members Chaolin Zhang and Yufeng Shen describe a method and some new findings that could help to more precisely identify rare autism-driving alterations. A new analytical pipeline they call DAMAGES (Disease Associated Mutation Analysis using Gene Expression Signatures) uses a unique approach to identifying autism risk genes, looking at differences in gene expression among different cell types in the brain in order to focus more specifically on mechanisms that are likely to be relevant for autism. Using this approach, they identified a pronounced molecular signature that is shared by disease risk genes due to haploinsufficiency, a type of genetic alteration that causes a dramatic drop in the expression of a particular protein.

September 28, 2016

Can Math Crack Cancer's Code?

An essay coauthored by Andrea Califano (Chair, Department of Systems Biology) and Gideon Bosker and published in the Wall Street Journal asks whether quantitative modeling could reveal the keys for turning cancer off. They write:

  • Disappointed with the slow pace of discovery and inclined to look for elegant, universal explanations for nature’s conundrums, many cancer researchers have increasingly been asking: Is there some sort of “Da Vinci Code” for cancer? And can we crack it using mathematics?

    Quantitative modeling has been extremely successful in disciplines as diverse as astronomy, physics, economics and computer science. Can “cancer quants”—scientists applying quantitative analyses to the landscape of cancer biology—find the answers we seek? And, if so, what would the new paradigm look like? 

The essay goes on to describe how computational methods developed in the Califano Lab are being tested in personalized N of 1 clinical trials to identify essential checkpoints in the molecular regulatory networks that sustain individual patients' tumors — as well as drugs capable of targeting them.

Click here to read the essay. (subscription may be required)

Yufeng Shen
Yufeng Shen's lab is interested in developing better computational methods for identifying rare genetic variants that increase disease risk.

On the surface, birth defects and cancer might not seem to have much in common. For some time, however, scientists have observed increased cancer risk among patients with certain developmental syndromes. One well-known example is seen in children with Noonan syndrome, who have an eightfold increased risk of developing leukemia. Recently, researchers studying the genetics of autism also observed mutations in PTEN, an important tumor suppressor gene. Although such findings have been largely isolated and anecdotal, they raise the tantalizing question of whether cancer and developmental disorders might be fundamentally linked.

According to a paper recently published in the journal Human Mutation, many of these similarities might not be just coincidental, but the result of shared genetic mutations. The study, led by Yufeng Shen, an Assistant Professor in the Columbia University Departments of Systems Biology and Biomedical Informatics, together with Wendy Chung, Kennedy Family Associate Professor of Pediatrics at Columbia University Medical Center, found that cancer-driving genes also make up more than a third of the risk genes for developmental disorders. Moreover, many of these genes appear to function through similar modes of action. The scientists suggest that this could make tumors “natural laboratories” for pinpointing and predicting the damaging effects of rare genetic alterations that cause developmental disorders.

“In comparison with cancer, there are relatively few patients with developmental disorders,” Shen explains, “For geneticists, this makes it hard to identify the risk genes solely based on statistical evidence of mutations from these patients. This study indicates that we should be able to use what we learn from cancer genetics — where much more data are available — to help in the interpretation of genetic data in developmental disorders.”

Master regulators of tumor homeostasis

In this rendering, master regulators of tumor homeostasis (white) integrate upstream genetic and epigenetic events (yellow) and regulate downstream genes (purple) responsible for implementing cancer programs such as proliferation and migration. CaST aims to develop systematic methods for identifying drugs capable of disrupting master regulator activity.

The Columbia University Department of Systems Biology has been named one of four inaugural centers in the National Cancer Institute’s (NCI) new Cancer Systems Biology Consortium. This five-year grant will support the creation of the Center for Cancer Systems Therapeutics (CaST), a collaborative research center that will investigate the general principles and functional mechanisms that enable malignant tumors to grow, evade treatment, induce disease progression, and develop drug resistance. Using this knowledge, the Center aims to identify new cancer treatments that target master regulators of tumor homeostasis.

CaST will build on previous accomplishments in the Department of Systems Biology and its Center for Multiscale Analysis of Genomic and Cellular Networks (MAGNet), which developed several key systems biology methods for characterizing the complex molecular machinery underlying cancer. At the same time, however, the new center constitutes a step forward, as it aims to move beyond a static understanding of cancer biology toward a time-dependent framework that can account for the dynamic, ever-changing nature of the disease. This more nuanced understanding could eventually enable scientists to better predict how individual tumors will change over time and in response to treatment.

Papers

Each year, participants in the ISCB/RECOMB Conference on Regulatory and Systems Genomics select publications over the past year that they consider to have made the most significant contributions to the field. During the most recent conference, held in Philadelphia on November 15-18, 2015, the top 10 papers were announced. Among those selected were four involving Columbia University Department of Systems Biology investigators. 

Factors affecting protein activity
Following gene transcription and translation, a protein can undergo a variety of modifications that affect its activity. By analyzing downstream gene expression patterns in single tumors, VIPER can account for these changes to identify proteins that are critical to cancer cell survival.

In a paper just published in Nature Genetics, the laboratory of Andrea Califano introduces what it describes as the first method capable of analyzing a single tumor biopsy to systematically identify proteins that drive cancerous activity in individual patients. Based on knowledge gained by modeling networks of molecular interactions in the cell, their computational algorithm, called VIPER (Virtual Inference of Protein activity by Enriched Regulon analysis), offers a unique new strategy for understanding how cancer cells survive and for identifying personalized cancer therapeutics.

Developed by Mariano Alvarez as a research scientist in the Califano laboratory, VIPER has become one of the cornerstones of Columbia University’s precision medicine initiative. Its effectiveness in cancer diagnosis and treatment planning is currently being tested in a series of N-of-1 clinical trials, which analyze the unique molecular characteristics of individual patients’ tumors to identify drugs and drug combinations that will be most effective for them. If successful, it could soon become an important component of cancer care at Columbia University Medical Center.

According to Dr. Califano, “VIPER makes it possible to find actionable proteins in 100% of cancer patients, independent of their genetic mutations. It also enables us to track tumors as they progress or relapse to determine the most appropriate therapeutic approach at different points in the evolution of disease. So far, this method is looking extremely promising, and we are excited about its potential benefits in finding novel therapeutic strategies to treat cancer patients.”

Clonal evolution in GBM tumors
The researchers' model of tumor evolution indicates that different clonal lineages branch from a common ancestral cell and then diversify, independently causing aggressive tumor behavior at different stages of disease.

Glioblastoma multiforme (GBM) is the most common and most aggressive type of primary brain tumor in adults. Existing treatments against the disease are very limited in their effectiveness, meaning that in most patients tumors recur within a year. Once GBM returns, no beneficial therapeutics currently exist and prognosis is generally very poor.

To better understand how GBM evades treatment, an international team led by Antonio Iavarone and Raul Rabadan at the Columbia University Center for Topology of Cancer Evolution and Heterogeneity has been studying how the cellular composition of GBM tumors changes over the course of therapy. In a paper just published online by Nature Genetics, they provide the first sketch of the main routes of GBM tumor evolution during treatment, showing that different cellular clones within a tumor become dominant within specific tumor states. The study uncovers important general principles of tumor evolution, novel genetic markers of disease progression, and new potential therapeutic targets.

Andrew Anzalone and Sakellarios ZairisMD/PhD students Andrew Anzalone and Sakellarios Zairis combined approaches based in chemical biology, synthetic biology, and computational biology to develop a new method for protein engineering.

The ribosome is a reliable machine in the cell, precisely translating the nucleotide code carried by messenger RNAs (mRNAs) into the polypeptide chains that form proteins. But although the ribosome typically reads this code with uncanny accuracy, translation has some unusual quirks. One is a phenomenon called -1 programmed ribosomal frameshifting (-1 PRF), in which the ribosome begins reading an mRNA one nucleotide before it should. This hiccup bumps translation “out of frame,” creating a different sequence of three-nucleotide-long codons. In essence, -1 PRF thus gives a single gene the unexpected ability to code for two completely different proteins.

Recently Andrew Anzalone, an MD/PhD student in the laboratory of Virginia Cornish, set out to explore whether he could take advantage of -1 PRF to engineer cells capable of producing alternate proteins. Together with Sakellarios Zairis, another MD/PhD student in the Columbia University Department of Systems Biology, the two developed a pipeline for identifying RNA motifs capable of producing this effect, as well as a method for rationally designing -1 PRF “switches.” These switches, made up of carefully tuned strands of RNA bound to ligand-sensing aptamers, can react to the presence of a specific small molecule and reliably modulate the ratio in the production of two distinct proteins from a single mRNA. The technology, they anticipate, could offer a variety of exciting new applications for synthetic biology. A paper describing their approach and findings has been published in Nature Methods.

cQTLs modify TF binding

Cofactors work with transcription factors (TFs) to enable efficient transcription of a TF's target gene. The Bussemaker Lab showed that genetic alterations in the cofactor gene (cQTLs) change the nature of this interaction, affecting the connectivity between the TF and its target gene. This, combined with other factors called aQTLs that affect the availability of the TF in the nucleus, can lead to downstream changes in gene expression.

When different people receive the same drug, they often respond to it in different ways — what is highly effective in one patient can often have no benefit or even cause dangerous side effects in another. From the perspective of systems biology, this is because variants in a person’s genetic code lead to differences in the networks of genes, RNA, transcription factors (TFs), and other proteins that implement the drug’s effects inside the cell. These multilayered networks are much too complex to observe directly, and so systems biologists have been developing computational methods to infer how subtle differences in the genome sequence produce these effects. Ultimately, the hope is that this knowledge could improve scientists’ ability to identify drugs that would be most effective in specific patients, an approach called precision medicine.

In a paper published in the Proceedings of the National Academy of Sciences, a team of Columbia University researchers led by Harmen Bussemaker proposes a novel approach for discovering some critical components of this molecular machinery. Using statistical methods to analyze biological data in a new way, the researchers identified genetic alterations they call connectivity quantitative trait loci (cQTLs), a class of variants in transcription cofactors that affect the connections between specific TFs and their gene targets.

Staphylococcus epidermis
Interactions between human cells and the bacteria that inhabit our bodies can affect health. Here, Staphylococcus epidermis binds to nasal epithelial cells. (Image courtesy of Sheetal Trivedi and Sean Sullivan.)

Launched in 2014 by investigators in the Mailman School of Public Health, the CUMC Microbiome Working Group brings together basic, clinical, and population scientists interested in understanding how the human microbiome—the ecosystems of bacteria that inhabit and interact with our tissues and organs—affects our health. Computational biologists in the Department of Systems Biology have become increasingly involved in this interdepartmental community, contributing expertise in analytical approaches that make it possible to make sense of the large data sets that microbiome studies generate.

Nicholas Tatonetti
Nicholas Tatonetti is an assistant professor in the Department of Biomedical Informatics and Department of Systems Biology.

A team of Columbia University Medical Center (CUMC) scientists led by Nicholas Tatonetti has identified several drug combinations that may lead to a potentially fatal type of heart arrhythmia known as torsades de pointes (TdP). The key to the discovery was a new bioinformatics pipeline called DIPULSE (Drug Interaction Prediction Using Latent Signals and EHRs), which builds on previous methods Tatonetti developed for identifying drug-drug interactions (DDIs) in observational data sets. The results are reported in a new paper in the journal Drug Safety and are covered in a detailed multimedia feature published by the Chicago Tribune.

The algorithm mined data contained in the US FDA Adverse Event Reporting System (FAERS) to identify latent signals of DDIs that cause QT interval prolongation, a disturbance in the electrical cycle that coordinates the heartbeat. It then validated these predictions by looking for their signatures in electrocardiogram results contained in a large collection of electronic health records at Columbia. Interestingly, the drugs the investigators identified do not cause the condition on their own, but only when taken in specific combinations.

Previously, no reliable methods existed for identifying these kinds of combinations. Although the findings are preliminary, the retrospective confirmation of many of DIPULSE’s predictions in actual patient data suggests its effectiveness, and the investigators plan to test them experimentally in the near future.

Pages