Research News ×

News

Andrew Anzalone and Sakellarios ZairisMD/PhD students Andrew Anzalone and Sakellarios Zairis combined approaches based in chemical biology, synthetic biology, and computational biology to develop a new method for protein engineering.

The ribosome is a reliable machine in the cell, precisely translating the nucleotide code carried by messenger RNAs (mRNAs) into the polypeptide chains that form proteins. But although the ribosome typically reads this code with uncanny accuracy, translation has some unusual quirks. One is a phenomenon called -1 programmed ribosomal frameshifting (-1 PRF), in which the ribosome begins reading an mRNA one nucleotide before it should. This hiccup bumps translation “out of frame,” creating a different sequence of three-nucleotide-long codons. In essence, -1 PRF thus gives a single gene the unexpected ability to code for two completely different proteins.

Recently Andrew Anzalone, an MD/PhD student in the laboratory of Virginia Cornish, set out to explore whether he could take advantage of -1 PRF to engineer cells capable of producing alternate proteins. Together with Sakellarios Zairis, another MD/PhD student in the Columbia University Department of Systems Biology, the two developed a pipeline for identifying RNA motifs capable of producing this effect, as well as a method for rationally designing -1 PRF “switches.” These switches, made up of carefully tuned strands of RNA bound to ligand-sensing aptamers, can react to the presence of a specific small molecule and reliably modulate the ratio in the production of two distinct proteins from a single mRNA. The technology, they anticipate, could offer a variety of exciting new applications for synthetic biology. A paper describing their approach and findings has been published in Nature Methods.

cQTLs modify TF binding

Cofactors work with transcription factors (TFs) to enable efficient transcription of a TF's target gene. The Bussemaker Lab showed that genetic alterations in the cofactor gene (cQTLs) change the nature of this interaction, affecting the connectivity between the TF and its target gene. This, combined with other factors called aQTLs that affect the availability of the TF in the nucleus, can lead to downstream changes in gene expression.

When different people receive the same drug, they often respond to it in different ways — what is highly effective in one patient can often have no benefit or even cause dangerous side effects in another. From the perspective of systems biology, this is because variants in a person’s genetic code lead to differences in the networks of genes, RNA, transcription factors (TFs), and other proteins that implement the drug’s effects inside the cell. These multilayered networks are much too complex to observe directly, and so systems biologists have been developing computational methods to infer how subtle differences in the genome sequence produce these effects. Ultimately, the hope is that this knowledge could improve scientists’ ability to identify drugs that would be most effective in specific patients, an approach called precision medicine.

In a paper published in the Proceedings of the National Academy of Sciences, a team of Columbia University researchers led by Harmen Bussemaker proposes a novel approach for discovering some critical components of this molecular machinery. Using statistical methods to analyze biological data in a new way, the researchers identified genetic alterations they call connectivity quantitative trait loci (cQTLs), a class of variants in transcription cofactors that affect the connections between specific TFs and their gene targets.

Nicholas Tatonetti
Nicholas Tatonetti is an assistant professor in the Department of Biomedical Informatics and Department of Systems Biology.

A team of Columbia University Medical Center (CUMC) scientists led by Nicholas Tatonetti has identified several drug combinations that may lead to a potentially fatal type of heart arrhythmia known as torsades de pointes (TdP). The key to the discovery was a new bioinformatics pipeline called DIPULSE (Drug Interaction Prediction Using Latent Signals and EHRs), which builds on previous methods Tatonetti developed for identifying drug-drug interactions (DDIs) in observational data sets. The results are reported in a new paper in the journal Drug Safety and are covered in a detailed multimedia feature published by the Chicago Tribune.

The algorithm mined data contained in the US FDA Adverse Event Reporting System (FAERS) to identify latent signals of DDIs that cause QT interval prolongation, a disturbance in the electrical cycle that coordinates the heartbeat. It then validated these predictions by looking for their signatures in electrocardiogram results contained in a large collection of electronic health records at Columbia. Interestingly, the drugs the investigators identified do not cause the condition on their own, but only when taken in specific combinations.

Previously, no reliable methods existed for identifying these kinds of combinations. Although the findings are preliminary, the retrospective confirmation of many of DIPULSE’s predictions in actual patient data suggests its effectiveness, and the investigators plan to test them experimentally in the near future.

Yaniv Erlich
Yaniv Erlich. Photo: Jared Leeds.

A new article published online in Nature Genetics reports that short tandem repeats, a class of genetic alterations in which short motifs of nucleotide base pairs occur multiple times in a row, play a role in modulating gene expression. Leading the study was Yaniv Erlich, an assistant professor in the Columbia University Department of Computer Science and core member of the New York Genome Center who recently joined the Center for Computational Biology and Bioinformatics.

As an article in Columbia Engineering explains, the findings reveal a new class of genome regulation.

Breast cancer cells

A histological slide of cancerous breast tissue. The pink "riverways" are normal connective tissue while areas stained blue are cancer cells. (Source: National Cancer Institute)

Investigators at Columbia University Medical Center and the Icahn School of Medicine at Mount Sinai have discovered a molecular signaling mechanism that drives a specific type of highly aggressive breast cancer. As reported in a paper in Genes & Development, a team led by Jose Silva and Andrea Califano determined that the gene STAT3 is a master regulator of breast tumors lacking hormone receptors but testing positive for human epidermal growth receptor 2 (HR-/HER2+). The researchers also characterized a pathway including IL-6, JAK2, STAT3, and S100A8/9 — genes already known to play important roles within the immune response — as being essential for the survival of HR-/HER2+ cancer cells. Additional tests showed that disrupting this pathway severely limits the ability of these cells to survive.

These findings are particularly exciting because the pathway the researchers identified contains multiple targets for which known FDA-approved drugs exist. The paper reports that when these drugs were tested in disease models, the cancer cells showed a dramatic response, suggesting promising strategies for the treatment of the HR-/HER2+ cancer subtype. A clinical trial is now underway to investigate the effects of these approaches in humans.

DeMAND graphical abstract
By analyzing drug-induced changes in disease-specific patterns of gene expression, a new algorithm called DeMAND identifies the genes involved in implementing a drug's effects. The method could help predict undesirable off-target interactions, suggest ways of regulating a drug's activity, and identify novel therapeutic uses for FDA-approved drugs, three critical challenges in drug development.

Researchers in the Columbia University Department of Systems Biology have developed an efficient and accurate method for determining a drug’s mechanism of action — the cellular machinery through which it produces its pharmacological effect. Considering that most drugs, including widely used ones, act in ways that are not completely understood at the molecular level, this accomplishment addresses a key challenge to drug development. The new approach also holds great potential for improving drugs’ effectiveness, identifying better combination therapies, and avoiding dangerous drug-induced side effects.

According to Andrea Califano, the Clyde and Helen Wu Professor of Chemical Systems Biology and co-senior author on the study, “This new methodology makes it possible for the first time to generate a genome-wide footprint of the proteins that are responsible for implementing or modulating the activity of a drug. The accuracy of the method has been the most surprising result, with up to 80% of the identified proteins confirmed by experimental assays.”

PhenoGraph

PhenoGraph, a new algorithm developed in Dana Pe'er's laboratory, proved capable of accurately identifying AML stem cells, reducing high-dimensional single cell mass cytometry data to an interpretable two-dimensional graph. Image courtesy of Dana Pe'er.

A key problem that has emerged from recent cancer research has been how to deal with the enormous heterogeneity found among the millions of cells that make up an individual tumor. Scientists now know that not all tumor cells are the same, even within an individual, and that these cells diversify into subpopulations, each of which has unique properties, or phenotypes. Of particular interest are cancer stem cells, which are typically resistant to existing cancer therapies and lead to relapse and recurrence of cancer following treatment. Finding better ways to distinguish and characterize cancer stem cells from other subpopulations of cancer cells has therefore become an important goal, for once these cells are identified, their vulnerabilities could be studied with the aim of developing better, long lasting cancer therapies.

In a paper just published online in Cell, investigators in the laboratories of Columbia University’s Dana Pe’er and Stanford University’s Garry Nolan describe a new method that takes an important step toward addressing this challenge. As Dr. Pe’er explains, “Biology has come to a point where we suddenly realize there are many more cell types than we ever imagined possible. In this paper, we have created an algorithm that can very robustly identify such subpopulations in a completely automatic and unsupervised way, based purely on high-dimensional single-cell data. This new method makes it possible to discover many new cell subpopulations that we have never seen before.”

Monthly disease risk

Columbia scientists used electronic records of 1.7 million New York City patients to map the statistical relationship between birth month and disease incidence. Image courtesy of Nicholas Tatonetti.

Columbia University Medical Center reports on a new study in the Journal of American Medical Informatics Association led by Nicholas Tatonetti, also an assistant professor in the Department of Systems Biology.

Columbia University scientists have developed a computational method to investigate the relationship between birth month and disease risk. The researchers used this algorithm to examine New York City medical databases and found 55 diseases that correlated with the season of birth. Overall, the study indicated people born in May had the lowest disease risk, and those born in October the highest. The study was published this week in the Journal of American Medical Informatics Association.

“This data could help scientists uncover new disease risk factors,” said study senior author Nicholas Tatonetti, PhD, an assistant professor of biomedical informatics at Columbia University Medical Center (CUMC) and Columbia’s Data Science Institute. The researchers plan to replicate their study with data from several other locations in the U.S. and abroad to see how results vary with the change of seasons and environmental factors in those places. By identifying what’s causing disease disparities by birth month, the researchers hope to figure out how they might close the gap.

Reposted from the Columbia University Medical Center Newsroom. Find the original article here .

Cancer bottlenecks
In an N-of-1 study, researchers at Columbia University use techniques from systems biology to analyze genomic information from an individual patient’s tumor. The goal is to identify key genes, called master regulators  (green circles), which, while not mutated, are nonetheless necessary for the survival of cancer cells. 

Columbia University Medical Center (CUMC) researchers are developing a new approach to cancer clinical trials, in which therapies are designed and tested one patient at a time. The patient’s tumor is “reverse engineered” to determine its unique genetic characteristics and to identify existing U.S. Food and Drug Administration (FDA)-approved drugs that may target them.

Rather than focusing on the usual mutated genes, only a very small number of which can be used to guide successful therapeutic strategies, the method analyzes the regulatory logic of the cell to identify genes and gene pairs that are critical for the survival of the tumor but are not critical for normal cells. FDA-approved drugs that inhibit these genes are then tested in a mouse model of the patient’s tumor and, if successful, considered as potential therapeutic agents for the patient — a journey from bedside to bench and back again that takes about six to nine months.

“We are taking a rather different approach to tailor therapy to the individual cancer patient,” said principal investigator Andrea Califano, PhD, Clyde and Helen Wu Professor of Chemical Systems Biology and chair of CUMC’s new Department of Systems Biology. “If we have learned one thing about this disease, it’s that it has tremendous heterogeneity both across patients and within individual patients. When we expect different patients with the same tumor subtype or different cells within the same tumor to respond the same way to a treatment, we make a huge simplification. Yet this is how clinical studies are currently conducted. To address this problem, we are trying to understand how tumors are regulated one at a time. Eventually, we hope to be able to treat patients not on an individual basis, but based on common vulnerabilities of the cancer cellular machinery, of which genetic mutations are only indirect evidence. Genetic alterations are clearly responsible for tumorigenesis but control points in molecular networks may be better therapeutic targets.”

Tracking clones

After identifying T cell clones that react against donated kidney tissue in vitro, new computational methods developed in Yufeng Shen's Lab are used to track their frequency following organ transplant. The findings can help to predict transplant rejection or tolerance.

When a patient receives a kidney transplant, a battle often ensues. In many cases, the recipient’s immune system identifies the transplanted kidney as a foreign invader and mounts an aggressive T cell response to eliminate it, leading to a variety of destructive side effects. To minimize complications, many transplant recipients receive drugs that suppress the immune response. These have their own consequences, however, as they can lead to increased risk of infections. For these reasons, scientists have been working to gain a better understanding of the biological mechanisms that determine transplant tolerance and rejection. This knowledge could potentially improve physicians’ ability to predict the viability of an organ transplant and to provide the best approach to immunosuppression therapy based on individual patients’ immune system profiles.

Yufeng Shen, an assistant professor in the Columbia University Department of Systems Biology and JP Sulzberger Columbia Genome Center, together with Megan Sykes, director of the Columbia Center for Translational Immunology at the Columbia University College of Physicians and Surgeons, recently took an encouraging step toward this goal. In a paper published in Science Translational Medicine, they report that the deletion of specific donor-resistant T cell clones in the transplant recipient can support tolerance of a new kidney. Critical to this discovery was the development of a new computational genomics approach by the Shen Lab, which makes it possible to track how frequently rare T cell clones develop and how their frequencies change following transplantation. The paper suggests both a general strategy for understanding the causes of transplant rejection and a means of identifying biomarkers for predicting how well a transplant recipient will tolerate a new kidney.

Comorbidity between Mendelian disease and cancer
Researchers in the Rabadan Lab have found that comorbidity between Mendelian diseases and cancer may result from shared genetic factors.

Genetic diseases can arise in a variety of ways. Mendelian disorders, for example, occur when specific mutations in single genes — called germline mutations — are inherited from either of one’s two parents. Well-known examples of Mendelian diseases include cystic fibrosis, sickle cell disease, and Duchenne muscular dystrophy. Other genetic diseases, including cancer, result from somatic mutations, which occur in individual cells during a person’s lifetime. Because the genetic origins of Mendelian diseases and cancer are so different, they are typically understood to be distinct phenomena. However, scientists in the Columbia University Department of Systems Biology have found evidence that there might be interesting genetic connections between them. 

In a paper just published in Nature Communications, postdoctoral research scientist Rachel Melamed and colleagues in the laboratory of Associate Professor Raul Rabadan report on a new method that uses knowledge about Mendelian diseases to suggest mutations involved in cancer. The study takes advantage of an enormous collection of electronic health records representing over 110 million patients, a substantial percentage of US residents. The authors show that clinical co-occurrence of Mendelian diseases and cancer, known as comorbidity, can be tied to genetic changes that play roles in both diseases. The paper also identifies several specific relationships between Mendelian diseases and the cancers melanoma and glioblastoma.

ALK-negative ALCL mutation map
A map of mutations observed in ALK-negative anaplastic large cell lymphoma. (Credit: Dr. Rabadan)

The following article is reposted with permission from the Columbia University Medical Center Newsroom. Find the original here.

The first-ever systematic study of the genomes of patients with ALK-negative anaplastic large cell lymphoma (ALCL), a particularly aggressive form of non-Hodgkin’s lymphoma (NHL), shows that many cases of the disease are driven by alterations in the JAK/STAT3 cell signaling pathway. The study also demonstrates, in mice implanted with human-derived ALCL tumors, that the disease can be inhibited by compounds that target this pathway, raising hopes that more effective treatments might soon be developed. The study, led by researchers at Columbia University Medical Center (CUMC) and Weill Cornell Medical College, was published today in the online edition of Cancer Cell.

Gut bacteria

Photo by David Gregory and Debbie Marshall, Wellcome Images. 

Recent deep sequencing studies are providing an increasingly detailed picture of the genetic composition of the human microbiome, the diverse collection of bacterial species that inhabit the gut. At the same time, however, little is known about the dynamics of these colonies, particularly why certain microbial strains outcompete others in the same environment. In a new paper published in the journal Molecular Systems Biology, Department of Systems Biology Assistant Professor Harris Wang, in collaboration with Georg Gerber and researchers at Harvard University, report on their development of the first method for using functional metagenomics to identify genes within commensal bacterial genomes that give them an evolutionary fitness advantage.

Bacterial evolutionary relationships

Image courtesy of Germán Plata and Dennis Vitkup.

Columbia News has just published an article covering recent research by associate professor Dennis Vitkup and postdoctoral research scientist Germán Plata that uses simulations of bacterial metabolism as a lens for studying how phenotypes adapt and diversify across evolutionary time scales. The article reports:

Despite their omnipresence, microbial evolutionary adaptations are often challenging to study, partly due to the difficulty of growing diverse bacteria in the lab. “Probably less than a dozen bacteria are really well studied in the laboratory,” Vitkup says.

Writing in the journal Nature this past January, Vitkup and Plata applied computational tools to investigate bacterial evolutionary adaptations by simulating metabolism for more than 300 bacterial species, covering the entire microbial tree of life.

Autism Spectrum Disorders Genetic Network

Network of autism-associated genes. (Credit: Dennis Vitkup)

The following article is reposted with permission from the Columbia University Medical Center Newsroom. Find the original here.

People with autism have a wide range of symptoms, with no two people sharing the exact type and severity of behaviors. Now a large-scale analysis of hundreds of patients and nearly 1000 genes has started to uncover how diversity among traits can be traced to differences in patients’ genetic mutations. The study, conducted by researchers at Columbia University Medical Center, was published Dec. 22 in the journal Nature Neuroscience.

Autism researchers have identified hundreds of genes that, when mutated, likely increase the risk of developing autism spectrum disorder (ASD). Much of the variability among people with ASD is thought to stem from the diversity of underlying genetic changes, including the specific genes mutated and the severity of the mutation.

“If we can understand how different mutations lead to different features of ASD, we may be able to use patients’ genetic profiles to develop accurate diagnostic and prognostic tools and perhaps personalize treatment,” said senior author Dennis Vitkup, PhD, associate professor of systems biology and biomedical informatics at Columbia University’s College of Physicians & Surgeons.

Sequence of genomic alterations in CLLA graph representing the sequence of genomic alterations in chronic lymphocytic leukemia (CLL). Each node represents a mutation, with arrows indicating temporal relationships between them. The size of the nodes indicates the number of patients in the study who exhibited the alteration, while the thickness of the lines shows how often the temporal relationships between nodes were seen. The method the researchers use enabled them to identify multiple, distinct evolutionary patterns in CLL.

As biologists have gained a better understanding of cancer, it has become clear that tumors are often driven not by a single mutation, but by a series of genetic changes that correspond to particular stages of cancer progression. In this sense, a tumor is constantly evolving, with different groups of cells that harbor distinctive mutations multiplying at different rates, depending on their fitness for particular disease states. As the search for more effective cancer diagnostics and therapies continues, one key question is how to disentangle the order in which mutations occur in order to understand how tumors change over time. Being able to predict how a tumor will behave based on signs seen early in the course of disease could enable the development of new diagnostics that could better inform treatment planning.

In a paper just published in the journal eLife, a team of investigators led by Department of Systems Biology Associate Professor Raul Rabadan reports on a new computational strategy for addressing this challenge. Their framework, called tumor evolutionary directed graphs (TEDG), considers next-generation sequencing data from tumor samples from a large number of patients. Using TEDG to analyze cancer cells in patients with chronic lymphocytic leukemia (CLL), they were able to develop a model of how the disease’s mutational landscape changes from its initial onset to its late stages. Their findings suggest that CLL may not be just the result of a single evolutionary path, but can evolve in alternative ways.

Expanding the landscape of breast cancer drivers

In comparison with a previous study (Stephens et al., 2012, shown in gray), a new computational approach that focuses on somatic copy number mutations increased the number of known driver mutations in breast tumors to a median of five for each tumor. The findings could raise the likelihood of finding actionable targets in individual patients with breast cancer.

For many years, researchers have known that somatic copy number alterations (SCNA’s) — insertions, deletions, duplications, and transpositions of sections of DNA that are not inherited but occur after birth — play important roles in causing many types of cancer. Indeed, most recurrent drivers of epithelial tumors are copy number alterations, with some found in up to 40% of patients with specific tumor types. However, because SCNA’s occur when entire sections of chromosomes become damaged, biologists have had difficulty developing effective methods for distinguishing genes within SCNA’s that actually drive cancer from those genes that might lie near a driver but do not themselves cause disease.

Helios nearly doubled the number of high-confidence predictions of breast cancer drivers.

In a new paper published in Cell, researchers in the laboratories of Dana Pe’er (Columbia University Departments of Systems Biology and Biological Sciences) and Jose Silva (Icahn School of Medicine at Mount Sinai) report on a new computational algorithm that promises to dramatically improve researchers’ ability to identify cancer-driving genes within potentially large SCNA’s. The algorithm, called Helios, was used to analyze a combination of genomic data and information generated by functional RNAi screens, enabling them to predict several dozen new SCNA drivers of breast cancer. In follow-up in vitro experimental studies, they tested 12 of these predictions, 10 of which were validated in the laboratory. Their findings nearly double the number of breast cancer drivers, providing many new opportunities towards personalized treatments for breast cancer. Their methodology is general and could also be used to locate disease-causing SCNA’s in other cancer types.

Leading this effort was Felix Sanchez-Garcia, a recent PhD graduate from the Pe’er Lab and a first author on the paper. The story of how this breakthrough came about illuminates how the interdisciplinary research and education that take place at the Department of Systems Biology can address important challenges facing biological and biomedical research.

Fluidigm C1 Single-Cell Plate

At the core of the Fluidigm C1 Single-Cell Auto Prep System is a 96-well plate containing microfluidics. After individual cells are isolated in their own wells, the device amplifies their cDNA for genome-wide gene expression profiling. Scientists at the Columbia Genome Center are developing methods for addressing the technical and analytical challenges of single-cell RNA sequencing, and have begun generating some exciting data.

Since the invention of the first microscope, a procession of new technologies has enabled scientists to study individual cells at increasingly fine levels of detail. The last two years have witnessed an important next stage in this evolution, with the arrival of the first devices for genetically profiling single cells on a genome-wide scale.

The first commercial product in this field is the Fluidigm C1 Single-Cell Auto Prep System, which uses microfluidics to isolate single cells and offers the ability to generate gene expression profiles for up to 96 cells at a time. But because of the novelty of the technology and the inherent difficulties of working with single cells, it has presented a number of technical challenges for researchers interested in exploring biology at this level.

Now, scientists at the JP Sulzberger Columbia Genome Center led by Assistant Professors Peter Sims and Yufeng Shen have developed an experimental and computational pipeline that optimizes the C1’s capabilities. And even as they work to solve some of the challenges that are inherent to single-cell research, their approach has begun generating some exciting data for studying genetics in a variety of cell types.

DIGGIT identifies mutations upstream of master regulators.

A new algorithm called DIGGIT identifies mutations that lie upstream of crucial bottlenecks within regulatory networks. These bottlenecks, called master regulators, integrate these mutations and become essential functional drivers of diseases such as cancer.

Although genome-wide association studies have made it possible to identify mutations that are linked to diseases such as cancer, determining which mutations actually drive disease and the mechanics of how they do so has been an ongoing challenge. In a paper just published in Cell, researchers in the lab of Andrea Califano describe a new computational approach that may help address this problem.

Ashkenazi Population Bottleneck Model
The consortium’s model of Ashkenazi Jewish ancestry suggests that the population’s history was shaped by three critical bottleneck events. The ancestors of both populations underwent a bottleneck sometime between 85,000 and 91,000 years ago, which was likely coincident with an Out-of-Africa event. The founding European population underwent a bottleneck at approximately 21,000 years ago, beginning a period of interbreeding between individuals of European and Middle Eastern ancestry. A severe bottleneck occurred in the Middle Ages, reducing the population to under 350 individuals. The modern-day Ashkenazi community emerged from this group.

An international research consortium led by Associate Professor Itsik Pe’er has produced a new panel of reference genomes that will significantly improve the study of genetic variation in Ashkenazi Jews. Using deep sequencing to analyze the genomes of 128 healthy individuals of Ashkenazi Jewish origin, The Ashkenazi Genome Consortium (TAGC) has just published a resource that will be much more effective than previously available European reference genomes for identifying disease-causing mutations within this historically isolated population. Their study also provides novel insights into the historical origins and ancestry of the Ashkenazi community. A paper describing their study has just been published online in Nature Communications.

The dataset produced by the consortium provides a high-resolution baseline genomic profile of the Ashkenazi Jewish population, which they revealed to be significantly different from that found in non-Jewish Europeans. In the past, clinicians’ only option for identifying disease-causing mutations in Ashkenazi individuals was to compare their genomes to more heterogeneous European reference sets. This new resource accounts for the historical isolation of this population, and so will make genetic screening much more accurate in identifying disease-causing mutations.

In an article that appears on the website of Columbia University’s Fu Foundation School of Engineering and Computer Science, Dr. Pe’er explains:

“Our study is the first full DNA sequence dataset available for Ashkenazi Jewish genomes... With this comprehensive catalog of mutations present in the Ashkenazi Jewish population, we will be able to more effectively map disease genes onto the genome and thus gain a better understanding of common disorders. We see this study serving as a vehicle for personalized medicine and a model for researchers working with other populations.”

In addition to offering an important resource for such future translational and clinical research, the paper’s findings also provide new insights that have implications for the much debated question of how European and Ashkenazi Jewish populations emerged historically.