Computational Biology×

News

viSNE

viSNE reveals the progression of cancer in a sample of cells taken from a patient with acute myeloid leukemia. Cells are colored according to intensity of expression of the indicated cell markers, enabling the comparison of expression patterns before and after relapse. For example, Fit3 is expressed primarily in the diagnosis sample, while CD34 emerges in the relapse sample.

Researchers in the Columbia Initiative in Systems Biology have developed a computational method that enables scientists to visualize and interpret high-dimensional data produced by single-cell measurement technologies such as mass cytometry. The method, called viSNE (visual interactive Stochastic Neighbor Embedding), has just been published in the online edition of Nature Biotechnology. It has particular relevance to cancer research and therapeutics. As Columbia University Medical Center reports:

Researchers now understand that cancer within an individual can harbor subpopulations of cells with different molecular characteristics. Groups of cells may behave differently from one another, including in how they respond to treatment. The ability to study single cells, as well as to identify and characterize subpopulations of cancerous cells within an individual, could lead to more precise methods of diagnosis and treatment.

“Our method not only will allow scientists to explore the heterogeneity of cancer cells and to characterize drug-resistant cancer cells, but also will allow physicians to track tumor progression, identify drug-resistant cancer cells, and detect minute quantities of cancer cells that increase the risk of relapse,” said co-senior author Dana Pe’er, associate professor of biological sciences and systems biology at Columbia.

Barry Honig

When Columbia University founded the Center for Multiscale Analysis of Genomic and Cellular Networks (MAGNet) in 2005, one of its goals was to integrate the methods of structural biology with those of systems biology. Considering protein structure within the context of computational models of cellular networks, researchers hoped, would not only improve the predictive value of their models by giving another layer of evidence, but also lead to new types of predictions that could not be made using other methods.

In a new paper published in Nature magazine, Barry Honig, Andrea Califano, and other members of the Columbia Initiative in Systems Biology, including first authors Qiangfeng Cliff Zhang and Donald Petrey, report that this goal has now been realized. For the first time, the researchers have shown that information about protein structure can be used to make predictions about protein-protein interactions on a genome-wide scale. Their approach capitalizes on innovative techniques in computational structural biology that the Honig lab has developed over the last 15 years, culminating in the development of a new algorithm called Predicting Protein-Protein Interactions (PrePPI). In this interview, Honig describes the evolution of this new approach, and what it could mean for the future of systems biology.

Figure

Tumor-induced mRNA expression changes for individual biochemical reactions in central metabolism. 

A large study analyzing gene expression data from 22 cancer types has identified a broad spectrum of metabolic expression changes associated with cancer. The analysis, led by Dennis Vitkup, first author Jie Hu, a postdoctoral research scientist in the Vitkup lab, with a multi-institutional group of collaborators, also identified hundreds of potential drug targets that could cut off a tumor’s fuel supply or interfere with its ability to synthesize essential elements necessary for tumor growth. The study has just been published in the online edition of Nature Biotechnology .

As Columbia University Medical Center reports:

The results should ramp up research into drugs that interfere with cancer metabolism, a field that dominated cancer research in the early 20th century and has recently undergone a renaissance.

Attractor Metagenes - DREAM7

Team Attractor Metagenes receives its award at the DREAM7 Conference. Gustavo Stolovitzky (IBM Research), Adam Margolis (Sage Bionetworks), Dimitris Anastassiou, Tai-Hsien Ou Yang, Wei-Yi Cheng, Stephen Friend (Sage Bionetworks), Erhan Bilal (IBM Research)

The team of Professor Dimitris Anastassiou and graduate students Wei-Yi Cheng and Tai-Hsien Ou Yang has been recognized as the best performer in the Sage Bionetworks – DREAM Breast Cancer Prognosis Challenge. This challenge, one of four organized as part of the seventh Dialogue for Reverse Engineering Assessments and Methods (DREAM7), was designed to assess the ability of participants’ computational models to predict breast cancer survival using patient clinical information and molecular profiling data. As a reward for this accomplishment, the journal Science Translational Medicine has just published a paper from the Anastassiou lab describing their model. It is also the journal’s cover theme for this issue, which includes a second article describing the Challenge.

The Columbia University researchers based their DREAM entry on previous work to identify what they call “attractor metagenes,” sets of strongly co-expressed genes that they have found to be present with very little variation in many cancer types. Moreover, these metagenes appear to be associated with specific attributes of cancer including chromosomal instability, epithelial-mesenchymal transition, and a lymphocyte-specific immune response. As Wei-Yi Cheng comments in Sage Synapse, “We like to think of these three main attractor metagenes as representing three key ‘bioinformatic hallmarks of cancer,’ reflecting the ability of cancer cells to divide uncontrollably and invade surrounding tissues, and the ability of the organism to recruit a particular type of immune response to fight the disease.”

Genes forming cluster I in the context of cellular signaling pathways

Genes forming cluster I in the context of cellular signaling pathways. Proteins encoded by cluster genes are shown in yellow, and those corresponding to other relevant genes that were present in the input data but not selected by the NETBAG+ algorithm are shown in cyan. 

In a new paper published in the journal Nature Neuroscience, Columbia University researchers report that many of the genes that are mutated in schizophrenia are organized into two main networks. Surprisingly, the study also found that a genetic network that leads to schizophrenia is very similar to a network that has been linked to autism. 

Using a computational approach called NETBAG+, Dennis Vitkup and colleagues performed network-based analyses of rare de novo mutations to map the gene networks that lead to schizophrenia. When they compared one schizophrenia network to an autism network described in a study he published last year, they discovered that different copy number variants in the same genes can lead to either schizophrenia or autism. The overlapping genes are important for processes such as axon guidance, synapse function, and cell migration — processes within the brain that have been shown to play a role in the development of these two diseases. These gene networks are particularly active during prenatal development, suggesting that the foundations for schizophrenia and autism are laid very early in life.

Itsik Pe'erItsik Pe'er, an Associate Professor in the Department of Computer Science and member of the Columbia Initiative in Systems Biology, is using mathematics and computer analytics to identify the genetic makeup of the founding Ashkenazi Jews. By analyzing the full DNA sequences of hundreds of their descendants in the New York City area and comparing them to reference sets of non-Ashkenazi DNA, his goal is to identify Ashkenazi-specific genetic mutations associated with diseases such as Tay-Sachs, Crohn's, and Parkinson's disease. As a new article in Columbia News explains:

By examining similarities in DNA segments shared by large numbers of related individuals, his lab developed statistical models that allow him to make generalizations about entire populations. The mix of genes that every child inherits from each parent travels in long sequences of code that remain together and are remarkably consistent from one generation to the next.

"The size of the gene chunks gets smaller with each generation, but they diminish at a consistent and predictable rate. As a result, Pe’er can use his models to determine distant relationships shared by two individuals by measuring the length of their common DNA segments."

Read the complete article here.

GLOBUS algorithm

 An overview of the GLOBUS algorithm.

A Columbia University team led by professor Dennis Vitkup and PhD student German Plata of the Center for Computational Biology and Bioinformatics has developed a novel genome-wide framework for making probabilistic annotations of metabolic networks. Their approach, called Global Biochemical Reconstruction Using Sampling (GLOBUS), combines information about sequence homology with context-specific information including phylogeny, gene clustering, and mRNA co-expression to predict the probability of biochemical interactions between specific genes. By integrating these different categories of information using a principled probabilistic framework, this approach overcomes limitations of considering only one functional category or one gene at a time, providing a global and accurate prediction of metabolic networks.

In a paper published in Nature Chemical Biology, the scientists write, "Currently, most publicly available biochemical databases do not provide quantitative probabilities or confidence measures for existing annotations. This makes it hard for the users of these valuable resources to distinguish between confident assignments and mere guesses... The GLOBUS approach, which is based on statistical sampling of possible biochemical assignments, provides a principled framework for such global probabilistic annotations. The method assigns annotation probabilities to each gene and suggests likely alternative functions."

Transforming activity of FGFR-TACC fusion proteins

Representative microphotographs of hematoxylin and eosin staining of advanced FGFR3-TACC3-shp53–generated tumors show histological features of high-grade glioma.

A new paper published by Columbia University Medical Center researchers in the journal Science has determined that some cases of glioblastoma, the most aggressive form of primary brain cancer, result from the fusion of the genes FGFR and TACC. Raul Rabadan, a co-senior author on the study, led efforts to identify these genes by using quantitative methods to analyze the glioblastoma genome from nine patients, and then compare these results with more than 300 genomes from the Cancer Genome Atlas project.

The collaboration with cancer genomics expert Antonio Iavarone and co-senior author Anna Lasorella found that the protein produced by the FGFR-TACC fusion disrupts the mitotic spindle (the cellular structure that guides mitosis) and causes aneuploidy, an uneven distribution of chromosomes that causes tumorigenesis. The researchers also found that drugs that target this aberration can dramatically slow the growth of tumors in mice, suggesting a potential therapeutic target.

An extensive microRNA-mediated network of RNA-RNA interactions

Genome-wide inference of sponge modulators identified a miR-program mediated post-transcriptional regulatory (mPR) network including ~248,000 interactions.

For decades, scientists have thought that the primary role of messenger RNA (mRNA) is to shuttle information from the DNA to the ribosomes, the sites of protein synthesis. However, new studies now suggest that the mRNA of one gene can control, and be controlled by, the mRNA of other genes via a large pool of microRNA molecules, with dozens to hundreds of genes working together in complex self-regulating sub-networks.

In work published in the journal Cell, Andrea Califano, José Silva, and colleagues analyzed gene expression data in glioblastoma in combination with matched microRNA profiles to uncover a posttranscriptional regulation layer of surprising magnitude, comprising more than 248,000 microRNA (miR)-mediated interactions. These include ∼7,000 genes whose transcripts act as miR “sponges.” When two genes share a set of microRNA regulators, changes in expression of one gene affects the other. If, for instance, one of those genes is highly expressed, the increase in its mRNA molecules will “sponge up” more of the available microRNAs. As a result, fewer microRNA molecules will be available to bind and repress the other gene’s mRNAs, leading to a corresponding increase in expression.

Although such an effect had been previously elucidated, the range and relevance of this kind of interaction had not been characterized.

Gene clusters found using NETBAG analysis of de novo CNV regions observed in autistic individuals.

Gene clusters found using NETBAG analysis of de novo CNV regions observed in autistic individuals. A) The highest scoring cluster obtained using the search procedure with up to one gene per each CNV region. B) The cluster obtained using the search with up to two genes per region.

Identification of complex molecular networks underlying common human phenotypes is a major challenge of modern genetics. A new network-based method developed at the lab of Dennis Vitkup was used to identify a large biological network of genes affected by rare de novo copy number variations (CNVs) in autism. The genes forming the network are primarily related to synapse development, axon targeting, and neuron motility. The identified network is strongly related to genes previously implicated in autism and intellectual disability phenotypes.

These findings are consistent with the hypothesis that significantly stronger functional perturbations are required to trigger the autistic phenotype in females compared to males. Overall, the analysis of de novo variants supports the hypothesis that perturbed synaptogenesis is at the heart of autism.

Systematic characterization of cancer genomes has revealed a staggering number of diverse alterations that differ among individuals, so that their functional importance and physiological impact remains poorly defined. In order to identify which genetic alterations are functional, the lab of Dr. Dana Pe’er has developed a novel Bayesian probabilistic algorithm, CONEXIC, to integrate copy number and gene expression data in order to identify tumor-specific “driver” aberrations, as well as the cellular processes they affect.

In work published in the journal Cell, the new method was applied on data from melanoma patients, identifying a list of 64 putative ‘drivers’ and the core processes affected by them. This list includes many known driver genes (e.g., MITF), which CONEXIC correctly identified and paired with their known targets. This list also includes novel ‘driver’ candidates including Rab27a and TBC1D16, both involved in protein trafficking. ShRNA-mediated silencing of these genes in short-term tumor-derived cultures determined that they are tumor dependencies and validated their computationally predicted role in melanoma (including target identification), suggesting that protein trafficking may play an important role in this malignancy.

Flu cases in early 2009

Because flu viruses mutate nearly once every reproduction cycle, no two people are made sick by precisely the same virus, as illustrated by this chart documenting swine flu cases among humans in early 2009.

The recent outbreak and sudden spread of a novel H1N1 influenza virus has caused a worldwide concern and has tested our ability to respond to major public health challenges. Significant scientific resources have been marshaled to discover the best possible responses against this novel swine origin influenza virus. A group led by Raul Rabadan at the Center for Computational Biology and Bioinformatics, and the Department of Biomedical Informatics at Columbia University has been studying the evolution of influenza viruses and the origins of flu pandemics by analyzing large data sets that contain genomic information.

Pages