• Editor's note: The MAGNet center formally closed in July 2016, following the mandatory conclusion of its grant after more than 10 years of activity. The pages in this section constitute an archive of its work.

The following papers are a representative sample of high-impact publications produced with funding for the Center for Multiscale Analysis of Genomic and Cellular Networks (MAGNet).


Zhang QC, Petrey D, Deng L, Qiang L, Shi Y, Thu CA, Bisikirska B, Lefebvre C, Accili D, Hunter T, Maniatis T, Califano A, Honig B. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature. 2012 Oct 25;490(7421):556-60.

Summary: The genome-wide identification of pairs of interacting proteins is an important step in the elucidation of cell regulatory mechanisms. Much of our current knowledge derives from high-throughput techniques such as yeast two-hybrid and affinity purification, as well as from manual curation of experiments on individual systems. In this work we showed that three-dimensional structural information can be used to predict protein-protein interactions with an accuracy and coverage that is comparable to that of high-throughput experiments. Our new algorithm yields over 300,000 high confidence interactions for human proteins and opens the door for the use of structural information on an unprecedented scale and for a fuller integration of structural and systems biology than has been possible in the past.


Sumazin P, Yang X, Chiu HS, Chung WJ, Iyer A, Llobet-Navas D, Rajbhandari P, Bansal M, Guarnieri P, Silva J, Califano A. An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell. 2011 Oct 14;147(2):370-81.

Summary: This paper is part of a set of four paper co-published in the same issue of Cell to highlight the extensive role of competitive endogenous RNA interactions in regulating both normal physiology and tumorigenesis. Competitive endogenous RNA (ceRNA) are RNA species that share a significant number of common microRNA binding sites. By competing for the same finite pool of microRNAs, ceRNA regulate each other. This paper, in particular, shows that the ceRNA regulatory network is comparable in size (i.e. number of interactions) and effect (regulatory effect) with transcriptional regulatory networks. Moreover, it shows that ceRNA interactions can explain a significant portion of missing genetic variability in tumorigenesis. Specifically, in GBM, we have shown that different subset of 13 ceRNA that regulate PTEN are frequently deleted in GBM patients with an intact or hemizigously deleted PTEN locus thus contributing to a significant down regulation of this tumor suppressor at the protein level.


Slattery M, Riley T, Liu P, Abe N, Gomez-Alcala P, Dror I, Zhou T, Rohs R, Honig B, Bussemaker HJ, Mann RS. Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell. 2011 Dec 9;147(6):1270-82.

Summary: We developed an experimental and computational platform, SELEX-seq, that can be used to determine the relative binding affinities to any DNA sequence for any transcription factor or transcription factor complex. Applying this method to all eight Drosophila Hox proteins, we showed that they obtain novel recognition properties when they bind DNA with the dimeric cofactor Extradenticle-Homothorax (Exd).  Our DNA structure predictions suggest that anterior and posterior Hox proteins prefer DNA sequences with distinct minor groove topographies. Our data suggest that emergent DNA recognition properties revealed by interactions with cofactors contribute to transcription factor specificities in vivo.


Tiacci E, Trifonov V, Schiavoni G, Holmes A, Kern W, Martelli MP, Pucciarini A, Bigerna B, Pacini R, Wells VA, Sportoletti P, Pettirossi V,Mannucci R, Elliott O, Liso A, Ambrosetti A, Pulsoni A, Forconi F, Trentin L, Semenzato G, Inghirami G, Capponi M, Di Raimondo F, Patti C, Arcaini L, Musto P, Pileri S, Haferlach C, Schnittger S, Pizzolo G, Foà R, Farinelli L, Haferlach T, Pasqualucci L, Rabadan R, Falini B. BRAF mutations in hairy-cell leukemia. N Engl J Med. 2011 Jun 16;364(24):2305-15.

Summary: This paper uses bioinformatics analysis of next generation sequencing data to identify a mutation that is present in all Hairy-Cell Leukemias (HCL). This is in stark contrast to mutations found in the genomes of other tumors sequenced so far, where the frequency of recurrence is rarely above 20% to 30% . The authors searched for HCL-associated mutations by sequencing the exome of leukemic and matched normal mononuclear cells purified from the peripheral blood of a single HCL patient. The analysis identified 5 missense somatic clonal mutations that were confirmed at Sanger sequencing, including a heterozygous V600E mutation involving the BRAF gene. Since the BRAF V600E mutation is oncogenic in other tumors, further analyses were focused on this genetic lesion. Sanger sequencing detected mutated BRAF in 46/46 additional HCL patients (47/47 including the index case; 100%). None of the 193 peripheral B-cell lymphomas/leukemias other than HCL that were investigated carried the BRAF V600E mutation, including 36 cases of splenic marginal zone lymphomas and unclassifiable splenic lymphomas/leukemias. Immunohistological and Western blot studies showed that HCL cells express phospho-MEK and phospho-ERK (the downstream targets of the BRAF kinase), indicating a constitutive activation of the RAF-MEK-ERK mitogen-activated protein kinase pathway in HCL. In vitro incubation of BRAF-mutated primary leukemic cells from 5 HCL patients with PLX-4720, a specific inhibitor of active BRAF, led to marked decrease of phosphorylated ERK and MEK.


Gilman SR, Iossifov I, Levy D, Ronemus M, Wigler M, Vitkup D. Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses. Neuron. 2011 Jun 9;70(5):898-907.

Summary: To understand molecular networks underlying the autistic phenotype, we developed NETBAG, a novel method for network-based analysis of genetic associations. NETBAG was used to identify a large biological network of genes affected by rare de-novo copy number variants (CNVs) associated with autism. The genes forming the identified network are primarily related to synapse development, axon targeting, and neuron motility.  The results of the study are also consistent with the hypothesis that significantly stronger functional perturbations are required to trigger the autistic phenotype in females compared to males. More generally, the study provides proof of the principle that networks underlying complex human phenotypes can be identified by a network-based analysis of relevant rare genetic variants.


Filion GJ, van Bemmel JG, Braunschweig U, Talhout W, Kind J, Ward LD, Brugman W, de Castro IJ, Kerkhoven RM, Bussemaker HJ, van Steensel B.  Systematic protein location mapping reveals five principal chromatin types in Drosophila cells. Cell. 2010 Oct 15;143(2):212-24.

Summary: Chromatin is traditionally classified into transcriptionally active euchromatin and repressed heterochromatin. In this paper we updated this view via a massive integrative genome-wide analysis. Specifically, we used the DamID method to map the genomic locations of 53 chromatin-associated proteins in an embryonic Drosophila melanogaster cell line. We then applied a data-driven computational classification strategy to identify recurrent, distinct combinations of protein association across the genome. Our analysis reveals that chromatin comes in five principal types. One of those appears to be a new type of repressive chromatin which covers almost half the Drosophila genome and lacks classic heterochromatin markers. Furthermore, we find that transcriptionally active euchromatin consists of two types that differ in molecular organization and regulate distinct classes of genes. As chromatin-associated proteins are broadly conserved among species, it is likely that this classification will be widely applicable and could form a new framework for describing the epigenome.


Carro MS, Lim WK, Alvarez MJ, Bollo RJ, Zhao X, Snyder EY, Sulman EP, Anne SL, Doetsch F, Colman H, Lasorella A, Aldape K, Califano A, Iavarone A. The transcriptional network for mesenchymal transformation of brain tumours. Nature. 2010 Jan 21;463(7279):318-25.

Summary: This paper showed for the first time that assembly and interrogation of accurate, context specific regulatory models can elucidate master regulators of pathologic phenotypes. Specifically, interrogation of an ARACNe inferred regulatory network for high grade glioma with a signature of the mesenchymal subtype of GBM, using the Master Regulator Inference algorithm (MARINa), revealed a small regulatory module comprising six transcription factors, including C/EBPb, C/EBPd, Stat3, FosL2, Runx1, and BHLHB2, responsible for controlling the expression of mesenchymal genes in neural stem cells. Two of these genes (C/EBPb and Stat3) were found to be synergistic master regulators of this phenotype. Indeed their co-ectopic expression was able to reprogram neural stem cells along an aberrant mesenchymal lineage, while their co-silencing in GBM derived cell lines of a mesenchymal subtype was sufficient to abrogate the mesenchymal state and tumorigenesis in vivo. This paper is seminal in that it paves the road for the use of reverse-engineered regulatory models to elucidate mechanisms of cell state transition, both physiologic and pathologic. This has been followed up by a number of additional manuscripts published or in review where this approach has elucidated master regulators for a variety of cellular phenotypes.


Akavia UD, Litvin O, Kim J, Sanchez-Garcia F, Kotliar D, Causton HC, Pochanard P, Mozes E, Garraway LA, Pe'er D. An integrated approach to uncover drivers of cancer. Cell. 2010 Dec 10;143(6):1005-17.

Summary: CONEXIC is an algorithm that combines genomic characterization and computational modeling to identify drivers of tumorigenesis, the genes affected by these drivers and their possible functions. When applied to Melanoma data, CONEXIC identified novel driver candidates, including two genes, RAB27A and TBC1D16, which were validated experimentally, implicating vesicular trafficking in cancer cell proliferation. This algorithm provides a method to identify drivers of cancer, and was successfully applied to multiple other cancer types including breast, ovarian, glioblastoma, CLL and lung cancer.


Rohs R, West SM, Sosinsky A, Liu P, Mann RS, Honig B. The role of DNA shape in protein-DNA recognition. Nature. 2009 Oct 29;461(7268):1248-53. 

Summary: This paper generalized and extended the observations made in Joshi et al. (Cell, 2007). By comprehensively analyzing the three dimensional structures of protein-DNA complexes, we showed that the binding of arginines to narrow minor grooves is a widely used mode for protein-DNA recognition. Minor groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. Our findings suggest that the ability to detect local variations in DNA shape via their electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA binding specificity.


Compagno M, Lim WK, Grunn A, Nandula SV, Brahmachary M, Shen Q, Bertoni F, Ponzoni M, Scandurra M, Califano A, Bhagat G, Chadburn A, Dalla-Favera R, Pasqualucci L. Mutations of multiple genes cause deregulation of NF-kappaB in diffuse large B-cell lymphoma. Nature. 2009 Jun 4;459(7247):717-21.

Summary: This paper shows that an entire spectrum of genetic alterations in diffuse large B cell lymphoma are canalized through a regulatory bottleneck to induce non-oncogene addiction. Specifically, mutations in TNFAIP3 (A20), CARD11, TRAF2, TRAF5, MAP3K7 (TAK1), and TNFRSF11A (RANK) all converge on Nf-kB as a master integrator, eliciting non-oncogene addiction. By non-oncogene addiction we mean that cells that harbor these genetic alterations become addicted to Nf-kB activity despite the fact that the genes coding for this complex are not themselves mutated (i.e., Nf-kB is not an oncogene). Thus, RNAi mediated silencing of Nf-kB subunits in DLBCL cells with these mutations leads to dramatic reduction in proliferation, compared to DLBCL cells that harbor other non Nf-kB related mutations. The concept of non-oncogene addiction is becoming an important tool in cancer research to identify mutation induced vulnerabilities that are more frequent than the classic oncogene alterations. This concept was further strengthened in the Carro et al paper in 2010 where another non-oncogene addiction (C/EBPb and Stat3) was identified in GBM.


Pe'er I, Yelensky R, Altshuler D, Daly MJ. Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol. 2008 May;32(4):381-5.

Summary: In this paper the authors provide rigorous, simulation-based estimates of the multiple testing burden for genome-wide association studies (GWAS). At the time of the paper’s publication GWAS were becoming very popular as affordable, array-based technologies for genotyping millions of variants were increasingly available. In that context, a key challenge facing researchers in the field was determining the significance of results in the face of testing a genome-wide set of multiple hypotheses, most of which were producing noisy, null-distributed association signals. The key contribution of the paper was the development of standards for whole-genome significance, based on data collected by the International Haplotype Map Consortium. The authors report an estimated burden of a million independent tests genome-wide in Europeans, and twice that number in Africans. They further identify the sensitivity of the testing burden to the required significance level, with implications to staged design of association studies.


Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T, Yamrom B, Yoon S, Krasnitz A, Kendall J, Leotta A, Pai D, Zhang R, Lee YH, Hicks J, Spence SJ, Lee AT, Puura K, Lehtimäki T, Ledbetter D, Gregersen PK, Bregman J, Sutcliffe JS, Jobanputra V, Chung W, Warburton D, King MC, Skuse D, Geschwind DH, Gilliam TC, Ye K, Wigler M. Strong association of de novo copy number mutations with autism. Science. 2007 Apr 20;316(5823):445-9. 

Summary: The work presented in this paper represents the first large study to conclusively associate de novo copy number variations with autism spectrum disorders (ASD), paving the way for several follow up efforts (some led by MAGNet investigators) that have further refined our understanding of the contribution of de novo mutations to the etiology of ASD. The authors performed comparative genomic hybridization (CGH) on the genomic DNA of patients and unaffected subjects to detect copy number variants not present in their respective parents. Candidate genomic regions were validated by higher-resolution CGH, paternity testing, cytogenetics, fluorescence in situ hybridization, and microsatellite genotyping. Confirmed de novo CNVs were significantly associated with autism (P = 0.0005). Such CNVs were identified in 12 out of 118 (10%) of patients with sporadic autism, in 2 out of 77 (3%) of patients with an affected first-degree relative, and in 2 out of 196 (1%) of controls. Most de novo CNVs were smaller than microscopic resolution. Affected genomic regions were highly heterogeneous and included mutations of single genes. These findings establish de novo germline mutation as a more significant risk factor for ASD than previously recognized.


Joshi R, Passner JM, Rohs R, Jain R, Sosinsky A, Crickmore MA, Jacob V, Aggarwal AK, Honig B, Mann RS. Functional specificity of a Hox protein mediated by the recognition of minor groove structure. Cell. 2007 Nov 2;131(3):530-43.

Summary: This paper was the first in a series that elucidated a new mode of protein-DNA recognition based on DNA shape. Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we showed that Hox proteins discriminate between specific and generic binding sites via residues located in unstructured regions that insert into the minor groove, but only when presented with the correct DNA sequence. Our results suggested that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of reading a specific DNA sequence.


Palomero T, Lim WK, Odom DT, Sulis ML, Real PJ, Margolin A, Barnes KC, O'Neil J, Neuberg D, Weng AP, Aster JC, Sigaux F, Soulier J, Look AT, Young RA, Califano A, Ferrando AA. NOTCH1 directly regulates c-MYC and activates a feed-forward-loop transcriptional network promoting leukemic cell growth. Proc Natl Acad Sci USA. 2006 Nov 28;103(48):18261-6.

Summary: The NOTCH1 signaling pathway directly links extracellular signals with transcriptional responses in the cell nucleus and plays a critical role during T cell development and in the pathogenesis over 50% of human T cell lymphoblastic leukemia (T-ALL) cases. However, little is known about the transcriptional programs activated by NOTCH1. The authors use an integrative systems biology approach to show that NOTCH1 controls a feed-forward-loop transcriptional network that promotes cell growth. Inhibition of NOTCH1 signaling in T-ALL cells led to a reduction in cell size and elicited a gene expression signature dominated by down-regulated biosynthetic pathway genes. By integrating gene expression array and ChIP-on-chip data, the authors show that NOTCH1 directly activates multiple biosynthetic routes and induces c-MYC gene expression. Reverse engineering of regulatory networks from expression profiles using the ARACNe algorithm showed that NOTCH1 and c-MYC govern two directly interconnected transcriptional programs containing common target genes that together regulate the growth of primary T-ALL cells. These results identify c-MYC as an essential mediator of NOTCH1 signaling and integrate NOTCH1 activation with oncogenic signaling pathways upstream of c-MYC.


Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A. ARACNe: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006 Mar 20;7 Suppl 1:S7.

Summary: This paper describes ARACNe, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. ARACNe has been widely adopted by the systems biology community and has been used to construct gene interaction networks that have enabled many novel discoveries. The authors prove that ARACNe reconstructs regulatory networks exactly (asymptotically) if the effect of loops in network topology is negligible, and they show that the algorithm works well in practice, even in the presence of numerous loops and complex topologies. They assess ARACNe's ability to reconstruct transcriptional regulatory networks using both a realistic synthetic dataset and a microarray dataset from human B cells. On synthetic datasets ARACNe achieves very low error rates and outperforms established methods, such as Relevance Networks and Bayesian Networks. Application to the deconvolution of genetic networks in human B cells demonstrates ARACNe's ability to infer validated transcriptional targets of the cMYC proto-oncogene. The authors also study the effects of misestimation of mutual information on network reconstruction, and show that algorithms based on mutual information ranking are more resilient to estimation errors.