Department of Systems Biology bioengineer Harris Wang describes the goals of the Human Genome Project - Write (HGP-write), an international initiative to develop new technologies for synthesizing very large genomes from scratch. 

In June 2016, a consortium of synthetic biologists, industry leaders, ethicists, and others  published a proposal in Science calling for a coordinated effort to synthesize large genomes, including a complete human genome in cell lines. The organizers of the project, called GP-write (for work in model organisms and plants) or sometimes HGP-write (for work in human cell lines), envision it as a successor to the Human Genome Project (retroactively termed HGP-read), which 25 years ago promoted rapid advances in DNA sequencing technology. As the ability to read the genome became more efficient and less expensive, it in turn enabled a revolution in how we study biology and attempt to improve human health. Now, by coordinating the development of new technologies for writing DNA on a whole-genome scale, GP-write aims to have a similarly transformative impact.

Among the paper’s authors were Virginia Cornish and Harris Wang, two members of the Columbia University Department of Systems Biology whose contributions to the field of engineering biology have in part made the idea of writing large-scale DNA sequences imaginable. We spoke with them to learn more about what GP-write hopes to accomplish, its potential benefits, and how the effort is evolving.


In a recent paper published in Molecular Systems Biology, Kam Leong describes a two-compartment microfluidic device that consists of a chamber within which is embedded a "microbial swarmbot" that is isolated by a permeable hydrogel shell. In collaboration with Lingchong You (Duke University), Leong used the device to regulate the dynamics of a population of bacteria containing a genetically engineered switch that reacts to population size. The scale bar in panel 1 represents a length of 250 micrometers.

With a restless curiosity, Kam Leong always seems to be on the lookout for new problems to solve. A versatile biomedical engineer originally trained in chemical engineering, he has developed an impressive array of innovative nanotechnologies that have opened up new opportunities in biomedical research and drug delivery. 

The most widely known of his designs resulted from his work as a postdoc in the laboratory of MIT’s Robert Langer. While there, Leong played a critical role in the development of Gliadel, a controlled-release therapy that uses biodegradable polymer particles to deliver an anticancer drug to a brain tumor site following surgery. Since then his name has appeared on more than 70 patents covering a wide range of inventions — from microfluidics technologies, to scaffolds for growing organic tissues, to nanoscale fluorescent probes, to a method that uses nanoparticles instead of viruses for the oral delivery of gene therapies. These achievements have gained him widespread respect within the engineering community, as evidenced by his 2013 election to both the National Academy of Engineering and the National Academy of Inventors.

Dr. Leong joined Columbia University in 2014. Although his primary affiliation is with the Department of Biomedical Engineering, he was also attracted by the chance to assume an interdisciplinary faculty appointment in the Department of Systems Biology. Since his arrival he has been developing collaborations with several Systems Biology faculty members as well as other scientists at Columbia University Medical Center, and plans are underway for his lab to move into the Lasker Biomedical Research Building to better facilitate interactions with systems biology and clinical investigators. In the following interview, Leong describes why opportunities to interact with scientists in other disciplines is so important to his work, and how the kinds of technologies he has developed could be relevant for systems biology research, as well as for improving treatment of human diseases.


By using statistical methods to compare genomic data across species, such as chimpanzees and humans, the Przeworski Lab is gaining insights into the origins of genetic variation and adaptation. (Photo: Common chimpanzee at the Leipzig Zoo. Thomas Lersch, Wikimedia Commons.)

Launched approximately 100 years ago, population genetics is a subfield within evolutionary biology that seeks to explain how processes such as mutation, natural selection, and random genetic drift lead to genetic variation within and between species. Population genetics was originally born from the convergence of Mendelian genetics and biostatistics, but with the recent availability of genome sequencing data and high-performance computing technologies, it has bloomed into a mature computational science that is providing increasingly high-resolution models of the processes that drive evolution.

Molly Przeworski, a professor in the Columbia University Departments of Biological Sciences and Systems Biology, majored in mathematics at Princeton before beginning her PhD in evolutionary biology at the University of Chicago in the mid-1990s. While there, she realized that the availability of increasingly large data sets was changing population genetics, and has since been interested in using statistical approaches to investigate questions such as how genetic variation drives adaptation and why mutation rate and recombination rate differ among species. In the following interview, she describes how population genetics is itself evolving, as well as some of her laboratory’s contributions to the field.

Some factors in the expo some

The exposome incorporates factors such as the environment we inhabit, the food we eat, and the drugs we take.

Although genomics has dramatically improved our understanding of the molecular origins of certain human genetic diseases, our health is also influenced by exposures to our surrounding environment. Molecules found in food, air and water pollution, and prescription drugs, for example, interact with genetic, molecular, and physiologic features within our bodies in highly personalized ways. The nature of these relationships is important in determining who is immune to such exposures and who becomes sick because of them.

In the past, methods for studying this interface have been limited because of the complexity of the problem. After all, how could we possibly cross-reference a lifetime’s worth of exposures with individual genetic profiles in any kind of meaningful way? Recently, however, an explosion in the generation of quantitative data related to the environment, health, and genetics — along with new computational methods based in machine learning and bioinformatics — have made this landscape ripe for exploration.

At this year’s South by Southwest Interactive Festival in Austin, Texas, Department of Systems Biology Assistant Professor Nicholas Tatonetti and his collaborator Chirag Patel (Harvard Medical School) discussed the remarkable new opportunities that “big data” approaches offer for investigating this landscape. Driving Tatonetti and Patel’s approach is a concept called the exposome. First proposed by Christopher Wild (University of Leeds) in 2005, an exposome represents all of the environmental exposures a person has experienced during his or her life that could play a role in the onset of chronic diseases. Tatonetti and Chirag’s presentation highlighted how investigation of the exposome has become tractable, as well as the important roles that individuals can play in supporting this effort.

In the following interview, Dr. Tatonetti discusses some of the approaches his team is using to explore the exposome, and how the project has evolved out of his previous research.

Harris Wang

As a graduate student in George Church’s lab at Harvard University, Harris Wang developed MAGE, a revolutionary tool for the field of synthetic biology that made it possible to introduce genomic mutations into E. coli cells in a highly specific and targeted way. Now an Assistant Professor in the Columbia University Department of Systems Biology, Dr. Wang recently published a paper in ACS Synthetic Biology that introduces an important advance in the MAGE technology. The new technique, called (MO)-MAGE, uses microarrays to engineer pools of oligonucleotides that, once amplified and integrated into a genome, can generate thousands or even millions of highly controlled mutations simultaneously. This new method offers a cost-effective way for designing and producing large numbers of genomic variants and provides an efficient platform for experimentally exploring genome-wide landscapes of mutations in bacteria and optimizing the organisms’ biochemical capabilities.

In the following interview, Dr. Wang explains the origins of the new technology, and discusses what he sees as the remarkable potential it holds for both basic biological research and industrial applications of synthetic biology.

Saeed Tavazoie

One of the defining features of systems biology has been its integration of computational and experimental methods for probing networks of molecular interactions. The research of Saeed Tavazoie, a professor in the Columbia University Department of Systems Biology, has been emblematic of this approach. After undergraduate studies in physics, he became fascinated by the processes that govern gene expression, particularly in understanding how gene expression is regulated by information encoded in the genome. Since then, his multidisciplinary approach to research has generated important insights into the principles that orchestrate genome regulation, as well as a number of novel algorithms and technologies for exploring this complex landscape.

In this conversation, Dr. Tavazoie discusses his research in the areas of gene transcription, post-transcriptional regulation, and molecular evolution, as well as some innovative technologies and experimental methods his lab has developed.

Barry Honig

When Columbia University founded the Center for Multiscale Analysis of Genomic and Cellular Networks (MAGNet) in 2005, one of its goals was to integrate the methods of structural biology with those of systems biology. Considering protein structure within the context of computational models of cellular networks, researchers hoped, would not only improve the predictive value of their models by giving another layer of evidence, but also lead to new types of predictions that could not be made using other methods.

In a new paper published in Nature magazine, Barry Honig, Andrea Califano, and other members of the Columbia Initiative in Systems Biology, including first authors Qiangfeng Cliff Zhang and Donald Petrey, report that this goal has now been realized. For the first time, the researchers have shown that information about protein structure can be used to make predictions about protein-protein interactions on a genome-wide scale. Their approach capitalizes on innovative techniques in computational structural biology that the Honig lab has developed over the last 15 years, culminating in the development of a new algorithm called Predicting Protein-Protein Interactions (PrePPI). In this interview, Honig describes the evolution of this new approach, and what it could mean for the future of systems biology.