News

Building a Better Way to Decode the Genome

No Read Left Behind

Gradually eliminating low-affinity binding sites identified by NRLB (from left to right) results in a gradual reduction of gene expression (white); Credit: Mann Lab/Columbia’s Zuckerman Institute

As reported  by the Zuckerman Mind Brain Behavior Institute, Columbia University researchers have developed a new computational method for deciphering DNA’s most well-kept secrets, and this new algorithm may help find the links between genes and disease. 

The researchers included lead PI at Zuckerman  Richard Mann , PhD, with collaborator, Harmen Bussemaker , PhD, both faculty members of the Department of Systems Biology. They recently published their findings in the Proceedings of the National Academy of Sciences .

“The genomes of even simple organisms such as the fruit fly contain 120 million letters worth of DNA, much of which has yet to be decoded because the cues it provides have been too subtle for existing tools to pick up,” said Mann, who is also Higgins Professor of Biochemistry and Molecular Biophysics and senior author of the paper. “But our new algorithm lets us sweep through these millions of lines of genetic code and pick up even the faintest signals, resulting in a much more complete picture what DNA encodes.”

A few years ago, the two labs--Mann and Bussemaker--developed a genetic sequencing method called SELEX-seq to systematically characterize all Hox binding sites. Hox genes are known as the drivers of some of the body's earliest and most critical aspects of growth and differentiation. Still, SELEX-seq had limitations: It required the same DNA fragment to be sequenced over and over again. With each new round, more pieces of the puzzle were revealed, but information about those critical low-affinity binding sites remained hidden.

To address this challenge, the researchers developed a sophisticated new computer algorithm that was able to explain — for the first time — the behavior of all DNA sequences in the SELEX-seq experiment. They called this algorithm No Read Left Behind, or NRLB.

To learn more about the researchers' new computational framework, read the full article on the Zuckerman news page.