Nucleic Acid and Protein Sequence Analysis and Annotation ×




A software package for comprehensive and streamlined analysis of CLIP data, including peak calling and identification of precise protein-RNA crosslink sites.


No Read Left Behind

Gradually eliminating low-affinity binding sites identified by NRLB (from left to right) results in a gradual reduction of gene expression (white); Credit: Mann Lab/Columbia’s Zuckerman Institute

As reported  by the Zuckerman Mind Brain Behavior Institute, Columbia University researchers have developed a new computational method for deciphering DNA’s most well-kept secrets, and this new algorithm may help find the links between genes and disease. 

The researchers included lead PI at Zuckerman  Richard Mann , PhD, with collaborator, Harmen Bussemaker , PhD, both faculty members of the Department of Systems Biology. They recently published their findings in the Proceedings of the National Academy of Sciences .

“The genomes of even simple organisms such as the fruit fly contain 120 million letters worth of DNA, much of which has yet to be decoded because the cues it provides have been too subtle for existing tools to pick up,” said Mann, who is also Higgins Professor of Biochemistry and Molecular Biophysics and senior author of the paper. “But our new algorithm lets us sweep through these millions of lines of genetic code and pick up even the faintest signals, resulting in a much more complete picture what DNA encodes.”

A few years ago, the two labs--Mann and Bussemaker--developed a genetic sequencing method called SELEX-seq to systematically characterize all Hox binding sites. Hox genes are known as the drivers of some of the body's earliest and most critical aspects of growth and differentiation. Still, SELEX-seq had limitations: It required the same DNA fragment to be sequenced over and over again. With each new round, more pieces of the puzzle were revealed, but information about those critical low-affinity binding sites remained hidden.



A program for mapping RNA-seq reads, including de novo identification of exon junctions.




ADOMETA (Adoption of Orphan Metabolic Activities) is a bioinformatics resource designed to predict genes for orphan metabolic activities — known biochemical activities not currently assigned to genes in some or all organisms. ADOMETA returns a ranked list of genes likely to catalyze a given metabolic activity in a selected organism.