Although next-generation sequencing follows certain standard protocols, it can also be customized to address factors such as your research goals and budget. When planning a sequencing project, the key parameters to consider are the method of library preparation, depth of coverage, and sequencer.

Library Preparation Method

Sequencing library preparation converts your nuclei acids of interest into DNA that can be read by the Illumina sequencer. If you are starting with DNA, this means fragmenting the DNA to the correct size, ligating adapters to the ends that can attach to the Illumina flowcell, and amplifying the product. If you are starting with RNA, this means removing ribosomal RNA (usually), converting the mRNA into cDNA, then proceeding as above.

It is critical to choose the library preparation method best suited for answering your research question. Whole genome sequencing and whole exome sequencing have different benefits and drawbacks. Different ribosomal removal techniques can help elucidate transcriptional changes or are better suited to certain sample types. We can help answer questions about project design on our FAQ page or by contacting us at genome@columbia.edu .

Depth of Coverage

The depth of coverage is a measure of the number of times that a specific genomic site is sequenced during a sequencing run. In exome sequencing, for example, the target might be 60X coverage, meaning that — on average — each targeted base is sequenced 60 times. This does not mean that every targeted base is sequenced every time; some segments may be read 100 or more times, while others might only be read once or twice, or not at all. The higher the number of times that a base is sequenced, the better the quality of the data.

For RNA-Seq, we generally recommend a minimum of 20 million pairs per sample for standard differential expression projects. For sequencing projects that require higher accuracy — such as studies of alternate splicing, allele-specific expression, or expression of low-abundant transcripts — 40 million to 80 million pairs is recommended.

Sequencer

We are currently operating three Illumina sequencers: the NovaSeq 6000, three NextSeq 500/550s, and the MiSeq.

The NovaSeq is the highest throughput sequencer. It is capable of generating 10 billion reads per flowcell and can run two flowcells in parallel. This economy of scale allows us to batch projects and has the lowest cost per base. Our projects offered are sequenced on this instrument unless

otherwise agreed upon by the researcher. Please see our pricing page for more information about our standard offerings.

NextSeq is a ‘mid’-throughput sequencer, with depths of 130 million or 400 million reads per run. The MiSeq is the lowest throughput of the three, but capable of the longest reads. The NextSeqs and the MiSeq are available 24/7 for self-service to Columbia researchers. Please contact us at genome@columbia.edu to inquire about using these instruments for your self-prepared libraries.