The generation of high-quality genome assemblies is crucial for their thorough characterisation and use as reliable reference sequences. However, the use of short reads produced by traditional sequencing technologies can make assembly a complex computational challenge, producing highly fragmented, incomplete sequences. As short reads cannot span important genomic regions such as repeats and structural variants, these may be incorrectly assembled. In contrast, nanopore technology enables the generation of long and ultra-long sequencing reads (current record >4 Mb), that can span complex genomic regions, delivering more complete, structurally accurate genomes.
Generate more contiguous genome assemblies using long sequencing reads
Genome assembly using short-read sequencing data is computationally demanding, and due to the lack of read overlap, tends to produce fragmented assemblies. Large structural variants, repeat sequences, and GC-rich regions are challenging to accurately assemble with short-read sequencing technology. Nanopore technology sequences the entire length of DNA (or RNA) presented to the pore, meaning read length is equal to fragment length (Figure 1). As a result, read length can be optimised by fragment size selection during sample preparation to suit specific genomes and assembly challenges. Longer read lengths allow genomes to be sequenced using fewer fragments, and the greater overlap between reads enables easier de novo genome assembly. The longest DNA fragment sequenced to date using nanopore technology is 4.2 Mb, which was achieved using the recently released Ultra-Long DNA Sequencing Kit. The long-read capability of nanopore sequencing not only enables accurate delineation of complex genomic regions such as repeats and structural variants, but also the sequencing of smaller microbial genomes in single reads — negating the need for assembly entirely (see poster).
Comprehensive genomic analysis, including direct detection of modified bases
A common metric for assessing genome assembly quality is contig N50, which denotes the length at which half of the nucleotides in the assembly belong in contigs of this length or longer. The use of long nanopore sequencing reads delivers significantly higher N50 values than provided by short-read sequencing technologies alone, indicating more complete, contiguous genome assemblies (Table 1). In addition, using Pore-C, a complete, end-to-end workflow for nanopore sequencing-based chromosome conformation capture, it is possible to further scaffold and correct large genome assemblies. Another advantage of long sequencing reads is the facility for haplotyping, enabling the resolution of compound heterozygosity and parental origin. Furthermore, nanopore sequencing does not require amplification, allowing the direct detection of base modifications (e.g. methylation) alongside the nucleotide sequence for even more comprehensive genomic analyses.
Population-scale large genome studies
‘With real-time base calling, a DNA-to-de novo assembly could be achieved in less than 96 hours with little difficulty.’
Nanopore technology is uniquely scalable, enabling researchers to choose the most appropriate device to meet their specific project requirements. Offering the facility to run up to 48 high-capacity flow cells, the PromethION device is ideal for population-scale large genome studies. Using the PromethION in combination with their own publicly available analysis pipeline, Shafin et al. were able to sequence and assemble 11 human cell lines in just 9 days. In total, 2.3 Tb of sequence was generated, providing an average 63x depth of coverage per sample and a read N50 of 42 Gb, including 6.5 kb of ultra-long 100 kb+ reads.
How do I assemble genomes using nanopore sequencing?
Oxford Nanopore provides a range of sequencing devices suitable for any sized genome assembly project, from small individual microbial genomes to high-throughput, population-scale sequencing of large genomes.
For best practice advice on genome assembly, view our whole-genome sequencing Getting Started guides for small or large genomes. These guides provide a step-by-step overview of the entire sequencing workflow — from selecting the right nanopore sequencing device through to sample preparation, sequencing, and data analysis. Our best practice workflows for human and microbial genome assembly provide structured, recommended workflows for assembling genomes using nanopore sequencing technology.
Read our simple, end-to-end workﬂow for microbial genome assembly from an isolate.