High-quality genome assemblies are crucial for their use as reliable reference sequences. However, the short reads produced by traditional sequencing technologies lead to highly fragmented, incomplete assemblies. Short reads cannot span important genomic regions such as repeats and structural variants, resulting in them being assembled incorrectly. In contrast, nanopore technology can deliver long and ultra-long sequencing reads (current record >4 Mb), that can span complex genomic regions, enabling the generation of highly contiguous genome assemblies.
Generate more contiguous genome assemblies using long sequencing reads
Large structural variants, repeat sequences, and GC-rich regions are challenging to accurately characterise with short-read sequencing technology, and the resulting genome assemblies tend to be fragmented due to the lack of read overlap. Nanopore technology routinely generates sequencing reads that are tens of kilobases in length, and is also capable of sequencing ultra-long libraries (i.e. read N50 of >100 kb; Figure 1). The greater overlap between ultra-long reads enables easier de novo genome assembly. The longest DNA fragment sequenced to date using nanopore technology is 4.2 Mb, which was achieved using the Ultra-Long DNA Sequencing Kit. The long-read capability of nanopore sequencing not only enables accurate delineation of complex genomic regions such as repeats and structural variants, but also the sequencing of smaller microbial genomes in single reads — negating the need for assembly entirely (see poster).
Comprehensive genomic analysis, including direct detection of modified bases
A common metric for assessing genome assembly quality is contig N50 — the length at which half of the nucleotides in the assembly belong in contigs of this length or longer. The use of long nanopore sequencing reads delivers significantly higher N50 values than provided by short-read sequencing technologies, enabling the generation of more complete and more contiguous genome assemblies (Table 1). In addition, using Pore-C, a complete, end-to-end workflow for nanopore sequencing-based chromosome conformation capture, large genome assemblies can be further scaffolded and corrected. Long sequencing reads also simplify haplotyping, enabling the resolution of compound heterozygosity and parental origin. Furthermore, nanopore sequencing does not require amplification, allowing the direct detection of base modifications (e.g. methylation) alongside the nucleotide sequence for even more comprehensive genomic analyses.
Delivering improved crop reference genomes
‘using a plant-trained basecalling model, nanopore-only reference crop genomes can be obtained with outstanding contiguity and accuracy, reducing the requirements for multiple technologies to generate reference-quality genomes’Alexander Wittenberg, KeyGene, Netherlands
Scientists at KeyGene in the Netherlands are at the forefront of technology innovation for crop improvement. A significant focus is crop improvement through breeding for traits such as pathogen resistance, extended shelf life, and improved taste and colour. The insights obtained using a high-quality reference genome enable better and faster selection of important breeding traits — allowing new plant varieties to be brought to market faster. Using the PromethION 24 device and a plant-trained basecalling model, the KeyGene team generated the most contiguous lettuce genome ever assembled. Using nanopore sequencing alone, the genome was captured in just 159 contigs. This contrasts with 153,952 contigs for the 2017 short-read-based reference genome, and 1,541 contigs for a genome assembled using an alternative long-read capable sequencing technology. Using their STL assembler, the nanopore-only genome was assembled within 30 hours, and consensus accuracies were shown to be on par with those obtained using alternative technologies.
How do I assemble genomes using nanopore sequencing?
Oxford Nanopore provides a range of sequencing devices suitable for any sized genome assembly project, from small individual microbial genomes to high-throughput, population-scale sequencing of large genomes.
For best practice advice on genome assembly, view our whole-genome sequencing Getting Started guides for small or large genomes. These guides provide a step-by-step overview of the entire sequencing workflow — from selecting the right nanopore sequencing device through to sample preparation, sequencing, and data analysis. Our best practice workflows for human and microbial genome assembly provide structured, recommended workflows for assembling genomes using nanopore sequencing technology.
Read our simple, end-to-end workﬂow for microbial genome assembly from an isolate.