PacBio Sequencing: Principle, Steps, Types, Applications

PacBio sequencing, also known as Single Molecule Real-Time (SMRT) DNA Sequencing, is a type of third-generation sequencing method developed by Pacific Biosciences that allows the real-time observation of DNA synthesis without the need for DNA amplification.

PacBio sequencing is a long-read sequencing technology that produces reads that are tens of kilobases long. PacBio sequencing can produce long and highly accurate DNA reads useful in studying complex genomic regions that are difficult for other sequencing technologies. It has different genomic applications including de novo genome assembly, variant detection, and the study of epigenetic modifications.

PacBio has developed different sequencing systems over the years, starting with the PacBio RS and transforming to the more advanced Sequel Systems.

Interesting Science Videos

Principle of PacBio Sequencing

PacBio sequencing works on the principle of real-time observation of DNA replication using the SMRT chip and fluorescence signals. This technology uses a specialized circular DNA template called SMRTbell that contains hairpin adapters on both ends. The template is loaded onto a specialized sequencing chip called SMRT Cell containing thousands of tiny wells called zero-mode waveguides (ZMWs).

Principle of PacBio Sequencing
Principle of PacBio Sequencing. Image Source: PacBio.

PacBio sequencing begins with the extraction of genetic material from any sample type which is then prepared into the SMRTbell library by ligating hairpin adapters to double-stranded DNA on both ends, forming a circular template. Primers and polymerases are added to this library, which is loaded onto the sequencing instrument that contains the SMRT Cell and ZMWs.  A single template DNA is immobilized in each ZMW. As the polymerase adds fluorescently labeled nucleotides into the growing DNA strand, light is emitted. This light emission is measured in real time and these signals are converted into nucleotide sequences.

Video on Working of PacBio Sequencing

YouTube video

Process of PacBio Sequencing

The process of PacBio Sequencing is divided into the following 4 steps.

1. Sample Preparation:

  • Sample preparation involves isolating DNA from the biological samples of interest.
  • DNA of interest is extracted using different chemical and mechanical methods. The cells are lysed to extract DNA and then purified to remove any other contaminants. 

2. Library Construction:

  • The next step is library construction which involves several steps to prepare DNA for sequencing.
  • Initially, the extracted genomic DNA is cleaved into fragments of the desired size and it undergoes end repair.
  • Then, adaptors with hairpin structures are ligated to both ends of the DNA fragments which creates single-stranded circular structures called SMRTbell templates. 
  • Finally, the templates are purified and loaded onto the PacBio sequencing instrument.
PacBio Sequencing SMRTbell template
PacBio Sequencing SMRTbell template

3. Sequencing:

  • The sequencing reaction uses SMRTbell templates and the SMRT Cell.
  • At first, each SMRTbell template is placed into each ZMW in the SMRT Cell. The template DNA strand and DNA polymerase are fixed at the base of the ZMW. 
  • Then, fluorescently labeled nucleotides are introduced into the ZMW. Laser light transmits through the base of the ZMW. 
  • The polymerase produces a fluorescent signal when it adds a complementary nucleotide to the DNA strand. This signal is captured in real time by using a charge-coupled device (CCD) camera. This is repeated to reveal the sequence of each base in the SMRTbell template.

4. Data Analysis:

  • After sequencing, the data is analyzed using different bioinformatics tools. 
  • The initial step in data analysis is base calling which converts raw data into nucleotide sequences. In PacBio sequencing, this involves translating fluorescence signals into base sequences. 
  • This results in a continuous long read (CLR), which is then divided into subreads to create HiFi reads using Circular Consensus Sequencing (CCS). 
  • Different tools are then used to check the quality of PacBio data. 
  • The next step is alignment and assembly into contiguous sequences. Once the assembly is complete, annotation is done to identify the structure and functions of genes within the assembled sequence. 

Types of PacBio Sequencing Data (Sequencing mode)

  1. Continuous Long Reads (CLRs) are the raw, unprocessed reads generated from single-pass sequencing of circular DNA templates in PacBio sequencing. They provide long read lengths but may have higher error rates. CLRs are generated by using DNA inserts of large size to construct SMRTbell templates. Due to the large insert size, the polymerase makes only one pass through the template.
  2. Circular Consensus Sequencing (CCS) sequences the same circular DNA template multiple times to generate multiple reads of the same sequence and produces a highly accurate consensus sequence. The subreads from multiple passes are combined to produce a consensus sequence. CCS is used to produce High-Fidelity (HiFi) reads that provide high-accuracy, long-read data. It uses smaller DNA inserts to assemble into SMRTbell templates which allows the polymerase to make multiple passes through the template. CCS is used to enhance the accuracy of sequencing by generating a highly accurate consensus sequence.
PacBio Sequencing Circular Consensus Sequencing (CCS)
Circular Consensus Sequencing (CCS). Image Source: PacBio.

PacBio Sequencing Systems

  • Different PacBio sequencing systems have been developed over the years, each with different features and improvements.
  • The first commercial system introduced by PacBio was PacBio RS released in 2011. In 2013, an upgraded version of the RS system was developed called PacBio RS II which produced longer continuous reads and better data quality.
  • The RS system is now succeeded by the Sequel system with improved features including advanced data processing and reduced computational cost.
  • The Sequel system was released in 2015. It features a redesigned SMRT Cell with a million ZMWs. The Sequel II system released in 2019 further increased the accuracy and throughput.
  • The newest PacBio platform is the Sequel IIe released in 2020 which includes integrated computational resources to reduce data analysis and cost. 

Advantages of PacBio Sequencing

  • PacBio sequencing can produce long reads which is used in the assembly of complex genomes.
  • PacBio sequencing is highly sensitive and can detect variants with a low frequency. This is useful when rare mutations need to be detected.
  • PacBio sequencing is effective in detecting sequences within regions of both high and low GC content. Long reads are less prone to issues caused by complex or repetitive regions in the DNA.
  • Unlike other sequencing methods, PacBio sequencing does not require PCR amplification of the DNA sample. This removes biases that can occur during PCR amplification such as amplification errors.
  • PacBio sequencing allows the direct detection of base modifications during sequencing without the need for chemical modifications. 

Limitations of PacBio Sequencing

  • PacBio sequencing has a lower throughput compared to short-read technologies which can be a limitation for large-scale projects.
  • The cost for PacBio sequencing including both the sequencing runs and computational tools is high.
  • Long-read sequencing technologies like PacBio currently cannot compete with the sequence accuracy achieved with second-generation sequencing technologies like Illumina technology.
  • PacBio sequencing requires large amounts of starting material for library preparation.

Applications of PacBio Sequencing

  • PacBio sequencing is useful in the de novo genome assembly of complex genomes, particularly in regions with high complexity or repetitive sequences. This method can generate complete and accurate genome sequences without the need for reference sequences.
  • It is also used in whole transcriptome sequencing to study mRNA transcripts which is necessary for understanding the gene expression, alternative splicing, and non-coding RNAs.
  • PacBio sequencing can be used for the Iso-Seq method which sequences full-length transcripts to study gene isoforms.
  • It has applications in metagenomic sequencing where it provides detailed information on microbial communities in complex environmental samples.
  • PacBio sequencing is also used in closing gaps left by other sequencing technologies.
  • PacBio sequencing can detect various base modifications such as DNA methylation directly during the sequencing process. This is useful for understanding the process of gene regulation and disease mechanisms.
  • PacBio sequencing can be used to accurately detect the repeat sequences associated with genetic disorders which is difficult to detect using conventional sequencing methods. This is important for diagnosing and studying different disorders.

References

  1. PacBio. (2024, May 8). Whole genome sequencing – PacBio. Retrieved from https://www.pacb.com/products-and-services/applications/whole-genome-sequencing/
  2. Amarasinghe, S. L., Su, S., Dong, X., Zappia, L., Ritchie, M. E., & Gouil, Q. (2020). Opportunities and challenges in long-read sequencing data analysis. Genome biology, 21(1), 30. https://doi.org/10.1186/s13059-020-1935-5
  3. Ardui, S., Ameur, A., Vermeesch, J. R., & Hestand, M. S. (2018). Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic acids research, 46(5), 2159–2168. https://doi.org/10.1093/nar/gky066
  4. Athanasopoulou, K., Boti, M. A., Adamopoulos, P. G., Skourou, P. C., & Scorilas, A. (2021). Third-Generation Sequencing: The Spearhead towards the Radical Transformation of Modern Genomics. Life (Basel, Switzerland), 12(1), 30. https://doi.org/10.3390/life12010030
  5. CCS Home | CCS Docs
  6. Goodwin, S., McPherson, J. D., & McCombie, W. R. (2016). Coming of age: ten years of next-generation sequencing technologies. Nature reviews. Genetics, 17(6), 333–351. https://doi.org/10.1038/nrg.2016.49
  7. Logsdon, G. A., Vollger, M. R., & Eichler, E. E. (2020). Long-read human genome sequencing and its applications. Nature Reviews Genetics. doi:10.1038/s41576-020-0236-x
  8. PacBio, Nanopore, Illumina: Understanding High-Throughput Sequencing Technologies – Omics tutorials
  9. Overview of PacBio SMRT sequencing: principles, workflow, and applications – CD Genomics (cd-genomics.com)
  10. Pacbio Sequencing | DNA Technologies Core (ucdavis.edu)
  11. Rhoads, A., & Au, K. F. (2015). PacBio Sequencing and Its Applications. Genomics, proteomics & bioinformatics, 13(5), 278–289. https://doi.org/10.1016/j.gpb.2015.08.002
  12. Sequencing 101: long-read sequencing – PacBio

About Author

Photo of author

Sanju Tamang

Sanju Tamang completed her Bachelor's (B.Tech) in Biotechnology from Kantipur Valley College, Lalitpur, Nepal. She is interested in genetics, microbiome, and their roles in human health. She is keen to learn more about biological technologies that improve human health and quality of life.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.