Mate-Pair Sequencing: Principle, Steps, Applications

Mate-pair sequencing is a next-generation sequencing method that creates long-insert paired-end DNA libraries.

In this sequencing method, two DNA fragments that are far apart in the genome and have opposite orientations are sequenced together. Unlike traditional sequencing techniques which focus on smaller, nearby sections, mate-pair sequencing targets more distant regions of the genome. This unique approach covers large gaps in the DNA sequence and is especially useful for applications like genome assembly and detecting structural variations. Mate-pair sequencing can map large sections of DNA more accurately and help better understand complex genomic structures. Mate-pair sequencing provides long-range information that helps connect distant sections of DNA. This can be combined with short-insert paired-end sequencing to achieve maximum coverage and resolution across the genome.

Interesting Science Videos

Principle of Mate-Pair Sequencing

The principle of mate-pair sequencing involves generating libraries of long DNA fragments, sequencing their ends, and using this data to map out the large regions between them. The main idea behind mate-pair sequencing is to sequence DNA fragments that are distant from each other within the genome to gather long-range genomic information.

Mate Pair Sequencing
Mate Pair Sequencing. Image Source (Right): ecSeq Bioinformatics.

Mate-pair sequencing uses long-insert fragments and captures data from both ends of these long fragments which provides information about the ends and the large region in between. This helps bridge large gaps in DNA that would otherwise be missed in conventional sequencing, allowing the study and accurate identification of complex structural variations. It is particularly beneficial for identifying rearrangements such as inversions, duplications, deletions and translocations, which are often missed in short-read sequencing.

The process starts by fragmenting the DNA into long pieces and labeling the ends. The labeled fragments are circularized which brings the ends close together in a loop. The circular DNA is then broken into smaller segments that are easier to sequence. The ends of the original long fragments are captured and paired-end sequencing is performed on them.

This method provides bridge coverage which infers the sequence information over the regions between the ends of the long fragments without the need for full sequencing of these regions. A relatively small amount of sequencing can reveal genetic alterations and large structural changes.

Process/Steps of Mate-Pair Sequencing

  1. DNA Fragmentation: The process of mate-pair sequencing begins with fragmenting the target DNA into large pieces using either mechanical shearing or restriction digestion. 
  2. End Labeling and Circularization: The fragments are size-selected and end-labeled with biotin. The labeled fragments are circularized, which brings the two ends of each DNA fragment close together. Non-circularized DNA is digested to remove unwanted material, leaving only circular DNA.
  3. Fragmentation of Circular DNA: The circularized DNA is sheared into smaller pieces to facilitate capture and sequencing. Since the ends of the original fragments are now linked within these circularized fragments, both ends are captured in each piece.
  4. Affinity Purification/Enrichment: The labeled fragments are affinity-purified to isolate those containing the original DNA ends. The biotin-labeled fragments are enriched to ensure the sequences captured are mate pairs. These fragments are isolated using streptavidin-coated magnetic beads. 
  5. Adapter Ligation: Sequence adapters are ligated to the purified fragments. These adapters help the fragments attach to the sequencing flow cell.
  6. Sequencing: The enriched DNA is sequenced, allowing for the analysis of both ends of each fragment. The final mate-pair libraries consist of short DNA fragments containing segments that were originally several kilobases apart, making them ready for paired-end cluster generation and sequencing on a next-generation sequencing (NGS) platform like Illumina. This yields sequences from both ends of each mate-pair.
  7. Data Analysis: After sequencing, these reads are aligned against a reference genome which helps to determine the genomic locations of each end. The data is analyzed to identify gaps, structural variants, and complex genomic features. 

Advantages of Mate Pair-Sequencing

  • Mate-pair sequencing generates longer DNA fragments than traditional short-read sequencing which helps assemble complex genomes more accurately.
  • Mate-pair sequencing effectively identifies large structural changes which are crucial for studying complex genomes. 
  • Although it is more expensive than standard sequencing, it can be more cost-effective when studying complex genomes where other methods would require extensive resources.
  • It is useful in various fields like cancer genomics, evolutionary biology, and de novo assembly making it a versatile tool. 
  • Longer fragments provide detailed long-range information which is necessary for accurate mapping.

Limitations of Mate-Pair Sequencing

  • Mate-pair sequencing can be more expensive than other methods due to the additional sample preparation and enrichment steps involved.
  • The process of preparing mate-pair libraries is complex and demanding which requires highly skilled technicians and specific reagents.
  • The data from mate-pair sequencing can be harder to analyze and requires sophisticated bioinformatics tools and expertise because of the large and complex structural variations it reveals. 
  • It can sometimes produce lower-quality reads which may require additional filtering or cleanup in the data analysis phase. 
  • High-quality DNA is necessary to generate accurate mate-pair libraries making it challenging to work with degraded or low-quality samples.
  • The large insert sizes and complex structural variants can sometimes lead to false positives requiring thorough verification of the results.
  • Mate-pair library preparation is time-intensive, often taking longer than other preparation methods. 

Applications of Mate-Pair Sequencing

  • Mate-pair sequencing is useful for detecting large structural changes like inversions, duplications, translocations, and deletions. This is valuable in identifying genetic mutations associated with diseases. 
  • It can analyze structural variations across species and reveal evolutionary relationships to identify how DNA has rearranged over time. This is useful in evolutionary and comparative genomics.
  • It is useful in de novo genome assembly as it fills in gaps and builds a more complete genome map by linking distant parts of the genome. 
  • It is used in clinical diagnostics to detect large structural changes associated with genetic disorders, helping in diagnosis and potential treatment planning.
  • This method is useful for resolving repetitive regions which are often difficult to sequence using only short reads. 

References

  1. Bioinformatics, E. (n.d.). What is mate pair sequencing for? Retrieved from https://www.ecseq.com/support/ngs/what-is-mate-pair-sequencing-useful-for
  2. France Génomique. (2024, June 6). “Mate Pair” sequencing – France génomique. Retrieved from https://www.france-genomique.org/technological-expertises/whole-genome/mate-pair-equencing/?lang=en
  3. Gao, G., & Smith, D. I. (2015). Mate-Pair Sequencing as a Powerful Clinical Tool for the Characterization of Cancers with a DNA Viral Etiology. Viruses, 7(8), 4507–4528. https://doi.org/10.3390/v7082831
  4. Mate pair sequencing. (n.d.). Retrieved from https://www.illumina.com/science/technology/next-generation-sequencing/mate-pair-sequencing.html
  5. Mate-pair sequencing – (General Biology I) – Vocab, Definition, Explanations | Fiveable. (n.d.). Retrieved from https://library.fiveable.me/key-terms/college-bio/mate-pair-sequencing
  6. Van Nieuwerburgh, F., Thompson, R. C., Ledesma, J., Deforce, D., Gaasterland, T., Ordoukhanian, P., & Head, S. R. (2011). Illumina mate-paired DNA sequencing-library preparation using Cre-Lox recombination. Nucleic Acids Research, 40(3), e24. https://doi.org/10.1093/nar/gkr1000

About Author

Photo of author

Sanju Tamang

Sanju Tamang completed her Bachelor's (B.Tech) in Biotechnology from Kantipur Valley College, Lalitpur, Nepal. She is interested in genetics, microbiome, and their roles in human health. She is keen to learn more about biological technologies that improve human health and quality of life.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.