HK1221710B

HK1221710B - Reagent and method for constructing sequencing library based on molecular inversion probe

Info

Publication number: HK1221710B
Application number: HK16109811.0A
Authority: HK
Inventors: 耿春雨; 于源; 李梅艶; 郭晶; 蒋慧; 章文蔚; 郭荣荣; 傅书锦; 田凯; 安丹; 贺玲瑜
Original assignee: 深圳华大智造科技有限公司
Priority date: 2014-12-22
Filing date: 2016-08-16
Publication date: 2018-08-03

Abstract

The present invention discloses a method and reagent for constructing a sequencing library based on molecular reverse probes,The method includes annealing hybridization between a molecular reverse probe and denatured nucleic acid,The molecular reverse probe includes an anchoring sequence at the 5 'end, an extension sequence at the 3' end, and a sequencing adapter sequence between the two,The anchoring sequence and the extension sequence are complementary to the sequences at both ends of the target region in the denatured nucleic acid, respectively;Using the target area as a template,Starting from the extension sequence at the 3 'end, the polymerization reaction is carried out,Generate complementary sequences for the target area;Connect the anchoring sequence at the 5 'end with the complementary sequence of the target region to form a circular nucleic acid molecule;Using nucleases to digest unencapsulated linear molecules,Obtain a single chain cyclic molecule containing the target region.Sequencing will be performed on the obtained single chain cyclic molecules to achieve capture sequencing of the target region.The method of the present invention integrates the construction of single stranded circular libraries and the capture of target regions, thereby greatly reducing the library construction process and cycle.

Description

Sequencing library construction method and reagent based on molecular reverse probe

Technical Field

The invention relates to the technical field of molecular biology, in particular to a sequencing library construction method and a sequencing library construction reagent based on a molecular reverse probe.

Background

With the development of DNA sequencing technology, high throughput sequencing technology has been widely used in various fields of life science research. Although the cost of sequencing technology is becoming lower and lower with the continuous update and spread of sequencing technology, the cost of whole genome sequencing technology itself is still expensive. One preferred approach to reconcile this problem is to enrich for the target region of interest followed by high throughput sequencing. In the conventional sequence capture technology, a high-throughput sequencing library is constructed, and then a probe is used for library enrichment and resequencing of a target region.

The sequencing platform represented by CG (Complete Genomics) sequencing platform requires first constructing a single-stranded circular library, then performing rolling circle amplification under the guidance of a rolling circle amplification primer sequence with the single-stranded circular library as a template to form DNA Nanoballs (DNB), and then performing sequencing. Constructing the single-stranded circular library is a time-consuming, labor-consuming and cost-consuming process, generally requires the processes of nucleic acid fragmentation and selection treatment, fragment end repair, 3 'end linker and 5' end linker connection, nick translation, Polymerization Chain Reaction (PCR), single-stranded separation and cyclization, and the like, and the whole process also requires 1-2 days at the fastest speed, and many steps require magnetic bead purification. In addition, the PCR process may introduce base errors, resulting in inaccurate sequencing results. Thereafter, it takes 2 to 5 days for hybridization to take place for capturing the target region.

Molecular reverse Probe (MIP) technology is a nucleic acid target region capture technology that can be used to capture target regions of interest for downstream processing or analysis. The application method of the MIP is mainly characterized in that a target area is captured through the MIP, then a large number of linear amplification products are obtained by amplifying PCR primer sequences carried on the MIP, a linker sequence of a sequencing platform can be introduced into the PCR primer sequences according to needs, and then the linear amplification products are used for sequencing.

At present, the application of the MIP technology in a sequencing platform taking a single-chain circular library as a sequencing object is not seen, so that the problems that the construction of the single-chain circular library is time-consuming, labor-consuming and cost-consuming and the enrichment of a target region is complex are solved.

Disclosure of Invention

The invention provides a sequencing library construction method and a sequencing library construction reagent based on a molecular reverse probe, which realize the integration of single-chain annular library construction and target region capture, thereby greatly reducing library construction flow and period.

According to a first aspect of the present invention, the present invention provides a method for constructing a sequencing library based on a molecular reverse probe, comprising the following steps:

annealing and hybridizing the denatured nucleic acid by using a molecular reverse probe, wherein the molecular reverse probe comprises an anchor sequence at the 5 'end, an extension sequence at the 3' end and a sequencing joint sequence between the anchor sequence and the extension sequence, and the anchor sequence and the extension sequence are respectively reversely complementary with sequences at two ends of a target region in the denatured nucleic acid;

under the action of polymerase, taking the target region as a template, and carrying out polymerization reaction from the extension sequence at the 3' end to generate a complementary sequence of the target region;

connecting the anchoring sequence at the 5' end with a complementary sequence of the target region under the action of ligase to form a circular nucleic acid molecule;

digesting the non-circularized linear molecule using exonuclease to obtain single-stranded circular molecule containing the target region.

In a preferred embodiment of the present invention, the sequencing linker sequence is a sequencing linker sequence of a sequencing platform that targets sequencing with a single-stranded circular library, and is preferably a sequencing linker sequence of a CG sequencing platform.

As a preferred embodiment of the present invention, the sequencing adaptor sequence comprises a tag sequence, a rolling circle amplification primer binding sequence and a sequencing primer binding sequence.

In a preferred embodiment of the present invention, the anchor sequence at the 5 'end and the extension sequence at the 3' end are 15 to 25 bases in length.

As a preferred embodiment of the present invention, the exonuclease is exonuclease I or exonuclease III.

According to a second aspect of the present invention, the present invention provides a sequencing library construction reagent based on a molecular reverse probe, comprising the following components:

the molecular reverse probe is used for annealing hybridization with the denatured nucleic acid, and comprises an anchoring sequence at the 5 'end, an extension sequence at the 3' end and a sequencing joint sequence between the anchoring sequence and the extension sequence, wherein the anchoring sequence and the extension sequence are respectively reversely complementary with the sequences at the two ends of a target region in the denatured nucleic acid;

the polymerase is used for taking the target region as a template and carrying out polymerization reaction from the extension sequence at the 3' end to generate a complementary sequence of the target region;

a ligase for ligating the anchor sequence at the 5' end to a complementary sequence of the target region to form a circular nucleic acid molecule;

exonuclease for digesting the non-circularized linear molecules to obtain single-stranded circular molecules containing the target region.

The method of the invention is based on specially designed molecular reverse probes, integrates the construction of the single-chain cyclic library and the capture of the target region, greatly shortens the process and the period compared with the traditional method of firstly constructing the whole genome library, then capturing the target region by utilizing the target region capture technology and then sequencing, and can obtain the final on-machine sequencing library generally in one day. Due to the simple process, the library construction with low initial amount (less than 100ng), high flux, short period and low cost can be realized. In addition, the method of the invention does not need PCR to amplify the library, and can greatly reduce errors caused by PCR in the process of library construction.

Drawings

FIG. 1 is a schematic technical schematic diagram of one embodiment of the sequencing library construction method based on the molecular reverse probe of the present invention;

FIG. 2 is a schematic diagram of the sequence structure of the sequencing linker sequence part (including the tag sequence, the rolling circle amplification primer binding sequence and the sequencing primer binding sequence) of the molecular reverse probe in one embodiment of the molecular reverse probe-based sequencing library construction method of the present invention;

FIG. 3 is a schematic diagram of PCR amplification and recyclization of circularized DNA molecules according to one embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating the principle of preparing a first molecular reverse probe in the present invention;

FIG. 5 is a schematic diagram illustrating the preparation of a second molecular reverse probe according to the present invention;

FIG. 6 is a schematic diagram showing the principle of preparing a third molecular reverse probe in the present invention;

FIG. 7 is a diagram showing the results of gel electrophoresis detection of a sequencing library according to an embodiment of the method for constructing a sequencing library based on a molecular reverse probe, wherein M represents a DNA Marker, and lanes 1 and 2 represent the results of PCR product verification;

FIG. 8 shows the quality control result of Agilent Bioanalyzer2100 obtained from another embodiment of the method for constructing sequencing library based on molecular reverse probe;

FIG. 9 is a diagram showing a sequencing depth profile of each gene region obtained by another embodiment of the molecular reverse probe-based sequencing library construction method of the present invention;

FIG. 10 shows the sequencing coverage of target regions at different depths according to another embodiment of the method for constructing a sequencing library based on molecular reverse probes.

Detailed Description

The terms involved in the present invention are explained as follows:

index pooling sequencing: and (3) a sequencing mode of adding tag sequences to a plurality of libraries and mixing the libraries together for sequencing.

oligo pools: a sequence synthesis mode of mixing a plurality of sequences together for synthesis can realize the synthesis of 94K sequences together at one time.

Gibson assembly: developed by Daniel Gibson, institute j. craig Vente, is the most commonly used sequence splicing scheme.

Agilent Bioanalyzer 2100: the Agilent 2100 biochip analysis system, a detection device based on the microfluidic chip technology, can give qualitative and quantitative digital data of up to 12 samples within 30 minutes.

Ampligase: a ligase, available from Epicentre, was prepared with 10 × Ampligase Buffer.

Phusion: a high fidelity DNA polymerase, sold under the trademark F-530L by Thermo fisher.

Exo III: exonuclease III, which acts on double-stranded DNA to remove single nucleotides gradually in the 3'→ 5' direction.

Exo I: exonuclease I, which acts on single-stranded DNA to degrade the single-stranded DNA in the 3'→ 5' direction.

Phusion MM: a high fidelity DNA polymerase, sold under the trademark F-531L by Thermo fisher.

TA Buffer: a buffer solution compatible with various restriction enzymes and nucleic acid modifying enzymes.

T4DNA Ligase: t4DNA ligase, derived from the T4 bacteriophage, can be used to ligate cohesive and blunt ends.

SE 50: single end 50 sequencing, i.e. 5 '-3' Single sequencing 50 nt.

The present invention will be described in further detail with reference to specific examples. Unless otherwise specified, the techniques used in the following examples are conventional techniques known to those skilled in the art; the instruments, reagents and the like used are all available to those skilled in the art from public sources such as commercial sources and the like.

Referring to FIG. 1, the present invention employs a molecular reverse probe (A) comprising an anchor sequence (i.e., anchor sequence) at the 5 'end, an extension sequence (i.e., extension sequence) at the 3' end, and a linker sequence in between to anneal to denatured nucleic acid such as genomic DNA (B), wherein the anchor sequence and the extension sequence are respectively reverse complementary to sequences at both ends of a target region in the denatured nucleic acid, and the linker sequence comprises a sequencing linker sequence (i.e., linker sequences used in sequencing platforms (which are known in the art to be different); then under the action of polymerase, taking the target region as a template, and carrying out polymerization reaction from the extension sequence at the 3' end to generate a complementary sequence of the target region; under the action of ligase, connecting the anchoring sequence at the 5' end with the complementary sequence of the target region to form a circular nucleic acid molecule; finally, the non-circularized linear molecules (C) are digested with exonucleases (e.g., ExoIII and ExoI) to obtain single-stranded circular molecules containing the target region. The obtained single-chain circular molecule can be used for preparing the DNA nanosphere (D).

The denatured nucleic acid in the present invention may be any nucleic acid fragment of any origin containing the target region to be captured, typical but non-limiting examples are: genomic DNA, random fragments obtained by physically disrupting genomic DNA or disrupting transposase-embedded complex, PCR amplification products, Whole Genome Amplification (WGA) products, and the like, and particularly genomic DNA used as a nucleic acid of the present invention for capturing a target region. Methods for denaturing nucleic acids, such as heat denaturation or alkali denaturation, which are well known in the art, can be used in the present invention.

It should be noted that, the anchor sequence and the extension sequence are different according to the target region to be captured, and if a plurality of different target regions are to be captured simultaneously, a corresponding plurality of anchor sequences and extension sequences are used. Thus, the anchor and extension sequences are not invariant sequences. The connecting sequence includes a sequencing linker sequence, and is generally a specific sequence for a specific sequencing platform (such as a CG sequencing platform), namely a sequencing linker sequence commonly used by a sequencing platform.

In one embodiment of the present invention, the length of the anchor sequence and the extension sequence is 15 to 25 bases, and neither too long nor too short is advantageous because too long of the anchor sequence and the extension sequence increases the synthesis cost of the molecular reverse probe, while too short of the anchor sequence and the extension sequence results in insufficient annealing hybridization strength of the molecular reverse probe to the denatured nucleic acid and poor binding due to the influence of temperature. Experiments show that the anchoring sequence and the extension sequence with the length of 15-25 bases can achieve better effect.

In one embodiment of the invention, the sequencing adapter sequences comprise tag sequences (barcode) for distinguishing between different samples, i.e. to which molecular reverse probes with different tag sequences anneal hybridized, for target regions to be captured from different sample sources, rolling circle amplification primer binding sequences and sequencing primer binding sequences. Therefore, the target areas with different sample sources can be subjected to mixed on-machine sequencing after being built with the library, and the sequencing result confirms the sample sources of the target areas through respective label sequences, so that the processing flux is improved, and the cost is reduced. The tag sequence may be a random sequence of an appropriate length, for example, a random sequence of 8 to 12 bases, and theoretically, a random sequence of N bases should have 4 types^NSpecies, sufficient to mark more than a million samples. The rolling circle amplification primer combines the sequence, namely the DNA nanosphere preparation amplification sequence. For a sequencing platform represented by a CG sequencing platform, a constructed single-chain circular library needs to be subjected to rolling circle amplification under the guidance of a primer to form a DNA nanosphere, and then sequencing is performed. This process requires primer priming, where primer binding functions as a primer for rolling circle amplification. In the preparation process of the DNA nanosphere, each single-chain ring molecule is amplified into linear molecules connected by thousands of repeated molecular units by rolling rings, and a nanosphere structure is formed. The sequencing primer binding sequence is the sequence of the binding site of the sequencing primer. After the preparation of the DNA nanospheres is finished, sequencing is carried out under the guidance of a sequencing primer, and a base sequence on a linear molecule is read.

The tag sequence, the rolling circle amplification primer binding sequence and the sequencing primer binding sequence in the sequencing adaptor sequence may be independent sequences (i.e. there is no overlap), and their positions are not particularly limited, and may be arranged arbitrarily, such as the tag sequence 1 shown in fig. 2a located between the rolling circle amplification primer binding sequence 2 and the sequencing primer binding sequence 3, or the rolling circle amplification primer binding sequence 2 shown in fig. 2b located between the tag sequence 1 and the sequencing primer binding sequence 3, or the sequencing primer binding sequence 3 shown in fig. 2c located between the rolling circle amplification primer binding sequence 2 and the tag sequence 1. Furthermore, as shown in FIG. 2d, there may be an overlapping (or common) sequence portion between the rolling circle amplification primer binding sequence 2 and the sequencing primer binding sequence 3. Two adjacent sequences can be directly connected or connected by several other bases.

The molecular reverse probe of the invention is characterized in that the anchoring sequence and the extension sequence are respectively specifically combined on the sequences at the two ends of the target region to form reverse complementation, namely the 5 'end of the anchoring sequence is opposite to the 3' end of the extension sequence, so that the complementary sequence of the target region formed by polymerization from the 3 'end of the extension sequence under the action of polymerase meets with the 5' end of the anchoring sequence, and then can be connected into a ring under the action of ligase.

In addition to the circularized circular molecules, some linear molecular reverse probes are present under the action of ligase because some molecular reverse probes do not anneal to hybridize to the corresponding target region, and some molecular reverse probes do anneal to hybridize to the target region but do not allow the complementary sequence of the target region formed by polymerization to meet the 5' end of the anchor sequence under the action of polymerase, resulting in unligation to a loop. The presence of these linear molecule reverse probes can interfere with the performance of subsequent amplification and sequencing, thus requiring the use of exonucleases to digest the non-circularized linear molecule. The exonuclease used is generally exonuclease I (ExoI) and exonuclease III (ExoIII). Wherein exonuclease I has exonuclease activity to hydrolyze single-stranded DNA from the 3' → 5' direction, releasing deoxyribonucleoside 5' -monophosphate; whereas exonuclease III acts on double-stranded DNA to remove mononucleotides gradually and catalytically in the 3'→ 5' direction.

Referring to FIG. 3, if the amount of circular DNA molecules generated after circularization is insufficient, amplification can be performed by PCR. Corresponding primers can be designed according to the rolling circle amplification primer binding sequence and the sequencing primer binding sequence on the connecting sequence, and PCR amplification (E in figure 3) is carried out to obtain a large amount of linear DNA molecules; then circularizing the linear DNA molecule obtained by PCR amplification (F in FIG. 3), specifically using a mediating bridge sequence capable of simultaneously complementing both ends of the linear DNA molecule; the finally circularized DNA molecules can be used for the preparation of DNA nanospheres (G in FIG. 3) and in-machine sequencing.

In the present invention, the molecular reverse probe is essential for the realization of the present invention, and the basic structure of the probe is as described above, and includes an anchor terminal sequence (i.e., anchor sequence) at the 5 'end, an extension terminal sequence (i.e., extension sequence) at the 3' end, and an intermediate linker sequence. Such molecular reverse probes can be obtained in various ways, and three methods for obtaining molecular reverse probes are described in the present application. It should be noted that the three methods described in the present application are merely exemplary, and the molecular reverse probe in the present application is not limited to the molecular reverse probes obtained by the three methods.

First, referring to FIG. 4, probe preparation is performed based on the principle of PCR: three oligo segments were designed and synthesized separately: probe primer L, Ad153 is ligated to a sequence (Ad153Linker, a self-named sequence) and probe primer R. Amplification was performed by means of PCR, and for application to index pooling sequencing, a tag sequence was designed in the middle of Ad153 Linker. Adding the probe primer and the Ad153Linker of each site into a reaction tube respectively, and amplifying by adopting high-fidelity polymerase; after purifying the PCR product, carrying out phosphorylation modification to ensure that the 5' end of the PCR product is phosphorylated; fragment selection by agarose gel electrophoresis; after purification, the product is subjected to quality inspection by an Agilent Bioanalyzer2100, and the product is used for subsequent library construction after being qualified.

As shown below, a probe primer L, Ad153Linker and a probe primer R that can be used in the above-described first method are given, in which a capture region L and a capture region R respectively represent variable sequences of 16 to 27bp, and are designed based on the sequences of the capture regions. The Ad153Linker comprises a 10bp tag sequence (barcode), which can be a random sequence.

A probe primer L: (capture region L16-27 bp) AAGTCGGATCGTAGCCATGTCG (SEQ ID NO: 1);

and (3) probe primer R: (capture region R16-27 bp) AAGTCGGAGGCCAAGCGGTCT (SEQ ID NO: 2);

Ad153 Linker：AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA(10bp barcode)CAACTCCTTGGCTCACAGAACGACATGGCTACGATCCGACTT(SEQ ID NO：3)。

in the second method, referring to FIG. 5, the probe sequence is broken into two parts: the fixed sequence part, namely the framework, is customized by means of oligo synthesis; the portions containing the capture sequences (anchor and extension) were custom made by means of oligo pools synthesis. Carrying out PCR reaction by using an oligo amplification primer to realize the amplification of the quantity of the oligo molecules; gibson assembly with synthetic oligos containing capture sequences using a pre-constructed probe backbone to form circular DNA; amplifying a probe library precursor by using a universal primer; after amplification was completed, the mixture was digested with two restriction enzymes (BspQI and BsrDI, or other restriction sites) shown in FIG. 5; separating by polyacrylamide gel to remove short segments corresponding to the primer binding regions, and purifying and concentrating to obtain a probe library; after purification, the product is subjected to quality inspection by an Agilent Bioanalyzer2100, and the product is used for subsequent library construction after being qualified.

As shown below, a framework sequence that can be used in the second method described above is given, in which the capture region L and the capture region R each represent a variable sequence of 16-27bp, and are designed based on the sequence of the capture region.

Oligo pools: GACCGCTTGGCCTCCGACTT (capture region L16-27 bp) TTGACGACTCAGTTGATCCTCGTCACGCAATGGAGTCCAGGT (capture region R16-27 bp) AAGTCGGATCGTAGCCATGTCG (SEQ ID NO: 4).

In the third method, referring to FIG. 6, the second method is simplified, the full-length probe sequence is synthesized directly with oligo pools, amplification primer regions (amplification primer L and R regions) are added to both ends of the probe, and after synthesis, oligos are amplified by PCR; carrying out PCR reaction by using an oligo amplification primer to realize the amplification of the quantity of the oligo molecules; purifying the amplification product, and performing enzyme digestion treatment by using restriction enzyme (such as Mly); after purification, the product is subjected to quality inspection by an Agilent Bioanalyzer2100, and the product is used for subsequent library construction after being qualified.

As shown below, a full-length probe sequence that can be used in the third method described above is given, in which the capture region L and the capture region R each represent a variable sequence of 16-27bp and comprise a tag sequence (barcode) of 10bp, which may be a random sequence.

Oligo pools: CCATGTCTAACCGGGAAGCAGAGTCGTCAAC (capture region L16-27 bp) AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA (10bp barcode) CAACTCCTTGGCTCACAGAACGACATGGCTACGATCCGACTT (capture region R16-27 bp) CAGCTGGACTCATATTGGTAGCGGATGGGCA (SEQ ID NO: 5).

It should be noted that, based on the features of the present invention, the present invention is applicable to any sequencing platform using single-stranded circular library as a sequencing template, typically CG sequencing platform, but also includes other existing such platforms and such platforms that may be developed in the future.

The present invention will be described in detail below with reference to specific examples.

Example 1

In this example, the genomic DNA (gDNA) of Yanhuang (YH) was used as a test material, and one site for deafness gene detection was used as an example.

The detection sites and nearby sequences are as follows, underlined and italicizedIs a site to be detected; the underlined sequences are sequences at both ends of the target region, and specifically bind to the anchor sequence and the extension sequence of the molecular reverse probe, respectively.

5’-AGGATCGTTGTCATCCAGTCtcttccttaggaattcattgcctttgggatcagcacatcttctcaggattcttctcttgttttgtggccaccactgctctttcccgcacggccgtccaggagagcactggaggaaagacaCAGGTAGGAACAACAGCCTT-3’(SEQ ID NO：6)。

Based on the above detection site and the nearby sequences, the following molecular reverse probe sequences are designed, wherein the sequences of the underlined parts are the anchor sequence and the extension sequence of the molecular reverse probe, respectively.

5’-pho_CCCAAAGGCAATGAATAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTG AAAGAGCAGTGGTGGC-3’(SEQ ID NO：7)。

The template (YH gDNA) and molecular reverse probe were each pre-denatured at 95 ℃ for 5min and immediately placed on ice.

Annealing, extension and ligation reactions were carried out according to the system shown in table 1 below.

TABLE 1

The reaction was carried out by repeating the reaction system at 95 ℃ for 3min → (95 ℃ for 30s → 55 ℃ for 5min) × 20 cycles.

After the annealing, extension and ligation reactions described above, the linear molecular reverse probe was digested with exonuclease. Specifically, 2. mu.L of exonuclease I (ExoI) and 1. mu.L of exonuclease III (ExoIII) were added to the above reaction system, and the reaction was carried out at 37 ℃ for 60min and then at 70 ℃ for 10 min.

Then, the digested product was purified with magnetic beads and dissolved in 1 × TE buffer to obtain a final single-stranded circular sequencing library.

From the purified product, 5. mu.L of the product was used for PCR verification, and the PCR system is shown in Table 2.

TABLE 2

Wherein, the sequence of the verification primer-F is as follows: 5'-GTGCACACAGCCCAGCTT-3' (SEQ ID NO: 8); the verification primer-R sequence is: 5'-CGACCGCTGCGCCTTA-3' (SEQ ID NO: 9).

The PCR procedure was as follows:

94 ℃, 3min → (94 ℃, 15s → 60 ℃, 30s → 68 ℃, 30s) × 30 cycles → 68 ℃, 10min → 12 ℃ incubation.

The result of gel electrophoresis detection of the PCR product is shown in FIG. 7, and a band of about 200bp corresponding to the expected size is obtained, which primarily proves that the molecular reverse probe of this example captures the target region of the Yanhuang genomic DNA (i.e., the deafness gene detection site).

The resulting single-stranded circular sequencing library was sequenced on a CG sequencing platform to obtain the sequences shown below, wherein the underlined part indicates the sequence of the captured deafness gene detection site, and the underlined in bold italicsIn bold italics underlined with the site to be detectedThe base sites on the corresponding strand.

5’-pho_CCCAAAGGCAATGAATAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTGAAAGAGCAGTGGTGGCCACAAAACAAGAGAAGAATCCTGAGAAGATGTTGCTGAT-3’(SEQ ID NO：10)。

The above results show that the molecular reverse probe in this example indeed captures the sequence of the deafness gene detection site in Yanhuang genomic DNA, demonstrating that the method of the present invention is successful.

Example 2

In this example, human peripheral blood DNA (300ng) was used as an experimental material, and the probe regions were 7 regions of congenital deafness genes 12S-rRNA, GJB2, GJB3 and SLC26A 4. The probe site information is shown in Table 3. The Complete Genomics platform was used as a sequencing platform, with the SE50 sequencing type.

TABLE 3

The experimental procedure was as follows:

1. MIP hybridization

The enzymatic reaction system is shown in table 4, wherein the MIP probe employs the sequence of SEQ id no: 1-3 sequence.

TABLE 4

Components	Dosage of
		Genomic DNA	5-300ng
MIP probe	1fmol
		Ampligase buffer	2μL
dNTP(10mM)	2μL
		Ampligase	1μL
Phusion	0.5μL
		Water (W)	Make up 20 μ L

Reaction procedure: the temperature is kept at 95 ℃ for 5 minutes, 65 ℃ for 30 minutes and 4 ℃.

2. Exonuclease digestion

The enzymatic reaction system is shown in table 5:

TABLE 5

Reaction procedure: 1 hour at 37 ℃, 20 minutes at 80 ℃ and heat preservation at 4 ℃.

Magnetic bead purification: purify with 1.3 XPEG 32 magnetic beads and redissolve 20. mu.L of TE buffer.

3. PCR amplification

The enzyme reaction system is shown in Table 6, wherein the primer sequences are as follows:

MIP Ad153_PCR2_1：PHO-GACATGGCTACGATCCGACTT(SEQ ID NO：11)；MIP Ad153_PCR2_3：GTTCTGTGAGCCAAGGAGTTG(SEQ ID NO：12)。

TABLE 6

Components	Dosage of
		Phusion MM	25μL
Step 2 DNA purification	10μL
		MIP Ad153_PCR2_1	5μL
MIP Ad153_PCR2_3	5μL
		Water (W)	5μL

Reaction procedure: 95 ℃ for 5 minutes, (95 ℃ for 30 seconds, 65 ℃ for 30 seconds, 72 ℃ for 30 seconds) 30 cycles of 72 ℃ for 5 minutes, and the temperature is maintained at 4 ℃.

Magnetic bead purification: purify with 1.2X Ampure XP beads and dissolve back 30. mu.L of TE buffer.

4. Cyclization of

Reaction system 1 was formulated as in table 7, with the following mediated bridge sequence:

MIP_Splint Oligo：CGTAGCCATGTCGTTCTGTGAGCC(SEQ ID NO：13)。

TABLE 7

Components	Dosage of
		DNA	660ng
MIP_Splint Oligo(10pmol/μL)	10μL
		H₂O	Make up to 70 μ L

Pre-denatured at 95 ℃ for 3 minutes and then placed on ice for use.

Reaction system 2 was prepared as in table 8:

TABLE 8

Components	Dosage of
		Reaction System 1	70μL
ATP(0.1M)	1.2μL
		10×TA Buffer	1.2μL
T4DNA Ligase	1.2μL
		Water (W)	Make up to 120. mu.L

Cyclization was carried out at 37 ℃ for 1 hour.

Reaction system 3 was prepared as in table 9:

TABLE 9

Components	Dosage of
		Reaction System 2	120μL
10×TA Buffer	0.8μL
		Exo I	3.9μL
Exo III	1.3μL

The reaction was carried out at 37 ℃ for 1 hour. And (3) purification: 1.3 times of PEG32 magnetic beads, and 20. mu.L of TE buffer was redissolved.

5. Quality control

After the library is built, the Agilent Bioanalyzer2100 is used for quality inspection, and after the library is qualified, the library is used for subsequent on-machine sequencing. One of the detection results is shown in fig. 8.

6. Sequencing: refer to Complete Genomics platform standard workflow operations.

FIG. 9 shows the depth profile of sequencing of each gene region obtained in this example; FIG. 10 shows the sequencing coverage of the target regions obtained in this example at different depths, demonstrating that the method of the invention is successful.

The foregoing is a more detailed description of the present invention that is presented in conjunction with specific embodiments, and the practice of the invention is not to be considered limited to those descriptions. It will be apparent to those skilled in the art that a number of simple derivations or substitutions can be made without departing from the inventive concept.

Claims

1. A sequencing library construction method based on a molecular reverse probe comprises the following steps:

annealing and hybridizing the denatured nucleic acid with a molecular reverse probe, wherein the molecular reverse probe comprises an anchor sequence at the 5 'end, an extension sequence at the 3' end and a sequencing joint sequence between the anchor sequence and the extension sequence, and the anchor sequence and the extension sequence are respectively and reversely complementary with sequences at two ends of a target region in the denatured nucleic acid;

under the action of polymerase, taking the target region as a template, and carrying out polymerization reaction from the extension sequence of the 3' end to generate a complementary sequence of the target region;

2. The method for constructing a sequencing library based on a molecular reverse probe according to claim 1, wherein the sequencing linker sequence is a sequencing linker sequence of a sequencing platform using a single-stranded circular library as a sequencing object.

3. The method for constructing a sequencing library based on a molecular reverse probe according to claim 1, wherein the sequencing linker sequence is a sequencing linker sequence of a CG sequencing platform.

4. The method for constructing a sequencing library based on a molecular reverse probe according to any one of claims 1 to 3, wherein the sequencing adaptor sequence comprises a tag sequence, a rolling circle amplification primer binding sequence and a sequencing primer binding sequence.

5. The method of claim 1, wherein the length of the 5 'anchor sequence and the 3' extension sequence is 15-25 bases.

6. The method for constructing the sequencing library based on the molecular reverse probe of claim 1, wherein the exonuclease is exonuclease I and exonuclease III.

7. A sequencing library construction reagent based on a molecular reverse probe comprises the following components:

a molecular reverse probe for annealing hybridization with denatured nucleic acid, the molecular reverse probe comprising an anchor sequence at the 5 'end and an extension sequence at the 3' end, and a sequencing linker sequence between the anchor sequence and the extension sequence, the anchor sequence and the extension sequence being respectively reverse complementary to sequences at both ends of a target region in the denatured nucleic acid;

the polymerase is used for taking the target region as a template and carrying out polymerization reaction from the extension sequence of the 3' end to generate a complementary sequence of the target region;

a ligase for ligating the anchor sequence at the 5' end to the complementary sequence of the target region to form a circular nucleic acid molecule;

8. The reagent for constructing the sequencing library based on the molecular reverse probe of claim 7, wherein the sequencing linker sequence is a sequencing linker sequence of a sequencing platform taking a single-stranded circular library as a sequencing object.

9. The molecular reverse probe-based sequencing library construction reagent of claim 8, wherein the sequencing linker sequence is a sequencing linker sequence of a CG sequencing platform.

10. The molecular reverse probe-based sequencing library construction reagent of any one of claims 7 to 8, wherein the sequencing adaptor sequence comprises a tag sequence, a rolling circle amplification primer binding sequence and a sequencing primer binding sequence.

11. The molecular reverse probe-based sequencing library building reagent of claim 7, wherein the length of the anchor sequence at the 5 'end and the extension sequence at the 3' end is 15-25 bases.

12. The molecular reverse probe-based sequencing library building reagent of claim 7, wherein the exonuclease is exonuclease I and exonuclease III.