CN120400136A

CN120400136A - Template switching oligonucleotides, kits and applications thereof

Info

Publication number: CN120400136A
Application number: CN202510898779.1A
Authority: CN
Inventors: 林秀妹; 陈亮; 王雪; 刘龙奇; 赵瑞; 李帆; 陈君杰
Original assignee: Hangzhou Huada Life Science Research Institute
Current assignee: Hangzhou Huada Life Science Research Institute
Priority date: 2025-07-01
Filing date: 2025-07-01
Publication date: 2025-08-01

Abstract

The present invention belongs to the field of molecular biology technology, and in particular to a template switching oligonucleotide, a kit and its application. The first aspect of the present invention provides a template switching oligonucleotide, which includes a template switching region at the 3' end, wherein the template switching region includes a plurality of nucleotides, the nucleotides include a sugar and a base, and the base is guanine; wherein, among the plurality of nucleotides, the 3'-hydroxyl site of the sugar group of the nucleotide at the 3' end has a modification for preventing the 3'-hydroxyl from forming a phosphodiester bond. By performing specific modifications on the conventional reverse transcribed TSO, conventional single-cell, spatiotemporal omics, etc. reverse transcription processes are performed without the need for additional enrichment operations, and the operating process is simple. It not only helps to improve the data utilization rate of single-molecule sequencing, but also helps to improve the sensitivity of target gene capture and reduce background interference.

Description

Template switching oligonucleotide, kit and application thereof

Technical Field

The invention belongs to the technical field of molecular biology, and particularly relates to a template conversion oligonucleotide, a kit and application thereof.

Background

The rapid development of single-cell and space-time histology technologies provides a powerful tool for deeply analyzing cell heterogeneity and spatial distribution characteristics thereof in tissues. Nevertheless, most large scale modular studies currently rely primarily on gene expression data. While the Next Generation Sequencing (NGS) technology is favored for its low error rate and economy, its short read length nature limits the ability to obtain full length transcript information for genes. Although Smart-seq series of microwell sequencing techniques can achieve single cell level full length transcriptome sequencing, its flux limitations and reliance on short fragment splicing can lead to inaccurate identification of duplicate copy variants of the gene coding region. Advances in single molecule sequencing technology have created new opportunities for capturing full-length transcript information.

However, single molecule sequencing technology also faces challenges in terms of accuracy and sequencing throughput, which can be traced to invalid sequences in existing single cell and space-time histology libraries. Most of these null sequences result from non-specific binding of TSO to non-target nucleic acid molecules, and referring to FIG. 1, the bound TSO serves as a primer to initiate transcription of non-target fragments, resulting in non-target products with TSO sequences at both ends. This problem is often ignored on NGS platforms where flux is not limited, but the impact of this problem is amplified on single molecule sequencing platforms where flux is itself limited. On the other hand, due to the existence of the TSO non-specific sequence, amplification preference may be caused, so that target fragment capture is deviated or masked, and interpretation of data by researchers is affected. Thus, although short-read-based pooling procedures can remove this part of the read length to some extent, the preference introduced by the prior cDNA amplification is unavoidable.

In order to relieve the bottleneck problem of long reading and long throughput, in the single-cell library establishment flow based on PacBio and ONT at present, a target product is marked by biotin and then captured by streptavidin so as to solve the influence of a TSO non-specific sequence. However, this approach requires biotin labeling of the DNA product, magnetic bead capture enrichment, PCR amplification or enzymatic cleavage to enrich the target product, making the overall process cumbersome. In addition, the target product is released through PCR amplification, so that the problem of amplification preference caused by excessive amplification is easily caused, and the amplification of the target fragment is influenced or the loss of a real signal is caused.

Disclosure of Invention

The invention provides a template switching oligonucleotide, a kit and application thereof, wherein the template switching oligonucleotide does not need to be marked, the whole process is simpler and more convenient, and the problem of real signal deletion caused by amplification preference is avoided.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

A first aspect of the present invention provides a template switching oligonucleotide comprising a template switching region at the 3 'end of the template switching oligonucleotide and an amplification adaptor region located in the 5' direction of the template switching region;

The length of the template conversion region is 2-5 nucleotides, the 3' -end nucleotide of the template conversion region has 3' -end modification, and the 3' -end modification is used for preventing the 3' -end nucleotide of the template conversion region from forming a phosphodiester bond with a free 5' -phosphate group of a deoxynucleotide and/or a deoxynucleotide sequence.

In some embodiments of the invention, a 3' terminal nucleotide having a 3' terminal modification refers to a locked nucleic acid, or a 3' terminal nucleotide in which the 3' hydroxyl group is replaced with an-O-R or-R ' group.

Wherein R is any one of C1-C18 alkyl, C2-C18 alkenyl, C2-C18 alkynyl, aryl below C18, aralkyl below C18, heteroaryl below C18, phosphate group, azidomethyl group, allyl group, nitrobenzyl group, carbamate group and thiophosphoric acid group, R' is any one of H, C alkyl of C1-C18, C2-C18 alkenyl, C2-C18 alkynyl, aryl below C18, aralkyl below C18 and heteroaryl below C18.

In some embodiments of the invention, the template switch region consists of 3 ribonucleotides.

In some embodiments of the invention, the sequence of the template switching region is rGrGrG.

In a second aspect of the invention, there is provided a kit comprising the template switching oligonucleotide as described above.

In some embodiments of the invention, the kit further comprises at least one of a reverse transcriptase, a reverse transcription primer, a DNA polymerase, an RNase inhibitor, polyethylene glycol, betaine, dntps, and a solid support.

The solid-phase carrier is directly or indirectly fixed with a bar code probe, the bar code probe comprises an amplified joint region, a bar code sequence and a capture sequence from the 5 'end to the 3' end, the bar code sequence is a space bar code or a cell bar code, and the capture sequence is complementary with a part of the sequence of RNA from a sample to be detected.

In a third aspect of the invention, a method is provided for constructing a spatial transcriptome sequencing library or a single cell transcriptome sequencing library using the template switching oligonucleotides described above.

In some embodiments of the invention, the following steps are included:

(1) Carrying out reverse transcription reaction by taking a bar code probe as a primer and RNA in a tissue or a cell as a template to obtain cDNA, wherein the bar code probe is directly or indirectly fixed on a solid phase carrier, and comprises an amplified joint region ', a bar code sequence and a capture sequence from the 5' end to the 3' end, wherein the bar code sequence is a space bar code or a cell bar code, and the capture sequence is complementary with a part of the sequence of RNA from a sample to be detected;

(2) Introducing a non-template polymerization sequence with the length of 2-5 nucleotides into the 3' end of the cDNA to obtain an extension product;

(3) Hybridizing the non-template polymerization sequence of the extension product with the template conversion region of the template conversion oligonucleotide, and continuing to extend by taking the template conversion oligonucleotide as a template to obtain library initial molecules, wherein the library initial molecules sequentially comprise an amplification joint region ', a barcode sequence, a capture sequence, a cDNA sequence and a complementary sequence of the amplification joint region from a 5' end to a 3' end;

(4) Amplifying or enriching the library initial molecules by using a first library amplification primer and a second library amplification primer to obtain the space transcriptome sequencing library or single cell transcriptome sequencing library, wherein the 3' sequence of the first library amplification primer is complementary with the complementary sequence of the amplification joint region, and the 3' sequence of the second library amplification primer is complementary with the complementary sequence of the amplification joint region '.

In a fourth aspect of the invention, there is provided a sequencing method comprising the step of sequencing a spatial transcriptome library or a single cell transcriptome library obtained by the method described above.

In some embodiments of the invention, the sequencing is full length sequencing using a single molecule sequencing method.

In some embodiments, the library is selectively captured and constructed for the 3' end, in which case the effective data utilization in the library constructed in the manner described above is more significantly improved. Specifically, a method for selectively capturing and pooling from the 3' -end is exemplified by single cell transcriptome sequencing, and the construction of a library can be referred to as CN112005115A.

The beneficial effects of the invention are as follows:

The invention carries out specific modification on the TSO of conventional reverse transcription, thereby carrying out conventional reverse transcription process of single cell transcriptome, space-time transcriptome and transcriptome (bulk RNA-seq), without carrying out additional enrichment operation, and having simple operation flow. The potential value is that the method is not only beneficial to improving the data utilization rate of single-molecule sequencing, but also beneficial to improving the capturing sensitivity of target genes and reducing background interference.

Meanwhile, the kit can be compatible with a long and short reading long sequencing platform, realizes accurate analysis of complex transcripts and sequencing at single base level, and is expected to become a conventional library-building kit in the fields of NGS and single-molecule sequencing in the future. The invention utilizes the blocked TSO sequence, can reduce the influence of amplification preference existing in single-molecule long-reading long-sequencing (namely third-generation sequencing) and short-reading long-sequencing (namely second-generation sequencing), and can also improve the effective data utilization rate of the long-reading long-sequencing. The utilization rate of single-cell histology data is greatly improved, high-quality data support is provided for deep understanding of cell states and functions, and accurate medical development is promoted. In addition, by combining transcriptome analysis of the space-time information, dynamic change rules of transcripts in different cell types and microenvironments can be revealed, and a new technical path is opened for space-time histology research.

Drawings

FIG. 1 is a schematic diagram of the nonspecific initiation of TSO.

FIG. 2 is a comparison of data utilization after single molecule sequencing library construction using different template switch nucleotides for different cell lines in an embodiment of the invention. Wherein C represents the existing conventional TSO (control group), P represents the substitution of 3' -OH of ribose of the last nucleotide at the 3' -end of the conventional TSO with 3' -P (phosphate group modification), H represents the substitution of 3' -OH with 3' -H (deoxygenation treatment), and C3 represents the substitution of 3' -OH with 3' -C3 Spacer (alkyl chain C3 modification).

Detailed Description

In the description of the present invention, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", and "a third" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

In a first aspect of the present invention, there is provided a template switching oligonucleotide comprising a template switching region at the 3 'end of the template switching oligonucleotide and an amplified adaptor region located 5' to the template switching region.

The template conversion region comprises 2-5 nucleotides, for example, 2,3, 4 and 5 nucleotides. The 3 'terminal nucleotide of the template switching region has a 3' terminal modification that serves to prevent the 3 'terminal nucleotide of the template switching region from forming a phosphodiester bond with the free 5' -phosphate group of the deoxynucleotide and/or deoxynucleotide sequence. It will be appreciated that the formation of a phosphodiester bond between the 3 '-terminal nucleotide of the template switching region and the free 5' -phosphate group may be a polymerase mediated extension reaction, a ligase mediated ligation reaction, or the like, whereby preventing the formation of a phosphodiester bond prevents a polymerase mediated extension reaction, a ligase mediated ligation reaction, or the like. Thus, when the template switch oligonucleotide is non-specifically bound to a non-target nucleic acid molecule, it cannot be used as a primer to initiate transcription of the non-target fragment, reducing the occurrence of non-target products.

Nucleotides include sugars and bases. In some embodiments, the base comprises adenine (a), cytosine (C), guanine (G), thymine (T). In some embodiments, the bases further include one or more of uracil (U), hypoxanthine (I), and other types of bases. Among the above-mentioned nucleotides of the template switching region of the template switching oligonucleotide, the base is guanine, whereby the template switching oligonucleotide can bind to cDNA by complementary pairing of guanine of the template switching region and cytosine added at the 3' -end of cDNA by reverse transcriptase. In some embodiments, the sugar is ribose or deoxyribose, whereby the multiple nucleotides of the template switching region can be ribonucleotides or deoxyribonucleotides.

In some embodiments, a 3 'terminal nucleotide having a 3' terminal modification refers to a locked nucleic acid. In some embodiments, the locked nucleic acid is a double-ring structure formed by a methylene bridge of the 2'-O atom and 4' -C of the nucleotide glycosyl.

In some embodiments, a 3' terminal nucleotide having a 3' terminal modification refers to a 3' terminal nucleotide in which the 3' hydroxyl group is replaced with an-O-R or-R ' group. Wherein, the 3' -terminal hydroxyl refers to the hydroxyl group connected with 3' -C of the sugar of the nucleotide, and the hydroxyl group is replaced to form 3' -C-O-R or 3' -C-R '.

Wherein R is any one of C1-C18 alkyl, C2-C18 alkenyl, C2-C18 alkynyl, aryl below C18, aralkyl below C18, heteroaryl below C18, phosphoric acid group, azidomethyl group, allyl group, nitrobenzyl group, carbamate group and thiophosphoric acid group.

R' is any one of H, C alkyl groups of 1 to 18 carbon atoms, alkenyl groups of 2 to 18 carbon atoms, alkynyl groups of 2 to 18 carbon atoms, aryl groups of less than 18 carbon atoms, aralkyl groups of less than 18 carbon atoms and heteroaryl groups of less than 18 carbon atoms.

Among the above groups, a part of the groups (such as azidomethyl and carbamate) or 3' -deoxynucleotide structures (such as 3' -CH) are adopted to eliminate hydroxyl activity, and a part of the groups (such as C1-C18 spacer) block the catalytic activity of DNA polymerase through the steric hindrance of the 3' -end spacer to inhibit primer extension reaction.

Among the above groups, the alkyl group of C1 to C18, the alkenyl group of C2 to C18 and the alkynyl group of C2 to C18 include any one of straight-chain hydrocarbon groups, branched-chain hydrocarbon groups and cyclic hydrocarbon groups. Taking alkyl as an example, the alkyl of C1-C18 comprises any one of C1-C18 straight-chain alkyl, C3-C18 branched-chain alkyl and C3-C18 cycloalkyl. In some embodiments, the C1-C18 alkyl group includes any of methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, sec-butyl, tert-butyl, n-pentyl, isopentyl, neopentyl, tert-pentyl, n-hexyl, isohexyl, n-heptyl, 2-methylhexyl, n-octyl, 2-ethylhexyl, n-nonyl, n-decyl, and the like.

In some embodiments, a heteroaryl group below C18 may be one in which at least one C atom in the aromatic ring is replaced with a heteroatom (e.g., N, O, S). In some embodiments, 1 to 3 heteroatoms are included in the heteroaryl group below C18.

In some embodiments, R or R' each independently may be selected from any one of C3 spacer, C6 spacer, C12 spacer, spacer, spacer 18, and the like.

In some embodiments, the sugar of at least one nucleotide of the template switching region is ribose, e.g., a sugar that may be 1,2, 3,4, 5 nucleotides is ribose. Therefore, the binding strength of the nucleotide of the template conversion region and the cytosine nucleotide added at the tail end of the cDNA can be improved, and the pairing stability and specificity are improved. Accordingly, the template switching region may be comprised of 0,1, 2, 3, and 4 nucleotides as deoxynucleotides, and at least one nucleotide may be a ribonucleotide. In some embodiments, the template switching region consists of ribonucleotides. In some embodiments, at least one nucleotide of the template switching region is guanine, e.g., a base that may be 1,2, 3,4, 5 nucleotides is guanine. In some embodiments, the template switching region consists of guanine ribonucleotides. In some embodiments, the sequence of the template switching region is rGrGrG, i.e., three consecutive guanine ribonucleotides. In some embodiments of the invention, the sequence of the template switching region is 5' -rGrGrG-3', wherein rG of 3' is a locked nucleic acid.

In some embodiments, the number of nucleotides in the template switch oligonucleotide is 10-30, e.g., 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30.

In some embodiments, a functional sequence, such as a unique molecular tag (UMI) sequence, may also be included between the template switching region and the amplified adaptor region.

In some embodiments, the amplified adaptor region is integrated into the cDNA after template conversion is complete, and the complementary sequence of the amplified adaptor region is ready for subsequent library construction and sequencing.

In some embodiments, the kit further comprises at least one of a reverse transcriptase, a reverse transcription primer, a DNA polymerase, an RNase inhibitor, polyethylene glycol, betaine, dntps, and a solid support.

In some embodiments, the reverse transcriptase includes, but is not limited to, at least one of MMLV reverse transcriptase, HIV reverse transcriptase, and AMV reverse transcriptase. The reverse transcriptase may be a wild type reverse transcriptase or a mutant reverse transcriptase obtained by subjecting a wild type reverse transcriptase to genetic engineering, for example, random mutation, site-directed mutation, DNA shuffling or the like for one or more of the purposes of reducing the RNase H activity, improving the synthesis ability, improving the thermostability and the like.

In some embodiments, the DNA polymerase includes, but is not limited to, at least one of Taq DNA polymerase, klenow DNA polymerase, bst DNA polymerase, pfu DNA polymerase, tfi DNA polymerase, tfl DNA polymerase, vent DNA polymerase, KOD DNA polymerase, phi29 DNA polymerase, and the like. It will be appreciated that the DNA polymerase may be a wild type of the above polymerase or a mutant, modification (e.g., antibody modification, chemical modification) or the like for improving the specificity, fidelity, heat resistance, amplification rate or the like.

In some embodiments, the solid support may be made of at least one material selected from the group consisting of inorganic, natural, synthetic polymers, including but not limited to cellulose and its derivatives (e.g., nitrocellulose), resins, glass, silica gel, polystyrene, agarose, gelatin, polyvinylpyrrolidone, vinyl-acrylamide copolymers, polyacrylamide, latex, dextran, rubber, silicon, plastic, natural sponge, metal plastic, hydrogel, and the like. In some embodiments, the solid support has a planar structure, such as a slide, chip, microchip, array. In some embodiments, the solid support or surface thereof is non-planar, such as the inner (outer) surface of a tube or vessel. In some embodiments, the solid support comprises a microsphere or bead, and in some embodiments, the solid support comprises an array of beads or wells. In some embodiments, the solid support comprises any one of a bead, a chip, glass, a sensor, an electrode, a silicon wafer.

In some embodiments, the solid support has a bar code probe immobilized directly or indirectly thereon.

Wherein the barcode probe comprises an amplified adaptor region ', a barcode sequence and a capture sequence from the 5' end to the 3' end.

The barcode sequences are spatial barcodes or cell barcodes, respectively, used to identify the spatial location of transcriptome data in spatial transcriptome sequencing and to identify the cellular source of transcriptome data in single cell transcriptome sequencing. In some embodiments, the barcode sequence is 3-30 nucleotides in length, e.g., 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30 nucleotides in length. When the barcode sequence is a spatial barcode, different barcode probes ("different" means that the spatial barcode sequences are different, the capture sequences may be the same or different) are typically immobilized at different positions on the same solid support, i.e., each position (i.e., a spot) may immobilize one barcode probe or a cluster of barcode probes, the barcode probes at different positions or the clusters of barcode probes at different positions, and the barcode sequences of the barcode probes are different. When the barcode sequence is a cellular barcode, different barcode probes (different meaning that the cellular barcode sequences are different, the capture sequences can be the same or different) are typically immobilized on different solid supports, i.e., each solid support has a unique barcode probe or cluster of barcode probes that is different from the barcode sequences of the barcode probes or clusters of barcode probes on the other solid supports.

The capture sequence is complementary to a portion of the sequence of the RNA from the sample to be tested, thereby allowing the RNA from the sample to be tested to be bound to the solid support. In some embodiments, the capture sequence may be a polyT sequence that can be complementarily paired to bind to the polyA tail of an RNA molecule. In some embodiments, the capture sequence may be a target specific sequence, whereby targeted capture sequencing and the like may be performed. In some embodiments, the capture sequence may also be a random sequence.

In some embodiments, the 5' direction of the amplified adaptor region may also include cleavage sites, e.g., USER cleavage sites, exonuclease cleavage sites, for release of products (e.g., labeled cDNA molecules), although alkaline solution elution may be used without cleavage sites, or two-strand release may be performed after two-strand synthesis.

In some embodiments, functional sequences such as UMI (unique molecular identifier) may also be included between the barcode sequence and the capture sequence.

In some embodiments, the method comprises the steps of:

(1) Carrying out reverse transcription reaction by taking a bar code probe as a primer and RNA in a tissue or a cell as a template to obtain cDNA, wherein the bar code probe is directly or indirectly fixed on a solid phase carrier, and comprises an amplified joint region ', a bar code sequence and a capture sequence from a 5' end to a 3' end, wherein the bar code sequence is a space bar code or a cell bar code, and the capture sequence is complementary with a partial sequence of RNA from a sample to be detected;

(3) Hybridizing a non-template polymerization sequence of the extension product with a template conversion region of the template conversion oligonucleotide, and continuing to extend by taking the template conversion oligonucleotide as a template to obtain library initial molecules, wherein the library initial molecules sequentially comprise an amplification joint region ', a barcode sequence, a capture sequence, a cDNA sequence and a complementary sequence of the amplification joint region from a 5' end to a 3' end;

(4) Amplifying or enriching library initial molecules by using a first library amplification primer and a second library amplification primer to obtain a space transcriptome sequencing library or a single cell transcriptome sequencing library, wherein the 3' sequence of the first library amplification primer is complementary to the complementary sequence of the amplified adaptor region, and the 3' sequence of the second library amplification primer is identical to the amplified adaptor region '.

In some embodiments, steps (1) - (3) may be performed in the same reaction system. At this time, the reverse transcription reaction of step (1) and the non-template sequence polymerization and further extension in steps (2) and (3) can be completed by the same reverse transcriptase. In other embodiments, this may be accomplished in a different reaction system.

In some embodiments, the non-template polymeric sequence comprises a single base polymeric sequence (e.g., 5 '-CCC-3') or a plurality of different base polymeric sequences (e.g., 5 '-CGC-3').

In some embodiments, the invalid fragments that can be reduced by the above-described pooling method include at least one of a non-specific TSO sequence, a tandem sequence of a barcode probe and a TSO, and the like.

The nonspecific TSO sequence comprises a TSO sequence with a deletion barcode sequence, the sequence structure can be, for example, a TSO-cDNA-TSO complementary sequence, and the tandem sequence of the barcode probe and the TSO comprises a sequence with a deletion cDNA, and the sequence structure can be, for example, a barcode probe-TSO complementary sequence.

In some embodiments, the method of amplification includes, but is not limited to, at least one of Polymerase Chain Reaction (PCR), isothermal amplification (e.g., loop-mediated isothermal amplification LAMP, recombinase polymerase amplification RPA, rolling circle amplification RCA, cross primer amplification CPA, strand displacement amplification SDA, helicase dependent amplification HDA), and the like.

In some embodiments, single-ended or double-ended, chain-like or circular, etc., different types of cDNA libraries may be constructed according to different sequencing platforms.

In a fifth aspect of the invention, there is provided a sequencing method comprising the step of sequencing a spatial transcriptome library or a single cell transcriptome library obtained by the method described above.

In some embodiments, the sequencing is full length sequencing without disruption. In some embodiments, the sequencing is full length sequencing using a single molecule sequencing method.

In some embodiments, specific method sequencing includes any of first generation sequencing, second generation sequencing, third generation sequencing. Among them, the first generation sequencing such as Maxam-Gilbert sequencing technology, sanger dideoxy sequencing technology, pyrosequencing technology, fluorescence automatic sequencing technology, hybridization sequencing technology, etc., the second generation sequencing such as 454 Roche GS FLX, illumina Solexa, SOLID, ion Torrent, BGISEQ, etc., the third generation sequencing such as HeliScope, pacBIO HiFi, pacBIO CLR, ONT, cyclone WT, etc.

In some embodiments, the sequencing method comprises at least one of single cell sequencing and spatiotemporal histology sequencing, bulk RNA sequencing, RNA direct sequencing, and the like. In some embodiments, to achieve sequencing at the single cell level, it is often desirable to capture single cells in advance, with capture methods including limiting dilution, flow sorting, laser cutting, microscopy, and novel capture methods by microfluidic techniques, among many others. In some embodiments, to achieve sequencing at the spatial level, tissue samples are frozen or paraffin embedded and then sectioned, and the tissue sections are attached to a specific carrier (e.g., chip, magnetic bead, etc.) for nucleic acid capture and labeling.

In a further aspect of the invention, there is provided the use of the template switch oligonucleotide, the kit, the method of constructing a spatial transcriptome sequencing library or a single cell transcriptome sequencing library or the sequencing method described above in a histologic assay or in the preparation of a product of a histologic assay.

In some embodiments, the histologic assay comprises a histologic assay of the transcriptome. In some embodiments, the genomic detection of transcriptomes includes the genomic detection of single cell transcriptomes (scRNA-seq), the genomic detection of spatiotemporal transcriptomes (spatial transcriptome), and the non-single cell transcriptomic detection (bulk RNA sequencing), which may specifically be Smart-seq, smart-seq2, 10X Genomics, MGI (DNBelab C) transcriptome sequencing, and the like. In some embodiments, the histologic assay also includes integration of the transcriptome with other histologic (e.g., genomic, proteomic, metabolomic) assays, such as may be a multicellular histologic assay, a spatiotemporal histologic assay, and the like.

In some embodiments, the foregoing template switching oligonucleotide, kit, spatial transcriptome sequencing library, or single cell transcriptome sequencing library construction method is applied to the fields of (a) sequencing library preparation prior to a histologic analysis, or (b) sequencing library product preparation for a histologic analysis. The sequencing platform comprises second generation sequencing or single-molecule long-reading length (third generation) sequencing, such as any one of single-molecule real-time sequencing, nanopore sequencing and the like.

In some embodiments, the sequencing library is a library that is constructed by selective capture of the 3' end of the mRNA molecules of the transcriptome. In this case, the improvement of the effective data utilization rate is more remarkable in the above manner.

In which a library is constructed by selectively capturing from the 3' end, for example, single cell transcriptome sequencing, the library can be constructed by referring to CN112005115A.

The present invention will be described in further detail with reference to specific examples.

It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention.

The experimental methods, in which specific conditions are not noted in the following examples, are generally conducted under conventional conditions or under conditions recommended by the manufacturer. The materials, reagents and the like used in this example are commercially available ones unless otherwise specified.

Example 1

The embodiment provides a DNBelab C4 single cell transcriptome library construction and sequencing, which comprises the following specific processes:

1. Single cell library optimization

1. Liquid drop generation system formulation

The system was formulated according to the DNBelab C series of high-throughput single-cell RNA library preparation kit set V3.0 (MGI, 940-001819-00) standard protocol.

1.1 Preparation of sample phase suspension

Cell suspensions of the different samples were performed according to the kit recommendations, wherein the human murine cell lines were derived from HEK293T and NIH3T3 mixed in a 1:1 ratio, human Peripheral Blood Mononuclear Cells (PBMC) were derived from peripheral blood mononuclear cells of healthy volunteers, and the murine brain nuclei were derived from dissociation of the mouse brain to extract the nuclei. The prepared single cell suspensions from different sample sources were loaded onto the chip according to 20000 cells. The prepared cell suspension was gently sucked and mixed by a pipette, and each component in the kit set was taken to prepare a sample phase suspension according to table 1:

TABLE 1 sample phase suspension System

Wherein RT Primer-V3 is replaced by TSO Primer of 100 mu mol/L, after preparation, the mixture is gently beaten and mixed by a pipette, and the mixture is instantly centrifuged and placed on ice for standby.

The sequence of the conventional TSO Primer (RT Primer-V3) is 5'-AAGCAGTGGTATCAACGCAGAG/rG// rG// rG/-3';

The sequences of the TSO primers of the experimental group were:

1)5'-AAGCAGTGGTATCAACGCAGAG/rG//rG//XNA_G/-3';

2)5'-AAGCAGTGGTATCAACGCAGAG/rG//rG//rG/-3'

Wherein rG is guanine ribonucleotide, 1) XNA_G is guanine ribonucleotide modified by locked nucleic acid, and 3' -OH of glycosyl is replaced by 3' -P (modified by phosphate group), 2) 3' -H of 3' -end of last rG is guanine ribonucleotide (deoxidization treatment) or 3' -C3 Spacer modification (alkyl chain C3 modification) respectively.

(Remark: the TSO modification is not limited to single cell C4 commercial kit, and can be flexibly modified according to the experimental platform, and similarly, the PCR amplification primers, the amplification enzymes and the purified magnetic beads can be replaced similarly, and are not limited to the components of the product)

1.2 Preparation of magnetic bead phase suspension

The Cell heads-V3 and Index Carrier in the kit set were removed and turned upside down or blown up until completely mixed. Cell Beads-V3 and Index Carrier to 0.2 mL low-adsorption PCR tubes were used in single sample amounts as described in Table 2. And placing the PCR tube on a magnetic rack for standing for 3-5 minutes, and slowly discarding the supernatant to avoid the loss of magnetic beads. And taking down the PCR tube from the magnetic rack, and sequentially adding the Beads Buffer and the Lysis Buffer-V3 in the kit set. After the preparation is completed, the pipettor is gently blown and sucked until the pipettor is completely mixed, the pipettor is instantly centrifuged and placed on ice for standby.

TABLE 2 magnetic bead phase suspension System

2. Droplet generation

Slides were prepared according to the instructions of the kit set and placed in a droplet generator, and 80 μl of sample phase suspension, 950 μl of droplet-forming Oil (P100 Oil) and 100 μl of magnetic bead phase suspension were sequentially added in order to start droplet generation.

3. Obtaining cDNA product

1) After the completion of the droplet generation, the droplets were collected in a PCR tube, and the reverse transcription reaction (42℃for 90 min,10 cycles (50℃for 2 min,42℃for 2 min), 85℃for 5 min, and 4℃hold) was performed according to the instructions of the kit.

2) After the reverse transcription reaction was completed, a demulsifier (Breakage Reagent, MGI, 940-001820-00) was added, and the demulsified aqueous phase (middle layer) was subjected to 0.6X (magnetic bead volume: sample volume) bead purification (DNA Clean Beads, MGI, 940-001820-00) followed by PCR amplification (95℃for 3 min,10 cycles (98℃for 20 s,65℃for 30s,72℃for 3 min), 72℃for 10 min,12℃for hold) using CDNA AMP PRIMER-V3 (MGI, 940-001819-00) (the upstream primer was the adaptor sequence of TSO and the downstream primer was the 5' -terminal sequence of the probe).

3) After the amplification reaction was completed, 0.6x (magnetic bead volume: sample volume) magnetic bead purification was performed (DNA Clean Beads, MGI, 940-001820-00). Further carrying out concentration detection and fragment distribution analysis on the full-length cDNA product after purification.

4. Short read long library-building sequencing

1) From the full-length cDNA product obtained in the previous step, 1/3 of the product is taken for breaking, end repairing, linker adding and breaking the amplification reaction of the product (MGI, 940-001821-00);

2) And carrying out DNBSEQ sequencing on the amplified products after library establishment.

2. Single molecule library sequencing

1) The full-length cDNA product obtained in the step 3 in the single cell library optimization flow is taken, and the construction and sequencing of a single molecule library are carried out according to the instructions of CycloneSEQ library preparation kit (MGI, H940-000001-00), including the flow of end repair and A addition and joint addition.

2) The library product was sequenced in single molecule according to the amount of kit instructions.

Wherein, if the full-length cDNA product obtained in the step 3 is not enough in initial amount of single molecule library construction, the full-length cDNA product can be properly amplified. Likewise, the cDNA obtained by this procedure is compatible with other single molecule sequencing platforms currently commercially available. The method is carried out according to the commercialized specifications of each platform.

3. Analysis of results

The data utilization of the sequencing results was calculated according to the following formula:

Data utilization = number of read lengths per total number of read lengths x 100%.

The results are shown in fig. 2, and the long-reading long-sequencing data results of the single-molecule library show that the effective data utilization rate can be systematically improved by modifying the 3' -hydroxyl site of the terminal nucleotide of the TSO or adding a molecular spacer to obtain various TSO structural variants whether the single-molecule library is a human mouse cell line, a PBMC, a mouse brain cell nucleus or the like. This demonstrates that the TSO three-terminal blocking strategy can effectively inhibit TSO-mediated non-specific amplification by multipath blocking polymerase non-template extension, providing a standardized solution for flow optimization of single cell and spatial transcriptome technologies.

The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims

1. A template switching oligonucleotide, characterized in that the template switching oligonucleotide comprises: a template switching region located at the 3' end of the template switching oligonucleotide, and an amplification linker region located 5' to the template switching region;

The length of the template switch region is 2 to 5 nucleotides; the 3'-end nucleotide of the template switch region has a 3'-end modification, and the 3'-end modification is used to prevent the 3'-end nucleotide of the template switch region from forming a phosphodiester bond with a deoxynucleotide and/or a free 5'-phosphate group of a deoxynucleotide sequence.

2. The template switching oligonucleotide according to claim 1, wherein the 3'-terminal nucleotide with a 3'-terminal modification is a locked nucleic acid, or a 3'-terminal nucleotide in which the 3'-terminal hydroxyl group is replaced by an -O-R or -R' group;

Wherein, R is any one of a C1-C18 alkyl group, a C2-C18 alkenyl group, a C2-C18 alkynyl group, an aryl group of C18 or less, an aralkyl group of C18 or less, a heteroaryl group of C18 or less, a phosphate group, an azidomethyl group, an allyl group, a nitrobenzyl group, a carbamate group, and a thiophosphate group; and R' is any one of H, a C1-C18 alkyl group, a C2-C18 alkenyl group, a C2-C18 alkynyl group, an aryl group of C18 or less, an aralkyl group of C18 or less, and a heteroaryl group of C18 or less.

3 . The template switching oligonucleotide according to claim 1 , wherein the template switching region consists of 3 ribonucleotides.

The template switching oligonucleotide according to claim 3 , wherein the sequence of the template switching region is rGrGrG.

5. A kit, characterized in that it comprises the template switching oligonucleotide according to any one of claims 1 to 4.

6. The kit according to claim 5, characterized in that the kit further comprises at least one of a reverse transcriptase, a reverse transcription primer, a DNA polymerase, an RNase inhibitor, polyethylene glycol, betaine, dNTPs, and a solid phase carrier; wherein a barcode probe is directly or indirectly fixed to the solid phase carrier; the barcode probe comprises, from the 5' end to the 3' end: an amplification linker region, a barcode sequence, and a capture sequence; the barcode sequence is a spatial barcode or a cellular barcode, and the capture sequence is complementary to a partial sequence of RNA derived from the sample to be tested.

7. A method for constructing a spatial transcriptome sequencing library or a single-cell transcriptome sequencing library using the template switching oligonucleotide according to any one of claims 1 to 4.

8. The method according to claim 7, characterized in that it comprises the following steps:

(1) Using a barcode probe as a primer and RNA in a tissue or cell as a template, a reverse transcription reaction is performed to obtain cDNA; wherein the barcode probe is directly or indirectly fixed on a solid phase carrier and includes, from the 5' end to the 3' end: an amplification linker region, a barcode sequence, and a capture sequence; the barcode sequence is a spatial barcode or a cellular barcode, and the capture sequence is complementary to a portion of the sequence of the RNA derived from the sample to be tested;

(2) Introducing a non-templated polymerization sequence of 2 to 5 nucleotides in length at the 3' end of the cDNA to obtain an extension product;

(3) The non-templated polymerization sequence of the extension product is hybridized with the template switching region of the template switching oligonucleotide according to any one of claims 1 to 4, and the template switching oligonucleotide is used as a template to continue extension to obtain a library starting molecule; the library starting molecule includes, from the 5' end to the 3' end, an amplification linker region, a barcode sequence, a capture sequence, a cDNA sequence, and a complementary sequence to the amplification linker region;

(4) Amplifying or enriching the starting molecules of the library using a first library amplification primer and a second library amplification primer to obtain the spatial transcriptome sequencing library or the single-cell transcriptome sequencing library; wherein the 3' end sequence of the first library amplification primer is complementary to the complementary sequence of the amplification junction region, and the 3' end sequence of the second library amplification primer is complementary to the complementary sequence of the amplification junction region.

9. A sequencing method, comprising the step of sequencing the spatial transcriptome library or single-cell transcriptome library obtained by the method according to claim 7 or 8.

10. The sequencing method according to claim 9, wherein the sequencing is full-length sequencing performed using a single-molecule sequencing method.