[go: up one dir, main page]

WO2025062002A1 - Séquençage simultané à l'aide d'une traduction de coupure simple brin - Google Patents

Séquençage simultané à l'aide d'une traduction de coupure simple brin Download PDF

Info

Publication number
WO2025062002A1
WO2025062002A1 PCT/EP2024/076525 EP2024076525W WO2025062002A1 WO 2025062002 A1 WO2025062002 A1 WO 2025062002A1 EP 2024076525 W EP2024076525 W EP 2024076525W WO 2025062002 A1 WO2025062002 A1 WO 2025062002A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
sequence
cleavable
primers
polynucleotide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2024/076525
Other languages
English (en)
Inventor
Aathavan KARUNAKARAN
Eli CARRAMI
Merek SIU
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marks & Clerk LLP
Illumina Inc
Original Assignee
Marks & Clerk LLP
Illumina Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marks & Clerk LLP, Illumina Inc filed Critical Marks & Clerk LLP
Publication of WO2025062002A1 publication Critical patent/WO2025062002A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to methods and kits for use in nucleic acid sequencing, in particular methods for use in double-stranded sequencing, and in particular for use in doublestranded sequence by synthesis (SBS).
  • the methods of the invention can be used in concurrent sequencing.
  • the invention also relates to methods and kits for use in nucleic acid sequencing, in particular methods for use in concurrent sequencing using nick translation, and in particular using a polymerase with 5’ to 3’ exonuclease activity.
  • next-generation sequencing technologies
  • a nucleic acid cluster is created on a flow cell by amplifying an original template nucleic acid strand. Sequencing cycles may be performed as complementary strands of the template nucleic acids are being synthesized, i.e., using sequencing-by-synthesis (SBS) processes.
  • SBS sequencing-by-synthesis
  • deoxyribonucleic acid analogs conjugated to fluorescent labels are hybridized to the template nucleic acids, and excitation light sources are used to excite the fluorescent labels on the deoxyribonucleic acid analogs.
  • Detectors capture fluorescent emissions from the fluorescent labels and identify the deoxyribonucleic acid analogs.
  • the sequence of the template nucleic acids may be determined by repeatedly performing such sequencing cycles.
  • NGS allows for the sequencing of a number of different template nucleic acids simultaneously, which has significantly reduced the cost of sequencing in the last twenty years.
  • G-Quads are highly stable secondary structures formed by G-rich sequences in DNA and RNA.
  • the structures of guanine quartets sterically hinder DNA polymerase’s movement along a polynucleotide and diminish the enzyme’s kinetics and fidelity. As such, reducing the formation of G-Quads is highly desirable.
  • the high stability of G-quads means these structures are difficult to remove without affecting the integrity of the template strand.
  • one solution may be to use cleavable sequencing primers to perform double-stranded SBS, consequently improving the accuracy of the sequence read.
  • a method of sequencing a first and second polynucleotide sequence comprises preparing the first and second polynucleotide sequence for sequencing and sequencing the nucleobases in the first and second polynucleotide sequence, wherein the first and second polynucleotide sequences are sequenced using nick translation.
  • first and second sequencing primers applying a plurality of first and second sequencing primers, wherein the first sequencing primers comprise a first cleavable site and wherein the second sequencing primers comprise a second cleavable site and conducting an amplification reaction to extend the first and second sequencing primers; and e. selectively cleaving the first cleavable site in the first sequencing primer and/or selectively cleaving the second cleavable site in the second sequencing primer.
  • the first and second immobilised primers that have not been extended are removed using a single-stranded exonuclease.
  • first adaptor comprises at least one cleavable site and the second adaptor comprises a complement of at least one cleavable site, and wherein the first and second adaptors comprises an immobilised primer-binding sequence and a sequencing primer binding sequence or complements thereof; c. hybridising the first and second polynucleotide sequences to first and second immobilised primers on a solid support, wherein optionally, the first and second immobilised primers comprise at least one cleavable site; d. forming a cluster of amplified first and second polynucleotide sequences; and e. cleaving at least one cleavable sites to provide sequencing sites for nick translation.
  • the first adaptor may comprise two or at least two cleavable sites and the second adaptor comprise two or at least two cleavable sites.
  • the method may further comprise a step of removing a or the sequence between the cleavable sites.
  • the method may further comprise blocking 3’-ends of the first polynucleotide sequences and the second polynucleotide sequences.
  • the method may further comprise applying a plurality of first and second sequencing primers, wherein the first or second sequencing primers comprise a mixture of blocked sequencing primers and unblocked sequencing primers.
  • the blocked sequencing primer may comprise a blocking group at a 3’ end of the sequencing primer.
  • the blocking group may be selected from the group consisting of: a hairpin loop, a deoxynucleotide, a deoxyribonucleotide, a hydrogen atom instead of a 3’-OH group, a phosphate group, a phosphorothioate group, a propyl spacer, a modification blocking the 3’-hydroxyl group, or an inverted nucleobase.
  • the first or second sequencing primers may comprise a mixture of blocked sequencing primers and unblocked sequencing primers wherein the ratio of blocked to unblocked sequencing primers is 1 :1.
  • the cleavable sites may be restriction sites for a nicking endonuclease.
  • the cluster may be formed by bridge amplification.
  • the step of cleaving at the cleavage sites is carried out following bridge amplification and before linearization of the extended sequences.
  • the first polynucleotide sequence is a forward strand of a doublestranded polynucleotide to be identified and the second polynucleotide sequence is a reverse strand of a double-stranded polynucleotide to be identified.
  • a method of simultaneously sequencing a first and second polynucleotide sequence comprising: preparing polynucleotide sequences for simultaneous sequencing using a method according the invention; and concurrently sequencing nucleobases in the first portion and the second portion.
  • the step of concurrently sequencing nucleobases comprises performing sequencing-by-synthesis or sequencing-by-ligation.
  • the step of concurrently sequencing nucleobases comprises treatment with a polymerase and a 5’-3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • a sequencing kit comprising a plurality of first and second sequencing primers, wherein the first and second sequencing primers comprise at least one cleavage site.
  • the kit may further comprise first and second sequencing primers wherein the first and second sequencing primers comprise a mixture of cleavable sequencing primers and un- cleavable sequencing primers.
  • the kit may further comprise a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • the kit may further comprise a single-stranded exonuclease and cleavable effectors, such as nickases.
  • a library preparation kit comprising a plurality of first adaptors and a plurality of second adaptors, wherein the first adaptors comprise at least one cleavable site and the second adaptors comprise a complement of at least one cleavable site, and wherein the first and second adaptors comprises an immobilised primer-binding sequence and a sequencing primer binding sequence or complements thereof.
  • a library preparation kit comprising a plurality of first and second sequencing primers, wherein the first and second sequencing primers comprise at least one cleavage site, and wherein the kit comprises a plurality of first adaptors and a plurality of second adaptors, wherein the first adaptors comprise at least one cleavable site and the second adaptors comprise a complement of at least one cleavable site, and wherein the first and second adaptors comprises an immobilised primer-binding sequence and a sequencing primer binding sequence or complements thereof.
  • the kit may further comprise first and second sequencing primers wherein the first and second sequencing primers comprise a mixture of blocked sequencing primers and unblocked sequencing primers, and wherein the ratio of blocked to unblocked sequencing primers is 1 :1.
  • the kit may also comprise a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • a data processing device comprising means for carrying out the method of the invention.
  • a data processing device comprising means for carrying out a method of the invention.
  • the data processing device is a polynucleotide sequencer.
  • a computer-readable storage medium comprising instructions which, when executed by a processor, cause the processor to carry out a method of the invention.
  • a computer-readable data carrier having stored thereon a computer program product of the invention.
  • the present invention provides all the advantages of double-stranded SBS sequencing, without the need to reconfigure the immobilised and/or sequencing primers.
  • Figure 1 shows a forward strand, reverse strand, forward complement strand, and reverse complement strand of a polynucleotide molecule.
  • Figure 2 shows an example of a polynucleotide sequence (or insert) with 5’ and 3’ adaptor sequences.
  • Figure 3 shows a typical polynucleotide with 5’ and 3’ adaptor sequences.
  • the second extended immobilised strand may comprise a second primer sequence at its’ 5’ end (e.g.
  • the first and second library strands may be the forward strand and reverse strand respectively of a polynucleotide duplex.
  • G Hybridising the 3’ primer binding sequence of the first library strand to a first lawn primer and hybridising the 3’ primer binding sequence of the second library strand to a second lawn primer; and carrying out an extension reaction to extend the lawn primers to generate a first or second immobilised (also referred to herein as extended) template strand complementary to the library strands, wherein the immobilised strands comprise a 3’ (second or first respectively) primer binding sequence.
  • Figure 5 shows a comparison between four workings of the conventional (A) NGS SBS workflow, (B) SPEAR workflow, (C) single-read dsDNA nick-translation, and (D) 16QaM SEAR using nick translation SBS workflows. These workflows are only examples and can be altered according to the embodiments described herein. The dark font highlights which stages of the conventional SBS and SPEAR SBS workflows are eliminated using nick translation of this invention.
  • Figure 6 shows an exemplar solid support 200 comprising a substrate 204 with a plurality of wells 203. Immobilised primers 201 , 202 are found within the well.
  • Figure 7 shows a graphical representation of sixteen distributions of signals generated by polynucleotide sequences according to one embodiment.
  • Figure 8 shows the detection of nucleobases using 4-channel, 2-channel and 1 -channel chemistry.
  • Figure 9 shows that by plotting relative intensities of light signals obtained from a first channel (ch1) and a second channel (ch2), a constellation of 16 clouds is obtained.
  • Figure 10 shows a flow diagram showing a method for base calling according to one embodiment.
  • Figure 11 shows g-quad coverage.
  • Figure 12 shows that dsDNA-SBS using the present invention reduces sequencing errors at G-quad regions.
  • Read 1-5 represents dsDNA-SBS and read 6, standard SBS.
  • Read 6 standard SBS had a greater error rate at G-quad regions (shown by the peak) compared to multiple reads using ds-DNA-nt-SBS (1-5).
  • Figure 13 illustrates the key stages of the single-read embodiment of this invention.
  • E) Nick translate SBS read 1 is completed, using a polymerase and a 5’->3’ exonuclease.
  • F) The 3’ end of the growing first read is blocked, and the second read primer is nicked.
  • Figure 14 illustrates the key stages of the simultaneous sequencing embodiment of this invention.
  • Nick-translation SBS of read 1 and read 2 are completed simultaneously. A different intensity ratio of read 1 and read 2, resulting from the mix of nickable and non-nickable primers is decoded by 16 QAM.
  • Figure 15 shows preparation of a library using a loop fork method.
  • Figure 16 shows an example of a concatenated polynucleotide sequence comprising a first portion and a second portion, as well as terminal and internal adaptor sequences.
  • Figure 17 shows an example of a concatenated polynucleotide sequence comprising a first portion and a second portion, as well as terminal and internal adaptor sequences.
  • Figure 18 shows an example of possible cleavage positions (e.g. nicking sites) (arrows) within the 5’ and 3’ adaptors.
  • Each adaptor sequence may comprise at least one or at least two of cleavage sites.
  • Exemplar possible cleavage sites include at the junctions between each component forming the 3’ or 5’ adaptor or at positions within each component of the adaptor; for example, a cleavage site may be positioned at the junction between an index (I5 or I7) and a SBS primer site (SBS3 or SBS’12), or alternatively be positioned within the SBS primer site or within the index sequence itself.
  • Figure 19 shows the key steps of one embodiment of the present invention: A) clustering by bridge amplification of first and second polynucleotide strands each comprising two cleavage sites within the adaptor regions; B) cleaving the first and second polynucleotide strands at a cleavable site found in the 5’ adaptor regions and melting the nicked region; C) Addition of first and second sequencing primers, wherein either the first or second sequencing primers comprise a mix of blocked and unblocked sequencing primers; D) Nick-translation sequencing-by-synthesis using a polymerase and a 5’-3’ exonuclease or a polymerase comprising 5’ to 3’ exonuclease activity.
  • the use of blocked and unblocked sequencing primers generates a different (preferably 2:1) intensity between read 1 and read 2, allowing for their simultaneous detection.
  • variant refers to a variant polypeptide sequence or part of the polypeptide sequence that retains desired function of the full non-variant sequence.
  • a desired function of the immobilised primer retains the ability to bind (i.e. hybridise) to a target sequence.
  • a “variant” has at least 25%, 26%, 27%, 28%, 29%, 30%, 31 %, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41 %, 42%, 43%,
  • Sequencing typically comprises four fundamental steps: 1) library preparation to form a plurality of target polynucleotides for identification; 2) cluster generation to form an array of amplified template polynucleotides; 3) sequencing the cluster array of amplified template polynucleotides; and 4) data analysis to identify characteristics of the target polynucleotides from the amplified template polynucleotide sequences. These steps are described in greater detail below.
  • the polynucleotide sequence 100 comprises a forward strand of the sequence 101 and a reverse strand of the sequence 102. See Figure 1.
  • replication of the polynucleotide sequence 100 provides a double-stranded polynucleotide sequence 100a that comprises a forward strand of the sequence 101 and a forward complement strand of the sequence 10T, and a double-stranded polynucleotide sequence 100b that comprises a reverse strand of the sequence 102 and a reverse complement strand of the sequence 102’.
  • the term “template” may be used to describe a complementary version of the doublestranded polynucleotide sequence 100.
  • the “template” comprises a forward complement strand of the sequence 10T and a reverse complement strand of the sequence 102’.
  • a sequencing process e.g. a sequencing- by-synthesis or a sequencing-by-ligation process
  • reproduces information that was present in the original forward strand of the sequence 101 by using the reverse complement strand of the sequence 102’ as a template for complementary base pairing, a sequencing process (e.g. a sequencing-by-synthesis or a sequencing-by-ligation process) reproduces information that was present in the original reverse strand of the sequence 102.
  • the two strands in the template may also be referred to as a forward strand of the template 10T and a reverse strand of the template 102’.
  • the complement of the forward strand of the template 10T is termed the forward complement strand of the template 101
  • the complement of the reverse strand of the template 102’ is termed the reverse complement strand of the template 102.
  • forward strand, reverse strand, forward complement strand, and reverse complement strand are used herein without qualifying whether they are with respect to the original polynucleotide sequence 100 or with respect to the “template”, these terms may be interpreted as referring to the “template”.
  • Library preparation is the first step in any high-throughput sequencing platform. These libraries allow templates to be generated via complementary base pairing that can subsequently be clustered and amplified. During library preparation, nucleic acid sequences, for example genomic DNA sample, or cDNA or RNA sample, is converted into a sequencing library, which can then be sequenced.
  • the first step in library preparation is random fragmentation of the DNA sample. Sample DNA is first fragmented and the fragments of a specific size (typically 200-500 bp, but can be larger) are ligated, sub-cloned or “inserted” in-between two oligo adaptors (adaptor sequences). The original sample DNA fragments are referred to as “inserts”.
  • the target polynucleotides may advantageously also be size-fractionated prior to modification with the adaptor sequences.
  • the templates to be generated typically include separate polynucleotide sequences, in particular a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion. Generating these templates from particular libraries may be performed according to methods known to persons of skill in the art. However, some example approaches of preparing libraries suitable for generation of such templates are described below.
  • the library may be prepared by ligating adaptor sequences to double-stranded polynucleotide sequences, each comprising a forward strand of the sequence and a reverse strand of the sequence, as described in more detail in e.g. WO 07/052006, which is incorporated herein by reference.
  • “tagmentation” can be used to attach the sample DNA to the adaptors, as described in more detail in e.g. WO 10/048605, US 2012/0301925, US 2013/0143774 and WO 2016/189331 , each of which are incorporated herein by reference.
  • tagmentation double-stranded DNA is simultaneously fragmented and tagged with adaptor sequences and PCR primer binding sites.
  • the combined reaction eliminates the need for a separate mechanical shearing step during library preparation.
  • These procedures may be used, for example, for preparing templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion, wherein the first portion is a forward strand of the template, and the second portion is a forward complement strand of the template - i.e. a copy of the forward strand (or alternatively, wherein the first portion is a reverse strand of the template, and the second portion is a reverse complement strand of the template).
  • library preparation may comprise ligating a first primer-binding sequence 30T or an adaptor that comprises a primer-binding sequence 30T (e.g. P5’, such as SEQ ID NO. 3) and a second terminal sequencing primer binding site 304 (e.g. SBS3’, for example, SEQ ID NO. 8) to a 3’-end of a forward strand of a sequence 101.
  • the library preparation may be arranged such that the second terminal sequencing primer binding site 304 is attached (e.g. directly attached) to the 3’-end of the forward strand of the sequence 101 , and such that the first primer-binding sequence 301’ is attached (e.g. directly attached) to the 3’-end of the second terminal sequencing primer binding site 304.
  • the library preparation may further comprise ligating a complement or an adaptor comprising a complement of first terminal sequencing primer binding site 303’ (e.g. SBS12, such as SEQ ID NO. 9) (also referred to herein as a first terminal sequencing primer binding site complement 303’) and a complement of a second primer-binding sequence 302 (also referred to herein as a second primer-binding complement sequence 302) (e.g. P7, such as SEQ ID NO. 2) to a 5’-end of the forward strand of the sequence 101.
  • the library preparation may be arranged such that first terminal sequencing primer binding site complement 303’ is attached (e.g. directly attached) to the 5’-end of the forward strand of the sequence 101 , and such that second primer-binding complement sequence 302 is attached (e.g. directly attached) to the 5’-end of first terminal sequencing primer binding site complement 303’.
  • one strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a second primer-binding complement sequence 302 or a first adaptor comprising a second primer-binding complement sequence 302 (e.g. P7), a first terminal sequencing primer binding site complement 303’ (e.g. SBS12), a forward strand of the sequence/insert 101 , a second terminal sequencing primer binding site 304 or a second adaptor comprising a second terminal sequencing primer binding site 304 (e.g. SBS3’), and a first primer-binding sequence 30T (e.g. P5’) ( Figure 2 - bottom strand).
  • a second primer-binding complement sequence 302 or a first adaptor comprising a second primer-binding complement sequence 302 e.g. P7
  • a first terminal sequencing primer binding site complement 303’ e.g. SBS12
  • the strand may further comprise one or more index sequences.
  • a first index sequence (e.g. i7) may be provided between the second primer-binding complement sequence 302 (e.g. P7) and the first terminal sequencing primer binding site complement 303’ (e.g. SBS12).
  • a second index complement sequence (e.g. i5’) may be provided between the second terminal sequencing primer binding site 304 (e.g. SBS3’) and the first primer-binding sequence 30T (e.g. P5’).
  • one strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a second primerbinding complement sequence 302 (e.g. P7), a first index sequence (e.g. i7), a first terminal sequencing primer binding site complement 303’ (e.g. SBS12), a forward strand of the sequence 101 , a second terminal sequencing primer binding site 304 (e.g. SBS3’), a second index complement sequence (e.g. i5’), and a first primer-binding sequence 301’ (e.g. P5’).
  • a typical polynucleotide is shown in Figure 3.
  • the library preparation may also comprise ligating a second primer-binding sequence 302’ or a third adaptor comprising a second primer-binding sequence 302’ (e.g. P7’) and a first terminal sequencing primer binding site 303 (e.g. SBS12’) to a 3’-end of a reverse strand of a sequence 102.
  • the library preparation may be arranged such that first terminal sequencing primer binding site 303 is attached (e.g. directly attached) to the 3’-end of the reverse strand of the sequence 102, and such that the second primer-binding sequence 302’ is attached (e.g. directly attached) to the 3’-end of first terminal sequencing primer binding site 303.
  • the library preparation may further comprise ligating a complement of a second terminal sequencing primer binding site 304’ or a fourth adaptor comprising a complement of a second terminal sequencing primer binding site 304’ (e.g. SBS3) (also referred to herein as a second terminal sequencing primer binding site complement 304’) and a complement of a first primer-binding sequence 301 (also referred to herein as a first primer-binding complement sequence 301) (e.g. P5) to a 5’-end of the reverse strand of the sequence 102.
  • the library preparation may be arranged such that the second terminal sequencing primer binding site complement 304’ is attached (e.g. directly attached) to the 5’-end of the reverse strand of the sequence 102, and such that the first primer-binding complement sequence 301 is attached (e.g. directly attached) to the 5’- end of the second terminal sequencing primer binding site complement 304’.
  • another strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a first primer-binding complement sequence 301 or a third adaptor comprising a first primer-binding complement sequence 301 (e.g. P5), a second terminal sequencing primer binding site complement 304’ (e.g. SBS3), a reverse strand of the sequence/insert 102, a first terminal sequencing primer binding site 303 or a fourth adaptor comprising a first terminal sequencing primer binding site 303 (e.g. SBS12’), and a second primer-binding sequence 302’ (e.g. P7’) ( Figure 2 - top strand).
  • a first primer-binding complement sequence 301 or a third adaptor comprising a first primer-binding complement sequence 301 e.g. P5
  • a second terminal sequencing primer binding site complement 304 e.g. SBS3
  • the another strand may further comprise one or more index sequences.
  • a second index sequence (e.g. i5) may be provided between the first primer-binding complement sequence 301 (e.g. P5) and the second terminal sequencing primer binding site complement 304’ (e.g. SBS3).
  • a first index complement sequence (e.g. i7’) may be provided between the first terminal sequencing primer binding site 303 (e.g. SBS12’) and the second primer-binding sequence 302’ (e.g. P7’).
  • another strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a first primer-binding complement sequence 301 (e.g. P5), a second index sequence (e.g. i5), a second terminal sequencing primer binding site complement 304’ (e.g. SBS3), a reverse strand of the sequence 102, a first terminal sequencing primer binding site 303 (e.g. SBS12’), a first index complement sequence (e.g. i7’), and a second primer-binding sequence 302’ (e.g. P7’).
  • a typical polynucleotide is shown in Figure 3 (top strand).
  • the term “genetically unrelated” refers to portions which are not related in the sense of being any two of the group consisting of: forward strands, reverse strands, forward complement strands, and reverse complement strands.
  • the “genetically unrelated” sequences could be different fragment sequences which are derived from the same source, but are different fragments from that source (e.g. from the same fragmented library preparation process). This includes sequences that can be overlapping in sequence (but not identical in sequence).
  • further processes may be used to generate templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion, wherein the first portion and the second portion are genetically unrelated.
  • the library may be prepared using a loop fork method, which is described below. This procedure may be used, for example, for preparing templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion, wherein the first portion is a forward strand of the template, and the second portion is a reverse complement strand of the template (or alternatively, wherein the first portion is a reverse strand of the template, and the second portion is a forward complement strand of the template).
  • Such libraries may also be referred to as self-tandem inserts.
  • a representative process for conducting a loop fork method is shown in Figure 15.
  • adaptors may be ligated to a first end of the sequence (e.g. using processes as described in more detail in e.g. WO 07/052006, or “tagmentation” methods as described above).
  • a second end of the sequence (different from the first end) may be ligated to a loop, which connects the forward strand of the sequence and the reverse strand of the sequence, thus generating a loop fork ligated polynucleotide sequence.
  • templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion, wherein the first portion is a forward strand of the template, and the second portion is a reverse complement strand of the template (or alternatively, wherein the first portion is a reverse strand of the template, and the second portion is a forward complement strand of the template).
  • the P5’ and P7’ primer-binding sequences are complementary to short primer sequences (or lawn primers) present on the surface of a flow cell. Binding of P5’ and P7’ to their complements (P5 and P7) on - for example - the surface of the flow cell, permits nucleic acid amplification. As used herein denotes the complementary strand.
  • the primer-binding sequences in the adaptor that permit hybridisation to amplification primers will typically be around 20-40 nucleotides in length, although the invention is not limited to sequences of this length.
  • the precise identity of the amplification primers (e.g. lawn primers), and hence the cognate sequences in the adaptors, are generally not material to the invention, as long as the primer-binding sequences are able to interact with the amplification primers in order to direct PCR amplification.
  • the sequence of the amplification primers may be specific for a particular target nucleic acid that it is desired to amplify, but in other embodiments these sequences may be "universal" primer sequences which enable amplification of any target nucleic acid of known or unknown sequence which has been modified to enable amplification with the universal primers.
  • the criteria for design of PCR primers are generally well known to those of ordinary skill in the art.
  • the index sequences (also known as a barcode or tag sequence) are unique short DNA (or RNA) sequences that are added to each DNA (or RNA) fragment during library preparation. The unique sequences allow many libraries to be pooled together and sequenced simultaneously.
  • Sequencing reads from pooled libraries are identified and sorted computationally, based on their barcodes, before final data analysis.
  • Library multiplexing is also a useful technique when working with small genomes or targeting genomic regions of interest. Multiplexing with barcodes can exponentially increase the number of samples analysed in a single run, without drastically increasing run cost or run time. Examples of tag sequences are found in WO05/068656, whose contents are incorporated herein by reference in their entirety.
  • the invention is not limited by the number of reads per cluster, for example two reads per cluster: three or more reads per cluster are obtainable simply by dehybridising a first extended sequencing primer, and rehybridising a second primer before or after a cluster repopulation/strand resynthesis step.
  • Single or dual indexing may also be used. With single indexing, up to 48 unique 6-base indexes can be used to generate up to 48 uniquely tagged libraries. With dual indexing, up to 24 unique 8-base Index 1 sequences and up to 16 unique 8-base Index 2 sequences can be used in combination to generate up to 384 uniquely tagged libraries. Pairs of indexes can also be used such that every i5 index and every i7 index are used only one time. With these unique dual indexes, it is possible to identify and filter indexed hopped reads, providing even higher confidence in multiplexed samples.
  • the sequencing primer binding sites are sequencing and/or index primer binding sites and indicate the starting point of the sequencing read.
  • sequencing primer binding site is only required to be a binding site for a sequencing primer.
  • a sequencing primer anneals (i.e. hybridises) to at least a portion of the sequencing primer binding site on the template strand.
  • the polymerase enzyme binds to this site and incorporates complementary nucleotides base by base into the growing opposite strand.
  • a double-stranded nucleic acid will typically be formed from two complementary polynucleotide strands comprised of deoxyribonucleotides or ribonucleotides joined by phosphodiester bonds, but may additionally include one or more ribonucleotides and/or non-nucleotide chemical moieties and/or non-naturally occurring nucleotides and/or non-naturally occurring backbone linkages.
  • the double-stranded nucleic acid may include nonnucleotide chemical moieties, e.g. linkers or spacers, at the 5' end of one or both strands.
  • the double-stranded nucleic acid may include methylated nucleotides, uracil bases, phosphorothioate groups, peptide conjugates etc.
  • Such non-DNA or non-natural modifications may be included in order to confer some desirable property to the nucleic acid, for example to enable covalent, non-covalent or metal-coordination attachment to a solid support, or to act as spacers to position the site of cleavage an optimal distance from the solid support.
  • a single stranded nucleic acid consists of one such polynucleotide strand.
  • a polynucleotide strand is only partially hybridised to a complementary strand - for example, a long polynucleotide strand hybridised to a short nucleotide primer - it may still be referred to herein as a single stranded nucleic acid.
  • a sequence comprising at least a primer-binding sequence (a primer-binding sequence and a sequencing primer binding site, or in another aspect, a combination of a primerbinding sequence, an index sequence and a sequencing primer binding site) may be referred to herein as an adaptor sequence, and an insert is flanked by a 5’ adaptor sequence and a 3’ adaptor sequence.
  • the 5’ adaptor sequence is the adaptor sequence that is attached to the 5’ end of the insert.
  • the 3’ adaptor sequence is the adaptor sequence that is attached to the 3’ end of the insert.
  • the primerbinding sequence may also comprise a sequencing primer for the index read.
  • one advantage of one embodiment of the present invention is that a separate sequencing primer for the index read is not needed, reducing the number of reagents/primers needed for a sequencing read.
  • an “adaptor” refers to a sequence that comprises a short sequencespecific oligonucleotide that is ligated to the 5' and 3' ends of each DNA (or RNA) fragment in a sequencing library as part of library preparation.
  • the adaptor sequence may further comprise non-peptide linkers.
  • each adaptor sequence comprises at least one cleavable site or a complement of a (or the same) cleavable site.
  • the 5’ adaptor sequence comprises a cleavable site and the 3’ adaptor sequence comprises a complement of that cleavable site. This is shown, for example, in Figure 18.
  • complement of a cleavable site is meant a sequence that is the complement or substantially complementary to a site (e.g. a sequence) that is cleavable. Unlike the cleavable site, the complement of the cleavable site may not be cleavable. It will be recognised that the adaptor sequences comprise complements of cleavage sites so that when the library strands are used as a template during amplification cleavable site are translated into the copied strand.
  • each adaptor sequence comprises at least two cleavable sites or complements of each cleavable site.
  • the 5’ adaptor sequences comprise two cleavable sites and the 3’ adaptor sequences comprise two complements of those (same) cleavable sites. This is shown, for example, in Figure 18.
  • the at least one cleavable site or complement of the cleavable site may be 5’ of, 3’ of, or within one of the sequences selected from the primer-binding site, the index sequence or the sequencing primer binding site as shown in Figure 18.
  • the at least one cleavable site is 3’ or 5’ of the sequencing primer binding site. That is, the at least one cleavable site is between the insert and the sequencing primer binding site.
  • the 5’ adaptor sequence comprises at least one cleavable site, where the at least one cleavable site is 3’ of the sequencing primer binding site. That is, the at least one cleavable site is between the insert and the sequencing primer binding site.
  • the 3’ adaptor sequence comprises at least one complement of a cleavable site, where the complement of the cleavable site is 5’ of the (complements of) the sequencing primer binding sites. That is, the complement of the cleavable site is between the insert and the sequencing primer binding site.
  • the first and second sites may be between 10 and 40 bases apart, and again, may be 5’ of, 3’ of, or within one of the sequences selected from the primer-binding site, the index sequence or the sequencing primer binding site.
  • the 5’ adaptor comprises a first and second cleavable site, where the first site is 3’ of the sequencing primer binding site and the second site is 5’ of the sequencing primer binding site. That is, the first and second cleavable sites are either side of the sequencing primer binding site.
  • the 5’ adaptor comprises a first and second cleavable site, where the first site is 3’ of the sequencing primer binding site and the second site is 5’ of the index sequence. That is, the cleavable sites are either side of the sequence for the sequencing primer and the index sequence.
  • the 3’ adaptor comprises a first and second cleavable site, where the first site is 5’ of the sequencing primer binding site and the second site is 3’ of the sequencing primer binding site. That is, the first and second cleavable sites are either side of the sequencing primer binding site.
  • the 3’ adaptor comprises a first and second cleavable site, where the first site is 5’ of the sequencing primer binding site and the second site is 3’ of the index sequence. That is, the cleavable sites are either side of the sequence for the sequencing primer and the index sequence.
  • the cleavable site is a restriction site. This allows the cleavage site to be nicked and allows sequencing to occur starting from the nick location (e.g. in conjunction with a polymerase as described below).
  • restriction site is meant a sequence of nucleotides recognised by an endonuclease, such as a single-stranded endonuclease.
  • a restriction site may also be referred to as a “recognition site” or “recognition sequence”, and such terms may be used interchangeably.
  • the endonuclease is a single strand restriction endonuclease, a nicking endonuclease or nicking enzyme or nickase (again, such terms may be used interchangeably).
  • a nicking endonuclease or nicking enzyme or nickase is meant an enzyme that can hydrolyze only one strand of the double-stranded polynucleotide (duplex), to produce DNA molecules that are “nicked”, rather than fully cleaved on both strands.
  • nicking enzymes examples include, but are not limited to, Nb.BbvCI, Nb.Bsml, Nb.BsrDI, Nb.BtsI, Nt.Alwl, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, BssSI, Nb.Bpu101 and Nt.CviPII.
  • These nickases can be used either alone or in various combinations.
  • Other suitable nicking endonucleases are available from commercial sources, including New England Biolabs and Fisher Scientific.
  • restriction sites vary depending on the nickase used, and are well known in the art.
  • the restriction site is selected from the following:
  • the nickase is Nb.BssSI
  • the restriction site is CACGAG, wherein Nb.BssSI catalyzes a single strand break within the recognition sequence.
  • the nickase is Nt.BspQI
  • the restriction site is GCTCTTC(1/-7)
  • Nt.BspQI catalyzes a single strand break one base beyond the 3’ side of the restriction site.
  • the nickase is Nt.CviPII and the restriction site is (0/-1)CCD, wherein Nt.CviPII catalyzes a single strand break at the 5’ side of the restriction site.
  • the nickase is Nb.BsrDI and the restriction site is GCAATG, wherein Nb.BsrDI catalyzes a single strand break within the restriction site.
  • the nickase is Nt.Alwl and the restriction site is GGATC(4/-5), wherein Nt.Alwl catalyzes a single strand break four bases beyond the 3’ side of the restriction site.
  • the nickase is Nb.BbvCI and the restriction site is CCTCAGC, wherein Nb.BbvCI catalyzes a single strand break within the restriction site.
  • the nickase is Nb.Bsml and the restriction site is GAATGC, wherein Nb.Bsml catalyzes a single strand break within the restriction site.
  • the nickase is Nt.BsmAI and the restriction site is GTCTC(1/-5), wherein Nt.BsmAI catalyzes a single strand break one base beyond the 3’ side of the restriction site.
  • the nickase is Nb.BpulOI and the restriction site is CCTNAGC, wherein Nb.BpulOI catalyzes a single strand break within the restriction site.
  • the restriction site is described in the following format (x/-y)
  • x is the number of nucleotides beyond (i.e. 3’ of) the 3’ end of the restriction site where cleavage occurs
  • y is the number of nucleotides in the restriction site.
  • the endonuclease is a Cas9 nickase.
  • a Cas9 nickase include Cas9 D10A and Cas9 H840A.
  • the Cas9 protein may comprise the D10A or H840A amino acid substitutions. These nickases cleave only the DNA strand that is complementary to and recognized by a gRNA.
  • the restriction site may be or may comprise a PAM (protospacer adjacent motif) sequence.
  • PAM sequences include NGG, NGAG, NGCG, NGN, NG, GAA, GAT, NNG, NGN, NRN, YG, NNGRRT, NNNRRT, NNAGAA, NNNNGATT and NNNNCRAA and complements thereof.
  • the Cas9 protein may alternatively or additionally comprise the N863A or N854A amino acid substitutions.
  • the Cas9 protein has been modified to improve activity.
  • the Cas9 protein may additionally comprise a D1135E substitution.
  • the Cas9 protein may also be the VQR variant
  • the library is subjected to denaturing conditions to provide single stranded nucleic acids. Suitable denaturing conditions will be apparent to the skilled reader with reference to standard molecular biology protocols (Sambrook et al., 2001 , Molecular Cloning, A Laboratory Manual, 4th Ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor Laboratory Press, NY; Current Protocols, eds Ausubel et al). In one embodiment, chemical denaturation may be used.
  • a single-stranded library may be contacted in free solution onto a solid support comprising surface capture moieties (for example P5 and P7 lawn primers).
  • a solid support 200 such as a flowcell.
  • seeding and clustering can be conducted off-flowcell using other types of solid support.
  • the solid support 200 may comprise a substrate 204.
  • An example of a solid support is shown in Figure 6.
  • the substrate 204 comprises at least one well 203 (e.g. a nanowell), and typically comprises a plurality of wells 203 (e.g. a plurality of nanowells).
  • the solid support comprises a plurality of first immobilised primers and a plurality of second immobilised primers.
  • each well 203 may comprise a plurality of first immobilised primers 201.
  • each well 203 may comprise a plurality of second immobilised primers 202.
  • each well 203 may comprise a plurality of first immobilised primers 201 and a plurality of second immobilised primers 202.
  • the first immobilised primer 201 may be attached via a 5’-end of its polynucleotide chain to the solid support 200.
  • the extension may be in a direction away from the solid support 200.
  • the second immobilised primer 202 may be attached via a 5’-end of its polynucleotide chain to the solid support 200.
  • the extension may be in a direction away from the solid support 200.
  • the first immobilised primer 201 may be different to the second immobilised primer 202 and/or a complement of the second immobilised primer 202.
  • the second immobilised primer 202 may be different to the first immobilised primer 201 and/or a complement of the first immobilised primer 201.
  • the (or each of the) first immobilised primer(s) 201 may comprise a sequence as defined in SEQ ID NO. 1 or 5, or a variant or fragment thereof.
  • the (or each of the) second immobilised primer(s) 202 may comprise a sequence as defined in SEQ ID NO. 2, or a variant or fragment thereof. Whilst first immobilised primer(s) 201 are shown here to correspond to P5 and second immobilised primer(s) 202 are shown here to correspond to P7, the definitions of these may be swapped - in other words, first immobilised primer(s) 201 may correspond instead to P7, and second immobilised primer(s) 202 may correspond to P5.
  • the immobilised primers - that is the first or second or the first and second immobilised primers are configured to be cleavable under cleavage conditions.
  • the immobilised primers may be configured to be cleavable by a thermal trigger (e.g. by heating), a light trigger (e.g. by exposure to ultraviolet light), and/or a chemical/biochemical trigger (e.g. an enzyme, periodate).
  • the location at which the immobilised primers are configured to be cleavable under cleavage conditions may also be referred to as a cleavage site. Accordingly, in one embodiment, the immobilised primers comprise at least one cleavable site. The cleavage site may be at the 3’ end of within the sequence of the immobilised primer.
  • the cleavage site comprises a cleavable covalent bond.
  • cleavable covalent bond refers to a covalent bond that can be cleaved, for example, under the application of heat, light or other (bio)chemical methods (e.g. by exposure to a degradation agent, such as an enzyme or a catalyst), while a “non- cleavable covalent bond” is stable to degradation under such conditions.
  • cleavable covalent bonds include thermally or photolytically cleavable cycloadducts (e.g.
  • furan-maleimide cycloadducts furan-maleimide cycloadducts
  • alkenylene linkages esters, amides, acetals, hemiaminal ethers, aminals, imines, hydrazones, 1 ,2-diol linkages (e.g. glycols cleavable by periodates), polysulfide linkages (e.g. disulfide linkages), boron-based linkages (e.g. boronic and borinic acids/esters), silicon-based linkages (e.g. silyl ether, siloxane), and phosphorus-based linkages (e.g. phosphite, phosphate) linkages.
  • polysulfide linkages e.g. disulfide linkages
  • boron-based linkages e.g. boronic and borinic acids/esters
  • silicon-based linkages e.g. silyl ether, siloxan
  • the immobilised primers may be configured to be cleavable by a glycosylase.
  • the cleavage conditions involve exposure to a glycosylase.
  • the immobilised primers may be configured to be cleavable by a glycosylase that recognises any nitrogenous base (e.g. purine or pyrimidine) which is not selected from guanine (G), cytosine (C), adenine (A) and thymine (T) when the first immobilised primer is a DNA sequence; or the immobilised primers may be configured to be cleavable by a glycosylase that recognises any nitrogenous base (e.g.
  • the glycosylase may recognise an unnatural nucleobase (i.e. one which is not usually present in a typical DNA sequence an RNA sequence).
  • unnatural nucleobases may include oxoguanine (e.g. 8-oxoguanine), hypoxanthine, xanthine, methylguanines (e.g. O 6 - methylguanine, N 7 -methylguanine), methyladenines (e.g.
  • methylcytosines e.g. 5-methylcytosine, 5- hydroxymethylcytosine, 5-formylcytosine, 5-carboxylcytosine
  • dihydrouracil e.g. 5-methylcytosine, 5-hydroxymethylcytosine, 5-formylcytosine, 5-carboxylcytosine
  • uracil if the first immobilised primer is a DNA sequence.
  • the immobilised primers may be configured to be cleavable by a uracil glycosylase (when the immobilised primer is a DNA sequence) or an oxoguanine glycosylase (e.g. 8-oxoguanine glycosylase); and in a further embodiment, an oxoguanine glycosylase (e.g. 8-oxoguanine glycosylase).
  • a uracil glycosylase when the immobilised primer is a DNA sequence
  • an oxoguanine glycosylase e.g. 8-oxoguanine glycosylase
  • an oxoguanine glycosylase e.g. 8-oxoguanine glycosylase
  • each immobilised primer that is cleavable may comprise a nucleobase which is not selected from guanine, cytosine, adenine or thymine when the immobilised primer is a DNA sequence, or wherein each immobilised primer that is cleavable may comprise a nucleobase which is not selected from guanine, cytosine, adenine or uracil when the immobilised primer is an RNA sequence.
  • each immobilised primer that is cleavable may comprise an unnatural nucleobase (i.e. one which is not usually present in a typical DNA sequence an RNA sequence).
  • examples of unnatural nucleobases may include oxoguanine (e.g. 8- oxoguanine), hypoxanthine, xanthine, methylguanines (e.g. O 6 -methylguanine, N 7 - methylguanine), methyladenines (e.g. 3-methyladenine, N 6 -methyladenine), modified cytosines including methylcytosines (e.g. 5-methylcytosine, 5-hydroxymethylcytosine, 5- formylcytosine, 5-carboxylcytosine), dihydrouracil, and uracil (if the first immobilised primer is a DNA sequence).
  • oxoguanine e.g. 8- oxoguanine
  • hypoxanthine e.g. O 6 -methylguanine, N 7 - methylguanine
  • methyladenines e.g. 3-methyladenine, N 6 -methyladenine
  • each immobilised primer that is cleavable may comprise oxoguanine (e.g. 8-oxoguanine) or uracil when the immobilised primer is a DNA sequence, or wherein each immobilised primer that is cleavable may comprise oxoguanine (e.g. 8-oxoguanine) when the immobilised primer is an RNA sequence; and in an even further embodiment, wherein immobilised primer that is cleavable may comprise oxoguanine (e.g. 8-oxoguanine) when the immobilised primer is a DNA sequence.
  • the cleavable site is a restriction site. A restriction site is already defined above.
  • the solid support may be contacted with the template to be amplified under conditions which permit hybridisation (or annealing - such terms may be used interchangeably) between the template and the immobilised primers.
  • the template is usually added in free solution under suitable hybridisation conditions, which will be apparent to the skilled reader.
  • hybridisation conditions are, for example, 5xSSC at 40°C.
  • other temperatures may be used during hybridisation, for example about 50°C to about 75°C, about 55°C to about 70°C, or about 60°C to about 65°C. Solid-phase amplification can then proceed.
  • the first step of the amplification is a primer extension step in which nucleotides are added to the 3' end of the immobilised primer using the template to produce a fully extended complementary strand.
  • the template is then typically washed off the solid support.
  • the complementary strand will include at its 3' end a primer-binding sequence (i.e. either P5’ or P7’) which is capable of bridging to the second primer molecule immobilised on the solid support and binding.
  • Further rounds of amplification leads to the formation of clusters or colonies of template molecules bound to the solid support. This is called clustering.
  • amplification may be isothermal amplification using a strand displacement polymerase; or may be exclusion amplification as described in WO 2013/188582. Further information on amplification can be found in WO 02/06456 and WO 07/107710, the contents of which are incorporated herein in their entirety by reference.
  • a cluster of template molecules is formed, comprising copies of a template strand and copies of the complement of the template strand.
  • each first polynucleotide sequence may be attached (via the 5’-end of the first polynucleotide sequence) to a first immobilised primer, and wherein each second polynucleotide sequence is attached (via the 5’-end of the second polynucleotide sequence) to a second immobilised primer.
  • Each first polynucleotide sequence may comprise a second adaptor sequence, wherein the second adaptor sequence comprises a portion, which is substantially complementary to the second immobilised primer (or is substantially complementary to the second immobilised primer).
  • the second adaptor sequence may be at a 3’-end of the first polynucleotide sequence.
  • Each second polynucleotide sequence may comprise a first adaptor sequence, wherein the first adaptor sequence comprises a portion, which is substantially complementary to the first immobilised primer (or is substantially complementary to the first immobilised primer).
  • the first adaptor sequence may be at a 3’-end of the second polynucleotide sequence.
  • a solution comprising a polynucleotide library prepared by ligating adaptor sequences to double-stranded polynucleotide sequences as described above may be flowed across a flowcell.
  • a particular polynucleotide strand from the polynucleotide library to be sequenced comprising, in a 5’ to 3’ direction, a second primer-binding complement sequence 302 (e.g. P7), a first terminal binding site complement 303’ (e.g. SBS12), a forward strand of the sequence 101 , a second terminal sequencing primer binding site 304 (e.g. SBS3’) and a first primer-binding sequence 30T (e.g. P5’), may anneal (via the first primerbinding sequence 30T) to the first immobilised primer 201 (e.g. P5 lawn primer) located within a particular well 203 ( Figure 4A).
  • a second primer-binding complement sequence 302 e.g. P7
  • a first terminal binding site complement 303’ e.g. SBS12
  • a forward strand of the sequence 101 e.g. SBS3’
  • a second terminal sequencing primer binding site 304 e.g. SBS3
  • the polynucleotide library may comprise other polynucleotide strands with different forward strands of the sequence 101.
  • Such other polynucleotide strands may anneal to corresponding first immobilised primers 201 (e.g. P5 lawn primers) in different wells 203, thus enabling parallel processing of the various different strands within the polynucleotide library.
  • first immobilised primers 201 e.g. P5 lawn primers
  • a new polynucleotide strand may then be synthesised, extending from the first immobilised primer 201 (e.g. P5 lawn primer) in a direction away from the substrate 204.
  • this generates a template strand comprising, in a 5’ to 3’ direction, the first immobilised primer 201 (e.g. P5 lawn primer) which is attached to the solid support 200, a second terminal sequencing primer binding site complement 304’ (e.g. SBS3), a forward strand of the template 101’ (which represents a type of “first portion”), a first terminal sequencing primer binding site 303 (which represents a type of “first sequencing primer binding site”) (e.g. SBS12’), and a second primer-binding sequence 302’ (e.g. P7’) ( Figure 4B).
  • Such a process may utilise an appropriate polymerase, such as a DNA or RNA polymerase.
  • the polynucleotides in the library comprise index sequences, then corresponding index sequences are also produced in the template.
  • the adaptors comprise at least one, or at least two cleavage sites - or complements of cleavage sites - as described above - these will be incorporated into the newly synthesised immobilised strands.
  • the adaptors comprise complements of cleavage sites therefore, for example complements of the above restriction sites, these will be translated into cleavage sites (e.g. restriction sites) in the newly synthesised immobilised strands.
  • the polynucleotide strand from the polynucleotide library may then be dehybridised and washed away, leaving a template strand attached to the first immobilised primer 201 (e.g. P5 lawn primer) ( Figure 4C).
  • first immobilised primer 201 e.g. P5 lawn primer
  • the second primer-binding sequence 302’ (e.g. P7’) on the template strand may then anneal to a second immobilised primer 202 (e.g. P7 lawn primer) located within the well 203. This forms a “bridge” or “sequence bridge” ( Figure 4D).
  • a new polynucleotide strand may then be synthesised by bridge amplification, extending from the second immobilised primer 202 (e.g. P7 lawn primer) (initially) in a direction away from the substrate 204.
  • the second immobilised primer 202 e.g. P7 lawn primer
  • a first terminal sequencing primer binding site complement 303’ e.g. SBS12
  • a forward complement strand of the template 101 which represents a type of “second portion”
  • a second terminal sequencing primer binding site 304 which represents a type of “second sequencing primer binding site” (e.g. SBS3’
  • a first primer-binding sequence 30T e.g. P5’
  • a suitable polymerase such as a DNA or RNA polymerase.
  • the strand attached to the second immobilised primer 202 may then be dehybridised from the strand attached to the first immobilised primer 201 (e.g. P5 lawn primer) ( Figure 4F).
  • a subsequent bridge amplification cycle can then lead to amplification of the strand attached to the first immobilised primer 201 (e.g. P5 lawn primer) and the strand attached to the second immobilised primer 202 (e.g. P7 lawn primer).
  • the second primer-binding sequence 302’ e.g. P7’
  • the first primer-binding sequence 30T e.g. P5’
  • the second immobilised primer 202 e.g. P7 lawn primer
  • Completion of bridge amplification and dehybridisation may then provide an amplified (duoclonal) cluster, thus providing a plurality of first polynucleotide sequences comprising the forward strand of the template 10T (i.e. “first portions”), and a plurality of second polynucleotide sequences comprising the forward complement strand of the template 101 (i.e. “second portions”) (Figure 4H).
  • the “first portion” corresponds with the forward strand of the template 10T
  • the “second portion” corresponds with the forward complement strand of the template 101.
  • other set-ups may be obtained by changing the library used. For example, by using a loop fork method to prepare a library, a portion at or close to the loop (or the loop complement) may be cleaved (e.g. by nicking).
  • the loop may comprise a cleavage site (e.g. a restriction recognition site, a cleavable linker, a modified nucleotide, or the like).
  • cleavage site e.g. a restriction recognition site, a cleavable linker, a modified nucleotide, or the like.
  • first portions and second portions may be prepared for templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion, and as such the forward strand of the template 101’ and the forward complement strand of the template 101 may be substituted as appropriate.
  • one strand of a concatenated polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a second primer-binding complement sequence 302 (e.g. P7), a first terminal sequencing primer binding site complement 303’ (e.g. B15-ME; or if ME is not present, then B15), a first insert sequence 401 , a hybridisation complement sequence 403 (e.g. ME’-HYB2-ME; or if ME’ and ME are not present, then HYB2), a second insert sequence 402, a second terminal sequencing primer binding site 304 (e.g. ME’-A14’; or if ME’ is not present, then A14’), and a first primer-binding sequence 30T (e.g. P5’) ( Figures 16 and 17 - bottom strand).
  • a second primer-binding complement sequence 302 e.g. P7
  • a first terminal sequencing primer binding site complement 303’ e.g
  • the strand may further comprise one or more index sequences.
  • a first index sequence (e.g. i7) may be provided between the second primer-binding complement sequence 302 (e.g. P7) and the first terminal sequencing primer binding site complement 303’ (e.g. B15-ME; or if ME is not present, then B15).
  • a second index complement sequence (e.g. i5’) may be provided between the second terminal sequencing primer binding site 304 (e.g. ME’-A14’) and the first primer-binding sequence 30T (e.g. P5’).
  • one strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a second primer-binding complement sequence 302 (e.g. P7), a first index sequence (e.g. i7), a first terminal sequencing primer binding site complement 303’ (e.g. B15-ME; or if ME is not present, then B15), a first insert sequence 401 , a hybridisation complement sequence 403 (e.g. ME’-HYB2-ME; or if ME’ and ME are not present, then HYB2), a second insert sequence 402, a second terminal sequencing primer binding site 304 (e.g. ME’-A14’; or if ME’ is not present, then A14’), a second index complement sequence (e.g. i5’), and a first primer-binding sequence 30T (e.g. P5’)
  • a second primer-binding complement sequence 302 e.g. P7
  • Another strand of a concatenated polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a first primer-binding complement sequence 301 (e.g. P5), a second terminal sequencing primer binding site complement 304’ (e.g. A14-ME; or if ME is not present, then A14), a second insert complement sequence 402’, a hybridisation sequence 403’ (e.g. ME’-HYB2’-ME; or if ME’ and ME are not present, then HYB2’), a first insert complement sequence 401’, a first terminal sequencing primer binding site 303 (e.g. ME’-B15’; or if ME’ is not present, then B15’), and a second primerbinding sequence 302’ (e.g. P7’) ( Figures 16 and 17 - top strand).
  • a first primer-binding complement sequence 301 e.g. P5
  • the another strand may further comprise one or more index sequences.
  • a second index sequence (e.g. i5) may be provided between the first primer-binding complement sequence 301 (e.g. P5) and the second terminal sequencing primer binding site complement 304’ (e.g. A14-ME; or if ME is not present, then A14).
  • a first index complement sequence (e.g. i7’) may be provided between the first terminal sequencing primer binding site 303 (e.g. ME’-B15’; or if ME’ is not present, then B15’) and the second primer-binding sequence 302’ (e.g. P7’).
  • another strand of a polynucleotide within a polynucleotide library may comprise, in a 5’ to 3’ direction, a first primer-binding complement sequence 301 (e.g. P5), a second index sequence (e.g. i5), a second terminal sequencing primer binding site complement 304’ (e.g. A14-ME; or if ME is not present, then A14).), a second insert complement sequence 402’, a hybridisation sequence 403’ (e.g. ME’-HYB2’-ME; or if ME’ and ME are not present, then HYB2’), a first insert complement sequence 40T, a first terminal sequencing primer binding site 303 (e.g.
  • the first insert sequence 401 and the second insert sequence 402 may comprise different types of library sequences.
  • the first insert sequence 401 may be different to the second insert sequence 402 (e.g. genetically unrelated, and/or obtained from different sources), for example where the library is prepared using PCR stitching.
  • the first insert sequence 401 may comprise a forward strand of the sequence 101
  • the second insert sequence may comprise a reverse complement strand of the sequence 102’ (or the first insert sequence 401 may comprise a reverse strand of the sequence 102, and the second insert sequence 402 may comprise a forward complement strand of the sequence 10T), for example where the library is prepared using a tandem insert method.
  • the first insert sequence 401 may comprise a forward strand of the sequence 101
  • the second insert sequence 402 may comprise a reverse strand of the sequence 102 (or the first insert sequence 401 may comprise a forward complement strand of the sequence 101’, and the second insert sequence 402 may comprise a reverse complement strand of the sequence 102’), for example where the library is prepared using a loop fork method.
  • the present invention is directed to methods of SBS sequencing, and in particular double-stranded SBS sequencing. This leads to reductions in error rates - i.e. improvements in accuracy.
  • the method of the invention comprises a first sequencing read that sequences the first polynucleotide strand and a second sequencing read that sequences a second polynucleotide strand, where the first and second reads are temporally separated - that is, the first sequencing read is before the second sequencing read.
  • the method of the invention comprises altering the ratio of first to second sequencing reads, which in turn allows the sequencing reads to be differentiated, and allows the first and second polynucleotide strands to be sequenced simultaneously.
  • the methods of the present invention also enable SBS sequencing without the need for a paired-end turn and cluster re-synthesis. This in turn reduces the time taken to sequence a target polynucleotide, thus improving even further the efficiency of the sequencing protocol when using the method of the present invention.
  • a paired- end turn refers to the sequence of stages required to effectively invert the sequence for the second read in paired-end reading, after sequencing read 1 ( Figure 5A). The paired- end turn may be facilitated by a cycle of bridge amplification and linearization.
  • the present invention also eliminates the need for cluster re-synthesis of the first or second polynucleotide sequences for read 2 of conventional SBS workflows ( Figure 5A).
  • Cluster re-synthesis follows the paired-end turn, and consists of cycles of bridge amplification to restore the clusters of library preparations for the second read ( Figure 5A).
  • eliminating these steps provide additional time savings.
  • the sequencing time can be almost halved.
  • the method of the invention also requires the redesigning of only the amplification primers, which also simplifies subsequent amplification and sequencing steps, as it is possible to rely on normal amplification and sequencing methods (such as those described above), rather than specifically designed amplification or sequencing methods to achieve double-stranded SBS sequencing, and in particular concurrent double-stranded SBS sequencing.
  • the method of the invention also simplifies the immobilised primers.
  • the immobilised primers of the present invention do not need to be cleavable from the solid support. Accordingly, in one embodiment, the immobilised primers may not comprise a cleavable site. In another embodiment, where the immobilised primers comprise a cleavable site, this cleavable site is different from or cleavable under different cleavable conditions from the cleavable sites present in the first and/or second sequencing primer. In another embodiment, the immobilised primers may not comprise a uracil or oxoguanine (e.g. 8-oxoguanine).
  • a method of identifying a target polynucleotide comprising a. hybridising the first polynucleotide sequence to first immobilised primers on a solid support and hybridising the second polynucleotide sequence to second immobilised primers on a solid support; b. synthesising a plurality of first and second polynucleotide sequences by conducting an amplification reaction to extend the first and second immobilised primers; c. selectively removing first and second immobilised primers that have not been extended; d.
  • first sequencing primers comprise a first cleavable site and wherein the second sequencing primers comprise a second cleavable site and conducting an amplification reaction to extend the first and second sequencing primers; and e. selectively cleaving the first cleavable site in the first sequencing primer and/or selectively cleaving the second cleavable site in the second sequencing primer.
  • the plurality of first and second polynucleotide sequences is synthesised by bridge or exclusion amplification, as described above. This lead to the formation of clusters or colonies of template molecules that are bound to the solid support via the immobilised primers. Accordingly, in one embodiment, synthesising a plurality of first and second polynucleotide sequences by conducting an amplification reaction forms a cluster or plurality of clusters. As described above, bridge amplification may be performed a number of times (i.e. a number of cycles are carried out) to obtain a cluster of first and second polynucleotide sequences.
  • excess solid support sequence(s) are removed after clustering.
  • excess solid support sequence(s) it is meant the immobilised primer sequences (e.g. P5, P7’) that have not been extended during clustering stages. Removing excess immobilised primers may serve to prevent further bridge amplification in the subsequent amplification stage described below.
  • the excess immobilised primers are preferably removed by an exonuclease.
  • the exonuclease is a single-strand exonuclease.
  • Non-limiting examples of exonucleases include RecJf, Lambda Exonuclease, T7 exonuclease domain, T5 exonuclease, or the DNA polymerase l-like H3TH domain, Exonuclease V (RecBCD) or Exonuclease VII.
  • Other suitable methods of removing a single-stranded DNA sequence known in the art may also be used to remove the excess immobilised primers.
  • the method of preparing a first and second polynucleotide sequence for sequencing comprises removing excess immobilised primers from the solid surface, such as using an exonuclease, and optionally a singlestranded exonuclease.
  • the library is linearised. Linearisation can occur as only one end of each side of the “bridge” formed during bridge amplification is attached (e.g. covalently bonded) to the solid support. The other side of the bridge is hybridized to its’ complementary immobilized primer. As shown in Figure 13C and 14C, linearization leads to a plurality of single-stranded first and second polynucleotide sequences immobilised on the solid support through the first and second immobilised primers. We may refer to these as first immobilised polynucleotide strands and second immobilised polynucleotide strands.
  • the first and second polynucleotides may be subjected to further amplification using first and second sequencing primers.
  • the method comprises applying (i.e. flowing across the surface of the solid support) a plurality of first and second sequencing primers.
  • the sequencing primers must be sufficiently complementary to the sequencing primerbinding sites within the first and second polynucleotide strands to allow hybridisation.
  • the first and second sequencing primers bind to the first and second polynucleotide strands at first and second sequencing primer-binding sites respectively.
  • the first and second sequencing primers bind (i.e. substantially hybridise) to sequencing primer-binding sites at the 3’ end of the immobilised first and/or second polynucleotide sequences.
  • the first and second sequencing primers comprise at least one cleavable site.
  • Cleavage of the cleavable site occurs under cleavage conditions, i.e. reaction conditions that cause cleavage, and results in a nick or gap within the sequencing primer.
  • This cleaved site acts as the initiation site for sequencing by nick- translation, which in particular may be carried out using a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ activity, as described below.
  • first cleavage conditions refers to reaction conditions that cause cleavage within the first only, or the first and second sequence primer (i.e. at the first or first and second cleavage site).
  • second cleavage conditions refers to reaction conditions that cause cleavage within the second and/or subsequent sequence primers (i.e. the second and/or subsequent cleavage site).
  • the method comprises exposing the immobilised sequencing primers to first and/or second cleavage conditions.
  • the location at which the first sequencing primer is configured to be cleavable under first cleavage conditions may also be referred to as a first cleavage site.
  • the first sequencing primer comprises a first cleavage site.
  • the first cleavage site may comprise a cleavable covalent bond. In some cases, when the first cleavage site is nicked, this allows sequencing to occur starting from the nick location (e.g. in conjunction with a polymerase and exonuclease).
  • the location at which the second sequencing primer is configured to be cleavable under second cleavage conditions may also be referred to as a second cleavage site.
  • the second sequencing primer comprises a second cleavage site.
  • the second cleavage site may comprise a cleavable covalent bond. In some cases, when the second cleavage site is nicked, this allows sequencing to occur starting from the nick location (e.g. in conjunction with a polymerase and exonuclease).
  • the method comprises selectively cleaving the first cleavable site in the first sequencing primer and/or selectively cleaving the second cleavable site in the second sequencing primer.
  • the cleavable site may be at the 5’ end, the 3’ end or within the sequencing primer; in particular, the cleavable site may be at the 3’ end of the sequencing primer.
  • cleavable covalent bond refers to a covalent bond that can be cleaved for example under the application of heat, light or other (bio)chemical methods (e.g. by exposure to a degradation agent, such as an enzyme or a catalyst), while a “non- cleavable covalent bond” is stable to degradation under such conditions.
  • cleavable covalent bonds include thermally or photolytically cleavable cycloadducts (e.g.
  • furan-maleimide cycloadducts furan-maleimide cycloadducts
  • alkenylene linkages esters, amides, acetals, hemiaminal ethers, aminals, imines, hydrazones, 1 ,2-diol linkages (e.g. glycols cleavable by periodates), polysulfide linkages (e.g. disulfide linkages), boron-based linkages (e.g. boronic and borinic acids/esters), silicon-based linkages (e.g. silyl ether, siloxane), and phosphorus-based linkages (e.g. phosphite, phosphate) linkages.
  • polysulfide linkages e.g. disulfide linkages
  • boron-based linkages e.g. boronic and borinic acids/esters
  • silicon-based linkages e.g. silyl ether, siloxan
  • the first and/or second cleavage conditions may involve exposure to a thermal trigger (e.g. by heating), a light trigger (e.g. by exposure to ultraviolet light), and/or a chemical/biochemical trigger (e.g. an enzyme, periodate).
  • a thermal trigger e.g. by heating
  • a light trigger e.g. by exposure to ultraviolet light
  • a chemical/biochemical trigger e.g. an enzyme, periodate
  • a proportion of first and/or second sequencing primers may be configured to be cleavable by a thermal trigger (e.g. by heating), a light trigger (e.g. by exposure to ultraviolet light), and/or a chemical/biochemical trigger (e.g. an enzyme, periodate).
  • a thermal trigger e.g. by heating
  • a light trigger e.g. by exposure to ultraviolet light
  • a chemical/biochemical trigger e.g. an enzyme, periodate
  • first cleavage conditions and the second cleavage conditions are different. This allows control over which sequencing primer is cleaved first, which in turn allows sequential sequencing of the first and second polynucleotide sequence of vice versa.
  • the first cleavage conditions and the second cleavage conditions may be the same. This allows the first and second sequencing primer to be simultaneously sequenced, which in turn allows concurrent sequencing of the first and second polynucleotide sequence.
  • a portion of the first sequencing primers or the second sequencing primers comprise a cleavable site. That is, in one embodiment, the method comprises applying a plurality of first sequencing primers and second sequencing primers, where the first or second sequencing primers comprise a mix of cleavable sequencing primers and un-cleavable primers.
  • cleavable is meant that the sequencing primers comprise at least one cleavable site.
  • un-cleavable or “non- cleavable” is meant that the sequencing primers do not comprise a cleavable site.
  • the proportion of first sequencing primers cleavable relative to a proportion of first sequencing primers which are not cleavable may be between 20:80 to 80:20; in a further embodiment, between 1 :2 to 2:1 ; and in an even further embodiment, about 1 :1.
  • the proportion of second sequencing primers cleavable relative to a proportion of second sequencing primers which are not cleavable may be between 20:80 to 80:20; in a further embodiment, between 1 :2 to 2:1 ; and in an even further embodiment, about 1 :1.
  • each first sequencing primer may comprise a sequence selected from one of SEQ ID NOs. 7 to 10 or a variant or fragment thereof; and each second sequencing primer may comprise a different sequence selected from SEQ ID NOs 7 to 10, or a variant or fragment thereof.
  • the cleavable site may comprise one or more unnatural nucleobase (i.e. one which is not usually present in a typical DNA sequence an RNA sequence).
  • unnatural nucleobases may include oxoguanine (e.g. 8- oxoguanine), hypoxanthine, xanthine, methylguanines (e.g. O 6 -methylguanine, N 7 - methylguanine), methyladenines (e.g. 3-methyladenine, N 6 -methyladenine), modified cytosines including methylcytosines (e.g. 5-methylcytosine, 5-hydroxymethylcytosine, 5- formylcytosine, 5-carboxylcytosine), dihydrouracil, and uracil (if the sequence primer is a DNA sequence).
  • oxoguanine e.g. 8- oxoguanine
  • hypoxanthine xanthine
  • methylguanines e.g.
  • a proportion of first and/or second sequencing primers may be configured to be cleavable by a glycosylase.
  • the cleavage conditions involve exposure to a glycosylase or a glycosylase and an AP lyase or endonuclease
  • a proportion of first and/or second sequencing primers may be configured to be cleavable by a glycosylase that recognises any nitrogenous base (e.g. purine or pyrimidine) which is not selected from guanine (G), cytosine (C), adenine (A) and thymine (T) when the first and/or second sequencing primer is a DNA sequence; or the proportion of sequence primers may be configured to be cleavable by a glycosylase that recognises any nitrogenous base (e.g.
  • the glycosylase may recognise an unnatural nucleobase (i.e. one which is not usually present in a typical DNA sequence an RNA sequence).
  • unnatural nucleobases may include oxoguanine (e.g. 8-oxoguanine), hypoxanthine, xanthine, methylguanines (e.g. O6-methylguanine, N7- methylguanine), methyladenines (e.g.
  • methylcytosines e.g. 5-methylcytosine, 5-hydroxymethylcytosine, 5- formylcytosine, 5-carboxylcytosine
  • dihydrouracil e.g. 5-methylcytosine, 5-hydroxymethylcytosine, 5- formylcytosine, 5-carboxylcytosine
  • uracil if the first sequencing primer is a DNA sequence.
  • the location of the unnatural nucleobase will be sufficiently within the sequencing primer to allow for both hybridisation of the sequencing primer to the sequencing primer binding site on the immobilised strand and for the resulting nicked sequence primer to be sufficiently long enough to be extended by a polymerase.
  • a proportion of sequence primers may be configured to be cleavable by a uracil glycosylase (when the first sequencing primer is a DNA sequence) or an oxoguanine glycosylase (e.g. 8-oxoguanine glycosylase); and in a further embodiment, an oxoguanine glycosylase (e.g. 8-oxoguanine glycosylase).
  • the term “glycosylase” may refer to an enzyme which catalyses the removal of a nitrogenous base from one of the nucleotides in a (poly)nucleotide chain by breaking a N-glycosidic bond, resulting in the formation of an apurinic/apyrimidinic site (AP site).
  • An AP lyase or an endonuclease may then be used to digest the remaining phosphodiester 3’ bond in the polynucleotide, to create a nick.
  • the phosphodiester backbone will be cleaved in an uncatalysed manner by elimination.
  • a “glycosylase” refers to a nickase when used alone or in conjunction with an AP lyase or an endonuclease.
  • the glycosylase may recognise any nitrogenous base (e.g. purine or pyrimidine) which is not selected from guanine (G), cytosine (C), adenine (A) and thymine (T); for RNA chains, the glycosylase may recognise any nitrogenous base (e.g. purine or pyrimidine) which is not selected from guanine (G), cytosine (C), adenine (A) and uracil (II).
  • nitrogenous bases recognised by glycosylases include oxoguanine (e.g. 8-oxoguanine), uracil, inosine and alkylpurines.
  • Glycosylases may be monofunctional, such that they only possess glycosylase activity (i.e. breaking of the N-glycosidic bond) - cleavage of a phosphodiester bond in the sugarphosphate backbone may then occur in an uncatalysed manner by elimination.
  • an AP lyase or endonuclease may be used in conjunction with a glycosylase cleave the remaining phosphodiester bond to create a nick.
  • Other glycosylases may be bifunctional, such that they also possess AP lyase activity by catalysing the phosphodiester bond of the (poly)nucleotide chain, for example FpG enzyme.
  • the glycosylase is bifunctional (i.e. possesses both glycosylase and AP lyase activity).
  • the cleavable site may comprise inosine (e.g. deoxyinosine) and is recognised by an endonuclease (e.g. endonuclease V) to nick at the inosine residue.
  • inosine e.g. deoxyinosine
  • endonuclease e.g. endonuclease V
  • the cleavable site may comprise uracil (e.g. deoxyuracil), and is recognised by a uracil DNA glycosylase and then endonuclease VIII. This combination is called the USER enzyme mix.
  • uracil e.g. deoxyuracil
  • a 3’ phosphate group will remain. Therefore, a treatment with a kinase, (e.g. T4 kinase) will be required to release the 3’ OH group.
  • a kinase e.g. T4 kinase
  • Kinases are well-known in the art and are widely available commercially.
  • the cleavable site may comprise oxoguanine (e.g. 8-oxoguanine).
  • the cleavable site may comprise uracil, particularly when the sequencing primer is a DNA sequence.
  • the cleavable site is an AP site and is recognised by an AP lyase.
  • the AP site may be due to a site modification within a nucleotide where the nitrogenous base has been removed.
  • the AP lyase will recognise the AP site and cleave the phosphodiester bond to create a nick.
  • the cleavable site is a restriction site. This allows the cleavage site to be nicked and allows sequencing to occur starting from the nick location (e.g. in conjunction with a polymerase as described below).
  • restriction site is meant a sequence of nucleotides recognised by an endonuclease, such as a single-stranded endonuclease.
  • a restriction site may also be referred to as a “recognition site” or “recognition sequence”, and such terms may be used interchangeably.
  • the method comprises cleaving the first and/or second sequencing primer by applying (i.e. flowing over the solid support) an endonuclease.
  • the endonuclease is a single strand restriction endonuclease, a nicking endonuclease or nicking enzyme or nickase (again, such terms may be used interchangeably).
  • a nicking endonuclease or nicking enzyme or nickase is meant an enzyme that can hydrolyze only one strand of the double-stranded polynucleotide (duplex), to produce DNA molecules that are “nicked”, rather than fully cleaved on both strands.
  • Suitable nicking enzymes include, but are not limited to, Nb.BbvCI, Nb.Bsml, Nb.BsrDI, Nb.BtsI, Nt.Alwl, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, BssSI, Nb.Bpu101 and Nt.CviPII. These nickases can be used either alone or in various combinations.
  • restriction sites vary depending on the nickase used, and are well known in the art.
  • the restriction site is selected from the following:
  • the nickase is Nb.
  • BssSI and the restriction site is CACGAG, wherein Nb.
  • BssSI catalyzes a single strand break within the recognition sequence.
  • the nickase is Nt.BspQI
  • the restriction site is GCTCTTC(1/-7), wherein Nt.BspQI catalyzes a single strand break one base beyond the 3’ side of the restriction site.
  • the restriction site must be at least 1 base from the 3’ end of the sequencing primer to enable nicking.
  • the nickase is Nt.CviPII and the restriction site is (0/-1)CCD, wherein Nt.CviPII catalyzes a single strand break at the 5’ side of the restriction site.
  • the nickase is Nt.BstNBI and the restriction site is GAGTC(4/-5), wherein Nt.BstNBI catalyzes a single strand break four bases beyond the 3’ side of the restriction site.
  • the nickase is Nb.BsrDI and the restriction site is GCAATG, wherein Nb.BsrDI catalyzes a single strand break within the restriction site.
  • the nickase is Nb.BtsI and the restriction site is GCAGTG, wherein Nb.BtsI catalyzes a single strand break within the restriction site.
  • the nickase is Nt.Alwl and the restriction site is GGATC(4/-5), wherein Nt.Alwl catalyzes a single strand break four bases beyond the 3’ side of the restriction site.
  • the nickase is Nb.BbvCI and the restriction site is CCTCAGC, wherein Nb.BbvCI catalyzes a single strand break within the restriction site.
  • the nickase is Nb.Bsml and the restriction site is GAATGC, wherein Nb.Bsml catalyzes a single strand break within the restriction site.
  • the nickase is Nt.BsmAI and the restriction site is GTCTC(1/-5), wherein Nt.BsmAI catalyzes a single strand break one base beyond the 3’ side of the restriction site.
  • the nickase is Nb.BpulOI and the restriction site is CCTNAGC, wherein Nb.BpulOI catalyzes a single strand break within the restriction site.
  • x is the number of nucleotides beyond (i.e. 3’ of) the 3’ end of the restriction site where cleavage occurs; and y is the number of nucleotides in the restriction site.
  • the endonuclease is a Cas9 nickase.
  • Cas9 nickase examples include Cas9 D10A and Cas9 H840A.
  • the Cas9 protein may comprise the D10A or H840A amino acid substitutions. These nickases cleave only the DNA strand that is complementary to and recognized by a gRNA.
  • the restriction site may be or may comprise a PAM (protospacer adjacent motif) sequence.
  • PAM sequences include NGG, NGAG, NGCG, NGN, NG, GAA, GAT, NNG, NGN, NRN, YG, NNGRRT, NNNRRT, NNAGAA, NNNNGATT and NNNNCRAA and complements thereof.
  • the Cas9 protein may alternatively or additionally comprise the N863A or N854A amino acid substitutions.
  • the Cas9 protein has been modified to improve activity.
  • the Cas9 protein may additionally comprise a D1135E substitution.
  • the Cas9 protein may also be the VQR variant.
  • the method may comprise the step of conducting an amplification reaction to extend the hybridised first and second sequencing primers.
  • the amplification reaction extends the first and second sequencing primers, using the first and second immobilised strands as a template and a polymerase, to produce an extended polynucleotide strand.
  • the resulting strand will comprise a complement of all or the majority of the first or second polynucleotide immobilised strand, including the insert.
  • the necessary conditions to enable amplification have been discussed above.
  • the newly synthesised polynucleotide strand remains unattached from the solid surface, and is hybridised to the template strand.
  • a number of rounds of amplification are carried out before cleavage. In another embodiment, one round of amplification is carried out before cleavage.
  • an immobilised strand comprises a first and second portion of a polynucleotide, for example, from fork loop preparation, one or more rounds of amplification may be carried out before cleavage.
  • a method of sequencing a first and second polynucleotide sequence comprises preparing first and second polynucleotide sequences for sequencing using the method described above for preparing a first and second polynucleotide sequence for sequencing; and sequencing nucleobases in the first polynucleotide sequence and the second polynucleotide sequence.
  • the template provides information (e.g. identification of the genetic sequence, identification of epigenetic modifications) on the original target polynucleotide sequence.
  • a sequencing process e.g. a sequencing-by-synthesis or sequencing-by-ligation process
  • the sequencing process comprises a first sequencing read and a second sequencing read. These sequencing reads may be carried our sequentially or concurrently. In one embodiment, the first sequencing read and the second sequencing read are conducted concurrently. In other words, the first sequencing read and the second sequencing read may be conducted at the same time. In one embodiment, the first read sequencing the forward strand of a library strand, and the second read sequences the reverse strand of a library strand.
  • the first and second sequencing read may comprise cleaving the sequence bridge at the above-described cleavage sites. Cleavage of the above-described cleavage sites may follow after a plurality of rounds of bridge amplification. In one embodiment, the cleavage sites are cleaved following at least 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 cycles, following at least 15, following at least 20 or following at least 25 cycles of bridge amplification.
  • sequencing of the first polynucleotide sequence and the second polynucleotide sequence occurs sequentially. That is, read one comprises sequencing the first polynucleotide sequence and read two comprises sequencing the second polynucleotide sequence or vice versa, and read one and two are temporally separated.
  • steps A to C the first and second polynucleotide sequences are hybridised to immobilised primers on the solid support (A), and a cluster comprising a plurality of immobilised first and second polynucleotide strands is generated using bridge amplification, as described in detail elsewhere.
  • Excess primers are then removed (B), for example using a single-stranded exonuclease, and sequencing primers flowed across the solid support to hybridise at the 3’ ends of the immobilised first and second polynucleotide sequences (C); again as described in detail above.
  • the first and second sequencing primers comprise different cleavage sites.
  • the first sequencing primers are cleaved by applying first cleavage conditions, such as a first nicking enzyme.
  • the second sequencing primers are cleaved by applying second cleavage conditions, such as a second nicking enzyme. Either leads to the cleavage (i.e. a single-stranded cleavage or nicking) of either the first or second sequencing primer, as shown in Figure 13D.
  • This allows sequencing to be performed starting from the nicked location.
  • the nicked sites provide sequencing start sites for nick translation or nick sequencing.
  • read one comprises cleaving the first or second sequencing primer and sequencing the immobilised strand that is hybridised to the first or second sequencing primer.
  • the method may comprise blocking all or substantially all free 3’ ends.
  • free 3’ ends will be present on the read 1 sequenced strand.
  • the blocking group may be any modification that prevents extension (i.e. elongation) of the free end by a polymerase. This serves to ensure that all polymerases are available to sequence the second sequencing strand.
  • suitable blocking groups include a hairpin loop (e.g.
  • a polynucleotide attached to the 3’- end comprising in a 5’ to 3’ direction, a cleavable site such as a nucleotide comprising uracil, a loop portion, and a complement portion, wherein the complement portion is substantially complementary to all or a portion of the lawn primer), a hydrogen atom instead of a 3’-OH group, a phosphate group, a propyl spacer (e.g.-O-(CH2)3-OH instead of a 3’-OH group), a modification blocking the 3’-hydroxyl group (e.g. hydroxyl protecting groups, such as silyl ether groups (e.g.
  • ether groups e.g. benzyl, allyl, t-butyl, methoxymethyl (MOM), 2-methoxyethoxymethyl (MEM), tetrahydropyranyl
  • acyl groups e.g. acetyl, benzoyl
  • the sequencing primer not cleaved in read 1 is cleaved.
  • the second or first sequencing primers are cleaved by applying second or first cleavage conditions (such as a second or first nicking enzyme). Again, either leads to the cleavage (i.e. a single-stranded cleavage or nicking) of either the second or first sequencing primer, as shown in Figure 13F.
  • second or first cleavage conditions such as a second or first nicking enzyme.
  • read one comprises cleaving the first or second sequencing primer and sequencing the immobilised strand that is hybridised to the second or first sequencing primer.
  • a method of sequencing a first polynucleotide and a second polynucleotide sequentially allows the sequencing of a first and second polynucleotide sequence without needing to re-synthesise the second polynucleotide sequence prior to sequencing (also referred to as cluster resynthesis), which is required for read 2 under conventional SBS workflows ( Figure 5A). This in turn reduces the time required to sequence a target polynucleotide compared to conventional single strand sequencing.
  • a method of sequencing a first polynucleotide and a second polynucleotide concurrently That is read 1 and read 2 are carried out at the same time or substantially the same time.
  • steps A to C the first and second polynucleotide sequences are hybridised to immobilised primers on the solid support (A), and a cluster comprising a plurality of immobilised first and second polynucleotide strands is generated using bridge amplification, as described in detail elsewhere. Excess primers are then removed, for example using a single-stranded exonuclease, and sequencing primers flowed across the solid support to hybridise at the 3’ ends of the immobilised first and second polynucleotide sequences; again as described in detail above.
  • the first and second sequencing primers comprise the same cleavage sites.
  • the first and second sequencing primers are cleaved by applying the same cleavage conditions (i.e. cleaved by applying the same nicking enzyme).
  • this cleavage condition is applied both the first and second sequencing primers are simultaneously cleaved, as shown in Figure 14D.
  • concurrent sequencing may be achieved using a mixture of cleavable and non-cleavable first sequencing primers and cleavable second sequencing primers.
  • any one of the above described cleavage conditions for example, any of the above-described nicking enzymes is applied
  • a portion of the first sequencing primers are cleaved and all or substantially all of the second sequencing primers are cleaved.
  • concurrent sequencing may be achieved using a mixture of cleavable and non-cleavable second sequencing primers and cleavable first sequencing primers.
  • any one of the above described cleavage conditions for example, any of the above-described nicking enzymes is applied
  • a portion of the second sequencing primers are cleaved and all or substantially all of the first sequencing primers are cleaved.
  • Figure 14E either alternative allows sequencing of the first and second polynucleotide sequences to be performed starting from the nicked location.
  • any ratio of cleavable:un-cleavable first or second sequencing primers can be used that generates or leads to a sequencing signal that is of a lower intensity than the second or first sequencing signal respectively.
  • the ratio of cleavable:un-cleavable primers may be: 20:80 to 80:20, or 1 :2 to 2:1 or 1 :1.
  • a ratio of 50:50 of cleavable: un-cleavable first or second sequencing primers is used, which in turn generates a sequencing signal that is around 50% of the intensity of the sequencing signal from the second or first sequencing signal respectively.
  • this enables at least a doubling of throughput of a sequencing reaction (i.e. increased sequencing efficiency) as well as a decrease in the time taken to sequence target polynucleotide strands.
  • designing this feature into the sequencing primers simplifies the adaptors and the immobilised primers on the solid support, as described above.
  • the above-described embodiment would allow spatially separated clusters to be read in a temporally simultaneous manner through the generation of an optically unresolved signal that can be analytically separated using 16QaM, as described below.
  • the first polynucleotide sequence is a forward strand of a doublestranded polynucleotide to be identified and the second polynucleotide sequence is a reverse strand of a double-stranded polynucleotide to be identified.
  • the first polynucleotide sequence comprises a first portion to be identified and a second portion to be identified.
  • the second polynucleotide sequence comprises a first portion to be identified and a second portion to be identified.
  • the first and second portions may be genetically unrelated, as described above ( Figures 16 and 17). Such sequences can be produced by PCR-stitching, as described above.
  • the first portion on the first polynucleotide sequence may be a forward strand of a target polynucleotide and the second portion on the first polynucleotide sequence may be the reverse complement of the forward of a target polynucleotide i.e. a copy of the forward strand.
  • the first portion on the second polynucleotide sequence may be a reverse strand of a target polynucleotide and the second portion on the second polynucleotide sequence may be the forward complement of the forward of a target polynucleotide i.e. a copy of the reverse strand.
  • first and second polynucleotide sequences comprise first and second portions
  • a further round of sequencing may be performed to sequence the first and second portions respectively.
  • first and second portion of a first polynucleotide and the first and second portion of the second polynucleotide are sequenced one-at-a-time. That is, each portion of each polynucleotide are sequenced sequentially.
  • the first portion of the first polynucleotide may be sequenced using a first cleavable sequencing primer
  • the second portion of the first polynucleotide may be sequenced using a second cleavable sequencing primer
  • the first portion of the second polynucleotide may be sequenced using a third cleavable sequencing primer
  • the second portion of the second polynucleotide may be sequenced using a fourth cleavable sequencing primer
  • the first, second, third and fourth cleavable sequencing primers comprise first, second, third and fourth cleavable sites that are all different.
  • the method may further require blocking any free 3’ ends of the sequencing strand following completion of each read sequencing, as described above.
  • the portions are sequenced, as long as the four portions are sequenced sequentially.
  • the first and then the second portion of the first polynucleotide will be sequenced, followed by the first and then the second portion of the second polynucleotide.
  • the result will be four sequencing reads.
  • an index read i.e. sequencing of one or more of the index sequences
  • One or more index read may be performed before or after the SBS primer reads.
  • the extended SBS primer strands are denatured (for example using heat or NaOH) and washed away.
  • Index primer 1 and Index primer 2 (if there are dual indexes) are sequentially hybridized to the flow cell and read using standard SBS.
  • the index reads are performed first. Index 1 is hybridized and read, then denatured and washed off, following which, Index 2 (if needed) is hybridized and read, then denatured and washed off. Then, the amplification with in-solution SBS primers and nick-translation SBS is performed.
  • sequencing is initiated at the cleavage site within the first and/or second sequencing primer that results from the cleavage event/application of the cleavage conditions.
  • the polymerase enzyme binds to this site and incorporates complementary nucleotides base by base into the growing opposite strand.
  • the method comprises applying a polymerase.
  • the polymerase may be a strand displacement polymerase (e.g. phi29).
  • a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity is used.
  • the 5’ to 3’ exonuclease activity is used to essentially clear the path ahead of the growing strand (i.e. remove the downstream hybridised strand) for the polymerase to follow and incorporate a complementary nucleotide.
  • Exonucleases are enzymes that catalyse the removal of individual nucleotides from an end of a polynucleotide chain, by cleaving the phosphodiester bond via hydrolysis. Exonucleases are described as having a 5’ to 3’ activity or 3’ to 5’ activity, according to the direction of the nucleotide that they travel and digest.
  • the DNA polymerase has native 5’-3’ exonuclease activity. That is, the DNA polymerase naturally has 5’ to 3’ exonuclease activity. Examples of DNA polymerase with native 5’ to 3’ exonuclease activity include: Taq DNA Polymerase, T7 DNA polymerase and DNA Polymerase I.
  • a DNA polymerase and an exonuclease are separately applied (i.e. flowed across the solid support) either sequentially or concurrently.
  • the exonuclease is applied prior to incorporation of the polymerase.
  • exonucleases examples include RecJf, Lambda Exonuclease, T7 exonuclease domain, T5 exonuclease, or the DNA polymerase l-like H3TH domain, Exonuclease V (RecBCD) or Exonuclease VII, with ss- and ds- DNA exonuclease activity.
  • the DNA polymerase has been engineered to have 5’ to 3’ exonuclease activity.
  • a DNA polymerase may be fused with a protein possessing 5’ to 3’ exonuclease activity.
  • the resulting fusion protein will contain an exonuclease domain, a DNA-binding domain and a polymerase domain.
  • Suitable DNA polymerases include Pfu DNA polymerase, DNA Pol 5, TherminatorTM DNA Polymerase, DNA Polymerase III.
  • Commercially available fusion proteins include the Phusion DNA polymerase.
  • a fusion protein could be engineered using the DNA polymerases and exonucleases discussed above. The skilled person would understand that fusion proteins can be recombinant fusion proteins, created through genetic engineering of a fusion gene. Most commonly, fusion proteins are created by tandem fusion or linker-mediated fusion.
  • exonuclease activity reduces or eliminates issues caused by having a displaced strand in the well.
  • the exonuclease’s gradual digestion of the strand ahead of the polymerase activity eliminates the displaced strand and, with it, the issues caused by displaced strand reannealing, which in turn can affect the sequencing signal.
  • sequencing may be carried out using any suitable "sequencing-by- synthesis" technique, wherein nucleotides are added successively in cycles to the free 3' hydroxyl group, resulting in synthesis of a polynucleotide chain in the 5' to 3' direction.
  • the nature of the nucleotide added may be determined after each addition.
  • One particular sequencing method relies on the use of modified nucleotides that can act as reversible chain terminators. Such reversible chain terminators comprise removable 3' blocking groups.
  • the modified nucleotides may carry a label to facilitate their detection.
  • a label may be configured to emit a signal, such as an electromagnetic signal, or a (visible) light signal.
  • the label is a fluorescent label (e.g. a dye).
  • a fluorescent label e.g. a dye
  • the label may be configured to emit an electromagnetic signal, or a (visible) light signal.
  • One method for detecting the fluorescently labelled nucleotides comprises using laser light of a wavelength specific for the labelled nucleotides, or the use of other suitable sources of illumination.
  • the fluorescence from the label on an incorporated nucleotide may be detected by a CCD camera or other suitable detection means. Suitable detection means are described in PCT/US2007/007991 , the contents of which are incorporated herein by reference in their entirety.
  • the detectable label need not be a fluorescent label. Any label can be used which allows the detection of the incorporation of the nucleotide into the DNA sequence.
  • Each cycle may involve simultaneous delivery of four different nucleotide types to the array of template molecules.
  • different nucleotide types can be added sequentially and an image of the array of template molecules can be obtained between each addition step.
  • each nucleotide type may have a (spectrally) distinct label.
  • four channels may be used to detect four nucleobases (also known as 4- channel chemistry) ( Figure 8 - top left).
  • a first nucleotide type e.g. A
  • a second nucleotide type e.g. G
  • a second label e.g. configured to emit a second wavelength, such as blue light
  • a third nucleotide type e.g. T
  • a third label e.g.
  • a fourth nucleotide type may include a fourth label (e.g. configured to emit a fourth wavelength, such as yellow light).
  • Four images can then be obtained, each using a detection channel that is selective for one of the four different labels.
  • the first nucleotide type e.g. A
  • the second nucleotide type e.g. G
  • the second channel e.g. configured to detect the second wavelength, such as blue light
  • the third nucleotide type e.g. T
  • a third channel e.g.
  • the fourth nucleotide type (e.g. C) may be detected in a fourth channel (e.g. configured to detect the fourth wavelength, such as yellow light).
  • a fourth channel e.g. configured to detect the fourth wavelength, such as yellow light.
  • detection of each nucleotide type may be conducted using fewer than four different labels.
  • sequencing-by-synthesis may be performed using methods and systems described in US 2013/0079232, which is incorporated herein by reference.
  • two channels may be used to detect four nucleobases (also known as 2-channel chemistry) ( Figure 8 - top right).
  • a first nucleotide type e.g. A
  • a second label e.g. configured to emit a second wavelength, such as red light
  • a second nucleotide type e.g. G
  • a third nucleotide type e.g. T
  • the first label e.g.
  • the first nucleotide type (e.g. A) may be detected in both a first channel (e.g. configured to detect the first wavelength, such as red light) and a second channel (e.g. configured to detect the second wavelength, such as green light), the second nucleotide type (e.g.
  • the third nucleotide type (e.g. T) may be detected in the first channel (e.g. configured to detect the first wavelength, such as red light) and may not be detected in the second channel
  • the fourth nucleotide type (e.g. C) may not be detected in the first channel and may be detected in the second channel (e.g. configured to detect the second wavelength, such as green light).
  • one channel may be used to detect four nucleobases (also known as 1 -channel chemistry) ( Figure 8 - bottom).
  • a first nucleotide type e.g. A
  • a second nucleotide type e.g. G
  • a third nucleotide type e.g. T
  • a non-cleavable label e.g. configured to emit the wavelength, such as green light
  • a fourth nucleotide type e.g. C
  • a label-accepting site which does not include the label.
  • a first image can then be obtained, and a subsequent treatment carried out to cleave the label attached to the first nucleotide type, and to attach the label to the label-accepting site on the fourth nucleotide type.
  • a second image may then be obtained.
  • the first nucleotide type e.g. A
  • the second nucleotide type e.g. G
  • the third nucleotide type e.g. T
  • the channel e.g.
  • the fourth nucleotide type (e.g. C) may not be detected in the channel in the first image and may be detected in the channel in the second image (e.g. configured to detect the wavelength, such as green light).
  • each adaptor sequence comprises two or at least two cleavage sites (or complements of the cleavage sites), as described above.
  • the 5’ adaptor may comprise at least two cleavage sites, as described above, and the 3’ adaptor comprises at least two complements of the same cleavage sites that are in the 5’ adaptor, again as described above.
  • Binding of a single-stranded library strand (which includes the insert and a 5’ and 3’ adaptor sequence) to a first (or second) immobilised primer and extension of the first (or second) immobilised strand results in a first (or second) immobilised strand comprising two or at least two 5’ cleavage sites and at least two or two 3’ complements of the same cleavage sites.
  • the second (or first) immobilised resynthesized strand will also comprise at least two or two 5’ cleavable sites and at least two or two 3’ complements of (the same) cleavable sites.
  • both 5’ ends of the first and second immobilised strands now comprise two or at least two cleavable sites.
  • the cleavable sites at the 5’ ends of both immobilised strands are simultaneously cleaved. As shown in Figure 19, this leads to cleavage of just one strand of the sequence bridge, at each 5’ end, of the immobilised sequences - in other words, a nick at two sites (e.g. a double nick), at both 5’ ends of the sequence bridges.
  • each adaptor sequence comprises a single cleavage or complement of a cleavage site and each immobilised primer comprises at least one cleavable site, as described in detail above.
  • the 5’ adaptor comprises a single cleavage site and the 3’ adaptor comprises a complement of the same cleavage site (that is in the 5’ adaptor).
  • Binding of a single-stranded library strand (which includes the insert and a 5’ and 3’ adaptor sequence) to a first (or second) immobilised primer and extension of the first (or second) immobilised strand results in a first (or second) immobilised strand comprising a 5’ cleavage site and a 3’ complements of the same cleavage site.
  • the second (or first) immobilised resynthesized strand will also comprise a 5’ cleavable site and a 3’ complements of (the same) cleavable site.
  • both 5’ ends of the first and second immobilised strands now comprise a cleavable site.
  • the cleavable site in the immobilised primer and the cleavable site at the 5’ end of both immobilised strands are simultaneously cleaved.
  • the cleavage sites are restriction sites
  • the cleavable sites are cleaved using a endonuclease, such as a single strand restriction endonuclease. Examples of suitable endonucleases are described above.
  • the cleavage sites can be cleaved simultaneously.
  • the cleavage sites are restrictions sites (the same restriction site) for a nicking endonuclease
  • the cleavage sites can be cleaved by the addition of the (a single) nicking endonuclease.
  • different cleavage sites are used - for example, if the 5’ and 3’ adaptor of one strand of the library strand (e.g. the forward strand) comprises a first cleavage site and complement of the first cleavage site respectively, and the 5’ and 3’ adaptor of another strand of the library strand (e.g.
  • both the first and second cleavage sites can be simultaneously cleaved, for example by the simultaneous addition of a first and second endonuclease.
  • the first and second cleavage sites can be sequentially cleaved by the sequential addition of e.g. a first and second endonuclease.
  • the method further comprises applying (i.e. adding/ flowing over the surface of the solid support) a first endonuclease.
  • the method comprises applying (i.e. adding/ flowing over the surface of the solid support) a first and second endonuclease.
  • the first and second endonucleases may be applied sequentially or concurrently.
  • the sequence between the two cleavage sites is washed away.
  • the conditions required to wash away the sequence between the two cleavage sites may be determined empirically, and will depend on the sequence makeup of the double nicked insert. In essence, whatever temperature is sufficient to melt the bridge structure is too high. A lower temperature can be determined that is sufficient to denature the short, double-nicked insert. In some embodiments, it could be as low as 50°C, but as explained above, will depend on the sequence context of the insert. As described above, given the position of the cleavage sites the sequence that is washed away will either comprise the sequencing primer binding site or the sequencing primer binding site and an index sequence. However, in all embodiments described, the sequencing start sites are formed simultaneously by e.g. nicking enzymes, which therefore allows both strands of the duplex to be sequenced simultaneously.
  • the method comprises blocking all or substantially all free 3’ ends of the immobilised strands.
  • the blocking group may be any modification that prevents extension (i.e. elongation) of the free end by a polymerase.
  • suitable blocking groups include a hairpin loop (e.g.
  • a polynucleotide attached to the 3’- end comprising in a 5’ to 3’ direction, a cleavable site such as a nucleotide comprising uracil, a loop portion, and a complement portion, wherein the complement portion is substantially complementary to all or a portion of the lawn primer), a hydrogen atom instead of a 3’-OH group, a phosphate group, a propyl spacer (e.g.-O-(CH2)3-OH instead of a 3’-OH group), a modification blocking the 3’-hydroxyl group (e.g. hydroxyl protecting groups, such as silyl ether groups (e.g.
  • ether groups e.g. benzyl, allyl, t-butyl, methoxymethyl (MOM), 2-methoxyethoxymethyl (MEM), tetrahydropyranyl
  • acyl groups e.g. acetyl, benzoyl
  • the method comprises simultaneously sequencing the first and second immobilised strands. Accordingly, the method comprises carrying out a first sequencing read to determine the sequence of the first and second immobilised strands simultaneously, such as by a sequencing-by-synthesis technique or by a sequencing-by ligation technique.
  • the method comprises applying (i.e. adding/ flowing over the surface of the solid support) a plurality of first and second sequencing primers.
  • the first sequencing primers are capable of hybridising to a first sequencing primer binding site on the second immobilised strand.
  • the second sequencing primers are capable of hybridising to a second sequence primer binding site on the first immobilised strand.
  • washing away the sequence between the two cleavage sites provides a binding site(s) for sequencing primers.
  • the nick sites provide sequencing start sites for nick translation or nick sequencing.
  • nick translation is meant a process of extending a polynucleotide from a single-strand break in the polynucleotide (hereafter referred to as a ‘nick site’ or ‘cleavable site’) by incorporating labelled nucleotides at one side of the nick site while removing nucleotides from the other side of the nick site, resulting in the progression (or translation) of the nick site along the polynucleotide strand.
  • nick translation allows for the polynucleotide to be sequenced from the nick site by the methods described herein.
  • a nick site, and therefore a preferred sequencing start position can be designed through the incorporation of a cleavable site into the polynucleotide, as described herein.
  • Exonucleases are enzymes that catalyse the removal of individual nucleotides from an end of a polynucleotide chain, by cleaving the phosphodiester bond via hydrolysis. Exonucleases are described as having a 5’ to 3’ activity or 3’ to 5’ activity, according to the direction of the nucleotide that they travel and digest.
  • the key stages to nick translation include: the introduction of a nick in a polynucleotide, to act as a start point for the introduction of labelled nucleotides; the activity of an exonuclease enzyme or exonuclease domain to remove nucleotides away from the nick site; and a polymerase that incorporates labelled nucleotides at the nick site.
  • the nick site is progressively moved or translated along the polynucleotide.
  • the first sequencing read may comprise the binding of a first sequencing primer (also known as a read 1 sequencing primer) to the first sequencing primer binding site (e.g. first terminal sequencing primer binding site 303 in templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion).
  • the second sequencing read may comprise the binding of a second sequencing primer (also known as a read 2 sequencing primer) to the second sequencing primer binding site (e.g. second terminal sequencing primer binding site 304 in templates including a first polynucleotide sequence comprising a first portion and a second polynucleotide sequence comprising a second portion).
  • the method comprises applying a first sequencing primer and a second sequencing primer, where the first sequencing primer comprises a mixed population of blocked and unblocked sequencing primers.
  • the method comprises applying a first sequencing primer and a second sequencing primer, where the second sequencing primer comprises a mixed population of blocked and unblocked sequencing primers.
  • the mixed population of blocked and unblocked first or second sequencing primers can be used to generate a sequence read where the intensity of the first sequencing read is different from the intensity of the second sequencing read, thereby allowing both sequencing reads to be analysed simultaneously. This means, for example, that the forward and reverse strand of a library strand can be sequenced simultaneously, and without the need for paired-end resynthesis.
  • any ratio of blocked:unblocked first or second sequencing primers can be used that generates or leads to a sequencing signal that is of a lower intensity than the second or first sequencing signal respectively.
  • the ratio of blocked:unblocked primers may be: 20:80 to 80:20, or 1 :2 to 2:1 or 1 :1.
  • a ratio of 50:50 of blocked:unblocked first or second sequencing primers is used, which in turn generates a sequencing signal that is around 50% of the intensity of the sequencing signal from the second or first sequencing signal respectively.
  • blocking groups include a hairpin loop (e.g. a polynucleotide attached to the 3’-end, comprising in a 5’ to 3’ direction, a cleavable site such as a nucleotide comprising uracil, a loop portion, and a complement portion, wherein the complement portion is substantially complementary to all or a portion of the immobilised primer), a deoxynucleotide, a deoxyribonucleotide, a hydrogen atom instead of a 3’-OH group, a phosphate group, a phosphorothioate group, a propyl spacer (e.g.
  • a modification blocking the 3’-hydroxyl group e.g. hydroxyl protecting groups, such as silyl ether groups (e.g. trimethylsilyl, triethylsilyl, triisopropylsilyl, t-butyl(dimethyl)silyl, t-butyl(diphenyl)silyl), ether groups (e.g. benzyl, allyl, t-butyl, methoxymethyl (MOM), 2-methoxyethoxymethyl (MEM), tetrahydropyranyl), or acyl groups (e.g. acetyl, benzoyl)), or an inverted nucleobase.
  • the blocking group may be any modification that prevents extension (i.e. elongation) of the primer by a polymerase.
  • the sequence of the sequencing primers and the sequence primer binding sites are not material to the methods of the invention, as long as the sequencing primers are able to bind to the sequence primer-binding site to enable amplification and sequencing of the regions to be identified.
  • the first sequencing primer comprises or consists of a sequence as defined in SEQ ID NO: 7 or a variant thereof.
  • the second sequencing primer comprises or consists of a sequence as defined in SEQ ID NO: 9 or a variant thereof.
  • the method comprises applying a polymerase.
  • the polymerase may be a strand displacement polymerase (e.g. phi29).
  • a polymerase and a 5’3 exonuclease or a polymerase with 5’ to 3’ exonuclease activity is used.
  • the 5’ to 3’ exonuclease activity is used to essentially clear the path ahead of the growing strand (i.e. remove the downstream hybridised strand)
  • Exonucleases are enzymes that catalyse the removal of individual nucleotides from an end of a polynucleotide chain, by cleaving the phosphodiester bond via hydrolysis. Exonucleases are described as having a 5’ to 3’ activity or 3’ to 5’ activity, according to the direction of the nucleotide that they travel and digest.
  • the DNA polymerase has native 5’-3’ exonuclease activity. That is, the DNA polymerase naturally has 5’ to 3’ exonuclease activity.
  • Examples of DNA polymerase with native 5’ to 3’ exonuclease activity include: Taq DNA Polymerase, T7 DNA polymerase and DNA Polymerase I.
  • a DNA polymerase and an exonuclease are separately applied (i.e. flowed across the solid support) either sequentially or concurrently.
  • the exonuclease is applied prior to incorporation of the polymerase.
  • suitable exonucleases include RecJf, Lambda Exonuclease, T7 exonuclease domain, T5 exonuclease, or the DNA polymerase l-like H3TH domain, Exonuclease V (RecBCD) or Exonuclease VII, with ss- and ds- DNA exonuclease activity.
  • the DNA polymerase has been engineered to have 5’ to 3’ exonuclease activity.
  • a DNA polymerase may be fused with a protein possessing 5’ to 3’ exonuclease activity.
  • the resulting fusion protein will contain an exonuclease domain, a DNA-binding domain and a polymerase domain.
  • Suitable DNA polymerases include Pfu DNA polymerase, DNA Pol 5, TherminatorTM DNA Polymerase, DNA Polymerase III.
  • Commercially available fusion proteins include the Phusion DNA polymerase.
  • a fusion protein could be engineered using the DNA polymerases and exonucleases discussed above. The skilled person would understand that fusion proteins can be recombinant fusion proteins, created through genetic engineering of a fusion gene. Most commonly, fusion proteins are created by tandem fusion or linker-mediated fusion.
  • sequencing may be carried out using any suitable "sequencing-by- synthesis" technique, wherein nucleotides are added successively in cycles to the free 3' hydroxyl group, resulting in synthesis of a polynucleotide chain in the 5' to 3' direction.
  • the nature of the nucleotide added may be determined after each addition.
  • One particular sequencing method relies on the use of modified nucleotides that can act as reversible chain terminators.
  • reversible chain terminators comprise removable 3' blocking groups.
  • the 3' block may be removed to allow addition of the next successive nucleotide.
  • the modified nucleotides may carry a label to facilitate their detection.
  • a label may be configured to emit a signal, such as an electromagnetic signal, or a (visible) light signal.
  • the label is a fluorescent label (e.g. a dye).
  • a fluorescent label e.g. a dye
  • the label may be configured to emit an electromagnetic signal, or a (visible) light signal.
  • One method for detecting the fluorescently labelled nucleotides comprises using laser light of a wavelength specific for the labelled nucleotides, or the use of other suitable sources of illumination.
  • the fluorescence from the label on an incorporated nucleotide may be detected by a CCD camera or other suitable detection means. Suitable detection means are described in PCT/US2007/007991 , the contents of which are incorporated herein by reference in their entirety.
  • the detectable label need not be a fluorescent label. Any label can be used which allows the detection of the incorporation of the nucleotide into the DNA sequence.
  • Each cycle may involve simultaneous delivery of four different nucleotide types to the array of template molecules.
  • different nucleotide types can be added sequentially and an image of the array of template molecules can be obtained between each addition step.
  • each nucleotide type may have a (spectrally) distinct label.
  • four channels may be used to detect four nucleobases (also known as 4- channel chemistry) ( Figure 8 - left).
  • a first nucleotide type e.g. A
  • a second nucleotide type e.g. G
  • a second label e.g. configured to emit a second wavelength, such as blue light
  • a third nucleotide type e.g. T
  • a third label e.g.
  • a fourth nucleotide type may include a fourth label (e.g. configured to emit a fourth wavelength, such as yellow light).
  • Four images can then be obtained, each using a detection channel that is selective for one of the four different labels.
  • the first nucleotide type e.g. A
  • the second nucleotide type e.g. G
  • the second channel e.g. configured to detect the second wavelength, such as blue light
  • the third nucleotide type e.g. T
  • a third channel e.g.
  • the fourth nucleotide type (e.g. C) may be detected in a fourth channel (e.g. configured to detect the fourth wavelength, such as yellow light).
  • a fourth channel e.g. configured to detect the fourth wavelength, such as yellow light.
  • detection of each nucleotide type may be conducted using fewer than four different labels.
  • sequencing-by-synthesis may be performed using methods and systems described in US 2013/0079232, which is incorporated herein by reference.
  • two channels may be used to detect four nucleobases (also known as 2-channel chemistry) ( Figure 8 - middle).
  • a first nucleotide type e.g. A
  • a second label e.g. configured to emit a second wavelength, such as red light
  • a second nucleotide type e.g. G
  • a third nucleotide type e.g. T
  • the first label e.g.
  • the first nucleotide type (e.g. A) may be detected in both a first channel (e.g. configured to detect the first wavelength, such as red light) and a second channel (e.g. configured to detect the second wavelength, such as green light), the second nucleotide type (e.g.
  • the third nucleotide type (e.g. T) may be detected in the first channel (e.g. configured to detect the first wavelength, such as red light) and may not be detected in the second channel
  • the fourth nucleotide type (e.g. C) may not be detected in the first channel and may be detected in the second channel (e.g. configured to detect the second wavelength, such as green light).
  • one channel may be used to detect four nucleobases (also known as 1 -channel chemistry) ( Figure 8 - right).
  • a first nucleotide type e.g. A
  • a second nucleotide type e.g. G
  • a third nucleotide type e.g. T
  • a non-cleavable label e.g. configured to emit the wavelength, such as green light
  • a fourth nucleotide type e.g. C
  • a label-accepting site which does not include the label.
  • a first image can then be obtained, and a subsequent treatment carried out to cleave the label attached to the first nucleotide type, and to attach the label to the label-accepting site on the fourth nucleotide type.
  • a second image may then be obtained.
  • the first nucleotide type e.g. A
  • the second nucleotide type e.g. G
  • the third nucleotide type e.g. T
  • the channel e.g.
  • the fourth nucleotide type (e.g. C) may not be detected in the channel in the first image and may be detected in the channel in the second image (e.g. configured to detect the wavelength, such as green light).
  • the above-described example would allow spatially separated clusters to be read in a temporally simultaneous manner through the generation of an optically unresolved signal that can be analytically separated using 16QaM.
  • Figure 9 is a scatter plot showing an example of sixteen distributions of signals generated by polynucleotide sequences disclosed herein.
  • the scatter plot of Figure 9 shows sixteen distributions (or bins) of intensity values from the combination of a brighter signal (i.e. a first signal as described herein) and a dimmer signal (i.e. a second signal as described herein); the two signals may be co-localized and may not be optically resolved as described above.
  • the intensity values shown in Figure 9 may be up to a scale or normalisation factor; the units of the intensity values may be arbitrary or relative (i.e., representing the ratio of the actual intensity to a reference intensity).
  • the sum of the brighter signal generated by the first portions and the dimmer signal generated by the second portions results in a combined signal.
  • the combined signal may be captured by a first optical channel and a second optical channel.
  • the computer system can map the combined signal generated into one of the sixteen bins, and thus determine the added nucleobase at the first portion and the added nucleobase at the second portion, respectively. For example, when the combined signal is mapped to bin 1612 for a base calling cycle, the computer processor base calls both the added nucleobase at the first portion and the added nucleobase at the second portion as C.
  • the processor base calls the added nucleobase at the first portion as C and the added nucleobase at the second portion as T.
  • the processor base calls the added nucleobase at the first portion as C and the added nucleobase at the second portion as G.
  • the processor base calls the added nucleobase at the first portion as C and the added nucleobase at the second portion as A.
  • the processor base calls the added nucleobase at the first portion as T and the added nucleobase at the second portion as C.
  • the processor base calls both the added nucleobase at the first portion and the added nucleobase at the second portion as T.
  • the processor base calls the added nucleobase at the first portion as T and the added nucleobase at the second portion as G.
  • the processor base calls the added nucleobase at the first portion as T and the added nucleobase at the second portion as A.
  • the processor base calls the added nucleobase at the first portion as G and the added nucleobase at the second portion as C.
  • the processor base calls the added nucleobase at the first portion as G and the added nucleobase at the second portion as T.
  • the processor base calls both the added nucleobase at the first portion and the added nucleobase at the second portion as G.
  • the processor base calls the added nucleobase at the first portion as G and the added nucleobase at the second portion as A.
  • the processor base calls the added nucleobase at the first portion as A and the added nucleobase at the second portion as C.
  • the processor base calls the added nucleobase at the first portion as A and the added nucleobase at the second portion as T.
  • the processor base calls the added nucleobase at the first portion as A and the added nucleobase at the second portion as G.
  • the processor base calls both the added nucleobase at the first portion and the added nucleobase at the second portion as A.
  • T is configured to emit a signal in both the IMAGE 1 channel and the IMAGE 2 channel
  • A is configured to emit a signal in the IMAGE 1 channel only
  • C is configured to emit a signal in the IMAGE 2 channel only
  • G does not emit a signal in either channel.
  • A may be configured to emit a signal in both the IMAGE 1 channel and the IMAGE 2 channel
  • T may be configured to emit a signal in the IMAGE 1 channel only
  • C may be configured to emit a signal in the IMAGE 2 channel only
  • G may be configured to not emit a signal in either channel.
  • Figure 10 is a flow diagram showing a method 1700 of base calling according to the present disclosure.
  • the described method allows for simultaneous sequencing of two (or more) portions (e.g. the first portion and the second portion) in a single sequencing run from a single combined signal obtained from the first portion and the second portion, thus requiring less sequencing reagent consumption and faster generation of data from both the first portion and the second portion.
  • the simplified method may reduce the number of workflow steps while producing the same yield as compared to existing next-generation sequencing methods. Thus, the simplified method may result in reduced sequencing runtime.
  • the disclosed method 1700 may start from block 1701. The method may then move to block 1710.
  • intensity data is obtained.
  • the intensity data includes first intensity data and second intensity data.
  • the first intensity data comprises a combined intensity of a first signal component obtained based upon a respective first nucleobase of the first portion and a second signal component obtained based upon a respective second nucleobase of the second portion.
  • the second intensity data comprises a combined intensity of a third signal component obtained based upon the respective first nucleobase of the first portion and a fourth signal component obtained based upon the respective second nucleobase of the second portion.
  • the first portion is capable of generating a first signal comprising a first signal component and a third signal component.
  • the second portion is capable of generating a second signal comprising a second signal component and a fourth signal component.
  • the first portion and the second portion may be arranged on the solid support such that signals from the first portion and the second portion are detected by a single sensing portion and/or may comprise a single cluster such that first signals and second signals from each of the respective first portions and second portions cannot be spatially resolved.
  • obtaining the intensity data comprises selecting intensity data that corresponds to two (or more) different portions (e.g. the first portion and the second portion).
  • intensity data is selected based upon a chastity score.
  • a chastity score may be calculated as the ratio of the brightest base intensity divided by the sum of the brightest and second brightest base intensities.
  • the desired chastity score may be different depending upon the expected intensity ratio of the light emissions associated with the different portions. As described above, it may be desired to produce clusters comprising the first portion and the second portion, which give rise to signals in a ratio of 2:1.
  • high-quality data corresponding to two portions with an intensity ratio of 2:1 may have a chastity score of around 0.8 to 0.9.
  • the method may proceed to block 1720.
  • one of a plurality of classifications is selected based on the intensity data.
  • Each classification represents a possible combination of respective first and second nucleobases.
  • the plurality of classifications comprises sixteen classifications as shown in Figure 9, each representing a unique combination of first and second nucleobases. Where there are two portions, there are sixteen possible combinations of first and second nucleobases.
  • Selecting the classification based on the first and second intensity data comprises selecting the classification based on the combined intensity of the first and second signal components and the combined intensity of the third and fourth signal components.
  • the method may then proceed to block 1730, where the respective first and second nucleobases are base called based on the classification selected in block 1720.
  • the signals generated during a cycle of a sequencing are indicative of the identity of the nucleobase(s) added during sequencing (e.g. using sequencing-by-synthesis). It will be appreciated that there is a direct correspondence between the identity of the nucleobases that are incorporated and the identity of the complementary base at the corresponding position of the template sequence bound to the solid support. Therefore, any references herein to the base calling of respective nucleobases at the two portions encompasses the base calling of nucleobases hybridised to the template sequences and, alternatively or additionally, the identification of the corresponding nucleobases of the template sequences.
  • the method may then end at block 1740.
  • Methods as described herein may be performed by a user physically.
  • a user may themselves conduct the methods of preparing polynucleotide sequences for identification as described herein, and as such the methods as described herein may not need to be computer-implemented.
  • a sequencing kit comprising a plurality of first and second sequencing primers, wherein the first sequencing primers comprise a first cleavage site and wherein the second sequencing primers comprise a second cleavage site, where the first and second cleavage sites are different.
  • a sequencing kit comprising a plurality of first and second sequencing primers, wherein the first sequencing primers comprise a first cleavage site and wherein the second sequencing primers comprise a second cleavage site, where the first and second cleavage sites are the same.
  • the kit comprises a mixture of cleavable and non-cleavable second sequencing primers and cleavable first sequencing primers.
  • the kit comprises a mixture of cleavable and non-cleavable first sequencing primers and cleavable second sequencing primers.
  • the ratio of cleavable: non/un-cleavable first or second sequencing primers is 1 :1.
  • the kit may also further comprise a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • the kit may also comprise cleavage effectors.
  • cleavage effectors is meant effectors that mediate cleavage of the first and/or second sequencing primers.
  • suitable cleavage effectors include AP lyases, glycosylases with or without an AP lyase or endonuclease, andendonucleases, such as endonuclease V and the USER mix (which is a cocktail of uracil glycosylase and endonuclease VIII).
  • the kit may also comprise a kinase, such as T4 kinase.
  • a kinase such as T4 kinase.
  • kits comprising a plurality of first 5’ adaptors, a plurality of second 3’ adaptors, as described above.
  • the kits further comprise instructions for use.
  • the kit may further comprise at least one single-stranded endonuclease or restriction endonuclease.
  • the endonuclease is selected from Nt. BspQI, Cas9 D10A and Cas9 H840A.
  • the kit may additionally comprise a uracil glycosylase or USER enzyme mix (which is a cocktail of uracil glycosylase and endonuclease VIII).
  • a solid support comprising a plurality of a third and/or fourth primer immobilised thereon, as described above.
  • kits comprising instructions for preparing polynucleotide sequences for identification according to the methods described herein and/or sequencing polynucleotide sequences according to the methods described herein.
  • methods as described herein may be performed by a computer.
  • a computer may contain instructions to conduct the methods of preparing polynucleotide sequences for identification as described herein, and as such the methods as described herein may be computer-implemented.
  • a data processing device comprising means for carrying out the methods as described herein.
  • the data processing device may be a polynucleotide sequencer.
  • the data processing device may comprise reagents used for methods as described herein.
  • the data processing device may comprise a solid support as described herein, such as a flow cell.
  • a computer program product comprising instructions which, when the program is executed by a processor, cause the processor to carry out the methods as described herein.
  • a computer-readable storage medium comprising instructions which, when executed by a processor, cause the processor to carry out the methods as described herein.
  • a computer-readable data carrier having stored thereon the computer program product as described herein.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a processor can be a microprocessor, but in the alternative, the processor can be a controller, microcontroller, or state machine, combinations of the same, or the like.
  • a processor can also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a processor can also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • systems described herein may be implemented using a discrete memory chip, a portion of memory in a microprocessor, flash, EPROM, or other types of memory.
  • a software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of computer-readable storage medium known in the art.
  • An exemplary storage medium can be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium can be integral to the processor.
  • the processor and the storage medium can reside in an ASIC.
  • a software module can comprise computer-executable instructions which cause a hardware processor to execute the computer-executable instructions.
  • Computer-executable instructions may be stored in a (transitory or non-transitory) computer readable storage medium (e.g., memory, storage system, etc.) storing code, or computer readable instructions.
  • a (transitory or non-transitory) computer readable storage medium e.g., memory, storage system, etc.
  • a method of preparing a first polynucleotide sequence and a second polynucleotide sequence for sequencing comprises a. hybridising the first polynucleotide sequence to first immobilised primers on a solid support and hybridising the second polynucleotide sequence to second immobilised primers on a solid support; b. synthesising a plurality of first and second polynucleotide sequences by conducting an amplification reaction to extend the first and second immobilised primers; c. selectively removing first and second immobilised primers that have not been extended; d.
  • first sequencing primers comprise a first cleavable site and wherein the second sequencing primers comprise a second cleavable site and conducting an amplification reaction to extend the first and second sequencing primers; and e. selectively cleaving the first cleavable site in the first sequencing primer and/or selectively cleaving the second cleavable site in the second sequencing primer.
  • the first and second cleavable sites is an unnatural nucleotide, wherein preferably the unnatural nucleotide is selected from oxoguanine, even more preferably 8-oxoguanine, a hypoxanthine, even more preferably inosine, xanthine, a methylguanine, a methyladenines, a modified cytosine, dihydrouracil, and uracil.
  • the method further comprises a step of linearising the first polynucleotide sequence and the second polynucleotide sequence following step (b).
  • a method of sequencing a first and second polynucleotide sequence comprises: preparing first and second polynucleotide sequences for sequencing using a method according to any one of clauses 1 to 8; and sequencing nucleobases in the first polynucleotide sequence and the second polynucleotide sequence.
  • the method comprises cleaving the first cleavable site in the first sequencing primer and sequencing the nucleobases in the first polynucleotide sequence.
  • a method according to clause 9 or 10 wherein the method further comprises subsequently cleaving the second cleavable site in the second sequencing primer and sequencing the nucleobases in the first polynucleotide sequence.
  • a method according to any of clauses 9 to 11 wherein the method further comprises blocking the 3’ end of the first polynucleotide sequence prior to sequencing the second polynucleotide.
  • the method comprises concurrently cleaving the first cleavable site in the first sequencing primer and cleaving the second cleavable site in the second sequencing primer and sequencing nucleobases in the first polynucleotide sequence and the second polynucleotide sequence, wherein preferably the method comprises applying a plurality of sequencing primers, wherein the first or second sequencing primers comprise a mixture of cleavable sequencing primers and non-cleavable sequencing primers.
  • the step of concurrently sequencing nucleobases comprises performing sequencing-by- synthesis or sequencing-by-ligation.
  • step of concurrently sequencing nucleobases comprises treatment with a polymerase and a 5’-3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • step of concurrently sequencing nucleobases comprises treatment with a strand displacement polymerase (e.g. phi29).
  • a strand displacement polymerase e.g. phi29
  • a sequencing kit comprising a plurality of first and second sequencing primers, wherein the first and second sequencing primers comprise at least one cleavage site.
  • kits further comprises first and second sequencing primers and wherein the first and second sequencing primers comprise a mixture of cleavable sequencing primers and un-cleavable sequencing primers.
  • a data processing device comprising means for carrying out a method according to any one of clauses 1 to 17.
  • a data processing device according to clause 22, wherein the data processing device is a polynucleotide sequencer.
  • a computer program product comprising instructions which, when the program is executed by a processor, cause the processor to carry out a method according to any one of clauses 1 to 17.
  • a computer-readable storage medium comprising instructions which, when executed by a processor, cause the processor to carry out a method according to any one of clauses 1 to 17.
  • a method of preparing a first polynucleotide sequence and a second polynucleotide sequence for simultaneous sequencing comprises a. attaching a first adaptor to a 5’ end of the first polynucleotide sequence and attaching a first adaptor to a 5’ end of the second polynucleotide sequence; b.
  • first adaptor comprises at least one cleavable site and the second adaptor comprises a complement of at least one cleavable site, and wherein the first and second adaptors comprises an immobilised primer-binding sequence and a sequencing primer binding sequence or complements thereof; c. hybridising the first and second polynucleotide sequences to first and second immobilised primers on a solid support, wherein optionally, the first and second immobilised primers comprise at least one cleavable site; d. forming a cluster of amplified first and second polynucleotide sequences; and e. cleaving at least one cleavable sites to provide sequencing sites for nick translation.
  • a method according to any one of clauses 1 to 4 further comprising a step of blocking 3’-ends of the first polynucleotide sequences and the second polynucleotide sequences. 6. A method according to any one of clauses 1 to 5, further comprising the step of applying a plurality of first and second sequencing primers, wherein the first or second sequencing primers comprise a mixture of blocked sequencing primers and unblocked sequencing primers.
  • the blocking group is selected from the group consisting of: a hairpin loop, a deoxynucleotide, a deoxyribonucleotide, a hydrogen atom instead of a 3’-OH group, a phosphate group, a phosphorothioate group, a propyl spacer, a modification blocking the 3’-hydroxyl group, or an inverted nucleobase.
  • first or second sequencing primers comprise a mixture of blocked sequencing primers and unblocked sequencing primers, and wherein the ratio of blocked to unblocked sequencing primers is 1 :1.
  • a method of simultaneously sequencing a first and second polynucleotide sequence comprising: preparing polynucleotide sequences for simultaneous sequencing using a method according to any one of clauses 1 to 13; and concurrently sequencing nucleobases in the first portion and the second portion.
  • a library preparation kit comprising of a plurality of first adaptors and a plurality of second adaptors, wherein the first adaptors comprise at least one cleavable site and the second adaptors comprise a complement of at least one cleavable site, and wherein the first and second adaptors comprises an immobilised primerbinding sequence and a sequencing primer binding sequence or complements thereof.
  • kit further comprises first and second sequencing primers and wherein the first and second sequencing primers comprise a mixture of blocked sequencing primers and unblocked sequencing primers, and wherein the ratio of blocked to unblocked sequencing primers is 1 :1.
  • kits 19. The library preparation kit of clause 17 to 18, wherein the kit further comprises a polymerase and a 5’ to 3’ exonuclease or a polymerase with 5’ to 3’ exonuclease activity.
  • a data processing device comprising means for carrying out a method according to any one of clauses 1 to 16.
  • a data processing device according to clause 20, wherein the data processing device is a polynucleotide sequencer.
  • a computer program product comprising instructions, which when the program is executed by a processor cause the processor to carry out a method according to any one of clauses 1 to 16.
  • a computer-readable storage medium comprising instructions which, when executed by a processor, cause the processor to carry out a method according to any one of clauses 1 to 16.
  • Disjunctive language such as the phrase “at least one of X, Y or Z,” unless specifically stated otherwise, is otherwise understood with the context as used in general to present that an item, term, etc., may be either X, Y or Z, or any combination thereof (e.g., X, Y and/or Z). Thus, such disjunctive language is not generally intended to, and should not, imply that certain embodiments require at least one of X, at least one of Y or at least one of Z to each be present.
  • the terms “about” or “approximate” and the like are synonymous and are used to indicate that the value modified by the term has an understood range associated with it, where the range can be ⁇ 20%, ⁇ 15%, ⁇ 10%, ⁇ 5%, or ⁇ 1 %.
  • the term “substantially” is used to indicate that a result (e.g., measurement value) is close to a targeted value, where close can mean, for example, the result is within 80% of the value, within 90% of the value, within 95% of the value, or within 99% of the value.
  • the term “partially” is used to indicate that an effect is only in part or to a limited extent.
  • a device configured to or “a device to” are intended to include one or more recited devices.
  • Such one or more recited devices can also be collectively configured to carry out the stated recitations.
  • a processor to carry out recitations A, B and C can include a first processor configured to carry out recitation A working in conjunction with a second processor configured to carry out recitations B and C.
  • Example 1 Evaluation of dsSBS to accurately read sequences that include high G content
  • Sequencing performance at a known G-quadruplex region was compared with ssSBS and dsSBS. Assay was performed on an iSeq with ssSBS performed according to standard manufacturer protocol and dsDNA SBS performed as shown in Figure 13. SBS (both dsDNA and ssDNA) was performed using various methods such as described in US Patent No. 11 , 293, 061 B2 and US Patent Pub. No. US 2021/0403500A1 , with incorporation time extended to 2 minutes.
  • Figure 12 shows that dsSBS using the present invention reduces sequencing errors at G-quad regions.
  • Read 1-5 represents dsSBS and read 6, standard ssSBS.
  • Read 6 standard SBS
  • dsSBS Reads 1-5) had much lower error rate, demonstrating the ability of double stranded sequencing to remove the effect of G- quadruplexes on sequencing performance.
  • SEQ ID NO: 1 P5 sequence
  • SEQ ID NO: 2 P7 sequence
  • SEQ ID NO: 4 P7’ sequence (complementary to P7) ATCTCGTATGCCGTCTTCTGCTTG
  • SEQ ID NO: 6 Alternative P5’ sequence (complementary to alternative P5 sequence) TCGGTCGCCGTATCATT
  • SEQ ID NO. 7 SBS3
  • SEQ ID NO. 9 SBS12

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne des procédés et des kits destinés à être utilisés dans le séquençage d'acides nucléiques, en particulier des procédés destinés à être utilisés dans le séquençage double brin, et en particulier destinés à être utilisés dans un séquençage par synthèse (SBS) double brin. Dans un autre mode de réalisation, les procédés de l'invention peuvent être utilisés dans un séquençage simultané.
PCT/EP2024/076525 2023-09-20 2024-09-20 Séquençage simultané à l'aide d'une traduction de coupure simple brin Pending WO2025062002A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202363584039P 2023-09-20 2023-09-20
US202363583994P 2023-09-20 2023-09-20
US63/584,039 2023-09-20
US63/583,994 2023-09-20

Publications (1)

Publication Number Publication Date
WO2025062002A1 true WO2025062002A1 (fr) 2025-03-27

Family

ID=92894680

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2024/076525 Pending WO2025062002A1 (fr) 2023-09-20 2024-09-20 Séquençage simultané à l'aide d'une traduction de coupure simple brin

Country Status (1)

Country Link
WO (1) WO2025062002A1 (fr)

Citations (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998044152A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode de sequençage d'acide nucleique
WO1998044151A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode d'amplification d'acide nucleique
WO2000018957A1 (fr) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Procedes d'amplification et de sequençage d'acide nucleique
WO2001079553A1 (fr) 2000-04-14 2001-10-25 Lynx Therapeutics, Inc. Methode et compositions permettant d'ordonner des fragments de restriction
WO2002006456A1 (fr) 2000-07-13 2002-01-24 Invitrogen Corporation Methodes et compositions d'extraction et d'isolation rapides de proteines et de peptides au moyen d'une matrice de lyse
WO2003074734A2 (fr) 2002-03-05 2003-09-12 Solexa Ltd. Procedes de detection de variations de sequence a l'echelle du genome associees a un phenotype
WO2005068656A1 (fr) 2004-01-12 2005-07-28 Solexa Limited Caracterisation d'acides nucleiques
US20060024681A1 (en) 2003-10-31 2006-02-02 Agencourt Bioscience Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
WO2006110855A2 (fr) 2005-04-12 2006-10-19 454 Life Sciences Corporation Procedes de determination de variantes de sequence utilisant un sequencage des amplicons
WO2006135342A1 (fr) 2005-06-14 2006-12-21 Agency For Science, Technology And Research Procede permettant de traiter et/ou de mapper des sequences ditag a un genome
US20060292611A1 (en) 2005-06-06 2006-12-28 Jan Berka Paired end sequencing
WO2007010251A2 (fr) * 2005-07-20 2007-01-25 Solexa Limited Preparation de matrices pour sequencage d'acides nucleiques
WO2007010252A1 (fr) 2005-07-20 2007-01-25 Solexa Limited Procede de sequencage d'une matrice de polynucleotide
WO2007052006A1 (fr) 2005-11-01 2007-05-10 Solexa Limited Procede pour preparer des bibliotheques de polynucleotides matrices
WO2007091077A1 (fr) 2006-02-08 2007-08-16 Solexa Limited Procédé de séquençage d'une matrice polynucléotidique
WO2007107710A1 (fr) 2006-03-17 2007-09-27 Solexa Limited Procédés isothermiques pour créer des réseaux moléculaires clonales simples
WO2008041002A2 (fr) 2006-10-06 2008-04-10 Illumina Cambridge Limited Procédé de séquençage d'une matrice polynucléotidique
WO2008093098A2 (fr) 2007-02-02 2008-08-07 Illumina Cambridge Limited Procedes pour indexer des echantillons et sequencer de multiples matrices nucleotidiques
WO2010048605A1 (fr) 2008-10-24 2010-04-29 Epicentre Technologies Corporation Compositions terminales de transposon et procédé de modification d’acides nucléiques
US20120301925A1 (en) 2011-05-23 2012-11-29 Alexander S Belyaev Methods and compositions for dna fragmentation and tagging by transposases
US20120316086A1 (en) 2011-06-09 2012-12-13 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
US20130079232A1 (en) 2011-09-23 2013-03-28 Illumina, Inc. Methods and compositions for nucleic acid sequencing
US20130143774A1 (en) 2011-12-05 2013-06-06 The Regents Of The University Of California Methods and compositions for generating polynucleic acid fragments
WO2013188582A1 (fr) 2012-06-15 2013-12-19 Illumina, Inc. Amplification par exclusion cinétique de banques d'acides nucléiques
WO2016189331A1 (fr) 2015-05-28 2016-12-01 Illumina Cambridge Limited Tagmentation à base de surface
WO2017117235A1 (fr) * 2015-12-30 2017-07-06 Omniome, Inc. Procédés de séquençage d'acides nucléiques double brin
US20190212294A1 (en) 2018-01-08 2019-07-11 Illumina, Inc. High-Throughput Sequencing with Semiconductor-Based Detection
US20210403500A1 (en) 2020-06-22 2021-12-30 Illumina Cambridge Limited Nucleosides and nucleotides with 3' acetal blocking group
US11293061B2 (en) 2018-12-26 2022-04-05 Illumina Cambridge Limited Sequencing methods using nucleotides with 3′ AOM blocking group
WO2022087150A2 (fr) 2020-10-21 2022-04-28 Illumina, Inc. Modèles de séquençage comprenant de multiples inserts et compositions et procédés d'amélioration du débit de séquençage
US20230212667A1 (en) 2021-12-29 2023-07-06 Illumina Cambridge Limited Methods of nucleic acid sequencing using surface-bound primers

Patent Citations (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998044152A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode de sequençage d'acide nucleique
WO1998044151A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode d'amplification d'acide nucleique
WO2000018957A1 (fr) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Procedes d'amplification et de sequençage d'acide nucleique
WO2001079553A1 (fr) 2000-04-14 2001-10-25 Lynx Therapeutics, Inc. Methode et compositions permettant d'ordonner des fragments de restriction
WO2002006456A1 (fr) 2000-07-13 2002-01-24 Invitrogen Corporation Methodes et compositions d'extraction et d'isolation rapides de proteines et de peptides au moyen d'une matrice de lyse
WO2003074734A2 (fr) 2002-03-05 2003-09-12 Solexa Ltd. Procedes de detection de variations de sequence a l'echelle du genome associees a un phenotype
US20060024681A1 (en) 2003-10-31 2006-02-02 Agencourt Bioscience Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
WO2005068656A1 (fr) 2004-01-12 2005-07-28 Solexa Limited Caracterisation d'acides nucleiques
WO2006110855A2 (fr) 2005-04-12 2006-10-19 454 Life Sciences Corporation Procedes de determination de variantes de sequence utilisant un sequencage des amplicons
US20060292611A1 (en) 2005-06-06 2006-12-28 Jan Berka Paired end sequencing
WO2006135342A1 (fr) 2005-06-14 2006-12-21 Agency For Science, Technology And Research Procede permettant de traiter et/ou de mapper des sequences ditag a un genome
WO2007010251A2 (fr) * 2005-07-20 2007-01-25 Solexa Limited Preparation de matrices pour sequencage d'acides nucleiques
WO2007010252A1 (fr) 2005-07-20 2007-01-25 Solexa Limited Procede de sequencage d'une matrice de polynucleotide
WO2007052006A1 (fr) 2005-11-01 2007-05-10 Solexa Limited Procede pour preparer des bibliotheques de polynucleotides matrices
WO2007091077A1 (fr) 2006-02-08 2007-08-16 Solexa Limited Procédé de séquençage d'une matrice polynucléotidique
WO2007107710A1 (fr) 2006-03-17 2007-09-27 Solexa Limited Procédés isothermiques pour créer des réseaux moléculaires clonales simples
WO2008041002A2 (fr) 2006-10-06 2008-04-10 Illumina Cambridge Limited Procédé de séquençage d'une matrice polynucléotidique
WO2008093098A2 (fr) 2007-02-02 2008-08-07 Illumina Cambridge Limited Procedes pour indexer des echantillons et sequencer de multiples matrices nucleotidiques
WO2010048605A1 (fr) 2008-10-24 2010-04-29 Epicentre Technologies Corporation Compositions terminales de transposon et procédé de modification d’acides nucléiques
US20120301925A1 (en) 2011-05-23 2012-11-29 Alexander S Belyaev Methods and compositions for dna fragmentation and tagging by transposases
US20120316086A1 (en) 2011-06-09 2012-12-13 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
US20130079232A1 (en) 2011-09-23 2013-03-28 Illumina, Inc. Methods and compositions for nucleic acid sequencing
US20130143774A1 (en) 2011-12-05 2013-06-06 The Regents Of The University Of California Methods and compositions for generating polynucleic acid fragments
WO2013188582A1 (fr) 2012-06-15 2013-12-19 Illumina, Inc. Amplification par exclusion cinétique de banques d'acides nucléiques
WO2016189331A1 (fr) 2015-05-28 2016-12-01 Illumina Cambridge Limited Tagmentation à base de surface
WO2017117235A1 (fr) * 2015-12-30 2017-07-06 Omniome, Inc. Procédés de séquençage d'acides nucléiques double brin
US20190212294A1 (en) 2018-01-08 2019-07-11 Illumina, Inc. High-Throughput Sequencing with Semiconductor-Based Detection
US11293061B2 (en) 2018-12-26 2022-04-05 Illumina Cambridge Limited Sequencing methods using nucleotides with 3′ AOM blocking group
US20210403500A1 (en) 2020-06-22 2021-12-30 Illumina Cambridge Limited Nucleosides and nucleotides with 3' acetal blocking group
WO2022087150A2 (fr) 2020-10-21 2022-04-28 Illumina, Inc. Modèles de séquençage comprenant de multiples inserts et compositions et procédés d'amélioration du débit de séquençage
US20230212667A1 (en) 2021-12-29 2023-07-06 Illumina Cambridge Limited Methods of nucleic acid sequencing using surface-bound primers
WO2023126457A1 (fr) * 2021-12-29 2023-07-06 Illumina Cambridge Ltd. Procédés de séquençage d'acide nucléique à l'aide d'amorces liées à une surface

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SAMBROOK ET AL.: "Molecular Cloning, A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS

Similar Documents

Publication Publication Date Title
US20230098456A1 (en) Methods for sequencing a polynucleotide template
CN110036117B (zh) 通过多联短dna片段增加单分子测序的处理量的方法
CA2810931C (fr) Capture directe, amplification et sequencage d'adn cible a l'aide d'amorces immobilisees
CN111100911B (zh) 一种扩增靶核酸的方法
CN101484589B (zh) 使用aflp的高通量物理作图
US20220364169A1 (en) Sequencing method for genomic rearrangement detection
WO2018148289A2 (fr) Adaptateurs duplex et séquençage duplex
WO2023175037A2 (fr) Séquençage simultané de brins de complément avant et inverse sur des polynucléotides séparés pour la détection de méthylation
WO2010060046A2 (fr) Génotypage par transfert d'énergie colorant-sonde par résonance de fluorescence
US20250043275A1 (en) Methods of preparing loop fork libraries
DK2456892T3 (en) Procedure for sequencing of a polynukleotidskabelon
JP7788280B2 (ja) 標的遺伝子領域の柔軟かつハイスループット配列決定
WO2024256581A1 (fr) Identification de cytosines modifiées
WO2025062002A1 (fr) Séquençage simultané à l'aide d'une traduction de coupure simple brin
EP4259826A1 (fr) Procédés de séquençage de fragments polynucléotidiques à partir des deux extrémités
WO2024256580A1 (fr) Séquençage simultané avec des anneaux spatialement séparés
WO2025062001A1 (fr) Séquençage optimisé d'acides nucléiques
WO2026006746A2 (fr) Techniques de préparation et d'analyse d'acide nucléique
WO2024254003A1 (fr) Identification et cartographie de sites de méthylation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24776253

Country of ref document: EP

Kind code of ref document: A1