[go: up one dir, main page]

WO2025240509A1 - Ligation of polynucleotides by ligases and screening methods thereof - Google Patents

Ligation of polynucleotides by ligases and screening methods thereof

Info

Publication number
WO2025240509A1
WO2025240509A1 PCT/US2025/029185 US2025029185W WO2025240509A1 WO 2025240509 A1 WO2025240509 A1 WO 2025240509A1 US 2025029185 W US2025029185 W US 2025029185W WO 2025240509 A1 WO2025240509 A1 WO 2025240509A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotide
ligase
substrate
double stranded
concentration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/029185
Other languages
French (fr)
Other versions
WO2025240509A9 (en
Inventor
Alexander Jacob BIMM
Jonathan Dewayne DORIGATTI
Stephan JENNE
Supriya Vijaykumar KADAM
Mathew G. MILLER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Codexis Inc
Original Assignee
Codexis Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Codexis Inc filed Critical Codexis Inc
Publication of WO2025240509A1 publication Critical patent/WO2025240509A1/en
Publication of WO2025240509A9 publication Critical patent/WO2025240509A9/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/93Ligases (6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/25Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving enzymes not classifiable in groups C12Q1/26 - C12Q1/66

Definitions

  • nucleic acid ligases for ligating modified polynucleotides, such as those with modifications at the 2’-position of the sugar moiety, can be used to synthesize modified polynucleotides, including modified siRNAs from shorter modified polynucleotide fragments (see, e.g., Paul et al., ACS Chem Biol., 2023, 18(10):2183-2187).
  • RNA ligases e.g., T4 RNA ligase 1 and T4 RNA ligase 2
  • T4 RNA ligase 1 and T4 RNA ligase 2 ligation of adaptors to 2’-O-methylated RNA
  • T4 RNA ligase 1 buffer conditions, ligation enhancers, incubation time, and temperature conditions may have significant differential effects on ligation efficiency of 2’-O-methylated RNAs.
  • ligases appear to display differences in sequence bias and tolerance to mismatches in the ligation reaction, indicating that the choice of ligase can be an important factor in the application of the ligase to polynucleotide synthesis.
  • the differential effects of ligation conditions as well as the nature of substrates on ligation efficiency of nucleic acid ligases and the variability in activity of different ligases indicate uncertainty or unpredictability in the activity of a nucleic acid ligase with regard to reaction conditions and substrate.
  • the present disclosure provides a method of predicting a reaction condition activity profile of a polynucleotide ligase for a ligase substrate by using ligase activity data obtained for different reaction conditions and applying Gaussian Process Regression to ligase activity data to generate a predicted reaction condition activity profile.
  • the predicted reaction condition activity profile for a polynucleotide ligase allows selection of reaction conditions best fit for the ligase for the ligase substrate.
  • a method of predicting a reaction condition activity profile of a ligase for a ligase substrate comprises: Docket No.
  • the CX10-278WO1 obtaining activity data of a polynucleotide ligase for a substrate under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR-predicted reaction condition activity profile of the ligase for the ligase substrate.
  • GPR Gaussian Process Regression
  • the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase.
  • GPR Gaussian Process Regression
  • the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration.
  • the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable double stranded RNA ligase substrate concentration.
  • the output of predicted reaction condition activity profile is a contour plot of the different reaction condition variables. In some embodiments, the output contour plot is a three or four dimensional surface plot of predicted ligase activity for the different reaction condition variables.
  • the ligase substrate comprises a modified nucleoside and/or internucleoside linkage.
  • the polynucleotide acceptor of the ligase substrate comprises a modified nucleoside and/or internucleoside linkage.
  • at least the 3’-terminal nucleoside of the polynucleotide acceptor comprises a modified nucleoside. Docket No. CX10-278WO1 [0014]
  • the polynucleotide donor comprises a modified nucleoside and/or modified internucleoside linkage.
  • the present disclosure further provides a method of screening polynucleotide ligases for activity on a ligase substrate, comprising: (a) contacting a plurality of different polynucleotide ligases with a ligase substrate under a first reaction condition; (b) selecting ligases with activity on the ligase substrate under the first reaction condition; (c) predicting the reaction condition activity profile for each of the selected ligases on the ligase substrate; and (d) retesting activity of the selected ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted reaction condition activity profile, to identify the ligases having optimal activity from the screened ligases for the ligase substrate.
  • the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase.
  • the ligase is a double stranded RNA ligase and the method of screening double stranded RNA ligases for activity on a double stranded RNA ligase substrate comprises: (a) contacting a plurality of different double stranded RNA ligases with a double stranded RNA ligase substrate under a first reaction condition; (b) selecting ligases with activity on the ligase substrate under the first reaction condition; (c) predicting a reaction condition activity profile for each of the selected ligases on the ligase substrate; and (d) retesting activity of the selected ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted reaction condition activity profile, to identify the ligases having optimal activity from the screened ligases for the ligase substrate.
  • the predicted reaction condition activity profile for each ligase is determined by applying Gaussian Process Regression (GPR) on ligase activity data obtained under different reaction conditions.
  • GPR Gaussian Process Regression
  • the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature. Docket No. CX10-278WO1
  • the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration.
  • the present disclosure also provides a computer implemented method for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising inputting or receiving ligase activity data of a ligase for a ligase substrate obtained under different reaction conditions; applying a Gaussian Process Regression to the activity data of the ligase for the ligase substrate under the different reaction conditions; and generating an output of the predicted reaction condition activity profile of the ligase for the ligase substrate.
  • the present disclosure provides a system for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising one or more processors and a memory storing instructions configured to, when executed by the processor, cause the processor to input or receive ligase activity data of a ligase for a ligase substrate for different reaction conditions; process the reaction condition activity data by applying Gaussian Process Regression; and generate an output of a predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions.
  • the present disclosure provides a computer readable storage medium for predicting the reaction condition activity profile of a ligase for a ligase substrate, comprising one or more programmed instructions configured to direct one or more processors to: inputting or receiving activity data of a ligase for a ligase substrate obtained under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions.
  • GPR Gaussian Process Regression
  • the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature.
  • the different reaction conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substate NTP concentration, buffer concentration, and salt concentration, as described herein. Docket No.
  • FIG.1 shows the flow diagram of the input data of activity for a double stranded RNA ligase, application of Gaussian Process Regression to the ligase activity data, and the output of the predicted reaction condition activity profile.
  • FIG.2 shows surface plot of predicted double stranded RNA ligase activity for variations in ATP, MgCl2, and ligase substrate concentrations for two different engineered double stranded RNA ligases, Panel A and Panel B.
  • the present disclosure provides a method of predicting the reaction condition activity profile of a ligase for a ligase substrate, which can be used to define the best-fit reaction conditions for a particular ligase for a defined ligase substrate.
  • Abbreviations and Definitions [0029] In reference to the present disclosure, the technical and scientific terms used in the descriptions herein will have the meanings commonly understood by one of ordinary skill in the art, unless specifically defined otherwise. Accordingly, the following terms are intended to have the following meanings. [0030] As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly indicates otherwise. Thus, for example, reference to “a polypeptide” includes more than one polypeptide.
  • EC means within 0.05%, 0.5%, 1.0%, or 2.0%, of a given value range. In some instances, “about” means within 1, 2, 3, or 4 standard deviations of a given value.
  • EC Enzyme Nomenclature of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB).
  • NC-IUBMB biochemical classification is a numerical classification system for enzymes based on the chemical reactions they catalyze.
  • DNA ligase refers to refers to an enzyme that covalently joins the 5’-phosphoryl termini (“donor”) and 3’-hydroxyl termini (“acceptor”) of DNA to each other.
  • DNA ligases can be grouped into two families based on cofactor/co-substrate requirements: ATP-dependent ligases and NAD+- dependent ligases.
  • DNA ligases of eukaryl and archael organisms are generally ATP-dependent.
  • DNA ligases of eubacterial origin are generally NAD+ dependent.
  • DNA ligase include enzymes within the general class of EC 6.5.1. [0037] “RNA ligase” refers to enzymes that covalently joins the 5’-phosphoryl termini (donor) of RNA or DNA to the 3’-hydroxyl termini (acceptor) of RNA or DNA.
  • RNA ligases Families of known RNA ligases include RNA ligase 1, also referred to as single-stranded RNA ligase or ssRNA ligase, which catalyzes the covalent joining of single-stranded 5’-phosphoryl termini of RNA or DNA to single- stranded 3’-hydroxyl termini of RNA or DNA.
  • RNA ligase 2 also referred to as double stranded RNA ligase or dsRNA ligase, catalyzes the covalent joining of a 3’-hydroxyl terminus of RNA to a 5’-phosphorylated RNA or DNA but shows preference for double stranded substrates.
  • RNA ligases include those enzymes classified in EC 6.5.1.3. It is to be understood that the ligation reaction is not limited to naturally occurring RNA and DNA substrates and includes nucleotide substrates that contain modified nucleotides and/or nucleotide analogs. [0038] “Polynucleotide,” “nucleic acid,” or “oligonucleotide” is used herein to denote a polymer comprising at least two nucleotides where the nucleotides are either deoxyribonucleotides or ribonucleotides or mixtures of deoxyribonucleotides and ribonucleotides.
  • the abbreviations used for genetically encoding nucleosides are conventional and are as follow: adenosine (A); guanosine (G); cytidine (C); thymidine (T); and uridine (U).
  • the abbreviated nucleosides may be either ribonucleosides or 2’-deoxyribonucleosides.
  • the nucleosides may be specified as being either ribonucleosides or 2’-deoxyribonucleosides on an individual basis or on an aggregate basis.
  • DNA refers to deoxyribonucleic acid.
  • RNA refers to ribonucleic acid.
  • the polynucleotide or nucleic acid may be single-stranded or double-stranded, or may include both single- stranded regions and double-stranded regions.
  • polynucleotide encompass polynucleotide or nucleic acid or oligonucleotide analogs or modified polynucleotide or nucleic acid or oligonucleotide, which include, among others, nucleosides linked together via other than standard phosphodiester linkages, such as non-standard linkages of phosphoramidates, phosphorothioates, amide linkages, etc.; nucleosides with modified and/or synthetic nucleobases, for example inosine, xanthine, hypoxanthine, etc.; nucleosides with modified sugar residues, such as 2’- O-alkyl, 2’-halo, 2,3-dideoxy, 2’-halo-2’-deoxy, ⁇ -D-ribo LNA, ⁇ -L-ribo-LNA (e.g., locked nucle
  • Nucleobase refers to an unmodified nucleobase or a modified nucleobase.
  • an “unmodified nucleobase” is adenine (A), thymine (T). cytosine (C). uracil (U). or guanine (G).
  • a “modified nucleobase” refers to a group of atoms other than unmodified A, T, C, U. or G capable of pairing with at least one unmodified nucleobase.
  • Nucleoside refers to a compound comprising a nucleobase and a sugar moiety. The nucleobases and sugar moiety are each, independently, unmodified or modified.
  • Internucleoside linkage refers to as a linkage that covalently couples two nucleosides together.
  • internucleoside linkages covalently couple adjacent nucleosides together, typically forming a bond between the sugar moieties of the adjacent nucleosides.
  • Non- limiting examples of internucleoside linkages include phosphodiester -O-P(O) 2 -O- linkages and modified internucleoside linkages, such as phosphorothioate -O-P-(O, S)-O- and phosphorodithioate - O-P(S)2-O-, as further described herein.
  • Modified oligonucleotide or “modified polynucleotide” refers to an oligonucleotide or polynucleotide which contains at least one modified internucleoside linkage and/or a modified nucleoside, or a modified terminal group.
  • Modified nucleotide refers to a nucleotide (e.g., NMP, NDP, NTP) in which at least one of the phosphate is a modified phosphate group and/or a modified nucleoside.
  • Modified nucleoside or “nucleoside modification” refers to a nucleoside modified as compared to the equivalent DNA or RNA nucleoside by the introduction of one or more modifications of the sugar moiety or the nucleobase.
  • the modified nucleoside comprises a modified nucleobase and/or a modified sugar residue.
  • the term “modified nucleoside” may also be used herein Docket No. CX10-278WO1 interchangeably with the term “nucleoside analogue.” Nucleosides with an unmodified DNA or RNA sugar moiety are termed DNA or RNA nucleosides herein.
  • Nucleosides with modifications in the nucleobase of the DNA or RNA nucleoside are still generally termed DNA or RNA if they allow Watson-Crick base pairing.
  • “Modified internucleoside linkage” refers to as a linkage other than a phosphodiester (PO) linkage that covalently connects two nucleosides together.
  • exemplary modified internucleoside linkage is a phosphorothioate or phosphorodithioate internucleoside linkage.
  • Other modified phosphorus-containing internucleoside linkages include phosphotriesters, methylphosphonates, and phosphoramidates (P-NH2).
  • internucleoside linkages having a chiral atom can be prepared as a mixture of the stereoisomers, or as separate stereoisomers.
  • “Phosphorothioate internucleoside linkage” refers to an internucleoside linkage in which one of the oxygen atom in a phosphodiester linkage is replaced with a sulfur atom.
  • a phosphorothioate linkage may be represented as -O-P(O,S)-O-, wherein one of the non-bridging oxygen atoms is replaced with a sulfur atom.
  • Phosphorothioate internucleoside linkages are chiral (see, for example, Jahns et al.2022, Nucleic Acids Research Vol.50, No 3, 1221-1240), with right- handed (Rp) and left-handed (Sp) isomers.
  • the Rp diastereomer may be referred to as an R-PS internucleoside linkage or an srP internucleoside linkage.
  • the Sp diastereomer may be referred to as an S-PS internucleoside linkage or ssP internucleoside linkage.
  • the oligonucleotide comprises one or more srP internucleoside linkages.
  • the oligonucleotide comprises one or more ssP internucleoside linkages.
  • that phosphorothioate internucleoside linkage may be either an srP linkage or an ssP linkage.
  • “Non-bridging phosphorothioate internucleoside linkage” refers to a phosphorothioate internucleoside linkage in which the sulfur atom attached to the phosphorous atom is in place of a non-bridging oxygen atom.
  • Non-bridging phosphorodithioate internucleoside linkage refers to a modified internucleoside linkage which is a non-bridging phosphorodithioate internucleoside linkage.
  • a non- bridging phosphorodithioate internucleoside linkage has two identical sulfur atoms attached to the Docket No. CX10-278WO1 phosphorous atom, achieved by replacing the non-bridging oxygen atom in the phosphorothioate linkage with a sulfur atom.
  • “Abasic sugar moiety” refers to a sugar moiety of a nucleoside that is not attached to a nucleobase.
  • abasic sugar moieties are referred to as “abasic nucleoside.”
  • “Inverted nucleoside” refers to a nucleotide having a 3’ to 3’ and/or 5’ to 5’ internucleoside linkage.
  • “inverted sugar moiety” refers to the sugar moiety of an inverted nucleoside or an abasic sugar moiety having a 3’ to 3’ and/or 5’ to 5’ internucleoside linkage.
  • LNA nucleoside or “locked nucleoside” refers to 2'-modified nucleoside which comprises a biradical linking the C2' and C4' of the ribose sugar ring of said nucleoside (also referred to as a "2'- 4' bridge"), which restricts or locks the conformation of the ribose ring.
  • These nucleosides are also termed bridged nucleic acid or bicyclic nucleic acid (BNA) in the literature.
  • BNA bicyclic nucleic acid
  • Non-limiting, exemplary LNA nucleosides are disclosed in WO 99/014226, WO 00/66604, WO 98/039352, WO 2004/046160, WO 00/047599, WO 2007/134181, WO 2010/077578, WO 2010/036698, WO 2007/090071, WO 2009/006478, WO 2011/156202, WO 2008/154401, WO 2009/067647, WO 2008/150729, Morita et a!., Bioorganic & Med. Chem. Lett.2002, 12, 73-76, Seth et al.
  • Terminal group refers to a group located at the first or last nucleoside in a polynucleotide or oligonucleotide.
  • a 5’-terminal group refers to the terminal group bonded to 5′-or 4’- carbon atom of the first nucleoside within a polynucleotide.
  • a 3’-terminal group is a terminal group bonded to 3′-carbon atom of the last nucleoside within a polynucleotide or oligonucleotide.
  • “5’-blocking group” as used herein refers to a moiety or chemical group that prevents or inhibits attachment of another nucleoside, nucleotide or oligonucleotide to the 5’-terminal nucleoside.
  • a 5’-blocking group prevents or inhibits the enzyme(s) from attachment of another nucleoside, nucleotide or oligonucleotide to the to the 5’- terminal nucleoside, particularly the 5’-OH of the 5’-terminal nucleoside.
  • “3’-blocking group” refers to moiety or chemical group that prevents or inhibits attachment off another nucleoside, nucleotide, or oligonucleotide to the 3’-terminal nucleoside.
  • a 3’-blocking Docket No. CX10-278WO1 group prevents or inhibits the enzyme(s) from attachment of another nucleoside, nucleotide, or oligonucleotide to the 3’-terminal nucleoside, particularly the 3’-OH of the 3’-terminal nucleoside.
  • “Reversible blocking group” refers to a blocking group that can be removed or cleaved off to provide a free 3’-OH.
  • the blocking group is removable with a deblocking agent, which can be a chemical or enzymatic deblocking agent.
  • Enzymatically reversible blocking group refers to a blocking group that is susceptible to removal or cleaving by an enzyme.
  • Duplex and “ds” refer to a double-stranded nucleic acid (e.g., DNA or RNA) molecule comprised of two single-stranded polynucleotides that are complementary in their sequence (e.g., A pairs to T or U, C pairs to G), arranged in an antiparallel 5’ to 3’ orientation, and held together by hydrogen bonds between the nucleobases (e.g., adenine [A], guanine [G], cytosine [C], thymine [T], uridine [U]).
  • “Complementary” is used herein to describe the structural relationship between nucleotide bases that are capable of forming base pairs with one another.
  • a purine nucleotide base present on a polynucleotide that is complementary to a pyrimidine nucleotide base on a polynucleotide may base pair by forming hydrogen bonds with one another.
  • Complementary nucleotide bases can base pair via Watson/Crick base pairing or in any other manner than forms stable duplexes or other nucleic acid structures.
  • “Watson/Crick Base-Pairing” refers to a pattern of specific pairs of nucleobases and analogs that bind together through sequence-specific hydrogen-bonds, e.g., A pairs with T or U, and G pairs with C.
  • “Annealing” or “Hybridization” refers to the base-pairing interactions of one nucleobase polymer (e.g., poly- and oligonucleotides) with another that results in the formation of a double- stranded structure, a triplex structure or a quaternary structure.
  • Annealing or hybridization can occur via Watson-Crick base-pairing interactions, but may be mediated by other hydrogen-bonding interactions, such as Hoogsteen base pairing.
  • the nucleobase polymer that anneals or hybridizes to another is a single nucleobase polymer while in other embodiments, the nucleobase polymers are separate nucleobase polymers.
  • a polynucleotide or a polypeptide refer to a material or a material corresponding to the natural or native form of the material that has been modified in a manner that would not otherwise exist in nature or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques. Docket No. CX10-278WO1 [0064] “Wild-type” and “naturally-occurring” refer to the form found in nature.
  • a wild-type polypeptide or polynucleotide sequence is a sequence present in an organism that can be isolated from a source in nature and which has not been intentionally modified by human manipulation.
  • Alkyl refers to straight or branched chain hydrocarbon groups having the number of carbon atoms designated, for example 1 to 20 carbon atoms (C1-C20), particularly 1 to 12 carbon atoms (C1- C12 or C1-12), and more particularly (C1-C8 or C1-8) carbon atoms.
  • alkyl includes, but are not limited to, methyl, ethyl, n-propyl, i-propyl, n-butyl, s-butyl, t-butyl, n-pentyl, and s-pentyl.
  • Alkenyl refers to straight or branched chain hydrocarbon having the number of carbon atoms designated, for example 2 to 20 carbon atoms (C2-C20), particularly 2 to 12 carbon atoms (C2- C12 or C2-12), and most particularly 2 to 8 (C2-C8 or C2-8)carbon atoms, having at least one double bond.
  • alkenyl includes, but are not limited to, vinyl ethenyl, allyl, isopropenyl, 1- propenyl, 2-methyl-1-propenyl, 1-butenyl, 2-butenyl, 3-butenyl, 2-ethyl-1-butenyl, 3-methyl-2- butenyl, 1-pentenyl, 2-pentenyl, 3-pentenyl, 4-pentenyl, 4-methyl-3-pentenyl, 1-hexenyl, 2-hexenyl, 3-hexenyl, 4-hexenyl and 5-hexenyl.
  • Alkynyl refers to a straight or branched chain hydrocarbon having the number of carbon atoms designated, for example 2 to 12 carbon atoms (C 2 -C 12 or C 2-12 ), particularly 2 to 8 carbon atoms (C 2 -C 8 or C 2-8 ), containing at least one triple bond.
  • alkynyl includes ethynyl, 1- propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 3-butynyl, 1-pentynyl, 2-pentynyl, 3-pentynyl, 4- pentynyl, 1-hexynyl, 2-hexynyl, 3-hexynyl, 4-hexynyl and 5-hexynyl.
  • Alkylene alkenylene
  • alkynylene refers to a straight or branched chain divalent hydrocarbon radical of the corresponding alkyl, alkenyl, and alkynyl, respectively.
  • alkylene alkenylene and “alkynylene” may be optionally substituted, for example with alkyl, alkyloxy, hydroxyl, carbonyl, carboxyl, halo, nitro, and the like.
  • “Lower” in reference to substituents refers to a group having between one and six carbon atoms.
  • Heteroalkyl heteroalkenyl
  • heteroalkynyl refers to the corresponding alkyl, alkenyl, and akynyl in which one or more of the carbon atoms is replaced with a heteroatom, such as O, S and N.
  • Cycloalkyl refers to any stable monocyclic or polycyclic system which consists of carbon atoms, any ring of which being saturated.
  • Cycloalkenyl refers to any stable monocyclic or polycyclic system which consists of carbon atoms, with at least one ring thereof being partially unsaturated. Examples of cycloalkyls include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclooctyl, bicycloalkyls and tricycloalkyls (e.g., adamantyl). Docket No.
  • Heterocycloalkyl or “heterocyclyl” refers to a substituted or unsubstituted 3 to 14 membered, mono- or bicyclic, non-aromatic hydrocarbon, wherein 1 to 3 carbon atoms a (e replaced by a heteroatom.
  • Heteroatoms and/or heteroatomic groups which can replace the carbon atoms include, but are not limited to, -O-, -S-, -S-O-, -NR’-, -PH-, -S(O)-, -S(O)2-, -S(O) NR’-, -S(O)2NR’-, and the like, including combinations thereof, where each R’ is independently hydrogen or lower alkyl.
  • Examples include oxiranyl, oxetanyl, azetidynyl, oxazolyl, thiazolidinyl, thiazolyl, morpholinyl, pyrrolidinonyl, pyrrolidinyl, piperidinyl, piperazinyl, 2,3-dihydrofuranyl, dihydropyranyl, tetrahydrofuranyl, tetrahydropyranyl, dihydropyridinyl, tetrahydropyridinyl, tetrahydropyrimidinyl, tetrahydrothiophenyl, tetrahydrothiopyranyl, azapanyl, and the like.
  • Aryl refers to a six- to fourteen-membered, mono- or bi-carbocyclic ring, wherein the monocyclic ring is aromatic and at least one of the rings in the bicyclic ring is aromatic. Unless stated otherwise, the valency of the group may be located on any atom of any ring within the radical, valency rules permitting. Examples of “aryl” groups include phenyl, naphthyl, indenyl, biphenyl, phenanthrenyl, naphthacenyl, and the like.
  • Heteroaryl refers to an aromatic heterocyclic ring, including both monocyclic and bicyclic ring systems, where at least one carbon atom of one or both of the rings is replaced with a heteroatom independently selected from nitrogen, oxygen, and sulfur, or at least two carbon atoms of one or both of the rings are replaced with a heteroatom independently selected from nitrogen, oxygen, and sulfur.
  • the heteroaryl can be a 5 to 6 membered monocyclic, or 7 to 11 membered bicyclic ring systems.
  • heteroaryl groups include pyrrolyl, pyrazolyl, imidazolyl, pyrazinyl, oxazolyl, isoxazolyl, thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzoxazolyl, benzisoxazolyl, benzothiazolyl, purinyl, benzimidazolyl, indolyl, isoquinolyl, quinoxalinyl, quinolyl, and the like.
  • Bridged bicyclic refers to any bicyclic ring system, i.e., carbocyclic or heterocyclic, saturated or partially unsaturated, having at least one bridge.
  • a “bridge” is an unbranched chain of atoms or an atom or a valence bond connecting two bridgeheads, where a “bridgehead” is any skeletal atom of the ring system which is bonded to three or more skeletal atoms (excluding hydrogen).
  • a bridged bicyclic group has 5 to 12 ring members and 0-4 heteroatoms independently selected from nitrogen, oxygen, and sulfur.
  • bridged bicyclic groups include those groups set forth below where each group is attached to the rest of the molecule at any substitutable carbon or nitrogen atom. Unless otherwise specified, a bridged bicyclic group is optionally substituted with one or more substituents as set forth for aliphatic groups. Additionally or alternatively, any substitutable nitrogen of a bridged bicyclic group is optionally substituted. Exemplary bridged bicyclics include: Docket No. CX10-278WO1 , some a a [0077] “Fused ring” refers a ring system with two or more rings having at least one bond and two atoms in common.
  • a “fused aryl” and a “fused heteroaryl” refer to ring systems having at least one aryl and heteroaryl, respectively, that share at least one bond and two atoms in common with another ring.
  • “Carbonyl” refers to -C(O)-.
  • the carbonyl group may be further substituted with a variety of substituents to form different carbonyl groups including acids, acid halides, aldehydes, amides, esters, and ketones.
  • an -C(O)R’, wherein R’ is an alkyl is referred to as an alkylcarbonyl.
  • R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • Halogen or “halo” refers to fluorine, chlorine, bromine and iodine.
  • Haloalkyl refers to an alkyl substituted with 1 or more halogen atoms. Preferably, the alkyl is substituted with 1 to 3 halogen atoms.
  • “Hydroxy” refers to –OH.
  • Oxy refers to group -O-, which may have various substituents to form different oxy groups, including ethers and esters.
  • the oxy group is an –OR’, wherein R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • acyl refers to -C(O)R’, where R is hydrogen, or an optionally substituted alkyl, heteroalkyl, cylcoalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, or heteroarylalkyl as Docket No. CX10-278WO1 defined herein.
  • exemplary acyl groups include, but are not limited to, formyl, acetyl, cyclohexylcarbonyl, cyclohexylmethylcarbonyl, benzoyl, benzylcarbonyl, and the like.
  • Alkyloxy or “alkoxy” refers to —OR’, wherein R’ is an optionally substituted alkyl.
  • Aryloxy refers to –OR’, wherein R’ is an optionally substituted aryl.
  • Carboxy refers to –COO- or COOM, wherein H or a M + counterion.
  • Carbamoyl refers to -C(O)NR’R’, wherein each R’ is independently selected from H or an optionally substituted alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocylcoalkylalkyl, aryl, arylalkyl, heteroaryl, or heteroarylalkyl.
  • Cyano refers to –CN.
  • R is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocyclolalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • SiR SiR’R’R’, where R’ is as defined in the specification.
  • each R’ is independently selected from alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • any heterocyloalkyl or heteroaryl group present in a silyl group has from 1 to 3 heteroatoms selected independently from O, N, and S.
  • “Thiol” or “sulfhydryl” refers to –SH.
  • “Disulfied” refers to -S-S- groups.
  • “Sulfanyl” refers to –SR’, wherein R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • R is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • -SR wherein R is an alkyl is an alkylsulfanyl.
  • “Sulfonyl” refers to -S(O)2-, which may have various substituents to form different sulfonyl groups including sulfonic acids, sulfonamides, sulfonate esters, and sulfones.
  • -S(O)2R’, wherein R’ is an alkyl refers to an alkylsulfonyl.
  • R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl.
  • Amino refers to the group –NR’R’ or –NR’R’R’, wherein each R’ is independently selected from H and an optionally substituted: alkyl, cycloalkyl, heterocycloalkyl, alkyloxy, aryl, heteroaryl, heteroarylalkyl, acyl, alkyloxycarbonyl, sulfanyl, sulfinyl, sulfonyl, and the like.
  • “Optional” or “optionally” refers to a described event or circumstance may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where the event or circumstance does not.
  • “optionally substituted alkyl” refers to an alkyl group that may or may not be substituted and that the description encompasses both substituted alkyl group and unsubstituted alkyl group.
  • “Substituted” as used herein means one or more hydrogen atoms of the group is replaced with a substituent atom or group commonly used in pharmaceutical chemistry. Each substituent can be the same or different.
  • substituents include, but are not limited to, alkyl, alkenyl, alkynyl, cycloalkyl, aryl, arylalkyl, heterocycloalkyl, heteroaryl, OR ’ (e.g., hydroxyl, alkyloxy (e.g., methoxy, ethoxy, and propoxy), aryloxy, heteroaryloxy, arylalkyloxy, ether, ester, carbamate, etc.), hydroxyalkyl, alkyloxycarbonyl, alkyloxyalkyloxy, perhaloalkyl, alkyloxyalkyl, SR ’ (e.g., thiol, alkylthio, arylthio, heteroarylthio, arylalkylthio, etc.), S + R ’ 2 , S(O)R ’ , SO 2 R ’ , NR ’ R ” (e.g., primary amine (i.e., i.
  • substitutions will typically number less than about 10 substitutions, more preferably about 1 to 5, with about 1 or 2 substitutions being preferred.
  • “Stereoisomer” refers to a compound made up of the same atoms bonded by the same bonds but having different three-dimensional structures, which are not interchangeable.
  • “stereoisomer thereof” with respect to a compound includes any stereoisomer of the compound and mixtures of stereoisomers, and includes “enantiomers,” which refers to two stereoisomers whose molecules are nonsuperimposable mirror images of one another.
  • a compound may have more than one chiral center such that the compound may exist as either an individual diastereomer or as a mixture of diastereomers.
  • Method of generating a predicted reaction condition activity profile of a polynucleotide ligase [0100] In one aspect, the present disclosure provides a method of predicting the reaction condition activity profile of a polynucleotide ligase for one or more ligase substrates.
  • reaction condition activity profiles e.g., ligase activity at different Mg +2 , cofactor or co-substrate ATP, double stranded RNA ligase substrate concentration, of each dsRNA ligase can differ for the same double stranded RNA ligase substrate.
  • a double stranded RNA ligase can exhibit different reaction condition activity profile for different double Docket No. CX10-278WO1 stranded RNA ligase substrates.
  • a predicted reaction condition activity profile can be generated for identifying the reaction conditions favorable for the double stranded RNA ligase on the specified double stranded RNA ligase substrate.
  • the present invention is applicable to different types of ligases, including, among others, single stranded DNA ligase, double stranded DNA ligase, single stranded RNA ligase, and double stranded RNA ligase.
  • the present disclosure provides a method of predicting or modeling reaction condition activity profile of a polynucleotide ligase for a polynucleotide ligase substrate, comprising: obtaining activity data of a polynucleotide ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR-predicted or modeled reaction condition activity profile of the ligase for the ligase substrate.
  • GPR Gaussian Process Regression
  • GPR Gaussian Process Regression
  • the steps in Gaussian Process Regression include: (1) data collection involving gathering the input-output data pairs for the regression problem; (2) choosing a kernel function involving selecting an appropriate covariance function (kernel) that is appropriate to the problem, where the choice of kernel influences the shape of the functions that GPR can model; (3) parameter optimization involving estimating the parameters of the kernel function by maximizing the likelihood of the data; and (4) a prediction in which given a new input, the trained GPR model, is used to make predictions.
  • GPR provides both the predicted mean and the associated uncertainty (variance).
  • Software for Gaussian Process Regression are available commercially, such as from JMP ® Statistical Discover, and scikit-learn (available at https scikit-learn.org).
  • model parameters are optimized (parameter optimization) to identify optimal conditions for each ligase, and in some embodiments across all ligases, by setting a desirability parameter of equally weighted average of the measured response of each ligase within an examined set.
  • the Gaussian Process platform implements two possible correlation structures, the Gaussian and the Cubic. See JMP ® Statistical Discover. Docket No. CX10-278WO1
  • the Gaussian correlation structure uses the product exponential correlation function with a power of 2 as the estimated model. This model assumes that Y is normally distributed with mean ⁇ and covariance matrix ⁇ 2R.
  • the Cubic correlation structure also assumes that Y is normally distributed with mean ⁇ and covariance matrix ⁇ 2 R.
  • the R matrix consists of the following elements: See Santer et al., In The Design and Analysis of Computer Experiments. New York: Springer-Verlag (2003).
  • the theta parameter used in the Cubic correlation structure is the reciprocal of the parameter often used in the literature.
  • the Gaussian Process Regression can use categorical predictors. If the Gaussian Process model includes categorical predictors, the Gaussian correlation structure is used for the correlation structure. See, e.g., JMP ® Statistical Discover.
  • the covariance element, rij depends on the combination of levels of the categorical predictors obtained from the i th and j th observations. See, e.g., JMP ® Statistical Discover, referencing Qian et al., “Gaussian process models for computer experiments with qualitative and quantitative factors,” Technometrics, 2012, 50:383–396.
  • the different reaction or variable conditions comprise varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration.
  • the different reaction or variable conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substrate NTP (or NAD + ) concentration, buffer concentration, and salt concentration.
  • the different or variable reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration. Docket No. CX10-278WO1 [0111]
  • the divalent metal is any divalent metal that a ligase is active in the ligation reaction.
  • the divalent metal is Mn +2 or Mg +2 .
  • the divalent metal is Mg +2 .
  • the activity profile is obtained for divalent metal concentrations under which the ligase shows activity for the ligase substrate. In some embodiments, the activity profile is obtained for divalent metal concentrations varied from 0.1 mM to 100 mM, from 0.5 to 50 mM, 0.1 to 40 mM, 0.5 to 40 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, or 0.5 to 10 mM.
  • Changes in divalent metal concentrations can be done in increments, e.g., of 1 mM, 2 mM, 5 mM, 10 mM, etc., as needed to obtain the activity profile for the desired divalent metal concentrations.
  • the ligase activity data is obtained for different or variable cofactor or co-substrate NTP concentrations.
  • the cofactor or co-substrate NTP concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM.
  • the cofactor or co-substrate NTP is ATP.
  • the cofactor or co-substrate NAD + concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM.
  • the ligase activity data is obtained for different or variable ligase substrate concentrations, and adjusted for the type of ligase (e.g., single stranded ligase substrate or double stranded ligase substrate).
  • concentration refers to the concentration of each component polynucleotide acting as a substrate.
  • a single stranded RNA ligase can use a polynucleotide acceptor and a polynucleotide donor provided as separate polynucleotides, where the concentration of polynucleotide acceptor or polynucleotide donor defines “substrate” concentration, as opposed to the sum of concentrations of the polynucleotide acceptor and polynucleotide donor.
  • the activity profile is obtained for ligase substrate concentration from 0.01 to 20 mM, from 0.1 to 20 m, from 0.2 to 10 mM, from 0.1 to 5 mM, from 0.2 to 5 mM, from 0.1 to 2 mM, or from 0.2 to 2 mM.
  • the ligase activity data is obtained for different or variable buffer types, and/or different or variable buffer concentrations.
  • Exemplary buffers for ligases include, by way of example and not limitation, borate, potassium phosphate, 2-(N-morpholino)ethane sulfonic acid (MES), 3-(N-morpholino)propanesulfonic acid (MOPS), acetate, triethanolamine, 2-amino-2- hydroxymethyl-propane-1,3-diol (Tris), and the like.
  • MES 2-(N-morpholino)ethane sulfonic acid
  • MOPS 3-(N-morpholino)propanesulfonic acid
  • Tris 2-amino-2- hydroxymethyl-propane-1,3-diol
  • the ligase activity data is obtained for buffer concentrations from 1 to 200 mM, 5 to 200 mM, 1 to 150 mM, 5 to 150 mM, 1 to 100 mM, 5 to 100 mM, 1 to 50 mM, 5 to 50 mM, 1 to 20 mM, 5 to 20 mM, 1 to 10 mM, or 5 to 10 mM. Docket No. CX10-278WO1 [0116] In some embodiments, the ligase activity data is obtained for different or variable salts, and/or different or variable salt concentrations.
  • the salt is, among others, NaCl, KCl, ammonium (e.g., ammonium acetate, ammonium chloride, etc.), acetate (e.g., sodium acetate, etc.). In some embodiments, the salt is NaCl.
  • the activity profile is obtained for salt concentrations, e.g., NaCl, from 0 to 500 mM, 1 to 500 mM, 5 mM to 500 mM, 1 to 400 mM, 5 to 400 mM, 1 to 300 mM, 5 to 300 mM, 1 to 200 mM, 5 to 200 mM, 1 to 100 mM, 5 to 100 mM, 1-50 mM, or 5 to 50 mM.
  • the reaction condition activity data is obtained for different ligase concentrations.
  • the ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 g/L to about 50 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2 g/L.
  • the ligase concentration is about 0.01 g/L, about 0.05 g/L, about 0.1 g/L, about 0.5 g/L , about 1 g/L, about 2 g/L, about 5 g/L, about 10 g/L, about 30 g/L, about 40 g/L, or about 50 g/L.
  • an output of the GPR-predicted or modeled reaction condition activity profile of the ligase for the ligase substrate is generated.
  • the output is a multi-output of the predicted or modeled reaction condition activity profile of the ligase for the ligase substrate providing covariance of each output.
  • the multi-output is generated by prescribing an additional covariance function (kernal) over the outputs, specifying the covariance between outputs.
  • the output is a contour plot of the predicted or modeled ligase activity for the different reaction condition variables.
  • the contour plot is a three or four dimensional surface plot of predicted ligase activity for the variable reaction conditions.
  • the ligase substrate comprises at least a polynucleotide acceptor and a polynucleotide donor.
  • the 3’-terminal nucleotide of the polynucleotide acceptor strand has a requisite 3’-OH, or functional form thereof, to act as the acceptor
  • the 5’- terminal nucleotide of the nucleotide or polynucleotide donor has a requisite 5’-phosphate, or functional form thereof, to act as the donor in the ligase reaction.
  • the polynucleotide acceptor and the polynucleotide donor can be any length and/or form suitable for the ligase.
  • the polynucleotide acceptor comprises 3 to 400, 4 to 350, 5 to 300, 6 to 250, 7 to 200, 8 to 150, 9 to 100, or 10 to 50 nucleotides in length.
  • the polynucleotide acceptor for a single stranded ligase can have some Docket No. CX10-278WO1 double stranded regions but have sufficient single stranded region at the 3’-terminal end to function as an acceptor for the single stranded ligase.
  • the polynucleotide donor for a ligase comprises 3 to 400, 4 to 350, 5 to 300, 6 to 250, 7 to 200, 8 to 150, 9 to 100, or 10 to 50 nucleotides in length.
  • the polynucleotide donor for a single stranded ligase similar to the polynucleotide acceptor, can have some double stranded regions but have sufficient single stranded region at the 5’-terminal end to function as a donor for the single stranded ligase.
  • the donor comprises a nucleotide donor, as further described herein.
  • the ligase substrate further comprises a polynucleotide strand complementary to the polynucleotide acceptor strand and polynucleotide donor strand and forms a double stranded polynucleotide substrate comprising a ligatable nick.
  • the polynucleotide acceptor strand and the polynucleotide donor strand are provided on a single polynucleotide, and the ligatable nick is formed through base pairing of self-complementary regions on the single polynucleotide (e.g., to form a hairpin structure).
  • the polynucleotide acceptor strand and the polynucleotide donor strand of the double stranded polynucleotide substrate are provided as separate polynucleotides and form a nick when they base pair to a polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand.
  • the double stranded polynucleotide substrate are formed from double stranded polynucleotide fragments that have cohesive ends, which can base pair to form ligatable nicks.
  • the cohesive ends comprise a complementary region of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides.
  • the double stranded polynucleotide substrate comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nicks. In some embodiments, the double stranded polynucleotide substrate comprises a plurality of polynucleotide acceptor substrates. In some embodiments, the double stranded polynucleotide substrate comprises a plurality of polynucleotide donor substrates. [0125] In some embodiments, the double stranded polynucleotide substrate comprises blunt ended substrates, wherein the double stranded polynucleotide ligase is capable of ligating blunt ended substrates.
  • the ligase substrate comprises a modified nucleoside, modified internucleoside linkage, or a combination of modified nucleoside and modified internucleoside linkage.
  • polynucleotide acceptor of the ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. Docket No. CX10-278WO1 [0128] In some embodiments, the polynucleotide acceptor comprises a 3’-terminal modified nucleoside.
  • the polynucleotide acceptor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside towards the 5’-terminus of the polynucleotide acceptor.
  • the polynucleotide acceptor comprises a 5’-terminal modified nucleoside.
  • the polynucleotide acceptor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside towards the 3’-terminus of the polynucleotide acceptor.
  • polynucleotide acceptor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides.
  • the modified 3’-terminal nucleoside of the polynucleotide acceptor comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’- position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine.
  • the modified 5’-terminal nucleoside of the polynucleotide acceptor comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’- position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine.
  • the polynucleotide acceptor comprises a 3’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide acceptor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’- terminal nucleoside of the polynucleotide acceptor. [0134] In some embodiments, the polynucleotide acceptor comprises a 5’-terminal modified internucleoside linkage.
  • the polynucleotide acceptor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’- terminal nucleoside of the polynucleotide acceptor strand.
  • polynucleotide acceptor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. Docket No.
  • the polynucleotide donor of the ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide donor comprises a 5’-terminal modified nucleoside.
  • the polynucleotide donor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide donor strand towards the 3’-terminus.
  • the polynucleotide donor comprises a 3’-terminal modified nucleoside.
  • the polynucleotide donor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide donor strand towards the 5’-terminus.
  • polynucleotide donor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides.
  • the modified 5’-terminal nucleoside of the polynucleotide donor comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’- position of the sugar moiety.
  • the modified 5’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine.
  • the modified 3’-terminal nucleoside of the polynucleotide donor comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’- position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine.
  • polynucleotide donor comprises a 5’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide donor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’- terminal nucleoside of the polynucleotide acceptor strand. [0143] In some embodiments, polynucleotide donor comprises a 3’-terminal modified internucleoside linkage.
  • the polynucleotide donor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’- terminal nucleoside of the polynucleotide donor strand. Docket No. CX10-278WO1 [0144] In some embodiments, polynucleotide donor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages.
  • the 3’-terminal nucleoside of a polynucleotide acceptor strand and/or the 5’-terminal nucleoside of the polynucleotide donor strand forming a ligation junction of the ligase substrate comprise modified nucleosides.
  • one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide acceptor strand are modified.
  • one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide donor strand are modified.
  • the ligation junction comprises one or more modified internucleoside linkages.
  • the polynucleotide acceptor strand forming the ligation comprises a modified internucleoside linkage.
  • the polynucleotide donor strand forming the ligation junction comprises a modified internucleoside linkage.
  • the ligation junction comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor, and a modified 5’-terminal nucleoside on the polynucleotide donor.
  • the ligation junction comprises at least a modified 3’- terminal nucleoside on the polynucleotide acceptor, wherein the modified 3’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine, and comprises at least a modified 5’-terminal nucleoside on the polynucleotide donor, wherein the modified 5’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro
  • the donor substrate for single stranded RNA ligase comprises a nucleotide donor substrate.
  • the nucleotide donor substrate for the single stranded RNA ligase comprises the structure pN, where the prefix p represents a 5’-phosphate group, and N represents a nucleoside.
  • the nucleotide donor comprises the structure pNp, where prefix p represents a 5’- phosphate group, N represents a nucleoside, and the suffix p represents a 3’-phosphate group.
  • the nucleoside N of the nucleotide donor is unmodified. In some embodiments, the nucleoside N of the nucleotide donor is modified. In some embodiments, the modified nucleoside of the nucleotide donor comprises a 2’-modified, 3’-modified, or 2’- and 3’- modified sugar moiety. In some embodiments, the modified nucleoside comprises a modified nucleobase. In some embodiments, the modified nucleoside comprises a modified sugar moiety and a modified nucleobase. Docket No. CX10-278WO1 In some embodiments, the nucleotide donor substrate is modified with a conjugate moiety, such as a targeting moiety, as further described herein.
  • the method of predicting or modeling the reaction condition activity profile of a ligase for a ligase substrate further comprises predicting or modeling reaction condition activity profile of at least a second ligase substrate, and comparing the output of predicted reaction condition activity profile of the ligase for the ligase substrate with the output of predicted reaction condition activity profile of the ligase for the second ligase substrate.
  • the ligation of the ligase substrate and ligation of the second ligase substrate produces the same ligated product.
  • the ligase substrate and the second ligase substrate that produces the same product comprises at least a different ligation point or ligation junction, thereby providing effect of different ligation points or ligation junction on reaction condition activity profile.
  • the ligase substrate and the second ligase substrate that result in the same product and comprise at least 2 different ligation junctions.
  • the ligase substrate and the second ligase substrate that result in the same product and comprise at least 3, 4, 5, 6, 7, 8, 9, 10 or more different ligation junctions.
  • the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase.
  • the ligase is preferably a double stranded RNA ligase.
  • a method of predicting or modeling reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate comprises: obtaining activity data of a double stranded RNA ligase for a double stranded RNA ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR-predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate.
  • GPR Gaussian Process Regression
  • the different or variable reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, double stranded RNA ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration.
  • the different or variable reaction conditions comprise at least different or variable divalent metal concentration, cofactor or co-substrate NTP concentration, and double stranded RNA ligase substrate concentration. Docket No. CX10-278WO1
  • the divalent metal is any divalent metal that the double stranded RNA ligase is active in the ligation reaction.
  • the divalent metal is Mn +2 or Mg +2 . In some embodiments, the divalent metal for the double stranded RNA ligase is Mg +2 . In some embodiments, the activity profile is obtained for divalent metal concentrations under which the double stranded RNA ligase shows activity for the ligase substrate.
  • the double stranded RNA ligase activity profile is obtained for divalent metal concentrations, e.g., Mg +2 , varied from 0.1 mM to 100 mM, from 0.5 to 50 mM, 0.1 to 40 mM, 0.5 to 40 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, or 0.5 to 10 mM.
  • changes in divalent metal concentrations can be done in increments, e.g., of 1 mM, 2 mM, 5 mM, 10 mM, etc., as needed to obtain the activity profile for the desired divalent metal concentrations.
  • the double stranded RNA ligase activity profile is obtained for different or variable cofactor or co-substrate NTP concentrations.
  • the cofactor or co-substrate NTP concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM.
  • the cofactor or co-substrate NTP is ATP.
  • the double stranded RNA ligase activity profile is obtained for different or variable double stranded RNA ligase substrate concentrations.
  • the activity profile is obtained for ligase substrate concentrations from 0.01 to 20 mM, from 0.1 to 20 m, from 0.2 to 10 mM, from 0.1 to 5 mM, from 0.2 to 5 mM, from 0.1 to 2 mM, or from 0.2 to 2 mM.
  • the double stranded RNA ligase substrate contains one or more modified nucleosides, one or more modified internucleoside linkages, or a combination of modified nucleoside and modified internucleoside linkages.
  • the modified nucleoside and/or modified internucleoside linkage is present at the ligation junction, and/or the polynucleotide strand complementary to the ligation junction, as further described herein.
  • the double stranded RNA ligase activity profile is obtained for different or variable buffer types, and/or different or variable buffer concentrations.
  • Exemplary buffers for double stranded RNA ligases include, by way of example and not limitation, borate, potassium phosphate, 2-(N-morpholino)ethane sulfonic acid (MES), 3-(N- morpholino)propanesulfonic acid (MOPS), acetate, triethanolamine, 2-amino-2-hydroxymethyl- propane-1,3-diol (Tris), and the like.
  • MES 2-(N-morpholino)ethane sulfonic acid
  • MOPS 3-(N- morpholino)propanesulfonic acid
  • Tris 2-amino-2-hydroxymethyl- propane-1,3-diol
  • the double stranded RNA ligase activity profile is obtained for buffer concentrations from 1 to 200 mM, 5 to 200 mM, 1 to 150 mM, 5 to 150 mM, 1 to 100 mM, 5 to 100 mM, 1 to 50 mM, 5 to 50 mM, 1 to 20 mM, 5 to 20 mM, 1 to 10 mM, or 5 to 10 mM.
  • the double stranded RNA ligase activity profile is obtained for different or variable salts, and/or different or variable salt concentrations. In some embodiments, the Docket No.
  • CX10-278WO1 salt is, among others, NaCl, KCl, ammonium (e.g., ammonium acetate, ammonium chloride, etc.), acetate (e.g., sodium acetate, etc.).
  • the salt is NaCl.
  • the double stranded RNA activity profile is obtained for salt concentrations, e.g., NaCl, from 0 to 500 mM, 1 to 500 mM, 5 mM to 500 mM, 1 to 400 mM, 5 to 400 mM, 1 to 300 mM, 5 to 300 mM, 1 to 200 mM, 5 to 200 mM, 1 to 100 mM, 5 to 100 mM, 1-50 mM, or 5 to 50 mM.
  • the double stranded RNA ligase is provided at different concentrations.
  • the double stranded RNA ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 to about 0.1 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2 g/L.
  • an output of the predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate is generated.
  • the output is a multi-output of the predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the ligase substrate providing covariance of each output, as described herein.
  • the output is a contour plot of the predicted or modeled activity for the different variables for the double stranded RNA ligase activity.
  • the contour plot is a three or four dimensional surface plot of the double stranded RNA ligase activity for the variable reaction conditions.
  • the double stranded RNA ligase substrate comprises a polynucleotide acceptor strand, a polynucleotide donor strand, and a polynucleotide strand complementary to the polynucleotide acceptor strand and polynucleotide donor strand, which together forms a double stranded RNA ligase substrate comprising a ligatable nick.
  • the polynucleotide acceptor strand and the polynucleotide donor strand are provided on a single polynucleotide, and the ligatable nick is formed through base pairing of self- complementary regions on the single polynucleotide (e.g., to form a hairpin structure).
  • the polynucleotide acceptor strand and the polynucleotide donor strand of the double stranded RNA ligase substrate are provided as separate polynucleotides, which form a nick when they base pair to a polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand. Docket No. CX10-278WO1 [0168] In some embodiments, the double stranded RNA ligase substrates are formed from double stranded polynucleotide fragments having cohesive ends that can base pair to form ligatable nicks.
  • the cohesive ends comprise a complementary region of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides.
  • the double stranded RNA ligase substrate comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nicks.
  • the double stranded RNA ligase substrate comprises a plurality of polynucleotide acceptor substrates and a plurality of polynucleotide donor substrates.
  • the double stranded RNA ligase substrate is formed from at least two, 3, 4, 5, 6, 7, 8 or more double stranded polynucleotide fragments.
  • the polynucleotide acceptor strand, the polynucleotide donor strand, and/or the polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified nucleoside.
  • the polynucleotide acceptor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide acceptor strand to the 5’- terminal end of the polynucleotide acceptor strand.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified nucleoside.
  • the polynucleotide acceptor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide acceptor strand to the 3’- terminal end of the polynucleotide acceptor strand.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides.
  • the modified 3’-terminal nucleoside of the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’-position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl- guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine.
  • the modified 5’-terminal nucleoside of the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’-position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl- guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified internucleoside linkage.
  • the polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide acceptor strand.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified internucleoside linkage.
  • the polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide acceptor strand.
  • the polynucleotide acceptor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified nucleoside.
  • the polynucleotide donor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide donor strand toward the 3’- terminus.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified nucleoside.
  • the polynucleotide donor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide donor strand toward the 5’- terminus.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. Docket No.
  • the modified 5’-terminal nucleoside of the polynucleotide donor strand double stranded RNA ligase substrate comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’-position of the sugar moiety.
  • the modified 5’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine.
  • the modified 3’-terminal nucleoside of the polynucleotide donor strand double stranded RNA ligase substrate comprises a modified sugar moiety.
  • the sugar moiety is modified at the 2’-position of the sugar moiety.
  • the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified internucleoside linkage.
  • the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide donor strand of the double stranded RNA ligase comprises a 3’-terminal modified internucleoside linkage.
  • the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide donor strand of the double stranded RNA ligase substrate comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. Docket No. CX10-278WO1 [0190]
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide acceptor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide acceptor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified internucleoside linkage.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 5’-terminal region of the polynucleotide strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside and/or modified internucleoside linkage.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide donor strand.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all of modified nucleosides. Docket No. CX10-278WO1 [0197] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified internucleoside linkage.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkages at the 5’-terminal region of the polynucleotide strand.
  • the polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand of the double stranded RNA ligase substrate comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages.
  • the 3’-terminal nucleoside of a polynucleotide acceptor strand and/or the 5’-terminal nucleoside of the polynucleotide donor strand forming a ligation junction of the double stranded RNA ligase substrate comprise modified nucleosides.
  • one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide acceptor strand are modified.
  • one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide donor strand are modified.
  • the ligation junction comprises one or more modified internucleoside linkages.
  • the polynucleotide acceptor strand forming the ligation comprises a modified internucleoside linkage.
  • the polynucleotide donor strand forming the ligation junction comprises a modified internucleoside linkage.
  • the ligation junction of the double stranded RNA ligase substrate comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor strand, and a modified 5’-terminal nucleoside on the polynucleotide donor strand.
  • the ligation junction comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor, wherein the modified 3’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’- O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine, and comprises at least a modified 5’-terminal nucleoside on the polynucleotide donor, wherein the modified 5’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand at the ligation junction of the double stranded RNA ligase substrate comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand.
  • the modified nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand is a complementary 2’-fluoro or 2’-O- methyl nucleoside.
  • the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand at the ligation junction of the double stranded RNA ligase substrate comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand.
  • the modified nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand is a complementary 2’-fluoro or 2’-O- methyl nucleoside.
  • the method of predicting or modeling the reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate further comprises predicting or modeling reaction condition activity profile of at least a second double stranded RNA ligase substrate, and comparing the output of the predicted reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate with the output of predicted reaction condition activity profile of the double stranded RNA ligase for the second double stranded RNA ligase substrate.
  • the ligation of the double stranded RNA ligase substrate and ligation of the second double stranded RNA ligase substrate produces the same ligated product.
  • the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate that produces the same product comprises at least a different ligation junction.
  • the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate that result in the same product comprise at least two different ligation junctions.
  • the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate comprise 3, 4, 5, 6, 7, 8, 9, 10, or more different ligation junctions.
  • the application of Gaussian Process Regression on ligase activity data and generation of predicted reaction condition activity profiles for polynucleotide ligases provides a method of screening of ligases for activity on one or more ligase substrates, and identifying ligases having activity and corresponding reaction conditions favorable for the ligase substrate.
  • a method of identifying or screening polynucleotide ligases for activity on a ligase substrate comprises: Docket No.
  • CX10-278WO1 (a) contacting a plurality of different polynucleotide ligases with a ligase substrate under a first reaction condition; (b) selecting the polynucleotide ligases with activity on the ligase substrate under the first reaction condition; (c) determining or predicting the reaction condition activity profile for each of the selected polynucleotide ligases on the ligase substrate; and (d) retesting activity of each ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted or modeled reaction condition activity profile, to identify the ligases having optimal activity among the screened ligases for the ligase substrate.
  • a polynucleotide ligase of the plurality of ligases is provided at different concentrations.
  • the ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 to about 0.1 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2
  • the predicted reaction condition activity profile for each ligase is determined by applying Gaussian Process Regression (GPR) on ligase activity data obtained under different or variable reaction conditions.
  • the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature.
  • the reaction condition activity data comprises activity determined for at least varying divalent metal concentrations, varying cofactor or co-substrate NTP concentrations, and varying double stranded RNA ligase substrate concentrations.
  • the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase.
  • the ligase is a single stranded DNA ligase and nucleic acid substrates are single stranded DNA substrates.
  • the single stranded DNA ligase uses ATP as a cofactor.
  • the single stranded DNA ligase is bacteriophage TS2126 RNA ligase (e.g., CircLigase; Epicenter, Biotechnologies), Methanobacterium thermoautotrophicum RNA ligase 1, and 5' AppDNA/RNA Ligase (New England Biolabs).
  • the nucleic acid ligase is a double stranded (dsDNA) ligase and the NMP donor cofactor or co-substrate comprises ATP or NAD (or dNAD).
  • dsDNA double stranded
  • CX10-278WO1 DNA ligases include, among others, viral, bacterial, fungal, plant, insect, and mammalian dsDNA ligases.
  • the double stranded DNA ligase uses NAD or dNAD as a cofactor/co- substrate.
  • dsDNA ligase using NAD as a cofactor/co-substrate include double stranded DNA ligases of eubacteria, such as E.
  • the double stranded DNA ligase uses ATP as a cofactor or co- substrate.
  • double stranded DNA ligase using ATP as a cofactor or co-substrate include bacteriophage double stranded DNA ligases (e.g., T3, T4, T6, and T7-DNA ligases), fungal double stranded DNA ligases (e.g., Saccharomyces pombe, Schizosaccharomyces pombe, etc.), human DNA ligase I, vaccinia DNA ligase, and African swine fever virus dsDNA ligase.
  • bacteriophage double stranded DNA ligases e.g., T3, T4, T6, and T7-DNA ligases
  • fungal double stranded DNA ligases e.g., Saccharomyces pombe, Schizosaccharomyces pombe, etc.
  • human DNA ligase I vaccinia DNA ligase
  • African swine fever virus dsDNA ligase African swin
  • the double stranded DNA ligase comprises an engineered or recombinant dsDNA ligase variants or chemically modified double stranded DNA ligases.
  • the double stranded DNA ligase comprises an engineered double stranded DNA ligase disclosed in U.S. Patent No.8728725, U.S. Patent No.10626390, U.S. Patent No.10837009, U.S. Patent No.11124789, WO2018208665, WO2024158764, and Wilson et al., Protein Engineering, Design & Selection, 2013, 26(7):471-478, all of which are incorporated by reference herein.
  • the nucleic acid ligase is a single stranded (ssRNA) ligase and the cofactor or co-substrate NTP comprises ATP.
  • various single stranded RNA ligases useful in the enzymatic reactions include, among others, RNA ligase 1, such as of bacteriophage T4, Citrobacter phage Merlin, Escherichia phage vB_EcoM_VR25, Serratia phage PS2, Phage TS2126, and Rhodothermus phage RM378.
  • the single stranded RNA ligase comprises an engineered single stranded RNA ligase.
  • the engineered single stranded RNA ligase is disclosed in U.S. provisional application No.63/634,859, filed April 16, 2024, and U.S. provisional application No.63/646,841, filed May 13, 2024, incorporated by reference herein.
  • the nucleic acid ligase is a double stranded (dsRNA) ligase and the cofactor or co-substrate NTP comprises ATP.
  • the dsRNA ligase comprises an engineered dsRNA ligase disclosed in WO2024138200; U.S. provisional application No.63/618,203, filed January 5, 2024; U.S. provisional application No.63/554,938, filed January 16, 2024; U.S. provisional application 63/646,753, filed May 13, 2024; and U.S. provisional application No.63/601,699, filed November 21, 2023; all references incorporated herein by reference. Docket No.
  • the nucleic acid ligase that produces reaction product NMP is an RNA splicing ligase and the cofactor or co-substrate comprises ATP.
  • the RNA splicing ligase is tRNA splicing ligase.
  • the RNA splicing ligase comprises RNA ligase RtcB or rRNA ligase, and homologs thereof, including human and C. elegans.
  • the modification or modifications in the ligase substrate can comprise modifications described herein and below. 2’- and 3’-sugar modifications
  • the modified nucleotide comprises a modified nucleoside, wherein the modification is on the sugar moiety of the nucleoside.
  • the modified sugar moiety is a modified furanosyl sugar moiety, for example ribose or deoxyribose.
  • the furanosyl sugar moiety is modified or substituted at the 2’, 3’, or a combination of 2’ and 3’ positions, as appropriate.
  • the modification is at the 2’-position of the sugar moiety.
  • substitutions at the 2’- position include, among others, halo (e.g., Cl, F, Br, etc.) or -O-alkyl or 2’-alkoxy (e.g., O-methyl, O-ethyl, etc.).
  • other modifications at the 2’-position include, but are not limited to, allyl, amino, azido, SH, CN, OCN, CF 3 , OCF 3 , SCH 3 , SOCH 3 , SO 2 CH 3 , ONO 2 , NO 2 , N 3 , and NH 2 .
  • substituent groups at the 2’-position include, among others, O-(C 1 -C 10 )alkoxy, alkoxyalkyl, O-alkyl, S-alkyl, N-alkyl, O-alkenyl, S-alkenyl, N-alkenyl, O-alkynyl, S-alkynyl, N-alkynyl, O-alkyl-O-alkyl, alkynyl, wherein the alkyl, alkenyl and alkynyl can be substituted or unsubstituted C1-C10 alkyl or C1- C10 alkenyl and alkynyl.
  • substituent groups at the 2’-position include, but are not limited to, alkaryl, aralkyl, O-alkaryl, and O-aralkyl.
  • the substitution at the 2’-position is a phosphate (see, e.g., Current Protocols in Nucleic Acid Chemistry, 13.1.1-13.1.31, John Wiley & Sons (2003).
  • the modified 2’-position of the sugar moiety is halo, 2’-O-R’, or 2’-O- COR’, where R’ is an alkyl, alkyloxyalkyl, cycloalkyl, heterocyclyl, aryl, heteroaryl, cycloalkylalkyl, heterocyclylalkyl, arylalkyl, or heteroarylalkyl.
  • R’ is a C 1 -C 4 alkyl.
  • the modified 2’-position is a 2’-O-R’, wherein in R’ is alkyloxyalkyl, alkylamine, cyanoalkyl, or -C(O)-alkyl.
  • the 2’-position of the sugar moiety of the nucleoside substrate is -O-R’, wherein R’ is -CH3 or -CH2CH3 or -CH2CH2OCH3.
  • the modified 2’-position is 2’-O-(2-methoxyethyl), 2’-O-allyl, 2’-O-propargyl, 2’-O- ethylamine, 2’-O-cyanoethyl, -2’-O-amine, or 2’-O-acetate ester. Docket No. CX10-278WO1 [0230] In some embodiments, a modification at the 2’-position comprises a locked nucleoside.
  • locked nucleosides comprises a biradical linking the C2’ and C4’ of the ribose sugar ring of said nucleoside (also referred to as a “2’- 4’ bridge”), which restricts or locks the conformation of the ribose ring (see, e.g., Obika et al., Tetrahedron Letters, 1997, 38(50):8735–8738; Orum et al., Current Pharmaceutical Design, 2008, 14(11):1138–1142).
  • the ribose moiety of the locked nucleotide is in the C3’-endo (beta-D) or C2’-endo (alpha-L) conformation.
  • the bridge is a methylene bridge.
  • the bridge is an ethylene bridge, also referred to as ENA (see, e.g., Morita et al., Bioorg Med Chem Lett., 2002, 12(1):73-6).
  • Other locked nucleoside are described in International patent publication WO 2121249993, incorporated by reference herein.
  • other locked nucleosides include, among others, 5’-methyl-LNA, 2’- amino-LNA, alpha-L-LNA, and thio-LNA. Structures of certain locked nucleosides are shown below: where R in the above is alkyl or acyl, and B refers to a nucleobase.
  • a modification at the 2’-position comprises a reactive moiety or a conjugate moiety, including a conjugate moiety attached via a linker or a linker, as described herein.
  • the modification is at the 3’-position of the sugar moiety.
  • the 3’-modification is on the 3’-terminal nucleoside of the nucleotide donor.
  • the modification at the 3’-position are similar to those at the 2’- position.
  • substitutions at the 3’- position include, among others, halo (e.g., Cl, F, Br, etc.) or -O-alkyl or 3’-alkoxy (e.g., O-methyl, O-ethyl, etc.).
  • other modifications at the 3’-position include, but are not limited to, allyl, amino, azido, SH, CN, OCN, CF 3 , OCF 3 , SCH 3 , SOCH 3 , SO 2 CH 3 , ONO 2 , NO 2 , N 3 , and NH 2 .
  • substituent groups at the 3’-position include, among others, O-(C 1 -C 10 )alkoxy, alkoxyalkyl, O-alkyl, S-alkyl, N- alkyl, O-alkenyl, S-alkenyl, N-alkenyl, O-alkynyl, S-alkynyl, N-alkynyl, O-alkyl-O-alkyl, alkynyl, Docket No. CX10-278WO1 wherein the alkyl, alkenyl and alkynyl can be substituted or unsubstituted C 1 -C 10 alkyl or C 1 -C 10 alkenyl and alkynyl.
  • substituent groups at the 3’-position include, but are not limited to, alkaryl, aralkyl, O-alkaryl, and O-aralkyl.
  • the substitution at the 3’-position is a phosphate.
  • the modified 3’-position of the sugar moiety is halo, 3’-O-R’, or 3’-O- COR’, where R’ is an alkyl, alkyloxyalkyl, cycloalkyl, heterocyclyl, aryl, heteroaryl, cycloalkylalkyl, heterocyclylalkyl, arylalkyl, or heteroarylalkyl.
  • R’ is a C1-C4alkyl.
  • the modified 3’-position is a 3’-O-R’, wherein in R’ is alkyloxyalkyl, alkylamine, cyanoalkyl, or -C(O)-alkyl.
  • the 3’-position of the sugar moiety of the nucleoside substrate is -O-R’, wherein R’ is -CH 3 or -CH 2 CH 3 or -CH 2 CH 2 OCH 3 .
  • the modified 3’-position is 3’-O-(2-methoxyethyl), 3’-O-allyl, 3’-O-propargyl, 3’-O- ethylamine, 3’-O-cyanoethyl, -3’-O-amine, or 3’-O-acetate ester.
  • the modifications at the 3’-position is a reversible or cleavable 3’- blocking group.
  • removal or cleaving of the reversible or cleavable 3’-blocking group results in a free 3’-OH group, which in some embodiments can serve as an acceptor for single- stranded RNA ligase or a terminal nucleotidyl transferase.
  • exemplary reversible or cleavable 3’-blocking groups include, among others, 3’-O-azidomethyl, 3’-O-(2- methoxyethyl), 3’-O-allyl, 3’-O-propargyl, 3’-O-ethylamine, 3’-O-cyanoethyl, -3’-O-amine, 3’-O- acetate ester, 3’-phosphate, 3’-diphosphate, or 3’-triphosphate.
  • the 3’- blocking group is paired with the corresponding deblocking agent used in the deblocking or cleavage of the 3’-blocking group.
  • a modification at the 3’-position comprises a reactive moiety or a conjugate moiety, including a conjugate moiety attached via a linker’ or a linker, as described herein.
  • the modified sugar moiety comprises an unlocked nucleoside. In some embodiments, in the unlocked nucleoside, the furanosyl ring is opened to result in the structure below: where B represents the nucleobase.
  • Unlocked nucleosides are described in, among others, International patent publication WO2022/098990 and Snead et al., Molecular Therapy-Nucleic Acids, 2013, 2, e103. Docket No. CX10-278WO1 [0238] As described herein, in some embodiments, where the modification is to the 3’-terminal nucleotide of the polynucleotide acceptor, the 3’-OH group of the nucleoside, or equivalent position thereof, is maintained to act as an acceptor for the ligase reaction.
  • the 5’-phosphate group of the nucleoside, or equivalent position thereof is maintained to act as a donor for the ligase reaction.
  • the 5’-phosphate group of the polynucleotide donor strand comprises a 5’- phosphorothioate (see, e.g., U.S. Patent No.6811986, incorporated by reference herein).
  • Modified nucleobases [0239] In some embodiments, the modified nucleotide comprises a modified nucleobase. In some embodiments, modified nucleobase that is capable of hydrogen bonding to form Watson and Crick type base pairing is selected.
  • the nucleobase comprise an inosine nucleoside (i.e., nucleosides comprising a hypoxanthine nucleobase).
  • the modified nucleobase is 5- substituted pyrimidines, 6-azapyrimidines, alkyl or alkynyl substituted pyrimidines, alkyl substituted purines, and N-2. N-6 and O-6 substituted purines.
  • the modified nucleobase is 2-aminopropyladenine.5-hydroxymethyl cytosine, 5-methylcytosine, xanthine, hypoxanthine, 2- aminoadenine, 6-N-methylguanine, 6-N-methyladenine, 2-propyladenine, 2-thiouracil, 2-thiothymine, and 2-thiocytosine.5-propynyl uracil, 5-propynylcytosine.6-azouracil, 6-azocytosine, 6-azothymine.
  • 5-ribosyluracil (pseudouracil), 4-thiouracil.8-halo purine, 8-amino purine, 8-thio purine, 8-thioalkyl purine, 8-hydroxy purine, 8-aza purine, 5-bromocytosine.5-trifluoromethylcytosine, 5-halouracil, 5- halocytosine, 7-methylguanine, 7-methyladenine, 2-F-adenine, 2-aminoadenine, 7-deazaguanine, 7- deazaadenine.3-deazaguanine, 3-deazaadenine, 6-N-benzoyladenine, 2-N-isobutyrylguanine, 4-N- benzoylcytosine, 4-N-benzoyluracil, 5-methyl 4-N-benzoylcytosine, and 5-methyl 4-N-benzoyluracil.
  • modified nucleobases include tricyclic pyrimidines, e.g., 1,3-diazaphenoxazine-2-one.1,3- diazaphenothiazine-2-one, and 9-(2-aminoethoxy)-1.3-diazaphenoxazine-2-one (G-clamp).
  • the modified nucleobase includes, among others, nucleobases based on 2,4-dihalotolene and benzimidazole groups.
  • the modified nucleobase is 4- methylbenzimidazole, 2,4-difluorotoluene, 9-methylimidazo[(4,5)-b]pyridine, 2,4-dibromotoluene, benzimidazole, 5-nitrobenzimidazole, 6-nitrobenzimidazole, and 5-nitroindole.
  • the modified nucleobase is 7-azaindole, and isocarbostyril (see, e.g., Berdis et al., Front. Chem.10:1051525).
  • Other modified nucleobases are described in, among others, patent publication WO2021249993.
  • a nucleobase that does not have a nucleobase also referred to as an abasic nucleoside.
  • the abasic nucleoside is present in the internal portion of an oligonucleotide acceptor.
  • an Docket No. CX10-278WO1 abasic nucleoside is attached to the 3’- or 5’-terminal end, which is in certain embodiments grouped as a terminal group.
  • the modified nucleobase is present on the 5’-terminal nucleoside of the polynucleotide acceptor or polynucleotide donor, 3’-terminal nucleoside of the polynucleotide acceptor or polynucleotide donor, and/or present on the internal nucleosides of the polynucleotide acceptor or polynucleotide donor, as described herein.
  • the blocks or contiguous stretches of nucleosides in the polynucleotide acceptor or polynucleotide donor have modified nucleobases.
  • the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate
  • the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises one or more modified nucleobases.
  • Modified internucleoside linkages [0245] In some embodiments, the modified nucleotide comprises at least one modified, non-naturally occurring internucleoside linkage.
  • the modified polynucleotide has 1% 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more modified internucleoside linkages. In some embodiments, all of the internucleoside linkages are modified internucleoside linkages. [0246] In some embodiments, the modified internucleoside linkage is a phosphorous containing modified internucleoside linkage.
  • Exemplary phosphorous-containing internucleoside linkages include, among others, phosphotriesters, alkylphosphonates (e.g., methyl phosphonate, ethyl phosphonate, etc.), phosphoramidates, phosphorothioate, and phosphorodithioate.
  • the modified internucleoside linkage is a non-phosphorous containing internucleoside linkage.
  • the modified internucleoside linkage is amide linkage, such as those of glycine nucleosides or nucleoside ⁇ -amino acids (see, e.g., Banerjee et al., Bioconjugate Chem., 2015, 26, 8, 1737–1742).
  • the modified internucleoside linkages provides for a chiral center.
  • a phosphorothioate or alkylphosphonate internucleoside linkage can be in the Rp or Sp stereomeric configuration.
  • the polynucleotide acceptor and/or polynucleotide donor have a mixture of stereoisomers in the internucleoside linkages.
  • the Docket No. CX10-278WO1 polynucleotide acceptor and/or polynucleotide donor have greater than 50% of the internucleoside linkages as Rp or Sp configuration.
  • the polynucleotide acceptor and/or polynucleotide donor have at least 60%, 70%, 80%, 90%, or greater of Rp or Sp stereomeric configuration.
  • the modified internucleoside linkages are present in the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor.
  • At least 1, 2, 3, 4, or 5 modified internucleoside linkages are present at the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments at least 1 or 2 phosphorothioate internucleoside linkages are present at the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments, the phosphorothioate linkage is a non- bridging phosphorothioate internucleoside linkage.
  • the modified internucleoside linkages are present in the 3’-terminal region of the polynucleotide acceptor or polynucleotide donor. In some embodiments, at least 1, 2, 3, 4, or 5 modified internucleoside linkages are present at the 3’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments, at least 1 or 2 phosphorothioate internucleoside linkages are present at the 3’-terminal region of an polynucleotide acceptor or polynucleotide donor.
  • the modified internucleoside linkages are present in the internal portions of the polynucleotide acceptor or polynucleotide donor.
  • the polynucleotide acceptor and/or polynucleotide donor comprises at least a phosphorothioate internucleoside linkage, where the phosphorothioate linkage is in the Sp configuration, the Rp configuration, or a mixture of Sp and Rp configurations in the nucleotides of the polynucleotide acceptor and/or polynucleotide donor strand.
  • the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises one or more modified internucleoside linkages. Terminal groups [0253] In some embodiments, the polynucleotide acceptor and/or polynucleotide donor comprises a terminal group.
  • the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises a terminal group.
  • the terminal group is attached to the 5’-OH or 4’-carbon atom of the terminal nucleoside.
  • the terminal group comprises a C-4’ modification of the 5’-terminal nucleoside, including among others, 4’-thio-C2’ modifications, 4’-aminoalkyl, C4’- Docket No.
  • the 5’-terminal group is a 5’-phosphate modification.
  • the 5'-phosphate modification includes, among others, 5’-C-methyl, particularly S isomer; 5’-(E or Z)-vinylphosphonate, or 5’-methylenephosphonate.
  • the 5’-terminal group comprises an abasic nucleotide attached to the 5’-OH.
  • the 5’-terminal groups comprises an inverted abasic nucleotide (5’-5’) attached to the 5’-OH of the 5’-end nucleoside.
  • the terminal group comprises a 3’-terminal group.
  • the 3’-terminal group comprises a 3’-phosphate, which can also function as a reversible blocking group.
  • the 3’-phosphate is modified, such as with 3’-(E or Z)- vinylphosphonate, or 3’-methylenephosphonate.
  • the 3’-terminal group on the nucleotide donor comprises an abasic nucleoside.
  • the 3’-terminal group comprises an inverted abasic nucleotide (3’-3’).
  • Conjugate moiety [0258]
  • the modified nucleoside comprises a conjugate moiety.
  • the polynucleotide acceptor and/or the polynucleotide donor comprises a conjugate moiety.
  • the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate
  • the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises a conjugate moiety.
  • the conjugate moiety includes, among others, carbohydrates (e.g. GalNAc), lipids, sterols, drug substances, hormones, polymers (e.g., polyethylene glycol, etc.), proteins, peptides, toxins (e.g. bacterial toxins, etc.), vitamins (e.g., folate, tocopherol, retinoic acid, etc.), or combinations thereof.
  • the conjugate moiety is used to affect the pharmacokinetics of an oligonucleotide, including for cellular targeting of an oligonucleotide.
  • the conjugate moiety can be attached to the 5’-terminal nucleotide, the 3’-terminal nucleotide, or an internal nucleotide of a ligase substrate.
  • the conjugate moiety is attached the 2’-position of the sugar moiety of a nucleoside, for example, to the 2’-OH.
  • the conjugate moiety is attached to the 3’-position of the sugar moiety of the nucleoside, for example 3’-OH.
  • the conjugate moiety is attached to the nucleobase, as discussed above (see, e.g., Biscans et al., Nucleic Acids Res.2019 Feb 20; 47(3): 1082–1096). In some embodiments, the conjugate moiety is attached directly or attached using a linker. Docket No. CX10-278WO1 [0261] In some embodiments, the conjugate moiety comprises a C 6 -C 22 alkyl, C 6-22 alkenyl, or C 6 -C 22 alkynyl.
  • the conjugate moiety comprises a C6-alkyl, C7-alkyl, C8-alkyl, C9- alkyl, C10-alkyl, C11-alkyl, C12-alkyl, C13-alkyl, C14-alkyl, C15-alkyl, C16-alkyl, C17-alkyl, C18-alkyl, C19-alkyl, C20-alkyl, C21-alkyl, or C22-alkyl.
  • the conjugate moiety comprises a C 6 alkenyl, C 7 alkenyl, C 8 alkenyl C 9 alkenyl, C 10 alkenyl, C 11 -alkenyl, C 12 -alkenyl, C 13 -alkenyl, C 14 - alkenyl, C 15 -alkenyl, C 16 -alkenyl, C 17 -alkenyl, C 18 -alkenyl, C 19 -alkenyl, C 20 -alkenyl, C 21 -alkenyl, or C 22 -alkenyl.
  • the conjugate moiety comprises a C 6 alkynyl, C 7 alkynyl, C 8 alkynyl, C9 alkynyl, C10 alkynyl, C11-alkynyl, C12-alkynyl, C13-alkynyl, C14-alkynyl, C15-alkynyl, C16- alkynyl, C17-alkynyl, C18-alkynyl, C19-alkynyl, C20-alkynyl, C21-alkynyl, or C22-alkynyl.
  • the conjugate moiety comprises a heteroalkyl, heteroalkenyl, or heteroalkynyl.
  • the heteroalkyl, heteroalkenyl or heteroalkynyl has one or more carbon atoms replaced with a heteroatom, such as O, S, or N.
  • the conjugate moiety comprises a cycloalkyl or heterocycloalkyl group.
  • the cycloalkyl includes, among others, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, l-cyclohexenyl, 3-cyclohexenyl, and cycloheptyl.
  • the heterocycloalkyl includes, among others, 1-(1,2,5,6-tetrahydropyridyfh l-piperidinyl, 2-piperidinyl, 3- piperidinyl, 4-morpholinyl, 3-morpholinyl, tctrahydrofuran-2-yl, tctrahydrofuran-3-yl, tetrahydrothicn-2-yl, tetrahydrothien-3-yl, l-piperazinyl, and 2-piperazinyl.
  • the conjugate moiety comprises an aryl or heteroaryl moiety.
  • the aryl group includes, among others, phenyl, naphthyl, indenyl, biphenyl, phenanthrenyl, naphthacenyl, anthracenyl, fluorenyl, indenyl, and azulenyl.
  • a heteroaryl group includes, among others, pyridyl, furanyl, thienyl, pynolyl, oxazolyl, oxadiazolyl, imidazolyl ihiazolyl, isoxazolyl, quinolinyl, pyrazolyl, isoihiazolyl, pyridazinyl, pyrimidinyl, pyrazinyl, triazinyl, isoquinolinyl, and indazolyl.
  • the conjugate moiety comprises a cycloalkylalkyl-, heterocycloalkylalkyl-, arylalkyl-, heteroarylalkyl-, cycloalkylheteroalkyl- heterocycloalkylheteroalkyl-, arylheteroalkyl-, heteroarylheteroalkyl-, cycloalkylalkenyl-, heterocycloalkylalkenyl-, arylalkenyl-, heteroarylalkenyl-, cycloalkylheteroalkenyl- heterocycloalkylheteroalkenyl-, arylheteroalkenyl-, or heteroarylheteroalkenyl-.
  • the conjugate moiety comprises a lipid or lipophilic moiety, for example a fatty acid.
  • the fatty acid comprises a saturated fatty acid, unsaturated fatty acid, or a polyunsaturated fatty acid.
  • the fatty acid comprises caprylic acid, lauric acid, myristic acid, palmitic acid, stearic acid, arachidic acid, behenic acid, oleic acid, elaidic acid, cis-vaccenic acid, trans-vaccenic acid, linoleic acid, alpha-linoleic acid, gamma- linoleic acid, arachidonic acid, eicosapentaenoic acid, decanoic acid, docosahexaenoic acid (DHA), Docket No.
  • the conjugate moiety comprises a sterol.
  • the sterol comprises cholesterol, alpha-cholesterol, cholesterol ester (e.g., cholesteryl palmitate, etc.), cholesterol sulfate, phytosterol, cholic acid, or lithocholic acid.
  • the conjugate moiety comprises a phospholipid.
  • the phospholipid comprises phosphatidic acid, phosphatidylethanolamine, phosphatidylcholine, phosphatidylinositol, phosphatidylserine, or a sphingolipid.
  • the conjugate moiety comprises a carbohydrate, particularly a carbohydrate moiety acting as a ligand for a cellular receptor for cellular targeting of the oligonucleotide.
  • the carbohydrate moiety comprises galactose or galactose derivatives.
  • the carbohydrate moiety is attached to the nucleoside via a linker.
  • exemplary carbohydrates that can be used include the following.
  • the oligonucleotide acceptor and/or nucleotide donor may be conjugated to at least one conjugate moiety comprising at least one N-acetylgalactosamine (GalNAc) moiety.
  • the conjugate moiety is a monovalent, divalent, trivalent or tetravalent, GalNAc.
  • the GalNAc moiety has the following structure, Docket No. CX10-278WO1 where L is a linker, and .
  • the W is the 2’-OH of the sugar moiety of a GalNAc moiety is 2’-position of a nucleoside, such as contiguous nucleotides in an oligonucleotide (see, e.g., WO2024040041).
  • the conjugate moiety is a trivalent GalNAc. Tri-valent N- acetylgalactosamine conjugate moieties are described in, for example, WO 2014/076196, WO 2014/207232 and WO 2014/179620.
  • the term “trivalent GalNAc” refers to a residue comprising three N-acetylgalactosamine moieties, typically attached via a linker.
  • the conjugate moiety is L96. Exemplary trivalent GalNAc conjugate moiety is depicted below:
  • the conjugate moiety comprises a reporter molecule.
  • reporter molecules include, among others, fluorescent moieties, such as fluorescein and fluorescein dyes (e.g., fluorescein isothiocyanine or FITC, naphthofluorescein, 4′,5′-dichloro-2′,7′-dimethoxy- fluorescein, 6-carboxyfluorescein or FAM), carbocyanine, merocyanine, styryl dyes, oxonol dyes, phycoerythrin, erythrosin, eosin, rhodamine dyes (e.g., carboxytetramethylrhodamine or TAMRA, carboxyrhodamine 6G, carboxy-X-rhodamine (ROX), lissamine rhodamine B, rhodamine 6G, rh
  • BODIPY dyes e.g., BODIPY FL, BODIPY R6G, BODIPY TMR, BODIPY TR, BODIPY 530/550, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, BODIPY 630/650, BODIPY 650/665
  • IRDyes e.g., IRD40, IRD 700, IRD 800.
  • the reporter moiety is a chemiluminescent moiety, for example acridinium esters, ruthenium derivatives (e.g., tris(2,2′-bipyridyl) ruthenium), and dioxetanes.
  • the conjugate moiety comprises an affinity or capture tag.
  • affinity or capture tag includes, among others, biotin, desthiobiotin, digoxigenin, 3-amino-3- deoxydigoxigenin, and a hapten (e.g., dinitrophenol, Alexa Fluor 40, Alexa Fluor 488, dansyl, Lucifer yellow, Oregon Green 488, fluorescein).
  • the conjugate moiety comprises a peptide.
  • the peptide comprises a cellular targeting peptide and/or cell penetration peptide (CPP) for enhancing cellular delivery of a conjugate-modified polynucleotide.
  • CPP cell penetration peptide
  • the cell penetrating peptide is attached via a linker, including a cleavable linker.
  • Cell penetrating peptides include among others, TAT, penetratin, MAP, transportan/TP10, VP22, polyarginine, MPG, Pep-1, pVEC, YTA2, YTA4, M918, and CADY.
  • the conjugate moiety comprises an RGD (Arg- Gly-Asp) peptide. Sequence of some penetrating peptides are described in Copolovici et al., 2014, 8(3):1972–1994, incorporated by reference herein.
  • the peptide can be attached using a thiol group on the 5’- phosphate of a polynucleotide or oligonucleotide.
  • the method herein is implemented by a computer.
  • the present disclosure provides a computer implemented method for predicting or modeling a reaction condition activity profile of a ligase for a ligase substrate, comprising receiving ligase activity data of a ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression using the ligase activity data for the ligase substrate Docket No.
  • the present disclosure further provides a system for predicting or modeling the reaction condition activity profile of one or more ligases for a ligase substrate, comprising: one or more processors, a memory storing instructions configured to, when executed by the processor, cause the processor to input or receive activity data of a ligase for a ligase substrate for different or variable reaction conditions; process the reaction condition activity data using a Gaussian Process Regression; and generate an output of a predicted or modeled reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions.
  • the present disclosure further provides a computer readable storage medium for predicting or modeling the reaction condition activity profile of a ligase for a ligase substrate, comprising one or more programmed instructions configured to direct one or more processors to: input or receive activity data of a ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR predicted or modeled reaction condition activity profile of the ligase for the ligase substrate.
  • GPR Gaussian Process Regression
  • the activity data for the different reaction or variable conditions is obtained by varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration.
  • the different reaction or variable conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co- substrate NTP (or NAD + ) concentration, buffer concentration, and salt concentration, as described herein.
  • the different or variable reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration.
  • a computer system is used for implementing one or more of the methods described herein.
  • the computer system comprises one or more processors, a memory, a terminal interface, input/output device interface, a storage interface, and in Docket No. CX10-278WO1 some embodiments, a network interface.
  • the computer system can be an electronic device of a user or a remotely located computer system.
  • the computer device is a mobile electronic device.
  • the computer system contains one or more general purpose programmable processing units.
  • the computer system may have a single processor or may have multiple processors. Each process may execute the instructions stored in one or more memory modules.
  • the memory can include a computer readable media, such as volatile memory, random access memory, and/or cache memory.
  • the storage system can be for reading and writing from a non-removable storage media, e.g., a hard drive or an optical disk.
  • the memory can be a flash memory.
  • the computer system uses a communication interface, and peripheral devices are in communication with the processing units.
  • the memory may store one or more programs, each having at least a program module stored in the memory.
  • the computer readable program instructions are for applying the Gaussian Process Regression to the activity date obtained for the different or variable reaction conditions inputted into the computer system.
  • the programmed instruction or program code for carrying out operations of the present invention may be written in any programming language such as Java, Python, Julia, or any other programming languages such as “C” programming language; a scripting programming language such as Perl and VBS; and other languages such as R and MATLAB.
  • the ligase activity data is inputted and stored in a memory, where the data is processed by one or more of the computer programs, e.g., for the Gaussian Process Regression algorithms.
  • the computer system includes a output interface and an peripheral device for receiving and/or processing the output of the predicted reaction condition activity profile of a ligase for a ligase substrate.
  • the peripheral device is an electronic display or printer connected via a user interface.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present disclosure provides a method of predicting a reaction condition activity profile of a ligase for a ligase substrate by Gaussian Process Regression using activity data obtained for different reaction conditions. Further provided is a method of screening a plurality of ligases for activity on a ligase substrate for identifying ligases active on the ligase substrate.

Description

Docket No. CX10-278WO1 LIGATION OF POLYNUCLEOTIDES BY LIGASES AND SCREENING METHODS THEREOF CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application claims the benefit of U.S. Provisional Application No.63/646,753, filed May 13, 2024, and U.S. Provisional Application No.63/718,893, filed November 11, 2024, all of which are incorporated by reference herein. BACKGROUND [0002] Synthesis of polynucleotides by ligation of polynucleotide fragments using a nucleic acid ligase is a commonly used technique in molecular biology. Use of nucleic acid ligases for ligating modified polynucleotides, such as those with modifications at the 2’-position of the sugar moiety, can be used to synthesize modified polynucleotides, including modified siRNAs from shorter modified polynucleotide fragments (see, e.g., Paul et al., ACS Chem Biol., 2023, 18(10):2183-2187). [0003] However, studies with 2’-O-methylated RNA indicate a decrease in ligation efficiency of RNA ligases (e.g., T4 RNA ligase 1 and T4 RNA ligase 2) for ligation of adaptors to 2’-O-methylated RNA (Munafo et al., RNA, 2010, 16(12):2537-2552). For T4 RNA ligase 1, buffer conditions, ligation enhancers, incubation time, and temperature conditions may have significant differential effects on ligation efficiency of 2’-O-methylated RNAs. Moreover, different ligases appear to display differences in sequence bias and tolerance to mismatches in the ligation reaction, indicating that the choice of ligase can be an important factor in the application of the ligase to polynucleotide synthesis. [0004] The differential effects of ligation conditions as well as the nature of substrates on ligation efficiency of nucleic acid ligases and the variability in activity of different ligases indicate uncertainty or unpredictability in the activity of a nucleic acid ligase with regard to reaction conditions and substrate. SUMMARY [0005] The present disclosure provides a method of predicting a reaction condition activity profile of a polynucleotide ligase for a ligase substrate by using ligase activity data obtained for different reaction conditions and applying Gaussian Process Regression to ligase activity data to generate a predicted reaction condition activity profile. The predicted reaction condition activity profile for a polynucleotide ligase allows selection of reaction conditions best fit for the ligase for the ligase substrate. [0006] In some embodiments, a method of predicting a reaction condition activity profile of a ligase for a ligase substrate comprises: Docket No. CX10-278WO1 obtaining activity data of a polynucleotide ligase for a substrate under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR-predicted reaction condition activity profile of the ligase for the ligase substrate. [0007] In some embodiments, the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase. [0008] In some embodiments, the ligase is a double stranded RNA ligase and the method of predicting reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate comprises: obtaining activity data of a double stranded RNA ligase for a double stranded RNA ligase substrate under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR-predicted reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate. [0009] In some embodiments, the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. [0010] In some embodiments, the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable double stranded RNA ligase substrate concentration. [0011] In some embodiments, the output of predicted reaction condition activity profile is a contour plot of the different reaction condition variables. In some embodiments, the output contour plot is a three or four dimensional surface plot of predicted ligase activity for the different reaction condition variables. [0012] In some embodiments, the ligase substrate comprises a modified nucleoside and/or internucleoside linkage. [0013] In some embodiments, the polynucleotide acceptor of the ligase substrate comprises a modified nucleoside and/or internucleoside linkage. In some embodiments, at least the 3’-terminal nucleoside of the polynucleotide acceptor comprises a modified nucleoside. Docket No. CX10-278WO1 [0014] In some embodiments, the polynucleotide donor comprises a modified nucleoside and/or modified internucleoside linkage. In some embodiments, at least the 5’-terminal nucleoside of the polynucleotide donor comprises a modified nucleoside. [0015] In some embodiments, the present disclosure further provides a method of screening polynucleotide ligases for activity on a ligase substrate, comprising: (a) contacting a plurality of different polynucleotide ligases with a ligase substrate under a first reaction condition; (b) selecting ligases with activity on the ligase substrate under the first reaction condition; (c) predicting the reaction condition activity profile for each of the selected ligases on the ligase substrate; and (d) retesting activity of the selected ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted reaction condition activity profile, to identify the ligases having optimal activity from the screened ligases for the ligase substrate. [0016] In some embodiments, for the screening method, the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase. [0017] In some embodiments, the ligase is a double stranded RNA ligase and the method of screening double stranded RNA ligases for activity on a double stranded RNA ligase substrate comprises: (a) contacting a plurality of different double stranded RNA ligases with a double stranded RNA ligase substrate under a first reaction condition; (b) selecting ligases with activity on the ligase substrate under the first reaction condition; (c) predicting a reaction condition activity profile for each of the selected ligases on the ligase substrate; and (d) retesting activity of the selected ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted reaction condition activity profile, to identify the ligases having optimal activity from the screened ligases for the ligase substrate. [0018] In some embodiments, the predicted reaction condition activity profile for each ligase is determined by applying Gaussian Process Regression (GPR) on ligase activity data obtained under different reaction conditions. [0019] In some embodiments, the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature. Docket No. CX10-278WO1 [0020] In some embodiments, the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration. [0021] In another aspect, the present disclosure also provides a computer implemented method for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising inputting or receiving ligase activity data of a ligase for a ligase substrate obtained under different reaction conditions; applying a Gaussian Process Regression to the activity data of the ligase for the ligase substrate under the different reaction conditions; and generating an output of the predicted reaction condition activity profile of the ligase for the ligase substrate. [0022] In another aspect, the present disclosure provides a system for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising one or more processors and a memory storing instructions configured to, when executed by the processor, cause the processor to input or receive ligase activity data of a ligase for a ligase substrate for different reaction conditions; process the reaction condition activity data by applying Gaussian Process Regression; and generate an output of a predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions. [0023] In some embodiments, the present disclosure provides a computer readable storage medium for predicting the reaction condition activity profile of a ligase for a ligase substrate, comprising one or more programmed instructions configured to direct one or more processors to: inputting or receiving activity data of a ligase for a ligase substrate obtained under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions. [0024] In some embodiments, for the computer implemented method, system, or computer readable storage medium, the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature. [0025] In some embodiments, the different reaction conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substate NTP concentration, buffer concentration, and salt concentration, as described herein. Docket No. CX10-278WO1 BRIEF DESCRIPTION OF THE DRAWINGS [0026] FIG.1 shows the flow diagram of the input data of activity for a double stranded RNA ligase, application of Gaussian Process Regression to the ligase activity data, and the output of the predicted reaction condition activity profile. [0027] FIG.2 shows surface plot of predicted double stranded RNA ligase activity for variations in ATP, MgCl2, and ligase substrate concentrations for two different engineered double stranded RNA ligases, Panel A and Panel B. DETAILED DESCRIPTION [0028] In the present disclosure, it is shown that different double stranded RNA ligases exhibit varying ligation efficiencies on a double stranded RNA ligase substrate and that each double stranded ligase appears to have different defined reaction condition activity profile for a specified double stranded ligase substrate. Thus, the reaction conditions favored by one double stranded RNA ligase can be different than the reaction conditions favored by another double stranded RNA ligase for the same double stranded RNA ligase substrate. The present disclosure provides a method of predicting the reaction condition activity profile of a ligase for a ligase substrate, which can be used to define the best-fit reaction conditions for a particular ligase for a defined ligase substrate. Abbreviations and Definitions [0029] In reference to the present disclosure, the technical and scientific terms used in the descriptions herein will have the meanings commonly understood by one of ordinary skill in the art, unless specifically defined otherwise. Accordingly, the following terms are intended to have the following meanings. [0030] As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly indicates otherwise. Thus, for example, reference to “a polypeptide” includes more than one polypeptide. [0031] Similarly, “comprise,” “comprises,” “comprising” “include,” “includes,” and “including” are interchangeable and not intended to be limiting. Thus, as used herein, the term “comprising” and its cognates are used in their inclusive sense (i.e., equivalent to the term “including” and its corresponding cognates). [0032] It is to be further understood that where descriptions of various embodiments use the term “comprising,” those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language “consisting essentially of” or “consisting of.” Docket No. CX10-278WO1 [0033] “About” means an acceptable error for a particular value. In some instances, “about” means within 0.05%, 0.5%, 1.0%, or 2.0%, of a given value range. In some instances, “about” means within 1, 2, 3, or 4 standard deviations of a given value. [0034] “EC” number refers to the Enzyme Nomenclature of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). The IUBMB biochemical classification is a numerical classification system for enzymes based on the chemical reactions they catalyze. [0035] “Protein,” “polypeptide,” and “peptide” are used interchangeably to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification (e.g., glycosylation or phosphorylation).GPR [0036] “DNA ligase” refers to refers to an enzyme that covalently joins the 5’-phosphoryl termini (“donor”) and 3’-hydroxyl termini (“acceptor”) of DNA to each other. DNA ligases can be grouped into two families based on cofactor/co-substrate requirements: ATP-dependent ligases and NAD+- dependent ligases. DNA ligases of eukaryl and archael organisms are generally ATP-dependent. DNA ligases of eubacterial origin are generally NAD+ dependent. DNA ligase include enzymes within the general class of EC 6.5.1. [0037] “RNA ligase” refers to enzymes that covalently joins the 5’-phosphoryl termini (donor) of RNA or DNA to the 3’-hydroxyl termini (acceptor) of RNA or DNA. Families of known RNA ligases include RNA ligase 1, also referred to as single-stranded RNA ligase or ssRNA ligase, which catalyzes the covalent joining of single-stranded 5’-phosphoryl termini of RNA or DNA to single- stranded 3’-hydroxyl termini of RNA or DNA. RNA ligase 2, also referred to as double stranded RNA ligase or dsRNA ligase, catalyzes the covalent joining of a 3’-hydroxyl terminus of RNA to a 5’-phosphorylated RNA or DNA but shows preference for double stranded substrates. In some embodiments, RNA ligases include those enzymes classified in EC 6.5.1.3. It is to be understood that the ligation reaction is not limited to naturally occurring RNA and DNA substrates and includes nucleotide substrates that contain modified nucleotides and/or nucleotide analogs. [0038] “Polynucleotide,” “nucleic acid,” or “oligonucleotide” is used herein to denote a polymer comprising at least two nucleotides where the nucleotides are either deoxyribonucleotides or ribonucleotides or mixtures of deoxyribonucleotides and ribonucleotides. In some embodiments, the abbreviations used for genetically encoding nucleosides are conventional and are as follow: adenosine (A); guanosine (G); cytidine (C); thymidine (T); and uridine (U). Unless specifically delineated, the abbreviated nucleosides may be either ribonucleosides or 2’-deoxyribonucleosides. The nucleosides may be specified as being either ribonucleosides or 2’-deoxyribonucleosides on an individual basis or on an aggregate basis. When a polynucleotide, nucleic acid, or oligonucleotide sequences are presented as a string of one-letter abbreviations, the sequences are presented in the 5’ to Docket No. CX10-278WO1 3’ direction in accordance with common convention, and the phosphates are not indicated. The term “DNA” refers to deoxyribonucleic acid. The term “RNA” refers to ribonucleic acid. The polynucleotide or nucleic acid may be single-stranded or double-stranded, or may include both single- stranded regions and double-stranded regions. [0039] In some embodiments, the terms “polynucleotide,” “nucleic acid” and “oligonucleotide” encompass polynucleotide or nucleic acid or oligonucleotide analogs or modified polynucleotide or nucleic acid or oligonucleotide, which include, among others, nucleosides linked together via other than standard phosphodiester linkages, such as non-standard linkages of phosphoramidates, phosphorothioates, amide linkages, etc.; nucleosides with modified and/or synthetic nucleobases, for example inosine, xanthine, hypoxanthine, etc.; nucleosides with modified sugar residues, such as 2’- O-alkyl, 2’-halo, 2,3-dideoxy, 2’-halo-2’-deoxy, β-D-ribo LNA, α-L-ribo-LNA (e.g., locked nucleic acids), etc.; and/or 5’-phosphate analogs, including, among others, phosphorothioate, phosphoacetate, phosphoramidate, monomethylphosphate, methylphosphonate, or phosphonocarboxylate. [0040] “Nucleobase” refers to an unmodified nucleobase or a modified nucleobase. As used herein, in some embodiments, an “unmodified nucleobase” is adenine (A), thymine (T). cytosine (C). uracil (U). or guanine (G). A “modified nucleobase” refers to a group of atoms other than unmodified A, T, C, U. or G capable of pairing with at least one unmodified nucleobase. [0041] “Nucleoside” refers to a compound comprising a nucleobase and a sugar moiety. The nucleobases and sugar moiety are each, independently, unmodified or modified. [0042] “Internucleoside linkage” refers to as a linkage that covalently couples two nucleosides together. In some embodiments, internucleoside linkages covalently couple adjacent nucleosides together, typically forming a bond between the sugar moieties of the adjacent nucleosides. Non- limiting examples of internucleoside linkages include phosphodiester -O-P(O)2-O- linkages and modified internucleoside linkages, such as phosphorothioate -O-P-(O, S)-O- and phosphorodithioate - O-P(S)2-O-, as further described herein. [0043] “Modified oligonucleotide” or “modified polynucleotide” refers to an oligonucleotide or polynucleotide which contains at least one modified internucleoside linkage and/or a modified nucleoside, or a modified terminal group. [0044] “Modified nucleotide” refers to a nucleotide (e.g., NMP, NDP, NTP) in which at least one of the phosphate is a modified phosphate group and/or a modified nucleoside. [0045] “Modified nucleoside” or “nucleoside modification” refers to a nucleoside modified as compared to the equivalent DNA or RNA nucleoside by the introduction of one or more modifications of the sugar moiety or the nucleobase. The modified nucleoside comprises a modified nucleobase and/or a modified sugar residue. The term “modified nucleoside” may also be used herein Docket No. CX10-278WO1 interchangeably with the term “nucleoside analogue.” Nucleosides with an unmodified DNA or RNA sugar moiety are termed DNA or RNA nucleosides herein. Nucleosides with modifications in the nucleobase of the DNA or RNA nucleoside are still generally termed DNA or RNA if they allow Watson-Crick base pairing. [0046] “Modified internucleoside linkage” refers to as a linkage other than a phosphodiester (PO) linkage that covalently connects two nucleosides together. In some embodiments, exemplary modified internucleoside linkage is a phosphorothioate or phosphorodithioate internucleoside linkage. Other modified phosphorus-containing internucleoside linkages include phosphotriesters, methylphosphonates, and phosphoramidates (P-NH2). (See, e.g., Clave et al., RSC Chem Biol., 2021 2(1): 94–150). In some embodiments, the modified internucleoside linkage is a non-phosphorus containing internucleoside linkage, including but not limited to methylenemethylimino (-CH2- N(CH3)-O-CH2), thiodiester, thionocarbamate (-O-C(=O)(NH)-S-); siloxane (-O-SiH2-O-); N,N’- dimethylhydrazine (-CH2-N((CH3)-N((CH3)-); MMI (3'-CH2-N(CH3)-O-5'), amide-3 (3'-CH2-C(=O)- N(H)-5'), amide-4 (3'-CH2-N(H)-C(=O)-5'), formacetal (3'-O-CH2-O-5'), methoxypropyl, and thioformacetal (3’-S-CH2-O-5'). In some embodiments, internucleoside linkages having a chiral atom can be prepared as a mixture of the stereoisomers, or as separate stereoisomers. [0047] “Phosphorothioate internucleoside linkage” refers to an internucleoside linkage in which one of the oxygen atom in a phosphodiester linkage is replaced with a sulfur atom. In some embodiments, a phosphorothioate linkage may be represented as -O-P(O,S)-O-, wherein one of the non-bridging oxygen atoms is replaced with a sulfur atom. Phosphorothioate internucleoside linkages are chiral (see, for example, Jahns et al.2022, Nucleic Acids Research Vol.50, No 3, 1221-1240), with right- handed (Rp) and left-handed (Sp) isomers. In some embodiments, the Rp diastereomer may be referred to as an R-PS internucleoside linkage or an srP internucleoside linkage. The Sp diastereomer may be referred to as an S-PS internucleoside linkage or ssP internucleoside linkage. In some embodiments, the oligonucleotide comprises one or more srP internucleoside linkages. In some embodiments, the oligonucleotide comprises one or more ssP internucleoside linkages. Where the chirality of a phosphorothioate internucleoside linkage is not specified, that phosphorothioate internucleoside linkage may be either an srP linkage or an ssP linkage. [0048] “Non-bridging phosphorothioate internucleoside linkage” refers to a phosphorothioate internucleoside linkage in which the sulfur atom attached to the phosphorous atom is in place of a non-bridging oxygen atom. [0049] “Non-bridging phosphorodithioate internucleoside linkage” refers to a modified internucleoside linkage which is a non-bridging phosphorodithioate internucleoside linkage. A non- bridging phosphorodithioate internucleoside linkage has two identical sulfur atoms attached to the Docket No. CX10-278WO1 phosphorous atom, achieved by replacing the non-bridging oxygen atom in the phosphorothioate linkage with a sulfur atom. [0050] “Abasic sugar moiety” refers to a sugar moiety of a nucleoside that is not attached to a nucleobase. In some embodiments, such abasic sugar moieties are referred to as “abasic nucleoside.” [0051] “Inverted nucleoside” refers to a nucleotide having a 3’ to 3’ and/or 5’ to 5’ internucleoside linkage. Similarly, and “inverted sugar moiety” refers to the sugar moiety of an inverted nucleoside or an abasic sugar moiety having a 3’ to 3’ and/or 5’ to 5’ internucleoside linkage. [0052] “LNA nucleoside” or “locked nucleoside” refers to 2'-modified nucleoside which comprises a biradical linking the C2' and C4' of the ribose sugar ring of said nucleoside (also referred to as a "2'- 4' bridge"), which restricts or locks the conformation of the ribose ring. These nucleosides are also termed bridged nucleic acid or bicyclic nucleic acid (BNA) in the literature. The locking of the conformation of the ribose is associated with an enhanced affinity of hybridization (duplex stabilization) when the LNA is incorporated into an oligonucleotide for a complementary RNA or DNA molecule. This can be routinely determined by measuring the melting temperature of the oligonucleotide/complement duplex. [0053] Non-limiting, exemplary LNA nucleosides are disclosed in WO 99/014226, WO 00/66604, WO 98/039352, WO 2004/046160, WO 00/047599, WO 2007/134181, WO 2010/077578, WO 2010/036698, WO 2007/090071, WO 2009/006478, WO 2011/156202, WO 2008/154401, WO 2009/067647, WO 2008/150729, Morita et a!., Bioorganic & Med. Chem. Lett.2002, 12, 73-76, Seth et al. J. Org. Chem.2010, Vol 75(5) pp.1569-81, and Mitsuoka et al., Nucleic Acids Research 2009, 37(4), 1225-1238, and Wan and Seth, J. Medical Chemistry 2016, 59, 9645-9667. [0054] “Terminal group” as used herein refers to a group located at the first or last nucleoside in a polynucleotide or oligonucleotide. A 5’-terminal group refers to the terminal group bonded to 5′-or 4’- carbon atom of the first nucleoside within a polynucleotide. A 3’-terminal group is a terminal group bonded to 3′-carbon atom of the last nucleoside within a polynucleotide or oligonucleotide. [0055] “5’-blocking group” as used herein refers to a moiety or chemical group that prevents or inhibits attachment of another nucleoside, nucleotide or oligonucleotide to the 5’-terminal nucleoside. In context enzymes active on the 5’-terminal nucleoside, a 5’-blocking group prevents or inhibits the enzyme(s) from attachment of another nucleoside, nucleotide or oligonucleotide to the to the 5’- terminal nucleoside, particularly the 5’-OH of the 5’-terminal nucleoside. [0056] “3’-blocking group” refers to moiety or chemical group that prevents or inhibits attachment off another nucleoside, nucleotide, or oligonucleotide to the 3’-terminal nucleoside. In context of single-stranded RNA ligase or other enzymes active on the 3’-terminal nucleoside, a 3’-blocking Docket No. CX10-278WO1 group prevents or inhibits the enzyme(s) from attachment of another nucleoside, nucleotide, or oligonucleotide to the 3’-terminal nucleoside, particularly the 3’-OH of the 3’-terminal nucleoside. [0057] “Reversible blocking group” refers to a blocking group that can be removed or cleaved off to provide a free 3’-OH. In some embodiments, the blocking group is removable with a deblocking agent, which can be a chemical or enzymatic deblocking agent. [0058] “Enzymatically reversible blocking group” refers to a blocking group that is susceptible to removal or cleaving by an enzyme. [0059] “Duplex” and “ds” refer to a double-stranded nucleic acid (e.g., DNA or RNA) molecule comprised of two single-stranded polynucleotides that are complementary in their sequence (e.g., A pairs to T or U, C pairs to G), arranged in an antiparallel 5’ to 3’ orientation, and held together by hydrogen bonds between the nucleobases (e.g., adenine [A], guanine [G], cytosine [C], thymine [T], uridine [U]). [0060] “Complementary” is used herein to describe the structural relationship between nucleotide bases that are capable of forming base pairs with one another. For example, a purine nucleotide base present on a polynucleotide that is complementary to a pyrimidine nucleotide base on a polynucleotide may base pair by forming hydrogen bonds with one another. Complementary nucleotide bases can base pair via Watson/Crick base pairing or in any other manner than forms stable duplexes or other nucleic acid structures. [0061] “Watson/Crick Base-Pairing” refers to a pattern of specific pairs of nucleobases and analogs that bind together through sequence-specific hydrogen-bonds, e.g., A pairs with T or U, and G pairs with C. [0062] “Annealing” or “Hybridization” refers to the base-pairing interactions of one nucleobase polymer (e.g., poly- and oligonucleotides) with another that results in the formation of a double- stranded structure, a triplex structure or a quaternary structure. Annealing or hybridization can occur via Watson-Crick base-pairing interactions, but may be mediated by other hydrogen-bonding interactions, such as Hoogsteen base pairing. In some embodiments, the nucleobase polymer that anneals or hybridizes to another is a single nucleobase polymer while in other embodiments, the nucleobase polymers are separate nucleobase polymers. [0063] “Engineered,” “recombinant,” “non-naturally occurring,” and “variant,” when used with reference to a cell, a polynucleotide or a polypeptide refer to a material or a material corresponding to the natural or native form of the material that has been modified in a manner that would not otherwise exist in nature or is identical thereto but produced or derived from synthetic materials and/or by manipulation using recombinant techniques. Docket No. CX10-278WO1 [0064] “Wild-type” and “naturally-occurring” refer to the form found in nature. For example, a wild-type polypeptide or polynucleotide sequence is a sequence present in an organism that can be isolated from a source in nature and which has not been intentionally modified by human manipulation. [0065] “Alkyl” refers to straight or branched chain hydrocarbon groups having the number of carbon atoms designated, for example 1 to 20 carbon atoms (C1-C20), particularly 1 to 12 carbon atoms (C1- C12 or C1-12), and more particularly (C1-C8 or C1-8) carbon atoms. Exemplary “alkyl” includes, but are not limited to, methyl, ethyl, n-propyl, i-propyl, n-butyl, s-butyl, t-butyl, n-pentyl, and s-pentyl. [0066] “Alkenyl” refers to straight or branched chain hydrocarbon having the number of carbon atoms designated, for example 2 to 20 carbon atoms (C2-C20), particularly 2 to 12 carbon atoms (C2- C12 or C2-12), and most particularly 2 to 8 (C2-C8 or C2-8)carbon atoms, having at least one double bond. Exemplary “alkenyl” includes, but are not limited to, vinyl ethenyl, allyl, isopropenyl, 1- propenyl, 2-methyl-1-propenyl, 1-butenyl, 2-butenyl, 3-butenyl, 2-ethyl-1-butenyl, 3-methyl-2- butenyl, 1-pentenyl, 2-pentenyl, 3-pentenyl, 4-pentenyl, 4-methyl-3-pentenyl, 1-hexenyl, 2-hexenyl, 3-hexenyl, 4-hexenyl and 5-hexenyl. [0067] “Alkynyl” refers to a straight or branched chain hydrocarbon having the number of carbon atoms designated, for example 2 to 12 carbon atoms (C2-C12 or C2-12), particularly 2 to 8 carbon atoms (C2-C8 or C2-8), containing at least one triple bond. Exemplary “alkynyl” includes ethynyl, 1- propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 3-butynyl, 1-pentynyl, 2-pentynyl, 3-pentynyl, 4- pentynyl, 1-hexynyl, 2-hexynyl, 3-hexynyl, 4-hexynyl and 5-hexynyl. [0068] “Alkylene”, “alkenylene” and “alkynylene” refers to a straight or branched chain divalent hydrocarbon radical of the corresponding alkyl, alkenyl, and alkynyl, respectively. The “alkylene”, “alkenylene” and “alkynylene” may be optionally substituted, for example with alkyl, alkyloxy, hydroxyl, carbonyl, carboxyl, halo, nitro, and the like. [0069] “Lower” in reference to substituents refers to a group having between one and six carbon atoms. [0070] “Heteroalkyl,” heteroalkenyl,” and “heteroalkynyl” refers to the corresponding alkyl, alkenyl, and akynyl in which one or more of the carbon atoms is replaced with a heteroatom, such as O, S and N. [0071] “Cycloalkyl” refers to any stable monocyclic or polycyclic system which consists of carbon atoms, any ring of which being saturated. “Cycloalkenyl” refers to any stable monocyclic or polycyclic system which consists of carbon atoms, with at least one ring thereof being partially unsaturated. Examples of cycloalkyls include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclooctyl, bicycloalkyls and tricycloalkyls (e.g., adamantyl). Docket No. CX10-278WO1 [0072] “Heterocycloalkyl” or “heterocyclyl” refers to a substituted or unsubstituted 3 to 14 membered, mono- or bicyclic, non-aromatic hydrocarbon, wherein 1 to 3 carbon atoms a (e replaced by a heteroatom. Heteroatoms and/or heteroatomic groups which can replace the carbon atoms include, but are not limited to, -O-, -S-, -S-O-, -NR’-, -PH-, -S(O)-, -S(O)2-, -S(O) NR’-, -S(O)2NR’-, and the like, including combinations thereof, where each R’ is independently hydrogen or lower alkyl. Examples include oxiranyl, oxetanyl, azetidynyl, oxazolyl, thiazolidinyl, thiazolyl, morpholinyl, pyrrolidinonyl, pyrrolidinyl, piperidinyl, piperazinyl, 2,3-dihydrofuranyl, dihydropyranyl, tetrahydrofuranyl, tetrahydropyranyl, dihydropyridinyl, tetrahydropyridinyl, tetrahydropyrimidinyl, tetrahydrothiophenyl, tetrahydrothiopyranyl, azapanyl, and the like. [0073] “Aryl” refers to a six- to fourteen-membered, mono- or bi-carbocyclic ring, wherein the monocyclic ring is aromatic and at least one of the rings in the bicyclic ring is aromatic. Unless stated otherwise, the valency of the group may be located on any atom of any ring within the radical, valency rules permitting. Examples of “aryl” groups include phenyl, naphthyl, indenyl, biphenyl, phenanthrenyl, naphthacenyl, and the like. [0074] “Heteroaryl” refers to an aromatic heterocyclic ring, including both monocyclic and bicyclic ring systems, where at least one carbon atom of one or both of the rings is replaced with a heteroatom independently selected from nitrogen, oxygen, and sulfur, or at least two carbon atoms of one or both of the rings are replaced with a heteroatom independently selected from nitrogen, oxygen, and sulfur. In some embodiments, the heteroaryl can be a 5 to 6 membered monocyclic, or 7 to 11 membered bicyclic ring systems. Examples of “heteroaryl” groups include pyrrolyl, pyrazolyl, imidazolyl, pyrazinyl, oxazolyl, isoxazolyl, thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzoxazolyl, benzisoxazolyl, benzothiazolyl, purinyl, benzimidazolyl, indolyl, isoquinolyl, quinoxalinyl, quinolyl, and the like. [0075] “Bridged bicyclic” refers to any bicyclic ring system, i.e., carbocyclic or heterocyclic, saturated or partially unsaturated, having at least one bridge. As defined by IUPAC, a “bridge” is an unbranched chain of atoms or an atom or a valence bond connecting two bridgeheads, where a “bridgehead” is any skeletal atom of the ring system which is bonded to three or more skeletal atoms (excluding hydrogen). In some embodiments, a bridged bicyclic group has 5 to 12 ring members and 0-4 heteroatoms independently selected from nitrogen, oxygen, and sulfur. Such bridged bicyclic groups include those groups set forth below where each group is attached to the rest of the molecule at any substitutable carbon or nitrogen atom. Unless otherwise specified, a bridged bicyclic group is optionally substituted with one or more substituents as set forth for aliphatic groups. Additionally or alternatively, any substitutable nitrogen of a bridged bicyclic group is optionally substituted. Exemplary bridged bicyclics include: Docket No. CX10-278WO1 , some a a [0077] “Fused ring” refers a ring system with two or more rings having at least one bond and two atoms in common. A “fused aryl” and a “fused heteroaryl” refer to ring systems having at least one aryl and heteroaryl, respectively, that share at least one bond and two atoms in common with another ring. [0078] “Carbonyl” refers to -C(O)-. The carbonyl group may be further substituted with a variety of substituents to form different carbonyl groups including acids, acid halides, aldehydes, amides, esters, and ketones. For example, an -C(O)R’, wherein R’ is an alkyl is referred to as an alkylcarbonyl. In some embodiments, R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. [0079] “Halogen” or “halo” refers to fluorine, chlorine, bromine and iodine. [0080] “Haloalkyl” refers to an alkyl substituted with 1 or more halogen atoms. Preferably, the alkyl is substituted with 1 to 3 halogen atoms. [0081] “Hydroxy” refers to –OH. [0082] “Oxy” refers to group -O-, which may have various substituents to form different oxy groups, including ethers and esters. In some embodiments, the oxy group is an –OR’, wherein R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. [0083] “Acyl” refers to -C(O)R’, where R is hydrogen, or an optionally substituted alkyl, heteroalkyl, cylcoalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, or heteroarylalkyl as Docket No. CX10-278WO1 defined herein. Exemplary acyl groups include, but are not limited to, formyl, acetyl, cyclohexylcarbonyl, cyclohexylmethylcarbonyl, benzoyl, benzylcarbonyl, and the like. [0084] “Alkyloxy” or “alkoxy” refers to –OR’, wherein R’ is an optionally substituted alkyl. [0085] “Aryloxy” refers to –OR’, wherein R’ is an optionally substituted aryl. [0086] “Carboxy” refers to –COO- or COOM, wherein H or a M+ counterion. [0087] “Carbamoyl” refers to -C(O)NR’R’, wherein each R’ is independently selected from H or an optionally substituted alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocylcoalkylalkyl, aryl, arylalkyl, heteroaryl, or heteroarylalkyl. [0088] “Cyano” refers to –CN. [0089] “Ester” refers to a group such as -C(=O)OR’, alternatively illustrated as –C(O)OR’, wherein R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocyclolalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. [0090] “Silyl” refers to Si, which may have various substituents, for example –SiR’R’R’, where R’ is as defined in the specification. For example, each R’ is independently selected from alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. As defined herein, any heterocyloalkyl or heteroaryl group present in a silyl group has from 1 to 3 heteroatoms selected independently from O, N, and S. [0091] “Thiol” or “sulfhydryl” refers to –SH. [0092] “Disulfied” refers to -S-S- groups. [0093] “Sulfanyl” refers to –SR’, wherein R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. For example, -SR, wherein R is an alkyl is an alkylsulfanyl. [0094] “Sulfonyl” refers to -S(O)2-, which may have various substituents to form different sulfonyl groups including sulfonic acids, sulfonamides, sulfonate esters, and sulfones. For example, -S(O)2R’, wherein R’ is an alkyl refers to an alkylsulfonyl. In some embodiments of -S(O)2R’, R’ is selected from an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocyloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. [0095] “Amino” or “amine” refers to the group –NR’R’ or –NR’R’R’, wherein each R’ is independently selected from H and an optionally substituted: alkyl, cycloalkyl, heterocycloalkyl, alkyloxy, aryl, heteroaryl, heteroarylalkyl, acyl, alkyloxycarbonyl, sulfanyl, sulfinyl, sulfonyl, and the like. Exemplary amino groups include, but are not limited to, dimethylamino, diethylamino, trimethylammonium, triethylammonium, methylysulfonylamino, furanyl-oxy-sulfamino, and the like. Docket No. CX10-278WO1 [0096] “Amide” refers to a group such as, -C(=O)NR’R’, wherein each R’ is independently selected from H and an optionally substituted: alkyl, cycloalkyl, cycloalkylalkyl, heterocycloalkyl, heterocycloalkylalkyl, aryl, arylalkyl, heteroaryl, and heteroarylalkyl. [0097] “Optional” or “optionally” refers to a described event or circumstance may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where the event or circumstance does not. For example, “optionally substituted alkyl” refers to an alkyl group that may or may not be substituted and that the description encompasses both substituted alkyl group and unsubstituted alkyl group. [0098] “Substituted” as used herein means one or more hydrogen atoms of the group is replaced with a substituent atom or group commonly used in pharmaceutical chemistry. Each substituent can be the same or different. Examples of suitable substituents include, but are not limited to, alkyl, alkenyl, alkynyl, cycloalkyl, aryl, arylalkyl, heterocycloalkyl, heteroaryl, OR (e.g., hydroxyl, alkyloxy (e.g., methoxy, ethoxy, and propoxy), aryloxy, heteroaryloxy, arylalkyloxy, ether, ester, carbamate, etc.), hydroxyalkyl, alkyloxycarbonyl, alkyloxyalkyloxy, perhaloalkyl, alkyloxyalkyl, SR (e.g., thiol, alkylthio, arylthio, heteroarylthio, arylalkylthio, etc.), S+R 2, S(O)R, SO2R, NRR (e.g., primary amine (i.e., NH2), secondary amine, tertiary amine, amide, carbamate, urea, etc.), hydrazide, halo, nitrile, nitro, sulfide, sulfoxide, sulfone, sulfonamide, thiol, carboxy, aldehyde, keto, carboxylic acid, ester, amide, imine, and imide, including seleno and thio derivatives thereof, wherein each of the substituents can be optionally further substituted. In embodiments in which a functional group with an aromatic carbon ring is substituted, such substitutions will typically number less than about 10 substitutions, more preferably about 1 to 5, with about 1 or 2 substitutions being preferred. [0099] “Stereoisomer” refers to a compound made up of the same atoms bonded by the same bonds but having different three-dimensional structures, which are not interchangeable. Thus, “stereoisomer thereof” with respect to a compound includes any stereoisomer of the compound and mixtures of stereoisomers, and includes “enantiomers,” which refers to two stereoisomers whose molecules are nonsuperimposable mirror images of one another. A compound may have more than one chiral center such that the compound may exist as either an individual diastereomer or as a mixture of diastereomers. Method of generating a predicted reaction condition activity profile of a polynucleotide ligase [0100] In one aspect, the present disclosure provides a method of predicting the reaction condition activity profile of a polynucleotide ligase for one or more ligase substrates. It is demonstrated herein that for double stranded RNA ligases, the reaction condition activity profiles, e.g., ligase activity at different Mg+2, cofactor or co-substrate ATP, double stranded RNA ligase substrate concentration, of each dsRNA ligase can differ for the same double stranded RNA ligase substrate. Relatedly, a double stranded RNA ligase can exhibit different reaction condition activity profile for different double Docket No. CX10-278WO1 stranded RNA ligase substrates. By employing Gaussian Process Regression analysis of the double stranded RNA ligase activity data obtained for different reaction conditions, a predicted reaction condition activity profile can be generated for identifying the reaction conditions favorable for the double stranded RNA ligase on the specified double stranded RNA ligase substrate. The present invention is applicable to different types of ligases, including, among others, single stranded DNA ligase, double stranded DNA ligase, single stranded RNA ligase, and double stranded RNA ligase. [0101] In one aspect, the present disclosure provides a method of predicting or modeling reaction condition activity profile of a polynucleotide ligase for a polynucleotide ligase substrate, comprising: obtaining activity data of a polynucleotide ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR-predicted or modeled reaction condition activity profile of the ligase for the ligase substrate. [0102] As used herein, Gaussian Process Regression (GPR) is a non-parametric regression technique used in machine learning and statistics, and useful when dealing with problems involving continuous data, where the relationship between input variables and output is not explicitly known or can be complex. GPR is a Bayesian approach that can model certainty in predictions, making it a tool for various applications, including optimization. The steps in Gaussian Process Regression include: (1) data collection involving gathering the input-output data pairs for the regression problem; (2) choosing a kernel function involving selecting an appropriate covariance function (kernel) that is appropriate to the problem, where the choice of kernel influences the shape of the functions that GPR can model; (3) parameter optimization involving estimating the parameters of the kernel function by maximizing the likelihood of the data; and (4) a prediction in which given a new input, the trained GPR model, is used to make predictions. GPR provides both the predicted mean and the associated uncertainty (variance). Software for Gaussian Process Regression are available commercially, such as from JMP® Statistical Discover, and scikit-learn (available at https scikit-learn.org). In some embodiments, model parameters are optimized (parameter optimization) to identify optimal conditions for each ligase, and in some embodiments across all ligases, by setting a desirability parameter of equally weighted average of the measured response of each ligase within an examined set. [0103] In some embodiments, where the Gaussian Process model contains continuous predictors, the Gaussian Process platform implements two possible correlation structures, the Gaussian and the Cubic. See JMP® Statistical Discover. Docket No. CX10-278WO1 [0104] In some embodiments, the Gaussian correlation structure uses the product exponential correlation function with a power of 2 as the estimated model. This model assumes that Y is normally distributed with mean μ and covariance matrix σ2R. The elements of the R matrix are defined as follows: K = # of continuous predictors; Θk = theta parameter for the kth predictor; xik = the value of the kth for subject I; and xjk = the value of the kth predictor for subject j. [0105] In some embodiments, the Cubic correlation structure also assumes that Y is normally distributed with mean ^ and covariance matrix ^2R. The R matrix consists of the following elements: See Santer et al., In The Design and Analysis of Computer Experiments. New York: Springer-Verlag (2003). The theta parameter used in the Cubic correlation structure is the reciprocal of the parameter often used in the literature. The reciprocal is used so that when theta has no effect on the model, then rho has a value of zero, rather than infinity. Docket No. CX10-278WO1 [0106] In some embodiments, the Gaussian Process Regression can use categorical predictors. If the Gaussian Process model includes categorical predictors, the Gaussian correlation structure is used for the correlation structure. See, e.g., JMP® Statistical Discover. The elements of the R matrix are defined as follows: K = # of continuous predictors; P = # of categorical predictors; Θk = theta parameter for the kth continuous predictor; Xi = the value of the kth continuous predictor for subject I; xjk = the value of the kth continuous predictor for subject j; and = the correlation between the observed level of predictor p for subject I and the observed level of predictor p for subject j. [0107] There is a τ parameter for each combination of levels of a categorical variable, where τij corresponds to the unique combination formed by the observed levels of subject I and subject j. Thus, the covariance element, rij, depends on the combination of levels of the categorical predictors obtained from the ith and jth observations. See, e.g., JMP® Statistical Discover, referencing Qian et al., “Gaussian process models for computer experiments with qualitative and quantitative factors,” Technometrics, 2012, 50:383–396. [0108] In some embodiments, the different reaction or variable conditions comprise varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. [0109] In some embodiments, the different reaction or variable conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substrate NTP (or NAD+) concentration, buffer concentration, and salt concentration. [0110] In some embodiments, the different or variable reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration. Docket No. CX10-278WO1 [0111] In some embodiments, the divalent metal is any divalent metal that a ligase is active in the ligation reaction. In some embodiments, the divalent metal is Mn+2 or Mg+2. In some embodiments, the divalent metal is Mg+2. In some embodiments, the activity profile is obtained for divalent metal concentrations under which the ligase shows activity for the ligase substrate. In some embodiments, the activity profile is obtained for divalent metal concentrations varied from 0.1 mM to 100 mM, from 0.5 to 50 mM, 0.1 to 40 mM, 0.5 to 40 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, or 0.5 to 10 mM. Changes in divalent metal concentrations can be done in increments, e.g., of 1 mM, 2 mM, 5 mM, 10 mM, etc., as needed to obtain the activity profile for the desired divalent metal concentrations. [0112] In some embodiments, the ligase activity data is obtained for different or variable cofactor or co-substrate NTP concentrations. In some embodiments, the cofactor or co-substrate NTP concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM. In some embodiments, the cofactor or co-substrate NTP is ATP. [0113] In some embodiments, where the ligase uses a cofactor or co-substrate NAD+, the cofactor or co-substrate NAD+ concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM. [0114] In some embodiments, the ligase activity data is obtained for different or variable ligase substrate concentrations, and adjusted for the type of ligase (e.g., single stranded ligase substrate or double stranded ligase substrate). As used herein, “concentration” of substrate refers to the concentration of each component polynucleotide acting as a substrate. By way of example and not limitation, a single stranded RNA ligase can use a polynucleotide acceptor and a polynucleotide donor provided as separate polynucleotides, where the concentration of polynucleotide acceptor or polynucleotide donor defines “substrate” concentration, as opposed to the sum of concentrations of the polynucleotide acceptor and polynucleotide donor. In some embodiments, the activity profile is obtained for ligase substrate concentration from 0.01 to 20 mM, from 0.1 to 20 m, from 0.2 to 10 mM, from 0.1 to 5 mM, from 0.2 to 5 mM, from 0.1 to 2 mM, or from 0.2 to 2 mM. [0115] In some embodiments, the ligase activity data is obtained for different or variable buffer types, and/or different or variable buffer concentrations. Exemplary buffers for ligases, include, by way of example and not limitation, borate, potassium phosphate, 2-(N-morpholino)ethane sulfonic acid (MES), 3-(N-morpholino)propanesulfonic acid (MOPS), acetate, triethanolamine, 2-amino-2- hydroxymethyl-propane-1,3-diol (Tris), and the like. In some embodiments, the ligase activity data is obtained for buffer concentrations from 1 to 200 mM, 5 to 200 mM, 1 to 150 mM, 5 to 150 mM, 1 to 100 mM, 5 to 100 mM, 1 to 50 mM, 5 to 50 mM, 1 to 20 mM, 5 to 20 mM, 1 to 10 mM, or 5 to 10 mM. Docket No. CX10-278WO1 [0116] In some embodiments, the ligase activity data is obtained for different or variable salts, and/or different or variable salt concentrations. In some embodiments, the salt is, among others, NaCl, KCl, ammonium (e.g., ammonium acetate, ammonium chloride, etc.), acetate (e.g., sodium acetate, etc.). In some embodiments, the salt is NaCl. In some embodiments, the activity profile is obtained for salt concentrations, e.g., NaCl, from 0 to 500 mM, 1 to 500 mM, 5 mM to 500 mM, 1 to 400 mM, 5 to 400 mM, 1 to 300 mM, 5 to 300 mM, 1 to 200 mM, 5 to 200 mM, 1 to 100 mM, 5 to 100 mM, 1-50 mM, or 5 to 50 mM. [0117] In some embodiments, the reaction condition activity data is obtained for different ligase concentrations. In some embodiments, the ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 g/L to about 50 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2 g/L. In some embodiments, the ligase concentration is about 0.01 g/L, about 0.05 g/L, about 0.1 g/L, about 0.5 g/L , about 1 g/L, about 2 g/L, about 5 g/L, about 10 g/L, about 30 g/L, about 40 g/L, or about 50 g/L. [0118] In some embodiments, after applying the GPR analysis on the activity data obtained for the different or variable reaction conditions, an output of the GPR-predicted or modeled reaction condition activity profile of the ligase for the ligase substrate is generated. In some embodiments, the output is a multi-output of the predicted or modeled reaction condition activity profile of the ligase for the ligase substrate providing covariance of each output. In some embodiments, the multi-output is generated by prescribing an additional covariance function (kernal) over the outputs, specifying the covariance between outputs. [0119] In some embodiments, the output is a contour plot of the predicted or modeled ligase activity for the different reaction condition variables. In some embodiments, the contour plot is a three or four dimensional surface plot of predicted ligase activity for the variable reaction conditions. [0120] In some embodiments, the ligase substrate comprises at least a polynucleotide acceptor and a polynucleotide donor. In some embodiments, the 3’-terminal nucleotide of the polynucleotide acceptor strand has a requisite 3’-OH, or functional form thereof, to act as the acceptor, and the 5’- terminal nucleotide of the nucleotide or polynucleotide donor has a requisite 5’-phosphate, or functional form thereof, to act as the donor in the ligase reaction. [0121] In some embodiments, the polynucleotide acceptor and the polynucleotide donor can be any length and/or form suitable for the ligase. In some embodiments, the polynucleotide acceptor comprises 3 to 400, 4 to 350, 5 to 300, 6 to 250, 7 to 200, 8 to 150, 9 to 100, or 10 to 50 nucleotides in length. In some embodiments, the polynucleotide acceptor for a single stranded ligase can have some Docket No. CX10-278WO1 double stranded regions but have sufficient single stranded region at the 3’-terminal end to function as an acceptor for the single stranded ligase. [0122] In some embodiments, the polynucleotide donor for a ligase comprises 3 to 400, 4 to 350, 5 to 300, 6 to 250, 7 to 200, 8 to 150, 9 to 100, or 10 to 50 nucleotides in length. In some embodiments, the polynucleotide donor for a single stranded ligase, similar to the polynucleotide acceptor, can have some double stranded regions but have sufficient single stranded region at the 5’-terminal end to function as a donor for the single stranded ligase. In some embodiments, for a single stranded RNA ligase, the donor comprises a nucleotide donor, as further described herein. [0123] In some embodiments, where the ligase is a double stranded polynucleotide ligase, the ligase substrate further comprises a polynucleotide strand complementary to the polynucleotide acceptor strand and polynucleotide donor strand and forms a double stranded polynucleotide substrate comprising a ligatable nick. In some embodiments, the polynucleotide acceptor strand and the polynucleotide donor strand are provided on a single polynucleotide, and the ligatable nick is formed through base pairing of self-complementary regions on the single polynucleotide (e.g., to form a hairpin structure). [0124] In some embodiments, the polynucleotide acceptor strand and the polynucleotide donor strand of the double stranded polynucleotide substrate are provided as separate polynucleotides and form a nick when they base pair to a polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand. In some embodiments, the double stranded polynucleotide substrate are formed from double stranded polynucleotide fragments that have cohesive ends, which can base pair to form ligatable nicks. In some embodiments, the cohesive ends comprise a complementary region of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides. In some embodiments, the double stranded polynucleotide substrate comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nicks. In some embodiments, the double stranded polynucleotide substrate comprises a plurality of polynucleotide acceptor substrates. In some embodiments, the double stranded polynucleotide substrate comprises a plurality of polynucleotide donor substrates. [0125] In some embodiments, the double stranded polynucleotide substrate comprises blunt ended substrates, wherein the double stranded polynucleotide ligase is capable of ligating blunt ended substrates. [0126] In some embodiments, the ligase substrate comprises a modified nucleoside, modified internucleoside linkage, or a combination of modified nucleoside and modified internucleoside linkage. [0127] In some embodiments, polynucleotide acceptor of the ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. Docket No. CX10-278WO1 [0128] In some embodiments, the polynucleotide acceptor comprises a 3’-terminal modified nucleoside. In some embodiments, the polynucleotide acceptor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside towards the 5’-terminus of the polynucleotide acceptor. [0129] In some embodiments, the polynucleotide acceptor comprises a 5’-terminal modified nucleoside. In some embodiments, the polynucleotide acceptor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside towards the 3’-terminus of the polynucleotide acceptor. [0130] In some embodiments, polynucleotide acceptor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. [0131] In some embodiments, the modified 3’-terminal nucleoside of the polynucleotide acceptor comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’- position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine. [0132] In some embodiments, the modified 5’-terminal nucleoside of the polynucleotide acceptor comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’- position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine. [0133] In some embodiments, the polynucleotide acceptor comprises a 3’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide acceptor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’- terminal nucleoside of the polynucleotide acceptor. [0134] In some embodiments, the polynucleotide acceptor comprises a 5’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide acceptor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’- terminal nucleoside of the polynucleotide acceptor strand. [0135] In some embodiments, polynucleotide acceptor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. Docket No. CX10-278WO1 [0136] In some embodiments, the polynucleotide donor of the ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. [0137] In some embodiments, the polynucleotide donor comprises a 5’-terminal modified nucleoside. In some embodiments, the polynucleotide donor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide donor strand towards the 3’-terminus. [0138] In some embodiments, the polynucleotide donor comprises a 3’-terminal modified nucleoside. In some embodiments, the polynucleotide donor comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide donor strand towards the 5’-terminus. [0139] In some embodiments, polynucleotide donor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. [0140] In some embodiments, the modified 5’-terminal nucleoside of the polynucleotide donor comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’- position of the sugar moiety. In some embodiments, the modified 5’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine. [0141] In some embodiments, the modified 3’-terminal nucleoside of the polynucleotide donor comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’- position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’- fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’- O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine. [0142] In some embodiments, polynucleotide donor comprises a 5’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide donor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’- terminal nucleoside of the polynucleotide acceptor strand. [0143] In some embodiments, polynucleotide donor comprises a 3’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide donor comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’- terminal nucleoside of the polynucleotide donor strand. Docket No. CX10-278WO1 [0144] In some embodiments, polynucleotide donor comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. [0145] In some embodiments, the 3’-terminal nucleoside of a polynucleotide acceptor strand and/or the 5’-terminal nucleoside of the polynucleotide donor strand forming a ligation junction of the ligase substrate comprise modified nucleosides. [0146] In some embodiments, one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide acceptor strand are modified. In some embodiments, one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide donor strand are modified. [0147] In some embodiments, the ligation junction comprises one or more modified internucleoside linkages. In some embodiments, the polynucleotide acceptor strand forming the ligation comprises a modified internucleoside linkage. In some embodiments, the polynucleotide donor strand forming the ligation junction comprises a modified internucleoside linkage. [0148] In some embodiments, the ligation junction comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor, and a modified 5’-terminal nucleoside on the polynucleotide donor. In some embodiments, the ligation junction comprises at least a modified 3’- terminal nucleoside on the polynucleotide acceptor, wherein the modified 3’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine, and comprises at least a modified 5’-terminal nucleoside on the polynucleotide donor, wherein the modified 5’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl- guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine. [0149] In some embodiments, where the polynucleotide ligase is a single stranded RNA ligase, the donor substrate for single stranded RNA ligase comprises a nucleotide donor substrate. In some embodiments, the nucleotide donor substrate for the single stranded RNA ligase comprises the structure pN, where the prefix p represents a 5’-phosphate group, and N represents a nucleoside. In some embodiments, the nucleotide donor comprises the structure pNp, where prefix p represents a 5’- phosphate group, N represents a nucleoside, and the suffix p represents a 3’-phosphate group. In some embodiments, the nucleoside N of the nucleotide donor is unmodified. In some embodiments, the nucleoside N of the nucleotide donor is modified. In some embodiments, the modified nucleoside of the nucleotide donor comprises a 2’-modified, 3’-modified, or 2’- and 3’- modified sugar moiety. In some embodiments, the modified nucleoside comprises a modified nucleobase. In some embodiments, the modified nucleoside comprises a modified sugar moiety and a modified nucleobase. Docket No. CX10-278WO1 In some embodiments, the nucleotide donor substrate is modified with a conjugate moiety, such as a targeting moiety, as further described herein. [0150] In some embodiments, the method of predicting or modeling the reaction condition activity profile of a ligase for a ligase substrate further comprises predicting or modeling reaction condition activity profile of at least a second ligase substrate, and comparing the output of predicted reaction condition activity profile of the ligase for the ligase substrate with the output of predicted reaction condition activity profile of the ligase for the second ligase substrate. [0151] In some embodiments, the ligation of the ligase substrate and ligation of the second ligase substrate produces the same ligated product. In some embodiments, the ligase substrate and the second ligase substrate that produces the same product comprises at least a different ligation point or ligation junction, thereby providing effect of different ligation points or ligation junction on reaction condition activity profile. [0152] In some embodiments, the ligase substrate and the second ligase substrate that result in the same product and comprise at least 2 different ligation junctions. In some embodiments, the ligase substrate and the second ligase substrate that result in the same product and comprise at least 3, 4, 5, 6, 7, 8, 9, 10 or more different ligation junctions. [0153] In some embodiments, as discussed above, the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase. [0154] In some embodiments, the ligase is preferably a double stranded RNA ligase. In some embodiments, a method of predicting or modeling reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate, comprises: obtaining activity data of a double stranded RNA ligase for a double stranded RNA ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR-predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate. [0155] In some embodiments, the different or variable reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, double stranded RNA ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. [0156] In some embodiments, the different or variable reaction conditions comprise at least different or variable divalent metal concentration, cofactor or co-substrate NTP concentration, and double stranded RNA ligase substrate concentration. Docket No. CX10-278WO1 [0157] In some embodiments, the divalent metal is any divalent metal that the double stranded RNA ligase is active in the ligation reaction. In some embodiments, the divalent metal is Mn+2 or Mg+2. In some embodiments, the divalent metal for the double stranded RNA ligase is Mg+2. In some embodiments, the activity profile is obtained for divalent metal concentrations under which the double stranded RNA ligase shows activity for the ligase substrate. In some embodiments, the double stranded RNA ligase activity profile is obtained for divalent metal concentrations, e.g., Mg+2, varied from 0.1 mM to 100 mM, from 0.5 to 50 mM, 0.1 to 40 mM, 0.5 to 40 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, or 0.5 to 10 mM. In some embodiments, changes in divalent metal concentrations can be done in increments, e.g., of 1 mM, 2 mM, 5 mM, 10 mM, etc., as needed to obtain the activity profile for the desired divalent metal concentrations. [0158] In some embodiments, the double stranded RNA ligase activity profile is obtained for different or variable cofactor or co-substrate NTP concentrations. In some embodiments, the cofactor or co-substrate NTP concentrations are from 0.1 mM to 40 mM, 0.5 to 40 mM, 0.1 to 30 mM, 0.5 mM to 30 mM, 0.1 to 20 mM, 0.5 to 20 mM, 0.1 to 10 mM, 0.5 to 10 mM, 0.1 to 5 mM, or 0.5 to 5 mM. In some embodiments, the cofactor or co-substrate NTP is ATP. [0159] In some embodiments, the double stranded RNA ligase activity profile is obtained for different or variable double stranded RNA ligase substrate concentrations. In some embodiments, the activity profile is obtained for ligase substrate concentrations from 0.01 to 20 mM, from 0.1 to 20 m, from 0.2 to 10 mM, from 0.1 to 5 mM, from 0.2 to 5 mM, from 0.1 to 2 mM, or from 0.2 to 2 mM. In some embodiments, the double stranded RNA ligase substrate contains one or more modified nucleosides, one or more modified internucleoside linkages, or a combination of modified nucleoside and modified internucleoside linkages. In some embodiments, the modified nucleoside and/or modified internucleoside linkage is present at the ligation junction, and/or the polynucleotide strand complementary to the ligation junction, as further described herein. [0160] In some embodiments, the double stranded RNA ligase activity profile is obtained for different or variable buffer types, and/or different or variable buffer concentrations. Exemplary buffers for double stranded RNA ligases, include, by way of example and not limitation, borate, potassium phosphate, 2-(N-morpholino)ethane sulfonic acid (MES), 3-(N- morpholino)propanesulfonic acid (MOPS), acetate, triethanolamine, 2-amino-2-hydroxymethyl- propane-1,3-diol (Tris), and the like. In some embodiments, the double stranded RNA ligase activity profile is obtained for buffer concentrations from 1 to 200 mM, 5 to 200 mM, 1 to 150 mM, 5 to 150 mM, 1 to 100 mM, 5 to 100 mM, 1 to 50 mM, 5 to 50 mM, 1 to 20 mM, 5 to 20 mM, 1 to 10 mM, or 5 to 10 mM. [0161] In some embodiments, the double stranded RNA ligase activity profile is obtained for different or variable salts, and/or different or variable salt concentrations. In some embodiments, the Docket No. CX10-278WO1 salt is, among others, NaCl, KCl, ammonium (e.g., ammonium acetate, ammonium chloride, etc.), acetate (e.g., sodium acetate, etc.). In some embodiments, the salt is NaCl. In some embodiments, the double stranded RNA activity profile is obtained for salt concentrations, e.g., NaCl, from 0 to 500 mM, 1 to 500 mM, 5 mM to 500 mM, 1 to 400 mM, 5 to 400 mM, 1 to 300 mM, 5 to 300 mM, 1 to 200 mM, 5 to 200 mM, 1 to 100 mM, 5 to 100 mM, 1-50 mM, or 5 to 50 mM. [0162] In some embodiments, the double stranded RNA ligase is provided at different concentrations. In some embodiments, the double stranded RNA ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 to about 0.1 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2 g/L. [0163] In some embodiments, after applying the GPR analysis on the double stranded RNA ligase activity data obtained for the different or variable reaction conditions, an output of the predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate is generated. In some embodiments, the output is a multi-output of the predicted or modeled reaction condition activity profile of the double stranded RNA ligase for the ligase substrate providing covariance of each output, as described herein. [0164] In some embodiments, the output is a contour plot of the predicted or modeled activity for the different variables for the double stranded RNA ligase activity. In some embodiments, the contour plot is a three or four dimensional surface plot of the double stranded RNA ligase activity for the variable reaction conditions. [0165] In some embodiments, the double stranded RNA ligase substrate comprises a polynucleotide acceptor strand, a polynucleotide donor strand, and a polynucleotide strand complementary to the polynucleotide acceptor strand and polynucleotide donor strand, which together forms a double stranded RNA ligase substrate comprising a ligatable nick. [0166] In some embodiments, the polynucleotide acceptor strand and the polynucleotide donor strand are provided on a single polynucleotide, and the ligatable nick is formed through base pairing of self- complementary regions on the single polynucleotide (e.g., to form a hairpin structure). [0167] In some embodiments, the polynucleotide acceptor strand and the polynucleotide donor strand of the double stranded RNA ligase substrate are provided as separate polynucleotides, which form a nick when they base pair to a polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand. Docket No. CX10-278WO1 [0168] In some embodiments, the double stranded RNA ligase substrates are formed from double stranded polynucleotide fragments having cohesive ends that can base pair to form ligatable nicks. In some embodiments, the cohesive ends comprise a complementary region of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides. In some embodiments, the double stranded RNA ligase substrate comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nicks. In some embodiments, the double stranded RNA ligase substrate comprises a plurality of polynucleotide acceptor substrates and a plurality of polynucleotide donor substrates. In some embodiments, the double stranded RNA ligase substrate is formed from at least two, 3, 4, 5, 6, 7, 8 or more double stranded polynucleotide fragments. [0169] In some embodiments, the polynucleotide acceptor strand, the polynucleotide donor strand, and/or the polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. [0170] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. [0171] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified nucleoside. In some embodiments, the polynucleotide acceptor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide acceptor strand to the 5’- terminal end of the polynucleotide acceptor strand. [0172] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified nucleoside. In some embodiments, the polynucleotide acceptor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide acceptor strand to the 3’- terminal end of the polynucleotide acceptor strand. [0173] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. [0174] In some embodiments, the modified 3’-terminal nucleoside of the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’-position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl- guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine. Docket No. CX10-278WO1 [0175] In some embodiments, the modified 5’-terminal nucleoside of the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’-position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’- fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl- guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine. [0176] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide acceptor strand. [0177] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide acceptor strand. [0178] In some embodiments, the polynucleotide acceptor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. [0179] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. [0180] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified nucleoside. In some embodiments, the polynucleotide donor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside of the polynucleotide donor strand toward the 3’- terminus. [0181] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 3’-terminal modified nucleoside. In some embodiments, the polynucleotide donor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside of the polynucleotide donor strand toward the 5’- terminus. [0182] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. Docket No. CX10-278WO1 [0183] In some embodiments, the modified 5’-terminal nucleoside of the polynucleotide donor strand double stranded RNA ligase substrate comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’-position of the sugar moiety. In some embodiments, the modified 5’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine. [0184] In some embodiments, the modified 3’-terminal nucleoside of the polynucleotide donor strand double stranded RNA ligase substrate comprises a modified sugar moiety. In some embodiments, the sugar moiety is modified at the 2’-position of the sugar moiety. In some embodiments, the modified 3’-terminal nucleoside is a 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine. [0185] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a 5’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide donor strand. [0186] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase comprises a 3’-terminal modified internucleoside linkage. In some embodiments, the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide donor strand. [0187] In some embodiments, the polynucleotide donor strand of the double stranded RNA ligase substrate comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. [0188] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside and/or modified internucleoside linkage. [0189] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. Docket No. CX10-278WO1 [0190] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide acceptor strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide acceptor strand. [0191] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises a modified internucleoside linkage. [0192] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 5’-terminal region of the polynucleotide strand. [0193] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside and/or modified internucleoside linkage. [0194] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand. [0195] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide donor strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide donor strand. [0196] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all of modified nucleosides. Docket No. CX10-278WO1 [0197] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises a modified internucleoside linkage. [0198] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand. In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkages at the 5’-terminal region of the polynucleotide strand. [0199] In some embodiments, the polynucleotide complementary to the polynucleotide acceptor strand and polynucleotide donor strand of the double stranded RNA ligase substrate comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified internucleoside linkages. [0200] In some embodiments, the 3’-terminal nucleoside of a polynucleotide acceptor strand and/or the 5’-terminal nucleoside of the polynucleotide donor strand forming a ligation junction of the double stranded RNA ligase substrate comprise modified nucleosides. [0201] In some embodiments, one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide acceptor strand are modified. In some embodiments, one or more nucleosides 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 bases from the ligation junction on the polynucleotide donor strand are modified. [0202] In some embodiments, the ligation junction comprises one or more modified internucleoside linkages. In some embodiments, the polynucleotide acceptor strand forming the ligation comprises a modified internucleoside linkage. In some embodiments, the polynucleotide donor strand forming the ligation junction comprises a modified internucleoside linkage. [0203] In some embodiments, the ligation junction of the double stranded RNA ligase substrate comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor strand, and a modified 5’-terminal nucleoside on the polynucleotide donor strand. In some embodiments, the ligation junction comprises at least a modified 3’-terminal nucleoside on the polynucleotide acceptor, wherein the modified 3’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’- O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O-methyl-thymidine, and comprises at least a modified 5’-terminal nucleoside on the polynucleotide donor, wherein the modified 5’-terminal nucleoside is 2’-fluoro-adenosine, 2’-fluoro-guanosine, 2’-fluoro cytidine, 2’-fluoro uridine, 2’-fluoro-thymidine, 2’-O-methyl-adenosine, 2’-O-methyl-guanosine, 2’-O-methyl-cytidine, 2’-O-methyl-uridine, or 2’-O- methyl-thymidine. Docket No. CX10-278WO1 [0204] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide acceptor strand at the ligation junction of the double stranded RNA ligase substrate comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. In some embodiments, the modified nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand is a complementary 2’-fluoro or 2’-O- methyl nucleoside. [0205] In some embodiments, the polynucleotide strand, or region thereof, complementary to the polynucleotide donor strand at the ligation junction of the double stranded RNA ligase substrate comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand. In some embodiments, the modified nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand is a complementary 2’-fluoro or 2’-O- methyl nucleoside. [0206] In some embodiments, the method of predicting or modeling the reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate further comprises predicting or modeling reaction condition activity profile of at least a second double stranded RNA ligase substrate, and comparing the output of the predicted reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate with the output of predicted reaction condition activity profile of the double stranded RNA ligase for the second double stranded RNA ligase substrate. [0207] In some embodiments, the ligation of the double stranded RNA ligase substrate and ligation of the second double stranded RNA ligase substrate produces the same ligated product. In some embodiments, the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate that produces the same product comprises at least a different ligation junction. [0208] In some embodiments, the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate that result in the same product comprise at least two different ligation junctions. [0209] In some embodiments, the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate comprise 3, 4, 5, 6, 7, 8, 9, 10, or more different ligation junctions. [0210] In some embodiments, the application of Gaussian Process Regression on ligase activity data and generation of predicted reaction condition activity profiles for polynucleotide ligases provides a method of screening of ligases for activity on one or more ligase substrates, and identifying ligases having activity and corresponding reaction conditions favorable for the ligase substrate. [0211] In some embodiments, a method of identifying or screening polynucleotide ligases for activity on a ligase substrate, comprises: Docket No. CX10-278WO1 (a) contacting a plurality of different polynucleotide ligases with a ligase substrate under a first reaction condition; (b) selecting the polynucleotide ligases with activity on the ligase substrate under the first reaction condition; (c) determining or predicting the reaction condition activity profile for each of the selected polynucleotide ligases on the ligase substrate; and (d) retesting activity of each ligase for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted or modeled reaction condition activity profile, to identify the ligases having optimal activity among the screened ligases for the ligase substrate. [0212] In some embodiments of the screening method, a polynucleotide ligase of the plurality of ligases is provided at different concentrations. In some embodiments, the ligase is provided at concentrations from about 0.01 g/L to about 50 g/L; about 0.01 to about 0.1 g/L; about 0.05 g/L to about 50 g/L; about 0.1 g/L to about 40 g/L; about 1 g/L to about 40 g/L; about 2 g/L to about 40 g/L; about 5 g/L to about 40 g/L; about 5 g/L to about 30 g/L; about 0.1 g/L to about 10 g/L; about 0.5 g/L to about 10 g/L; about 1 g/L to about 10 g/L; about 0.1 g/L to about 5 g/L; about 0.5 g/L to about 5 g/L; or about 0.1 g/L to about 2 g/L. [0213] In some embodiments, the predicted reaction condition activity profile for each ligase is determined by applying Gaussian Process Regression (GPR) on ligase activity data obtained under different or variable reaction conditions. As described herein, the different reaction conditions comprise varying two or more, or three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature. [0214] In some embodiments, the reaction condition activity data comprises activity determined for at least varying divalent metal concentrations, varying cofactor or co-substrate NTP concentrations, and varying double stranded RNA ligase substrate concentrations. [0215] As described herein, the ligase is a single stranded DNA ligase, a double stranded DNA ligase, a single stranded RNA ligase, or a double stranded RNA ligase. [0216] In some embodiments, the ligase is a single stranded DNA ligase and nucleic acid substrates are single stranded DNA substrates. In some embodiments, the single stranded DNA ligase uses ATP as a cofactor. In some embodiments, the single stranded DNA ligase is bacteriophage TS2126 RNA ligase (e.g., CircLigase; Epicenter, Biotechnologies), Methanobacterium thermoautotrophicum RNA ligase 1, and 5' AppDNA/RNA Ligase (New England Biolabs). [0217] In some embodiments, the nucleic acid ligase is a double stranded (dsDNA) ligase and the NMP donor cofactor or co-substrate comprises ATP or NAD (or dNAD). Various double stranded Docket No. CX10-278WO1 DNA ligases include, among others, viral, bacterial, fungal, plant, insect, and mammalian dsDNA ligases. [0218] In some embodiments, the double stranded DNA ligase uses NAD or dNAD as a cofactor/co- substrate. In some embodiments, dsDNA ligase using NAD as a cofactor/co-substrate include double stranded DNA ligases of eubacteria, such as E. coli, Thermus filiformis, Thermus aquaticus, Thermus thermophilus, Bacillus stearothermophilus, Haemophilus influenzae, Desulfurolobus ambivalens, Sulfolobus acidocaldarius, Methanothermus fervidus, and Methanococcus vannielii. [0219] In some embodiments, the double stranded DNA ligase uses ATP as a cofactor or co- substrate. In some embodiments, double stranded DNA ligase using ATP as a cofactor or co-substrate include bacteriophage double stranded DNA ligases (e.g., T3, T4, T6, and T7-DNA ligases), fungal double stranded DNA ligases (e.g., Saccharomyces pombe, Schizosaccharomyces pombe, etc.), human DNA ligase I, vaccinia DNA ligase, and African swine fever virus dsDNA ligase. [0220] In some embodiments, the double stranded DNA ligase comprises an engineered or recombinant dsDNA ligase variants or chemically modified double stranded DNA ligases. In some embodiments, the double stranded DNA ligase comprises an engineered double stranded DNA ligase disclosed in U.S. Patent No.8728725, U.S. Patent No.10626390, U.S. Patent No.10837009, U.S. Patent No.11124789, WO2018208665, WO2024158764, and Wilson et al., Protein Engineering, Design & Selection, 2013, 26(7):471-478, all of which are incorporated by reference herein. [0221] In some embodiments, the nucleic acid ligase is a single stranded (ssRNA) ligase and the cofactor or co-substrate NTP comprises ATP. In some embodiments, various single stranded RNA ligases useful in the enzymatic reactions, include, among others, RNA ligase 1, such as of bacteriophage T4, Citrobacter phage Merlin, Escherichia phage vB_EcoM_VR25, Serratia phage PS2, Phage TS2126, and Rhodothermus phage RM378. [0222] In some embodiments, the single stranded RNA ligase comprises an engineered single stranded RNA ligase. In some embodiments, the engineered single stranded RNA ligase is disclosed in U.S. provisional application No.63/634,859, filed April 16, 2024, and U.S. provisional application No.63/646,841, filed May 13, 2024, incorporated by reference herein. [0223] In some embodiments, the nucleic acid ligase is a double stranded (dsRNA) ligase and the cofactor or co-substrate NTP comprises ATP. [0224] In some embodiments, the dsRNA ligase comprises an engineered dsRNA ligase disclosed in WO2024138200; U.S. provisional application No.63/618,203, filed January 5, 2024; U.S. provisional application No.63/554,938, filed January 16, 2024; U.S. provisional application 63/646,753, filed May 13, 2024; and U.S. provisional application No.63/601,699, filed November 21, 2023; all references incorporated herein by reference. Docket No. CX10-278WO1 [0225] In some embodiments, the nucleic acid ligase that produces reaction product NMP is an RNA splicing ligase and the cofactor or co-substrate comprises ATP. In some embodiments, the RNA splicing ligase is tRNA splicing ligase. [0226] In some embodiments, the RNA splicing ligase comprises RNA ligase RtcB or rRNA ligase, and homologs thereof, including human and C. elegans. Modified Ligase Substrate [0227] In some embodiments, the modification or modifications in the ligase substrate, such as modified nucleosides and/or modified internucleoside linkages, can comprise modifications described herein and below. 2’- and 3’-sugar modifications [0228] In some embodiments, the modified nucleotide comprises a modified nucleoside, wherein the modification is on the sugar moiety of the nucleoside. In some embodiments, the modified sugar moiety is a modified furanosyl sugar moiety, for example ribose or deoxyribose. In some embodiments, the furanosyl sugar moiety is modified or substituted at the 2’, 3’, or a combination of 2’ and 3’ positions, as appropriate. In some embodiments, the modification is at the 2’-position of the sugar moiety. In some embodiments, substitutions at the 2’- position include, among others, halo (e.g., Cl, F, Br, etc.) or -O-alkyl or 2’-alkoxy (e.g., O-methyl, O-ethyl, etc.). In some embodiments, other modifications at the 2’-position include, but are not limited to, allyl, amino, azido, SH, CN, OCN, CF3, OCF3, SCH3, SOCH3, SO2CH3, ONO2, NO2, N3, and NH2. In some embodiments, substituent groups at the 2’-position include, among others, O-(C1-C10)alkoxy, alkoxyalkyl, O-alkyl, S-alkyl, N-alkyl, O-alkenyl, S-alkenyl, N-alkenyl, O-alkynyl, S-alkynyl, N-alkynyl, O-alkyl-O-alkyl, alkynyl, wherein the alkyl, alkenyl and alkynyl can be substituted or unsubstituted C1-C10 alkyl or C1- C10 alkenyl and alkynyl. In some embodiments, substituent groups at the 2’-position include, but are not limited to, alkaryl, aralkyl, O-alkaryl, and O-aralkyl. In some embodiments, the substitution at the 2’-position is a phosphate (see, e.g., Current Protocols in Nucleic Acid Chemistry, 13.1.1-13.1.31, John Wiley & Sons (2003). [0229] In some embodiments, the modified 2’-position of the sugar moiety is halo, 2’-O-R’, or 2’-O- COR’, where R’ is an alkyl, alkyloxyalkyl, cycloalkyl, heterocyclyl, aryl, heteroaryl, cycloalkylalkyl, heterocyclylalkyl, arylalkyl, or heteroarylalkyl. In some embodiments, R’ is a C1-C4alkyl. In some embodiments, the modified 2’-position is a 2’-O-R’, wherein in R’ is alkyloxyalkyl, alkylamine, cyanoalkyl, or -C(O)-alkyl. In some embodiments, the 2’-position of the sugar moiety of the nucleoside substrate is -O-R’, wherein R’ is -CH3 or -CH2CH3 or -CH2CH2OCH3. In some embodiments, the modified 2’-position is 2’-O-(2-methoxyethyl), 2’-O-allyl, 2’-O-propargyl, 2’-O- ethylamine, 2’-O-cyanoethyl, -2’-O-amine, or 2’-O-acetate ester. Docket No. CX10-278WO1 [0230] In some embodiments, a modification at the 2’-position comprises a locked nucleoside. In some embodiments, locked nucleosides comprises a biradical linking the C2’ and C4’ of the ribose sugar ring of said nucleoside (also referred to as a “2’- 4’ bridge”), which restricts or locks the conformation of the ribose ring (see, e.g., Obika et al., Tetrahedron Letters, 1997, 38(50):8735–8738; Orum et al., Current Pharmaceutical Design, 2008, 14(11):1138–1142). In some embodiments, the ribose moiety of the locked nucleotide is in the C3’-endo (beta-D) or C2’-endo (alpha-L) conformation. In some embodiments, the bridge is a methylene bridge. In some embodiments, the bridge is an ethylene bridge, also referred to as ENA (see, e.g., Morita et al., Bioorg Med Chem Lett., 2002, 12(1):73-6). Other locked nucleoside are described in International patent publication WO 2121249993, incorporated by reference herein. [0231] In some embodiments, other locked nucleosides include, among others, 5’-methyl-LNA, 2’- amino-LNA, alpha-L-LNA, and thio-LNA. Structures of certain locked nucleosides are shown below: where R in the above is alkyl or acyl, and B refers to a nucleobase. [0232] In some embodiments, a modification at the 2’-position comprises a reactive moiety or a conjugate moiety, including a conjugate moiety attached via a linker or a linker, as described herein. [0233] In some embodiments, the modification is at the 3’-position of the sugar moiety. In some embodiments, in view of the effect of a 3’-modification on ligase activity, and use of the 3’-OH group for internucleoside linkage, the 3’-modification is on the 3’-terminal nucleoside of the nucleotide donor. In some embodiments, the modification at the 3’-position are similar to those at the 2’- position. In some embodiments, substitutions at the 3’- position include, among others, halo (e.g., Cl, F, Br, etc.) or -O-alkyl or 3’-alkoxy (e.g., O-methyl, O-ethyl, etc.). In some embodiments, other modifications at the 3’-position include, but are not limited to, allyl, amino, azido, SH, CN, OCN, CF3, OCF3, SCH3, SOCH3, SO2CH3, ONO2, NO2, N3, and NH2. In some embodiments, substituent groups at the 3’-position include, among others, O-(C1-C10)alkoxy, alkoxyalkyl, O-alkyl, S-alkyl, N- alkyl, O-alkenyl, S-alkenyl, N-alkenyl, O-alkynyl, S-alkynyl, N-alkynyl, O-alkyl-O-alkyl, alkynyl, Docket No. CX10-278WO1 wherein the alkyl, alkenyl and alkynyl can be substituted or unsubstituted C1-C10 alkyl or C1-C10 alkenyl and alkynyl. In some embodiments, substituent groups at the 3’-position include, but are not limited to, alkaryl, aralkyl, O-alkaryl, and O-aralkyl. In some embodiments, In some embodiments, the substitution at the 3’-position is a phosphate. [0234] In some embodiments, the modified 3’-position of the sugar moiety is halo, 3’-O-R’, or 3’-O- COR’, where R’ is an alkyl, alkyloxyalkyl, cycloalkyl, heterocyclyl, aryl, heteroaryl, cycloalkylalkyl, heterocyclylalkyl, arylalkyl, or heteroarylalkyl. In some embodiments, R’ is a C1-C4alkyl. In some embodiments, the modified 3’-position is a 3’-O-R’, wherein in R’ is alkyloxyalkyl, alkylamine, cyanoalkyl, or -C(O)-alkyl. In some embodiments, the 3’-position of the sugar moiety of the nucleoside substrate is -O-R’, wherein R’ is -CH3 or -CH2CH3 or -CH2CH2OCH3. In some embodiments, the modified 3’-position is 3’-O-(2-methoxyethyl), 3’-O-allyl, 3’-O-propargyl, 3’-O- ethylamine, 3’-O-cyanoethyl, -3’-O-amine, or 3’-O-acetate ester. [0235] In some embodiments, the modifications at the 3’-position is a reversible or cleavable 3’- blocking group. In some embodiments, removal or cleaving of the reversible or cleavable 3’-blocking group results in a free 3’-OH group, which in some embodiments can serve as an acceptor for single- stranded RNA ligase or a terminal nucleotidyl transferase. In some embodiments, exemplary reversible or cleavable 3’-blocking groups include, among others, 3’-O-azidomethyl, 3’-O-(2- methoxyethyl), 3’-O-allyl, 3’-O-propargyl, 3’-O-ethylamine, 3’-O-cyanoethyl, -3’-O-amine, 3’-O- acetate ester, 3’-phosphate, 3’-diphosphate, or 3’-triphosphate. In some embodiments, the 3’- blocking group is paired with the corresponding deblocking agent used in the deblocking or cleavage of the 3’-blocking group. Other reversible or cleavable 3’-blocking groups are described in International patent publication WO2023183569, incorporated by reference herein. [0236] In some embodiments, a modification at the 3’-position comprises a reactive moiety or a conjugate moiety, including a conjugate moiety attached via a linker’ or a linker, as described herein. [0237] In some embodiments, the modified sugar moiety comprises an unlocked nucleoside. In some embodiments, in the unlocked nucleoside, the furanosyl ring is opened to result in the structure below: where B represents the nucleobase. Unlocked nucleosides are described in, among others, International patent publication WO2022/098990 and Snead et al., Molecular Therapy-Nucleic Acids, 2013, 2, e103. Docket No. CX10-278WO1 [0238] As described herein, in some embodiments, where the modification is to the 3’-terminal nucleotide of the polynucleotide acceptor, the 3’-OH group of the nucleoside, or equivalent position thereof, is maintained to act as an acceptor for the ligase reaction. In some embodiments, where the modification is to the 5’-terminal nucleotide of the polynucleotide donor, the 5’-phosphate group of the nucleoside, or equivalent position thereof, is maintained to act as a donor for the ligase reaction. In some embodiments, the 5’-phosphate group of the polynucleotide donor strand comprises a 5’- phosphorothioate (see, e.g., U.S. Patent No.6811986, incorporated by reference herein). Modified nucleobases [0239] In some embodiments, the modified nucleotide comprises a modified nucleobase. In some embodiments, modified nucleobase that is capable of hydrogen bonding to form Watson and Crick type base pairing is selected. [0240] In some embodiments, the nucleobase comprise an inosine nucleoside (i.e., nucleosides comprising a hypoxanthine nucleobase). In some embodiments, the modified nucleobase is 5- substituted pyrimidines, 6-azapyrimidines, alkyl or alkynyl substituted pyrimidines, alkyl substituted purines, and N-2. N-6 and O-6 substituted purines. In some embodiments, the modified nucleobase is 2-aminopropyladenine.5-hydroxymethyl cytosine, 5-methylcytosine, xanthine, hypoxanthine, 2- aminoadenine, 6-N-methylguanine, 6-N-methyladenine, 2-propyladenine, 2-thiouracil, 2-thiothymine, and 2-thiocytosine.5-propynyl uracil, 5-propynylcytosine.6-azouracil, 6-azocytosine, 6-azothymine. 5-ribosyluracil (pseudouracil), 4-thiouracil.8-halo purine, 8-amino purine, 8-thio purine, 8-thioalkyl purine, 8-hydroxy purine, 8-aza purine, 5-bromocytosine.5-trifluoromethylcytosine, 5-halouracil, 5- halocytosine, 7-methylguanine, 7-methyladenine, 2-F-adenine, 2-aminoadenine, 7-deazaguanine, 7- deazaadenine.3-deazaguanine, 3-deazaadenine, 6-N-benzoyladenine, 2-N-isobutyrylguanine, 4-N- benzoylcytosine, 4-N-benzoyluracil, 5-methyl 4-N-benzoylcytosine, and 5-methyl 4-N-benzoyluracil. Further modified nucleobases include tricyclic pyrimidines, e.g., 1,3-diazaphenoxazine-2-one.1,3- diazaphenothiazine-2-one, and 9-(2-aminoethoxy)-1.3-diazaphenoxazine-2-one (G-clamp). [0241] In some embodiments, the modified nucleobase includes, among others, nucleobases based on 2,4-dihalotolene and benzimidazole groups. In some embodiments, the modified nucleobase is 4- methylbenzimidazole, 2,4-difluorotoluene, 9-methylimidazo[(4,5)-b]pyridine, 2,4-dibromotoluene, benzimidazole, 5-nitrobenzimidazole, 6-nitrobenzimidazole, and 5-nitroindole. In some embodiments, the modified nucleobase is 7-azaindole, and isocarbostyril (see, e.g., Berdis et al., Front. Chem.10:1051525). Other modified nucleobases are described in, among others, patent publication WO2021249993. [0242] In some embodiments, included within modified nucleobase is a nucleobase that does not have a nucleobase, also referred to as an abasic nucleoside. In some embodiments, the abasic nucleoside is present in the internal portion of an oligonucleotide acceptor. In some embodiments, an Docket No. CX10-278WO1 abasic nucleoside is attached to the 3’- or 5’-terminal end, which is in certain embodiments grouped as a terminal group. [0243] In some embodiments, the modified nucleobase is present on the 5’-terminal nucleoside of the polynucleotide acceptor or polynucleotide donor, 3’-terminal nucleoside of the polynucleotide acceptor or polynucleotide donor, and/or present on the internal nucleosides of the polynucleotide acceptor or polynucleotide donor, as described herein. In some embodiments, the blocks or contiguous stretches of nucleosides in the polynucleotide acceptor or polynucleotide donor have modified nucleobases. [0244] In some embodiments, where the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate, the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises one or more modified nucleobases. Modified internucleoside linkages [0245] In some embodiments, the modified nucleotide comprises at least one modified, non-naturally occurring internucleoside linkage. In some embodiments, the modified polynucleotide has 1% 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more modified internucleoside linkages. In some embodiments, all of the internucleoside linkages are modified internucleoside linkages. [0246] In some embodiments, the modified internucleoside linkage is a phosphorous containing modified internucleoside linkage. Exemplary phosphorous-containing internucleoside linkages include, among others, phosphotriesters, alkylphosphonates (e.g., methyl phosphonate, ethyl phosphonate, etc.), phosphoramidates, phosphorothioate, and phosphorodithioate. [0247] In some embodiments, the modified internucleoside linkage is a non-phosphorous containing internucleoside linkage. Exemplary non-phosphorous containing internucleoside linkages include, among others, methylenemethylimino (-CH2-N(CH3)-O-CH2), thiodiestcr, thionocarbamate (-O- C(=O)(NH)-S-); siloxane (-O-SiH2-O-); N,N’-dimethylhydrazine (-CH2-N((CH3)-N((CH3)-); MMI (3'-CH2-N(CH3)-O-5'), amide-3 (3'-CH2-C(=O)-N(H)-5'), amide-4 (3'-CH2-N(H)-C(=O)-5'), formacetal (3'-O-CH2-O-5'), methoxypropyl, and thioformacctal (3’-S-CH2-O-5'). In some embodiments, the modified internucleoside linkage is amide linkage, such as those of glycine nucleosides or nucleoside β-amino acids (see, e.g., Banerjee et al., Bioconjugate Chem., 2015, 26, 8, 1737–1742). [0248] In some embodiments, the modified internucleoside linkages provides for a chiral center. For example, a phosphorothioate or alkylphosphonate internucleoside linkage can be in the Rp or Sp stereomeric configuration. In some embodiments, the polynucleotide acceptor and/or polynucleotide donor have a mixture of stereoisomers in the internucleoside linkages. In some embodiments, the Docket No. CX10-278WO1 polynucleotide acceptor and/or polynucleotide donor have greater than 50% of the internucleoside linkages as Rp or Sp configuration. In some embodiments, the polynucleotide acceptor and/or polynucleotide donor have at least 60%, 70%, 80%, 90%, or greater of Rp or Sp stereomeric configuration. [0249] In some embodiments, the modified internucleoside linkages are present in the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments, at least 1, 2, 3, 4, or 5 modified internucleoside linkages are present at the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments at least 1 or 2 phosphorothioate internucleoside linkages are present at the 5’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments, the phosphorothioate linkage is a non- bridging phosphorothioate internucleoside linkage. [0250] In some embodiments, the modified internucleoside linkages are present in the 3’-terminal region of the polynucleotide acceptor or polynucleotide donor. In some embodiments, at least 1, 2, 3, 4, or 5 modified internucleoside linkages are present at the 3’-terminal region of the polynucleotide acceptor and/or polynucleotide donor. In some embodiments, at least 1 or 2 phosphorothioate internucleoside linkages are present at the 3’-terminal region of an polynucleotide acceptor or polynucleotide donor. In some embodiments, the modified internucleoside linkages are present in the internal portions of the polynucleotide acceptor or polynucleotide donor. [0251] In some embodiments, the polynucleotide acceptor and/or polynucleotide donor comprises at least a phosphorothioate internucleoside linkage, where the phosphorothioate linkage is in the Sp configuration, the Rp configuration, or a mixture of Sp and Rp configurations in the nucleotides of the polynucleotide acceptor and/or polynucleotide donor strand. [0252] In some embodiments, where the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate, the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises one or more modified internucleoside linkages. Terminal groups [0253] In some embodiments, the polynucleotide acceptor and/or polynucleotide donor comprises a terminal group. In some embodiments, where the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate, the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises a terminal group. [0254] In some embodiments, the terminal group is attached to the 5’-OH or 4’-carbon atom of the terminal nucleoside. In some embodiments, the terminal group comprises a C-4’ modification of the 5’-terminal nucleoside, including among others, 4’-thio-C2’ modifications, 4’-aminoalkyl, C4’- Docket No. CX10-278WO1 guanidino-C2’-modifications, and C4’-O-methyl (see, e.g., Gangopadhyay et al., RNA Biology, 2022, 19:1, 452-467) [0255] In some embodiments, the 5’-terminal group is a 5’-phosphate modification. In some embodiments, the 5'-phosphate modification, includes, among others, 5’-C-methyl, particularly S isomer; 5’-(E or Z)-vinylphosphonate, or 5’-methylenephosphonate. [0256] In some embodiments, the 5’-terminal group comprises an abasic nucleotide attached to the 5’-OH. In some embodiments, the 5’-terminal groups comprises an inverted abasic nucleotide (5’-5’) attached to the 5’-OH of the 5’-end nucleoside. [0257] In some embodiments, the terminal group comprises a 3’-terminal group. In some embodiments, the 3’-terminal group comprises a 3’-phosphate, which can also function as a reversible blocking group. In some embodiments, the 3’-phosphate is modified, such as with 3’-(E or Z)- vinylphosphonate, or 3’-methylenephosphonate. In some embodiments, the 3’-terminal group on the nucleotide donor comprises an abasic nucleoside. In some embodiments, the 3’-terminal group comprises an inverted abasic nucleotide (3’-3’). Conjugate moiety [0258] In some embodiments, the modified nucleoside comprises a conjugate moiety. In some embodiments, the polynucleotide acceptor and/or the polynucleotide donor comprises a conjugate moiety. In some embodiments, where the nucleic acid ligase substrate is a double stranded nucleic acid ligase substrate, the polynucleotide complementary to the polynucleotide acceptor strand and/or the polynucleotide donor strand comprises a conjugate moiety. [0259] In some embodiments, the conjugate moiety (i.e., non-nucleotide moiety) includes, among others, carbohydrates (e.g. GalNAc), lipids, sterols, drug substances, hormones, polymers (e.g., polyethylene glycol, etc.), proteins, peptides, toxins (e.g. bacterial toxins, etc.), vitamins (e.g., folate, tocopherol, retinoic acid, etc.), or combinations thereof. In some embodiments, the conjugate moiety is used to affect the pharmacokinetics of an oligonucleotide, including for cellular targeting of an oligonucleotide. [0260] In some embodiments, the conjugate moiety can be attached to the 5’-terminal nucleotide, the 3’-terminal nucleotide, or an internal nucleotide of a ligase substrate. In some embodiments, the conjugate moiety is attached the 2’-position of the sugar moiety of a nucleoside, for example, to the 2’-OH. In some embodiments, the conjugate moiety is attached to the 3’-position of the sugar moiety of the nucleoside, for example 3’-OH. In some embodiments, the conjugate moiety is attached to the nucleobase, as discussed above (see, e.g., Biscans et al., Nucleic Acids Res.2019 Feb 20; 47(3): 1082–1096). In some embodiments, the conjugate moiety is attached directly or attached using a linker. Docket No. CX10-278WO1 [0261] In some embodiments, the conjugate moiety comprises a C6-C22 alkyl, C6-22 alkenyl, or C6-C22 alkynyl. In some embodiments, the conjugate moiety comprises a C6-alkyl, C7-alkyl, C8-alkyl, C9- alkyl, C10-alkyl, C11-alkyl, C12-alkyl, C13-alkyl, C14-alkyl, C15-alkyl, C16-alkyl, C17-alkyl, C18-alkyl, C19-alkyl, C20-alkyl, C21-alkyl, or C22-alkyl. In some embodiments, the conjugate moiety comprises a C6 alkenyl, C7 alkenyl, C8 alkenyl C9 alkenyl, C10 alkenyl, C11-alkenyl, C12-alkenyl, C13-alkenyl, C14- alkenyl, C15-alkenyl, C16-alkenyl, C17-alkenyl, C18-alkenyl, C19-alkenyl, C20-alkenyl, C21-alkenyl, or C22-alkenyl. In some embodiments, the conjugate moiety comprises a C6 alkynyl, C7 alkynyl, C8 alkynyl, C9 alkynyl, C10 alkynyl, C11-alkynyl, C12-alkynyl, C13-alkynyl, C14-alkynyl, C15-alkynyl, C16- alkynyl, C17-alkynyl, C18-alkynyl, C19-alkynyl, C20-alkynyl, C21-alkynyl, or C22-alkynyl. [0262] In some embodiments, the conjugate moiety comprises a heteroalkyl, heteroalkenyl, or heteroalkynyl. In some embodiments, the heteroalkyl, heteroalkenyl or heteroalkynyl has one or more carbon atoms replaced with a heteroatom, such as O, S, or N. [0263] In some embodiments, the conjugate moiety comprises a cycloalkyl or heterocycloalkyl group. In some embodiments, the cycloalkyl includes, among others, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, l-cyclohexenyl, 3-cyclohexenyl, and cycloheptyl. In some embodiments, the heterocycloalkyl includes, among others, 1-(1,2,5,6-tetrahydropyridyfh l-piperidinyl, 2-piperidinyl, 3- piperidinyl, 4-morpholinyl, 3-morpholinyl, tctrahydrofuran-2-yl, tctrahydrofuran-3-yl, tetrahydrothicn-2-yl, tetrahydrothien-3-yl, l-piperazinyl, and 2-piperazinyl. [0264] In some embodiments, the conjugate moiety comprises an aryl or heteroaryl moiety. In some embodiments, the aryl group includes, among others, phenyl, naphthyl, indenyl, biphenyl, phenanthrenyl, naphthacenyl, anthracenyl, fluorenyl, indenyl, and azulenyl. In some embodiments, a heteroaryl group includes, among others, pyridyl, furanyl, thienyl, pynolyl, oxazolyl, oxadiazolyl, imidazolyl ihiazolyl, isoxazolyl, quinolinyl, pyrazolyl, isoihiazolyl, pyridazinyl, pyrimidinyl, pyrazinyl, triazinyl, isoquinolinyl, and indazolyl. [0265] In some embodiments, the conjugate moiety comprises a cycloalkylalkyl-, heterocycloalkylalkyl-, arylalkyl-, heteroarylalkyl-, cycloalkylheteroalkyl- heterocycloalkylheteroalkyl-, arylheteroalkyl-, heteroarylheteroalkyl-, cycloalkylalkenyl-, heterocycloalkylalkenyl-, arylalkenyl-, heteroarylalkenyl-, cycloalkylheteroalkenyl- heterocycloalkylheteroalkenyl-, arylheteroalkenyl-, or heteroarylheteroalkenyl-. [0266] In some embodiments, the conjugate moiety comprises a lipid or lipophilic moiety, for example a fatty acid. In some embodiments, the fatty acid comprises a saturated fatty acid, unsaturated fatty acid, or a polyunsaturated fatty acid. In some embodiments, the fatty acid comprises caprylic acid, lauric acid, myristic acid, palmitic acid, stearic acid, arachidic acid, behenic acid, oleic acid, elaidic acid, cis-vaccenic acid, trans-vaccenic acid, linoleic acid, alpha-linoleic acid, gamma- linoleic acid, arachidonic acid, eicosapentaenoic acid, decanoic acid, docosahexaenoic acid (DHA), Docket No. CX10-278WO1 and docosanoic acid (DCA) conjugate moieties (see, e.g., Kubo et al., ACS Chem. Biol., 2021, 16, 150−164; see also, WO2024040041; incorporated herein by reference). [0267] In some embodiments, the conjugate moiety comprises a sterol. In some embodiments, the sterol comprises cholesterol, alpha-cholesterol, cholesterol ester (e.g., cholesteryl palmitate, etc.), cholesterol sulfate, phytosterol, cholic acid, or lithocholic acid. [0268] In some embodiments, the conjugate moiety comprises a phospholipid. In some embodiments, the phospholipid comprises phosphatidic acid, phosphatidylethanolamine, phosphatidylcholine, phosphatidylinositol, phosphatidylserine, or a sphingolipid. [0269] In some embodiments, the conjugate moiety comprises a carbohydrate, particularly a carbohydrate moiety acting as a ligand for a cellular receptor for cellular targeting of the oligonucleotide. In some embodiments, the carbohydrate moiety comprises galactose or galactose derivatives. In some embodiments, the carbohydrate moiety is attached to the nucleoside via a linker. In some embodiments, exemplary carbohydrates that can be used include the following. , , some an conjugate moiety. In some embodiments, the oligonucleotide acceptor and/or nucleotide donor may be conjugated to at least one conjugate moiety comprising at least one N-acetylgalactosamine (GalNAc) moiety. In some embodiments, the conjugate moiety is a monovalent, divalent, trivalent or tetravalent, GalNAc. [0271] In some embodiments, the GalNAc moiety has the following structure, Docket No. CX10-278WO1 where L is a linker, and . In some embodiments, the W is the 2’-OH of the sugar moiety of a GalNAc moiety is 2’-position of a nucleoside, such as contiguous nucleotides in an oligonucleotide (see, e.g., WO2024040041). [0272] In some embodiments, the conjugate moiety is a trivalent GalNAc. Tri-valent N- acetylgalactosamine conjugate moieties are described in, for example, WO 2014/076196, WO 2014/207232 and WO 2014/179620. The term “trivalent GalNAc” refers to a residue comprising three N-acetylgalactosamine moieties, typically attached via a linker. In some embodiments, the conjugate moiety is L96. Exemplary trivalent GalNAc conjugate moiety is depicted below:
Docket No. CX10-278WO1 [0273] In some embodiments, the conjugate moiety comprises a reporter molecule. Examples of reporter molecules include, among others, fluorescent moieties, such as fluorescein and fluorescein dyes (e.g., fluorescein isothiocyanine or FITC, naphthofluorescein, 4′,5′-dichloro-2′,7′-dimethoxy- fluorescein, 6-carboxyfluorescein or FAM), carbocyanine, merocyanine, styryl dyes, oxonol dyes, phycoerythrin, erythrosin, eosin, rhodamine dyes (e.g., carboxytetramethylrhodamine or TAMRA, carboxyrhodamine 6G, carboxy-X-rhodamine (ROX), lissamine rhodamine B, rhodamine 6G, rhodamine Green, rhodamine Red, tetramethylrhodamine or TMR), coumarin and coumarin dyes (e.g., methoxycoumarin, dialkylaminocoumarin, hydroxycoumarin and aminomethylcoumarin or AMCA), Oregon Green Dyes (e.g., Oregon Green 488, Oregon Green 500, Oregon Green 514), Texas Red, Texas Red-X, Spectrum Red™, Spectrum Green™, cyanine dyes (e.g., Cy-3™, Cy-5™, Cy- 3.5™, Cy-5.5™), Alexa Fluor dyes (e.g., Alexa Fluor 350, Alexa Fluor 488, Alexa Fluor 532, Alexa Fluor 546, Alexa Fluor 568, Alexa Fluor 594, Alexa Fluor 633, Alexa Fluor 660 and Alexa Fluor Docket No. CX10-278WO1 680), BODIPY dyes (e.g., BODIPY FL, BODIPY R6G, BODIPY TMR, BODIPY TR, BODIPY 530/550, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, BODIPY 630/650, BODIPY 650/665), IRDyes (e.g., IRD40, IRD 700, IRD 800). (See, e.g., “The Handbook of Fluorescent Probes and Research Products”, 9th Ed., R.P. Haugland, 2002, Molecular Probes, Inc., Eugene, Oregon). [0274] In some embodiments, the reporter moiety is a chemiluminescent moiety, for example acridinium esters, ruthenium derivatives (e.g., tris(2,2′-bipyridyl) ruthenium), and dioxetanes. [0275] In some embodiments, the conjugate moiety comprises an affinity or capture tag. Exemplary affinity or capture tag includes, among others, biotin, desthiobiotin, digoxigenin, 3-amino-3- deoxydigoxigenin, and a hapten (e.g., dinitrophenol, Alexa Fluor 40, Alexa Fluor 488, dansyl, Lucifer yellow, Oregon Green 488, fluorescein). [0276] In some embodiments, the conjugate moiety comprises a peptide. In some embodiments, the peptide comprises a cellular targeting peptide and/or cell penetration peptide (CPP) for enhancing cellular delivery of a conjugate-modified polynucleotide. In some embodiments, the cell penetrating peptide is attached via a linker, including a cleavable linker. Cell penetrating peptides, include among others, TAT, penetratin, MAP, transportan/TP10, VP22, polyarginine, MPG, Pep-1, pVEC, YTA2, YTA4, M918, and CADY. In some embodiments, the conjugate moiety comprises an RGD (Arg- Gly-Asp) peptide. Sequence of some penetrating peptides are described in Copolovici et al., 2014, 8(3):1972–1994, incorporated by reference herein. [0277] Other cell penetrating peptides, including those conjugated to nucleic acids, are disclosed in, among others, patent publications WO24063570, WO24044663, US2024083949, WO24026141, WO23230600, WO23219933, WO23177261, WO23178327, WO23093960, WO23086342, WO23081893, WO23069332, WO23070108, WO23034515, US2023248630, US2023053924, WO23003380, WO23277628, WO23277575, US2022378946, WO22171972, WO22162200, WO2020144233, WO22180242, WO22132520, WO22129926, WO22125673, WO22120276, WO22101193, US2023287086, US2023357334, US2023144488, and US2023048338; incorporated by reference herein. In some embodiments, the peptide can be attached using a thiol group on the 5’- phosphate of a polynucleotide or oligonucleotide. Systems and Computer Implementation [0278] In some embodiments, the method herein is implemented by a computer. In some embodiments, the present disclosure provides a computer implemented method for predicting or modeling a reaction condition activity profile of a ligase for a ligase substrate, comprising receiving ligase activity data of a ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression using the ligase activity data for the ligase substrate Docket No. CX10-278WO1 obtained under different or variable reaction conditions; and generating an output of the predicted or modeled reaction condition activity profile of the ligase for the ligase substrate. [0279] In some embodiments, the present disclosure further provides a system for predicting or modeling the reaction condition activity profile of one or more ligases for a ligase substrate, comprising: one or more processors, a memory storing instructions configured to, when executed by the processor, cause the processor to input or receive activity data of a ligase for a ligase substrate for different or variable reaction conditions; process the reaction condition activity data using a Gaussian Process Regression; and generate an output of a predicted or modeled reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions. [0280] In some embodiments, the present disclosure further provides a computer readable storage medium for predicting or modeling the reaction condition activity profile of a ligase for a ligase substrate, comprising one or more programmed instructions configured to direct one or more processors to: input or receive activity data of a ligase for a ligase substrate under different or variable reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different or variable reaction conditions; and generating an output of the GPR predicted or modeled reaction condition activity profile of the ligase for the ligase substrate. [0281] In some embodiments, the activity data for the different reaction or variable conditions is obtained by varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. In some embodiments, the different reaction or variable conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co- substrate NTP (or NAD+) concentration, buffer concentration, and salt concentration, as described herein. [0282] In some embodiments, the different or variable reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration. [0283] In some embodiments, a computer system is used for implementing one or more of the methods described herein. In some embodiments, the computer system comprises one or more processors, a memory, a terminal interface, input/output device interface, a storage interface, and in Docket No. CX10-278WO1 some embodiments, a network interface. In some embodiments, the computer system can be an electronic device of a user or a remotely located computer system. In some embodiments, the computer device is a mobile electronic device. [0284] In some embodiments, the computer system contains one or more general purpose programmable processing units. In some embodiments, the computer system may have a single processor or may have multiple processors. Each process may execute the instructions stored in one or more memory modules. [0285] In some embodiments, the memory can include a computer readable media, such as volatile memory, random access memory, and/or cache memory. In some embodiments, the storage system can be for reading and writing from a non-removable storage media, e.g., a hard drive or an optical disk. In some embodiments, the memory can be a flash memory. In some embodiments, the computer system uses a communication interface, and peripheral devices are in communication with the processing units. [0286] In some embodiments, the memory may store one or more programs, each having at least a program module stored in the memory. In some embodiments, the computer readable program instructions are for applying the Gaussian Process Regression to the activity date obtained for the different or variable reaction conditions inputted into the computer system. [0287] In some embodiments, the programmed instruction or program code for carrying out operations of the present invention may be written in any programming language such as Java, Python, Julia, or any other programming languages such as “C” programming language; a scripting programming language such as Perl and VBS; and other languages such as R and MATLAB. [0288] In some embodiments, the ligase activity data is inputted and stored in a memory, where the data is processed by one or more of the computer programs, e.g., for the Gaussian Process Regression algorithms. In some embodiments, the computer system includes a output interface and an peripheral device for receiving and/or processing the output of the predicted reaction condition activity profile of a ligase for a ligase substrate. In some embodiments, the peripheral device is an electronic display or printer connected via a user interface. [0289] While the invention has been described with reference to the specific embodiments, various changes can be made and equivalents can be substituted to adapt to a particular situation, material, composition of matter, process, process step or steps, thereby achieving benefits of the invention without departing from the scope of what is claimed. [0290] For all purposes, each and every publication and patent document cited in this disclosure is incorporated herein by reference as if each such publication or document was specifically and individually indicated to be incorporated herein by reference. Citation of publications and patent Docket No. CX10-278WO1 documents is not intended as an indication that any such document is pertinent prior art, nor does it constitute an admission as to its contents or date.

Claims

Docket No. CX10-278WO1 CLAIMS What is claimed is: 1. A method of predicting reaction condition activity profile of a double stranded RNA ligase for a double stranded RNA ligase substrate, comprising: obtaining activity data of a double stranded RNA ligase for a double stranded RNA ligase substrate under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR-predicted reaction condition activity profile of the double stranded RNA ligase for the double stranded RNA ligase substrate. 2. The method of claim 1, wherein the different reaction conditions comprise varying three or more of: divalent metal concentration, double stranded RNA ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. 3. The method of claim 2, wherein the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable double stranded RNA ligase substrate concentration. 4. The method of claim 2 or 3, wherein the divalent metal is MgCl2, wherein the divalent metal concentrations are from 0.5 mM to 50 mM. 5. The method of any one of claims 2-4, wherein the double stranded RNA ligase substrate concentrations are from 0.1 to 20 mM. 6. The method of any one of claims 2-5, wherein the cofactor or co-substrate NTP concentrations are from 0.1 mM to 20 mM. 7. The method of claim 6, wherein the NTP is ATP. 8. The method of any one of claims 2-7, wherein the buffer concentrations are from 1- 200 mM. 9. The method of any one of claims 2-8, wherein the salt concentrations are from 0 to 500 mM. 10. The method of claim 9, wherein the salt is NaCl. Docket No. CX10-278WO1 11. The method of any one of claims 1-10, wherein the output of predicted reaction condition activity profile is a contour plot of the different reaction condition variables. 12. The method of claim 11, wherein the output contour plot is a three or four dimensional surface plot of predicted ligase activity for the different reaction condition variables. 13. The method of any one of claims 1-12, further comprising predicting reaction condition activity profile of at least a second double stranded RNA ligase substrate, and comparing the multi-output of the GPR of the double stranded RNA ligase for the double stranded RNA ligase substrate with the multi-output of the GPR of the double stranded RNA ligase for the second double stranded RNA ligase substrate. 14. The method of claim 13, wherein ligation of the double stranded RNA ligase substrate and ligation of the second double stranded RNA ligase substrate produces the same ligated product, and wherein the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate comprises at least a different ligation junction. 15. The method of claim 14, wherein the double stranded ligase substrate and the second double stranded ligase substrate comprises at least one different ligation junction. 16. The method of claim 14, wherein the double stranded ligase substrate and the second double stranded ligase substrate comprise at least two different ligation junctions. 17. The method of claim 14, wherein the double stranded ligase and the second double stranded ligase substrate comprise 3 or more ligation junctions. 18. The method of any one of claims 1-17, wherein the double stranded RNA ligase substrate comprises a modified nucleoside and/or internucleoside linkage. 19. The method of claim 18, wherein the polynucleotide acceptor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. 20. The method of claim 18, wherein the polynucleotide acceptor strand comprises a 3’- terminal modified nucleoside. 21. The method of claim 20, wherein the polynucleotide acceptor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside to the 5’-terminal end of the polynucleotide acceptor strand. Docket No. CX10-278WO1 22. The method of claim 21, wherein polynucleotide acceptor strand has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. 23. The method of any one of claims 18-22, wherein polynucleotide acceptor strand comprises a 3’-terminal modified internucleoside linkage. 24. The method of any one of claims 18-23, wherein the polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide acceptor strand. 25. The method of any one of claims 18-24, wherein polynucleotide acceptor strand comprises a 5’-terminal modified internucleoside linkage. 26. The method of any one of claims 18-25, wherein polynucleotide acceptor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide acceptor strand. 27. The method of claim 18, wherein the polynucleotide donor strand of the double stranded RNA ligase substrate comprises a modified nucleoside and/or modified internucleoside linkage. 28. The method of claim 27, wherein the polynucleotide donor strand comprises a 5’- terminal modified nucleoside. 29. The method of claim 28, wherein the polynucleotide donor strand comprises a modified nucleoside at one or more of nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside to the 3’-terminal end of the polynucleotide donor strand. 30. The method of claim 29, wherein polynucleotide donor strand has at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all modified nucleosides. 31. The method of any one of claims 27-30, wherein polynucleotide donor strand comprises a 5’-terminal modified internucleoside linkage. 32. The method of any one of claims 27-31, wherein the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 5’-terminal nucleoside of the polynucleotide acceptor strand. Docket No. CX10-278WO1 33. The method of any one of claims 27-32, wherein the polynucleotide donor strand comprises one or more modified internucleoside linkages at the internucleoside linkage position 1, 2, 3, 4, or 5 from the 3’-terminal nucleoside of the polynucleotide donor strand. 34. The method of claim 18, wherein the 3’-terminal nucleoside of a polynucleotide acceptor strand and the 5’-terminal nucleoside of the polynucleotide donor strand forming a ligation junction of the double stranded RNA ligase substrate comprise modified nucleosides. 35. The method of any one of claims 18-34, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises a modified nucleoside and/or modified internucleoside linkage. 36. The method of claim 35, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises a modified nucleoside in the nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. 37. The method of claim 36, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 3’-terminal nucleoside complementary to the 3’-terminal nucleoside of the polynucleotide acceptor strand. 38. The method of claim 36 or 37, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all of modified nucleosides. 39. The method of any one of claims 36-38, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises a modified internucleoside linkage. 40. The method of claim 39, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand complementary to the polynucleotide acceptor strand. 41. The method of claim 39, wherein the polynucleotide strand complementary to the polynucleotide acceptor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 5’-terminal region of the polynucleotide strand complementary to the polynucleotide acceptor strand. Docket No. CX10-278WO1 42. The method of any one of claims 18-41, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises a modified nucleoside and/or modified internucleoside linkage. 43. The method of any one of claims 18-41, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises a modified nucleoside in the nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand. 44. The method of claim 43, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises one or more modified nucleosides at nucleoside residue positions 2, 3, 4, 5, 6, 7, 8, 9, or 10 from the 5’-terminal nucleoside complementary to the 5’-terminal nucleoside of the polynucleotide donor strand. 45. The method of claim 43 or 44, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or all of modified nucleosides. 46. The method of any one of claims 42-45, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises a modified internucleoside linkage. 47. The method of claim 46, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 3’-terminal region of the polynucleotide strand complementary to the polynucleotide donor strand. 48. The method of claim 46, wherein the polynucleotide strand complementary to the polynucleotide donor strand comprises at least 1, 2, 3, 4 or 5 modified internucleoside linkage at the 5’-terminal region of the polynucleotide strand complementary to the polynucleotide donor strand. 49. A method of screening polynucleotide ligases for activity on a ligase substrate, comprising: (a) contacting a plurality of polynucleotide ligases with a ligase substrate under a first reaction condition; (b) selecting polynucleotide ligases with activity on the ligase substrate under the first reaction condition; (c) predicting a reaction condition activity profile for each of the selected polynucleotide ligases on the ligase substrate; and (d) retesting activity of each selected polynucleotide ligases for the ligase substrate under best-fit reaction conditions for the ligase as determined from the predicted reaction condition activity Docket No. CX10-278WO1 profile, to identify the ligases having optimal activity among screened ligases for the ligase substrate. 50. The method of claim 49, wherein each of the ligases are provided at different concentrations. 51. The method of claim 50, wherein the concentrations of ligase are from 0.1 mg/mL to 20 mg/mL. 52. The method of any one of claims 49-51, wherein the predicted reaction condition activity profile for each ligase is determined by applying Gaussian Process Regression (GPR) on ligase activity data obtained under different reaction conditions. 53. The method of claim 52, wherein the different reaction conditions comprise varying three or more of: divalent metal concentration, double stranded RNA ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, salt concentration, and ligase reaction temperature. 54. The method of claim 52, wherein the different reaction conditions comprise at least variable divalent metal concentration, variable cofactor or co-substrate NTP concentration, and variable ligase substrate concentration. 55. The method of any one of claims 49-54, further comprising predicting reaction condition activity profile of at least a second ligase substrate, and comparing the multi-output of the GPR of the ligase for the ligase substrate with the multi-output of the GPR of the ligase for the second ligase substrate. 56. The method of claim 55, wherein ligation of the double stranded RNA ligase substrate and ligation of the second double stranded RNA ligase substrate produces the same ligated product, and wherein the double stranded RNA ligase substrate and the second double stranded RNA ligase substrate comprises at least a different ligation junction. 57. The method of claim 56, wherein the double stranded ligase substrate and the second double stranded ligase substrate comprises at least one different ligation junction. 58. The method of claim 57, wherein the double stranded ligase substrate and the second double stranded ligase substrate comprise at least two different ligation junctions. Docket No. CX10-278WO1 59. The method of claim 58, wherein the double stranded ligase and the second double stranded ligase substrate comprise 3 or more ligation junctions. 60. A computer implemented method for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising receiving reaction condition activity data of a ligase for a ligase substrate obtained under different reaction conditions; applying a Gaussian Process Regression to the ligase activity data for the ligase substrate under the different reaction conditions; and generating an output of the predicted reaction condition activity profile of the ligase for the ligase substrate. 61. The computer implemented method of claim 60, wherein the ligase activity data for the different reaction or variable conditions is obtained by varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. 62. The computer implemented method of claim 60, wherein activity data for the different reaction conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substrate NTP (or NAD+) concentration, buffer concentration, and salt concentration. 63. The computer implemented method of claim 60, wherein the activity data for the different reaction conditions comprise at least divalent metal concentration, cofactor or co-substrate NTP concentration, and ligase substrate concentration. 64. A system for predicting a reaction condition activity profile of a ligase for a ligase substrate, comprising one or more processors and a memory storing instructions configured to, when executed by the processor, cause the processor to input or receive activity data of a ligase for a ligase substrate for different reaction conditions; process the reaction condition activity data using a Gaussian Process Regression; and generate an output of a predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions. 65. The system of claim 64, wherein activity data for the different reaction conditions comprise varying three or more of: divalent metal concentration, ligase substrate concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. Docket No. CX10-278WO1 66. The system of claim 64, wherein the activity data for the different reaction conditions comprise at least divalent metal concentration, cofactor or co-substrate NTP concentration, and ligase substrate concentration. 67. A computer readable storage medium for predicting the reaction condition activity profile of a ligase for a ligase substrate, comprising one or more programmed instructions configured to direct one or more processors to: input or receive activity data of a ligase for a ligase substrate under different reaction conditions; applying Gaussian Process Regression (GPR) on the activity data obtained for the different reaction conditions; and generating an output of the GPR predicted reaction condition activity profile of the ligase for the ligase substrate for the different reaction conditions. 68. The computer readable storage medium of claim 67, wherein the activity data for the different reaction or variable conditions is obtained by varying two or more of: divalent metal concentration, double stranded ligase substrate(s) concentration, cofactor or co-substrate NTP concentration, buffer concentration, and salt concentration. 69. The computer readable storage medium of claim 67, wherein activity data for the different reaction conditions comprise varying three or more of: divalent metal concentration, ligase substrate(s) concentration, cofactor or co-substrate NTP (or NAD+) concentration, buffer concentration, and salt concentration, as described herein. 70. The computer readable storage medium of claim 67, wherein the activity data for the different reaction conditions comprise at least divalent metal concentration, cofactor or co-substrate NTP concentration, and ligase substrate concentration.
PCT/US2025/029185 2024-05-13 2025-05-13 Ligation of polynucleotides by ligases and screening methods thereof Pending WO2025240509A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202463646753P 2024-05-13 2024-05-13
US63/646,753 2024-05-13
US202463718893P 2024-11-11 2024-11-11
US63/718,893 2024-11-11

Publications (2)

Publication Number Publication Date
WO2025240509A1 true WO2025240509A1 (en) 2025-11-20
WO2025240509A9 WO2025240509A9 (en) 2025-12-26

Family

ID=97720623

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/029185 Pending WO2025240509A1 (en) 2024-05-13 2025-05-13 Ligation of polynucleotides by ligases and screening methods thereof

Country Status (1)

Country Link
WO (1) WO2025240509A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210074377A1 (en) * 2019-09-11 2021-03-11 Wisconsin Alumni Research Foundation Systems and methods for fully automated protein engineering
US20230048421A1 (en) * 2013-09-27 2023-02-16 Codexis, Inc. Automated screening of enzyme variants
US20230366010A1 (en) * 2017-10-06 2023-11-16 10X Genomics, Inc. Rna templated ligation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230048421A1 (en) * 2013-09-27 2023-02-16 Codexis, Inc. Automated screening of enzyme variants
US20230366010A1 (en) * 2017-10-06 2023-11-16 10X Genomics, Inc. Rna templated ligation
US20210074377A1 (en) * 2019-09-11 2021-03-11 Wisconsin Alumni Research Foundation Systems and methods for fully automated protein engineering

Also Published As

Publication number Publication date
WO2025240509A9 (en) 2025-12-26

Similar Documents

Publication Publication Date Title
Kerkour et al. High-resolution three-dimensional NMR structure of the KRAS proto-oncogene promoter reveals key features of a G-quadruplex involved in transcriptional regulation
McAuley-Hecht et al. Crystal structure of a DNA duplex containing 8-hydroxydeoxyguanine-adenine base pairs
EP3183260B1 (en) Modified oligonucleotides and methods for their synthesis
Shen et al. The structure of an RNA pseudoknot that causes efficient frameshifting in mouse mammary tumor virus
CN107250148B (en) chemically modified guide RNA
Swaminathan et al. Molecular dynamics of B-DNA including water and counterions: a 140-ps trajectory for d (CGCGAATTCGCG) based on the GROMOS force field
Semenyuk et al. Synthesis of RNA using 2 ‘-O-DTM protection
JP2022536173A (en) Antisense RNA Editing Oligonucleotides Containing Cytidine Analogs
Tereshko et al. Correlating structure and stability of DNA duplexes with incorporated 2 ‘-O-modified RNA analogues
Shivalingam et al. Molecular requirements of high-fidelity replication-competent DNA backbones for orthogonal chemical ligation
JP6326433B2 (en) Tricyclic nucleosides and oligomeric compounds prepared therefrom
CN120988042A (en) Reagents and methods for replication, transcription, and translation in semi-synthetic organisms
Luo et al. Invading Escherichia coli genetics with a xenobiotic nucleic acid carrying an acyclic phosphonate backbone (ZNA)
WO2009020771A2 (en) Rnai agents comprising universal nucleobases
US20220195424A1 (en) Chemical Capping for Template Switching
Isaksson et al. A uniform mechanism correlating dangling-end stabilization and stacking geometry
Damha et al. Oligodeoxynucleotides containing unnatural L-2′-deoxyribose
WO2025240509A1 (en) Ligation of polynucleotides by ligases and screening methods thereof
Zheng et al. Inosine-induced base pairing diversity during reverse transcription
Zhang et al. Template-directed nonenzymatic primer extension using 2-methylimidazole-activated morpholino derivatives of guanosine and cytidine
Taylor et al. An unsymmetrical approach to the synthesis of bismethylene triphosphate analogues
Kieken et al. HIV-1Lai genomic RNA: combined used of NMR and molecular dynamics simulation for studying the structure and internal dynamics of a mutated SL1 hairpin
Efimov et al. Synthesis of RNA by the rapid phosphotriester method using azido-based 2′-O-protecting groups
Pyne et al. Combining high-resolution AFM with MD simulations shows that DNA supercoiling induces kinks and defects that enhance flexibility and recognition
D'yachkov et al. DNA–phospholipid recognition: modulation by metal ion and lipid nature. Complexes structure and stability calculated by molecular mechanics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25804271

Country of ref document: EP

Kind code of ref document: A1