[go: up one dir, main page]

WO2026030345A1 - Signal and pro-region sequence variants for enhanced protease production in bacillus cells - Google Patents

Signal and pro-region sequence variants for enhanced protease production in bacillus cells

Info

Publication number
WO2026030345A1
WO2026030345A1 PCT/US2025/039697 US2025039697W WO2026030345A1 WO 2026030345 A1 WO2026030345 A1 WO 2026030345A1 US 2025039697 W US2025039697 W US 2025039697W WO 2026030345 A1 WO2026030345 A1 WO 2026030345A1
Authority
WO
WIPO (PCT)
Prior art keywords
substitution
modified
protease
seq
signal peptide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/039697
Other languages
French (fr)
Inventor
Cristina Bongiorni
Frits Goedegebuur
Yunlong Li
Mai DU
Rei OTSUKA
Harm Mulder
Andrew Abel PRIOR
Laurens LAMMERTS
Sina Pricelius
Viktor Yuryevich Alekseyev
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Genencor International BV
Danisco US Inc
Original Assignee
Genencor International BV
Danisco US Inc
Genencor International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genencor International BV, Danisco US Inc, Genencor International Inc filed Critical Genencor International BV
Publication of WO2026030345A1 publication Critical patent/WO2026030345A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • C12N15/625DNA sequences coding for fusion proteins containing a sequence coding for a signal sequence
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/74Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
    • C12N15/75Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora for Bacillus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/52Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
    • C12N9/54Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea bacteria being Bacillus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21062Subtilisin (3.4.21.62)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/01Bacteria or Actinomycetales ; using bacteria or Actinomycetales
    • C12R2001/07Bacillus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/01Bacteria or Actinomycetales ; using bacteria or Actinomycetales
    • C12R2001/07Bacillus
    • C12R2001/125Bacillus subtilis ; Hay bacillus; Grass bacillus

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The instant disclosure generally relates to nucleic acid (DNA) sequences encoding novel signal peptide sequences, nucleic acid (DNA) sequences encoding novel pro-region sequences, recombinant polynucleotides (e.g., vectors, plasmids, expression cassettes, etc.) encoding precursor proteases, recombinant (modified) Bacillus sp. cells expressing a precursor protease and secreting the mature protease into the fermentation broth when fermented under suitable conditions, and the like.

Description

IFF10076-W0-PCT[2]
SIGNAL AND PRO-REGION SEQUENCE VARIANTS FOR ENHANCED PROTEASE PRODUCTION IN BACILLUS CELLS
FIELD
[0001] The present disclosure is generally related to the fields of microbial host cells, molecular biology, fermentation, protein engineering, protein production and the like. Certain aspects of the disclosure are related to signal peptide sequences, pro-region sequences, recombinant polynucleotides, expression cassettes, and methods thereof for constructing genetically modified (recombinant) Bacillus sp. strains producing proteases of interest.
REFERENCE TO A SEQUENCE LISTING
[0002] The contents of the electonic submission of the text file Sequence Listing, named “IFF10076WOPCT2_SequenceListing.xml” was created on July 22, 2024, and is 13,824 bytes in size, which is hereby incorporated by reference in its entirety.
BACKGROUND
[0003] Gram-positive microorganisms are often used for large-scale industrial fermentation due to their ability to secrete their fermentation products into their culture media. Secreted proteins are exported across a cell membrane and a cell wall, and then subsequently released into the external media. For example, large-scale industrial fermentation and secretion of heterologous polypeptides is a widely used technique in industry, wherein microbial cells are transformed with a nucleic acid encoding a heterologous polypeptide to be expressed. Despite various advances in protein production methods, there remains a need in the art to provide more efficient methods for protein expression with the aim to enhance the production of proteins of interest which find use in the use in various industries.
SUMMARY
[0004] As generally described herein, the instant disclosure provides, inter alia, nucleic acid (DNA) sequences encoding novel signal peptide sequences, nucleic acid (DNA) sequences encoding novel proregion sequences, recombinant polynucleotides (e.g., vectors, plasmids, expression cassettes, etc.) encoding precursor proteases, recombinant (modified) Bacillus sp. cells expressing a precursor protease and secreting the mature protease into the fermentation broth when fermented under suitable conditions, and the like.
[0005] Certain one or more embodiments of the disclosure are therefore directed to isolated polynucleotides encoding modified (variant) signal peptide sequences comprising an amino acid substitution, insertion, or deletion. In certain embodiments, modified (variant) signal peptide sequences comprise a substitution selected from any one of positions VI, S3, L10, A13, L14, T15, T19 and M20, wherein the positions of the modified (variant) signal peptide are numbered according to the reference IFF10076-W0-PCT[2]
(native) signal peptide of SEQ ID NO: 5. In other embodiments, modified (variant) signal peptide sequences comprise an amino acid insertion selected from a position immediately following one of positions K5, L6 and T19, wherein the positions of the modified (variant) signal peptide are numbered according to the reference (native) signal peptide of SEQ ID NO: 5. In certain other embodiments, modified (variant) signal peptide sequences comprise an amino acid deletion selected from any one of positions S3, L14, L16, F18 and T19, wherein the positions of the modified (variant) signal peptide are numbered according to the reference (native) signal peptide of SEQ ID NO: 5. In certain related embodiments, modified (variant) signal peptide sequences comprise at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5. In certain other embodiments, the amino acid substitution at position VI is a methionine (VIM), the amino acid substitution at position S3 is a phenylalanine (S3F), an isoleucine (S3I) or a threonine (S3T), the amino acid substitution at position LIO is a tryptophan (L10W), the amino acid substitution at position A13 is a threonine (A13T), the amino acid substitution at position L14 is a proline (L14P), the amino acid substitution at position T15 is a valine (T15V), the amino acid substitution at position T19 is a valine (T19V) or the amino acid substitution at position M20 is an aspartic acid (M20D). In other embodiments, the amino acid insertion immediately following one of positions K5, L6 or T19 is an alanine (A).
[0006] In certain other embodiments, modified (variant) signal peptide sequences comprise a substitution selected from any one of positions G3, L10, A13, L14, A15, T19 and M20, wherein the positions of the modified (variant) signal peptide are numbered according to the reference (native) signal peptide of SEQ ID NO: 10. In other embodiments, modified (variant) signal peptide sequences comprise an amino acid insertion selected from a position immediately following one of positions K5, V6 and T19, wherein the positions of the modified (variant) signal peptide are numbered according to the reference (native) signal peptide of SEQ ID NO: 10. In yet other embodiments, modified (variant) signal peptide sequences comprise a deletion selected from any one of positions G3, L14, L16, F18 and T19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 10.
[0007] In certain other embodiments, the disclosure provides isolated polynucleotides encoding modified (variant) pro-region sequences comprising an amino acid substitution, insertion, or deletion. In certain embodiments, modified (valiant) pro-region sequences comprise a substitution selected from any one of positions K9, T17, S19, T20, M21, A23, K36, Q38, T50, K57, E58, K60, D62, S64 and E70, wherein the positions of the modified (variant) pro-region are numbered according to the reference (native) pro-region of SEQ ID NO: 7. In certain other embodiments, modified (variant) pro-region sequences comprise an amino acid insertion selected from a position immediately following one of positions E7, A24 and A76, wherein the positions of the modified (variant) pro-region arc numbered according to the reference (native) pro-region of SEQ ID NO: 7. In certain other embodiments, modified (variant) pro-region sequences IFF10076-W0-PCT[2] comprise an amino acid deletion is at position T20, wherein the positions of the modified (variant) proregion are numbered according to the reference (native) pro-region of SEQ ID NO: 7. In related embodiments, modified (variant) pro-region sequences comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7. In certain other embodiments, the amino acid substitution at position K9 is a methionine (K9M), the amino acid substitution at position T17 is a glycine (T17G), the amino acid substitution at position S19 is a lysine (S19K) or a threonine (S19T), the amino acid substitution at position T20 is a histidine (T20H) or a methionine (T20M), the amino acid substitution at position M21 is a leucine (M21L), the amino acid substitution at position A23 is an asparagine (A23N), the amino acid substitution at position K36 is a glutamic acid (K36E), the amino acid substitution at position Q38 is an isoleucine (Q38I), the amino acid substitution at position T50 is a glutamic acid (T50E), the amino acid substitution at position K57 is an alanine (K57A), the amino acid substitution at position E58 is an aspartic acid (E58D) or a glutamine (E58Q), the amino acid substitution at position K60 is a glutamic acid (K60E), the amino acid substitution at position D62 is a histidine (D62H), the amino acid substitution at position S64 is a tyrosine (S64Y) or the amino acid substitution at position E70 is an arginine (E70R). In another embodiment, the amino acid insertion immediately following one of positions E7, A24 or A76 is an alanine (A).
[0008] In certain other one or more embodiments, the disclosure provides recombinant polynucleotides encoding modified precursor proteases, wherein the polynucleotides encoding the modified precursor proteases comprise a first (1st) polynucleotide encoding a modified (variant) signal peptide sequence operably linked to a second (2nd) polynucleotide encoding a reference (native) pro-region operably linked to a third (3rd) polynucleotide encoding a mature protease, wherein the modified (variant) signal peptide comprises an amino acid substitution, insertion, or deletion. In certain embodiments, the 3rd polynucleotide encodes a mature subtilisin protease derived or obtained from a Bacillus sp. cell.
[0009] In other embodiments, the disclosure provides recombinant polynucleotides encoding modified precursor proteases, wherein the polynucleotides encoding the modified precursor proteases comprise a first (1st) polynucleotide encoding a reference (native) signal peptide sequence operably linked to a second (2nd) polynucleotide encoding a modified (variant) pro-region operably linked to a third (3rd) polynucleotide encoding a mature protease, wherein the modified (valiant) pro-region comprises an amino acid substitution, insertion, or deletion. In certain embodiments, the 3rd polynucleotide encodes a mature subtilisin protease derived or obtained from a Bacillus sp. cell.
[0010] Thus, certain other one or more embodiments provide recombinant (modified) Bacillus sp. cells comprising introduced polynucleotides encoding modified precursor proteases of the disclosure, recombinant (modified) Bacillus sp. cells comprising introduced expression cassettes of the disclosure and the like. In related embodiments, the Bacillus sp. cell is selected from the group consisting of B. subtilis, IFF10076-W0-PCT[2]
B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis .
[0011] Certain other embodiments of the disclosure are related to, inter alia, compositions and methods for producing heterologous proteases in Bacillus sp. cells. Thus, in certain embodiments, the disclosure provides methods for producing heterologous proteases in Bacillus sp. cells comprising introducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein the cassette comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a modified (variant) signal peptide sequence operably linked to a downstream DNA sequence encoding a reference (native) pro-region sequence operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under suitable conditions for the production of the protease.
[0012] In certain other embodiments of the methods, the modified (recombinant) cell produces an increased amount of the protease relative to a control (isogenic) Bacillus sp. cell fermented under the same conditions. For instance, the control Bacillus sp. cell may be constructed by introducing into the cell a control precursor protease expression cassette, wherein the control cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding the reference (native) signal peptide of SEQ ID NO: 5 or SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding the reference (native) pro-region of SEQ ID NO: 7 operably linked to a downstream DNA sequence encoding the mature protease, wherein the promoter and mature protease sequences of the control cassette are the same as the promoter and mature protease sequences used in the modified precursor protease cassette. In certain embodiments of the methods, the increased amount of the protease produced by the modified cell is at least about a 1%, 2%, 3%, 4% to about 5%, increased relative to the control cell. In other embodiments of the methods, the increased amount of the protease produced by the modified cell is at least about a 10% increase relative to the control cell. In other embodiments, the protease is secreted into the fermentation. In yet other embodiments, the secreted protease is recovered from the fermentation broth.
[001 ] Certain other embodiments are related to methods for producing heterologous proteases in Bacillus sp. cells comprising introducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein the cassette comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a reference (native) signal peptide of SEQ ID NO: 5 or SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding a modified (variant) pro-region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under conditions suitable for the production of the protease. In certain other embodiments of the methods, the mature protease comprises an amino acid sequence having at least about 75% to 100% identity to SEQ ID NO: 3. In other embodiments of the methods, the modified (variant) pro-region sequence comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to IFF10076-W0-PCT[2]
SEQ ID NO: 7. In other embodiments of the methods, the reference (native) signal peptide sequence comprises at least 95% to about 100% identity to SEQ ID NO: 7. In other embodiments of the methods, the modified (recombinant) cell produces an increased amount of the protease relative to a control (isogenic) Bacillus sp. cell fermented under the same conditions. In certain embodiments, the increased amount of the protease produced by the modified cell is at least a 5% increase relative to the control cell. In other embodiments of the methods, the increased amount of the protease produced by the modified cell is at least about a 1%, 2%, 3%, 4% to about 5%increased relative to the control cell. In other embodiments of the methods, the protease is secreted into the fermentation. In yet other embodiments, the secreted protease is recovered from the fermentation broth. In other embodiments of the one or more methods of the disclosure, the Bacillus sp. cell is selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis .
BRIEF DESCRIPTION OF DRAWINGS
10014] Figure 1 presents a schematic of an exemplary expression cassette and the encoded precursor protease. In particular, FIG. 1A shows an expression cassette comprising, in the 5' to 3' direction, a promoter region (promoter) DNA sequence operably linked to a DNA sequence (AprE_ss) encoding a reference signal peptide sequence operably linked to a DNA sequence (BPN'_PRO) encoding a reference pro-region sequence operably linked to a DNA sequence mature protease) encoding a mature protease operably linked to a transcriptional terminator (DNA) sequence. Additionally, FIG. IB presents a schematic of the precursor protease encoded by the exemplary cassette, wherein a site evaluation library (SEL) was performed on the reference signal sequence (AprE_ss; SEQ ID NO: 5) and the reference proregion sequence (BPN' PRO; SEQ ID NO:7) as indicated with arrows in FIG. IB.
[0015] Figure 2 shows the amino acid sequence of the native (reference) B. subtilis AprE signal (pre) peptide sequence (AprE_ss; SEQ ID NO: 5) used in the site evaluation libraries (SELs) described herein. More particularly, FIG. 2A presents the amino acid sequence of the reference AprE_ss (SEQ ID NO: 5) using single letter amino acid abbreviations, and FIG. 2B shows the same reference AprE_ss with amino acid residue positions 1-29 of SEQ ID NO: 5 annotated with numerical subscripts.
[0016] Figure 3 shows the amino acid sequence of the native (reference) B. amyloliquefaciens subtilisin (BPN') pro-region sequence (BPN'_PRO; SEQ ID NO: 7) used in the SELs described herein. More particularly, FIG. 3A presents the amino acid sequence of the reference BPN'_PRO sequence (SEQ ID NO: 7) using single letter amino acid abbreviations, and FIG. 3B shows the same reference BPN'_PRO sequence with amino acid positions 1-77 of SEQ ID NO: 7 annotated with numerical subscripts.
[0017] Figure 4 presents the amino acid sequence of the native (reference) B. amyloliquefaciens subtilisin (BPN') signal peptide sequence (BPN'_ss; SEQ ID NO: 10) used in the site evaluation libraries (SELs) IFF10076-W0-PCT[2] described herein. More particularly, FIG. 4A presents the amino acid sequence of the reference BPN'_ss (SEQ ID NO: 10) using single letter amino acid abbreviations, and FIG. 4B shows the same reference BPN'_ _ss with amino acid residue positions 1-30 of SEQ ID NO: 10 annotated with numerical subscripts.
[0018] Figure 5 shows the amino acid sequence alignment of the reference BPN' signal peptide sequence (FIG. 5A', BPN'_ss, SEQ ID NO: 10) and the reference AprE signal peptide sequence FIG. 5B, AprE_ss, SEQ ID NO: 5). For instance, FIG. 5C presents a BLAST pairwise alignment of SEQ ID NO: 10 vs. SEQ ID NO: 5, wherein the amino acid residues presented in bold text are identical. The BLAST alignment statistics presented in FIG. 5D show that the two sequences have about 79% sequence identity with one gap.
BRIEF DESCRIPTION OF THE BIOLOGICAL SEQUENCES
[0019] SEQ ID NO: 1 is a synthetic nucleic acid (DNA) sequence comprising an upstream B. subtilis rrnI-P2 promoter operably linked to a downstream aprE 5 '-untranslated region (UTR) sequence.
[0020] SEQ ID NO: 2 is a synthetic DNA sequence encoding a mature subtilisin variant (protease) of SEQ ID NO: 3.
10021 ] SEQ ID NO: 3 is the amino acid sequence of the mature subtilisin variant (protease) encoded by SEQ ID NO: 2.
[0022] SEQ ID NO: 4 is a DNA sequence (AprE_ss) encoding a native B. subtilis AprE signal peptide sequence (AprE_ss) set forth in SEQ ID NO: 5.
[0023] SEQ ID NO: 5 is the amino acid sequence of the native AprE signal peptide sequence (AprE_ss).
[0024] SEQ ID NO: 6 is a DNA sequence (BPN'_PRO) encoding a native B. amyloliquefaciens BPN' proregion sequence (BPN'_PRO) set forth in SEQ ID NO: 7.
[0025] SEQ ID NO: 7 is the amino acid sequence of the native BPN' pro-region sequence (BPN'_PRO).
[0026] SEQ ID NO: 8 is the DNA sequence of a wild-type B. amyloliquefaciens BPN' terminator.
[0027] SEQ ID NO: 9 is the DNA sequence of a kanamycin (kan) gene expression cassette.
[0028] SEQ ID NO: 10 is the amino acid sequence of the native BPN' signal peptide sequence (BPN'_ss). [0029] SEQ ID NO: 11 is a DNA sequence (BPN'_ss) encoding the native B. amyloliquefaciens BPN' signal peptide sequence (BPN'_ss) set forth in SEQ ID NO: 10.
DETAILED DESCRIPTION
[0030] As briefly set forth above and described hereinafter in detail, the present disclosure provides, inter alia, nucleic acid (DNA) sequences encoding novel signal peptide sequences, nucleic acid (DNA) sequences encoding novel pro-region sequences, recombinant polynucleotides (e.g., vectors, plasmids, expression cassettes, etc.) encoding precursor proteases, recombinant Bacillus sp. strains expressing a precursor protease and secreting the mature protease into the fermentation broth when fermented under suitable conditions for the production of the protease, and the like. Certain embodiments of the disclosure IFF10076-W0-PCT[2] are therefore related to recombinant polynucleotides comprising a first (1st) polynucleotide encoding a signal peptide sequence operably linked to a second (2nd) polynucleotide encoding a pro-region sequence operably linked to a third (3rd) polynucleotide encoding a mature protease. Certain other embodiments are therefore related to expression cassettes comprising (in the 5' to 3' direction) an upstream promoter region (DNA) sequence operably linked to a DNA sequence encoding a signal peptide sequence operably linked to a DNA sequence encoding a pro-region sequence operably linked to a DNA sequence encoding a mature protease, which may optionally comprise a transitional terminator sequence operably linked to the DNA encoding the mature protease. Certain other embodiments provide recombinant Bacillus sp. strains comprising an introduced expression cassette of the disclosure, methods and compositions for fermenting recombinant Bacillus sp. strains for the enhanced production and secretion of proteases and the like.
I. DEFINITIONS
[0031] In view of the recombinant (modified) strains of the disclosure and methods thereof described herein, the following terms and phrases are defined. Terms not defined herein should be accorded their ordinary meaning as used in the art.
[0032] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present compositions and methods apply. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present compositions and methods, representative illustrative methods and materials are now described. All publications and patents cited herein are incorporated by reference in their entirety.
[0033] It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only”, “excluding”, “not including” and the like, in connection with the recitation of claim elements, or use of a “negative” limitation or proviso thereof.
[0034] As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present compositions and methods described herein. Any recited method can be carried out in the order of events recited or in any other order which is logically possible.
[0035] Certain ranges are presented herein with numerical values being preceded by the term “about”. The term “about” is used herein to provide literal support for the exact number that it precedes, as well as a number that is near to or approximately the number that the term precedes. In determining whether a number is near to or approximately a specifically recited number, the near or approximating unrecited number can be a number which, in the context in which it is presented, provides the substantial equivalent IFF10076-W0-PCT[2] of the specifically recited number. For example, in connection with a numerical value, the term “about” refers to a range of -10% to +10% of the numerical value, unless the term is otherwise specifically defined in context.
[0036] As used herein, the phrases “Gram-positive bacteria”, Gram-positive cells” “Gram-positive bacterial strains”, and/or “Gram positive bacterial cells” have the same meaning as used in the art. For example, Gram-positive bacterial cells include all strains of Actinobacteria and Firmicutes. In certain embodiments, such Gram-positive bacteria are of the classes Bacilli, Clostridia and Mollicutes.
[0037] As used herein, the genus “Bacillus” includes all species within the genus “Bacillus”’ as known to those of skill in the art, including but not limited to B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis. It is recognized that the genus Bacillus continues to undergo taxonomical reorganization. Thus, it is intended that the genus include species that have been reclassified, including but not limited to such organisms as B. stearothermophilus, which is now named “Geobacillus stearothermophilus” .
[0038] As used herein, the terms “recombinant” or “non-natural” refer to an organism, microorganism, cell, nucleic acid molecule, or vector that has at least one engineered genetic alteration, or has been modified by the introduction of a heterologous nucleic acid molecule, or refer to a cell (e.g., a microbial cell) that has been altered such that the expression of a heterologous or endogenous nucleic acid molecule or gene can be controlled. Recombinant also refers to a cell that is derived from a non-natural cell or is progeny of a non-natural cell having one or more such modifications. Genetic alterations include, for example, modifications introducing expressible nucleic acid molecules encoding proteins, or other nucleic acid molecule additions, deletions, substitutions, or other functional alteration of a cell’s genetic material. For example, recombinant cells may express genes or other nucleic acid molecules that are not found in identical or homologous form within a native (wild-type) cell (e.g., a fusion or chimeric protein), or may provide an altered expression pattern of endogenous genes, such as being over-expressed, under-expressed, minimally expressed, or not expressed at all. “Recombination”, “recombining” or generating a “recombined” nucleic acid is generally the assembly of two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene.
[0039] The term “derived” encompasses the terms “originated” “obtained,” “obtainable,” and “created,” and generally indicates that one specified material or composition finds its origin in another specified material or composition, or has features that can be described with reference to another specified material or composition. For example, recombinant Gram-positive bacterial cells of the disclosure may be dcrivcd/obtaincd from any known Gram-positive bacterial strains. IFF10076-W0-PCT[2]
[0040] As used herein, “nucleic acid” refers to a nucleotide or polynucleotide sequence, and fragments or portions thereof, as well as to DNA, cDNA, and RNA of genomic or synthetic origin, which may be doublestranded or single-stranded, whether representing the sense or antisense strand. It will be understood that as a result of the degeneracy of the genetic code, a multitude of nucleotide sequences may encode a given protein.
[0041] It is understood that the polynucleotides (or nucleic acid molecules) described herein include “genes”, “vectors” and “plasmids”.
[0042] Accordingly, the term “gene”, refers to a polynucleotide that codes for a particular sequence of amino acids, which comprise all, or part of a protein coding sequence, and may include regulatory (nontranscribed) DNA sequences, such as promoter sequences, which determine for example the conditions under which the gene is expressed. The transcribed region of the gene may include untranslated regions (UTRs), including 5 '-untranslated regions (UTRs), and 3'-UTRs, as well as the coding sequence.
[0043] As used herein, an “endogenous gene” refers to a gene in its natural location in the genome of an organism.
[0044] As used herein, a “heterologous” gene, a “non-endogenous” gene, or a “foreign” gene refer to a gene not normally found in the host organism, but that is introduced into the host organism by gene transfer. The term “foreign” gene(s) comprises native genes inserted into a non-native organism and/or chimeric genes inserted into a native or non-native organism.
[0045] As used herein, a “heterologous control sequence”, refers to a gene expression control sequence (e.g., promoters, enhancers, terminators, etc.) which does not function in nature to regulate (control) the expression of the gene of interest. Generally, heterologous nucleic acids are not endogenous (native) to the cell, or a part of the genome in which they are present, and have been added to the cell, by infection, transfection, transduction, transformation, microinjection, electroporation, and the like. A “heterologous” nucleic acid construct may contain a control sequence/DNA coding (ORF) sequence combination that is the same as, or different, from a control sequence/DNA coding sequence combination found in the native host cell.
[0046] As used herein, the term “expression” refers to the transcription and stable accumulation of sense (mRNA) or anti-sense RNA, derived from a nucleic acid molecule of the disclosure. Expression may also refer to translation of mRNA into a polypeptide. Thus, the term “expression” includes any steps involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, secretion and the like.
[0047] As used herein, the term “coding sequence” (CDS) refers to a nucleotide sequence, which directly specifics the amino acid sequence of its (encoded) protein product. The boundaries of the coding sequence are generally determined by an open reading frame (hereinafter, “ORF”), which usually begins with an IFF10076-W0-PCT[2]
ATG start codon. The coding sequence typically includes DNA, cDNA, and recombinant nucleotide sequences.
[0048] As used herein, the terms “promoter”, “promoter element”, “promoter sequence” and the like, refer to a nucleic acid (DNA) sequence capable of controlling the transcription of a gene coding sequence (CDS) into messenger RNA (mRNA) when the promoter region sequence is placed upstream (5') and operably linked to the downstream (3') gene CDS. As generally understood by of skilled in the art, promoters typically provide a site for specific binding by RNA polymerase and the initiation of transcription. In certain aspects, the term “promoter” refers to the minimal portion of the promoter nucleic acid sequence required to initiate transcription (i.e., comprising RNA polymerase binding sites). For example, a promoter generally comprises a “-10” (consensus sequence) element and a “-35” (consensus sequence) element, which are upstream (5') and relative to the +1 transcription start site (TSS) of the gene CDS to be transcribed. The core promoter -10 and -35 elements are generally referred to in the art as the “TATAAT” (Pribnow box) consensus region and the “TTGACA” consensus region, respectively. The spacing of the core promoter (10 and -35) regions are generally separated (spaced) by about fifteen-twenty (15-20) intervening base pairs (nucleotides).
[0049] Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic nucleic acid segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters can be constitutive promoters, inducible promoters, tunable promoters, hybrid promoters, synthetic promoters, tandem promoters, etc. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
[0050] As used herein, a “functional promoter sequence controlling the expression of a gene of interest linked to the gene of interest’s protein coding sequence” refers to a promoter sequence which controls the transcription and tr anslation of the coding sequence in a desired Gram-positive host cell. For example, in certain embodiments, the present disclosure provides polynucleotides comprising an upstream (5') promoter (or 5' promoter region, or tandem 5' promoters and the like) functional in a Gram-positive cell, wherein the functional promoter region is operably linked to a nucleic acid sequence encoding a protein of interest.
[0051] As used herein, the term “precursor protein” refers to an inactive form of a protein. In certain aspects, a full-length protein is synthesized as precursor, in the form of a pro-sequence and mature protein (abbreviated, “pro-protein”). In other aspects, a full-length protein is synthesized as precursor, in the form of a signal (pre) peptide sequence, a pro-sequence and mature protein (abbreviated, “pre -pro-protein”). IFF10076-W0-PCT[2]
For example, pre-sequences usually act as signal peptides for transport, and pro-sequences are typically essential for the correct folding of the associated (mature) protein.
[0052] As used herein, the term “precursor protease” refers to a full-length protease synthesized as a precursor, in the form of a signal (pre) peptide sequence, a pro-region sequence and mature protease sequence (abbreviated, “pre-pro-protease”).
[0053] As used herein, the term “mature protease” refers to an active form of a protease, in contrast to the inactive precursor (full-length) protease.
[0054] As used herein, the phrase “polynucleotide encoding a precursor protease” (e.g., a “modified precursor protease”, a “control precursor protease”) refers to a polynucleotide sequence encoding a precursor protease, wherein the polynucleotide comprises in the 5' to 3' direction a DNA sequence encoding a signal peptide (pre) sequence operably linked to a DNA sequence encoding a pro-region sequence operably linked to a DNA sequence encoding a mature protease.
[0055] As used herein, the terms “signal sequence”, “secretion signal” and “signal peptide” may be used interchangeably and refer to a sequence of amino acid residues that may participate in the secretion or direct transport of a precur sor protein. The signal (pre) sequence is typically cleaved from the precursor protein by a signal peptidase during translocation. The signal (pre) sequence is typically located N-terminal to the mature protein sequence, or located N-terminal to the pro-region (PRO) sequence when a signal (pre) sequence and a pro-region (PRO) sequence are used in operable combination and upstream (5') of the mature POI sequence.
[0056] As used herein, the terms “pro sequence”, “pro-sequence”, “pro-region sequence” and the like may be used interchangeably and abbreviated as “PRO” sequence, “Pro” sequence, “pro” sequence and the like. The term pro-sequence as used herein has the same meaning as understood in the art.
[0057] For example, subtilisin proteases are first produced as a pre-pro-subtilisin (precursor protein), which consists of a signal peptide (pre) sequence followed by a pro-region (pro) sequence followed by the mature subtilisin sequence (i.e., pre-pro-subtilisin). Pro-region sequences are often essential for the correct folding of the associated (mature) protein, acting as an intra-molecular chaperone (e.g., catalyzing the protein-folding reaction directly). Likewise, pro-region sequences may be required for both folding and intracellular transport (or secretion) of the mature protein.
[0058] In certain embodiments, a modified (variant) signal peptide sequence of the disclosure comprises an amino acid sequence derived from a reference B. subtilis AprE signal sequence (SEQ ID NO: 5). In certain embodiments, amino acid modifications of the signal sequence described herein are numbered by reference to amino acid positions 1-29 of SEQ ID NO: 5 as shown in FIG. 2.
[0059] In certain other embodiments, a modified (variant) pro-region sequence of the disclosure comprises an amino acid sequence derived from a reference B. amyloliquefaciens BPN' pro-region sequence (SEQ ID IFF10076-W0-PCT[2]
NO: 7). In certain embodiments, amino acid modifications of the pro-region sequence described herein are numbered by reference to amino acid positions 1-77 of SEQ ID NO: 7 as shown in FIG. 3
[0060] In certain embodiments, a modified (variant) signal peptide sequence of the disclosure comprises an amino acid sequence derived from a reference B. amyloliquefaciens BPN' signal sequence (SEQ ID NO: 10). In certain embodiments, amino acid modifications of the signal sequence described herein are numbered by reference to amino acid positions 1-30 of SEQ ID NO: 10 as shown in FIG. 4.
[0061] As used herein, the variant Bacillus amyloliquefaciens (BPN') subtilisin protease comprising the amino acid sequence of SEQ ID NO: 3 is abbreviated “variant 1” subtilisin. More specifically, as described below in the Examples, the variant 1 subtilisin was used as a reporter to monitor protein expression in Bacillus sp. cells as described hereinafter. For instance, US Provisional Application Serial No. 63/482634, incorporated herein by reference in its entirety, generally discloses variant B. amyloliquefaciens (BPN') subtilisin proteases derived from the native B. amyloliquefaciens BPN' subtilisin, as well as methods and compositions for constructing such variants. However, the variant 1 subtilisin reporter protein (SEQ ID NO: 3) is not meant to be limiting, as other (mature) subtilisin sequences known in the art may be expressed/produced/secreted using one or more modified (variant) signal peptide sequence and/or modified (variant) pro-region sequences of the disclosure, as described herein.
[0062] As used herein, the term “untranslated region” may be abbreviated “UTR”.
10063] As used herein, the phrases “five prime (5') untranslated region”, “5' untranslated region” and/or “5' transcript leader” may be used interchangeably and abbreviated as “5'-UTR”. As generally understood in the art, the 5'-UTR is known to be the region of a messenger RNA (mRNA) that is directly upstream (5') from the initiation codon.
[0064] A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. Generally, “operably linked” means that the DNA sequences being linked are contiguous, and, in the case of a signal sequence operably linked to a pro-region sequence operably linked to a mature protein sequence, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. Thus, the term operably linked generally refers to the association (juxtaposition) of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other.
[0065] As used herein, an “rmI-P2 promoter and 5'-UTR region” sequence (abbreviated, "rrnI-P2/aprE 5'-UTR”) refers to an upstream B. subtilis rrnI-P2 promoter operably linked to a downstream aprE gene 5'- UTR as presented in SEQ ID NO: 1.
[0066] As used herein, “suitable regulatory sequences” refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which IFF10076-W0-PCT[2] influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, transcription leader sequences, RNA processing site, effector binding site and stem-loop structures.
[0067] As used herein, a “host cell” refers to a cell that has the capacity to act as a host or expression vehicle for a newly introduced DNA sequence. This, in certain embodiments of the disclosure, the host cells are Gram-positive cells (e.g., Bacillus sp.) and/or Gram-negative cells (e.g., E. coli).
[0068] As used herein, a “modified cell” refers to a recombinant cell that comprises at least one genetic modification which is not present in the parental, reference, or control cell from which the modified cell is derived or obtained.
[0069] As used herein, when the expression/production of a protein of interest (POI) in a recombinant (modified) cell is being compared to the expression/production of the same POI in an unmodified (control) cell, it will be understood that the modified and unmodified cells are grown/cultivated/fermented under the same conditions (e.g., the same conditions such as media, temperature, pH and the like).
[0070] As used herein, an “increased amount”, when used in phrases such as a “recombinant cell ‘expresses/produces/secretes an increased amount’ of a protein of interest relative to the unmodified (control or isogenic) cell”, particularly refers to an “increased amount” of a protein of interest (POI) expressed/produced/secreted in by the recombinant cell, which “increased amount” is always relative to the unmodified (control or isogenic) cells expressing/producing/secreting the same POI, wherein the modified and unmodified (control or isogenic) cells are grown/cultured/fermented under the same conditions.
[0071] As used herein, “enhanced” protein production, increasing” protein production or “increased” protein production is meant an increased amount of protein produced (e.g., a protein of interest). The protein may be produced inside the host cell, or secreted (or transported) into the culture medium. In certain embodiments, the protein of interest is produced (secreted) into the culture medium. Increased protein production may be detected for example, as higher maximal level of protein or enzymatic activity (e.g., such as amylase or protease activity), or total extracellular protein produced as compared to the parental host cell.
[0072] As used herein, the terms “modification” and “genetic modification” are used interchangeably and include, but are not limited to, the insertion, substitution, or removal (deletion) of one or more nucleotides in a gene (or an ORF/CDS thereof), or the introduction, substitution, or removal of one or more nucleotides in a regulatory element required for the transcription or translation of the gene or ORF/CDS thereof, (b) a gene disruption, (c) a gene conversion, (d) a gene deletion, (e) the down-regulation of a gene, (f) specific mutagenesis and/or (g) random mutagenesis of any one or more the genes disclosed herein.
[0073] As used herein, the term “introducing”, as used in phrases such as “introducing into a Gram-positive bacterial cell a ‘gene’, a ‘polynucleotide’, an ‘open reading frame’ (ORF), a ‘gene coding sequence, a IFF10076-W0-PCT[2]
‘vector’, an ‘expression cassette’”, and the like, includes methods known in the art for introducing polynucleotides (DNA) into a cell, including, but not limited to protoplast fusion, natural or artificial transformation (e.g., calcium chloride, electroporation), transduction, transfection, conjugation and the like. [0074] As used herein, “transformed” or “transformation” mean a cell has been transformed by use of recombinant DNA techniques. Transformation typically occurs by insertion of one or more nucleotide sequences (e.g., a polynucleotide, an ORF or gene) into a cell. The inserted nucleotide sequence may be a heterologous nucleotide sequence (i.e., a sequence that is not naturally occurring in cell that is to be transformed). Transformation therefore generally refers to introducing an exogenous DNA into a host cell so that the DNA is maintained as a chromosomal integrant or a self-replicating extra-chromosomal vector. [0075] As used herein, “transforming DNA”, “transforming sequence”, and “DNA construct” refer to DNA that is used to introduce sequences into a host cell or organism. Transforming DNA is DNA used to introduce sequences into a host cell or organism. The DNA may be generated in vitro by PCR or any other suitable techniques. In some embodiments, the transforming DNA comprises an incoming sequence, while in other embodiments it further comprises an incoming sequence flanked by homology boxes. In yet a further embodiment, the transforming DNA comprises other non-homologous sequences, added to the ends (i.e., stuffer sequences or flanks). The ends can be closed such that the transforming DNA forms a closed circle, such as, for example, insertion into a vector.
10076] As used herein, “disruption of a gene” or a “gene disruption”, are used interchangeably and refer broadly to any genetic modification that substantially prevents a host cell from producing a functional gene product (e.g., a protein). Thus, as used herein, a gene disruption includes, but is not limited to, frameshift mutations, premature stop codons (i.e., such that a functional protein is not made), substitutions eliminating or reducing activity of the protein internal deletions (such that a functional protein is not made), insertions disrupting the coding sequence, mutations removing the operable link between a native promoter required for transcription and the open reading frame, and the like.
[0077] As used herein “an incoming sequence” refers to a DNA sequence that is introduced into the bacterial cell chromosome. In some embodiments, the incoming sequence is part of a DNA construct. In other embodiments, the incoming sequence encodes one or more proteins of interest. In some embodiments, the incoming sequence comprises a sequence that may or may not already be present in the genome of the cell to be transformed (i.e., it may be either a homologous or heterologous sequence). In some embodiments, the incoming sequence encodes one or more proteins of interest, a gene, and/or a mutated or modified gene. In alternative embodiments, the incoming sequence encodes a functional wildtype gene or operon, a functional mutant gene or operon, or a nonfunctional gene or operon. In some embodiments, the non-functional sequence may be inserted into a gene to disrupt function of the gene. In IFF10076-W0-PCT[2] another embodiment, the incoming sequence includes a selective marker. In a further embodiment the incoming sequence includes two homology boxes.
[0078] As used herein, “homology box” refers to a nucleic acid sequence, which is homologous to a sequence in the bacterial cell chromosome. More specifically, a homology box is an upstream or downstream region having between about 80 and 100% sequence identity, between about 90 and 100% sequence identity, or between about 95 and 100% sequence identity with the immediate flanking coding region of a gene or part of a gene to be deleted, disrupted, inactivated, down-regulated and the like, according to the invention. These sequences direct where in the bacterial cell chromosome a DNA construct is integrated and directs what part of the chromosome is replaced by the incoming sequence. While not meant to limit the present disclosure, a homology box may include about between 1 base pair (bp) to 200 kilobases (kb). Preferably, a homology box includes about between 1 bp and 10.0 kb; between 1 bp and 5.0 kb; between 1 bp and 2.5 kb; between 1 bp and 1.0 kb, and between 0.25 kb and 2.5 kb. A homology box may also include about 10.0 kb, 5.0 kb, 2.5 kb, 2.0 kb, 1.5 kb, 1.0 kb, 0.5 kb, 0.25 kb and 0.1 kb. In some embodiments, the 5' and 3’ ends of a selective marker are flanked by a homology box wherein the homology box comprises nucleic acid sequences immediately flanking the coding region of the gene.
[0079] As used herein, a host cell “genome”, a bacterial (host) cell “genome”, or a Bacillus sp. (host) cell “genome” includes chromosomal and extrachromosomal genes.
10080] As used herein, the terms “plasmid”, “vector” and “cassette” refer to extrachromosomal elements, often carrying genes which are typically not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a singlestranded or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell.
[0081] As used herein, the term “plasmid” refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal self-replicating genetic element in many bacteria and some eukaryotes. In some embodiments, plasmids become incorporated into the genome of the host cell, in some embodiments plasmids exist in a parental cell and are lost in the daughter cell.
[0082] A used herein, a “transformation cassette” refers to a specific vector comprising a gene (or ORF thereof), and having elements in addition to the foreign gene that facilitate transformation of a particular host cell.
[0083] As used herein, the term “vector” refers to any nucleic acid that can be replicated (propagated) in cells and can carry new genes or DNA segments into cells. Thus, the term refers to a nucleic acid construct IFF10076-W0-PCT[2] designed for transfer between different host cells. Vectors include viruses, bacteriophage, pro-viruses, plasmids, phagemids, transposons, and artificial chromosomes such as YACs (yeast artificial chromosomes), BACs (bacterial artificial chromosomes), PLACs (plant artificial chromosomes), and the like, that are “episomes” (z'.e., replicate autonomously or can integrate into a chromosome of a host organism).
[0084] An “expression vector” refers to a vector that has the ability to incorporate and express heterologous DNA in a cell. Many prokaryotic and eukaryotic expression vectors are commercially available and know to one skilled in the art. Selection of appropriate expression vectors is within the knowledge of one skilled in the art.
[0085] As used herein, the terms “expression cassette” and “expression vector” refer to a nucleic acid construct generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell (i.e., these are vectors or vector elements, as described above). The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid sequence to be transcribed and a promoter. In some embodiments, DNA constructs also include a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. In certain embodiments, a DNA construct of the disclosure comprises a selective marker and an inactivating chromosomal or gene or DNA segment as defined herein.
[0086] As used herein, a “targeting vector” is a vector that includes polynucleotide sequences that are homologous to a region in the chromosome of a host cell into which the targeting vector is transformed and that can drive homologous recombination at that region. For example, targeting vectors find use in introducing mutations into the chromosome of a host cell through homologous recombination. In some embodiments, the targeting vector comprises other non-homologous sequences, e.g., added to the ends (i.e., stuffer sequences or flanking sequences). The ends can be closed such that the targeting vector forms a closed circle, such as, for example, insertion into a vector. For example, in certain embodiments, a parental B. licheniformis (host) cell is modified (e.g., transformed) by introducing therein one or more “targeting vectors”.
[0087] As used herein, the term “protein of interest” or “POI” refers to a polypeptide of interest that is desired to be expressed in a modified (recombinant) Gram-positive host cell. Thus, as used herein, a POI may be an enzyme, a substrate-binding protein, a surface-active protein, a structural protein, a receptor protein, and the like. In certain embodiments, a modified cell of the disclosure produces an increased amount of a heterologous protein of interest relative to a control cell. In particular- embodiments, an increased amount of a protein of interest produced by a modified cell of the disclosure is at least a 0.5% IFF10076-W0-PCT[2] increase, at least a 1.0% increase, at least a 5.0% increase, or a greater than 5.0% increase, relative to the control cell.
[0088] Similarly, as defined herein, a “gene of interest” or “GOI” refers a nucleic acid sequence (e.g., a polynucleotide, a gene or an ORF) which encodes a POI. A “gene of interest” encoding a “protein of interest” may be a naturally occurring gene, a mutated gene or a synthetic gene.
[0089] As used herein, the terms “polypeptide” and “protein” are used interchangeably, and refer to polymers of any length comprising amino acid residues linked by peptide bonds. The conventional one (1) letter or three (3) letter codes for amino acid residues are used herein. The polypeptide may be linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. The term polypeptide also encompasses an amino acid polymer that has been modified naturally or by intervention; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. Also included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino acids, etc.), as well as other modifications known in the art.
[0090] In certain embodiments, a gene of the instant disclosure encodes a commercially relevant industrial protein of interest, such as an enzyme (e.g., a acetyl esterases, aminopeptidases, amylases, arabinases, arabinofuranosidases, carbonic anhydrases, carboxypeptidases, catalases, cellulases, chitinases, chymosins, cutinases, deoxyribonucleases, epimerases, esterases, a-galactosidases, [3-galactosidases, a-glucanases, glucan lysases, endo-|3-glucanases, glucoamylases, glucose oxidases, a- glucosidases, P-glucosidases, glucuronidases, glycosyl hydrolases, hemicellulases, hexose oxidases, hydrolases, invertases, isomerases, laccases, lipases, lyases, mannosidases, oxidases, oxidoreductases, pectate lyases, pectin acetyl esterases, pectin depolymerases, pectin methyl esterases, pectinolytic enzymes, perhydrolases, polyol oxidases, peroxidases, phenoloxidases, phytases, polygalacturonases, proteases, peptidases, rhamno-galacturonases, ribonucleases, transferases, transport proteins, transglutaminases, xylanases, hexose oxidases, and combinations thereof).
[0091] As used herein, a “variant” polypeptide refers to a polypeptide that is derived from a parent (or reference or control) polypeptide by the substitution, insertion, or deletion of one or more amino acids, typically by recombinant DNA techniques. Variant polypeptides may differ from a reference polypeptide by a small number of amino acid residues and may be defined by their level of primary amino acid sequence homology/identity with the reference polypeptide. In certain one or more embodiments, variant polypeptides have at least about 40% to about 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99% to about 100% amino acid sequence identity with a reference polypeptide sequence. IFF10076-W0-PCT[2]
[0092] As used herein, a “variant” polynucleotide refers to a polynucleotide having a specified degree of sequence homology/identity with a parent (reference or control) polynucleotide, or hybridizes with a parent (reference or control) polynucleotide (or a complement thereof) under stringent hybridization conditions. In certain embodiments, a variant polynucleotide comprises at least about 40% to about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% to about 100% nucleotide sequence identity with a reference polynucleotide sequence.
[0093] As used herein, a “mutation” refers to any change or alteration in a nucleic acid sequence. Several types of mutations exist, including point mutations, deletion mutations, silent mutations, frame shift mutations, splicing mutations and the like. Mutations may be performed specifically (e.g., via site directed mutagenesis) or randomly (e.g., via chemical agents, passage through repair minus bacterial strains).
[0094] As used herein, “specific productivity” is total amount of protein produced per cell per time over a given time period.
[0095] As used herein, the terms “purified”, “isolated” or “enriched” are meant that a biomolecule e.g., a polypeptide or polynucleotide) is altered from its natural state by virtue of separating it from some, or all of, the naturally occurring constituents with which it is associated in nature. Such isolation or purification may be accomplished by art-recognized separation techniques such as ion exchange chromatography, affinity chromatography, hydrophobic separation, dialysis, protease treatment, ammonium sulphate precipitation or other protein salt precipitation, centrifugation, size exclusion chromatography, filtration, microfiltration, gel electrophoresis or separation on a gradient to remove whole cells, cell debris, impurities, extraneous proteins, or enzymes undesired in the final composition. It is further possible to then add constituents to a purified or isolated biomolecule composition which provide additional benefits, for example, activating agents, anti-inhibition agents, desirable ions, compounds to control pH or other enzymes or chemicals.
II. MODIFIED SIGNAL AND PRO-REGION SEQUENCES FOR ENHANCED PROTEASE PRODUCTION
[0096] As briefly set forth above, and further described hereinafter, certain embodiments of the disclosure are related to, inter alia, polynucleotides encoding genetically modified signal peptide sequences, polynucleotides encoding genetically modified pro-region sequences, polynucleotides encoding mature proteases, recombinant polynucleotides comprising a first polynucleotide encoding a signal peptide sequence operably linked to a second polynucleotide encoding a pro-region sequence operably linked to a third polynucleotide encoding a mature protease, expression cassettes thereof, and the like.
[0097] More particularly, as set forth and exemplified herein, in certain embodiments, Applicant constructed expression cassettes (Example 1) encoding a mature reporter protease (i.e., subtilisin variant 1; IFF10076-W0-PCT[2]
SEQ ID NO: 3), wherein the cassettes generally comprise an upstream promoter region (DNA) sequence operably linked to a downstream (DNA) sequence encoding a signal peptide sequence operably linked to a downstream (DNA) sequence encoding a pro-region sequence operably linked a downstream (DNA) sequence encoding a mature protease operably linked a downstream transcriptional terminator (DNA) sequence.
[0098] For instance, as described below in Example 1, a DNA fragment comprising an upstream aprE gene flanking region aprE 5'-FR) was operably linked to a polynucleotide expression cassette (FIG. 1) comprising a heterologous B. subtilis rrnI-P2 promoter/aprE 5'-UTR region (DNA) sequence (SEQ ID NO: 1) operably linked to a DNA sequence (SEQ ID NO: 4) encoding a reference signal peptide sequence (FIG. 2, SEQ ID NO: 5) operably linked to a DNA sequence (SEQ ID NO: 6) encoding a reference pro-region sequence (FIG. 3, SEQ ID NO: 7) operably linked to a DNA sequence encoding the mature subtilisin (SEQ ID NO: 3) operably linked to a BPN' terminator DNA sequence (SEQ ID NO: 8), which polynucleotide construct was operably linked to a downstream aprE gene flanking region (aprE 3'-FR) sequence which includes a kanamycin (kan) gene expression cassette (SEQ ID NO: 9). This DNA fragment was used as template to develop linear DNA expression cassettes (FIG. 1) comprising a signal peptide sequence modification or a pro-region sequence modification described herein. In particular, the cassettes were used to transform competent B. subtilis cells, wherein the transformation mixtures were grown overnight at 37°C, and single colonies were sorted and grown at 37°C under antibiotic selection.
[0099] Subsequently, sequence analysis was performed to determine novel signal peptide sequence and pro-region sequence variants that were cherry picked into 384-well microtiter plates (MTPs). For example, in the case of the signal peptide sequence (FIG. 2B, AprE_ss; SEQ ID NO: 5), the first (1st) amino acid position is a valine (“Vai” or “V”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, the second (2nd) amino acid position is an arginine (“Arg” or “R”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all twenty-nine (29) positions of the signal sequence (AprE_ss; SEQ ID NO: 5) were evaluated. In addition to the substitutions at each position, amino acid insertions or deletions were introduced and evaluated at each position of the of the signal sequence (AprE_ss; SEQ ID NO: 5).
[0100] Likewise, in the case of the pro-region sequence (FIG. 3B, BPN'_PRO; SEQ ID NO: 7), the first (1st) amino acid position is an alanine (“Ala” or “A”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and the second (2nd) amino acid position of is a glycine (“Gly” or “G”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all seventy-seven (77) positions of pro-region sequence (BPN'_PRO; SEQ ID NO: 7) were evaluated. In addition to the substitutions at each position, amino acid insertions or deletions were introduced and evaluated at each position of the pro-region sequence (BPN'_PRO; SEQ ID NO: 7). IFF10076-W0-PCT[2]
[0101] Thus, for the reporter protease expression experiments described below in Example 2, transformed cells were grown in 96-well MTPs in cultivation medium for two days at 32°C, 270 rpm, with 80% humidity in shaking incubator, which were centrifuged and filtrated. Clarified culture supernatants were used to assay reporter protease activity and determine productivity levels, wherein samples were taken after forty (40) hours. More particularly, the activity of the reporter protease (Example 2) was determined by measuring the hydrolysis of the synthetic suc-AAPF-pNA peptide substrate, wherein the protease activity was expressed as mOD/minute. More particularly, the protease activity of each SEL variant constructed (Example 1) was measured and compared to a reference construct that was grown in the same plate. For example, by dividing the value of the reference sample by the value of a variant sample ([reference]/[variant]), the performance index (PI) of the signal peptide sequence SEL (TABLE 1) and proregion sequence SEL (TABLE 2) were determined.
[0102] In particular, TABLE 1 (Example 2) presents the Pl of certain signal peptide SEL variants having improved protease productivity as compared to (vis-a-vis) the reference construct. For instance, as presented in TABLE 1, certain amino acid substitutions at position S3 (S3F, S3I or S3T), position L10 (L10W), position A13 (A13T), position L14 (L14P), position T15 (T15V), position T19 (T19V) or position M20 (M20D) of the signal sequence produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 5). As further shown in TABLE 1, certain amino acid insertions immediately following the lysine (K) at position 5, immediately following the leucine (L) at position 6, or immediately following the threonine (T) at positions 19 produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 5). In addition, as presented in TABLE 1, certain amino acid deletions (Z) at positions S3 (S3Z), L14 (L14Z), L16 (L16Z), F18 (F18Z) or T19 (T19Z) of the signal sequence produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 5). Thus, as summarized in TABLE 1, out of the 445 variant (SEL) signal sequences constructed, 17 of the variant signal sequences had improved protease productivity (PI values) as compared to the reference sequence.
[0103] Likewise, TABLE 2 (Example 2) presents the PI of certain pre-region sequence SEL variants having improved protease productivity as compared to (vis-a-vis) the reference construct. For example, as shown in TABLE 2, certain amino acid substitutions at position K9 (K9M), position T17 (T17G), position S19 (S19K or S19T), position T20 (T20H or T20M), position M21 (M21L), position A23 (A23N), position K36 (K36E), position Q38 (Q38I), position T50 (T50E), position K57 (K57A), position E58 (E58D or E58Q), position K60 (K60E), position D62 (D62H), position S64 (S64Y) or position E70 (E70R) of the pro-region sequence produced increased amounts of the reporter protease relative to the reference proregion sequence (SEQ ID NO: 7). As further shown in TABLE 2, certain amino acid insertions immediately following the glutamic acid (E) at position 7, immediately following the alanine (A) at position IFF10076-W0-PCT[2]
24 or immediately following the alanine (A) at position 76 produced increased amounts of the reporter protease relative to the reference pro-region sequence (SEQ ID NO: 7). In addition, as presented in TABLE 2, deletion (Z) of the threonine (T) at position 20 (T20Z) produced increased amounts of the reporter protease relative to the reference pro-region sequence (SEQ ID NO: 7). As summarized in TABLE 2, out of the 1,461 variant (SEL) pro-region sequences constructed, 22 of the variant pro-region sequences had improved protease productivity (PI values) as compared to the reference sequence.
[0104] In certain other embodiments, expression cassettes (Example 3) encoding a mature reporter protease (i.e., subtilisin variant 1; SEQ ID NO: 3) are constructed, wherein the cassettes comprise an upstream promoter region (DNA) sequence operably linked to a downstream (DNA) sequence encoding a signal peptide sequence operably linked to a downstream (DNA) sequence encoding a pro-region sequence operably linked a downstream (DNA) sequence encoding a mature protease operably linked a downstream transcriptional terminator (DNA) sequence. Lor instance, as described below in Example 3, a DNA fragment comprising an upstream aprE gene flanking region (aprE 5'-FR) is operably linked to a polynucleotide expression cassette comprising a heterologous B. subtilis rrnI-P2 promoter/oprE 5'-UTR region (DNA) sequence (SEQ ID NO: 1) operably linked to a DNA sequence (SEQ ID NO: 11) encoding a reference signal peptide sequence (FIG. 4, SEQ ID NO: 10) operably linked to a DNA sequence (SEQ ID NO: 6) encoding a reference pro-region sequence (FIG. 3, SEQ ID NO: 7) operably linked to a DNA sequence encoding the mature subtilisin (SEQ ID NO: 3) operably linked to a BPN' terminator DNA sequence (SEQ ID NO: 8), which polynucleotide construct is operably linked to a downstream aprE gene flanking region {aprE 3'-FR) sequence which includes a kanamycin (kan) gene expression cassette (SEQ ID NO: 9). This DNA fragment is used as template to develop linear DNA expression cassettes comprising a signal peptide sequence modification described herein. In particular, the cassettes ar e used to transform competent B. subtilis cells, wherein the transformation mixtures is grown overnight at 37°C, and single colonies sorted and grown at 37°C under antibiotic selection.
[0105] Subsequently, sequence analysis is performed to determine novel signal peptide sequence variants that are cherry picked into 384-well microtiter plates (MTPs). For example, in the case of the signal peptide sequence FIG. 4B, BPN'_ss; SEQ ID NO: 10), the first (1st) amino acid position is a methionine (“Met” or “M”), which is altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, the second (2nd) amino acid position is an arginine (“Arg” or “R”), which is altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all thirty (30) positions of the signal sequence (BPN'_ss; SEQ ID NO: 10) are evaluated. In addition to the substitutions at each position, amino acid insertions or deletions were introduced and evaluated at each position of the of the signal sequence (BPN'_ss; SEQ ID NO: 10).
[0106] As set forth below in Example 4, reporter protease expression experiments are performed as IFF10076-W0-PCT[2] generally described in Example 2 wherein the clarified culture supernatants are used to assay reporter protease activity and determine productivity levels after forty (40) hours. More particularly, the protease activity of each SEL variant constructed (Example 3) are measured and compared to a reference construct that was grown in the same plate. For example, by dividing the value of the reference sample by the value of a variant sample ([variant]/[reference]), the performance index (PI) of the signal peptide sequence SEL (TABLE 3) are determined.
[0107] In particular, TABLE 3 (Example 4) presents the PI of certain signal peptide SEL variants having improved protease productivity as compared to (vis-a-vis) the reference construct. For instance, as presented in TABLE 3, certain amino acid substitutions at position G3 (G3F, G3I or G3T), position LIO (L10W), position A13 (A13T), position L14 (L14P), position A15 (A15V), position T19 (T19V) or position M20 (M20D) of the signal sequence produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 10). As further shown in TABLE 3, certain amino acid insertions immediately following the lysine (K) at position 5 (Z5.01 A), immediately following the leucine (L) at position (Z6.01A), or immediately following the threonine (T) at positions 19 (Z 19.01 A) produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 10). In addition, as presented in TABLE 3, certain amino acid deletions (Z) at positions G3 (G3Z), L14 (L14Z), L16 (L16Z), F18 (F18Z) or T19 (T19Z) of the signal sequence produced increased amounts of the reporter protease relative to the reference signal peptide sequence (SEQ ID NO: 10). Thus, as summarized in TABLE 3, out of the XXX variant (SEL) signal sequences constructed, YY of the variant signal sequences had improved protease productivity (PI values) as compared to the reference sequence.
[0108] Thus, certain one or more embodiments of the disclosure are related to, inter alia, polynucleotides encoding genetically modified signal peptide sequences, polynucleotides encoding genetically modified pro-region sequences, polynucleotides encoding mature proteases, recombinant polynucleotides comprising a first polynucleotide encoding a signal peptide sequence operably linked to a second polynucleotide encoding a pro-region sequence operably linked to a third polynucleotide encoding a mature protease, expression cassettes thereof, methods and compositions for constructing recombinant (modified) Bacillus sp. strains comprising introduced polynucleotides encoding a precursor protease, Bacillus sp. strains thereof secreting the mature protease into the fermentation broth when fermented under suitable conditions and the like.
[0109] In certain embodiments, a modified signal peptide sequence of the disclosure comprises an amino acid sequence derived from a native (reference) signal peptide sequence (SEQ ID NO: 5). In certain related embodiments, amino acid modifications of the signal peptide sequence described herein are numbered by reference to amino acid positions 1-29 of SEQ ID NO: 5 (e.g., FIG. 2). IFF10076-W0-PCT[2]
[0110] In certain other one or more embodiments, a modified pro-region sequence of the disclosure comprises an amino acid sequence derived from a native (reference) pro-region sequence (SEQ ID NO: 7). In certain related embodiments, amino acid modifications of the pro-region described herein are numbered by reference to amino acid positions 1-77 of SEQ ID NO: 7 (e.g., FIG. 3).
[0111] In certain other embodiments, a modified signal peptide sequence of the disclosure comprises an amino acid sequence derived from a native (reference) signal peptide sequence (SEQ ID NO: 10). In certain related embodiments, amino acid modifications of the signal peptide sequence described herein are numbered by reference to amino acid positions 1-30 of SEQ ID NO: 10 (e.g., FIG. 4).
[0112] For example, the amino acid sequence of a signal peptide sequence variant described herein can be aligned with the amino acid sequence of SEQ ID NO: 5 or SEQ ID NO: 10 using an alignment algorithm, and each amino acid residue in the given amino acid sequence that aligns (preferably optimally aligns) with an amino acid residue in SEQ ID NO: 5 or SEQ ID NO: 10 is conveniently numbered by reference to the numerical position of that corresponding amino acid residue. Likewise, the amino acid sequence of a proregion sequence variant described herein can be aligned with the amino acid sequence of SEQ ID NO: 7 using an alignment algorithm, and each amino acid residue in the given amino acid sequence that aligns (preferably optimally aligns) with an amino acid residue in SEQ ID NO: 7 is conveniently numbered by reference to the numerical position of that corresponding amino acid residue.
10113] In certain aspects, reference to an amino acid “position” may be presented as a single letter amino acid (residue) followed by the position number (e.g.. SEQ ID NO: 5, valine (V) at position 1 presented as “VI”, arginine (R) at position 2 presented as “R2”, etc.). In other embodiments, reference to a variant sequence may be presented as a single letter amino acid (residue), wherein the amino acid position of the reference sequence is numbered followed by the substituted amino acid (residue) at the same position (e.g., SEQ ID NO: 5, valine (V) at position 1 substituted with a glycine (G) presented as “V1G”, arginine (R) at position 2 substituted with a histidine (H) presented as “R2H”, etc.). Multiple amino acid residues may also be substituted at the same position of a signal sequence, wherein the amino acid position of the parent (reference) sequence is numbered followed by the mutated amino acid residue(s) at the same position separated by a fore slash (e.g., SEQ ID NO: 5, valine (V) at position 1 substituted with glycine (G), glutamic acid (E) or lysine (K) may be presented as “V1G/E/K”, etc.).
[0114] As used herein, in the context of a polypeptide or a sequence thereof, the term “substitution” means the replacement (i.e., substitution) of one amino acid with another amino acid.
[0115] As used herein, “homologous signal sequence”, “homologous pro-region sequence” and/or “homologous protease sequence” refer to sequences that have distinct similarity in primary, secondary, and/or tertiary structure. IFF10076-W0-PCT[2]
[0116] As used herein, the term “homology” relates to homologous polynucleotides or polypeptides. If two or more polynucleotides, or two or more polypeptides are homologous, this means that the homologous polynucleotides or polypeptides have a “degree of identity” of at least about 40%, 50%, 60%, more preferably at least 70%, even more preferably at least 85%, still more preferably at least 90%, more preferably at least 95%, and most preferably at least 98%. Whether two polynucleotide or polypeptide sequences have a sufficiently high degree of identity to be homologous as defined herein, can suitably be investigated by aligning the two sequences using a computer program known in the art, such as “GAP” provided in the GCG program package (Program Manual for the Wisconsin Package, Version 8, August 1994, Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711) (Needleman and Wunsch, (1970). Using GAP with the following settings for DNA sequence comparison: GAP creation penalty of 5.0 and GAP extension penalty of 0.3.
[0117] As set forth above, the term “identical” in the context of two polynucleotide or polypeptide sequences refers to the nucleotides or amino acids in the two sequences that are the same when aligned for maximum correspondence, as measured using sequence comparison or analysis algorithms described below and known in the art. The phrase “percent (%) identity” (abbreviated, “PID”) refers to polynucleotide (nucleic acid) or polypeptide (amino acid) sequence identity. Percent identity may be determined using standard techniques known in the art. In certain embodiments, the percent amino acid identity shared by sequences of interest can be determined by aligning the sequences to directly compare the sequence information, e.g., by using an alignment program/algorithm such as BLAST, MUSCLE, or CLUSTAL. For example, the BLAST algorithm has been described in Altschul et al. (1990) and Karlin et al. (1993). In particular, a percent (%) amino acid sequence identity value is determined by the number of matching identical residues divided by the total number of residues of the “reference” sequence including any gaps created by the program for optimal/maximum alignment. BLAST algorithms refer to the “reference” sequence as the “query” sequence.
[01 18] The BLAST program uses several search parameters, most of which are set to the default values. The NCBI BLAST algorithm finds the most relevant sequences in terms of biological similarity but is not recommended for query sequences of less than 20 residues (Altschul et al., 1997 and Schaffer et al., 2001). Exemplary default BLAST parameters for a nucleic acid sequence searches include: Neighboring words threshold=l l; E-value cutoff=10; Scoring Matrix=NUC.3.1 (match=l, mismatch=-3);Gap Opening=5; and Gap Extension=2. Exemplary default BLAST parameters for amino acid sequence searches include: Word size = 3; E-value cutoff=10; Scoring Matrix=BLOSUM62; Gap Opening=l l; and Gap extension=l. Using this information, protein sequences can be grouped and/or a phylogenetic tree built therefrom. Amino acid sequences can be entered in a program such as the Vector NTI Advance suite and a Guide Tree can be created using the Neighbor Joining (NJ) method (Saitou and Nei, 1987). The tree construction can be IFF10076-W0-PCT[2] calculated using Kimura’s correction for sequence distance and ignoring positions with gaps. A program such as AlignX can display the calculated distance values in parenthesis following the molecule name displayed on the phylogenetic tree.
[0119] The CLUSTAL W algorithm is another example of a sequence alignment algorithm (Thompson et al., 1994). Default parameters for the CLUSTAL W algorithm include: Gap opening penalty=10.0; Gap extension penalty=0.05; Protein weight matrix=BLOSUM series; DNA weight matrix=IUB; Delay divergent sequences %=40; Gap separation distance=8; DNA transitions weight=0.50; List hydrophilic residues=GPSNDQEKR; Use negative matrix=OFF; Toggle Residue specific penalties=ON; Toggle hydrophilic penalties=ON; and Toggle end gap separation penalty=OFF. In CLUSTAL algorithms, deletions occurring at either terminus are included. For example, a variant with a five amino acid deletion at either terminus (or within the polypeptide) of a polypeptide of 500 amino acids would have a percent sequence identity of 99% (495/500 identical residues x 100) relative to the “reference” polypeptide. Such a variant would be encompassed by a variant having “at least 99% sequence identity” to the polypeptide.
III. RECOMBINANT POLYNUCLEOTIDES AND MOLECULAR BIOLOGY
[0120] As generally set forth above, certain embodiments of the disclosure provide, inter alia, polynucleotides encoding genetically modified signal peptide sequences, polynucleotides encoding genetically modified pro-region sequences, polynucleotides encoding mature proteases, recombinant polynucleotides comprising a first polynucleotide encoding a signal peptide sequence operably linked to a second polynucleotide encoding a pro-region sequence operably linked to a third polynucleotide encoding a mature protease, expression cassettes thereof, methods and compositions for constructing recombinant strains comprising introduced polynucleotides encoding a precursor protease, and the like.
[0121] Thus, certain embodiments are related to recombinant polynucleotides encoding a protease of interest. Certain other embodiments provide polynucleotide constructs suitable for introducing into recombinant Gram-positive bacterial strains for the enhanced production of a protease of interest. Certain other embodiments provide expression cassettes comprising (in the 5' to 3' direction) an upstream promoter region (DNA) sequence operably linked to a downstream DNA sequence encoding a signal peptide sequence operably linked to a downstream DNA sequence encoding a pro-region sequence operably linked to a downstream DNA encoding a mature protease of interest operably linked to a downstream transcriptional terminator DNA sequence (e.g., sec FIG. 1A)
[0122] For example, one or more nucleic acid sequences described herein can be generated by using any suitable synthesis, manipulation, and/or isolation techniques, or combinations thereof. For example, one or more polynucleotides described herein may be produced using standard nucleic acid synthesis techniques, such as solid-phase synthesis techniques that are well-known to those skilled in the art. In such techniques, fragments of up to fifty (50) or more nucleotide bases are typically synthesized, then joined IFF10076-W0-PCT[2]
(e.g., by enzymatic or chemical ligation methods) to form essentially any desired continuous nucleic acid sequence. The synthesis of the one or more polynucleotide described herein can be also facilitated by any suitable method known in the art, including but not 21imited to chemical synthesis using the classical phosphoramidite method (e.g., Beaucage and Caruthers, 1981) or the method described by Matthes et al. (1984) as is typically practiced in automated synthetic methods. One or more polynucleotides described herein can also be produced by using an automatic DNA synthesizer. Customized nucleic acids can be ordered from a variety of commercial sources (e.g., ATUM (DNA 2.0), Newark, CA, USA; Life Tech (GeneArt), Carlsbad, CA, USA; GenScript, Ontario, Canada; Base Clear B. V., Leiden, Netherlands; Integrated DNA Technologies, Skokie, IL, USA; Ginkgo Bioworks (Gen9), Boston, MA, USA; and Twist Bioscience, San Francisco, CA, USA). Other techniques for synthesizing nucleic acids and related principles are described and known in the art.
[0123 J Recombinant DNA techniques useful in modification of nucleic acids are well known in the art, such as, for example, restriction endonuclease digestion, ligation, reverse transcription and cDNA production, and polymerase chain reaction (e.g., PCR). One or more polynucleotides described herein may also be obtained by screening cDNA libraries using one or more oligonucleotide probes that can hybridize to or PCR-amplify polynucleotides which encode one or more variants described herein. Procedures for screening and isolating cDNA clones and PCR amplification procedures are well known to those of skill in the art and described in standard references known to those skilled in the art. One or more polynucleotides described herein can be obtained by altering a naturally occurring polynucleotide backbone (e.g., that encodes one or more variant pro-region sequences described herein) by, for example, a known mutagenesis procedure (e.g., site-directed mutagenesis, site saturation mutagenesis, and in vitro recombination). A variety of methods are known in the art that are suitable for generating modified polynucleotides described herein that encode one or more variants described herein, including, but not limited to, for example, sitesaturation mutagenesis, scanning mutagenesis, insertional mutagenesis, deletion mutagenesis, random mutagenesis, site-directed mutagenesis, and directed-e volution, as well as various other recombinatorial approaches.
[0124] As generally set forth above and further described below in the Examples, certain embodiments of the disclosure are related to recombinant (modified) Gram-positive cells capable of producing increased amounts of heterologous proteins of interest. Certain embodiments are therefore related to methods for constructing such recombinant Gram-positive cells having increased protein production capabilities. In certain embodiments, one or more expression cassettes encoding a protein of intertest are introduced into Gram-positive cells of the disclosure. In exemplary embodiments, the cassettes are integrated into the genome of the cell. Thus, certain embodiments arc related to nucleic acid molecules, polynucleotides (e.g., vectors, plasmids, expression cassettes), regulatory elements, and the like, suitable for use in constructing IFF10076-W0-PCT[2] recombinant (modified) Gram-positive host cells. An example of such a regulatory or control sequence may be a promoter sequence or a functional part thereof, (i.e., a part which is sufficient for affecting expression of the nucleic acid sequence). Other control sequences for modification include, but are not limited to, a leader sequence, a pro-peptide sequence, a signal sequence, a transcription terminator, a transcriptional activator and the like.
[0125] In certain embodiments, a modified cell is produced or constructed via CRISPR-Cas9 editing. For example, a gene encoding a protein of interest can be edited or disrupted (or deleted or down-regulated) by means of nucleic acid guided endonucleases, that find their target DNA by binding either a guide RNA (e.g., Cas9) and Cpfl or a guide DNA (e.g., NgAgo), which recruits the endonuclease to the target sequence on the DNA, wherein the endonuclease can generate a single or double stranded break in the DNA. This targeted DNA break becomes a substrate for DNA repair, and can recombine with a provided editing template to disrupt or delete the gene. For example, the gene encoding the nucleic acid guided endonuclease (for this purpose Cas9 from S. pyogenes) or a codon optimized gene encoding the Cas9 nuclease is operably linked to a promoter active in the Gram-positive cell and a terminator active in Grampositive cells, thereby creating a Gram-positive cell Cas9 expression cassette. Likewise, one or more target sites unique to the gene of interest are readily identified by a person skilled in the art. For example, to build a DNA construct encoding a gRNA -directed to a target site within the gene of interest, the variable targeting domain (VT) will comprise nucleotides of the target site which are 5' of the (PAM) proto-spacer adjacent motif (TGG), which nucleotides are fused to DNA encoding the Cas9 endonuclease recognition domain for S. pyogenes Cas9 (CER). The combination of the DNA encoding a VT domain and the DNA encoding the CER domain thereby generate a DNA encoding a gRNA. Thus, a Gram-positive expression cassette for the gRNA is created by operably linking the DNA encoding the gRNA to a promoter active in Grampositive cells and a terminator active in Gram-positive cells.
[0126] In certain embodiments, the DNA break induced by the endonuclease is repaired/replaced with an incoming sequence. For example, to precisely repair the DNA break generated by the Cas9 expression cassette and the gRNA expression cassette described above, a nucleotide editing template is provided, such that the DNA repair machinery of the cell can utilize the editing template. For example, about 500bp 5' of targeted gene can be fused to about 500bp 3' of the targeted gene to generate an editing template, which template is used by the Gram-positive host’s machinery to repair the DNA break generated by the RGEN. [0127] The Cas9 expression cassette, the gRNA expression cassette and the editing template can be codelivered to filamentous fungal cells using many different methods (e.g., protoplast fusion, electroporation, natural competence, or induced competence). The transformed cells are screened by PCR amplifying the target gene locus, by amplifying the locus with a forward and reverse primer. These primers can amplify IFF10076-W0-PCT[2] the wild-type locus or the modified locus that has been edited by the RGEN. These fragments are then sequenced using a sequencing primer to identify edited colonies.
[0128] PCT Publication No. W02003/083125 discloses methods for modifying Gram-positive Bacillus) cells, such as the creation of Bacillus deletion strains and DNA constructs using PCR fusion to bypass E. coli. PCT Publication No. W02002/14490 discloses methods for modifying Bacillus cells including (1) the construction and transformation of an integrative plasmid (pComK), (2) random mutagenesis of coding sequences, signal sequences and pro-peptide sequences, (3) homologous recombination, (4) increasing transformation efficiency by adding non-homologous flanks to the transformation DNA, (5) optimizing double cross-over integrations, (6) site directed mutagenesis and (7) marker-less deletion.
[0129] Those of skill in the art are well aware of suitable methods for introducing polynucleotide sequences into bacterial cells (e.g., Gram-negative cells, Gram-positive cells). Indeed, such methods as transformation including protoplast transformation and congression, transduction, and protoplast fusion are known and suited for use in the present disclosure. Methods of transformation are particularly preferred to introduce a DNA construct of the present disclosure into a host cell.
[0130] In addition to commonly used methods, in some embodiments, host cells are directly transformed (i.e., an intermediate cell is not used to amplify, or otherwise process, the DNA construct prior to introduction into the host cell). Introduction of the DNA construct into the host cell includes those physical and chemical methods known in the art to introduce DNA into a host cell, without insertion into a plasmid or vector. Such methods include, but are not limited to, calcium chloride precipitation, electroporation, naked DNA, liposomes and the like. In additional embodiments, DNA constructs are co-transformed with a plasmid without being inserted into the plasmid. In further embodiments, a selective marker is deleted or substantially excised from the modified Bacillus strain by methods known in the ait. In some embodiments, resolution of the vector from a host chromosome leaves the flanking regions in the chromosome, while removing the indigenous chromosomal region.
[0131] Promoters and promoter sequence regions for use in the expression of genes, coding sequences (CDS), open reading frames (ORFs) and/or variant sequences thereof in Gram-positive cells are generally known on one of skill in the art. Promoter sequences of the disclosure are generally chosen so that they are functional in the Gram-positive cells. For example, promoters useful for driving gene expression in Bacillus cells include, but are not limited to, the B. subtilis alkaline protease (aprE) promoter, the a-amylase promoter (amyE) of B. subtilis, the a-amylase promoter (amyL) of B. licheniformis, the a-amylase promoter of B. amyloliquefaciens, the neutral protease (nprE) promoter from B. subtilis, a mutant aprE promoter, or any other promoter from B licheniformis or other related Bacilli. Methods for screening and creating promoter libraries with a range of activities (promoter strength) in Bacillus cells is describe in Publication No. W02002/14490. IFF10076-W0-PCT[2]
IV. FERMENTING GRAM-POSITIVE CELLS FOR THE PRODUCTION OF PROTEINS
[0132] As generally described above, certain embodiments are related to compositions and methods for constructing and obtaining Gram-positive cells having increased protein production phenotypes. Thus, certain embodiments are related to methods of producing proteins of interest in Gram-positive cells by fermenting the cells in a suitable medium. Fermentation methods well known in the art can be applied to ferment Gram-positive cells of the disclosure.
[0133] In some embodiments, the cells are cultured under batch or continuous fermentation conditions. A classical batch fermentation is a closed system, where the composition of the medium is set at the beginning of the fermentation and is not altered during the fermentation. At the beginning of the fermentation, the medium is inoculated with the desired organism(s). In this method, fermentation is permitted to occur without the addition of any components to the system. Typically, a batch fermentation qualifies as a “batch” with respect to the addition of the carbon source, and attempts are often made to control factors such as pH and oxygen concentration. The metabolite and biomass compositions of the batch system change constantly up to the time the fermentation is stopped. Within typical batch cultures, cells can progress through a static lag phase to a high growth log phase, and finally to a stationary phase, where growth rate is diminished or halted. If untreated, cells in the stationary phase eventually die. In general, cells in log phase are responsible for the bulk of production of product.
10134] A suitable variation on the standard batch system is the “fed-batch” fermentation system. In this variation of a typical batch system, the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression likely inhibits the metabolism of the cells and where it is desirable to have limited amounts of substrate in the medium. Measurement of the actual substrate concentration in fed-batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors, such as pH, dissolved oxygen and the partial pressure of waste gases, such as COj. Batch and fed-batch fermentations are common and known in the art.
[0135] Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor, and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density, where cells are primarily in log phase growth. Continuous fermentation allows for the modulation of one or more factors that affect cell growth and/or product concentration. For example, in one embodiment, a limiting nutrient, such as the carbon source or nitrogen source, is maintained at a fixed rate and all other parameters are allowed to moderate. In other systems, a number of factors affecting growth can be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems strive to maintain steady state growth conditions. Thus, cell loss due to medium being drawn off should be balanced against the cell growth rate in the fermentation. Methods of modulating nutrients and growth factors for IFF10076-W0-PCT[2] continuous fermentation processes, as well as techniques for maximizing the rate of product formation, are well known in the art of industrial microbiology.
[0136] In certain embodiments, a protein of interest expressed/produced by a Gram-positive cell of the disclosure may be recovered from the culture medium by conventional procedures including separating the host cells from the medium by centrifugation or filtration, or if necessary, disrupting the cells and removing the supernatant from the cellular fraction and debris. Typically, after clarification, the proteinaceous components of the supernatant or filtrate are precipitated by means of a salt, e.g., ammonium sulfate. The precipitated proteins are then solubilized and may be purified by a variety of chr omatographic procedures, e.g., ion exchange chromatography, gel filtration.
[0137] In some embodiments, the cells are cultured under batch or continuous fermentation conditions. A classical batch fermentation is a closed system, where the composition of the medium is set at the beginning of the fermentation and is not altered during the fermentation. At the beginning of the fermentation, the medium is inoculated with the desired organism(s). In this method, fermentation is permitted to occur without the addition of any components to the system. Typically, a batch fermentation qualifies as a “batch” with respect to the addition of the carbon source, and attempts are often made to control factors such as pH and oxygen concentr ation. The metabolite and biomass compositions of the batch system change constantly up to the time the fermentation is stopped. Within typical batch cultures, cells can progress through a static lag phase to a high growth log phase, and finally to a stationary phase, where growth rate is diminished or halted. If untreated, cells in the stationary phase eventually die. In general, cells in log phase are responsible for the bulk of production of product.
[0138] A suitable variation on the standard batch system is the “fed-batch” fermentation system. In this variation of a typical batch system, the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression likely inhibits the metabolism of the cells and where it is desirable to have limited amounts of substrate in the medium. Measurement of the actual substrate concentration in fed-batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors, such as pH, dissolved oxygen, and the partial pressure of waste gases, such as CO?. Batch and fed-batch fermentations are common and known in the art.
[0139] Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor, and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density, where cells are primarily in log phase growth. Continuous fermentation allows for the modulation of one or more factors that affect cell growth and/or product concentration. For example, in one embodiment, a limiting nutrient, such as the carbon source or nitrogen source, is maintained at a fixed rate and all other parameters are allowed to moderate. In other systems, a number of factors affecting growth can be altered continuously IFF10076-W0-PCT[2] while the cell concentration, measured by media turbidity, is kept constant. Continuous systems strive to maintain steady state growth conditions. Thus, cell loss due to medium being drawn off should be balanced against the cell growth rate in the fermentation. Methods of modulating nutrients and growth factors for continuous fermentation processes, as well as techniques for maximizing the rate of product formation, are well known in the art of industrial microbiology.
[0140] In certain embodiments, a protein of interest expressed/produced by a Gram-positive cell of the disclosure may be recovered from the culture medium by conventional procedures including separating the host cells from the medium by centrifugation or filtration, or if necessary, disrupting the cells and removing the supernatant from the cellular fraction and debris. Typically, after clarification, the proteinaceous components of the supernatant or filtrate are precipitated by means of a salt, e.g., ammonium sulfate. The precipitated proteins are then solubilized and may be purified by a variety of chromatographic procedures, e.g., ion exchange chromatography, gel filtration.
V. PROTEINS OF INTEREST
101411 A protein of interest (POI) of the instant disclosure can be any endogenous or heterologous protein, and it may be a variant of such a POI. The protein can contain one or more disulfide bridges or is a protein whose functional form is a monomer or a multimer, i.e., the protein has a quaternary structure and is composed of a plurality of identical (homologous) or non-identical (heterologous) subunits, wherein the POI or a variant POI thereof is preferably one with properties of interest.
[0142] For example, in certain embodiments, a modified Gram-positive cell of the disclosure produces at least about 0.1% more, at least about 0.5% more, at least about 1% more, at least about 5% more, at least about 6% more, at least about 7% more, at least about 8% more, at least about 9% more, or at least about 10% or more of a POI, relative to its unmodified (reference or control) cell.
[0143] In certain embodiments, a modified Gram-positive cell of the disclosure exhibits an increased specific productivity (Qp) of a POI relative the control cell. For example, the detection of specific productivity (Qp) is a suitable method for evaluating protein production. The specific productivity (Qp) can be determined using the following equation:
“Qp = gP/gDCW*hr” wherein, “gP” is grams of protein produced in the tank; “gDCW” is grams of dry cell weight (DCW) in the tank and “hr” is fermentation time in hours from the time of inoculation, which includes the time of production as well as growth time.
[0144] Thus, in certain other embodiments, a modified Gram-positive cell of the disclosure comprises a specific productivity (Qp) increase of at least about 0.1%, at least about 1%, at least about 5%, at least about 6%, at least about 7%, at least about 8%, at least about 9%, or at least about 10% or more, relative to the unmodified (parental) cell. IFF10076-W0-PCT[2]
[0145] In certain embodiments, a POI or a variant POI thereof is selected from the group consisting of acetyl esterases, aminopeptidases, amylases, arabinases, arabinofuranosidases, carbonic anhydrases, carboxypeptidases, catalases, cellulases, chitinases, chymosins, cutinases, deoxyribonucleases, epimerases, esterases, a-galactosidases, |3-galactosidases, a-glucanases, glucan lysases, endo-0-glucanases, glucoamylases, glucose oxidases, a-glucosidases, [3-glucosidases, glucuronidases, glycosyl hydrolases, hemicellulases, hexose oxidases, hydrolases, invertases, isomerases, laccases, ligases, lipases, lyases, mannosidases, oxidases, oxidoreductases, pectate lyases, pectin acetyl esterases, pectin depolymerases, pectin methyl esterases, pectinolytic enzymes, perhydrolases, polyol oxidases, peroxidases, phenoloxidases, phytases, polygalacturonases, proteases, peptidases, rhamno-galacturonases, ribonucleases, transferases, transport proteins, transglutaminases, xylanases, hexose oxidases, and combinations thereof.
[0146] In certain embodiments, a POI is a protease. For example, a protease (also known as a proteinase) is an enzyme (protein) that has the ability to break down other proteins. A protease has the ability to conduct proteolysis, by hydrolysis of peptide bonds that link amino acids together in a peptide or polypeptide chain forming the protein. This activity of a protease as a protein-digesting enzyme is termed a proteolytic activity. Many well-known procedures exist for measuring proteolytic activity. For instance, proteolytic activity may be ascertained by comparative assays which analyze the respective protease’s ability to hydrolyze a commercial substrate. Exemplary substrates useful in the analysis of protease or proteolytic activity, include, but are not limited to, di-methyl casein (Sigma C-9801), bovine collagen (Sigma C-9879), bovine elastin (Sigma E-1625), and bovine keratin (ICN Biomedical 902111). Colorimetric assays utilizing these substrates are well known in the ait.
[0147] In certain related embodiments, a protease of the disclosure is a serine proteases (EC No. 3.4.21) possessing an active site serine that initiates hydrolysis of peptide bonds of proteins. Serine proteases comprise a diverse class of enzymes having a wide range of specificities and biological functions that are further divided based on their structure into chymotrypsin-like (trypsin-like) and subtilisin-like. The prototypical subtilisin (EC No. 3.4.21.62) was initially obtained from Bacillus subtilis. Subtilisins (also sometimes referred to as subtilases) and their homologues are members of the S8 peptidase family of the MEROPS classification scheme. Members of family S8 have a catalytic triad in the order Asp, His and Ser in their amino acid sequence. Thus, in certain one or more embodiments, a protease of the disclosure is a native or variant subtilisin, such as native and variant subtilisins described in PCT Publication No. WO1995/10615, PCT Publication No WO2019/108599, PCT Publication No WO2019/245705, PCT Publication No W02020/112599, US Patent No. 5,185,258 and US Patent No. 5,204,015.
[0148] In certain embodiments, a POI or a variant POI thereof is an enzyme selected from Enzyme Commission (EC) Number EC 1, EC 2, EC 3, EC 4, EC 5, or EC 6. IFF10076-W0-PCT[2]
[0149] Various assays known to those of ordinary skill in the art for detecting and measuring activity of intracellularly and extracellularly expressed proteins.
VI. EXEMPLARY EMBODIMENTS
[0150] Non-limiting embodiments of compositions and methods disclosed herein are as follows:
[0151] 1. A polynucleotide encoding a modified signal peptide comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions VI, S3, LIO, A13, L14, T15, T19 and M20, the insertion is selected from a position immediately following one of positions K5, L6 and T19, or the deletion is selected from any one of positions S3, L14, L16, F l 8 and T19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 5.
[0152] 2. The polynucleotide of embodiment 1, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5.
10153] 3. The polynucleotide of embodiment 1, wherein the substitution at position VI is a methionine (VIM), the substitution at position S3 is a phenylalanine (S3F), an isoleucine (S3I) or a threonine (S3T), the substitution at position L10 is a tryptophan (L10W), the substitution at position A13 is a threonine (A13T), the substitution at position L14 is a proline (L14P), the substitution at position T15 is a valine (T15V), the substitution at position T19 is a valine (T19V) or the substitution at position M20 is an aspartic acid (M20D), or the amino acid insertion immediately following one of positions K5, L6 or T19 is an alanine (A).
[0154] 4. A polynucleotide encoding a modified signal peptide comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions G3, L10, A13, LI 4, A15, T19 and M20, the insertion is selected from a position immediately following one of positions K5, V6 and T19, or the deletion is selected from any one of positions G3, L14, L16, F18 and T19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 10.
[0155] 5. The polynucleotide of embodiment 4, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 10.
[0156] 6. The polynucleotide of embodiment 4, wherein the substitution at position G3 is a phenylalanine (G3F), an isoleucine (G3I) or a threonine (G3T), the substitution at position L10 is a tryptophan (L10W), the substitution at position A13 is a threonine (A13T), the substitution at position L14 is a proline (L14P), the substitution at position A15 is a valine (A15V), the substitution at position T19 is a valine (T19V) or IFF10076-W0-PCT[2] the substitution at position M20 is an aspartic acid (M20D), or the amino acid insertion immediately following one of positions K5, V6 or T19 is an alanine (A).
[0157] 7. A polynucleotide encoding a modified pro-region comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions K9, T17, S19, T20, M21, A23, K36, Q38, T50, K57, E58, K60, D62, S64 and E70, the insertion is selected from a position immediately following one of positions E7, A24 and A76, or the amino acid deletion is at position T20, wherein the amino acid positions of the modified pro-region are numbered according to the reference proregion of SEQ ID NO: 7.
[0158] 8. The polynucleotide of embodiment 7, wherein the modified pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7.
[0159] 9. The polynucleotide of embodiment 7, wherein the substitution at position K9 is a methionine (K9M), the substitution at position T17 is a glycine (TUG), the substitution at position S19 is a lysine (S19K) or a threonine (SWT), the substitution at position T20 is a histidine (T20H) or a methionine (T20M), the substitution at position M21 is a leucine (M21L), the substitution at position A23 is an asparagine (A23N), the substitution at position K36 is a glutamic acid (K36E), the substitution at position Q38 is an isoleucine (Q38I), the substitution at position T50 is a glutamic acid (T50E), the substitution at position K57 is an alanine (K.57A), the substitution at position E58 is an aspartic acid (E58D) or a glutamine (E58Q), the substitution at position K60 is a glutamic acid (K60E), the substitution at position D62 is a histidine (D62H), the substitution at position S64 is a tyrosine (S64Y) or the substitution at position E70 is an arginine (E70R), or the amino acid insertion immediately following one of positions E7, A24 or A76 is an alanine (A).
[0160] 10. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a native pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 1 , 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is IFF10076-W0-PCT[2] selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 5.
[0161] 11. The polynucleotide of embodiment 10, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5.
[0162] 12. The polynucleotide of embodiment 10, wherein the substitution at position 1 is a methionine (M), the substitution at position 3 is a phenylalanine (F), an isoleucine (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions K5, L6 or T19 is an alanine (A).
[0163] 13. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a native pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 9, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 10.
[0164] 14. The polynucleotide of embodiment 13, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 10.
[0165] 15. The polynucleotide of embodiment 13, wherein the substitution at position 3 is a phenylalanine (F), an isoleucine (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions K5, V6 or T19 is an alanine (A).
[0166] 16. The polynucleotide of embodiment 10 or embodiment 13, wherein the native pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7.
[0167] 17. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a native signal peptide operably linked to a second polynucleotide encoding a modified pro-region operably linked to a third polynucleotide encoding a mature IFF10076-W0-PCT[2] protease, wherein the modified pro-region comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 9, 17, 19, 20, 21, 23, 36, 38, 50, 57, 58, 60, 62, 64 and 70, the insertion is selected from a position immediately following one of positions 7, 24 and 76, or the deletion is at position 20, wherein the amino acid positions of the modified pro-region are numbered according to the reference pro-region of SEQ ID NO: 7.
[0168] 18. The polynucleotide of embodiment 17, wherein the modified pro-region comprises at least about 75%, 80%, 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7.
[0169] 19. The polynucleotide of embodiment 17, wherein the substitution at position 9 is a methionine (M), the substitution at position 17 is a glycine (G), the substitution at position 19 is a lysine (K) or a threonine (T), the substitution at position 20 is a histidine (H) or a methionine (M), the substitution at position 21 is a leucine (L), the substitution at position 23 is an asparagine (N ), the substitution at position 36 is a glutamic acid (E), the substitution at position 38 is an isoleucine (I), the substitution at position 50 is a glutamic acid (E), the substitution at position 57 is an alanine (A), the substitution at position 58 is an aspartic acid (D) or a glutamine (Q), the substitution at position 60 is a glutamic acid (E), the substitution at position 62 is a histidine (H), the substitution at position 64 is a tyrosine (Y) or the substitution at position 70 is an arginine (R), or the insertion immediately following one of positions 7, 24 or 76 is an alanine (A). 10170] 20. The polynucleotide of embodiment 17, wherein the native signal peptide comprises at least about 75%, 80%, 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5 or SEQ ID NO: 10.
[0171] 21. The polynucleotide of any one of embodiments 10, 13, or 17, wherein the mature protease is a subtilisin.
[0172] 22. An expression cassette comprising an upstream promoter operably linked to a downstream polynucleotide encoding a modified precursor protease of any one of embodiments 10, 13, or 17.
[0173] 23. The cassette of embodiment 22, further comprising a transcriptional terminator downstream and operably linked to the polynucleotide encoding the modified precursor protease.
[0174] 24. The cassette of embodiment 22, wherein the upstream promoter region comprises at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 1.
[0175] 25. A recombinant Bacillus sp. cell comprising an introduced polynucleotide of any one of embodiments 10, 13, or 17.
[0176] 26. A recombinant Bacillus sp. cell comprising an introduced expression cassette of embodiment 22. IFF10076-W0-PCT[2]
[0177] 27. The Bacillus sp. cell of embodiment 25 or embodiment 26, selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis.
[0178] 28. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a modified pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5, a valine (V) to methionine (M) substitution at position 1 (V IM) and a threonine (T) to valine (V) substitution at position 19 (T19V) and the modified pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7 and a serine (S) to lysine (K.) substitution at position 19 (S19K)
[0179] 29. The polynucleotide of embodiment 28, wherein the mature protease is a subtilisin.
[0180] 30. An expression cassette comprising an upstream promoter operably linked to a downstream polynucleotide encoding a modified precursor protease of embodiment 28.
[0181] 31. The cassette of embodiment 30, further comprising a transcriptional terminator downstream and operably linked to the polynucleotide encoding the modified precursor protease.
[0182] 32. The cassette of embodiment 30, wherein the upstream promoter region comprises at least 95% identity to SEQ ID NO: 1.
[0183] 33. A recombinant Bacillus sp. cell comprising an introduced polynucleotide of embodiment 28.
[0184] 34. A recombinant Bacillus sp. cell comprising an introduced expression cassette of embodiment 30.
[0185] 35. The Bacillus sp. cell of embodiment 33 or embodiment 34, selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis.
[0186] 36. A method for the production of a heterologous protease in a Bacillus sp. cell comprising introducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein the cassette comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a native proregion operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under suitable conditions for the production of the protease. IFF10076-W0-PCT[2]
[0187] 37. The method of embodiment 36, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 1, 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino acid positions of the modified signal peptide sequence are numbered according to the reference signal peptide of SEQ ID NO: 5.
[0188] 38. The method of embodiment 36, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5.
[0189] 39. The method of embodiment 37, wherein the substitution at position 1 is a methionine (M), the substitution at position 3 is a phenylalanine (F), an isoleucine (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions 5, 6 or 19 is an alanine (A).
[0190] 40. The method of embodiment 36, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression cassette encoding a control precursor protease, wherein the control cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 5 operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, wherein the promoter and mature protease sequences of the control cassette are the same as the promoter and mature protease sequences used in the expression cassette encoding the modified precursor protease.
[0191] 41. The method of embodiment 40, wherein the increased amount of the protease produced by the modified cell is at least about 1 %, 2%, 3%, 4%, or 5% increased relative to the control cell.
[0192] 42. The method of embodiment 36, wherein the protease is secreted into the fermentation broth.
[0193] 43. The method of embodiment 42, wherein the secreted protease is recovered from the fermentation broth.
[0194] 44. The method of embodiment 36, wherein the native pro-region comprises at least embodiment SEQ ID NO: 7.
[0195] 45. A method for the production of a heterologous protease in a Bacillus sp. cell comprising introducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein the cassette comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a native pro- IFF10076-W0-PCT[2] region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under suitable conditions for the production of the protease.
[0196] 46. The method of embodiment 45, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino acid positions of the modified signal peptide sequence are numbered according to the reference signal peptide of SEQ ID NO: 10.
[0197] 47. The method of embodiment 45, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 10.
[0198] 48. The method of embodiment 46, wherein the substitution at position 3 is a phenylalanine (F), an isoleucine (I) or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions 5, 6 or 19 is an alanine (A). [0199] 49. The method of embodimentO 45, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression cassette encoding a control precursor protease, wherein the control cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, wherein the promoter and mature protease sequences of the control cassette are the same as the promoter and mature protease sequences used in the expression cassette encoding the modified precursor protease.
[0200] 50. The method of embodiment 49, wherein the increased amount of the protease produced by the modified cell is at least about 1%, 2%, 3%, 4%, or 5% increased relative to the control cell
[0201] 51. The method of embodiment 45, wherein the protease is secreted into the fermentation broth.
[0202] 52. The method of embodiment 51, wherein the secreted protease is recovered from the fermentation broth.
[0203] 53. The method of embodiment 45, wherein the native pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7.
[0204] 54. A method for the production of a protease in a modified Bacillus sp. cell, the method comprising intr oducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein IFF10076-W0-PCT[2] the cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide operably linked to a downstream DNA sequence encoding a modified proregion operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under conditions suitable for the production of the protease.
[0205] 55. The method of embodiment 54, wherein the modified pro-region sequence comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 9, 17, 19, 20, 21, 23, 36, 38, 50, 57, 58, 60, 62, 64 and 70, the insertion is selected from a position immediately following one of positions 7, 24 and 76, or the deletion is at position 20, wherein the amino acid positions of the modified pro-region are numbered according to the reference pro-region of SEQ ID NO: 7.
[0206] 56. The method of embodiment 55, wherein the substitution at position 9 is a methionine (M), the substitution at position 17 is a glycine (G), the substitution at position 19 is a lysine (K) or a threonine (T), the substitution at position 20 is a histidine (H) or a methionine (M), the substitution at position 21 is a leucine (L), the substitution at position 23 is an asparagine (N), the substitution at position 36 is a glutamic acid (E), the substitution at position 38 is an isoleucine (I), the substitution at position 50 is a glutamic acid (E), the substitution at position 57 is an alanine (A), the substitution at position 58 is an aspartic acid (D) or a glutamine (Q), the substitution at position 60 is a glutamic acid (E), the substitution at position 62 is a histidine (H), the substitution at position 64 is a tyrosine ( Y) or the substitution at position E70 is an arginine (R), or the insertion immediately following one of positions 7, 24 and 76 is an alanine (A).
[0207] 57. The method of embodiment 54, wherein the modified pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7.
[0208] 58. The method of embodiment 54, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression cassette encoding a control precursor protease, wherein the control cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 5 or SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding a native pro-region of SEQ ID NO: 7 operably linked to a downstream DNA sequence encoding a same mature protease, wherein the promoter and mature protease sequences of the control cassette are the same as the promoter and mature protease sequences used in the expression cassette encoding the modified precursor protease.
[0209] 59. The method of embodiment 58, wherein the increased amount of the protease produced by the modified cell is at least about 1%, 2%, 3%, 4% or 5% increased relative to the control cell.
[0210] 60. The method of embodiment 54, wherein the protease is secreted into the fermentation broth. IFF10076-W0-PCT[2]
[0211] 61. The method of embodiment 60, wherein the secreted protease is recovered from the fermentation broth.
[0212] 62. A method for the production of a protease in a modified Bacillus sp. cell, the method comprising introducing into a Bacillus sp. cell an expression cassette encoding a modified precursor protease, wherein the cassette comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a modified pro-region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under conditions suitable for the production of the protease, wherein the modified signal peptide comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 5, a valine (V) to methionine (M) substitution at position 1 (VIM) and a threonine (T) to valine (V) substitution at position 19 (T19V) and the modified pro-region comprises at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% to about 100% identity to SEQ ID NO: 7 and a serine (S) to lysine (K) substitution at position 19 (S19K).
[0213] 63. The method of any one of embodiments 36, 45, 54 or 62, wherein the mature protease is a subtilisin.
[0214] 64. The method of any one of embodiments 36, 45, 54 or 62, wherein the Bacillus sp. cell is selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. steam thermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis.
IFF10076-W0-PCT[2]
EXAMPLES
[0215] Certain aspects of the present disclosure may be further understood in light of the following examples, which should not be construed as limiting. Modifications to materials and methods will be apparent to those skilled in the art. Standard recombinant DNA and molecular cloning techniques used herein are well known in the art (Ausubel et al., 1987; Sambrook et al., 1989).
EXAMPLE 1
MODIFICATION OF NATIVE SIGNAL PEPTIDE AND PRO-REGION SEQUENCES BY SITE EVALUATION LIBRARIES
[0216] In the instant example, a mature subtilisin protein (variant 1 ; SEQ ID NO: 3) was used as a reporter to monitor protein expression as described herein. More specifically, a DNA fragment comprising an upstream (5') aprE gene flanking region (FR) was operably linked to a polynucleotide construct (e.g., expression cassette; FIG. 1) comprising a heterologous B. subtilis rmI-P2 promoter/5'-aprE UTR region DNA sequence (SEQ ID NO: 1) operably linked to a DNA sequence (SEQ ID NO: 4) encoding a reference signal peptide sequence (FIG. 2, SEQ ID NO: 5) operably linked to a DNA sequence (SEQ ID NO: 6) encoding a reference pro-region sequence (FIG. 3, SEQ ID NO: 7) operably linked to a DNA sequence encoding the mature subtilisin (SEQ ID NO: 3) operably linked to a B. amyloliquefaciens BPN' terminator DNA sequence (SEQ ID NO: 8), which polynucleotide construct was operably linked to a downstream (3') aprE gene flanking region (FR) sequence which includes a kanamycin (kan) gene expression cassette (SEQ ID NO: 9). More particularly, this DNA fragment was assembled using standard molecular biology techniques and was used as template to develop linear DNA expression cassettes comprising one or more signal peptide and/or pro-region sequence modifications (mutations) described herein.
[0217] Site Evaluation Libraries (SEL) of Signal and Pro-Region Sequences
[0218] As briefly stated above, the DNA sequences encoding the reference AprE signal peptide (FIG. 2, AprE_ss; SEQ ID NO: 5) in combination with the reference BPN' pro-region sequence (FIG. 3, BPN'_PRO; SEQ ID NO: 7) were used as templates to create site evaluation libraries (SELs) and were developed as 6.5 kb fragments using standard molecular biology techniques. The linear DNA of the expression cassettes were used to transform competent B. subtilis cells, wherein the transformation mixtures were grown in TSB broth containing five (5) ppm kanamycin and incubated overnight at 37°C. Single colonics were selected using methods known in the art and grown in TSB broth at 37°C under antibiotic selection.
[0219] More particularly, sequence analysis was performed to determine unique signal peptide and proregion sequence variants that were cherry picked into 384-well microtiter plates (MTPs). For example, in the case of the signal peptide sequence (FIG. 2B, AprE_ss; SEQ ID NO: 5), the first (1st) amino acid position is a valine (“Vai” or “V”), which was altered (substituted) to all other nineteen (19) naturally IFF10076-W0-PCT[2] occurring amino acid residues, the second (2nd) amino acid position is an arginine (“Arg” or “R”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all twenty-nine (29) positions of the signal sequence (AprE_ss; SEQ ID NO: 5) were evaluated. In addition to the substitutions at each position, amino acid insertions or deletions were introduced and evaluated at each position of the of the signal sequence (AprE_ss; SEQ ID NO: 5). Thus, in the case of the signal peptide sequence SEL, a total of 455 (SEL) variants were constructed and screened as described below in Example 2.
[0220] Likewise, in the case of the pro-region sequence (FIG. 3B, BPN'_PRO; SEQ ID NO: 7), the first (1st) amino acid position is an alanine (“Ala” or “A”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and the second (2nd) amino acid position of is a glycine (“Gly” or “G”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all seventy-seven (77) positions of pro-region sequence (BPN'_PRO; SEQ ID NO: 7) were evaluated. In addition to the substitutions at each position, amino acid insertions or deletions were introduced and evaluated at each position of the pro-region sequence (BPN'_PRO; SEQ ID NO: 7). Thus, with regard to the pro-region sequence SEL, a total of 1,461 (SEL) valiants were constructed and screened as described below in Example 2.
[0221] Thus, for the variant 1 reporter protein expression experiments, transformed cells were grown in 96-well MTPs in cultivation medium (enriched semi-defined media based on MOPs buffer) for two (2) days at 32°C, 270 rpm, with 80% humidity in shaking incubator, which were centrifuged and filtrated. Clarified culture supernatants were used to measure (assay) reporter protease activity and determine productivity levels, wherein samples were taken after forty (40) hours. The reporter protease activity assay is further described below in Example 2.
EXAMPLE 2
PROTEASE ACTIVITY ASSAY
[0222] The protease activity of the subtilisin reporter (SEQ ID NO: 3) was determined by measuring the hydrolysis of the synthetic suc-A APF- NA peptide substrate. For the AAPF assay, the reagent solutions used were 100 mM Tris pH 8.6, 10 mM CalCh, 0.005% Tween®-80 (Tris/Ca buffer) and 160 mM suc- AAPF-pNA in DMSO (suc-AAPF-pNA stock solution; Sigma: S-7388). To prepare a working solution, one (1) mL suc-AAPF-pNA stock solution was added to 100 mL Tris/Ca buffer and mixed. An enzyme sample was added to a microtiter plate (MTP) containing one (1) mg/mL suc-A APF-pN A working solution and assayed for activity at 405 nm over three to five (3-5) minutes using a SpectraMax plate reader in kinetic mode at room temperature, wherein the protease activity was expressed as mOD/minute.
[0223] More particularly, the protease activity of each SEL variant constructed (Example 1) was measured and compared to a reference construct that was grown in the same plate. Thus, by dividing the value of IFF10076-W0-PCT[2] the reference sample by the value of a variant sample, the performance index (PI) of the signal sequence (TABLE 1) and pro-region sequence (TABLE 2) were determined, as shown below in TABLE 1 and TABLE 2.
TABLE 1
PERFORMANCE INDEX (PI) OF SIGNAL SEQUENCE VARIANTS COMPARED TO REFERENCE SIGNAL SEQUENCE
[0224] Thus, as shown above in TABLE 1, the PI of certain signal sequence amino acid substitutions were capable of producing increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 5). More particularly, as shown in TABLE 1, certain amino acid substitutions at position S3 (S3F, S3I, S3T), position LIO (L10W), position A13 (A13T), position L14 (L14P), position T15 (T15V), position T19 (T19V) and position M20 (M20D) of the signal sequence produced increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 5). Likewise, as shown in TABLE 1, certain alanine (A) amino acid insertions (.01 A) immediately following the lysine (K) at position 5 (Z5.01A), immediately following the leucine (L) at position 6 (Z6.01A) or immediately following the threonine (T) at positions 19 (Z19.01 A) produced increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 5). In addition, as presented in TABLE 1, certain amino acid deletions (Z) at positions S3 (S3Z). L14 (L14Z), L16 (L16Z). F18 (F18Z) or T19 (T19Z) of the signal sequence produced increased amounts of the reporter protease as compared to the reference IFF10076-W0-PCT[2] signal sequence (SEQ ID NO: 5).
TABLE 2 PERFORMANCE INDEX (PI) OF PRO-REGION SEQUENCE VARIANTS COMPARED TO REFERENCE PRO-REGION SEQUENCE
[0225] As presented in TABLE 2 above, the PI of certain pro-region sequence amino acid substitutions were capable of producing increased amounts of the reporter protease as compared to the reference proregion sequence (SEQ ID NO: 7). More particularly, as shown in TABLE 2, certain amino acid substitutions at position K9 (K9M), position T17 (T17G), position S19 (S19K, SWT), position T20 (T20H, T20M), position M21 (M21L), position A23 (A23N), position K36 (K36E), position Q38 (Q38I), position T50 (T50E), position K57 (K57A), position E58 (E58D, E58Q), position K60 (K60E), position D62 (D62H), position S64 (S64Y) or position E70 (E70R) of the pre-region sequence produced increased amounts of the reporter protease as compared to the reference pre -region sequence (SEQ ID NO: 7). Likewise, as shown in TABLE 2, certain alanine (A) amino acid insertions (.01A) immediately following IFF10076-W0-PCT[2] the glutamic acid (E) at position 7 (Z7.01A), immediately following the alanine (A) at position 24 (Z24.01 A) or immediately following the alanine (A) at position 76 (Z76.01 A) produced increased amounts of the reporter protease as compared to the reference pro-region sequence (SEQ ID NO: 7). In addition, as presented in TABLE 2, deletion (Z) of the threonine (T) at position 20 (T20Z) produced increased amounts of the reporter protease as compared to the reference pro-region sequence (SEQ ID NO: 7).
EXAMPLE 3
MODIFICATION OF BPN' SIGNAL SEQUENCE BY SITE EVALUATION LIBRARIES |0226| In the instant example, a mature subtilisin protein (variant 1; SEQ ID NO: 3) is used as a reporter to monitor protein expression as described herein. More specifically, a DNA fragment comprising an upstream (5') aprE gene flanking region (FR) is operably linked to a polynucleotide construct (e.g.. expression cassette) comprising a heterologous B. subtilis rrnI-P2 promoter/5'-opr£ UTR region DNA sequence (SEQ ID NO: 1) operably linked to a DNA sequence (SEQ ID NO: 11) encoding a reference signal peptide sequence (FIG. 4, SEQ ID NO: 10) operably linked to a DNA sequence (SEQ ID NO: 6) encoding a reference pro-region sequence (FIG. 3, SEQ ID NO: 7) operably linked to a DNA sequence encoding the mature subtilisin (SEQ ID NO: 3) operably linked to a B. amyloliquefaciens BPN' terminator DNA sequence (SEQ ID NO: 8), which polynucleotide construct is operably linked to a downstream (3') aprE gene flanking region (FR) sequence which includes a kanamycin (kan) gene expression cassette (SEQ ID NO: 9). More particularly, this DNA fragment is assembled using standard molecular biology techniques and was used as template to develop linear DNA expression cassettes comprising one or more signal peptide and/or pro-region sequence modifications (mutations ) described herein.
[0227] Site Evaluation Libraries (SEL) of the BPN' Signal Sequence
[0228] As briefly stated above, the DNA sequences encoding the reference BPN' signal peptide (FIG. 4, BPN'_ss; SEQ ID NO: 10) in combination with the reference BPN' pro-region sequence (FIG. 3, BPN'_PRO; SEQ ID NO: 7) are used as templates to create site evaluation libraries (SELs) and developed as 6.5 kb fragments using standard molecular biology techniques. The linear DNA of the expression cassettes are used to transform competent B. subtilis cells, wherein the transformation mixtures are grown in TSB broth containing five (5) ppm kanamycin and incubated overnight at 37°C. Single colonies are selected using methods know in the art and grown in TSB broth at 37°C under antibiotic selection.
[0229] More particularly, sequence analysis is performed to determine unique signal peptide sequence val iants that are cherry picked into 384-well microtiter plates (MTPs). For example, in the case of the BPN' signal peptide sequence (FIG. 4B, BPN'_ss; SEQ ID NO: 10), the first (1st) amino acid position is a methionine (“Met” or “M”), which is altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, the second (2nd) amino acid position is an arginine (“Arg” or “R”), which was altered (substituted) to all other nineteen (19) naturally occurring amino acid residues, and so on, until all thirty IFF10076-W0-PCT[2]
(30) positions of the signal sequence (BPN'_ss; SEQ ID NO: 10) are evaluated. In addition to the substitutions at each position, amino acid insertions or deletions are introduced and evaluated at each position of the of the signal sequence (BPN'_ss; SEQ ID NO: 10). Thus, in the case of the signal peptide sequence SEL, a total of X (SEL) variants were constructed and screened as described below in Example 4. [0230] Thus, for the variant 1 reporter protein expression experiments, transformed cells were grown in 96-well MTPs in cultivation medium (enriched semi-defined media based on MOPs buffer) for two (2) days at 32°C, 270 rpm, with 80% humidity in shaking incubator, which were centrifuged and filtrated. Clarified culture supernatants were used to measure (assay) reporter protease activity and determine productivity levels, wherein samples were taken after forty (40) hours. The reporter protease activity assay is further described below in Example 4.
EXAMPLE 4
PROTEASE ACTIVITY ASSAY
[0231] The protease activity of the subtilisin reporter (SEQ ID NO: 3) is determined by measuring the hydrolysis of the synthetic suc-AAPF-pNA peptide substrate, as described above in Example 2.
[0232] More particularly, the protease activity of each SEL variant constructed (Example 3) is measured and compared to a reference construct that was grown in the same plate. Thus, by dividing the value of the reference sample by the value of a variant sample, the performance index (Pl) of the signal sequence (TABLE 3) are determined, as shown below in TABLE3.
IFF10076-W0-PCT[2]
TABLE 3 PERFORMANCE INDEX (PI) OF SIGNAL SEQUENCE VARIANTS COMPARED TO REFERENCE SIGNAL SEQUENCE
[0233] Thus, as shown above in TABLE 3, the PI of certain signal sequence amino acid substitutions are capable of producing increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 10). More particularly, as shown in TABLE 3, certain amino acid substitutions at position A13 (A13T), position G3 (G3F, G3T), position M20 (M20D) position A15 (A15V), position T19 (T19V) or position LIO (L10W) of the signal sequence produced increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 10). Likewise, as shown in TABLE 3, certain alanine (A) amino acid insertions (.01A) immediately following the valine (V) at position 6 (Z6.01A), or immediately following the threonine (T) at position 19 (Z 19.01 A) produced increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 10). In addition, as presented in TABLE 3, certain amino acid deletions (Z) at positions G3 (G3Z), L14 (L14Z), Fl 8 (F18Z) or T19 (T19Z) of the signal sequence produced increased amounts of the reporter protease as compared to the reference signal sequence (SEQ ID NO: 10). IFF10076-W0-PCT[2]
REFERENCES
PCT Publication No. W02003/083125
PCT Publication No. W02002/14490
PCT Publication No. WO1995/10615
PCT Publication No WO2019/108599
PCT Publication No WO2019/245705
PCT Publication No W02020/112599
US Patent No. 5,185,258
US Patent No. 5,204,015
US Provisional Application Serial No. 63/482634
Altschul et al., “Basic local alignment search tool”, J. Mol. Biol., 215(13):403-410, 1990.
Altschul et al., “Gapped BLAST and PSI BLAST a new generation of protein database search programs”, Nucleic Acids Res, Set 1 ;25( 17):3389-402, 1997.
Ausubel et al., “Current Protocols in Molecular Biology”, published by Greene Publishing Assoc, and Wiley- Inter science (1987).
Beaucage and Caruthers, “Deoxynucleoside phosphoramidites - A new class of key intermediates for deoxypolynucleotide synthesis”, Tetrahedron Lett., 22,1859-1862, 1981.
Karlin et al., “Applications and statistics for multiple high-scoring segments in molecular sequences”, PNAS USA, 90(12): 5873 -5787, 1993.
Matthes et al., “Simultaneous rapid chemical synthesis of over one hundred oligonucleotides on a microscale”, EMBO J., 3:801-805, 1984.
Needleman and Wunsch, “A general method applicable to the search for similarities in the amino acid sequences of two proteins”, Journal of Molecular Biology, 48 (3): 443-53, 1970.
Saitou and Nei, “The neighbor-joining method: a new method for reconstructing phylogenetic trees”, Mol. Biol. Evol., Vol. 4, issue 4, pages 406-425, 1987.
Sambrook etal., “Molecular Cloning: A Laboratory Manual” Cold Spring Harbor Laboratory: Cold Spring Harbor, N. Y. (1989), (2001) and (2012).
Schaffer et al., “Improving the accuracy of PSI-BLAST protein database searches with composition-based IFF10076-W0-PCT[2] statistics and other refinements”, Nucleic Acids Res, 29(14}:2994-3005, 2001.
Thompson et al., “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice”. Nucleic Acids Res. 22(22 ): 4673-4680, 1994.

Claims

IFF10076-W0-PCT[2] CLAIMS
1. A polynucleotide encoding a modified signal peptide comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions VI, S3, LIO, A13, L14, T15, T19 and M20, the insertion is selected from a position immediately following one of positions K5, L6 and T19, or the deletion is selected from any one of positions S3, L14, L16, Fl 8 and T19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 5.
2. The polynucleotide of claim 1, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 5.
3. The polynucleotide of claim 1, wherein the substitution at position VI is a methionine (VIM), the substitution at position S3 is a phenylalanine (S3F), an isoleucine (S3I) or a threonine (S3T), the substitution at position LIO is a tryptophan (L10W), the substitution at position A13 is a threonine (A13T), the substitution at position L 14 is a proline (L14P), the substitution at position T15 is a valine (T15V), the substitution at position T19 is a valine (T19V) or the substitution at position M20 is an aspartic acid (M20D), or the amino acid insertion immediately following one of positions K5, L6 or T19 is an alanine (A).
4. A polynucleotide encoding a modified signal peptide comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions G3, LIO, A13, L14, A15, T19 and M20, the insertion is selected from a position immediately following one of positions K5, V6 and T19, or the deletion is selected from any one of positions G3, L14, L16, F18 and T19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 10.
5. The polynucleotide of claim 4, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 10.
6. The polynucleotide of claim 4, wherein the substitution at position G3 is a phenylalanine (G3F), an isoleucine (G3I) or a threonine (G3T), the substitution at position LIO is a tryptophan (L10W), the substitution at position A13 is a threonine (A13T), the substitution at position L14 is a proline (L14P), the substitution at position A15 is a valine (A15V), the substitution at position T19 is a valine (T19V) or the substitution at position M20 is an aspartic acid (M20D), or the amino acid insertion immediately following one of positions K5. V6 or T19 is an alanine (A).
7. A polynucleotide encoding a modified pro-region comprising an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions K9, T17, S19, T20, M2L IFF10076-W0-PCT[2]
A23, K36, Q38, T50, K57, E58, K60, D62, S64 and E70, the insertion is selected from a position immediately following one of positions E7, A24 and A76, or the amino acid deletion is at position T20, wherein the amino acid positions of the modified pro-region are numbered according to the reference pro-region of SEQ ID NO: 7.
8. The polynucleotide of claim 7, wherein the modified pro-region comprises at least 75% identity to SEQ ID NO: 7.
9. The polynucleotide of claim 7, wherein the substitution at position K9 is a methionine (K9M), the substitution at position T17 is a glycine (TUG), the substitution at position S19 is a lysine (S19K) or a threonine (S19T), the substitution at position T20 is a histidine (T20H) or a methionine (T20M), the substitution at position M21 is a leucine (M21L), the substitution at position A23 is an asparagine (A23N), the substitution at position K36 is a glutamic acid (K36E), the substitution at position Q38 is an isoleucine (Q38I), the substitution at position T50 is a glutamic acid (T50E), the substitution at position K57 is an alanine (K57A), the substitution at position E58 is an aspartic acid (E58D) or a glutamine (E58Q), the substitution at position K60 is a glutamic acid (K60E), the substitution at position D62 is a histidine (D62H), the substitution at position S64 is a tyrosine (S64Y) or the substitution at position E70 is an arginine (E70R), or the amino acid insertion immediately following one of positions E7, A24 or A76 is an alanine (A).
10. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a native pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 1, 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 5.
11. The polynucleotide of claim 10, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 5.
12. The polynucleotide of claim 10, wherein the substitution at position 1 is a methionine (M), the substitution at position 3 is a phenylalanine (F), an isoleucine (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at IFF10076-W0-PCT[2] position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions K5, L6 or T19 is an alanine (A).
13. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a native pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 9, wherein the amino acid positions of the modified signal peptide are numbered according to the reference signal peptide of SEQ ID NO: 10.
14. The polynucleotide of claim 13, wherein the modified signal peptide comprises at least 75% to identity to SEQ ID NO: 10.
15. The polynucleotide of claim 13, wherein the substitution at position 3 is a phenylalanine (F), an isolcucinc (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions K5, V6 or T19 is an alanine (A).
16. The polynucleotide of claim 10 or claim 13, wherein the native pro-region comprises at least 75% identity to SEQ ID NO: 7.
17. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a native signal peptide operably linked to a second polynucleotide encoding a modified pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified pro-region comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 9, 17, 19, 20, 21, 23, 36, 38, 50, 57, 58, 60, 62, 64 and 70, the insertion is selected from a position immediately following one of positions 7, 24 and 76, or the deletion is at position 20, wherein the amino acid positions of the modified pro-region are numbered according to the reference pro-region of SEQ ID NO: 7.
18. The polynucleotide of claim 17, wherein the modified pro-region comprises at least 75% identity to SEQ ID NO: 7. IFF10076-W0-PCT[2]
19. The polynucleotide of claim 17, wherein the substitution at position 9 is a methionine (M), the substitution at position 17 is a glycine (G), the substitution at position 19 is a lysine (K) or a threonine (T), the substitution at position 20 is a histidine (H) or a methionine (M), the substitution at position 21 is a leucine (L), the substitution at position 23 is an asparagine (N), the substitution at position 36 is a glutamic acid (E), the substitution at position 38 is an isoleucine (I), the substitution at position 50 is a glutamic acid (E), the substitution at position 57 is an alanine (A), the substitution at position 58 is an aspartic acid (D) or a glutamine (Q), the substitution at position 60 is a glutamic acid (E), the substitution at position 62 is a histidine (H), the substitution at position 64 is a tyrosine (Y) or the substitution at position 70 is an arginine (R), or the insertion immediately following one of positions 7, 24 or 76 is an alanine (A).
20. The polynucleotide of claim 17, wherein the native signal peptide comprises at least 75% identity to SEQ ID NO: 5 or SEQ ID NO: 10.
21. The polynucleotide of any one of claims 10, 13, or 17, wherein the mature protease is a subtilisin.
22. An expression construct (cassette) comprising an upstream promoter operably linked to a downstream polynucleotide encoding a modified precursor protease of any one of claims 10, 13, or 17.
23. The expression construct of claim 22, further comprising a transcriptional terminator downstream and operably linked to the polynucleotide encoding the modified precursor protease.
24. The expression construct of claim 22, wherein the upstream promoter region comprises at least 95% identity to SEQ ID NO: 1.
25. A recombinant Bacillus sp. cell comprising an introduced polynucleotide of any one of claims 10, 13, or 17.
26. A recombinant Bacillus sp. cell comprising an introduced expression construct of claim 22.
27. The Bacillus sp. cell of claim 25 or claim 26, selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis.
28. A recombinant polynucleotide encoding a modified precursor protease, wherein the polynucleotide comprises a first polynucleotide encoding a modified signal peptide operably linked to a second polynucleotide encoding a modified pro-region operably linked to a third polynucleotide encoding a mature protease, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 5, a valine (V) to methionine (M) substitution at position 1 (VIM) and a threonine (T) to valine IFF10076-W0-PCT[2]
(V) substitution at position 19 (T19V) and the modified pro-region comprises at least about 75% identity to SEQ ID NO: 7 and a serine (S) to lysine (K) substitution at position 19 (S19K)
29. The polynucleotide of claim 28, wherein the mature protease is a subtilisin.
30. An expression construct comprising an upstream promoter operably linked to a downstream polynucleotide encoding a modified precursor protease of claim 28.
31. The expression construct of claim 30, further comprising a transcriptional terminator downstream and operably linked to the polynucleotide encoding the modified precursor protease.
32. The cassette of claim 30, wherein the upstream promoter region comprises at least 95% identity to SEQ ID NO: 1.
33. A recombinant Bacillus sp. cell comprising an introduced polynucleotide of claim 28.
34. A recombinant Bacillus sp. cell comprising an introduced expression construct of claim 30.
35. The Bacillus sp. cell of claim 33 or claim 34, selected from the group consisting of B. subtilis, B. lichemformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis .
36. A method for the production of a heterologous protease in a Bacillus sp. cell comprising introducing into a Bacillus sp. cell an expression construct encoding a modified precursor protease, wherein the construct comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under suitable conditions for the production of the protease.
37. The method of claim 36, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 1, 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino acid positions of the modified signal peptide sequence are numbered according to the reference signal peptide of SEQ ID NO: 5.
38. The method of claim 36, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 5.
39. The method of claim 37, wherein the substitution at position 1 is a methionine (M), the substitution at position 3 is a phenylalanine (F), an isoleucine (I), or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position IFF10076-W0-PCT[2]
14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions 5, 6 or 19 is an alanine (A).
40. The method of claim 36, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression construct encoding a control precursor protease, wherein the control construct comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 5 operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, wherein the promoter and mature protease sequences of the control construct are the same as the promoter and mature protease sequences used in the expression construct encoding the modified precursor protease.
41. The method of claim 40, wherein the increased amount of the protease produced by the modified cell is at least a 5% increase relative to the control cell.
42. The method of claim 36, wherein the protease is secreted into the fermentation broth.
43. The method of claim 42, wherein the secreted protease is recovered from the fermentation broth.
44. The method of claim 36, wherein the native pro-region comprises at least 75% identity to SEQ ID
NO: 7.
45. A method for the production of a heterologous protease in a Bacillus sp. cell comprising introducing into a Bacillus sp. cell an expression construct encoding a modified precursor protease, wherein the construct comprises an upstream promoter region operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under suitable conditions for the production of the protease.
46. The method of claim 45, wherein the modified signal peptide comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 3, 10, 13, 14, 15, 19 and 20, the insertion is selected from a position immediately following one of positions 5, 6 and 19, or the deletion is selected from any one of positions 3, 14, 16, 18 and 19, wherein the amino IFF10076-W0-PCT[2] acid positions of the modified signal peptide sequence are numbered according to the reference signal peptide of SEQ ID NO: 10.
47. The method of claim 45, wherein the modified signal peptide comprises at least 75% identity to SEQ ID NO: 10.
48. The method of claim 46, wherein the substitution at position 3 is a phenylalanine (F), an isoleucine (I) or a threonine (T), the substitution at position 10 is a tryptophan (W), the substitution at position 13 is a threonine (T), the substitution at position 14 is a proline (P), the substitution at position 15 is a valine (V), the substitution at position 19 is a valine (V) or the substitution at position 20 is an aspartic acid (D), or the amino acid insertion immediately following one of positions 5, 6 or 19 is an alanine (A).
49. The method of claim 45, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression construct encoding a control precursor protease, wherein the control construct comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding a native pro-region operably linked to a downstream DNA sequence encoding a mature protease, wherein the promoter and mature protease sequences of the control construct are the same as the promoter and mature protease sequences used in the expression construct encoding the modified precursor protease.
50. The method of claim 49, wherein the increased amount of the protease produced by the modified cell is at least about a 1% to 5% increase relative to the control cell.
51. The method of claim 45, wherein the protease is secreted into the fermentation broth.
52. The method of claim 51, wherein the secreted protease is recovered from the fermentation broth.
53. The method of claim 45, wherein the native pro-region comprises at least 75% identity to SEQ ID
NO: 7.
54. A method for the production of a protease in a modified Bacillus sp. cell, the method comprising introducing into a Bacillus sp. cell an expression construct encoding a modified precursor protease, wherein the construct comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide operably linked to a downstream DNA sequence encoding a modified pro-region operably linked to a downstream DNA sequence encoding a mature IFF10076-W0-PCT[2] protease, and fermenting the modified cell under conditions suitable for the production of the protease.
55. The method of claim 54, wherein the modified pro-region sequence comprises an amino acid substitution, insertion, or deletion, wherein the substitution is selected from any one of positions 9, 17, 19, 20, 21, 23, 36, 38, 50, 57, 58, 60, 62, 64 and 70, the insertion is selected from a position immediately following one of positions 7, 24 and 76, or the deletion is at position 20, wherein the amino acid positions of the modified pro-region are numbered according to the reference pro-region of SEQ ID NO: 7.
56. The method of claim 55, wherein the substitution at position 9 is a methionine (M), the substitution at position 17 is a glycine (G), the substitution at position 19 is a lysine (K) or a threonine (T), the substitution at position 20 is a histidine (H) or a methionine (M), the substitution at position 21 is a leucine (L), the substitution at position 23 is an asparagine (N), the substitution at position 36 is a glutamic acid (E), the substitution at position 38 is an isoleucine (I), the substitution at position 50 is a glutamic acid (E), the substitution at position 57 is an alanine (A), the substitution at position 58 is an aspartic acid (D) or a glutamine (Q), the substitution at position 60 is a glutamic acid (E), the substitution at position 62 is a histidine (H), the substitution at position 64 is a tyrosine (Y) or the substitution at position E70 is an arginine (R), or the insertion immediately following one of positions 7, 24 and 76 is an alanine (A).
57. The method of claim 54, wherein the modified pro-region comprises at least 75% identity to SEQ ID NO: 7.
58. The method of claim 54, wherein the modified cell produces an increased amount of the protease relative to a control Bacillus sp. cell fermented under the same conditions, wherein the control cell comprises an introduced control expression construct encoding a control precursor protease, wherein the control construct comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a native signal peptide of SEQ ID NO: 5 or SEQ ID NO: 10 operably linked to a downstream DNA sequence encoding a native pro-region of SEQ ID NO: 7 operably linked to a downstream DNA sequence encoding a same mature protease, wherein the promoter and mature protease sequences of the control construct are the same as the promoter and mature protease sequences used in the expression construct encoding the modified precursor protease. IFF10076-W0-PCT[2]
59. The method of claim 58, wherein the increased amount of the protease produced by the modified cell is at least about a 1% to 5% increase relative to the control cell.
60. The method of claim 54, wherein the protease is secreted into the fermentation broth.
61. The method of claim 60, wherein the secreted protease is recovered from the fermentation broth.
62. A method for the production of a protease in a modified Bacillus sp. cell, the method comprising introducing into a Bacillus sp. cell an expression construct encoding a modified precursor protease, wherein the construct comprises an upstream promoter sequence operably linked to a downstream DNA sequence encoding a modified signal peptide operably linked to a downstream DNA sequence encoding a modified pro-region operably linked to a downstream DNA sequence encoding a mature protease, and fermenting the modified cell under conditions suitable for the production of the protease, wherein the modified signal peptide comprises at least about 75% identity to SEQ ID NO: 5, a valine (V) to methionine (M) substitution at position 1 (V IM) and a threonine (T) to valine (V) substitution at position 19 (T19V) and the modified pro-region comprises at least about 75% identity to SEQ ID NO: 7 and a serine (S) to lysine (K) substitution at position 19 (S19K).
63. The method of any one of claims 36, 45, 54 or 62, wherein the mature protease is a subtilisin.
64. The method of any one of claims 36, 45, 54 or 62, wherein the Bacillus sp. cell is selected from the group consisting of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis.
PCT/US2025/039697 2024-07-29 2025-07-29 Signal and pro-region sequence variants for enhanced protease production in bacillus cells Pending WO2026030345A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463676764P 2024-07-29 2024-07-29
US63/676,764 2024-07-29

Publications (1)

Publication Number Publication Date
WO2026030345A1 true WO2026030345A1 (en) 2026-02-05

Family

ID=96946523

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/039697 Pending WO2026030345A1 (en) 2024-07-29 2025-07-29 Signal and pro-region sequence variants for enhanced protease production in bacillus cells

Country Status (1)

Country Link
WO (1) WO2026030345A1 (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5185258A (en) 1984-05-29 1993-02-09 Genencor International, Inc. Subtilisin mutants
US5204015A (en) 1984-05-29 1993-04-20 Genencor International, Inc. Subtilisin mutants
WO1995010615A1 (en) 1993-10-14 1995-04-20 Genencor International, Inc. Subtilisin variants
WO2002014490A3 (en) 2000-08-11 2003-02-06 Genencor Int Bacillus transformation, transformants and mutant libraries
WO2003083125A1 (en) 2002-03-29 2003-10-09 Genencor International, Inc. Ehanced protein expression in bacillus
WO2015038792A1 (en) * 2013-09-12 2015-03-19 Danisco Us Inc. Compositions and methods comprising lg12-clade protease variants
WO2018169780A1 (en) * 2017-03-15 2018-09-20 Dupont Nutrition Biosciences Aps Methods of using an archaeal serine protease
WO2019108599A1 (en) 2017-11-29 2019-06-06 Danisco Us Inc Subtilisin variants having improved stability
WO2019245705A1 (en) 2018-06-19 2019-12-26 Danisco Us Inc Subtilisin variants
WO2020112599A1 (en) 2018-11-28 2020-06-04 Danisco Us Inc Subtilisin variants having improved stability
WO2023192953A1 (en) * 2022-04-01 2023-10-05 Danisco Us Inc. Pro-region mutations enhancing protein production in gram-positive bacterial cells
WO2024050503A1 (en) * 2022-09-02 2024-03-07 Danisco Us Inc. Novel promoter and 5'-untranslated region mutations enhancing protein production in gram-positive cells

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5185258A (en) 1984-05-29 1993-02-09 Genencor International, Inc. Subtilisin mutants
US5204015A (en) 1984-05-29 1993-04-20 Genencor International, Inc. Subtilisin mutants
WO1995010615A1 (en) 1993-10-14 1995-04-20 Genencor International, Inc. Subtilisin variants
WO2002014490A3 (en) 2000-08-11 2003-02-06 Genencor Int Bacillus transformation, transformants and mutant libraries
WO2003083125A1 (en) 2002-03-29 2003-10-09 Genencor International, Inc. Ehanced protein expression in bacillus
WO2015038792A1 (en) * 2013-09-12 2015-03-19 Danisco Us Inc. Compositions and methods comprising lg12-clade protease variants
WO2018169780A1 (en) * 2017-03-15 2018-09-20 Dupont Nutrition Biosciences Aps Methods of using an archaeal serine protease
WO2019108599A1 (en) 2017-11-29 2019-06-06 Danisco Us Inc Subtilisin variants having improved stability
WO2019245705A1 (en) 2018-06-19 2019-12-26 Danisco Us Inc Subtilisin variants
WO2020112599A1 (en) 2018-11-28 2020-06-04 Danisco Us Inc Subtilisin variants having improved stability
WO2023192953A1 (en) * 2022-04-01 2023-10-05 Danisco Us Inc. Pro-region mutations enhancing protein production in gram-positive bacterial cells
WO2024050503A1 (en) * 2022-09-02 2024-03-07 Danisco Us Inc. Novel promoter and 5'-untranslated region mutations enhancing protein production in gram-positive cells

Non-Patent Citations (12)

* Cited by examiner, † Cited by third party
Title
"Program Manual for the Wisconsin Package", 8 August 1994, GENETICS COMPUTER GROUP
ALTSCHUL ET AL.: "Basic local alignment search tool", J. MOL. BIOL., vol. 215, no. 13, 1990, pages 403 - 410, XP002949123, DOI: 10.1006/jmbi.1990.9999
ALTSCHUL ET AL.: "Gapped BLAST and PSI BLAST a new generation of protein database search programs", NUCLEIC ACIDS RES, SET 1, vol. 25, no. 17, 1997, pages 3389 - 402, XP002905950, DOI: 10.1093/nar/25.17.3389
BEAUCAGECARUTHERS: "Deoxynucleoside phosphoramidites - A new class of key intermediates for deoxypolynucleotide synthesis", TETRAHEDRON LETT., vol. 22, 1981, pages 1859 - 1862
DATABASE Geneseq [online] 15 November 2018 (2018-11-15), "Bacillus subtilis AprE signal peptide, SEQ:16.", retrieved from EBI accession no. GSP:BFR51955 Database accession no. BFR51955 *
KARLIN ET AL.: "Applications and statistics for multiple high-scoring segments in molecular sequences", PNAS USA, vol. 90, no. 12, 1993, pages 5873 - 5787
MATTHES ET AL.: "Simultaneous rapid chemical synthesis of over one hundred oligonucleotides on a microscale", EMBO J., vol. 3, 1984, pages 801 - 805
NEEDLEMANWUNSCH: "A general method applicable to the search for similarities in the amino acid sequences of two proteins", JOURNAL OF MOLECULAR BIOLOGY, vol. 48, no. 3, 1970, pages 443 - 53
SAITOUNEI: "The neighbor-joining method: a new method for reconstructing phylogenetic trees", MOL. BIOL. EVOL., vol. 4, 1987, pages 406 - 425
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY
SCHAFFER ET AL.: "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", NUCLEIC ACIDS RES, vol. 29, no. 14, 2001, pages 2994 - 3005
THOMPSON ET AL.: "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice", NUCLEIC ACIDS RES, vol. 22, no. 22, 1994, pages 4673 - 4680, XP002956304

Similar Documents

Publication Publication Date Title
CN113366108B (en) Novel promoter sequences and methods thereof for enhancing protein production in Bacillus cells
CN111094576A (en) Modified 5' -untranslated region (UTR) sequences for increased protein production in Bacillus
EP4090738A1 (en) Compositions and methods for enhanced protein production in bacillus licheniformis
JP2025072450A (en) Compositions and methods for increased protein production in Bacillus licheniformis
US20240360430A1 (en) Methods and compositions for enhanced protein production in bacillus cells
US20250223623A1 (en) Pro-region mutations enhancing protein production in gram-positive bacterial cells
EP4581145A1 (en) Novel promoter and 5'-untranslated region mutations enhancing protein production in gram-positive cells
WO2026030345A1 (en) Signal and pro-region sequence variants for enhanced protease production in bacillus cells
US20220389372A1 (en) Compositions and methods for enhanced protein production in bacillus cells
WO2022178432A1 (en) Methods and compositions for producing proteins of interest in pigment deficient bacillus cells
WO2025034713A2 (en) Compositions and methods for enhanced protein production in gram‑positive bacterial cells
WO2025101486A1 (en) Methods and compositions for enhanced protein production in bacillus cells
EP4433588A1 (en) Compositions and methods for enhanced protein production in bacillus cells
WO2024091804A1 (en) Compositions and methods for enhanced protein production in bacillus cells
WO2023104846A1 (en) Improved protein production in recombinant bacteria