WO2013091661A2 - Proteolytic resistant protein affinity tag - Google Patents
Proteolytic resistant protein affinity tag Download PDFInfo
- Publication number
- WO2013091661A2 WO2013091661A2 PCT/DK2012/050505 DK2012050505W WO2013091661A2 WO 2013091661 A2 WO2013091661 A2 WO 2013091661A2 DK 2012050505 W DK2012050505 W DK 2012050505W WO 2013091661 A2 WO2013091661 A2 WO 2013091661A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- tag
- pxp
- protein
- vector
- fusion polypeptide
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/0004—Oxidoreductases (1.)
- C12N9/0012—Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7)
- C12N9/0036—Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7) acting on NADH or NADPH (1.6)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/14—Extraction; Separation; Purification
- C07K1/16—Extraction; Separation; Purification by chromatography
- C07K1/22—Affinity chromatography or related techniques based upon selective absorption processes
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K7/00—Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
- C07K7/04—Linear peptides containing only normal peptide links
- C07K7/06—Linear peptides containing only normal peptide links having 5 to 11 amino acids
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K7/00—Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
- C07K7/04—Linear peptides containing only normal peptide links
- C07K7/08—Linear peptides containing only normal peptide links having 12 to 20 amino acids
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/20—Fusion polypeptide containing a tag with affinity for a non-protein ligand
- C07K2319/21—Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/20—Fusion polypeptide containing a tag with affinity for a non-protein ligand
- C07K2319/23—Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a GST-tag
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/20—Fusion polypeptide containing a tag with affinity for a non-protein ligand
- C07K2319/24—Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a MBP (maltose binding protein)-tag
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/35—Fusion polypeptide containing a fusion for enhanced stability/folding during expression, e.g. fusions with chaperones or thioredoxin
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/50—Fusion polypeptide containing protease site
Definitions
- the present invention relates to a protease resistant protein tag with high affinity for IMAC resins, which tag comprises a proline-X-proline motif.
- the C-/N- terminal tagging of recombinant proteins is a well-known procedure that enables both immunological detection of expression level and downstream purification of the recombinant protein by exploiting the properties of the tag to bind to a specific affinity column.
- the use of high affinity tags also increase the yield obtainable from recombinant production of proteins, which is a very important factor with great economic impact.
- different tags vary greatly with respect to yield of recombinant protein, and the position of the tag in a fusion protein also influences purification yield.
- the most used purification tag is the polyhistidine tag, typically 6x His, that consists of six consecutive histidine residues (HHHHHH) that with the use of PCR and/or restriction sites present in an expression vector can be incorporated into genetic constructs for making recombinant proteins.
- the linker amino acid sequence between the hexa histidine tag and the recombinant protein is the reason for unwanted proteolysis that results in loss of the tag and hence diminishing the recovery yield.
- the present invention provides a modified protease resistant protein tag.
- This tag is not degraded by host proteases, and therefore delivers high yields of recombinant protein, when expressed and purified from a suitable host organism.
- the protein tag of the present invention comprises a short flanking amino acid motif, PXP (Proline-X-Proline), where X can be any amino acid, that for example when added to an N- or C- terminal polyhistidine epitope tags (e.g. a 6xHis tag) drastically reduce the loss of this epitope tag.
- PXP Proline-X-Proline
- X can be any amino acid
- N- or C- terminal polyhistidine epitope tags e.g. a 6xHis tag
- unwanted proteolysis usually occurs both in vivo and during purification.
- new improved variants of the polyhistidine tags are provided, which increase the binding to metal-chelate resins commonly used for purification of tagged recombinant proteins.
- the combination of the PXP tag and any protein tag, such as the canonical and/or new polyhistidine tags can generate an uncleavable C-/N- terminal affinity tag for recombinant protein purifications.
- a protein tag such as an epitope tag and/or an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
- PXP Proline-X-Proline
- the tag preferably comprises a polyhistidine tag, where the PXP motif is fused to the N- terminal and/or C-terminal end of the polyhistidine tag, thus having the formula: PXP-(His)n, (His)n-PXP or PXP-(His)n-PXP, where n is at least 2, and preferably 2-10, and even more preferred 4-6.
- the tag preferably has the general formula: PXP-(HZ)n, where n is at least 1 , preferably 1-10, Z is an amino acid, such as an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K), and X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above.
- Z is an amino acid, such as an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K)
- X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above.
- X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N).
- Preferred protein tags such as epitope or affinity tags, comprise the amino acid motif PKP, PQP, PHP, PNP and/or PRP.
- the invention relates to a fusion polypeptide for expression in a host cell, said fusion polypeptide comprising a first polypeptide sequence fused to a protein tag, such as an epitope tag, of the present invention.
- the fusion polypeptide in one embodiment further comprises a linker sequence between the protein tag and said first polypeptide, or example the linker sequence comprises
- a proteolytic cleavage site suitable for separating the first polypeptide from the protein tag, such as an epitope tag, or
- the invention also in a further aspect pertains to a nucleic acid encoding an protein tag and/or a fusion polypeptide of the present invention.
- nucleic acid vector comprising such a nucleic acid sequence of the invention.
- the nucleic acid vector is in a preferred embodiment an expression vector, such as a prokaryotic expression vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector.
- a recombinant host cell comprising a nucleotide sequence, fusion polypeptide, a nucleic acid sequence, and/or a nucleic acid vector of the present invention.
- the host cell is for example selected from eukaryotic or prokaryotic cells, for example the host is selected from mammalian cells, human cells, mouse cells, plant cells, Chinese Hamster Ovary (CHO) cells, and insect cells.
- the present invention relates to a kit comprising a nucleic acid sequence, a nucleic acid vector and/or a host cell of the present invention.
- Another aspect relates to the use of an oligonucleotide comprising a sequence encoding a protein tag, such as an epitope tag of the present invention, or part thereof, for introducing said protein tag or part thereof in a nucleic acid cloning vector and/or expression vector, and/or for fusing said protein tag or part thereof to a nucleic acid sequence encoding a first polypeptide sequence.
- a protein tag such as an epitope tag of the present invention
- the invention pertains to a method of producing recombinant protein, said method comprising:
- the isolation of the fusion polypeptide preferably comprises
- the metal chelate resin is preferably NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin, and the metal chelate preferably comprises a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+.
- FIG. 1 A: Vector 01 polylinker based on pPICZ_alpha_A (Life Technology) backbone. The vector has been obtained ligating the linker created with the oligos
- the C- terminal of the recombinant protein will contain the residues: INASAPKPHHHHHH (SEQ ID NO: 6).
- the C-terminal of the recombinant protein will contain the residues: INASAPKPHQHRHKHQP (SEQ ID NO: 7).
- FIG. 1 A. Vector 03 polylinker based on pPICZ_alpha_A backbone.
- the vector has been obtained ligating the linker created with the oligos V03_up/V03_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ.
- the C-terminal of the recombinant protein will contain the residues: INASAPKPHGHTHGHSHGHP (SEQ ID NO: 8).
- the vector has been obtained ligating the linker created with the oligos V04_up/V04_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ.
- the C-terminal of the recombinant protein will contain the residues: INASAPKPHEHDHEHDHEHP (SEQ ID NO: 9).
- Vector 05 polylinker based on pPICZ_alpha_A backbone The vector has been obtained ligating the linker created with the oligos V05_up/V05_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ.
- the C-terminal of the recombinant protein will contain the residues: SRPKPHHHHHH (SEQ ID NO: 10).
- Vector 06 polylinker based on pPICZ_alpha_A backbone.
- the vector has been obtained ligating the linker created with the oligos V06_up/V06_down to the vector 05 opened with XbaJ and Sal J.
- the C-terminal of the recombinant protein will contain the residues: SRPKPHQHRHKHQP (SEQ ID NO: 11).
- FIG. 3 A. Vector 07 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V07_up/V07_down to the vector 05 opened with XbaJ and Sal J. The C-terminal of the recombinant protein will contain the residues: SRPKPHGHTHGHSHGHP (SEQ ID NO: 12).
- Vector 08 polylinker based on pPICZ_alpha_A backbone.
- the vector has been obtained ligating the linker created with the oligos V08_up/V08_down to the vector 05 opened with XbaJ and Sal J.
- the C-terminal of the recombinant protein will contain the residues: SRPKPHEHDHEHDHEHP (SEQ ID NO: 13).
- the vector has been obtained ligating the linker created with the oligos V09_up/V09_down to the vector pET15a opened with Nco_l and NdeJ.
- the N-terminal of the recombinant protein will contain the residues: MGSSHHHHHHPKP (SEQ ID NO: 14).
- Fig. 4 Illustration of Immobilized Metal affinity chromatography (IMAC) with nickel. Comparison of binding of conventional consecutive histidine tag, and alternating histidine.
- IMAC Immobilized Metal affinity chromatography
- Fig. 5 Comparison of a traditional HIS6 tagged protein and a PXP-HIS6 tagged protein.
- Top panel shows a traditional N- and C-t tagged HIS6 protein.
- the bottom panel shows a PXP N- and C-t tagged HIS6 protein, which have 90 degree turn that allow a nice usage of the hexahistidine moiety and does not affect hindrance in folding. It is seen that PXP polyhistidine tags stick out the recombinant protein ensuring better IMAC (Immobilized Metal affinity chromatography).
- IMAC Immobilized Metal affinity chromatography
- Fig. 6 Advantage of the PXP tag over just 6xHis tag.
- the PXP-HIS tag display strong binding to Ni-NTA or Ni-IDA-Ni columns.
- the C-terminal PKPHHHHHH tagged HvEP- B2 was all eluted from Ni-IDA (Protino, Macherey-Nagel) with 1 M NaCI and 0.5 M
- Fig. 8 Commercial Anti-His antibodies are dependent on the vicinal amino acid residues. 6xHIS-tagged proteins A-H were detected with QIAexpress Anti-His antibodies. Indicated amounts of pure 6xHis-tagged protein were applied to nitrocellulose membrane, and detection was carried out with the Anti-His primary antibody indicated diluted 1/2000, followed by chromogenic detection with AP- conjugated rabbit anti-mouse IgG and NBT/BCIP.
- Fig. 9. A schematic overview of PXP tag removal by DAPase (Tagzyme). Top panel shows DAPase cleavage sites in respect of amino acids, Lysine, Arginine, Proline, and Glutamine.
- the middle panel shows Tagzyme tag removal of a recombinant protein with an N-terminal 6xHIS-PKP tag.
- the bottom panel show DAPase tag removal of a recombinant protein with an N-terminal 6xHIS-Q-PKP tag, where the polyhistidine tag is first removed by DAPase, and the N-terminal glutamine residue is converted to pyroglutamate by Qcyclase that in turn is removed by pGAPase Enzyme action.
- Fig. 10 SDS-PAGE/Coomassie and the elution profile of the Thioredoxin h isoform b as coming from pET15m or vector 10/1 1.
- Fig. 11 SDS-PAGE/Coomassie and the elution profile of the wheat NADPH dependent Thioredoxin reductase (NTR).
- N- or C-terminal polyhistidine epitope tags which are used for downstream affinity purification, in heterologous recombinant proteins is always accompanied by a percentage of loss of the same due to unpredictable proteolysis events. This phenomenon differs from host to host, expression strain and type of recombinant protein produced.
- the present invention provides a short Proline-X-Proline (PXP) amino acid motif that when added to the N- or C- terminus of a protein tag, such as a polyhistidine epitope tags (i.e. 6xHis tag), drastically reduce the loss of the tag due to unwanted proteolysis occurring both in vivo or during purification.
- a protein tag such as a polyhistidine epitope tags (i.e. 6xHis tag)
- the PXP tag of the present invention for example in a polyhistidine tag, provides a proteolytic resistant I MAC tag to recombinant proteins, since the flanking amino acidic residues will act as a steric hindrance upstream and downstream to the polyhistidine motif so that the attack carried out by proteases is blocked.
- the PXP-tags/PXP-IMAC tags will not disturb the correct folding of the C-terminal of the recombinant proteins
- Constructs tagged with a PXP tag of the invention display tight binding to IMAC resins (e.g. Nickel-IMAC resins), which allows using stronger washing conditions, and consequently, more pure protein can be eluted form the resin.
- IMAC resins e.g. Nickel-IMAC resins
- a PXP-protein tag of the present invention increases the yield of recombinant protein obtainable, when the tagged protein is purified, for example by affinity chromatography.
- the protein tags of the present invention offers several advantages over the prior art. Apart from displaying increased proteolytic resistance, recombinant proteins with a PXP-tag (such as modified IMAC tags) of the present invention display provide higher recovery yield (>95%), high purity of the eluted proteins, and significantly better reactivity to anti-His antibodies.
- a PXP-tag such as modified IMAC tags
- Affinity chromatography is a method of separating biochemical mixtures and based on a highly specific interaction such as that between antigen and antibody, enzyme and substrate, or receptor and ligand. It can be used for purifying and concentrating a substance from a mixture into a buffering solution or purifying and concentrating an enzyme solution.
- Affinity chromatography employs a solid/immobile phase, which is typically a gel matrix/resin, often of agarose. Usually the starting point is an undefined heterogeneous group of molecules in solution, such as a cell lysate/extract or growth medium/broth. The molecule of interest will have a well-known and defined property which can be exploited during the affinity purification process.
- affinity chromatography the polypeptide of interest becomes trapped on the solid phase/matrix. The other molecules in solution will not become trapped as they do not possess this property.
- the solid medium can then be removed from the mixture, washed and the target polypeptide released from the entrapment in a process called elution.
- Affinity chromotography is commonly used for the purification of recombinant proteins.
- the solid phase is often loaded into a column into which the sample (e.g. cell extract) is loaded; thereafter, a washing buffer can be applied to was off non-specific binding agents. Then the target polypeptide can be eluted from the column and collected.
- affinity chromatography comprises the steps of
- affinity matrix/resin depends on the specific protein to be purified.
- Recombinant proteins fused to an epitope tag or affinity tag may be purified by using an affinity matrix, which binds the specific tag used.
- One type of resin is a metal chelate resin, which can be used for isolating fusion polypeptides by metal ion affinity chromatography or immobilized metal ion affinity chromatography (I MAC) using the culture broth or a cell extract as starting material.
- I MAC immobilized metal ion affinity chromatography
- This methodology is based on the specific coordinate covalent bond of amino acids, particularly histidine, to metals.
- the technique works by allowing proteins with an affinity for metal ions to be retained in a column containing immobilized metal ions, such as cobalt, nickel, copper for the purification of histidine containing proteins or peptides, iron, zinc or gallium for the purification of phosphorylated proteins or peptides.
- polyHIS tagged polypeptides bind metal chelate resin, such as NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin comprising a bivalent cation, such as Co 2+ , Zn 2+ , Cu 2+ , Ca 2+ , Cd 2+ , or Ni 2+ .
- metal chelate resin such as NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin comprising a bivalent cation, such as Co 2+ , Zn 2+ , Cu 2+ , Ca 2+ , Cd 2+ , or Ni 2+ .
- Methods used to elute the protein of interest include changing the pH, or adding a competitive molecule, such as imidazole.
- Amino acid is an entity comprising an amino terminal part (NH 2 ) and a carboxy terminal part (COOH) separated by a central part comprising a carbon atom, or a chain of carbon atoms, comprising at least one side chain or functional group.
- NH 2 refers to the amino group present at the amino terminal end of an amino acid or peptide
- COOH refers to the carboxy group present at the carboxy terminal end of an amino acid or peptide.
- the generic term amino acid comprises both natural and non-natural amino acids. Natural amino acids of standard nomenclature as listed in J. Biol. Chem., 243:3552-59 (1969) and adopted in 37 C.F.R., section 1.822(b)(2) belong to the group of amino acids listed herein below.
- Amino acid residues described herein can be in the "D" or "L” isomeric form. Where the L or D form has not been specified it is to be understood that the amino acid in question has the natural L form, cf. Pure & Appl. Chem. Vol. (56(5) pp 595-624 (1984) or the D form, so that the peptides formed may be constituted of amino acids of L form, D form, or a sequence of mixed L forms and D forms.
- amino acids are either indicated by the full name, or by their conventional 3-letter or 1 -letter code.
- an amino acid can be selected from any amino acid, whether naturally occurring or not, such as alfa amino acids, beta amino acids, and/or gamma amino acids. Accordingly, the group comprises but are not limited to: Ala, Val, Leu, lie, Pro, Phe, Trp, Met, Gly, Ser, Thr, Cys, Tyr, Asn, Gin, Asp, Glu, Lys, Arg, His, Aib, Nal, Sar, Orn, Lysine analogues DAP and DAPA.
- Nucleic acid, nucleic acid or polynucleotide is meant to encompass DNA and RNA as well as derivatives thereof such as peptide nucleic acids (PNA) or locked nucleic acids (LNA) throughout the description.
- PNA peptide nucleic acids
- LNA locked nucleic acids
- the terms refer to polynucleotides, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction (PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action.
- Polynucleotides can be composed of monomers that are naturally-occurring nucleotides (such as DNA and RNA), or analogs of naturally-occurring nucleotides (e.g., (alpha-enantiomeric forms of naturally-occurring nucleotides), or a combination of both.
- Modified nucleotides can have alterations in sugar moieties and/or in pyrimidine or purine base moieties.
- Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, alkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters.
- the entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza-sugars and carbocyclic sugar analogs.
- modifications in a base moiety include alkylated purines and pyrimidines, acylated purines or pyrimidines, or other well-known heterocyclic substitutes.
- Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages. Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like.
- polynucleotide also includes so-called “peptide nucleic acids,” which comprise naturally-occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded. Nucleic acids of the present invention are generally defined in terms of the translational gene products encoded by the respective nucleic acid. The sequence of a nucleic acid of the invention may thus be inferred from the sequence of the polypeptide gene product by using the conventional genetic code to decipher, which nucleotides will encode a specific polypeptide.
- Gene product refers to any transcriptional or translational product of a gene.
- a transcriptional product comprises any RNA-species, which is transcribed from the specific gene, such as pre-RNA, mRNA, tRNA, miRNA, spliced and nonspliced RNA.
- a transcriptional gene product of the present invention comprises any RNA- species encoded by or comprising a sequence selected from any ⁇ -glucosidase gene.
- a transcriptional gene product of the present invention comprises any RNA-species encoded by a sequence selected from any of SEQ ID NO: 15-40, 52-72 and 75-79.
- the transcript may be bound by RNA-binding proteins and, thus, packaged into a ribonucleoprotein (RNP), for example an mRNP molecule.
- RNP ribonucleoprotein
- a translational gene product of the present invention comprises any peptide or polypeptide encoded by the gene or a fragment thereof.
- a "polypeptide encoded by a gene of the present invention” is comprised in the terms “gene product", or “translational gene product”.
- a translational gene product of the present invention comprises any polypeptide-species encoded by a sequence selected from any ⁇ - glucosidase.
- Extract is used herein for any extraction of a microorganism and/or host cell of the present invention.
- the extract preferably comprises a polypeptide, in particular polypeptides of the present invention, such as fusion polypeptides having a modified HIS-tag.
- An extract may also comprise polynucleotide, such as nucleic acids molecules of nucleic acid vectors of the present invention.
- the extract may be prepared by opening the cells by lysis or chemical shear, and extracting the desired components in a suitable buffer.
- Broth is used herein to describe a medium, which has been used for the culturing of a microorganism and/or host cell of the present invention.
- the broth is preferably a liquid culture broth, and the broth preferably comprise metabolites and/or other secreted components of the cultured microorganism and/or host cell, for example polypeptides of the present invention.
- Polypeptide is a plurality of at least two covalently linked amino acid residues defining a sequence and linked by amide bonds. Proteins and enzymes are composed of polypeptide, and thus, the term "protein" is used herein analogously with polypeptide.
- the amino acids may be both natural amino acids and non-natural amino acids, including any combination thereof.
- the natural and/or non-natural amino acids may be linked by peptide bonds or by non-peptide bonds.
- polypeptide also embraces post-translational modifications introduced by chemical or enzyme-catalyzed reactions, as are known in the art. Such post-translational modifications can be introduced prior to partitioning, if desired. Amino acids as specified herein will preferentially be in the L- stereoisomeric form.
- Amino acid analogs can be employed instead of the 20 naturally- occurring amino acids.
- Several such analogs are known, including fluorophenylalanine, norleucine, azetidine-2-carboxylic acid, S-aminoethyl cysteine, 4-methyl tryptophan and the like.
- Protein tag is used herein for any entity, which can serve for identifying and/or isolating a protein to which it is associated.
- the present invention primarily relates to polypeptide tags, i.e. protein tags, which are composed of covalently linked amino acid residues. Protein tags can be used to identify and isolate a polypeptide of interest, which is attached to the tag by a chemical linkage, such as a peptide bond. Protein tags for example include epitope tags and affinity tags.
- Affinity tag is a protein tag consisting of an amino acid sequence that has been attached to a protein of interest to make its purification easier, so that the protein can be purified from a crude biological source using an affinity technique.
- the tag could be another protein or a short amino acid sequence, allowing purification by affinity chromatography.
- an affinity tag can be an epitope tag.
- affinity tags are chitin binding protein (CBP), maltose binding protein (MBP), and glutathiones-transferase (GST).
- CBP chitin binding protein
- MBP maltose binding protein
- GST glutathiones-transferase
- polyHis tag is a widely-used affinity tag; which binds to metal matrices/resins.
- Affinity tags may also be used as epitope tags
- An epitope is a portion of a molecule to which an antibody binds, i.e. epitopes are recognizable and bound by an antibody or an antigen-binding fragment thereof.
- Epitopes can be composed of sugars, lipids or amino acids.
- epitope tags generally are constructed of amino acids.
- Epitope tags may be added to a molecule (usually a polypeptide) as a tool to visualize or isolate the protein.
- Visualization can take place in a gel, a western blot or labeling via
- Isolation can be done by protein purification techniques, such as chromatographic methods using antibodies or affinity chromatography.
- an epitope tag can be added to the polypeptide.
- Epitope tags range typically from 10 to 15 amino acids and are designed to create a molecular handle for the polypeptide of interest.
- An epitope tag could be placed anywhere within your protein, but typically they are placed on either the amino or carboxyl terminus to minimize any potential disruption in tertiary structure and thus function of your protein. Any short stretch of amino acids known to bind an antibody qualifies as an epitope tag.
- a number of epitope tags are widely used in the field of protein chemistry. These include:
- HHHHHH PolyHIS/His6 - 5-6 histidines placed in a row form a structure that binds the element divalent cations such as nickel. This is especially useful for affinity chromatography but can also be used as an epitope tag
- VSV-G (YTDIEMNRLGK)
- Epitope tags also include affinity tags, which are recognizable by an antibody or an antigen-binding fragment thereof.
- Expression vectors are defined herein as DNA sequences that are required for the transcription of cloned copies of genes and the translation of their mRNAs in an appropriate host. Such vectors can be used to express genes in a variety of hosts such as bacteria, yeast, bluegreen algae, plant cells, insect cells and animal cells. In particular, the vectors can be used for expression of a PXP-comprising epitope tag of the present invention, including fusion polypeptides.
- Protein tag comprising PXP amino acid motif
- the present invention provides a protein tag, such as an epitope tag or affinity tag, comprising a PXP amino acid motif.
- the PXP motif functions as a proteolytic resistant motif to the flanking region of the tag.
- the steric hindrance of two flanking proline residues render the central amino acid of the PXP motif and the protein tag, proteolytic resistant to the most proteases present in the hosts for recombinant protein production (£. coli, other bacteria, yeast, fungi or other eukaryotic recombinant expression systems). This ensures a significantly higher yield of recombinant protein expressed with a protein tag, such as an epitope tag, of the present invention.
- the two proline residues of the PXP motif introduces a 90 degree angle turn so that the protein tag is projected outside in respect of the recombinant protein and hence making less steric hindrance to the folding and stability of the same.
- This enables the production of recombinant protein with increased biological activity and stability, and in particular allows the production of biologically active recombinant proteins, which cannot be produced by conventional protein tag methods, due to incorrect folding, and/or degradation.
- the conformation, where the PXP motif of the protein tag introduces a 90 degree angle turn so that the protein tag is projected outside in respect of the recombinant protein also facilitates the interaction of the protein tag, e.g.
- epitope tag with external binding partners, for example interaction with binding agents, such as antibodies or resins, which bind the tag, because the conformation exposes the tag outside of the recombinant protein.
- binding agents such as antibodies or resins
- IMAC resins such as HIS-TRAP resins (e.g. Ni/NTA, Ni/IDA, etc.) is promoted.
- the present invention solves at least the problem of protein tag loss due to proteolysis that occurs close to the site of fusion to the heterologous recombinant protein.
- the flanking tag is resistant to intracellular and/or extracellular proteolysis for prokaryotic or eukaryotic expression systems.
- problems of reduced interaction with protein binding agents are solved; and problems resulting from incorrect folding and reduced stability is solved by inclusion of a protein tag, such as an epitope tag, of the present invention in a recombinant protein.
- flanking tag can be incorporated by PCR into constructs comprising a gene of interest either as N- or C-terminal to a suitable protein tag (e.g. an epitope tag such as polyHIS (e.g. 6xHIS tag)).
- a suitable protein tag e.g. an epitope tag such as polyHIS (e.g. 6xHIS tag)
- C-terminal tag it should include a stop codon after the last residue of the tag, for example after the last histidine in a polyhistidine tag.
- the PXP sequence can be fused to a vector with a custom protein tag engineered in order to adapt the flanking PXP tag with a suitable restriction site in order to generate a maximum of 2 amino acid frame distance to the PXP and the
- polyhistidine tag In the examples provided herein below, vectors are disclosed, wherein the flanking tag is incorporated in custom polyhistidine tags.
- the present invention relates to a polypeptide protein tag, such as an epitope tag or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
- a polypeptide protein tag such as an epitope tag or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
- PXP Proline-X-Proline
- the protein tag of the invention is not bound to be any specific length, but generally, the tag comprises at least 5 amino acids, and normally 5-50, such as 5-40 for example 5-30 or 5-20 amino acids.
- X may be selected among basic amino acids, and therefore X can be selected on the basis of the isoelectric point of the respective amino acid.
- X is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7.
- X is selected from amino acids having an isoelectric point above 7.
- X is in one embodiment selected from His, Lys or Arg.
- X is selected on the basis of its charge, polarity and/or hydrophilicity.
- X is selected from positively charged amino acids; such as His, Lys, and Arg.
- X is selected from amino acids, which are negatively charged (basic amino acids and non-acidic amino acids), polar and hydrophilic, such as His, Lys and Arg.
- X is selected from amino acids, which are negatively charged (acidic amino acids), polar and hydrophilic amino acids, such as Asp and Glu.
- X is not a non-polar, hydrophobic acid.
- X an amino acid, which is not Phe, Met, Trp, lie, Val, Leu, Ala, and Pro.
- X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N).
- K Lysine
- R Arginine
- Q Glutamine
- H Histidine
- N Asparagine
- X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP.
- preferred protein tags comprise the amino acid motif PKP, PQP, PRP and/or PNP.
- X is Lysine (K) or Arginine (R), i.e. the PXP tag is PKP or PRP.
- PXP motifs specifically PKP, PQP, PRP and/or PNP, in the flanking tag improve for example a polyhistidine tag because they introduce one extra positive charge, which increases the affinity of the polyhistidine tag to for example IMAC resins, such as nickel-immobilized affinity resins (e.g. Ni/NTA, Ni/IDA etc.).
- IMAC resins such as nickel-immobilized affinity resins (e.g. Ni/NTA, Ni/IDA etc.).
- PKP or PRP motifs in a flanking polyhistidine tag leaves the epitope or affinity tag more sensitive to anti-HISn antibodies, having in the order of 10 to 100 fold more sensitivity to a PXP- comprising polyHIS tag than the polyHIS tag itself, i.e. without a PXP of the present invention.
- the PXP motif may be introduced into any suitable epitope tag or affinity tag, including those tags, which are commercially available.
- suitable epitope tags to which a PXP motif of the present invention may be attached are c-myc, HA, PolyHIS, VSV-G, HSV, V5, HAT and FLAG.
- the PXP motif is introduced into a polyhistidine tag, which comprises or consists of at least 2 consecutive histidine residues, and preferably 2-10, and even more preferred 4-6 consecutive histidine residues, such as 5 or 6 His residues.
- the tag of the invention further comprises a polyhistidine tag, where the PXP motif is fused to the N- terminal and/or C-terminal end of the polyhistidine tag, thus having the general formula:
- n is at least 2, and preferably 2-10, and even more preferred 4-6.
- X is any amino acid, but preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above.
- X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), for example PXP is PKP, PQP, PRP or PNP.
- X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP, or X is K or R; i.e.
- the PXP amino acid motif is PKP or PRP.
- the present invention also relates to epitope tags and/or affinity tags comprising a PXP motif as defined herein, where the tag is a variation of the classical consecutive polyhistidine tag.
- the invention encompasses a modified polyhistidine tag formed by the mix of any amino acid spaced by histidine comprising the PXP motif of the invention at the C or N terminal end.
- This modified HIS-tag can have the general formula:
- the modified HIS-tag above further comprises a C-terminal Proline residue, thus having the general formula:
- n is at least 1 , preferably 1-10, such as 2-8, for example 4-6; and H is histidine.
- X is selected from any amino acid, and as described above, X is preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, and/or X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), or in a specifically preferred embodiment, X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP.
- preferred polyhistidine tags comprise the amino acid motif PKP, PQP, PRP and/or PNP.
- Z is selected from any amino acid, and each repetitive (HZ) sequence may vary with respect to the identity of Z.
- Z is a basic amino acid.
- Z is an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q), Arginine (R) and Lysine (K).
- the protein tag of the invention is selected from the group consisting of SEQ ID NO: 1-5.
- the epitope and/or affinity tag of the present invention consists of or comprises an amino acid sequence selected from the group consisting of
- a tag of the present invention is selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP, INASAPKPHGHTHGHSHGHP, INASAPKPHEHDHEHDHEHP, SRPKPHHHHHH, SRPKPHQHRHKHQP, SRPKPHGHTHGHSHGHP, SRPKPHEHDHEHDHEHP, and/or MGSSHHHHHHPKP (SEQ ID NO: 6-14).
- the epitope and/or affinity tag of the present invention consists of or comprises an amino acid sequence selected from the group consisting of NH2- MGHHHHHHPKPASHM (SEQ ID NO: 73) and NH2-MGHHHHHHPKPENLYFQGASH (SEQ ID NO: 74).
- the tag of the present invention is a polyhistidine tag comprising a mix of aliphatic, acidic and/or basic residues, wherein each of the aliphatic, acidic and/or basic residues are separated by one histidine residue.
- a tag of the invention has the general formula:
- Z is a basic amino acid and/or selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q), Arginine (R) and Lysine (K); and X is selected from G,S,T,E,D,Q or K amino acids, or Z is K, Q, R or N.
- the tag has terminal proline residue and thus has the general formula: PZPHXHXHXHXHXHP.
- the inclusion of the terminal proline is optional.
- Fusion polypeptide comprising PXP-com prising protein tag.
- the invention relates to a recombinant polypeptide comprising a polypeptide of interest for recombinant expression and purification, which is fused to a protein tag, such as an epitope tag, of the present invention.
- a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, such as a PXP-comprising protein tag, as defined elsewhere herein; i.e.
- the invention relates to a fusion polypeptide for expression in a host cell comprising a first polypeptide sequence R1 fused to a protein tag comprising a PXP amino acid motif, where X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Histidine (H), Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), preferably X is K, Q, R or N.
- the invention provides a fusion polypeptide obtainable by expression in a host cell, said fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, such as a PXP-comprising epitope tag of the present invention.
- the at least one protein tag such as epitope tag, may be fused to the N-terminal and/or C-terminal end of said first polypeptide.
- Linker sequences may be inserted into the fusion polypeptide of the invention, for example in order to create fusion proteins, custom proteolytic sites, introduce specific polypeptide signals or recognition sites in the polypeptide.
- the fusion polypeptide in one embodiment further comprises a linker sequence between the protein tag and the first polypeptide, for example the linker sequence comprises
- a proteolytic cleavage site suitable for separating the first polypeptide from the protein tag, such as the epitope tag of the invention, or
- the "first polypeptide” may be any functional polypeptide, which is desired to be purified by affinity chromatography or other methods.
- the "first polypeptide may be any biologically active polypeptide.
- Particular relevant polypeptides are enzymes produced in large scale, and polypeptides, which are prone to proteolytic degradation, where the advantages of reduced proteolytic degradation in particularly relevant.
- Examples of “first polypeptides” are Alpha Mating Factor fusion protein containing Maltose Binding Protein (MBP) and wheat Thioredoxin.
- a fusion polypeptide of the present invention is selected from the group consisting of SEQ ID NO: 41-51.
- the fusion polypeptide comprises a sequence selected from any one of SEQ ID NO: 41-51 , or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
- a fusion polypeptide of the present invention is encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NO: 52-68.
- the fusion polypeptide is encoded by a nucleic acid comprising a sequence selected from any one of SEQ ID NO: 52-68, or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
- the PXP-motif could be integrated into an amino acid linker sequence comprising at least 3 amino acids, and preferably in the range of 3-100 amino acids, where the sequence serves as a protease resistant linker to connect to recombinant polypeptides.
- the PXP tag can serve as proteolytic uncleavable linker to join together two recombinant polypeptides.
- the fusion polypeptide comprise the PXP-comprising protein tag inserted between a first (R1) and a second (R2) polypeptide.
- the fusion polypeptide of the invention has a general formula selected from the group consisting of: R1-PXP-R2, R1-PXP-(His)n-R2, R1-(His)n-PXP-R2, R1- PXP-(His)n-P-R2, R1-P-(His)n-PXP-R2, R1-PXP-(HZ)n-R2, R1-PXP-(HZ)n-P-R2, R1- (HZ)n-PXP-R2, and R1-P-(HZ)n-PXP-R2, where n is at least 1 , preferably 1-10, such as 4-6, Z is any amino acid, but preferably an amino acid selected from the group consisting of Histidine (H), Gly
- the protein tag inserted between a first (R1) polypeptide and a second (R2) polypeptide of a fusion polypeptide of the invention comprises a site-specific protease cleavable site, such as a sequence selected from the group consisting of ENLYFQ/X (Tobacco etch virus protease, TEV), EVLFQ/GP (Human rhinovirus 3C protein), IEGR/ (Factor Xa), DDDDK/ ( Enterokinase, EntK), LVPR/GS (Thrombin, Thr protease), DXXD/ (Caspase-3 protease), where 7" designates the cleavage position.
- ENLYFQ/X tobacco etch virus protease, TEV
- EVLFQ/GP Human rhinovirus 3C protein
- IEGR/ Factor Xa
- DDDDK/ Enterokinase, EntK
- LVPR/GS Thrombin, Thr protease
- the tag inserted between a first (R1) polypeptide and a second (R2) polypeptide of a fusion polypeptide is not bound to be any specific length, but generally, the tag comprises at least 5 amino acids, and normally 5-50, such as 5-40 for example 5-30 or 5-20 amino acids.
- the site-specific protease cleavable site can be inserted at any position in the tag, and the tag also preferably comprises an epitope and/or affinity tag sequence as described elsewhere herein.
- a first polypeptide could for example serve as a solubility enhancer for the fusion protein and a second polypeptide could be any biologically active polypeptide, in particular polypeptides of industrial, pharmaceutical or scientific interest, or any other nature of interest.
- a solubility enhancer polypeptide can be positioned either N- terminal or C-terminally of the linker sequence.
- the solubility enhancer polypeptide can be chosen from any polypeptide sequence known to promote solubility.
- Non-limiting examples include Maltose Binding Protein (MBP), NusA, Trx, GST, SUMO, SET, DsbC.Skp, T7PK, GB1 , and ZZ as described by Esposito D. et al., 2006.
- the fusion polypeptide comprise as R1 or R2, Maltose Binding Protein (MBP), which has been shown to promote recombinant expression, and yield of correctly folded recombinant fusion protein, when MBP is fused to a biologically relevant recombinant protein of interest via a linker sequence; cf. Dalken B.,
- the invention also in one aspect pertains to nucleic acid molecules encoding one or more polypeptide sequences of the invention, or fragments or functional variants thereof.
- the invention thus provides a nucleic acid encoding a polypeptide protein tag of the present invention; i.e. the invention provides a nucleic acid sequence encoding an epitope tag and/or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e.
- PXP Proline-X-Proline
- the PXP amino acid motif is PKP, PQP and/or PRP.
- the invention relates to a nucleic acid encoding a polypeptide polyHIS tag or modified polyHIS tag comprising the PXP motif, such as epitope/affinity tags having the general formula:
- n is at least 2, and preferably 2-10, and even more preferred 4-6.
- X is any amino acid, but preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7.
- X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N).
- X is selected from K, Q, R and N; i.e. the PXP amino acid motif is PKP, PQP and/or PRP, or X is K or R; i.e. the PXP amino acid motif is PKP or PRP.
- a nucleic acid of the invention encodes a polypeptide sequence is selected from the group consisting of SEQ ID NO: 6-14, 73-74, or part thereof.
- the invention also encompasses oligonucleotide primers, which may be used for incorporating a PXP-motif into an existing protein tag, such as an epitope tag, in order to produce a tag of the invention.
- the invention also relates to nucleic acids encoding a PXP-motif of the invention and/or a modified protein tag, such as a modified epitope tag of the invention, which may be used for introducing the motif or tag into a first polypeptide of interest.
- the nucleic acid of the invention is in one
- the invention also provides a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, as described herein above.
- a nucleic acid of the present invention encodes a fusion polypeptide selected from the group consisting of SEQ ID NO: 41-51 or part thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
- a nucleic acid of the present invention is selected from the group consisting of SEQ ID NO: 52-68, or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
- nucleic acids of the present invention may be incorporated into a nucleic acid vector, for example for cloning and/or expression.
- a nucleic acid vector comprising a nucleic acid sequence encoding a polypeptide protein tag of the present invention; i.e. a nucleic acid sequence encoding an affinity purification tag or epitope tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above.
- PXP Proline-X-Proline
- the invention also relates to a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above, such as PKP, PQP, PRP and/or PNP
- a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above, such as PKP, PQP, PRP and/or PNP
- the nucleic acid vector is in a preferred embodiment an expression vector, such as a prokaryotic expression vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector.
- a prokaryotic expression vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector.
- Numerous vectors are available and the skilled person will be able to select a useful vector for the specific purpose.
- the vector may, for example, be in the form of a plasmid, cosmid, viral particle or artificial chromosome.
- the appropriate nucleic acid sequence may be inserted into the vector by a variety of procedures, for example, DNA may be inserted into an appropriate restriction endonuclease site(s) using techniques well known in the art.
- the vector may furthermore comprise one or more of a signal sequence, an origin of replication, one or more marker genes, an enhancer element, a promoter, and a transcription termination sequence.
- the vector may also comprise additional sequences, such as enhancers, poly-A tails, linkers, polylinkers, operative linkers, multiple cloning sites (MCS), STOP codons, internal ribosomal entry sites (IRES) and host homologous sequences for integration or other defined elements.
- the vector is preferably an expression vector, comprising the nucleic acid operably linked to a regulatory nucleic acid sequence directing expression thereof in a suitable cell.
- a suitable host cell such as a yeast cell or prokaryotic cell, such as Escherichia coli.
- the vector may be any eukaryotic expression vector, for example a mammalian expression vector, or a yeast vector.
- the vector may comprise at least one intron, which will facilitate the transport from the nucleus to the cytoplasm of the vector encoded RNA.
- the vector is capable of expressing RNA in the cytoplasm by cytoplasmic transcription, which can be translated into polypeptide.
- nucleic acid sequence encoding a protein tag or a fusion polypeptide of the invention may be incorporated into the vector by any method available to the skilled person.
- the nucleic acid sequence may be incorporated by standard cloning techniques, involving PCR amplification, nucleolytic restriction digestion and ligation, transformation/transfection/recombination into host organisms and/or marker selection.
- the invention also in one aspect relates to the use of an oligonucleotide comprising a sequence encoding a protein tag comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above, or part thereof, for introducing said protein tag or part thereof in a nucleic acid vector, for example a cloning vector and/or expression vector.
- the invention also provides a use of said oligonucleotide for fusing said protein tag or part thereof to a nucleic acid sequence encoding a first polypeptide sequence in a fusion polypeptide of the invention, as defined above.
- PXP Proline-X-Proline
- oligonucleotide claimed for such use comprises or consists of a sequence selected from the group consisting of SEQ ID NO: 15-40, 69-72 and/or 75-79 or a PXP- encoding part thereof.
- PXP-comprising tag When the PXP-comprising tag is incorporated into a fusion polypeptide as a C-terminal tag, it should include a stop codon (TAA/UAA, TAG/UAG, TGA/UGA in DNA/RNA respectively) after the last histidine or proline in order to terminate translation of the transcript.
- the protein tag may be an epitope or affinity tag having the following general polypeptide sequences: PXP-(HZ)n-P-STOP or (HZ)n-P- PXP- STOP, PXP-(HZ)n- STOP or (HZ)n- PXP- STOP, PXP-(His) n -STOP-, (His) n -PXP- STOP or PXP-(His) n -PXP- STOP,
- the PXP-comprising tag sequence can be fused into a vector with a custom stop codon.
- the PXP sequence can also be fused into a vector with a custom protein tag, such as an epitope or affinity tag, engineered in order to adapt the flanking PXP tag with a suitable restriction site in order to generate a maximum of 2 amino acid frame distance to the PXP and the tag.
- the vector is engineered so that the flanking PXP-motif is incorporated into custom polyhistidine tags.
- an amino acidic linker between the PXP motif and the protein of interest for example engineered to also be resistant to proteolysis.
- the polylinker is for example structured for incorporating the PXP tag into prokaryotic (e.g. E. coli) or eukaryotic (e.g. P. pastoris, S. cerevisiae or Kluyveromyces lactis) expression systems.
- the PXP-comprising protein tag of the invention is a PXP-comprising polyHIS tag or modified polyHIS tag with an amino acid linker sequence.
- the PXP-comprising protein tag with a polylinker sequence consists of or comprises an amino acid sequence selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP,
- the nucleic acid vector of the invention preferably comprise a nucleic acid sequence encoding a PXP-comprising protein tag and polylinker sequence consisting of or comprises an amino acid sequence selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP,
- a recombinant host cell comprising a nucleotide sequence, fusion polypeptide, a nucleic acid sequence, and/or a nucleic acid vector of the present invention, and described elsewhere herein.
- the recombinant host cell preferably comprises a recombinant nucleic acid vector comprising a polynucleotide of the present invention, wherein the vector is introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a self- replicating extra chromosomal vector.
- the term "host cell” encompasses any progeny of a parent cell that is not identical to the parent cell due to mutations that occur during replication. The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide and its source.
- the host cell may also be any suitable host microorganism. Accordingly, the host cell is preferably selected from host
- microorganisms belonging to a phylum selected from the group consisting of yeasts, fungi, bacteria, algae or plants.
- the host cell may be any prokaryote or eukaryote, such as a mammalian, insect, plant, or fungal cell.
- the host cell is for example selected from eukaryotic or prokaryotic cells, for example the host is selected from mammalian cells, human cells, mouse cells, plant cells, Chinese Hamster Ovary (CHO) cells, and insect cells.
- eukaryotic or prokaryotic cells for example the host is selected from mammalian cells, human cells, mouse cells, plant cells, Chinese Hamster Ovary (CHO) cells, and insect cells.
- the host cell is a prokaryotic cell, such as Escherichia coli, or varienat of E. coli..
- the host cell may also be a eukaryotic cell.
- the host cell is a yeast cell, such as a methylotrophic yeast.
- the host cell can be Saccharomyces cerevisiae, Kluyveromyces lactis, or Pichia pastoris.
- the host cell e.g. Saccharomyces cerevisiae,
- Kluyveromyces lactis, Pichia pastoris, or E coli comprises in its genome and/or in a nucleic acid vector, such as a plasmid vector, a nucleic acid sequence encoding an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e.
- a nucleic acid vector such as a plasmid vector
- PXP Proline-X-Proline
- X is selected from any amino acid
- X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at
- the PXP amino acid motif is PKP, PQP and/or PRP.
- the host cell comprises in its genome and/or in a plasmid vector a nucleic acid sequence encoding a nucleic acid encoding a polypeptide polyHIS tag or modified polyHIS tag comprising the PXP motif, such as epitope or affinity tags having the general formula: PXP-(His) n , (His)n-PXP or PXP-(His) n -PXP, PXP-(HZ)n or (HZ)n- PXP. PXP-(HZ)n-P or (HZ)n-P- PXP, as defined elsewhere herein.
- the nucleic acid may encode a fusion polypeptide comprising a gene of interest for transgenic expression is the host organism and a PXP-comprising tag, as described elsewhere herein.
- the invention also relates to a kit comprising one or more of the components of the invention, such as one or more polypeptide tags, fusion polypeptides, nucleic acid sequences, nucleic acid vectors and/or host cells of the present invention.
- the present invention relates to a kit comprising a nucleic acid sequence encoding an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP, PRP and/or PNP.
- PXP Proline-X-Proline
- kits comprising a nucleic acid encoding a polypeptide polyHIS tag or modified polyHIS tag comprising the PXP motif, such as epitope/affinity tags having the general formula:
- kits of the invention may also comprise a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence of a gene of interest for expression fused to at least one PXP- comprising protein tag of the invention.
- kits of the invention may comprise one or more nucleic acid vectors of the invention, e.g. nucleic acid vectors comprising a nucleic acid sequence encoding a polypeptide protein tag of the present invention; i.e.
- kits may also comprise a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above.
- a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above.
- the kit may comprise one or more oligonucleotide primers for PCR amplification of a specific target, wherein the PCR primers comprise a nucleic acid sequence of the invention or part thereof, for example a nucleic acid sequence encoding a PXP- comprising protein tag, as defined herein above.
- the kit may comprise one or more additional components, which are relevant for cloning and amplification of a gene of interest, and/or introduction of a PXP- comprising tag to a gene of interest.
- additional components include for example, host cells (competent cells), buffers (e.g. restriction buffers, ligation buffers,
- transformation buffers e.g. transformation buffers, PCR buffers etc.
- enzymes e.g. restriction enzymes, ligases, polymerases etc.
- the nucleic acids and specifically nucleic acid vectors of the present invention are specifically suitable for production of recombinant protein.
- the invention therefore, also provides methods for producing recombinant protein by utilizing the PXP-comprising modified tag of the present invention. This method comprises culturing the host cell under suitable growth conditions, which allows the expression of a fusion polypeptide of the invention.
- the produced polypeptide is then preferably recovered from the microorganism and/or host cell, and/or recovered from the incubation broth of the host cell and/or
- the present invention pertains to a method of producing recombinant protein, said method comprising:
- Suitable culturing conditions will depend of the choice of host organisms, for example yeast cells generally require incubation temperatures around 30°C, whereas bacteria such as Escherichia coli, and mammalian cells generally thrive at 37°C. Also the composition of culture media depends on the choice of host organisms, but suitable growth conditions and culture media will be known to the skilled practitioner. Both liquid and solid growth media may be employed.
- the level of expression of the fusion protein depends on the choice of expression vector, and the choice of promoter in the vector. Expression can be modulated by use of inducible promoters, which are activated by certain external stimuli, such as metabolites, temperature, chemicals or nutrients. Alternatively, constitutive promoters mediate a constant expression of the fusion polypeptide. Also the method of purification may depend on the choice of host as well as by other factors. Secreted proteins may be isolated from the culture broth, whereas intracellular proteins can be purified from cell extract, prepared from host cells collected from the media. For example, cells may be collected by centrifugation, and opened by chemical or mechanical methods, for example incubation in lysis buffer. Such methods are well- known in the art.
- the fusion polypeptide is preferably isolated by chromatographic methods, such as affinity chromatography.
- the fusion polypeptide is isolated from the culture by metal ion affinity chromatography or immobilized metal ion affinity chromatography (I MAC) using the culture broth or a cell extract as starting material.
- I MAC immobilized metal ion affinity chromatography
- This methodology is based on the specific coordinate covalent bond of amino acids, particularly histidine, to metals.
- the technique works by allowing proteins with an affinity for metal ions to be retained in a column containing immobilized metal ions, such as cobalt, nickel, copper for the purification of histidine containing proteins or peptides, iron, zinc or gallium for the purification of
- phosphorylated proteins or peptides Many naturally occurring proteins do not have an affinity for metal ions; therefore a recombinant fusion polypeptide of the present invention comprising a PXP-com prising polyHIS tag or modified polyHIS tag is particularly suitable.
- Methods used to elute the protein of interest include changing the pH, or adding a competitive molecule, such as imidazole.
- isolation of the fusion polypeptide comprises
- a contacting a lysate or extract of the host cell containing the fusion polypeptide with a metal chelate resin, b. optionally, washing the resin-fusion polypeptide complex with a buffer to remove unbound proteins and other materials, and
- the metal chelate resin is preferably NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin, and the metal chelate preferably comprises a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+.
- the tag of the present invention may be removed subsequent to its purification or in connection with its purification.
- the tag is removed when the tagged polypeptide is bound to a resin, thereby releasing the mature polypeptide without tag from the resin, and collecting the mature polypeptide as the eluate.
- the PXP-tag is removed subsequent to its elution from the resin matrix.
- proteases are available, which may be used to remove the PXP-tag, and any suitable protease may be used according to the present invention.
- the PXP-tag is removed by a protease selected from the group consisting of Tobacco etch virus protease, TEV (ENLYFQ/X), Human rhinovirus 3C protein (EVLFQ/GP), Factor Xa (IEGR/), Enterokinase, EntK (DDDDK/), Thrombin, Thr protease (LVPR/GS) and Caspase-3 protease (DXXD/).
- the cleavage site of the respective proteases are indicated in parenthesis, where 7" designates the cleavage position.
- the PXT-com prising tag of the present invention is removed by Tagzyme/DAPase. Removal by DAPase is illustrated in figure 9
- the PXT-comprising tag of the present invention is removed by an A-type carboxy peptidase, such as bovine carboxypeptidase A (BoCPA) or
- MeCPA Metarhizium anisopliae carboxypeptidase A
- QRLLDDTSGKHHHHHH BoCPA protein QRLLDDTSGK
- BoCPA removal of PXP-comprising HIS-tags of the present invention are: protein— QRLLDDTSGPKPHHHHHH BoCPA protein— QRLLDDTSGPKPH protein— KEEDDHRPKPSHLLVHHHH BoCPA protein— KEEDDHRPKPS(S) protein— QTSSLISPPRPSFSHHHHHH BoCPA protein— QTSSLISPPRPS Examples
- the target expression system used is P. pastoris (strain KM71 H, mut s ) as extracellular expression by the use of modified LifeTechnologies engineered pPICZ_alpha A vector.
- the maltose binding protein (MBP, accession ACI46135) was chosen as test protein to be expressed as fusion of the alpha mating peptide and the tag using the UserTM (Uracil- Specific Excision Reagent, New England Biolab) based cloning strategy (Nour-Eldin et al., 2006).
- the MBP sequence was amplified from vector pKLAC-malE
- PCR conditions were the followings: 1 min at 95°C as initial temperature, 36 cycles of 45 " 95°C followed by 30 " of annealing at 64°C and 1.5 min at 72°C as elongation time.
- the PCR product was purified from 1.5% agarose gel and treated with the UserTM enzymes mix (NEB).
- the Vector 01 was opened with Pac I and Nt.BbvCI enzymes (NEB) and purified from gel as well.
- About 200 ng of PCR product of MBP treated with the UserTM enzymes was mixed with 50 ng of the Vector 01 opened as mentioned above and this mix of inser vector was left at RT for 30 min prior standard E.coli Topo Top10 ' competent cell transformation.
- the positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG.
- the elution was carried out with 0.5 M Imidazole and 1 M NaCI pH 8.0.
- the eluted peak fractions as monitored by the OD at 280 nm above the threshold were collected and dialyzed and concentrated with Vivaspin columns (Ge-Healthcare).
- the recovery yield was calculated to be up to 95% of the supernatant recombinant proteins as judged by western blotting with anti penta HIS (Qiagen) or C-terminal 6xHis antibodies (Life Technology).
- Vectors 2-4 were produced by the same methods as described in example 1.
- the pPICZ_alpha_A vector opened with Xhol and Xba I was ligated respectively to the linker created by mixing the oligos V02_up and V02_down (table 1), for obtaining the vector 2 (fig. 1 B), the oligos V03_up and V03_down (table 1), for obtaining the vector 3 (fig. 3) and the oligos V04_up and V04_down (table 1), for obtaining the vector 4 (fig. 2A).
- the resulting vectors 2, 3 and 4 used with the User cloning technology described in Example 1 originated a recombinant maltose binding protein which possessed respectively a C-terminal sequence as the follow: SEQ ID NO: 7 in vector 2, SEQ ID NO: 8 in vector 3 and SEQ ID NO: 9 in vector 4.
- SEQ ID NO: 7 in vector 2 SEQ ID NO: 8 in vector 3
- SEQ ID NO: 7 in vector 2 SEQ ID NO: 8 in vector 3
- SEQ ID NO: 9 in vector 4.
- the same procedures for making recombinant proteins using the fermentation conditions in example 1 apply in this example 2.
- the vector 5 was built in two stages. The first step was to introduce into
- pPICZ_alpha_A the linker formed by oligos V05_up and V05_down (table 1) in the Xho l/Xba I site.
- the second step was to introduce the second linker formed by the combination of the two oligos V05a_up and V05a_down (table 1) into the restriction site Xba I/Sal I of the latter formed vector.
- the resulting Vector 5 polylinker is illustrated in Fig. 2B.
- the vector 5 was hence opened with EcoRI and Xba I meanwhile the MBP was amplified by PCR with the primers MBP_EcoR_l_fw /MBP_Xba_l_rv. After the PCR was finished the PCR product was cleaved by EcoRJ/Xba I and the PCR cleaned up fragment was ligated into the corresponding EcorJ/XbaJ site of Vector 5 by 10U of
- T4 Ligase (Fermentas, Thermo Scientific) 1 h at 16 °C.
- the positive colonies were identified by restriction analysis and sequencing at
- the shuttle vector 5 containing the MBP was amplified in E. coli TOP 10 ' cells (Life technologies) by chemical transformation. Pichia KM71 H transformation and fermentation was carried out as in example 1. Proteins were also purified as described in example 1. Examples 6-8
- Vectors 6-8 were produced by the same methods as described in example 5.
- Vector 5 was chosen as template. It was opened with Xbal and Sal I and ligated respectively to the linker created by mixing the oligos V06_up and V06_down (table 1), for obtaining the vector 6 (fig. 2C), the oligos V07_up and V07_down (table 1), for obtaining the vector 7 (fig. 3A) and the oligos V08_up and V08_down (table 1), for obtaining the vector 8 (fig. 3B).
- the resulting vectors 6, 7, 8 were used for conventional ligase cloning technology described in Example 5, originating a recombinant maltose binding protein which possessed respectively a C-terminal sequence as the follow:
- the prokaryotic expression was performed in the E. coli expression strain BL21 (DE3) pLysS from Promega Corporation, meanwhile the cloning part has been carried out in Life Technology TOP 10 ' E. coli chemical competent cells.
- the vector pET15b Novagen, Merk
- the vector pET15b was opened with Nco I and Nde I and the linker formed by the two oligos V09_up and V09_down being ligated using 10U of T4 Ligase (Fermentas, Thermo Scientific) 1 h at 16 °C.
- the new ligated vector was used for transformation of TOP 10 ' cells that were used for amplification.
- the correct vector 9 having an N- terminal the amino acid sequence MGSSHHHHHHPKP (SEQ ID NO: 14), was used for introducing the PCR product of MBP obtained with the primers
- MBP_Nde_l_fw/MBP_Xho_l_rv After double digestion with Ndel/Xho I of both the vector 9 and the latter PCR product, these were purified from 1.5% agarose gel electrophoresis.
- the MBP was hence ligated to the Ndel/Xhol site of vector 9 and amplified in E. coli TOP 10 ' cells as described herein above. The positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG. Chemical competent BL21 (DE3) pLysS were hence transformed with the vector 9 containing the MBP.
- IPTG Isopropyl ⁇ -D-l-thiogalactopyranoside, Sigma-Aldrich
- IPTG Isopropyl ⁇ -D-l-thiogalactopyranoside
- the cells were harvested by centrifugation and lysed by the BugBuster (Novagen, Merk) reagent.
- the lysate was sonicated and passed through a Ni-IDA (Protino resin, Macherey-Nagel) silica column (Omnifit, Diba) using an AKTA FPLC machine (GE- Healthcare), prior adjusting the pH of the media at ph 8.0 with 1 N NaOH.
- the chromatography steps were similar to example 1 as well the recovery yield.
- the prokaryotic expression was performed in the E. coli expression strain T7 Shuffle from New England Biolabs, meanwhile the cloning part has been carried out in Life Technology TOP 10 ' E. coli chemical competent cells.
- Two new vector derivative from pET15b (Novagen, Merk) were created by Infusion cloning technique (Clontech, USA).
- pET15m Dionisio et al., 2011
- was opened with Nco I and Nde I and the linker formed by the two oligos V10_up (Seq ID. NO: 69) and V10_down (Seq ID. NO: 70) have been introduced by Infusion reaction for 5 min. at 50 °C, according to the manufacturer instructions.
- the new generated vector 10 was used for transformation of TOP 10 ' cells. Likewise the oligos V11_up (Seq ID. NO: 71) and V11_down (Seq ID. NO: 72) were introduced into pET15m as described above.
- the vector pET15m and the vector 10 and 11 having these latter the N-terminal amino acid sequence respectively NH2- MGHHHHHHPKPASHM (SEQ ID NO: 73) and NH2- MGHHHHHHPKPENLYFQGASH (Seq ID.
- Thr_h_b (Genbank nucleotide accession AY072771 and protein accession AAL67139) obtained with the primers Thr_Nde_l_fw (SEQ ID NO: 75) and Thr_EcoR_l_rv (SEQ ID NO: 76).
- Thr_Nde_l_fw SEQ ID NO: 75
- Thr_EcoR_l_rv SEQ ID NO: 76
- the Thioredoxin was hence ligated into the Nde l/EcoR I site of these vectors and amplified in E. coli TOP 10 ' cells as described herein above.
- the positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG.
- Chemical competent BL21 (DE3) T7 Shuffle cells were transformed with the vectors containing the Thr_h_b.
- the protein expression was carried out in 200 ml of LB (Luria-Bertani) media which were inoculated with 10 ml pre-growth inocula and this was allowed to grow at 37 °C until the OD 6 oo nm of the culture was around 0.6. After that 0.1 mM IPTG (Isopropyl ⁇ -D- 1-thiogalactopyranoside, Sigma-Aldrich) as final concentration was added and the culture was allowed to grow for 12h at 26 °C. The cells were harvested by
- the tagged proteins were eluted by an imidazole gradient from 0 to 0.5 M using the buffer A (20 mM Tris-HCI pH 8.0, 0.2 M NaCI) and Buffer B (20 mM Tris-HCI, 0.5 M Imidazole, 0.6 M NaCI, pH 8.0) in order to create the gradient.
- buffer A (20 mM Tris-HCI pH 8.0, 0.2 M NaCI
- Buffer B (20 mM Tris-HCI, 0.5 M Imidazole, 0.6 M NaCI, pH 8.0) in order to create the gradient.
- SDS- PAGE/Coomassie SDS- PAGE/Coomassie and the elution profile of the Thioredoxin h isoform b as coming from pET15m or vector 10/1 1.
- Recombinant thioredoxin h proteins coming from vector pET15m were eluted with 125 mM Imidazole judging from the elution peak of the Ni/NTA column (figure 10a, filled circle line), meanwhile recombinant PKP or PKP TEV tagged thioredoxin h were eluted with 200 mM Imidazole (figure 10a, open circle and inverted filled triangle lines).
- the figure 10b represents the 10-20% Tris GlycineSDS- PAGE/Coomassie (Life-Technologies) of the summary of the purification steps.
- lane 1 there are the all blue precision standard prestained markers (Bio-Rad, cat. num.
- NTR NADPH dependent Thioredoxin reductase
- X is selected from the group consisting of any amino acid, but is preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Lysine (K), Arginine (R),
- Glutamine Q
- Histidine H
- Asparagine N
- X is selected from the group consisting of Lysine (K), Arginine (R) and Glutamine (Q);
- Z is an amino acid, such as preferably an amino acid selected from the group consisting of Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K), and
- n is at least 1 , preferably 2-10, and more preferred 4-6.
- aMF Alpha Mating Factor
- MBP Maltose Binding Protein
- SEQ ID NO: 6 C-terminal PXP tag
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 7 in italic and bold
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 8 in italic and bold
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 9 in italic and bold
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 10 C-terminal PXP tag
- aMF Alpha Mating Factor
- MBP Maltose Binding Protein
- SEQ ID NO: 11 C-terminal PXP tag
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 12 in italic and bold
- EcoRI / Xbal conventional ligation with sticky ends
- aMF Alpha Mating Factor
- MBP Maltose Binding Protein
- SEQ ID NO: 13 C-terminal PXP tag
- N-terminal PXP tag SEQ ID NO: 14, in italic and bold
- MBP Maltose Binding Protein
- SEQ ID NO: 14 N-terminal PXP tag
- Nde I / Xho I sticky ends
- Alpha Mating Factor (aMF, in bold) fusion protein containing the UserTM site and the C- terminal PXP tag (SEQ ID NO: 6, in italic and bold) of Vector 1.
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 6 in italic and bold
- Alpha Mating Factor in bold
- aMF Alpha Mating Factor
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 7 in italic and bold
- Alpha Mating Factor (aMF, in bold) fusion protein containing the UserTM site and the C- terminal PXP tag (SEQ ID NO: 8, in italic and bold) of Vector 3.
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 8 in italic and bold
- Alpha Mating Factor (aMF, in bold) fusion protein containing the UserTM site and the C- terminal PXP tag (SEQ ID NO: 9, in italic and bold) of Vector 4.
- Alpha Mating Factor in bold
- MBP Maltose Binding Protein
- SEQ ID NO: 9 in italic and bold
- Alpha Mating Factor in bold
- aMF Alpha Mating Factor
- Alpha Mating Factor in bold
- MBP underlined
- SEQ ID NO: 10 C-terminal PXP tag
- Alpha Mating Factor in bold
- aMF Alpha Mating Factor
- Alpha Mating Factor in bold
- MBP underlined
- SEQ ID NO: 11 C-terminal PXP tag
- Alpha Mating Factor in bold
- aMF Alpha Mating Factor
- Alpha Mating Factor in bold
- aMF Alpha Mating Factor
- Alpha Mating Factor in bold
- MBP underlined
- PXP tag SEQ ID NO: 13, in italic and bold
- Vector 9 polylinker obtained from pET15b (Novagen) with the Nco I / Ndel (underlined) insertion of a new N-terminal PXP tag (SEQ ID NO: 14, in italic and bold).
- N-terminal PXP tag SEQ ID NO: 14, in italic and bold.
- N-aminoterminal aminoacidic sequence related to vector 10 the PKP site is in bold: Seq ID. NO: 74
- N-aminoterminal aminoacidic sequence related to vector 1 1 the PKP site is in bold and the TEV protease site is underlined:
- NTR_ATG_fw NADPH dependent Thioredoxin reductase
- NRR_C_t_6xHIS NADPH dependent Thioredoxin reductase
- NTR_C_t_PKP_6xHIS NADPH dependent Thioredoxin reductase
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Analytical Chemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Peptides Or Proteins (AREA)
Abstract
A protein tag is provided, which comprise a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid. Furthermore, fusion polypeptides and host cell comprising the proteintag is provided, as well as nucleic acids encoding these. Moreover, methods are provided of producing recombinant protein by culturing the host cell, expression a fusion polypeptide, and isolating the fusion polypeptide from the culture.
Description
Proteolytic resistant protein affinity tag
Field of invention
The present invention relates to a protease resistant protein tag with high affinity for IMAC resins, which tag comprises a proline-X-proline motif.
Background of invention
The C-/N- terminal tagging of recombinant proteins is a well-known procedure that enables both immunological detection of expression level and downstream purification of the recombinant protein by exploiting the properties of the tag to bind to a specific affinity column. The use of high affinity tags also increase the yield obtainable from recombinant production of proteins, which is a very important factor with great economic impact. However, different tags vary greatly with respect to yield of recombinant protein, and the position of the tag in a fusion protein also influences purification yield.
The most used purification tag is the polyhistidine tag, typically 6x His, that consists of six consecutive histidine residues (HHHHHH) that with the use of PCR and/or restriction sites present in an expression vector can be incorporated into genetic constructs for making recombinant proteins. Typically, the linker amino acid sequence between the hexa histidine tag and the recombinant protein is the reason for unwanted proteolysis that results in loss of the tag and hence diminishing the recovery yield. Approaches to prevent this unwanted proteolysis operated by the host proteases both in Escherichia coli, yeasts and mammalian/plant cell cultures includes lowering the temperature (Rozkov et al., 2000; Li et al., 2001), changing pH of the media (Kang et al., 2000; Jahic et al., 2003) and either supplying casamino acids or protease inhibitors (Clare et al., 1991 ; Shi et al., 2003; Holmquist et al., 1997). These approaches, however, suffer from the drawback that protein yield is considerably reduced due to reduced biomass levels, and more expenses due to the considerable price of protease inhibitors for big fermentation volumes.
Many attempts have been made to improve such unwanted proteolysis. For example, use of protease free (knock out) strains in E. coli (Rozkov et al., 2004) for recombinant protein expression. Also other eukaryotic protease knock out hosts strains have been produced, e.g. Saccharomyces cerevisiae or Pichia pastoris, but these strains also
suffer from high loss of the protein tags due to the variability of the linker between the recombinant protein and the tag itself.
Summary of invention
The present invention provides a modified protease resistant protein tag. This tag is not degraded by host proteases, and therefore delivers high yields of recombinant protein, when expressed and purified from a suitable host organism.
The protein tag of the present invention comprises a short flanking amino acid motif, PXP (Proline-X-Proline), where X can be any amino acid, that for example when added to an N- or C- terminal polyhistidine epitope tags (e.g. a 6xHis tag) drastically reduce the loss of this epitope tag. In fact, in the canonical hexa histidine tag unwanted proteolysis usually occurs both in vivo and during purification. Additionally, new improved variants of the polyhistidine tags are provided, which increase the binding to metal-chelate resins commonly used for purification of tagged recombinant proteins. Hence the combination of the PXP tag and any protein tag, such as the canonical and/or new polyhistidine tags can generate an uncleavable C-/N- terminal affinity tag for recombinant protein purifications. In one aspect the present invention relates to a protein tag, such as an epitope tag and/or an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
The tag preferably comprises a polyhistidine tag, where the PXP motif is fused to the N- terminal and/or C-terminal end of the polyhistidine tag, thus having the formula: PXP-(His)n, (His)n-PXP or PXP-(His)n-PXP, where n is at least 2, and preferably 2-10, and even more preferred 4-6.
Alternatively, the tag preferably has the general formula: PXP-(HZ)n, where n is at least 1 , preferably 1-10, Z is an amino acid, such as an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K), and X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above.
. In another embodiment, X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N).
Preferred protein tags, such as epitope or affinity tags, comprise the amino acid motif PKP, PQP, PHP, PNP and/or PRP.
Preferred examples of epitope tags of the present invention are
a) INASAPKPHHHHHH
b) INASAPKPHQHRHKHQP
c) INASAPKPHGHTHGHSHGHP
d) INASAPKPHEHDHEHDHEHP
e) SRPKPHHHHHH
f) SRPKPHQHRHKHQP
g) SRPKPHGHTHGHSHGHP
h) SRPKPHEHDHEHDHEHP, and
i) MGSSHHHHHHPKP
In another aspect, the invention relates to a fusion polypeptide for expression in a host cell, said fusion polypeptide comprising a first polypeptide sequence fused to a protein tag, such as an epitope tag, of the present invention.
The fusion polypeptide in one embodiment further comprises a linker sequence between the protein tag and said first polypeptide, or example the linker sequence comprises
a. a proteolytic cleavage site suitable for separating the first polypeptide from the protein tag, such as an epitope tag, or
b. one or more endonucleolytic cleavage sites.
The invention also in a further aspect pertains to a nucleic acid encoding an protein tag and/or a fusion polypeptide of the present invention.
Another aspect relates to a nucleic acid vector comprising such a nucleic acid sequence of the invention. The nucleic acid vector is in a preferred embodiment an expression vector, such as a prokaryotic expression vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector.
Yet another aspect relates to a recombinant host cell comprising a nucleotide sequence, fusion polypeptide, a nucleic acid sequence, and/or a nucleic acid vector of the present invention. The host cell is for example selected from eukaryotic or
prokaryotic cells, for example the host is selected from mammalian cells, human cells, mouse cells, plant cells, Chinese Hamster Ovary (CHO) cells, and insect cells.
In a further aspect, the present invention relates to a kit comprising a nucleic acid sequence, a nucleic acid vector and/or a host cell of the present invention.
Another aspect relates to the use of an oligonucleotide comprising a sequence encoding a protein tag, such as an epitope tag of the present invention, or part thereof, for introducing said protein tag or part thereof in a nucleic acid cloning vector and/or expression vector, and/or for fusing said protein tag or part thereof to a nucleic acid sequence encoding a first polypeptide sequence.
In a further aspect, the invention pertains to a method of producing recombinant protein, said method comprising:
(a) culturing a host cell of the invention under suitable growth conditions, which allows the expression of a fusion polypeptide of the invention, and
(b) isolating the fusion polypeptide from the culture.
The isolation of the fusion polypeptide preferably comprises
a. contacting a lysate or extract of said host cell containing the fusion polypeptide with a metal chelate resin,
b. optionally, washing the resin-fusion polypeptide complex with a buffer to remove unbound proteins and other materials, and
c. eluting the bound fusion polypeptide from the resin-fusion polypeptide complex.
The metal chelate resin is preferably NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin, and the metal chelate preferably comprises a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+.
Description of Drawings
Fig. 1. A: Vector 01 polylinker based on pPICZ_alpha_A (Life Technology) backbone. The vector has been obtained ligating the linker created with the oligos
V01_up/V01_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ. The C- terminal of the recombinant protein will contain the residues: INASAPKPHHHHHH (SEQ ID NO: 6).
B: Vector 02 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V02_up/V02_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ. The C-terminal of the recombinant protein will contain the residues: INASAPKPHQHRHKHQP (SEQ ID NO: 7).
C. Vector 03 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V03_up/V03_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ. The C-terminal of the recombinant protein will contain the residues: INASAPKPHGHTHGHSHGHP (SEQ ID NO: 8). Fig. 2. A. Vector 04 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V04_up/V04_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ. The C-terminal of the recombinant protein will contain the residues: INASAPKPHEHDHEHDHEHP (SEQ ID NO: 9).
B. Vector 05 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V05_up/V05_down to the vector pPICZ_alpha_A opened with XhoJ and XbaJ. The C-terminal of the recombinant protein will contain the residues: SRPKPHHHHHH (SEQ ID NO: 10).
C. Vector 06 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V06_up/V06_down to the vector 05 opened with XbaJ and Sal J. The C-terminal of the recombinant protein will contain the residues: SRPKPHQHRHKHQP (SEQ ID NO: 11).
Fig. 3. A. Vector 07 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V07_up/V07_down to the vector 05 opened with XbaJ and Sal J. The C-terminal of the recombinant protein will contain the residues: SRPKPHGHTHGHSHGHP (SEQ ID NO: 12).
B. Vector 08 polylinker based on pPICZ_alpha_A backbone. The vector has been obtained ligating the linker created with the oligos V08_up/V08_down to the vector 05 opened with XbaJ and Sal J. The C-terminal of the recombinant protein will contain the residues: SRPKPHEHDHEHDHEHP (SEQ ID NO: 13).
C. Vector 09 polylinker based on pET15a (Novagen, Merk) backbone. The vector has been obtained ligating the linker created with the oligos V09_up/V09_down to the vector pET15a opened with Nco_l and NdeJ. The N-terminal of the recombinant protein will contain the residues: MGSSHHHHHHPKP (SEQ ID NO: 14).
Fig. 4. Illustration of Immobilized Metal affinity chromatography (IMAC) with nickel. Comparison of binding of conventional consecutive histidine tag, and alternating histidine.
Fig. 5. Comparison of a traditional HIS6 tagged protein and a PXP-HIS6 tagged protein. Top panel shows a traditional N- and C-t tagged HIS6 protein. The bottom panel shows a PXP N- and C-t tagged HIS6 protein, which have 90 degree turn that allow a nice usage of the hexahistidine moiety and does not affect hindrance in folding. It is seen that PXP polyhistidine tags stick out the recombinant protein ensuring better IMAC (Immobilized Metal affinity chromatography).
Fig. 6. Advantage of the PXP tag over just 6xHis tag. The PXP-HIS tag display strong binding to Ni-NTA or Ni-IDA-Ni columns. The C-terminal PKPHHHHHH tagged HvEP- B2 was all eluted from Ni-IDA (Protino, Macherey-Nagel) with 1 M NaCI and 0.5 M
Imidazole in comparison to Ni-NTA using 0.5 M NaCI and 0.5 M Imidazole. This latter elution conditions would elute the most of recombinant 6xHis tagged proteins. A.
HiPrep IMAC FF 16/10 from GE Healthcare, Life Sciences; Wash: 0.5 M NaCI; Elution: 0.5 M NaCI and 0.5 M Imidazole; B. Ni-IDA protino from Macherey-Nagel; Wash: 1 M NaCI; Elution: 1 M NaCI and 0.25 M Imidazole; C. Ni-IDA protino from Macherey- Nagel; Wash: 1 M NaCI; Elution: 1 M NaCI and 0.5 M Imidazole.
Fig. 7. PXP tag and the new polyhistine tags ensure a very good proteolytic resistance and high yield of protein expressed in Pichia. Purification and quality check of 3 constructs: 1) recHvPAPhy_a (-CLLKHHHHHH); 2) recHvPAPhy_a (-
CLLKVDHHHHHH); 3) recHvPAPhy_a (-CLLPKPHHHHHH). Contructs 1) and 2) are conventional HIS tags, and construct 3 comprise a PKP-HIS6 tag of the invention. The following results are seen from the gel: 1) recHvPAPhy_a bind to Ni2+/NTA resin (50 % yield than sample 3); 2) recHvPAPhy_a does not bind to Ni2+/NTA resin (purified by I EX and Gel filtration); 3) recHvPAPhy_a strong binding to Ni2+/NTA resin. Thus, the aminoacid close by the His tag can drive proteolysis and hence loss of the HIS6 tag.
Fig. 8. Commercial Anti-His antibodies are dependent on the vicinal amino acid residues. 6xHIS-tagged proteins A-H were detected with QIAexpress Anti-His antibodies. Indicated amounts of pure 6xHis-tagged protein were applied to
nitrocellulose membrane, and detection was carried out with the Anti-His primary antibody indicated diluted 1/2000, followed by chromogenic detection with AP- conjugated rabbit anti-mouse IgG and NBT/BCIP. Fig. 9. A schematic overview of PXP tag removal by DAPase (Tagzyme). Top panel shows DAPase cleavage sites in respect of amino acids, Lysine, Arginine, Proline, and Glutamine. The middle panel shows Tagzyme tag removal of a recombinant protein with an N-terminal 6xHIS-PKP tag. The bottom panel show DAPase tag removal of a recombinant protein with an N-terminal 6xHIS-Q-PKP tag, where the polyhistidine tag is first removed by DAPase, and the N-terminal glutamine residue is converted to pyroglutamate by Qcyclase that in turn is removed by pGAPase Enzyme action.
Fig. 10. SDS-PAGE/Coomassie and the elution profile of the Thioredoxin h isoform b as coming from pET15m or vector 10/1 1. A. elution profile; B. SDS-PAGE/Coomassie of the summary of the purification steps.
Fig. 11. SDS-PAGE/Coomassie and the elution profile of the wheat NADPH dependent Thioredoxin reductase (NTR). A. elution profile; B. SDS-PAGE/Coomassie of the summary of the purification steps.
Detailed description of the invention
The inclusion of N- or C-terminal polyhistidine epitope tags, which are used for downstream affinity purification, in heterologous recombinant proteins is always accompanied by a percentage of loss of the same due to unpredictable proteolysis events. This phenomenon differs from host to host, expression strain and type of recombinant protein produced.
The present invention provides a short Proline-X-Proline (PXP) amino acid motif that when added to the N- or C- terminus of a protein tag, such as a polyhistidine epitope tags (i.e. 6xHis tag), drastically reduce the loss of the tag due to unwanted proteolysis occurring both in vivo or during purification.
The PXP tag of the present invention, for example in a polyhistidine tag, provides a proteolytic resistant I MAC tag to recombinant proteins, since the flanking amino acidic residues will act as a steric hindrance upstream and downstream to the polyhistidine
motif so that the attack carried out by proteases is blocked. The PXP-tags/PXP-IMAC tags will not disturb the correct folding of the C-terminal of the recombinant proteins
Constructs tagged with a PXP tag of the invention display tight binding to IMAC resins (e.g. Nickel-IMAC resins), which allows using stronger washing conditions, and consequently, more pure protein can be eluted form the resin.
Moreover, the addition of a PXP-protein tag of the present invention to a recombinant protein increases the yield of recombinant protein obtainable, when the tagged protein is purified, for example by affinity chromatography.
Thus, the protein tags of the present invention offers several advantages over the prior art. Apart from displaying increased proteolytic resistance, recombinant proteins with a PXP-tag (such as modified IMAC tags) of the present invention display provide higher recovery yield (>95%), high purity of the eluted proteins, and significantly better reactivity to anti-His antibodies.
It is also possible to use the PXP tags as a linker for recombinant fusion proteins. Definitions
Unless otherwise stated, the following terms used in this application, including the specification and claims, have the definitions given below. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise. Definition of standard chemistry terms may be found in reference works, including Carey and
Sundberg (1992) "Advanced Organic Chemistry 3rd Ed." Vols. A and B, Plenum Press, New York. The practice of the present invention will employ, unless otherwise indicated, conventional methods of protein chemistry, biochemistry and recombinant DNA techniques, within the skill of the art.
Affinity chromatography is a method of separating biochemical mixtures and based on a highly specific interaction such as that between antigen and antibody, enzyme and substrate, or receptor and ligand. It can be used for purifying and concentrating a substance from a mixture into a buffering solution or purifying and concentrating an enzyme solution.
Affinity chromatography employs a solid/immobile phase, which is typically a gel matrix/resin, often of agarose. Usually the starting point is an undefined heterogeneous group of molecules in solution, such as a cell lysate/extract or growth medium/broth. The molecule of interest will have a well-known and defined property which can be exploited during the affinity purification process. In affinity chromatography, the polypeptide of interest becomes trapped on the solid phase/matrix. The other molecules in solution will not become trapped as they do not possess this property. The solid medium can then be removed from the mixture, washed and the target polypeptide released from the entrapment in a process called elution. Affinity chromotography is commonly used for the purification of recombinant proteins.
The solid phase is often loaded into a column into which the sample (e.g. cell extract) is loaded; thereafter, a washing buffer can be applied to was off non-specific binding agents. Then the target polypeptide can be eluted from the column and collected. Generally, affinity chromatography comprises the steps of
a) Loading affinity column
b) Proteins sieve through matrix of affinity beads
c) Proteins interact with affinity ligand with some binding loosely and others tightly d) Wash off proteins that do not bind
e) Wash off proteins that bind loosely
f) Elute proteins that bind tightly to ligand and collect purified protein of interest
The choice of affinity matrix/resin depends on the specific protein to be purified.
Recombinant proteins fused to an epitope tag or affinity tag may be purified by using an affinity matrix, which binds the specific tag used.
One type of resin is a metal chelate resin, which can be used for isolating fusion polypeptides by metal ion affinity chromatography or immobilized metal ion affinity chromatography (I MAC) using the culture broth or a cell extract as starting material. This methodology is based on the specific coordinate covalent bond of amino acids, particularly histidine, to metals. The technique works by allowing proteins with an affinity for metal ions to be retained in a column containing immobilized metal ions, such as cobalt, nickel, copper for the purification of histidine containing proteins or peptides, iron, zinc or gallium for the purification of phosphorylated proteins or peptides. Many naturally proteins have no affinity for metal ions, and can therefore be fused to a HIS-tag, in the present a PXP-comprising HIS tag or another modified
polyHIS tag. For example, polyHIS tagged polypeptides bind metal chelate resin, such as NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin comprising a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+. Methods used to elute the protein of interest include changing the pH, or adding a competitive molecule, such as imidazole.
Amino acid is an entity comprising an amino terminal part (NH2) and a carboxy terminal part (COOH) separated by a central part comprising a carbon atom, or a chain of carbon atoms, comprising at least one side chain or functional group. NH2 refers to the amino group present at the amino terminal end of an amino acid or peptide, and COOH refers to the carboxy group present at the carboxy terminal end of an amino acid or peptide. The generic term amino acid comprises both natural and non-natural amino acids. Natural amino acids of standard nomenclature as listed in J. Biol. Chem., 243:3552-59 (1969) and adopted in 37 C.F.R., section 1.822(b)(2) belong to the group of amino acids listed herein below. Amino acid residues described herein can be in the "D" or "L" isomeric form. Where the L or D form has not been specified it is to be understood that the amino acid in question has the natural L form, cf. Pure & Appl. Chem. Vol. (56(5) pp 595-624 (1984) or the D form, so that the peptides formed may be constituted of amino acids of L form, D form, or a sequence of mixed L forms and D forms.
Throughout the description and claims amino acids are either indicated by the full name, or by their conventional 3-letter or 1 -letter code.
List of natural amino acids and their respective 1 -letter and 3-letter codes.
Symbols Amino acid
1 -Letter 3-Letter
Y Tyr tyrosine
G Gly glycine
F Phe phenylalanine
M Met methionine
A Ala alanine
S Ser serine
I lie isoleucine
L Leu leucine
τ Thr threonine
V Val valine
P Pro proline
κ Lys lysine
H His histidine
Q Gin glutamine
E Glu glutamic acid
W Trp tryptophan
R Arg arginine
D Asp aspartic acid
N Asn asparagine
C Cys cysteine
Where nothing else is specified, an amino acid can be selected from any amino acid, whether naturally occurring or not, such as alfa amino acids, beta amino acids, and/or gamma amino acids. Accordingly, the group comprises but are not limited to: Ala, Val, Leu, lie, Pro, Phe, Trp, Met, Gly, Ser, Thr, Cys, Tyr, Asn, Gin, Asp, Glu, Lys, Arg, His, Aib, Nal, Sar, Orn, Lysine analogues DAP and DAPA.
Nucleic acid, nucleic acid or polynucleotide is meant to encompass DNA and RNA as well as derivatives thereof such as peptide nucleic acids (PNA) or locked nucleic acids (LNA) throughout the description. The terms refer to polynucleotides, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction (PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action. Polynucleotides can be composed of monomers that are naturally-occurring nucleotides (such as DNA and RNA), or analogs of naturally-occurring nucleotides (e.g., (alpha-enantiomeric forms of naturally-occurring nucleotides), or a combination of both. Modified nucleotides can have alterations in sugar moieties and/or in pyrimidine or purine base moieties. Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, alkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters. Moreover, the entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza-sugars and carbocyclic sugar analogs. Examples of modifications in a base moiety include alkylated purines and pyrimidines,
acylated purines or pyrimidines, or other well-known heterocyclic substitutes. Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages. Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like. The term "polynucleotide" also includes so-called "peptide nucleic acids," which comprise naturally-occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded. Nucleic acids of the present invention are generally defined in terms of the translational gene products encoded by the respective nucleic acid. The sequence of a nucleic acid of the invention may thus be inferred from the sequence of the polypeptide gene product by using the conventional genetic code to decipher, which nucleotides will encode a specific polypeptide.
Gene product refers to any transcriptional or translational product of a gene. A transcriptional product comprises any RNA-species, which is transcribed from the specific gene, such as pre-RNA, mRNA, tRNA, miRNA, spliced and nonspliced RNA. Thus, a transcriptional gene product of the present invention comprises any RNA- species encoded by or comprising a sequence selected from any β-glucosidase gene. For example, a transcriptional gene product of the present invention comprises any RNA-species encoded by a sequence selected from any of SEQ ID NO: 15-40, 52-72 and 75-79. The transcript may be bound by RNA-binding proteins and, thus, packaged into a ribonucleoprotein (RNP), for example an mRNP molecule.
A translational gene product of the present invention comprises any peptide or polypeptide encoded by the gene or a fragment thereof. Thus, a "polypeptide encoded by a gene of the present invention" is comprised in the terms "gene product", or "translational gene product". A translational gene product of the present invention comprises any polypeptide-species encoded by a sequence selected from any β- glucosidase.
Extract is used herein for any extraction of a microorganism and/or host cell of the present invention. The extract preferably comprises a polypeptide, in particular polypeptides of the present invention, such as fusion polypeptides having a modified
HIS-tag. An extract may also comprise polynucleotide, such as nucleic acids molecules of nucleic acid vectors of the present invention. The extract may be prepared by opening the cells by lysis or chemical shear, and extracting the desired components in a suitable buffer.
Broth is used herein to describe a medium, which has been used for the culturing of a microorganism and/or host cell of the present invention. The broth is preferably a liquid culture broth, and the broth preferably comprise metabolites and/or other secreted components of the cultured microorganism and/or host cell, for example polypeptides of the present invention.
Polypeptide is a plurality of at least two covalently linked amino acid residues defining a sequence and linked by amide bonds. Proteins and enzymes are composed of polypeptide, and thus, the term "protein" is used herein analogously with polypeptide. The amino acids may be both natural amino acids and non-natural amino acids, including any combination thereof. The natural and/or non-natural amino acids may be linked by peptide bonds or by non-peptide bonds. The term polypeptide also embraces post-translational modifications introduced by chemical or enzyme-catalyzed reactions, as are known in the art. Such post-translational modifications can be introduced prior to partitioning, if desired. Amino acids as specified herein will preferentially be in the L- stereoisomeric form. Amino acid analogs can be employed instead of the 20 naturally- occurring amino acids. Several such analogs are known, including fluorophenylalanine, norleucine, azetidine-2-carboxylic acid, S-aminoethyl cysteine, 4-methyl tryptophan and the like.
Protein tag is used herein for any entity, which can serve for identifying and/or isolating a protein to which it is associated. The present invention, primarily relates to polypeptide tags, i.e. protein tags, which are composed of covalently linked amino acid residues. Protein tags can be used to identify and isolate a polypeptide of interest, which is attached to the tag by a chemical linkage, such as a peptide bond. Protein tags for example include epitope tags and affinity tags.
Affinity tag is a protein tag consisting of an amino acid sequence that has been attached to a protein of interest to make its purification easier, so that the protein can be purified from a crude biological source using an affinity technique. The tag could be
another protein or a short amino acid sequence, allowing purification by affinity chromatography. In particular, an affinity tag can be an epitope tag. Examples of affinity tags are chitin binding protein (CBP), maltose binding protein (MBP), and glutathiones-transferase (GST). Also the polyHis tag is a widely-used affinity tag; which binds to metal matrices/resins. Affinity tags may also be used as epitope tags
Epitope tag
An epitope is a portion of a molecule to which an antibody binds, i.e. epitopes are recognizable and bound by an antibody or an antigen-binding fragment thereof.
Epitopes can be composed of sugars, lipids or amino acids. In the present invention, epitope tags generally are constructed of amino acids. Epitope tags may be added to a molecule (usually a polypeptide) as a tool to visualize or isolate the protein.
Visualization can take place in a gel, a western blot or labeling via
immunofluorescence. Isolation can be done by protein purification techniques, such as chromatographic methods using antibodies or affinity chromatography.
For detection and isolation of a polypeptide, for example for which no good antibody exists, an epitope tag can be added to the polypeptide. Epitope tags range typically from 10 to 15 amino acids and are designed to create a molecular handle for the polypeptide of interest. An epitope tag could be placed anywhere within your protein, but typically they are placed on either the amino or carboxyl terminus to minimize any potential disruption in tertiary structure and thus function of your protein. Any short stretch of amino acids known to bind an antibody qualifies as an epitope tag. A number of epitope tags are widely used in the field of protein chemistry. These include:
a) c-myc - a 10 amino acid segment of the human protooncogene myc
(EQKLISEEDL)
b) HA - haemoglutinin protein from human influenza hemagglutinin protein
(YPYDVPDYA)
c) PolyHIS/His6 - 5-6 histidines (HHHHHH) placed in a row form a structure that binds the element divalent cations such as nickel. This is especially useful for affinity chromatography but can also be used as an epitope tag
d) VSV-G (YTDIEMNRLGK)
e) HSV (QPELAPEDPED)
f) V5 (GKPIPNPLLGLDST)
g) FLAG (DYKDDDDK)
h) HAT (KDHLIHNVHKEEHAHAHNK)
Epitope tags also include affinity tags, which are recognizable by an antibody or an antigen-binding fragment thereof. Expression vectors are defined herein as DNA sequences that are required for the transcription of cloned copies of genes and the translation of their mRNAs in an appropriate host. Such vectors can be used to express genes in a variety of hosts such as bacteria, yeast, bluegreen algae, plant cells, insect cells and animal cells. In particular, the vectors can be used for expression of a PXP-comprising epitope tag of the present invention, including fusion polypeptides.
Protein tag comprising PXP amino acid motif
The present invention provides a protein tag, such as an epitope tag or affinity tag, comprising a PXP amino acid motif. The PXP motif functions as a proteolytic resistant motif to the flanking region of the tag. The steric hindrance of two flanking proline residues render the central amino acid of the PXP motif and the protein tag, proteolytic resistant to the most proteases present in the hosts for recombinant protein production (£. coli, other bacteria, yeast, fungi or other eukaryotic recombinant expression systems). This ensures a significantly higher yield of recombinant protein expressed with a protein tag, such as an epitope tag, of the present invention.
In addition to the increased yield, the two proline residues of the PXP motif introduces a 90 degree angle turn so that the protein tag is projected outside in respect of the recombinant protein and hence making less steric hindrance to the folding and stability of the same. This enables the production of recombinant protein with increased biological activity and stability, and in particular allows the production of biologically active recombinant proteins, which cannot be produced by conventional protein tag methods, due to incorrect folding, and/or degradation. Moreover, the conformation, where the PXP motif of the protein tag introduces a 90 degree angle turn so that the protein tag is projected outside in respect of the recombinant protein, also facilitates the interaction of the protein tag, e.g. epitope tag, with external binding partners, for example interaction with binding agents, such as antibodies or resins, which bind the tag, because the conformation exposes the tag
outside of the recombinant protein. For example, the interaction with IMAC resins, such as HIS-TRAP resins (e.g. Ni/NTA, Ni/IDA, etc.) is promoted.
Thus, the present invention solves at least the problem of protein tag loss due to proteolysis that occurs close to the site of fusion to the heterologous recombinant protein. The flanking tag is resistant to intracellular and/or extracellular proteolysis for prokaryotic or eukaryotic expression systems. Also, problems of reduced interaction with protein binding agents are solved; and problems resulting from incorrect folding and reduced stability is solved by inclusion of a protein tag, such as an epitope tag, of the present invention in a recombinant protein.
The flanking tag can be incorporated by PCR into constructs comprising a gene of interest either as N- or C-terminal to a suitable protein tag (e.g. an epitope tag such as polyHIS (e.g. 6xHIS tag)). As C-terminal tag, it should include a stop codon after the last residue of the tag, for example after the last histidine in a polyhistidine tag.
Alternatively the PXP sequence can be fused to a vector with a custom protein tag engineered in order to adapt the flanking PXP tag with a suitable restriction site in order to generate a maximum of 2 amino acid frame distance to the PXP and the
polyhistidine tag. In the examples provided herein below, vectors are disclosed, wherein the flanking tag is incorporated in custom polyhistidine tags.
In one aspect the present invention relates to a polypeptide protein tag, such as an epitope tag or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
The protein tag of the invention is not bound to be any specific length, but generally, the tag comprises at least 5 amino acids, and normally 5-50, such as 5-40 for example 5-30 or 5-20 amino acids. X may be selected among basic amino acids, and therefore X can be selected on the basis of the isoelectric point of the respective amino acid. In one embodiment, X is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7. In one embodiment, X is selected from amino acids having an isoelectric point above 7. Thus, X is in one embodiment selected from His, Lys or Arg.
However, in another embodiment, is selected from amino acids having an isoelectric point above 5; cf. table below.
In another embodiment, X is selected on the basis of its charge, polarity and/or hydrophilicity. Thus, in one embodiment, X is selected from positively charged amino acids; such as His, Lys, and Arg. In another embodiment, X is selected from amino acids, which are negatively charged (basic amino acids and non-acidic amino acids), polar and hydrophilic, such as His, Lys and Arg. In yet another embodiment, X is selected from amino acids, which are negatively charged (acidic amino acids), polar and hydrophilic amino acids, such as Asp and Glu.
In one specific embodiment, it is provided that X is not a non-polar, hydrophobic acid. Thus, in this embodiment, X an amino acid, which is not Phe, Met, Trp, lie, Val, Leu, Ala, and Pro.
In one embodiment, X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N). In a specifically preferred
embodiment, X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP. Thus, preferred protein tags comprise the amino acid motif PKP, PQP, PRP and/or PNP. In more specific embodiment, X is Lysine (K) or Arginine (R), i.e. the PXP tag is PKP or PRP.
These PXP motifs, specifically PKP, PQP, PRP and/or PNP, in the flanking tag improve for example a polyhistidine tag because they introduce one extra positive charge, which increases the affinity of the polyhistidine tag to for example IMAC resins, such as nickel-immobilized affinity resins (e.g. Ni/NTA, Ni/IDA etc.). Furthermore, PKP or PRP motifs in a flanking polyhistidine tag leaves the epitope or affinity tag more sensitive to anti-HISn antibodies, having in the order of 10 to 100 fold more sensitivity to a PXP- comprising polyHIS tag than the polyHIS tag itself, i.e. without a PXP of the present invention.
According to the present invention, the PXP motif may be introduced into any suitable epitope tag or affinity tag, including those tags, which are commercially available. Non- limiting examples of epitope tags to which a PXP motif of the present invention may be attached are c-myc, HA, PolyHIS, VSV-G, HSV, V5, HAT and FLAG.
In a preferred embodiment, the PXP motif is introduced into a polyhistidine tag, which comprises or consists of at least 2 consecutive histidine residues, and preferably 2-10, and even more preferred 4-6 consecutive histidine residues, such as 5 or 6 His residues. Thus, in one preferred embodiment, the tag of the invention further comprises a polyhistidine tag, where the PXP motif is fused to the N- terminal and/or C-terminal end of the polyhistidine tag, thus having the general formula:
PXP-(His)n, (His)n-PXP or PXP-(His)n -PXP,
where n is at least 2, and preferably 2-10, and even more preferred 4-6. X is any amino acid, but preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above. In another embodiment, X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), for example PXP is PKP, PQP, PRP or PNP. In a specifically preferred embodiment, X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP, or X is K or R; i.e. the PXP amino acid motif is PKP or PRP.
The present invention also relates to epitope tags and/or affinity tags comprising a PXP motif as defined herein, where the tag is a variation of the classical consecutive polyhistidine tag. For example, the invention encompasses a modified polyhistidine tag formed by the mix of any amino acid spaced by histidine comprising the PXP motif of the invention at the C or N terminal end. This modified HIS-tag can have the general formula:
PXP-(HZ)n or (HZ)n- PXP.
In one embodiment, the modified HIS-tag above further comprises a C-terminal Proline residue, thus having the general formula:
PXP-(HZ)n-P or (HZ)n-P- PXP.
In these formulas, n is at least 1 , preferably 1-10, such as 2-8, for example 4-6; and H is histidine.
X is selected from any amino acid, and as described above, X is preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, and/or X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), or in a specifically preferred embodiment, X is selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP. Thus, preferred polyhistidine tags comprise the amino acid motif PKP, PQP, PRP and/or PNP.
Z is selected from any amino acid, and each repetitive (HZ) sequence may vary with respect to the identity of Z. In a preferred embodiment, Z is a basic amino acid. In another preferred embodiment, Z is an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q), Arginine (R) and Lysine (K).
Thus, in a preferred embodiment, the protein tag of the invention is selected from the group consisting of SEQ ID NO: 1-5. In one embodiment, the epitope and/or affinity tag of the present invention consists of or comprises an amino acid sequence selected from the group consisting of
INASAPKPHHHHHH, INASAPKPHQHRHKHQP, INASAPKPHGHTHGHSHGHP, INASAPKPHEHDHEHDHEHP, SRPKPHHHHHH, SRPKPHQHRHKHQP,
SRPKPHGHTHGHSHGHP, SRPKPHEHDHEHDHEHP, and/or MGSSHHHHHHPKP (i.e. SEQ ID NO: 6-14). In a preferred embodiment, a tag of the present invention is
selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP, INASAPKPHGHTHGHSHGHP, INASAPKPHEHDHEHDHEHP, SRPKPHHHHHH, SRPKPHQHRHKHQP, SRPKPHGHTHGHSHGHP, SRPKPHEHDHEHDHEHP, and/or MGSSHHHHHHPKP (SEQ ID NO: 6-14). In another embodiment, the epitope and/or affinity tag of the present invention consists of or comprises an amino acid sequence selected from the group consisting of NH2- MGHHHHHHPKPASHM (SEQ ID NO: 73) and NH2-MGHHHHHHPKPENLYFQGASH (SEQ ID NO: 74).
In one embodiment, the tag of the present invention is a polyhistidine tag comprising a mix of aliphatic, acidic and/or basic residues, wherein each of the aliphatic, acidic and/or basic residues are separated by one histidine residue.
In one embodiment, a tag of the invention has the general formula:
PZPHXHXHXHXHXH, HXHXHXHXHXHPZP, N-PZPHXHXHXHXHXH, or N- HXHXHXHXHXHPZP;
where Z is a basic amino acid and/or selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q), Arginine (R) and Lysine (K); and X is selected from G,S,T,E,D,Q or K amino acids, or Z is K, Q, R or N.
In another embodiment, the tag has terminal proline residue and thus has the general formula: PZPHXHXHXHXHXHP. However, the inclusion of the terminal proline is optional.
Fusion polypeptide comprising PXP-com prising protein tag.
In one aspect, the invention relates to a recombinant polypeptide comprising a polypeptide of interest for recombinant expression and purification, which is fused to a protein tag, such as an epitope tag, of the present invention. Thus, the invention provides a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, such as a PXP-comprising protein tag, as defined elsewhere herein; i.e. the invention relates to a fusion polypeptide for expression in a host cell comprising a first polypeptide sequence R1 fused to a protein tag comprising a PXP amino acid motif, where X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Histidine (H), Lysine (K),
Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), preferably X is K, Q, R or N.
In particular, the invention provides a fusion polypeptide obtainable by expression in a host cell, said fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, such as a PXP-comprising epitope tag of the present invention.
The at least one protein tag, such as epitope tag, may be fused to the N-terminal and/or C-terminal end of said first polypeptide.
Linker sequences may be inserted into the fusion polypeptide of the invention, for example in order to create fusion proteins, custom proteolytic sites, introduce specific polypeptide signals or recognition sites in the polypeptide. The fusion polypeptide in one embodiment further comprises a linker sequence between the protein tag and the first polypeptide, for example the linker sequence comprises
a. a proteolytic cleavage site suitable for separating the first polypeptide from the protein tag, such as the epitope tag of the invention, or
b. one or more endonucleolytic cleavage sites.
The "first polypeptide" may be any functional polypeptide, which is desired to be purified by affinity chromatography or other methods. The "first polypeptide may be any biologically active polypeptide. Particular relevant polypeptides are enzymes produced in large scale, and polypeptides, which are prone to proteolytic degradation, where the advantages of reduced proteolytic degradation in particularly relevant. Examples of "first polypeptides" are Alpha Mating Factor fusion protein containing Maltose Binding Protein (MBP) and wheat Thioredoxin.
In one embodiment, a fusion polypeptide of the present invention is selected from the group consisting of SEQ ID NO: 41-51. In another embodiment, the fusion polypeptide comprises a sequence selected from any one of SEQ ID NO: 41-51 , or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
In one embodiment, a fusion polypeptide of the present invention is encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NO: 52-68. In another embodiment, the fusion polypeptide is encoded by a nucleic acid comprising a
sequence selected from any one of SEQ ID NO: 52-68, or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
In one particular embodiment, the PXP-motif could be integrated into an amino acid linker sequence comprising at least 3 amino acids, and preferably in the range of 3-100 amino acids, where the sequence serves as a protease resistant linker to connect to recombinant polypeptides. For example, the PXP tag can serve as proteolytic uncleavable linker to join together two recombinant polypeptides.
Thus, in one embodiment, the fusion polypeptide comprise the PXP-comprising protein tag inserted between a first (R1) and a second (R2) polypeptide. In preferred embodiments, the fusion polypeptide of the invention has a general formula selected from the group consisting of: R1-PXP-R2, R1-PXP-(His)n-R2, R1-(His)n-PXP-R2, R1- PXP-(His)n-P-R2, R1-P-(His)n-PXP-R2, R1-PXP-(HZ)n-R2, R1-PXP-(HZ)n-P-R2, R1- (HZ)n-PXP-R2, and R1-P-(HZ)n-PXP-R2, where n is at least 1 , preferably 1-10, such as 4-6, Z is any amino acid, but preferably an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K), and X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Histidine (H), Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), preferably X is K, Q, R or N.
In a preferred embodiment, the protein tag inserted between a first (R1) polypeptide and a second (R2) polypeptide of a fusion polypeptide of the invention comprises a site-specific protease cleavable site, such as a sequence selected from the group consisting of ENLYFQ/X (Tobacco etch virus protease, TEV), EVLFQ/GP (Human rhinovirus 3C protein), IEGR/ (Factor Xa), DDDDK/ ( Enterokinase, EntK), LVPR/GS (Thrombin, Thr protease), DXXD/ (Caspase-3 protease), where 7" designates the cleavage position. The tag inserted between a first (R1) polypeptide and a second (R2) polypeptide of a fusion polypeptide is not bound to be any specific length, but generally, the tag comprises at least 5 amino acids, and normally 5-50, such as 5-40 for example 5-30 or 5-20 amino acids. The site-specific protease cleavable site can be
inserted at any position in the tag, and the tag also preferably comprises an epitope and/or affinity tag sequence as described elsewhere herein.
In a fusion polypeptide of the present invention, where a first (R1) and a second (R2) polypeptide are linked to a PXP-comprising linker sequence or PXP-comprising tag, a first polypeptide could for example serve as a solubility enhancer for the fusion protein and a second polypeptide could be any biologically active polypeptide, in particular polypeptides of industrial, pharmaceutical or scientific interest, or any other nature of interest. In particular, a solubility enhancer polypeptide can be positioned either N- terminal or C-terminally of the linker sequence. The solubility enhancer polypeptide can be chosen from any polypeptide sequence known to promote solubility. Non-limiting examples include Maltose Binding Protein (MBP), NusA, Trx, GST, SUMO, SET, DsbC.Skp, T7PK, GB1 , and ZZ as described by Esposito D. et al., 2006. In a preferred embodiment, the fusion polypeptide comprise as R1 or R2, Maltose Binding Protein (MBP), which has been shown to promote recombinant expression, and yield of correctly folded recombinant fusion protein, when MBP is fused to a biologically relevant recombinant protein of interest via a linker sequence; cf. Dalken B.,
Jabulowsky R.A., Oberoi P., Benhar I., Wels W.S. (2011) Maltose-Binding Protein Enhances Secretion of Recombinant Human Granzyme B Accompanied by In Vivo Processing of a Precursor MBP Fusion Protein. PLoS One. 5(12):e14404.
Nucleic acid sequence and expression vectors encoding a protein tag
The invention also in one aspect pertains to nucleic acid molecules encoding one or more polypeptide sequences of the invention, or fragments or functional variants thereof.
The invention thus provides a nucleic acid encoding a polypeptide protein tag of the present invention; i.e. the invention provides a nucleic acid sequence encoding an epitope tag and/or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP. In particular, the invention relates to a nucleic acid encoding a polypeptide polyHIS tag or modified
polyHIS tag comprising the PXP motif, such as epitope/affinity tags having the general formula:
PXP-(His)n, (His)n-PXP, PXP-(His)n -PXP, PXP-(HZ)n, (HZ)n- PXP, PXP-(HZ)n-P or (HZ)n-P- PXP,
where n is at least 2, and preferably 2-10, and even more preferred 4-6. X is any amino acid, but preferably selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7. In another embodiment, X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N). In a specifically preferred embodiment, X is selected from K, Q, R and N; i.e. the PXP amino acid motif is PKP, PQP and/or PRP, or X is K or R; i.e. the PXP amino acid motif is PKP or PRP.
In a particular embodiment, a nucleic acid of the invention encodes a polypeptide sequence is selected from the group consisting of SEQ ID NO: 6-14, 73-74, or part thereof. The invention also encompasses oligonucleotide primers, which may be used for incorporating a PXP-motif into an existing protein tag, such as an epitope tag, in order to produce a tag of the invention. Thus, the invention also relates to nucleic acids encoding a PXP-motif of the invention and/or a modified protein tag, such as a modified epitope tag of the invention, which may be used for introducing the motif or tag into a first polypeptide of interest. Thus, the nucleic acid of the invention is in one
embodiment selected from the group consisting of SEQ ID NO: 15-40, 69-72, and 75- 79.
The invention also provides a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag of the present invention, as described herein above.
In one embodiment, a nucleic acid of the present invention encodes a fusion polypeptide selected from the group consisting of SEQ ID NO: 41-51 or part thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
In another embodiment, a nucleic acid of the present invention is selected from the group consisting of SEQ ID NO: 52-68, or a fragment thereof, or a functional homolog thereof having at least 70%, such as at least 80%, such as at least 85%, for example at least 90%, such as at least 95%, for example at least 98% identity to any of those sequences or fragments.
In particular, the nucleic acids of the present invention may be incorporated into a nucleic acid vector, for example for cloning and/or expression. Thus, one aspect of the invention relates to a nucleic acid vector comprising a nucleic acid sequence encoding a polypeptide protein tag of the present invention; i.e. a nucleic acid sequence encoding an affinity purification tag or epitope tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above. The invention also relates to a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above, such as PKP, PQP, PRP and/or PNP
The nucleic acid vector is in a preferred embodiment an expression vector, such as a prokaryotic expression vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector. Numerous vectors are available and the skilled person will be able to select a useful vector for the specific purpose. The vector may, for example, be in the form of a plasmid, cosmid, viral particle or artificial chromosome. The appropriate nucleic acid sequence may be inserted into the vector by a variety of procedures, for example, DNA may be inserted into an appropriate restriction endonuclease site(s) using techniques well known in the art. Apart from the nucleic acid sequence according to the invention, the vector may furthermore comprise one or more of a signal sequence, an origin of replication, one or more marker genes, an enhancer element, a promoter, and a transcription termination sequence. The vector may also comprise additional sequences, such as enhancers, poly-A tails, linkers, polylinkers, operative linkers, multiple cloning sites (MCS), STOP codons, internal ribosomal entry sites (IRES) and host homologous sequences for integration or other defined elements. Methods for engineering nucleic acid constructs are well known in the art (see, e.g., Molecular Cloning: A Laboratory Manual, Sambrook et al., eds., Cold Spring Harbor Laboratory, 2nd Edition, Cold Spring Harbor, N.Y., 1989). The vector is preferably an expression vector, comprising the nucleic acid operably linked to a
regulatory nucleic acid sequence directing expression thereof in a suitable cell. Within the scope of the present invention said regulatory nucleic acid sequence should in general be capable of directing expression in a suitable host cell, such as a yeast cell or prokaryotic cell, such as Escherichia coli.
The vector may be any eukaryotic expression vector, for example a mammalian expression vector, or a yeast vector. The vector may comprise at least one intron, which will facilitate the transport from the nucleus to the cytoplasm of the vector encoded RNA. In another embodiment, the vector is capable of expressing RNA in the cytoplasm by cytoplasmic transcription, which can be translated into polypeptide.
The nucleic acid sequence encoding a protein tag or a fusion polypeptide of the invention may be incorporated into the vector by any method available to the skilled person. For example, the nucleic acid sequence may be incorporated by standard cloning techniques, involving PCR amplification, nucleolytic restriction digestion and ligation, transformation/transfection/recombination into host organisms and/or marker selection.
To this end, the invention also in one aspect relates to the use of an oligonucleotide comprising a sequence encoding a protein tag comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above, or part thereof, for introducing said protein tag or part thereof in a nucleic acid vector, for example a cloning vector and/or expression vector. The invention also provides a use of said oligonucleotide for fusing said protein tag or part thereof to a nucleic acid sequence encoding a first polypeptide sequence in a fusion polypeptide of the invention, as defined above. In one
embodiment, oligonucleotide claimed for such use comprises or consists of a sequence selected from the group consisting of SEQ ID NO: 15-40, 69-72 and/or 75-79 or a PXP- encoding part thereof. When the PXP-comprising tag is incorporated into a fusion polypeptide as a C-terminal tag, it should include a stop codon (TAA/UAA, TAG/UAG, TGA/UGA in DNA/RNA respectively) after the last histidine or proline in order to terminate translation of the transcript. Thus, the protein tag may be an epitope or affinity tag having the following general polypeptide sequences:
PXP-(HZ)n-P-STOP or (HZ)n-P- PXP- STOP, PXP-(HZ)n- STOP or (HZ)n- PXP- STOP, PXP-(His)n -STOP-, (His)n-PXP- STOP or PXP-(His)n -PXP- STOP,
Where X, Z, and N are as defined elsewhere herein. Alternatively the PXP-comprising tag sequence can be fused into a vector with a custom stop codon. The PXP sequence can also be fused into a vector with a custom protein tag, such as an epitope or affinity tag, engineered in order to adapt the flanking PXP tag with a suitable restriction site in order to generate a maximum of 2 amino acid frame distance to the PXP and the tag.
In specific examples, the vector is engineered so that the flanking PXP-motif is incorporated into custom polyhistidine tags. In such vectors, an amino acidic linker between the PXP motif and the protein of interest for example engineered to also be resistant to proteolysis. The polylinker is for example structured for incorporating the PXP tag into prokaryotic (e.g. E. coli) or eukaryotic (e.g. P. pastoris, S. cerevisiae or Kluyveromyces lactis) expression systems. In one embodiment, the PXP-comprising protein tag of the invention is a PXP-comprising polyHIS tag or modified polyHIS tag with an amino acid linker sequence. For example, the PXP-comprising protein tag with a polylinker sequence consists of or comprises an amino acid sequence selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP,
INASAPKPHGHTHGHSHGHP, INASAPKPHEHDHEHDHEHP, SRPKPHHHHHH, SRPKPHQHRHKHQP, SRPKPHGHTHGHSHGHP, SRPKPHEHDHEHDHEHP, and/or MGSSHHHHHHPKP. Thus, the nucleic acid vector of the invention, preferably comprise a nucleic acid sequence encoding a PXP-comprising protein tag and polylinker sequence consisting of or comprises an amino acid sequence selected from the group consisting of INASAPKPHHHHHH, INASAPKPHQHRHKHQP,
INASAPKPHGHTHGHSHGHP, INASAPKPHEHDHEHDHEHP, SRPKPHHHHHH, SRPKPHQHRHKHQP, SRPKPHGHTHGHSHGHP, SRPKPHEHDHEHDHEHP, and/or MGSSHHHHHHPKP.
Host cell
Yet another aspect relates to a recombinant host cell comprising a nucleotide sequence, fusion polypeptide, a nucleic acid sequence, and/or a nucleic acid vector of the present invention, and described elsewhere herein.
The recombinant host cell preferably comprises a recombinant nucleic acid vector comprising a polynucleotide of the present invention, wherein the vector is introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a self- replicating extra chromosomal vector. The term "host cell" encompasses any progeny of a parent cell that is not identical to the parent cell due to mutations that occur during replication. The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide and its source. The host cell may also be any suitable host microorganism. Accordingly, the host cell is preferably selected from host
microorganisms belonging to a phylum selected from the group consisting of yeasts, fungi, bacteria, algae or plants. However, the host cell may be any prokaryote or eukaryote, such as a mammalian, insect, plant, or fungal cell.
The host cell is for example selected from eukaryotic or prokaryotic cells, for example the host is selected from mammalian cells, human cells, mouse cells, plant cells, Chinese Hamster Ovary (CHO) cells, and insect cells.
In one preferred embodiment, the host cell is a prokaryotic cell, such as Escherichia coli, or varienat of E. coli..
The host cell may also be a eukaryotic cell. In another preferred embodiment, the host cell is a yeast cell, such as a methylotrophic yeast. Specifically, the host cell can be Saccharomyces cerevisiae, Kluyveromyces lactis, or Pichia pastoris.
In a preferred embodiment, the host cell (e.g. Saccharomyces cerevisiae,
Kluyveromyces lactis, Pichia pastoris, or E coli) comprises in its genome and/or in a nucleic acid vector, such as a plasmid vector, a nucleic acid sequence encoding an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP and/or PRP. In a more specific embodiment, the host cell comprises in its genome and/or in a plasmid vector a nucleic acid sequence encoding a nucleic acid encoding a polypeptide polyHIS tag or modified polyHIS tag comprising the PXP motif, such as epitope or affinity tags having the general formula: PXP-(His)n, (His)n-PXP or PXP-(His)n -PXP, PXP-(HZ)n or (HZ)n- PXP.
PXP-(HZ)n-P or (HZ)n-P- PXP, as defined elsewhere herein. Specifically, the nucleic acid may encode a fusion polypeptide comprising a gene of interest for transgenic expression is the host organism and a PXP-comprising tag, as described elsewhere herein.
Kit
The invention also relates to a kit comprising one or more of the components of the invention, such as one or more polypeptide tags, fusion polypeptides, nucleic acid sequences, nucleic acid vectors and/or host cells of the present invention.
Thus, in one embodiment, the present invention relates to a kit comprising a nucleic acid sequence encoding an affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid, and X preferably is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7, or is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), such as selected from K, Q, and R; i.e. the PXP amino acid motif is PKP, PQP, PRP and/or PNP.
More specifically, the invention provides a kit comprising a nucleic acid encoding a polypeptide polyHIS tag or modified polyHIS tag comprising the PXP motif, such as epitope/affinity tags having the general formula:
PXP-(His)n, (His)n-PXP or PXP-(His)n -PXP, PXP-(HZ)n or (HZ)n- PXP.
PXP-(HZ)n-P or (HZ)n-P- PXP, as defined herein above. A kit of the invention may also comprise a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first polypeptide sequence of a gene of interest for expression fused to at least one PXP- comprising protein tag of the invention. Further kits of the invention may comprise one or more nucleic acid vectors of the invention, e.g. nucleic acid vectors comprising a nucleic acid sequence encoding a polypeptide protein tag of the present invention; i.e. a nucleic acid sequence encoding an epitope and/or affinity purification tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above. The kit may also comprise a nucleic acid vector comprising a nucleic acid sequence encoding a fusion polypeptide of the present invention, such as a fusion polypeptide comprising a first
polypeptide sequence fused to at least one protein tag, comprising a Proline-X-Proline (PXP) amino acid motif, as defined herein above.
The kit may comprise one or more oligonucleotide primers for PCR amplification of a specific target, wherein the PCR primers comprise a nucleic acid sequence of the invention or part thereof, for example a nucleic acid sequence encoding a PXP- comprising protein tag, as defined herein above.
Moreover, the kit may comprise one or more additional components, which are relevant for cloning and amplification of a gene of interest, and/or introduction of a PXP- comprising tag to a gene of interest. Such additional components include for example, host cells (competent cells), buffers (e.g. restriction buffers, ligation buffers,
transformation buffers, PCR buffers etc.), enzymes (e.g. restriction enzymes, ligases, polymerases etc.), nucleotides etc.
Method of producing recombinant protein
The nucleic acids and specifically nucleic acid vectors of the present invention are specifically suitable for production of recombinant protein. The same applies to the host organisms of the invention. The invention, therefore, also provides methods for producing recombinant protein by utilizing the PXP-comprising modified tag of the present invention. This method comprises culturing the host cell under suitable growth conditions, which allows the expression of a fusion polypeptide of the invention.
The produced polypeptide is then preferably recovered from the microorganism and/or host cell, and/or recovered from the incubation broth of the host cell and/or
microorganism. Thus, in one aspect, the present invention pertains to a method of producing recombinant protein, said method comprising:
(a) culturing a host cell of the invention under suitable growth conditions, which allows the expression of a fusion polypeptide of the invention, and
(b) purifying the fusion polypeptide from the culture.
Suitable culturing conditions will depend of the choice of host organisms, for example yeast cells generally require incubation temperatures around 30°C, whereas bacteria such as Escherichia coli, and mammalian cells generally thrive at 37°C. Also the composition of culture media depends on the choice of host organisms, but suitable
growth conditions and culture media will be known to the skilled practitioner. Both liquid and solid growth media may be employed.
The level of expression of the fusion protein depends on the choice of expression vector, and the choice of promoter in the vector. Expression can be modulated by use of inducible promoters, which are activated by certain external stimuli, such as metabolites, temperature, chemicals or nutrients. Alternatively, constitutive promoters mediate a constant expression of the fusion polypeptide. Also the method of purification may depend on the choice of host as well as by other factors. Secreted proteins may be isolated from the culture broth, whereas intracellular proteins can be purified from cell extract, prepared from host cells collected from the media. For example, cells may be collected by centrifugation, and opened by chemical or mechanical methods, for example incubation in lysis buffer. Such methods are well- known in the art.
More importantly, the fusion polypeptide is preferably isolated by chromatographic methods, such as affinity chromatography. In a preferred embodiment, the fusion polypeptide is isolated from the culture by metal ion affinity chromatography or immobilized metal ion affinity chromatography (I MAC) using the culture broth or a cell extract as starting material. This methodology is based on the specific coordinate covalent bond of amino acids, particularly histidine, to metals. The technique works by allowing proteins with an affinity for metal ions to be retained in a column containing immobilized metal ions, such as cobalt, nickel, copper for the purification of histidine containing proteins or peptides, iron, zinc or gallium for the purification of
phosphorylated proteins or peptides. Many naturally occurring proteins do not have an affinity for metal ions; therefore a recombinant fusion polypeptide of the present invention comprising a PXP-com prising polyHIS tag or modified polyHIS tag is particularly suitable. Methods used to elute the protein of interest include changing the pH, or adding a competitive molecule, such as imidazole.
In one embodiment, isolation of the fusion polypeptide comprises
a. contacting a lysate or extract of the host cell containing the fusion polypeptide with a metal chelate resin,
b. optionally, washing the resin-fusion polypeptide complex with a buffer to remove unbound proteins and other materials, and
c. eluting the bound fusion polypeptide from the resin-fusion polypeptide complex.
The metal chelate resin is preferably NTA (nitriloacetic acid) or IDA (imminodiacetic acid)-charged agarose/silica resin, and the metal chelate preferably comprises a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+.
It is often advantageous to remove protein tags subsequent to its purification, because the tag may interfere with the desired function of the protein, and for example cause antigenic reactions and other side effects where the protein is used as a
pharmaceutical agent. Therefore, the tag of the present invention may be removed subsequent to its purification or in connection with its purification. In one embodiment, the tag is removed when the tagged polypeptide is bound to a resin, thereby releasing the mature polypeptide without tag from the resin, and collecting the mature polypeptide as the eluate. However, preferably the PXP-tag is removed subsequent to its elution from the resin matrix.
A number of proteases are available, which may be used to remove the PXP-tag, and any suitable protease may be used according to the present invention. In one embodiment, the PXP-tag is removed by a protease selected from the group consisting of Tobacco etch virus protease, TEV (ENLYFQ/X), Human rhinovirus 3C protein (EVLFQ/GP), Factor Xa (IEGR/), Enterokinase, EntK (DDDDK/), Thrombin, Thr protease (LVPR/GS) and Caspase-3 protease (DXXD/). The cleavage site of the respective proteases are indicated in parenthesis, where 7" designates the cleavage position. In one specific embodiment, the PXT-com prising tag of the present invention is removed by Tagzyme/DAPase. Removal by DAPase is illustrated in figure 9
Examples of PXP-tag removal of different PXP-tags of the present invention by specific proteases are presented herein below:
Tobacco etch virus protease, TEV:
NH2-MHHHHHHPKPENLYFQ/G -— protein ENLYFQ/G PKPHHHHHH-COOH
Human rhinovirus 3C protein:
NH2-MHHHHHHPKPEVLFQ/GP -— protein EVLFQ/GPPKPHHHHHH-COOH Factor Xa:
NH2-MHHHHHHPKPIEGR/ -— protein IEGR/ GPKPHHHHHH-COOH
Enterokinase, EntK:
NH2-MHHHHHHPKPDDDDK/-— protein DDDDK/GPKPHHHHHH-COOH
Thrombin, Thr:
NH2-MHHHHHHPKPLVPR/GS -— protein LVPR/GSPKPHHHHHH-COOH
Caspase-3 protease:
NH2-MHHHHHHPKPDXXD/-— protein DXXD/GPKPHHHHHH-COOH
where 7" designates the cleavage position In another embodiment, the PXT-comprising tag of the present invention is removed by an A-type carboxy peptidase, such as bovine carboxypeptidase A (BoCPA) or
Metarhizium anisopliae carboxypeptidase A (MeCPA). A-type carboxy peptidases digest from the C-terminal upstream and stop after certain amino acids, namely:
Arginine (R), Lysine (K), Proline (P+1 , P+2). Examples of BoCPA removal of conventional HIS-tags are:
protein— QRLLDDTSGKHHHHHH BoCPA protein— QRLLDDTSGK
protein— KEEDDHRPSSSHLLVHHHHHH BoCPA protein— KEEDDHRPS(S) protein— QTSSLISPPRSFSHHHHHH BoCPA protein— QTSSLISPPRS Examples of BoCPA removal of PXP-comprising HIS-tags of the present invention are: protein— QRLLDDTSGPKPHHHHHH BoCPA protein— QRLLDDTSGPKPH protein— KEEDDHRPKPSHLLVHHHHHH BoCPA protein— KEEDDHRPKPS(S) protein— QTSSLISPPRPSFSHHHHHH BoCPA protein— QTSSLISPPRPS Examples
Example 1
Construction of restriction free vector for User™ based cloning strategies including a PKP tag and hexa histidine tag (Vector 01).
General molecular biology techniques are described in Maniatis et al., et al., 1982. In this example, the target expression system used is P. pastoris (strain KM71 H, muts) as extracellular expression by the use of modified LifeTechnologies engineered pPICZ_alpha A vector. In brief, the synthetic oligos V01_up and V01_down (see table 1) have been used to create a linker that was ligated to the PICZ_alpha_A opened with Xhol/Xbal in order to generate the Vector 01 (Fig. 1A, C-terminal = SEQ ID NO: 6) of
which the C-terminal tag sequence is INASAPKPHHHHHH (SEQ ID NO: 6). The maltose binding protein (MBP, accession ACI46135) was chosen as test protein to be expressed as fusion of the alpha mating peptide and the tag using the User™ (Uracil- Specific Excision Reagent, New England Biolab) based cloning strategy (Nour-Eldin et al., 2006). In particular the MBP sequence was amplified from vector pKLAC-malE
(accession EU 196354, New England Biolab, NEB) by PCR using PfuTurbo Cx Hotstart DNA Polymerase (Agilent technologies) and the sequent set of primers for User™ cloning: MBP_fw and MBP_rv (see table 1). The primers were synthesized at Eurofins- MWG biotech (Germany) and "U" represented a modified nucleotide deoxy uridine, used instead of T in this technique.
PCR conditions were the followings: 1 min at 95°C as initial temperature, 36 cycles of 45" 95°C followed by 30" of annealing at 64°C and 1.5 min at 72°C as elongation time. The PCR product was purified from 1.5% agarose gel and treated with the User™ enzymes mix (NEB). In the meantime, the Vector 01 was opened with Pac I and Nt.BbvCI enzymes (NEB) and purified from gel as well. About 200 ng of PCR product of MBP treated with the User™ enzymes was mixed with 50 ng of the Vector 01 opened as mentioned above and this mix of inser vector was left at RT for 30 min prior standard E.coli Topo Top10' competent cell transformation. The positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG.
The following colonies plasmid miniprep was sequenced and linearized with either Dra I or Sac I and transformed by electrotransformation into pichia pastoris KM71 H according to De Schutter et al., 2009. After 3 days zeocin resistant pichia colonies were picked up and screened for the presence of MBP insert. Hence best candidate colony was chosen for expression using the shaking flasks feed batch method according to Life-Technologies fermentation guidelines
(http://tools.invitrogen.com/content/sfs/manuals/pichiaferm_prot.pdf ).
After 3-4 days the recombinant BMP were observed accumulated in the media by SDS- PAGE (4-12%). The supernatant was collected by centrifugation at 5000 x g for 10 min for recombinant protein purification. About 400 ml of pichia supernatant was passed through a Nickel-NTA Sepharose column (Qiagen), on an AKTA FPLC machine (GE- Healthcare), prior adjusting the pH of the media at ph 8.0 with 1 N NaOH. The pass through was discarded and the first wash was applied with a buffer containing 20 mM Tris-HCI pH 8.0, 0.5 M NaCI and 20 mM Imidazole. The elution was carried out with 0.5 M Imidazole and 1 M NaCI pH 8.0. The eluted peak fractions as monitored by the OD at 280 nm above the threshold were collected and dialyzed and concentrated with
Vivaspin columns (Ge-Healthcare). The recovery yield was calculated to be up to 95% of the supernatant recombinant proteins as judged by western blotting with anti penta HIS (Qiagen) or C-terminal 6xHis antibodies (Life Technology).
SDS-PAGE followed either by Coomassie staining or western blotting was carried out in order to verify purity and the presence of the HIS tag. The C-terminal anti HIS6 tag antibodies from Life technologies were used for the immunodetection. Finally, mass spectroscopy was used to confirm the presence of the C-terminal PXP(His)6 tag.
Examples 2-4
Construction of restriction free vector for User™ based cloning strategies including a PKP tag and modified histidine tag (Vectors 2-4).
Vectors 2-4 were produced by the same methods as described in example 1. In this case the pPICZ_alpha_A vector opened with Xhol and Xba I was ligated respectively to the linker created by mixing the oligos V02_up and V02_down (table 1), for obtaining the vector 2 (fig. 1 B), the oligos V03_up and V03_down (table 1), for obtaining the vector 3 (fig. 3) and the oligos V04_up and V04_down (table 1), for obtaining the vector 4 (fig. 2A). The resulting vectors 2, 3 and 4 used with the User cloning technology described in Example 1 , originated a recombinant maltose binding protein which possessed respectively a C-terminal sequence as the follow: SEQ ID NO: 7 in vector 2, SEQ ID NO: 8 in vector 3 and SEQ ID NO: 9 in vector 4. The same procedures for making recombinant proteins using the fermentation conditions in example 1 apply in this example 2.
Example 5
Construction of restriction vector for cloning strategies including a PKP tag and modified histidine tag (Vector 5).
The vector 5 was built in two stages. The first step was to introduce into
pPICZ_alpha_A the linker formed by oligos V05_up and V05_down (table 1) in the Xho l/Xba I site. The second step was to introduce the second linker formed by the combination of the two oligos V05a_up and V05a_down (table 1) into the restriction site Xba I/Sal I of the latter formed vector. The resulting Vector 5 polylinker is illustrated in Fig. 2B. The vector 5 was hence opened with EcoRI and Xba I meanwhile the MBP was amplified by PCR with the primers MBP_EcoR_l_fw /MBP_Xba_l_rv. After the PCR was finished the PCR product was cleaved by EcoRJ/Xba I and the PCR cleaned
up fragment was ligated into the corresponding EcorJ/XbaJ site of Vector 5 by 10U of
T4 Ligase (Fermentas, Thermo Scientific) 1 h at 16 °C.
The positive colonies were identified by restriction analysis and sequencing at
Eurofins/MWG.
The shuttle vector 5 containing the MBP was amplified in E. coli TOP 10' cells (Life technologies) by chemical transformation. Pichia KM71 H transformation and fermentation was carried out as in example 1. Proteins were also purified as described in example 1. Examples 6-8
Construction of restriction vector for cloning strategies including a PKP tag and modified histidine tag (Vectors 6-8).
Vectors 6-8 were produced by the same methods as described in example 5. In this case, Vector 5 was chosen as template. It was opened with Xbal and Sal I and ligated respectively to the linker created by mixing the oligos V06_up and V06_down (table 1), for obtaining the vector 6 (fig. 2C), the oligos V07_up and V07_down (table 1), for obtaining the vector 7 (fig. 3A) and the oligos V08_up and V08_down (table 1), for obtaining the vector 8 (fig. 3B). The resulting vectors 6, 7, 8 were used for conventional ligase cloning technology described in Example 5, originating a recombinant maltose binding protein which possessed respectively a C-terminal sequence as the follow:
SEQ ID NO: 11 in vector 6, SEQ ID NO: 12 in vector 7 and SEQ ID NO: 13 in vector 8. The same procedures for making recombinant proteins using the fermentation conditions in example 1 apply in these examples 6-8. Example 9
Construction of restriction vector for cloning strategies including a PKP tag and hexa histidine tag (Vectors 9).
The prokaryotic expression was performed in the E. coli expression strain BL21 (DE3) pLysS from Promega Corporation, meanwhile the cloning part has been carried out in Life Technology TOP 10' E. coli chemical competent cells. In brief the vector pET15b (Novagen, Merk) was opened with Nco I and Nde I and the linker formed by the two oligos V09_up and V09_down being ligated using 10U of T4 Ligase (Fermentas, Thermo Scientific) 1 h at 16 °C. The new ligated vector was used for transformation of TOP 10'cells that were used for amplification. The correct vector 9 having an N- terminal the amino acid sequence MGSSHHHHHHPKP (SEQ ID NO: 14), was used for
introducing the PCR product of MBP obtained with the primers
MBP_Nde_l_fw/MBP_Xho_l_rv. After double digestion with Ndel/Xho I of both the vector 9 and the latter PCR product, these were purified from 1.5% agarose gel electrophoresis. The MBP was hence ligated to the Ndel/Xhol site of vector 9 and amplified in E. coli TOP 10'cells as described herein above. The positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG. Chemical competent BL21 (DE3) pLysS were hence transformed with the vector 9 containing the MBP.
The protein expression was carried out according the pET System Manual, 1 1th Ed. (http://www.merck-chemicals.se/chemdat/en_CA/Merck-US-
Site/USD/ViewProductDocuments-File?ProductSKU=EMD_BIO- 71867&DocumentType=USP&Documentld=/emd/biosciences/userprotocols/en- US/TB055.pdf&DocumentSource=GDS ). In brief 400 ml of LB (Luria-Bertani) media were inoculated with 10ml pre-growth inocula and this fresh media was allowed to grow at 37 °C until the OD6oo nm of the culture was around 0.6. After that as final
concentration 0.5 mM IPTG (Isopropyl β-D-l-thiogalactopyranoside, Sigma-Aldrich) was used as inducer and the culture was allowed to grow for additionally 3h at 37 °C. The cells were harvested by centrifugation and lysed by the BugBuster (Novagen, Merk) reagent. The lysate was sonicated and passed through a Ni-IDA (Protino resin, Macherey-Nagel) silica column (Omnifit, Diba) using an AKTA FPLC machine (GE- Healthcare), prior adjusting the pH of the media at ph 8.0 with 1 N NaOH. The chromatography steps were similar to example 1 as well the recovery yield.
Example 10
Construction of restriction vector for cloning strategies including and N-terminal hexa histidine tag, a PKP tag and an optional TEV TAG (Vectors 10 and 11).
The prokaryotic expression was performed in the E. coli expression strain T7 Shuffle from New England Biolabs, meanwhile the cloning part has been carried out in Life Technology TOP 10' E. coli chemical competent cells. Two new vector derivative from pET15b (Novagen, Merk) were created by Infusion cloning technique (Clontech, USA). In brief a previous described pET15m (Dionisio et al., 2011) was opened with Nco I and Nde I and the linker formed by the two oligos V10_up (Seq ID. NO: 69) and V10_down (Seq ID. NO: 70) have been introduced by Infusion reaction for 5 min. at 50 °C, according to the manufacturer instructions. The new generated vector 10 was used for transformation of TOP 10'cells. Likewise the oligos V11_up (Seq ID. NO: 71) and
V11_down (Seq ID. NO: 72) were introduced into pET15m as described above. The vector pET15m and the vector 10 and 11 having these latter the N-terminal amino acid sequence respectively NH2- MGHHHHHHPKPASHM (SEQ ID NO: 73) and NH2- MGHHHHHHPKPENLYFQGASH (Seq ID. NO: 74, TEV site underlined), were used for introducing the PCR product of wheat Thioredoxin H isoform b, Thr_h_b (Genbank nucleotide accession AY072771 and protein accession AAL67139) obtained with the primers Thr_Nde_l_fw (SEQ ID NO: 75) and Thr_EcoR_l_rv (SEQ ID NO: 76). After double digestion with Ndel/EcoR I of both the vectors pET15m, Vector 10 and 11 and the Thr_h_b PCR product, these were purified from 1.5% agarose gel electrophoresis. The Thioredoxin was hence ligated into the Nde l/EcoR I site of these vectors and amplified in E. coli TOP 10'cells as described herein above. The positive colonies were identified by restriction analysis and sequencing at Eurofins/MWG. Chemical competent BL21 (DE3) T7 Shuffle cells were transformed with the vectors containing the Thr_h_b.
The protein expression was carried out in 200 ml of LB (Luria-Bertani) media which were inoculated with 10 ml pre-growth inocula and this was allowed to grow at 37 °C until the OD6oo nm of the culture was around 0.6. After that 0.1 mM IPTG (Isopropyl β-D- 1-thiogalactopyranoside, Sigma-Aldrich) as final concentration was added and the culture was allowed to grow for 12h at 26 °C. The cells were harvested by
centrifugation and lysed by the BugBuster (Novagen, Merk) reagent plus 100 units of Benzonase (Sigma-Aldrich) and 2mg/ml of Lysozyme. The lysate was centrifuged and purified through a 5ml HisTRAP column (GE-Healthcare) using an AKTA FPLC machine (GE-Healthcare), prior adjusting the pH of the media to pH 8.0. The unbound proteins were collected as pass through; meanwhile a wash step was performed by a wash buffer (20 mM Tris-HCI pH 8.0, 10 mM Imidazole, 350 mM NaCI). The tagged proteins were eluted by an imidazole gradient from 0 to 0.5 M using the buffer A (20 mM Tris-HCI pH 8.0, 0.2 M NaCI) and Buffer B (20 mM Tris-HCI, 0.5 M Imidazole, 0.6 M NaCI, pH 8.0) in order to create the gradient. In figure 10 is illustrated the SDS- PAGE/Coomassie and the elution profile of the Thioredoxin h isoform b as coming from pET15m or vector 10/1 1. Recombinant thioredoxin h proteins coming from vector pET15m were eluted with 125 mM Imidazole judging from the elution peak of the Ni/NTA column (figure 10a, filled circle line), meanwhile recombinant PKP or PKP TEV tagged thioredoxin h were eluted with 200 mM Imidazole (figure 10a, open circle and inverted filled triangle lines). The figure 10b represents the 10-20% Tris GlycineSDS- PAGE/Coomassie (Life-Technologies) of the summary of the purification steps. In lane
1 there are the all blue precision standard prestained markers (Bio-Rad, cat. num. 161- 0373); in lane 2, the uninduced total proteins of the Thx_h expressed in pET15m; in lane 3, the induced total soluble proteins from pET15m; in lane 4 the purified Thx_h Ni/NTA peak from vector pET15m; in lane 5 the induced total soluble proteins from Thx_h in vector 10, in lane 6 the purified Thx_h Ni/NTA peak from vector 10; in lane 7, the induced total soluble proteins from Thx_h in vector 1 1 ; in lane 8, the the purified Thx_h Ni/NTA peak from vector 1 1 ; in lane 9 the TEV treated Thx_h from vector 1 1 concentrated/overloaded. Example 11
Introduction into wheat Thioredoxin reductase by cloning approach a C-terminal hexa histidine tag with or without an upstream PKP tag.
A wheat NADPH dependent Thioredoxin reductase (NTR) was amplified from previously cloned NTR from developing seeds (Genbak nucleotide accession,
AJ421947; protein accession: CAD19162) using the primers NTR_ATG_fw (SEQ ID NO: 77) and NTR_C_t_6xHIS (SEQ ID NO: 78) in order to give a PCR product that was introduced into pET15m opened with Nco I and EcoR I by Infusion cloning as described above. In a similar way the amplified PCR product of NTR obtained by the set of primers NTR_ATG_fw (SEQ ID NO: 77) and NTR_C_t_PKP_6xHIS (SEQ ID NO: 79) was introduced likewise by Infusion reaction into pET15m opened with Nco I and EcoR I. Thus these two C/t 6xHIS tagged NTR constructs were expressed into BL21 (DE3) T7 Shuffle (New England Biolabs) as described in Example 10. The purification was performed as in Example 10. There was no difference in the elution profile between the C/t 6xHis tagged NTR (figure 11 a, open circle line) and the C/t PKP 6xHis tagged NTR (fig. 1 1a, filled circle line) as shown in the elution profile of the Ni/NTA resin. The purification yield was a little in favor of the C/t PKP 6xHis construct (11 mg/L) against the traditionally C/t 6xHis tagged counterpart (7 mg/L) as shown in figure 1 1 b where is described the related SDS-PAGE/Coomassie. . In lane 1 there are the all blue precision standard prestained markers (Bio-Rad, cat. num. 161-0373); in lane 2, the uninduced total proteins of the NTR C/t 6xHis expressed in pET15m; in lane 3, the induced total soluble proteins from the NTR C/t 6xHis pET15m; in lane 4 the purified NTR C/t 6xHis Ni/NTA peak; in lane 5 the uninduced total proteins of the NTR C/t PKP 6xHis expressed in pET15m; in lane 6, the induced total soluble proteins from NTR C/t PKP 6xHis, in lane 7 the purified NTR C/t PKP 6xHis Ni/NTA peak; in lane 8 the concentrated by Vivaspin 500 (Ge-Healthcare) NTR C/t PKP 6xHis.
40
Sequences
SEQ ID NO: 1
PXP-(His)n
SEQ ID NO: 2
(His)n-PXP
SEQ ID NO: 3
PXP-(His)n-PXP.
SEQ ID NO: 4
PXP-(HZ)n.
SEQ ID NO: 5
PXP-(HZ)n-P.
In SEQ ID NO: 1-5:
X is selected from the group consisting of any amino acid, but is preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Lysine (K), Arginine (R),
Glutamine (Q), Histidine (H) and Asparagine (N), and more preferably X is selected from the group consisting of Lysine (K), Arginine (R) and Glutamine (Q);
Z is an amino acid, such as preferably an amino acid selected from the group consisting of Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K), and
n is at least 1 , preferably 2-10, and more preferred 4-6.
SEQ ID NO: 6
C-terminal protein sequence present in vector 1
INASAPKPHHHHHH SEQ ID NO: 7
C-terminal protein sequence present in vector 2
INASAPKPHQHRHKHQP SEQ ID NO: 8
C-terminal protein sequence present in vector 3
INASAPKPHGHTHGHSHGHP SEQ ID NO: 9
C-terminal protein sequence present in vector 4
INASAPKPHEHDHEHDHEHP
SEQ ID NO: 10
C-terminal protein sequence present in vector 5
SRPKPHHHHHH SEQ ID NO: 11
C-terminal protein sequence present in vector 6
SRPKPHQHRHKHQP SEQ ID NO: 12
C-terminal protein sequence present in vector 7
SRPKPHGHTHGHSHGHP SEQ ID NO: 13
C-terminal protein sequence present in vector 8
SRPKPHEHDHEHDHEHP
SEQ ID NO: 14
C-terminal protein sequence present in vector 9
MGSSHHHHHHPKP
Primers used in the examples for producing a vector of the invention.
SEQ ID NO Primer name Sequence (5' -> 3')
SEQ ID NO: 15 V01_up TCGAGAAAAGAGCTGAGGTCTTAATTAATGCC
TCAGCACCAAAGCCACATCATCATCATCATCA TTGAT
SEQ ID NO: 16 V01_down CTAGATCAATGATGATGATGATGATGTGGCTT
TGGTGCTGAGGCATTAATTAAGACCTCAGCTC I I I I C
SEQ ID NO: 17 V02_up TCGAGAAAAGAGCTGAGGTCTTAATTAATGCC
TCAGCACCAAAGCCACATCAACATAGACATAA GCATCAACATCCATGAT
SEQ ID NO: 18 V02_down CTAGATCATGGATGTTGATGCTTATGTCTATG
TTGATGTGGC I I I GGTGCTGAGGCATTAATTA AGACCTCAGCTC I I I I C
SEQ ID NO: 19 V03_up TCGAGAAAAGAGCTGAGGTCTTAATTAATGCC
TCAGCACCAAAGCCACATGGACATACTCATG
GTCATAGTCATGGACATCCATGAT
SEQ ID NO: 20 V03_down CTAGATCATGGATGTCCATGACTATGACCATG
AGTATGTCCATGTGGC I I I GGTGCTGAGGCA
TTAATTAAGACCTCAGCTCTTTTC
SEQ ID NO: 21 V04_up TCGAGAAAAGAGCTGAGGTCTTAATTAATGCC
TCAGCACCAAAGCCACATGAACATGATCATGA GCATGACCATGAACATCCATGAT
SEQ ID NO: 22 V04_down CTAGATCATGGATGTTCATGGTCATGCTCATG
ATCATGTTCATGTGGC I I I GGTGCTGAGGCAT TAATTAAGACCTCAGCTC I I I I C
SEQ ID NO: 23 V05_up TCGAGAAAAGAGAGGCTGAAGCTGAGGCTCC
AGCCGAAGACATTCCAGAATTCGGACATATG
GCTT
SEQ ID NO: 24 V05_down CTAGAAGCCATATGTCCGAATTCTGGAATGTC
TTCGGCTGGAGCCTCAGCTTCAGCCTCTC I I I TC
SEQ ID NO: 25 V05a_up CTAGACCAAAGCCACATCATCATCATCATCAT
TGAG
SEQ ID NO: 26 V05a_down TCGACTCAATGATGATGATGATGATGTGGCTT
TGGT
SEQ ID NO: 27 V06_up CTAGACCAAAGCCACATCAACATAGACATAAG
CATCAACATCCATGAG
SEQ ID NO: 28 V06_down TCGACTCATGGATGTTGATGCTTATGTCTATG
TTGATGTGGC I I I GGT
SEQ ID NO: 29 V07_up CTAGACCAAAGCCACATGGACATACTCATGGT
CATAGTCATGGACATCCATGAG
SEQ ID NO: 30 V07_down TCGACTCATGGATGTCCATGACTATGACCATG
AGTATGTCCATGTGGC I I I GGT
SEQ ID NO: 31 V08_up CTAGACCAAAGCCACATGAACATGATCATGAG
CATGACCATGAACATCCATGAG
SEQ ID NO: 32 V08_down TCGACTCATGGATGTTCATGGTCATGCTCATG
ATCATGTTCATGTGGC I I I GGT
SEQ ID NO: 33 V09_up CATGGGCAGCAGCCATCATCATCATCATCAC
CCGAAACCGCA
SEQ ID NO: 34 V09_down TATGCGGTTTCGGGTGATGATGATGATGATG
GCTGCTGC
SEQ ID NO: 35 MBP_fw GGTCTTAAUCATGAAGACTGAAGAAGGTAAAT
TGGTAATCT
SEQ ID NO: 36 MBP_rv GGCATTAAUCGAAGCCGAGTTAGTCTGCGCG
SEQ ID NO: 37 MBP_EcoRI CTCAGAATTCATGAAGACTGAAGAAGGTAAAT
TGGTAATCT
_fw
SEQ ID NO: 38 MBP_Xba_l_ TAGGTCTAGACGAAGCCGAGTTAGTCTGCGC
G
rv
SEQ ID NO: 39 MBP_Nde_l_ CTCACATATGAAGACTGAAGAAGGTAAATTGG
TAATCT
fw
SEQ ID NO: 40 MBP_Xho_l_ TAGGCTCGAGTTACGAAGCCGAGTTAGTCTG
CGCG
rv
SEQ ID NO: 41
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 6, in italic and bold) inserted using the User™ technology in vector 1
SEQ ID NO: 42
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 7, in italic and bold) inserted using the User™ technology in vector 2.
SEQ ID NO: 43
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 8, in italic and bold) inserted using the User™ technology in vector 3.
SEQ ID NO: 44
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 9, in italic and bold) inserted using the User™ technology in vector 4.
SEQ ID NO: 45
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 10, in italic and bold) inserted using conventional ligation with sticky ends (EcoRI / Xbal) in vector 5.
SEQ ID NO: 46
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 11 , in italic and bold) inserted using conventional ligation with sticky ends (EcoRI / Xbal) in vector 6.
SEQ ID NO: 47
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 12, in italic and bold) inserted using conventional ligation with sticky ends (EcoRI / Xbal) in vector 7.
SEQ ID NO: 48
Alpha Mating Factor (aMF, in bold) fusion protein containing Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 13, in italic and bold) inserted using conventional ligation with sticky ends (EcoRI / Xbal) in vector 8. SEQ ID NO: 49
N-terminal PXP tag (SEQ ID NO: 14, in italic and bold) of vector 9.
MGSSHHHHHHPKP M SEQ ID NO: 50
Maltose Binding Protein (MBP, underlined) with a N-terminal PXP tag (SEQ ID NO: 14, in italic and bold) inserted using conventional ligation with sticky ends (Nde I / Xho I) in vector 9.
SEQ ID NO: 51
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ site and the C- terminal PXP tag (SEQ ID NO: 6, in italic and bold) of Vector 1.
SEQ ID NO: 52
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ inserted Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 6, in italic and bold) in Vector.
SEQ ID NO: 53
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ site and the C- terminal PXP tag (SEQ ID NO: 7, in italic and bold) of Vector 2.
SEQ ID NO: 54
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ inserted Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 7, in italic and bold) in Vector 2.
SEQ ID NO: 55
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ site and the C- terminal PXP tag (SEQ ID NO: 8, in italic and bold) of Vector 3.
SEQ ID NO: 56
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ inserted Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 8, in italic and bold) in Vector 3.
SEQ ID NO: 57
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ site and the C- terminal PXP tag (SEQ ID NO: 9, in italic and bold) of Vector 4.
SEQ ID NO: 58
Alpha Mating Factor (aMF, in bold) fusion protein containing the User™ inserted Maltose Binding Protein (MBP, underlined) and the C-terminal PXP tag (SEQ ID NO: 9, in italic and bold) in Vector 4.
SEQ ID NO: 59
Alpha Mating Factor (aMF, in bold) fusion protein containing EcoRI, Ndel and Xba I (underlined) in a new polylinker in frame with a C-terminal PXP tag (SEQ ID NO: 10, in italic and bold) of Vector 5.
SEQ ID NO: 60
Alpha Mating Factor (aMF, in bold) fusion protein with MBP (underlined) inserted in EcoRI and Xba I, in frame with a C-terminal PXP tag (SEQ ID NO: 10, in italic and bold), in Vector 5.
SEQ ID NO: 61
Alpha Mating Factor (aMF, in bold) fusion protein containing EcoRI, Ndel and Xba I (underlined) in a new polylinker in frame with a C-terminal PXP tag (SEQ ID NO: 11 , in italic and bold) of Vector 6.
SEQ ID NO: 62
Alpha Mating Factor (aMF, in bold) fusion protein with MBP (underlined) inserted in EcoRI and Xba I, in frame with a C-terminal PXP tag (SEQ ID NO: 11 , in italic and bold), in Vector 6.
SEQ ID NO: 63
Alpha Mating Factor (aMF, in bold) fusion protein containing EcoRI, Ndel and Xba I (underlined) in a new polylinker in frame with a C-terminal PXP tag (SEQ ID NO: 12, in italic and bold) of Vector 7.
SEQ ID NO: 64
Alpha Mating Factor (aMF, in bold) fusion protein with MBP (underlined) inserted in EcoRI and Xba I, in frame with a C-terminal PXP tag (SEQ ID NO: 12, in italic and bold), in Vector 7.
SEQ ID NO: 65
Alpha Mating Factor (aMF, in bold) fusion protein containing EcoRI, Ndel and Xba I (underlined) in a new polylinker in frame with a C-terminal PXP tag (SEQ ID NO: 13, in italic and bold) of Vector 8.
SEQ ID NO: 66
Alpha Mating Factor (aMF, in bold) fusion protein with MBP (underlined) inserted in EcoRI and Xba I, in frame with a C-terminal PXP tag (SEQ ID NO: 13, in italic and bold), in Vector 8.
SEQ ID NO: 67
Vector 9 polylinker obtained from pET15b (Novagen) with the Nco I / Ndel (underlined) insertion of a new N-terminal PXP tag (SEQ ID NO: 14, in italic and bold).
SEQ ID NO: 68
MBP (underlined) inserted into Nde I / Xhol (underlined, bold) of vector 9 containing a
N-terminal PXP tag (SEQ ID NO: 14, in italic and bold).
Seq ID. NO: 69
oligos V10_up:
Seq ID. NO: 70
oligos V10_down:
Seq ID. NO: 71
oligos V1 1_up:
Seq ID. NO: 72
oligos V1 1_down:
Seq ID. NO: 73
N-aminoterminal aminoacidic sequence related to vector 10, the PKP site is in bold: Seq ID. NO: 74
N-aminoterminal aminoacidic sequence related to vector 1 1 , the PKP site is in bold and the TEV protease site is underlined:
Seq ID. NO: 75
Wheat (Triticum aestivum) Thioredoxin H isoform b construct primers (Thr_Nde_l_fw): Seq ID. NO: 76
Wheat (Triticum aestivum) Thioredoxin H isoform b construct primers (Thr_EcoR_l_rv): Seq ID. NO: 77
Wheat (Triticum aestivum) NADPH dependent Thioredoxin reductase (NTR) construct primers (NTR_ATG_fw):
Seq ID. NO: 78
Wheat (Triticum aestivum) NADPH dependent Thioredoxin reductase (NTR) construct primers (NTR_C_t_6xHIS):
Seq ID. NO: 79
Wheat (Triticum aestivum) NADPH dependent Thioredoxin reductase (NTR) construct primers (NTR_C_t_PKP_6xHIS):
Claims
1. A protein tag comprising a Proline-X-Proline (PXP) amino acid motif, wherein X is selected from any amino acid.
2. The protein tag according to any of the preceding claims, wherein said tag is an epitope tag and/or an affinity purification tag and/or a fusion linker for fusion proteins.
3. The protein tag according to any of the preceding claims having the general
formula: PXP-(HZ)n or (HZ)n- PXP, where n is at least 1 , preferably 1-10, Z is any amino acid, and preferably an amino acid selected from the group consisting of
Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D), Glutamine (Q) and Lysine (K).
4. The protein tag according to claim 3, further comprising a Proline (P) at the C- terminal end, thus having the general formula: PXP-(HZ)n-P.
5. The protein tag according to any of the preceding claims, wherein said tag
comprises a polyhistidine tag, where the PXP motif is fused to the N- terminal and/or C-terminal end of the polyhistidine tag, thus having the formula: PXP-(His)n, (His)n-PXP or PXP-(His)n-PXP, where n is at least 2, and preferably 2-10, and even more preferred 4-6.
6. The protein tag according any of the preceding claims, wherein X is selected from amino acids having an isoelectric point of at least 5, such as at least 6, for example at least 7.
7. The protein tag according to any of the preceding claims, wherein X is selected from the group consisting of Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N).
8. The protein tag according to any of the preceding claims, wherein X is K, Q or R, thus having the amino acid motif PKP, PQP or PRP.
9. The protein tag according to any of the preceding claims, said protein tag
comprising or consisting of a sequence selected from the group consisting of: a) INASAPKPHHHHHH
b) INASAPKPHQHRHKHQP
c) INASAPKPHGHTHGHSHGHP d) INASAPKPHEHDHEHDHEHP e) SRPKPHHHHHH f) SRPKPHQHRHKHQP g) SRPKPHGHTHGHSHGHP h) SRPKPHEHDHEHDHEHP, and i) MGSSHHHHHHPKP
10. The protein tag according to any of the preceding claims, comprising in vivo and/or in vitro posttranslational modifications, such as deaminations and/or
phosphorylations.
1 1. A fusion polypeptide comprising a first polypeptide sequence R1 fused to a protein tag comprising a PXP amino acid motif, where X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Histidine (H), Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), preferably X is K, Q or R.
12. The fusion polypeptide according to claim 1 1 , wherein said protein tag is as defined in any of the preceding claims 1 to 10.
13. The fusion polypeptide according to any one of claims 1 1 to 12, wherein said
protein tag is fused to the N-terminal and/or C-terminal end of said first polypeptide R1.
14. The fusion polypeptide according to any one of claims 1 1 to 13, wherein said
protein tag is inserted into said fusion polypeptide between a first (R1) and a second (R2) polypeptide.
15. The fusion polypeptide according to claim 14, wherein said fusion polypeptide has a general formula selected from the group consisting of: R1-PXP-R2, R1-PXP-
(His)n-R2, R1-(His)n-PXP-R2, R1-PXP-(His)n-P-R2, R1-P-(His)n-PXP-R2, R1- PXP-(HZ)n-R2, R1-PXP-(HZ)n-P-R2, R1-(HZ)n-PXP-R2, and R1-P-(HZ)n-PXP-R2, where n is at least 1 , preferably 1-10, such as 4-6, Z is any amino acid, but preferably an amino acid selected from the group consisting of Histidine (H), Glycine (G), Serine (S), Threonine (T), Glutamic acid (E), Aspartic acid (D),
Glutamine (Q) and Lysine (K), and X is any amino acid, but preferably an amino acid having an isoelectric point of at least 5, such as at least 6, for example at least 7 or above, or X is selected from the group consisting of Histidine (H), Lysine (K), Arginine (R), Glutamine (Q), Histidine (H) and Asparagine (N), preferably X is K, Q or R.
16. The fusion polypeptide according to any one of claims 1 1 to 15, wherein said
protein tag comprise a site-specific protease cleavable site, such as a sequence selected from the group consisting of ENLYFQ/X (Tobacco etch virus protease, TEV), EVLFQ/GP (Human rhinovirus 3C protein), IEGR/ (Factor Xa), DDDDK/ ( Enterokinase, EntK), LVPR/GS (Thrombin, Thr protease), DXXD/ (Caspase-3 protease), where / designates the cleavage position.
17. The fusion polypeptide according to any one of claims 1 1 to 16, wherein said first or second polypeptide is a solubility enhancer, such as Maltose Binding Protein (MBP), NusA, Trx, GST, SUMO, SET, DsbC.Skp, T7PK, GB1 or ZZ.
18. The fusion polypeptide according to any one of claims 1 1 to 17, wherein said fusion polypeptide comprises a linker sequence between the protein tag and said first and/or second polypeptide.
19. The fusion polypeptide according to any one of claims 1 1 to 18, wherein said linker sequence comprises
a. a proteolytic cleavage site suitable for separating the first polypeptide from the protein tag, or
b. one or more endonucleolytic cleavage sites.
20. The fusion polypeptide according to any one of claims 1 1 to 19, wherein a linker sequence is introduced to create fusion proteins or custom proteolytic sites.
21. The fusion polypeptide according to any one of claims 1 1 to 20, wherein said
polypeptide comprises or consists of a sequence selected from the group consisting of any one of SEQ ID NO: 41-50.
22. A nucleic acid encoding a protein tag as defined in any one of claims 1 to 10 and/or a fusion polypeptide as defined in any one of claims 1 1 to 21.
23. The nucleic acid according to claim 22, wherein said nucleic acid is selected from the group consisting of any one of SEQ ID NO: 15-40, 51-72 and 75-79.
24. A nucleic acid vector comprising a nucleic acid sequence as defined in any one of claims 22 and 23.
25. The nucleic acid vector according to claim 24, wherein said vector is an expression vector, such as a prokaryotic vector or a eukaryotic expression vector, for example a yeast or mammalian expression vector.
26. A recombinant host cell comprising a nucleotide sequence as defined in any one of claims 1-10, fusion polypeptide as defined in any of claims 1 1 to 21 , a nucleic acid sequence as defined in any of claims 22 and 23, and/or a nucleic acid vector as defined in any of claims 24 and 25.
27. The recombinant host cell according to claim 26, wherein said host is selected from plant cells, CHO cells, insect cells, BAC virus and E.coli.
28. The recombinant host cell according to claim 26, wherein said host is selected from any yeast or fungi, such as Saccharomyces cerevisiae, Kluyveromyces lactis, Pichia pastoris or other methylotrophic yeast.
29. The recombinant host cell according to any one of claims 26 to 28, wherein said host cell express a fusion polypeptide as defined in any of claims 11 to 20.
30. A kit comprising a nucleic acid sequence as defined in any one of claims 22 and 23, a nucleic acid vector as defined in any of claims 24 and 25, and/or a
recombinant host cell as defined in any one of claims 26 to 29.
31. Use of an oligonucleotide comprising a sequence encoding a protein tag as defined in any of claims 1 to 10, or a PXP-comprising part thereof, for introducing said protein tag or part thereof in a nucleic acid cloning and/or expression vector, and/or for fusing said protein tag or part thereof to a polypeptide sequence.
32. The use according to claim 31 , wherein said nucleic acid vector and/or for fusion polypeptide comprising a first polypeptide sequence is as defined in any of the preceding claims.
33. The use according to any one of claim 31 and 32, wherein said oligonucleotide is selected from the group consisting of SEQ ID NO: 15-40, 69-72 and 75-79.
34. A method of producing recombinant protein, said method comprising:
(a) culturing a host cell as defined in any one of claims 26 to 29 under suitable growth conditions, which allows the expression of a fusion polypeptide as defined in any one of claims 11 to 20, and
(b) purifying the fusion polypeptide from the culture.
35. The method according to claim 34, wherein the purification comprises
a. contacting a lysate, extract or broth of said host cell containing the fusion polypeptide with a metal chelate resin,
b. optionally, washing the resin-fusion polypeptide complex with a buffer to remove unbound proteins and other materials, and
c. eluting the bound fusion polypeptide from the resin-fusion polypeptide
complex.
36. The method according to any one of claims 34 and 35, wherein said metal chelate resin is NTA (nitriloacetic acid), TALON™ (Clontech) or IDA (imminodiacetic acid)- charged agarose/silica resin.
37. The method according to any of claims 34 to 36, wherein said metal chelate is comprises a bivalent cation, such as Co2+, Zn2+, Cu2+, Ca2+, Cd2+, or Ni2+.
38. The method according to any of claims 34 to 37, wherein said PXP-tag is removed subsequent to its elution from the resin matrix.
39. The method according to claim 38, wherein said PXP-tag is removed by a protease selected from the group consisting of Tobacco etch virus protease, TEV
(ENLYFQ/X), Human rhinovirus 3C protein (EVLFQ/GP), Factor Xa (IEGR/), Enterokinase, EntK (DDDDK/), Thrombin, Thr protease (LVPR/GS) and Caspase-3 protease (DXXD/),Tagzyme/DAPase and A-type carboxy peptidases.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DKPA201170757 | 2011-12-23 | ||
| DKPA201170757 | 2011-12-23 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2013091661A2 true WO2013091661A2 (en) | 2013-06-27 |
| WO2013091661A3 WO2013091661A3 (en) | 2013-08-15 |
Family
ID=47561008
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/DK2012/050505 Ceased WO2013091661A2 (en) | 2011-12-23 | 2012-12-21 | Proteolytic resistant protein affinity tag |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2013091661A2 (en) |
Cited By (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106188234A (en) * | 2016-07-27 | 2016-12-07 | 中国石油大学(华东) | A kind of protein purification label and application thereof with metal ion with higher adhesion |
| WO2017180587A2 (en) | 2016-04-11 | 2017-10-19 | Obsidian Therapeutics, Inc. | Regulated biocircuit systems |
| WO2018136572A1 (en) * | 2017-01-18 | 2018-07-26 | Savior Lifetec Corporation | Expression construct and method for producing proteins of interest |
| WO2019241315A1 (en) | 2018-06-12 | 2019-12-19 | Obsidian Therapeutics, Inc. | Pde5 derived regulatory constructs and methods of use in immunotherapy |
| WO2020086742A1 (en) | 2018-10-24 | 2020-04-30 | Obsidian Therapeutics, Inc. | Er tunable protein regulation |
| WO2020185632A1 (en) | 2019-03-08 | 2020-09-17 | Obsidian Therapeutics, Inc. | Human carbonic anhydrase 2 compositions and methods for tunable regulation |
| CN111690039A (en) * | 2020-07-02 | 2020-09-22 | 南开大学 | Self-assembly polypeptide probe for identifying 6xHis tag protein, preparation method and application thereof |
| WO2020252404A1 (en) | 2019-06-12 | 2020-12-17 | Obsidian Therapeutics, Inc. | Ca2 compositions and methods for tunable regulation |
| WO2020252405A1 (en) | 2019-06-12 | 2020-12-17 | Obsidian Therapeutics, Inc. | Ca2 compositions and methods for tunable regulation |
| WO2021046451A1 (en) | 2019-09-06 | 2021-03-11 | Obsidian Therapeutics, Inc. | Compositions and methods for dhfr tunable protein regulation |
| US20210130404A1 (en) * | 2015-12-28 | 2021-05-06 | Idemitsu Kosan Co., Ltd. | Peptide tag and tagged protein including same |
| CN115725000A (en) * | 2022-12-07 | 2023-03-03 | 中国科学技术大学 | Double-label recombinant protein for expressing and purifying small peptide and preparation method and application thereof |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107267475A (en) * | 2017-08-11 | 2017-10-20 | 浙江福斯特新材料研究院有限公司 | A kind of method that metal chelate affinity chromatography purifies thioredoxin |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AUPN480095A0 (en) * | 1995-08-15 | 1995-09-07 | Commonwealth Scientific And Industrial Research Organisation | Epitope tagging system |
| MXPA02004283A (en) * | 1999-10-29 | 2002-10-17 | Chiron Spa | Neisserial antigenic peptides. |
| AR039067A1 (en) * | 2001-11-09 | 2005-02-09 | Pfizer Prod Inc | ANTIBODIES FOR CD40 |
| GB0214528D0 (en) * | 2002-06-24 | 2002-08-07 | Univ Aberdeen | Materials and methods for induction of immune tolerance |
| US20060099710A1 (en) * | 2004-11-10 | 2006-05-11 | Donnelly Mark I | Vector for improved in vivo production of proteins |
| US20060286047A1 (en) * | 2005-06-21 | 2006-12-21 | Lowe David J | Methods for determining the sequence of a peptide motif having affinity for a substrate |
| FR2890446B1 (en) * | 2005-09-05 | 2008-04-18 | Cis Bio Internat Sa | METHOD FOR DETECTING INTRACELLULAR INTERACTION BETWEEN BIO-MOLECULES |
-
2012
- 2012-12-21 WO PCT/DK2012/050505 patent/WO2013091661A2/en not_active Ceased
Non-Patent Citations (5)
| Title |
|---|
| "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY |
| CAREY; SUNDBERG: "Advanced Organic Chemistry", vol. A, B, 1992, PLENUM PRESS |
| DALKEN B.; JABULOWSKY R.A.; OBEROI P.; BENHAR I.; WELS W.S.: "Maltose-Binding Protein Enhances Secretion of Recombinant Human Granzyme B Accompanied by In Vivo Processing of a Precursor MBP Fusion Protein", PLOS ONE, vol. 5, no. 12, 2011, pages E14404, XP002689476, DOI: doi:10.1371/JOURNAL.PONE.0014404 |
| J. BIOL. CHEM., vol. 243, 1969, pages 3552 - 59 |
| PURE & APPL. CHEM., vol. 56, no. 5, 1984, pages 595 - 624 |
Cited By (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210130404A1 (en) * | 2015-12-28 | 2021-05-06 | Idemitsu Kosan Co., Ltd. | Peptide tag and tagged protein including same |
| US12371697B2 (en) * | 2015-12-28 | 2025-07-29 | Idemitsu Kosan Co., Ltd | Peptide tag and tagged protein including same |
| WO2017180587A2 (en) | 2016-04-11 | 2017-10-19 | Obsidian Therapeutics, Inc. | Regulated biocircuit systems |
| CN106188234B (en) * | 2016-07-27 | 2019-05-17 | 中国石油大学(华东) | A kind of protein purification label and its application with metal ion with stronger binding force |
| CN106188234A (en) * | 2016-07-27 | 2016-12-07 | 中国石油大学(华东) | A kind of protein purification label and application thereof with metal ion with higher adhesion |
| WO2018136572A1 (en) * | 2017-01-18 | 2018-07-26 | Savior Lifetec Corporation | Expression construct and method for producing proteins of interest |
| WO2019241315A1 (en) | 2018-06-12 | 2019-12-19 | Obsidian Therapeutics, Inc. | Pde5 derived regulatory constructs and methods of use in immunotherapy |
| WO2020086742A1 (en) | 2018-10-24 | 2020-04-30 | Obsidian Therapeutics, Inc. | Er tunable protein regulation |
| WO2020185632A1 (en) | 2019-03-08 | 2020-09-17 | Obsidian Therapeutics, Inc. | Human carbonic anhydrase 2 compositions and methods for tunable regulation |
| WO2020252405A1 (en) | 2019-06-12 | 2020-12-17 | Obsidian Therapeutics, Inc. | Ca2 compositions and methods for tunable regulation |
| WO2020252404A1 (en) | 2019-06-12 | 2020-12-17 | Obsidian Therapeutics, Inc. | Ca2 compositions and methods for tunable regulation |
| WO2021046451A1 (en) | 2019-09-06 | 2021-03-11 | Obsidian Therapeutics, Inc. | Compositions and methods for dhfr tunable protein regulation |
| CN111690039B (en) * | 2020-07-02 | 2022-04-05 | 南开大学 | Self-assembled peptide probe for recognizing 6xHis-tagged protein, preparation method and application |
| CN111690039A (en) * | 2020-07-02 | 2020-09-22 | 南开大学 | Self-assembly polypeptide probe for identifying 6xHis tag protein, preparation method and application thereof |
| CN115725000A (en) * | 2022-12-07 | 2023-03-03 | 中国科学技术大学 | Double-label recombinant protein for expressing and purifying small peptide and preparation method and application thereof |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2013091661A3 (en) | 2013-08-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2013091661A2 (en) | Proteolytic resistant protein affinity tag | |
| US7655413B2 (en) | Methods and compositions for enhanced protein expression and purification | |
| KR102079293B1 (en) | Expression sequences | |
| CA2819552C (en) | Mgmt-based method for obtaining high yield of recombinant protein expression | |
| Choi et al. | Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology | |
| KR101526019B1 (en) | Methods and compositions for enhanced protein expression and purification | |
| KR20150008852A (en) | Method for the preparation of surfactant peptides | |
| US20090023898A1 (en) | Methods and compositions for protein purification | |
| US7799561B2 (en) | Affinity peptides and method for purification of recombinant proteins | |
| CA2386471C (en) | Purification of recombinant proteins fused to multiple epitopes | |
| US7176298B2 (en) | Polynucleotides encoding metal ion affinity peptides and related products | |
| AU762984B2 (en) | Expression vector for use in a one-step purification protocol | |
| Li et al. | Single-step affinity and cost-effective purification of recombinant proteins using the Sepharose-binding lectin-tag from the mushroom Laetiporus sulphureus as fusion partner | |
| CA2816217C (en) | Compositions and methods of producing enterokinase in yeast | |
| JP4411192B2 (en) | Recombinantly expressed carboxypeptidase B and purification thereof | |
| KR20160077750A (en) | Mass production method of recombinant trans glutaminase | |
| WO2007096899A2 (en) | Affinity polypeptide for purification of recombinant proteins | |
| US20080039616A1 (en) | Tandem affinity purification systems and methods utilizing such systems | |
| US20170226489A1 (en) | Solubility and Affinity Tag for Recombinant Protein Expression and Purification |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12815989 Country of ref document: EP Kind code of ref document: A2 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 12815989 Country of ref document: EP Kind code of ref document: A2 |