[go: up one dir, main page]

WO1998012308A1 - Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation - Google Patents

Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation

Info

Publication number
WO1998012308A1
WO1998012308A1 PCT/IT1997/000228 IT9700228W WO9812308A1 WO 1998012308 A1 WO1998012308 A1 WO 1998012308A1 IT 9700228 W IT9700228 W IT 9700228W WO 9812308 A1 WO9812308 A1 WO 9812308A1
Authority
WO
WIPO (PCT)
Prior art keywords
hcv
protease
polypeptides
protein
pro
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IT1997/000228
Other languages
French (fr)
Inventor
Raffaele De Francesco
Anna Tramontano
Licia Tomei
Maria Chiara Nardi
Christian STEINKÜHLER
Andrea Urbani
Rosa Letizia Vitale
Stefano Colloca
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Istituto di Ricerche di Biologia Molecolare P Angeletti SpA
Original Assignee
Istituto di Ricerche di Biologia Molecolare P Angeletti SpA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Istituto di Ricerche di Biologia Molecolare P Angeletti SpA filed Critical Istituto di Ricerche di Biologia Molecolare P Angeletti SpA
Priority to EP97942190A priority Critical patent/EP0950094A1/en
Priority to JP10514467A priority patent/JP2001500735A/en
Priority to CA002264487A priority patent/CA2264487A1/en
Priority to AU43970/97A priority patent/AU4397097A/en
Publication of WO1998012308A1 publication Critical patent/WO1998012308A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/503Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from viruses
    • C12N9/506Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from viruses derived from RNA viruses

Definitions

  • HCV hepatitis C virus
  • NANB non-A, non-B hepatitis
  • HCV is an enveloped virus containing an RNA positive genome of approximately 9.4 kb. This virus is a member of the Flavivirida.e family, the other members of which are the pestiviruses and flaviviruses .
  • RNA genome of HCV has recently been sequenced. Comparison of sequences from the HCV genomes isolated in various parts of the world has shown that these sequences can be extremely heterogeneous. Most of the HCV genome is occupied by an open reading frame (ORF) that can vary between 9030 and 9099 nucleotides. This ORF codes for a single viral polyprotein, the length of which can obviously vary from 3010 to 3033 amino acids. During the virus infection cycle, the polyprotein is proteolytically processed into the individual gene products necessary for replication of the virus. The genes coding for HCV structural protein are located at the 5' end of the ORF, whereas the region coding for the non-structural proteins occupies the rest . i . of the ORF.
  • ORF open reading frame
  • the structural proteins consist of: C (core, 21 kDa) , El (envelope, gp37) and E2 (NS1, gp61) .
  • C is a non-glycosilate protein of 21 kDa, which probably forms the viral nucleocapsid.
  • the protein El is a glycoprotein of approximately 37 kDa and is believed to be a structural protein of the outer viral envelope.
  • E2 another membrane glycoprotein of 61 kDa, is probably a second structural protein of the outer envelope of the virus .
  • the non-structural region starts with NS2 (p24) , a hydrophobic protein of 24 kDa whose function is not known.
  • NS3 a protein of 68 kDa which follows NS2 in the polyprotein, has two functional domains: a serine protease domain in the first 180 amino-terminal araino acids and an RNA-dependent ATPase domain in the carboxy- terminal part .
  • the gene region corresponding to NS4 codes for NS4A (p6) , a membrane protein of 54 amino acids, and NS4B (p26) .
  • the gene corresponding to NS5 codes for two proteins, NS5A (p56) and NS5B (p65) , of 56 and 65 kDa, respectively. Recently it has been shown that the NS5B region has an RNA dependent RNA-poly erase activity (1) .
  • a first protease activity of HCV is responsible for the cleavage between NS2 and NS3. This activity is contained in a region comprising both a part of NS2 and the part of NS3 containing the serine protease domain, but does not use the same catalytic mechanism (3) .
  • the serine protease contained in the 180 amino acids at the amino-terminal of NS3 is responsible for cleavage at the junctions between NS3 and NS4A, between NS4A and NS4B, between NS4B and NS5A, and between NS5A and NS5B
  • N R methods and X-ray crystallography requires large amounts of soluble protein, and at the present time it is not possible to meet this request.
  • the simplest and most economical manner of obtaining large amounts of the desired polypeptide is expression of the corresponding gene in bacteria, and although there is a widespread availability of numerous eucaryotic promoters and methods for maximising the expression of heterologous genes in E. Coli, nevertheless an efficient production of the polypeptide in question, although necessary, might not be sufficient.
  • Many recombinant proteins do not fold the polypeptidic chain correctly when they are expressed in E . Coli. The result is the synthesis of polypeptides which are either degraded in the host cell, or are accumulated in an insoluble form in the so called inclusion bodies (15) .
  • proteins of viral origin or proteins that are toxic for the bacterial cell (as is the case for certain proteases of viral origin) there are insurmountable difficulties in producing them in a native, soluble form.
  • this method is based on the unexpected discovery that the NS3 serine protease domain, in its native conformation, binds a Zn ion. Because, as mentioned above, the structure of the HCV NS3 protease is not yet known, a structural model of the protein was prepared, to be used as a guide during experiments. However, the similarity of the NS3 protease to other serine proteases of known structure is extremely low (less than 15%) , which does not allow good alignment between sequences and as a result does not allow construction of a three-dimensional model based solely on homology.
  • the HCV NS3 protease actually has a metal content equivalent to one mole of zinc to each mole of protein, and as is the case in other proteins the zinc is necessary to enable the protein to take on its native structure and become catalytically active (20, 21) .
  • the NS3 protease has a binding site for a metal ion and that this binding site is so well preserved, even in viruses that are not phylogenetically close, opens the way to the study of antiviral therapeutic agents whose target site is this very region of the protein.
  • another viral protein that binds Zn 2+ ions that is to say the HIV virus nucleocapsid
  • An object of the present invention is therefore to provide a method for high-yield expression, in a native form, that is to say as a protein containing a bivalent metallic ion, and in a highly soluble form of the HCV N ⁇ 3 protease using heterologous expression systems, such as E. coli cells transformed using suitable genetic constructs and cultivated in a medium enriched with salts containing divalent metal ions.
  • a further object of the present invention is to provide a general method allowing preparation and isolation in a native, pure and highly soluble form, of large amounts of polypeptides containing Zn + , Co or
  • an additional object of the present invention is to provide a method that allows preparation and isolation in a native, pure and highly soluble form of large amounts of polypeptides with the protease activity of HCV NS3 , which are at the same time marked using stable heavy isotopes such as 13 c or 15 N , as required for experiments to determine the three- dimensional structure of the protein using NMR.
  • the present invention provides new genetic constructs for the expression, in E. coli cells, of modified polypeptides with the protease activity of HCV
  • a procedure for obtaining production of the NS3 serine protease domain in its native form, that is to say containing a bivalent metal ion, which is necessary for the structural integrity of the protein.
  • the innovation in the procedure consists in the addition to the culture medium in which the transformed bacterial cells are grown of compounds containing metals such as Zn, Co, Cd, Mn, Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt , V. These compounds provide the culture medium with the ions required by the protein to take on its native structure.
  • the protein is found in its native, soluble form in the cytoplasm of the bacterial cells, instead of being held in the included bodies, from which it can only be obtained by applying difficult resolubilisation procedures .
  • a procedure is provided that makes it possible to replace the zinc ion in the protease, which is spectroscopically silent, with other ions (for example Co or Cd + ) , which are spectroscopically active, so as to permit the study of possible inhibitors capable of co-ordinating the metal contained in the protein and therefore of disturbing the bond between the protein and the metal .
  • bivalent metal ions to a minimum culture medium, containing glucose and ammonium salts enriched with 13C or 15N as the sole sources of carbon and nitrogen, respectively, makes it possible to obtain large amounts of soluble protein marked with stable heavy isotopes such as 13C or 15N.
  • This type of isotope enrichment is necessary to determine the structure using NMR techniques .
  • polypeptide sequences are provided that contain the NS3 serine protease domain of hepatitis C virus, suitably modified.
  • polypeptides are characterised in that they have at their C-terminal end a sequence of extremely hydrophilic amino acids, such as for example a series of lysines, which are not present in the original sequence.
  • a sequence of extremely hydrophilic amino acids such as for example a series of lysines
  • Subjects of the present invention are therefore: a) Isolated and purified polypeptides containing the HCV NS3 serine protease domain, characterised in that they have at their C-terminal end a tail of at least three lysines. b) A process for the preparation of polypeptides containing the HCV NS3 serine protease domain in a soluble form, of use for enzymological experiments, determination of the three-dimensional structure of the enzyme both by means of NMR and using X-ray crystallography, comprising the following operations:
  • a process for the renaturation in vitro of the above polypeptides characterised in that it comprises the following operations:
  • Figure 1 shows the alignment between the HCV NS3 serine protease sequence and the viruses GBV-A, GBV-B and GBV-C/HGV (Hcv, Hga, Hgb, Hgc) , with the poliovirus (Pol) 2A cysteine protease.
  • Amino acids conserved in the HCV proteases and in the viruses GBV-A, GBV-B and GBV-C/HGV are shaded.
  • the catalytic residues are underlined and the residues that bind zinc are indicated using the symbol _.
  • Figure 2 shows a diagrammatic model of the NS3 serine protease domain. In particular it shows the position within the structure of the amino acids involved in binding zinc (dark grey) and the catalytic triad
  • Figure 3 shows the effect of the zinc ion on HCV NS3 serine protease activity.
  • Figure 4 shows the effects of the zinc ion on the production of HCV NS3 protease as a soluble protein in E. coli on a minimum culture medium.
  • Column 2 refers to the results of the experiment carried out on the cells without inducing protease production (-IPTG) .
  • Columns 3, 4 and 5 indicate that in the absence of ZnCl 2 and following the induction of protease production (+IPTG) the protein remains locked in the insoluble portion (indicated by the abbreviation PT) .
  • the protease is found entirely in the soluble portion (indicated by the abbreviation SN) .
  • FIG. 5 shows the electronic spectrums of the HCV
  • Figure 5a shows the visible and near-UV spectrum of the Co 2+ -protease .
  • Figure 5b shows the UV absorption spectrums of the Zn + -protease and of the Cd * - protease.
  • the plasmids pT7-7(Pro BK-asK4), pT7-7(Pro H-asK4), pT7-7(Pro J-asK4) and pT7-7(Pro J8-asK4) were constructed to allow expression in E. coli of polypeptides characterised in that they have a sequence chosen from the ones in the group from SEQ ID NO : 1 to SEQ ID NO : 4.
  • the polypeptides contain the NS3 protease domain of various HCV isolates (BK, H, J and J8 , respectively) with the addition of a "tail" of four lysines at the C- terminal end.
  • pT7-7 (Pro BK-asK4) contains the sequence for HCV-BK (EMBL data bank access number: M58335) between the nucleotides 3411 and 3950, cloned in the vector pT7-7.
  • pT7-7 (Pro H-asK4) contains the sequence for HCV-H (EMBL data bank access number: M67463) between the nucleotides 3420 and 3959, cloned in the vector pT7-7.
  • pT7-7 (Pro J-asK4) contains the sequence for HCV-J (EMBL data bank access number: D90208) between the nucleotides 3408 and 3947, cloned in the vector pT7-7.
  • pT7-7 contains the sequence for HCV-J8 (EMBL data bank access number: D10988/D01221) between the nucleotides 3432 and 3971, cloned in the vector pT7- 7.
  • the expression vector pT7-7 is a derivative of pBR322 which contains, in addition to the gene for ⁇ lactamase and the replication origin of ColEl, the promotor and the ribosome binding site of the T7 bacteriophage 010 gene (24) .
  • the fragments coding for the HCV NS3 protease were cloned downstream of the T7 bacteriophage 010 promoter, in reading frame with the first ATG condon of the gene 10 protein of phage T7 using methods known to the art .
  • the cDNA fragment containing the sequence HCV-BK between nucleotides 3411 and 3950 was amplified by Polymerase Chain Reaction (PCR) , using the oligonucleotides PR0T(BK-K4)S (SEQ ID NO: 5) and PROT (BK- K4)AS (SEQ ID NO: 6) as primers.
  • the cDNA fragment so obtained was digested with the restriction enzyme Ndel , and cloned in pT7-7, which was first linearised with the restriction enzymes Ndel and S al .
  • the cDNA fragment containing the sequence HCV-H between nucleotides 3420 and 3959 was amplified by PCR, using the oligonucleotides PROT(H-K4)S (SEQ ID NO: 7) and PROT(H-K4)AS (SEQ ID NO: 8) as primers.
  • the cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI , and cloned in pT7-7, which was first linearised with the same restriction enzymes.
  • the cDNA fragment containing the sequence HCV-J between nucleotides 3408 and 3947 was amplified by PCR, using the oligonucleotides PROT(J-K4)S (SEQ ID NO: 9) and PROT(J-K4)AS (SEQ ID NO: 10) as primers.
  • the cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI , and cloned in pT7-7, which was first linearised with the same restriction enzymes.
  • the cDNA fragment containing the sequence HCV-J8 between nucleotides 3432 and 3971 was amplified by PCR, using the oligonucleotides PROT(J8-K4)S (SEQ ID NO:ll) and PROT(J8-K4)AS (SEQ ID NO: 12) as primers.
  • the cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI, and cloned in pT7-7, which was first linearised with the same restriction enzymes.
  • the plasmids pT7-7(Pro BK-asK4), ⁇ T7-7(Pro K-asK4) , pT7-7(Pro J-asK4) and pT7-7 (Pro J8-asK4) containing NS3 sequences also contain the gene for ⁇ -lactamase, which can be used as a selection marker for E. coli cells transformed with these plasmids.
  • the fragments were cloned downstream of the T7 bacteriophage promotor, in reading frame with the first ATG codon of the gene 10 protein of phage T7 using methods known to the art.
  • the plasmids pT7-7(Pro BK- asK4) , pT7-7(Pro H-asK4), pT7-7 (Pro J-asK4) and pT7-7(Pro J8-asK4) containing NS3 sequences also contain the gene for ⁇ -lactamase, which can be used as a selection marker for E. coli cells transformed with these plasmids.
  • the plasmids are then transformed in the E. coli strain BL21 (DE3), normally used for high levels of expression of genes cloned in expression vectors containing the T7 promotor.
  • the T7 polymerase gene is carried into the bacteriophage ⁇ DE3 , which is integrated into the chromosome of BL21 cells (25) .
  • Expression of the gene is induced by incubating the cultures at an A600 nm of 0.7-0.9 with 0.4 mM of isopropyl-1-thio- ⁇ -D-galactopyranoside (IPTG) for 3 hours at 20°C in LB culture medium additioned with ZnCl2 at a concentration that can vary from 50 ⁇ M to 1 mM.
  • IPTG isopropyl-1-thio- ⁇ -D-galactopyranoside
  • the cells are harvested and washed in a saline phosphate buffer solution (20 mM sodium phosphate pH 7.5, 140 mM NaCl) , after which they are re-suspended in 25 mM sodium phosphate at pH 7.5, 10% glycerol, 500 mM NaCl , 10 mM DTT, 0.5% CHAPS (10 ml per 1 litre of culture medium) .
  • the cells are then lysated by passing twice through a "French pressure cell" and the homogenate obtained in this way is centrifugated at 100,000xg for 1 hour, while the nucleic acids are removed by precipitation with 0.5% polyethylenimine .
  • the supernatants are loaded onto a HiLoad 16/10 SP Sepharose High Performance column (Pharmacia) , and balanced with 50 mM of sodium phosphate at pH 7.5 , 5% glycerol, 3 m DTT, 0.1% CHAPS (buffer A).
  • the column had been washed repeatedly with buffer A and the protease was eluted by applying a gradient of from 0 to 0.6 M NaCl.
  • the fractions containing the protease were then collected and concentrated using a chamber for ultrafiltration under magnetic stirring, equipped with a YM-10 membrane (Amicon) .
  • the sample was then loaded onto an HR 26/60 HiLoad Superdex 75 column (Pharmacia) , balanced with buffer A, operating at a flow rate of 1 ml/min.
  • the fractions containing NS3 were collected and further purified on an HR 5/5 Mono S column (Pharmacia) , balanced with buffer B and operating at a flow rate of 1 ml/min.
  • the protease was eluted from the column in pure form applying a linear gradient of 0-0.6 NaCl in buffer A.
  • the concentration of the protein was estimated by determination of absorbancy at 280 nm using a coefficient of extinction deriving from the sequence data or from quantitative amino acid analysis. Both methods come to the same results, with an error factor of 10%.
  • the purity of the enzyme was ascertained on SDS polyacrylamide gel and by HPLC using an inverse phase
  • Vydac C4 column (4.6x250 mm, 5 mm, 300 A) .
  • the eluents used were H2O/0.1% TFA (A) and acetonitryl/0.1% TFA (B) .
  • a s nth tic pe tide ⁇ 1J amino aciu ⁇ was used as a substrate.
  • This peptide was derived from the cleavage sequence of the NS4A-NS4B junction (DEEMECSSHLPYK) .
  • a peptide with 14 amino acids corresponding to the central hydrophobic region of the protein NS4A (from position 21 to position 34) was used as a protease cofactor.
  • the peptides were synthesised by solid phase synthesis based on Fmoc chemistry. After washing and deprotection, the "raw" peptides were purified by HPLC to 98% purity. The identity of the peptides was determined by mass spectrometry .
  • the peptide solutions stored were prepared in DMSO and preserved at -80°C, furthermore the concentrations were determined by quantitative amino acid analysis carried out on samples hydrolysed with HC1.
  • the cleavage tests were carried out using 300 nM - 1.6 ⁇ M of enzyme in 30 1 of 50 mM Tris pH 7.5, 50% glycerol, 2% CHAPS, 30 mM DTT and appropriate amounts of substrate and/or peptide-NS4A at 22°C.
  • the reaction was stopped by addition of 70 ⁇ l of H20 containing 0.1% TFA.
  • Cleavage of the peptide substrate was determined by HPLC using a Merck-Hitachi chromatograph. After this, 90 ⁇ l of each sample were injected into an inverse phase Lichrospher C18 cartridge column (4x125 mm, 5 ⁇ m, Merck) and the fragments were separated using an acetonitryl gradient of 3-100% at 2%/min.
  • Tables 1 and 2 give the data for solubility and yield relating to the NS3 protease corresponding to various HCV virus isolated.
  • Table 1 gives the data for production of the various forms of protease both with and without the addition of four lysines at the C- terminal end, and both with and without the addition of ZnC12 in the culture medium. The data are expressed as the percentage of protein recovered in the soluble fraction of the cell extracts and the protein found in the included bodies.
  • Table 2 gives the yields and solubility of the various forms of protease, purified from E.
  • the modified proteases (BK-ASK4, J-ASK4, H-ASK4) are between 10 and 20 times more soluble and, when expressed in a culture medium containing an excess of ZnCl2, they give a yield up to 10 times greater than the respective proteases without the lysine tail .
  • the protease (at a concentration of 4 mg/ml) was dialysed for a period of at least 16 hours against a buffer containing 50 mM Tris/Hcl pH 7.5, 3 mM DTT, 10% glycerol, 0.1% CHAPS.
  • a Chelex-100 resin (2.5 g/1) was held in suspension in the dialysis buffer to prevent contamination by casual metal ions.
  • the protein was then hydrolysed with nitric acid and then used to determine the metal content .
  • the standardised Zn + , Co and Cd + solutions were purchased from Merck.
  • NS3 serine protease activity its proteolytic activity was first measured on a synthetic substrate peptide.
  • NS3 protease were denaturated by addition of TFA to a final concentration of 1%.
  • the denaturated protein was then purified on a Resouce RPC 3 ml column using an acetonitryl gradient of from 0% to 85% in the presence of
  • the apoprotein was diluted to a final concentration of 60 nM in the activity buffer containing the concentrations of ZnCl 2 shown in the graph and 10 mM DTT to prevent oxidation of the thiole groups. After an incubation period of 1 hour at 22°C the reaction was started by adding the substrate eptide at a concentration of 40 mM. The reaction was then made to proceed for another hour before taking the measurements. As shown in figure 3, reconstitution of the enzymatic activity depends on the concentration of zinc ions in the buffer. Maximum reactivation was observed at a ZnCl2 concentration of 25 ⁇ M.
  • HCV NS3 protease contains a structural zinc atom has been used to increase the production of soluble protein in bacterial cells (E. coli) and therefore to produce a protein in a form that can be used for experiments aimed at determining the structure by means of NMR.
  • determination of structure by means of NMR involves metabolic marking with 15 N and 13 c , to be carried out on a minimum culture medium, for example modified M9 culture medium (NH 4 ) 2 S0 4 lg/1, K-phosphate 100 mM, MgS0 4 0.5 mM, CaCl 2 0.5 ml ⁇ , biotin S ⁇ M, thiarnine 7 ⁇ M, a picillin 5 ⁇ g/ml, glucose 4 g/1, FeS0 4 .7H 2 0 13 ⁇ M) .
  • modified M9 culture medium NH 4 ) 2 S0 4 lg/1, K-phosphate 100 mM, MgS0 4 0.5 mM, CaCl 2 0.5 ml ⁇ , biotin S ⁇ M, thiarnine 7 ⁇ M, a picillin 5 ⁇ g/ml, glucose 4 g/1, FeS0 4 .7H 2 0 13 ⁇ M
  • FIG. 4 shows how the protease (at approximately 21 kDa - indicated in the figure by an arrow) is produced as an insoluble aggregate (PT) when the bacterial cells are grown in minimum culture medium without zinc (columns 3, 4 and 5) .
  • the Zn 2+ binding site of the HCV NS3 protease and zinc can be studied by replacing the zinc with metals that make spectroscopic studies possible.
  • the close binding of the structural zinc to the enzyme makes it difficult to remove the metal and replace it in vi tro .
  • the Zn * was replaced by Co 2+ and Cd 2+ by incorporation in vivo.
  • the bacterial cells (E. coli) were transformed with an appropriate expression vector and grown in minimum culture medium containing 100 M potassium phosphate at pH 7.0, 0.5 mM MgS0 4 , 0.5 mM CaCl 2 , 13 ⁇ M FeS0 4 , 7 ⁇ M thiamine, 6 ⁇ M biotin.
  • Glucose (4 g/1) and (NH4)2S04 (1 g/1) were used as sources of carbon and nitrogen, respectively.
  • the phosphate buffer was made to pass through a Chelex-100 column.
  • 50 mM of CoCl2 and CdCl2 were added, respectively, 20 minutes before addition of IPTG.
  • Purification of the Co + and Cd ⁇ -proteases was obtained using the procedure described in example 1, except for the fact that all the buffers used were treated with Chelex-100 resin (2.5 g/1) and the DTT was eliminated.
  • the protease containing Co + and Cd + was subjected to electronic absorption spectroscopic analysis .
  • the protease containing Co + shows a typical absorption spectrum in the visible region (figure 6a) , which indicates a binding site with a tetrahedral geometry
  • the two main bands at 640 nm and at 685 nm and the minimums at 585 nm and at 740 nm indicate d-d transitions.
  • the energy in these transitions and the molar extinction coefficients are characteristic of complexes with a distorted tetrahedral co-ordination geometry (27) .
  • the d-d transition energy is consistent, with a mixed sulphur-nitrogen co-ordination bond.
  • the centroide in the band corresponding to the d-d transition indicates a Co complex with a S3N bond (26) .
  • Lam P . Y . Jadhav P . K . , Eyermann C.J., Hodge C . , Ru Y., Bacheler L.T., Meek J.L., Otto M.J., Rayner M.M., Wong Y.N., Chang C.-H., Weber P.C, Jackson D.A., Sharpe T.R., Erickson-Viitanen S. (1994) Science 263:380-384. 15. Georgiou G. and Valax P. (1996) Current Opinion in Biotechnology 7:190-197.
  • NAME Pro BK-asK4
  • D OTHER INFORMATION: sequence for the NS3 protease of HCV-isolated BK.
  • Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly
  • Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 20 25 30
  • Glu Val Gin lie Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys
  • NAME PROT(BK-K4)S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
  • NAME PROT (BK-K4 )
  • AS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: CTACTTCTTC TTCTTGCTAG CCCGCATAGT AGT 33
  • NAME PROT(H-K4)AS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: TTTGAATTCC TACTTCTTCT TCTTGCTAGC TCTCATGGTT GT 42
  • NAME PROT(J-K4)S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: TTTCATATGG CGCCTATCAC GGCCTAT 27
  • NAME PROT(J-K4)AS (xi) SEQUENCE DESCRIPTION SEQ ID NO: 10:
  • MOLECULE TYPE Synthetic DNA
  • ANTISENSE No
  • NAME PROT(J8-K4)S (xi) SEQUENCE DESCRIPTION SEQ ID NO: 12: TTTGAATTCC TACTTCTTCT TCTTGCTAGC CCGTGTGGCG AC 42

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Virology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention relates to serine protease NS3 of hepatitis C virus, and in particular to the observation that the NS3 serine protease domain, in its native conformation, binds a Zn2+ ion and that bivalent metallic ions are necessary to the structural integrity of the protein and to the activity of the enzyme. The present invention further relates to recombinant polypeptides which comprise sequences of the NS3 protease and are characterised by a tail of at least three lysines at their C-terminal ends, to increase its solubility. A further subject of the present invention is a new process which allows the expression of said polypeptides, as metalloproteins, with the proteolytic activity of the HCV NS3 protease, in a soluble form and in a quantity sufficient to allow research to identify inhibitors and to determine the three-dimensional structure of the NS3 protease. Figure 4 shows the effects of the zinc ion on the production of the HCV NS3 protease as a soluble protein in E. Coli in a minimum culture medium.

Description

SOLUBLE POLYPEPTIDES WITH ACTIVITY OF THE NS3 SERINE PROTEASE OF HEPATITIS C VIRUS, AND PROCESS FOR THEIR PREPARATION AND ISOLATION
DESCRIPTION The hepatitis C virus (HCV) is the main etiologic agent of non-A, non-B hepatitis (NANB) . It is estimated that HCV causes at least 90% of post-transfusional NANB viral hepatitis and 50% of sporadic NANB hepatitis. Although great progress has been made in the selection of blood donors and in the immunological characterisation of blood used for transfusions, there is still a high level of acute HCV infection among those receiving blood transfusions, resulting in one million or more infections every year throughout the world. Approximately 50% of HCV infected individuals develop cirrhosis of the liver within a period that can range from 5 to 40 years, and recent clinical studies suggest that there is a correlation between chronic HCV infection and the development of hepatocellular carcinoma. HCV is an enveloped virus containing an RNA positive genome of approximately 9.4 kb. This virus is a member of the Flavivirida.e family, the other members of which are the pestiviruses and flaviviruses .
The RNA genome of HCV has recently been sequenced. Comparison of sequences from the HCV genomes isolated in various parts of the world has shown that these sequences can be extremely heterogeneous. Most of the HCV genome is occupied by an open reading frame (ORF) that can vary between 9030 and 9099 nucleotides. This ORF codes for a single viral polyprotein, the length of which can obviously vary from 3010 to 3033 amino acids. During the virus infection cycle, the polyprotein is proteolytically processed into the individual gene products necessary for replication of the virus. The genes coding for HCV structural protein are located at the 5' end of the ORF, whereas the region coding for the non-structural proteins occupies the rest . i . of the ORF. The structural proteins consist of: C (core, 21 kDa) , El (envelope, gp37) and E2 (NS1, gp61) . C is a non-glycosilate protein of 21 kDa, which probably forms the viral nucleocapsid. The protein El is a glycoprotein of approximately 37 kDa and is believed to be a structural protein of the outer viral envelope. E2 , another membrane glycoprotein of 61 kDa, is probably a second structural protein of the outer envelope of the virus . The non-structural region starts with NS2 (p24) , a hydrophobic protein of 24 kDa whose function is not known. NS3, a protein of 68 kDa which follows NS2 in the polyprotein, has two functional domains: a serine protease domain in the first 180 amino-terminal araino acids and an RNA-dependent ATPase domain in the carboxy- terminal part . The gene region corresponding to NS4 codes for NS4A (p6) , a membrane protein of 54 amino acids, and NS4B (p26) . The gene corresponding to NS5 codes for two proteins, NS5A (p56) and NS5B (p65) , of 56 and 65 kDa, respectively. Recently it has been shown that the NS5B region has an RNA dependent RNA-poly erase activity (1) .
Various molecular biological studies indicate that the signal peptidase, a protease associated with the endoplasmic reticulum of the host cell, is responsible for proteolytic processing in the non-structural region, that is to say the sites C/El, E1/E2 and E2/NS2 (2) . A first protease activity of HCV is responsible for the cleavage between NS2 and NS3. This activity is contained in a region comprising both a part of NS2 and the part of NS3 containing the serine protease domain, but does not use the same catalytic mechanism (3) . On the contrary, the serine protease contained in the 180 amino acids at the amino-terminal of NS3 is responsible for cleavage at the junctions between NS3 and NS4A, between NS4A and NS4B, between NS4B and NS5A, and between NS5A and NS5B
(4-8) . In particular it has been found that the cleavage produced by this serine protease leaves a residue of cysteine or threonine on the amino- erminal side (position PI) and a residue of alanine or serine on the carboxy- terminal side (position PI') of the substrate (6, 9) . Recently it has been shown that NS4A binds the N- terminal end of NS3 with its central hydrophobic portion, thereby increasing the proteolytic activity of NS3 in all the cleavage sites on the polyprotein (10-12) .
Inhibition of the protease activity would therefore stop the proteolytic processing of the non-structural portion of the HCV polyprotein and, as a consequence, would prevent virus replication in infected cells . This sequence of events has been verified in a flavivirus, homologous of the hepatitis C virus, which infects cells in culture. In this case it has been possible to show that genetic manipulation, producing a protease that is no longer capable of exerting its catalytic activity, abolishes the ability of the virus to replicate (13) . Furthermore it has been widely demonstrated, both in vi tro and in clinical studies, that compounds capable of interfering with the activity of the HIV protease are capable of inhibiting the replication of this virus (14) .
Finally there is evidence of the fact that the NS5 region of HCV, which as we have mentioned above has an RNA dependent RNA-polymerase activity, does not display this function except after processing by the NS3 protease .
Therefore a substance capable of interfering with the proteolytic activity associated with the NS3 protein, could be a new therapeutic agent. From this point of view detailed knowledge of the three-dimensional structure of the protease takes on a great deal of importance, as it would allow both a greater understanding of the biological phenomena in which it is involved, and the analysis, study and design of inhibitor molecules capable of interfering with the protease activity, thus paving the way for the development of pharmaceutical compositions suitable for treatment of hepatitis C.
Nevertheless, determination of the structure both using
N R methods and X-ray crystallography, requires large amounts of soluble protein, and at the present time it is not possible to meet this request. In fact, although the simplest and most economical manner of obtaining large amounts of the desired polypeptide is expression of the corresponding gene in bacteria, and although there is a widespread availability of numerous eucaryotic promoters and methods for maximising the expression of heterologous genes in E. Coli, nevertheless an efficient production of the polypeptide in question, although necessary, might not be sufficient. Many recombinant proteins do not fold the polypeptidic chain correctly when they are expressed in E . Coli. The result is the synthesis of polypeptides which are either degraded in the host cell, or are accumulated in an insoluble form in the so called inclusion bodies (15) . Furthermore, in the case of extremely hydrophobic proteins, proteins of viral origin or proteins that are toxic for the bacterial cell (as is the case for certain proteases of viral origin) there are insurmountable difficulties in producing them in a native, soluble form.
In the case of the NS3 serine protease of the hepatitis C virus, due to the conditions in which the protein is normally produced, it has not been possible to date to obtain in E. coli a native type, soluble protease in amounts sufficient to enable the study of the structural nature of this protein, which requires solutions containing a high millimolar concentration of the protein.
It has now unexpectedly been found that these important limitations can be overcome by using the method according to the present invention. As will be seen from the following, this method is based on the unexpected discovery that the NS3 serine protease domain, in its native conformation, binds a Zn ion. Because, as mentioned above, the structure of the HCV NS3 protease is not yet known, a structural model of the protein was prepared, to be used as a guide during experiments. However, the similarity of the NS3 protease to other serine proteases of known structure is extremely low (less than 15%) , which does not allow good alignment between sequences and as a result does not allow construction of a three-dimensional model based solely on homology. For this reason, the available serine protease structures were used to build a multiple alignment of the structurally conserved regions and to draw up in this way a profile with which the sequence of the NS3 protease could subsequently be aligned. In this way it was possible to build an approximate three-dimensional model of the HCV NS3 protease (9, 16) .
Recently, three new viruses responsible for human hepatitis have been discovered (17). These new viruses, known as GBV-A, GBV-B and GBV-C, show a polyprotein organisation in common with that of HCV (18, 19) . From alignment of the region corresponding to NS3 in these three new viruses with that of various HCV serotypes, several preserved amino acids were identified. These residues comprise: the amino acids in the active site, some glycines and prolines (probably involved in stabilising the structure of the protein) and three cysteines and one histidine (figure 1) . In the model suggested by us for the NS3 protease these last four residues are found in a region of the molecule opposite the active site, in a close spatial relationship, and their relative position is such that it forms a binding site for a divalent metallic ion, such as for example the ion Zn2+ (figure 2) .
This observation was subsequently confirmed experimentally. In fact, as will be illustrated in greater detail in the examples, the HCV NS3 protease actually has a metal content equivalent to one mole of zinc to each mole of protein, and as is the case in other proteins the zinc is necessary to enable the protein to take on its native structure and become catalytically active (20, 21) .
The fact that the NS3 protease has a binding site for a metal ion and that this binding site is so well preserved, even in viruses that are not phylogenetically close, opens the way to the study of antiviral therapeutic agents whose target site is this very region of the protein. In fact, in the case of another viral protein that binds Zn2+ ions, that is to say the HIV virus nucleocapsid, it has been possible to identify compounds that interfere selectively with the bond between the protein and the Zn + ions (22, 23) and it has also been seen that these compounds interfere with the viral infection of cells grown in culture medium.
An object of the present invention is therefore to provide a method for high-yield expression, in a native form, that is to say as a protein containing a bivalent metallic ion, and in a highly soluble form of the HCV NΞ3 protease using heterologous expression systems, such as E. coli cells transformed using suitable genetic constructs and cultivated in a medium enriched with salts containing divalent metal ions.
A further object of the present invention is to provide a general method allowing preparation and isolation in a native, pure and highly soluble form, of large amounts of polypeptides containing Zn + , Co or
Cd+, with the protease activity of HCV NS3.
Furthermore, an additional object of the present invention is to provide a method that allows preparation and isolation in a native, pure and highly soluble form of large amounts of polypeptides with the protease activity of HCV NS3 , which are at the same time marked using stable heavy isotopes such as 13c or 15N, as required for experiments to determine the three- dimensional structure of the protein using NMR. Finally, the present invention provides new genetic constructs for the expression, in E. coli cells, of modified polypeptides with the protease activity of HCV
NS3, having a high yield of the native and soluble form of the HCV NS3 protease .
These and other objects are achieved using one or more of the embodiments of the present invention described below.
In an embodiment of the invention a procedure is provided for obtaining production of the NS3 serine protease domain in its native form, that is to say containing a bivalent metal ion, which is necessary for the structural integrity of the protein. The innovation in the procedure consists in the addition to the culture medium in which the transformed bacterial cells are grown of compounds containing metals such as Zn, Co, Cd, Mn, Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt , V. These compounds provide the culture medium with the ions required by the protein to take on its native structure. In this way the protein is found in its native, soluble form in the cytoplasm of the bacterial cells, instead of being held in the included bodies, from which it can only be obtained by applying difficult resolubilisation procedures . In another embodiment of the invention, a procedure is provided that makes it possible to replace the zinc ion in the protease, which is spectroscopically silent, with other ions (for example Co or Cd +) , which are spectroscopically active, so as to permit the study of possible inhibitors capable of co-ordinating the metal contained in the protein and therefore of disturbing the bond between the protein and the metal .
In another embodiment of the invention, the addition of bivalent metal ions to a minimum culture medium, containing glucose and ammonium salts enriched with 13C or 15N as the sole sources of carbon and nitrogen, respectively, makes it possible to obtain large amounts of soluble protein marked with stable heavy isotopes such as 13C or 15N. This type of isotope enrichment is necessary to determine the structure using NMR techniques . In a further embodiment of the present invention polypeptide sequences are provided that contain the NS3 serine protease domain of hepatitis C virus, suitably modified. These polypeptides are characterised in that they have at their C-terminal end a sequence of extremely hydrophilic amino acids, such as for example a series of lysines, which are not present in the original sequence. By using this other new method there is a substantial improvement in terms of solubility and integrity of the protein produced. These modified protease molecules are also to be considered as a subject of the present invention .
Subjects of the present invention are therefore: a) Isolated and purified polypeptides containing the HCV NS3 serine protease domain, characterised in that they have at their C-terminal end a tail of at least three lysines. b) A process for the preparation of polypeptides containing the HCV NS3 serine protease domain in a soluble form, of use for enzymological experiments, determination of the three-dimensional structure of the enzyme both by means of NMR and using X-ray crystallography, comprising the following operations:
- transformation of a prokaryotic host cell with an expression vector containing a DNA sequence coding for a polypeptide with the proteolytic activity of the HCV NS3 protease ;
- growth of the prokaryotic host cell on a special culture medium containing Zn2+ or alternatively salts of transition metals such as Co, Cd, Mn, Cu, Ni , Ag, Fe , Cr, Hg, Au, Pt , V;
- expression of the DNA sequence required to produce the chosen polypeptide ; - purification of the polypeptide without having to resort to resolubilisation protocols, and without the need for renaturation of the protein from included bodies . c) A process for the renaturation in vitro of the above polypeptides, characterised in that it comprises the following operations:
- transformation of a prokaryotic host cell with an expression vector containing a DNA sequence coding for a polypeptide with the proteolytic activity of HCV NS3 protease ;
- expression of the DNA sequence required to produce the chosen polypeptide;
- purification of the denaturated and renaturated polypeptide of the protein using buffers containing Zn2+ or alternatively salts of transition metals such as Co, Cd, Mn, Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt , V. d) Expression vectors for the production of the polypeptides represented by the sequences SEQ ID NO:l to SEQ ID NO: 4 with the proteolytic activity of HCV NS3, comprising: a polynucleotide coding for one of said polypeptides; regulation, transcription and translation sequences, operating in said host cell, operationally bonded to said polynucleotide; and, optionally, a selectable marker. e) A prokaryotic cell transformed with an expression vector containing a DNA sequence coding for polypeptides with the proteolytic activity of the HCV NS3 protease, so as to allow said host cell to express the specific polypeptide which is coded in the chosen sequence .
Figure 1 shows the alignment between the HCV NS3 serine protease sequence and the viruses GBV-A, GBV-B and GBV-C/HGV (Hcv, Hga, Hgb, Hgc) , with the poliovirus (Pol) 2A cysteine protease. Amino acids conserved in the HCV proteases and in the viruses GBV-A, GBV-B and GBV-C/HGV are shaded. The catalytic residues are underlined and the residues that bind zinc are indicated using the symbol _.
Figure 2 shows a diagrammatic model of the NS3 serine protease domain. In particular it shows the position within the structure of the amino acids involved in binding zinc (dark grey) and the catalytic triad
(light grey) .
Figure 3 shows the effect of the zinc ion on HCV NS3 serine protease activity. Figure 4 shows the effects of the zinc ion on the production of HCV NS3 protease as a soluble protein in E. coli on a minimum culture medium. Column 2 refers to the results of the experiment carried out on the cells without inducing protease production (-IPTG) . Columns 3, 4 and 5 indicate that in the absence of ZnCl2 and following the induction of protease production (+IPTG) the protein remains locked in the insoluble portion (indicated by the abbreviation PT) . On the contrary, in the presence of ZnCl2 the protease is found entirely in the soluble portion (indicated by the abbreviation SN) .
Figure 5 shows the electronic spectrums of the HCV
NS3 protease. Figure 5a shows the visible and near-UV spectrum of the Co2+-protease . Figure 5b shows the UV absorption spectrums of the Zn +-protease and of the Cd *- protease.
DEPOSITS
Strains of E. coli DHl/p bacteria transformed with the plasmids pT7-7(Pro BK-as K4) , pT7-7 (Pro) -asK4) , pT7- 7 (Pro H-asK4) and pT7-7 (Pro J8-asK4) and coding for the amino acid sequences SEQ ID NO : 1 , SEQ ID NO:2, SEQ ID NO: 3 and SEQ ID NO : 4 , respectively, were deposited on August 8, 1996 with The National Collections of Industrial and Marine Bacteria Ltd (NCIMB) , Aberdeen, Scotland, U.K., under access numbers NCIMB 40821, NCIMB 40822, NCIMB 40823 and NCIMB 40824, respectively.
Up to this point a general description has been given of the present invention. With the aid of the following examples a more detailed description of specific embodiments of the invention will now be given, with the aim of clarifying the objects, characteristics, advantages and methods of application thereof. EXAMPLE 1
EXPRESSION AND PURIFICATION OF POLYPEPTIDES WITH THE PROTEASE ACTIVITY OF HCV NS3, IN THEIR NATIVE SOLUBLE FORM
The plasmids pT7-7(Pro BK-asK4), pT7-7(Pro H-asK4), pT7-7(Pro J-asK4) and pT7-7(Pro J8-asK4) were constructed to allow expression in E. coli of polypeptides characterised in that they have a sequence chosen from the ones in the group from SEQ ID NO : 1 to SEQ ID NO : 4. The polypeptides contain the NS3 protease domain of various HCV isolates (BK, H, J and J8 , respectively) with the addition of a "tail" of four lysines at the C- terminal end. pT7-7 (Pro BK-asK4) contains the sequence for HCV-BK (EMBL data bank access number: M58335) between the nucleotides 3411 and 3950, cloned in the vector pT7-7. pT7-7 (Pro H-asK4) contains the sequence for HCV-H (EMBL data bank access number: M67463) between the nucleotides 3420 and 3959, cloned in the vector pT7-7. pT7-7 (Pro J-asK4) contains the sequence for HCV-J (EMBL data bank access number: D90208) between the nucleotides 3408 and 3947, cloned in the vector pT7-7. pT7-7 (Pro J8-asK4) contains the sequence for HCV-J8 (EMBL data bank access number: D10988/D01221) between the nucleotides 3432 and 3971, cloned in the vector pT7- 7.
The expression vector pT7-7 is a derivative of pBR322 which contains, in addition to the gene for β lactamase and the replication origin of ColEl, the promotor and the ribosome binding site of the T7 bacteriophage 010 gene (24) .
The fragments coding for the HCV NS3 protease were cloned downstream of the T7 bacteriophage 010 promoter, in reading frame with the first ATG condon of the gene 10 protein of phage T7 using methods known to the art .
The cDNA fragment containing the sequence HCV-BK between nucleotides 3411 and 3950 was amplified by Polymerase Chain Reaction (PCR) , using the oligonucleotides PR0T(BK-K4)S (SEQ ID NO: 5) and PROT (BK- K4)AS (SEQ ID NO: 6) as primers. The cDNA fragment so obtained was digested with the restriction enzyme Ndel , and cloned in pT7-7, which was first linearised with the restriction enzymes Ndel and S al .
The cDNA fragment containing the sequence HCV-H between nucleotides 3420 and 3959 was amplified by PCR, using the oligonucleotides PROT(H-K4)S (SEQ ID NO: 7) and PROT(H-K4)AS (SEQ ID NO: 8) as primers. The cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI , and cloned in pT7-7, which was first linearised with the same restriction enzymes.
The cDNA fragment containing the sequence HCV-J between nucleotides 3408 and 3947 was amplified by PCR, using the oligonucleotides PROT(J-K4)S (SEQ ID NO: 9) and PROT(J-K4)AS (SEQ ID NO: 10) as primers. The cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI , and cloned in pT7-7, which was first linearised with the same restriction enzymes. The cDNA fragment containing the sequence HCV-J8 between nucleotides 3432 and 3971 was amplified by PCR, using the oligonucleotides PROT(J8-K4)S (SEQ ID NO:ll) and PROT(J8-K4)AS (SEQ ID NO: 12) as primers. The cDNA fragment so obtained was digested with the restriction enzymes Ndel and EcoRI, and cloned in pT7-7, which was first linearised with the same restriction enzymes.
The plasmids pT7-7(Pro BK-asK4), ρT7-7(Pro K-asK4) , pT7-7(Pro J-asK4) and pT7-7 (Pro J8-asK4) containing NS3 sequences also contain the gene for β-lactamase, which can be used as a selection marker for E. coli cells transformed with these plasmids. The fragments were cloned downstream of the T7 bacteriophage promotor, in reading frame with the first ATG codon of the gene 10 protein of phage T7 using methods known to the art. The plasmids pT7-7(Pro BK- asK4) , pT7-7(Pro H-asK4), pT7-7 (Pro J-asK4) and pT7-7(Pro J8-asK4) containing NS3 sequences also contain the gene for β-lactamase, which can be used as a selection marker for E. coli cells transformed with these plasmids.
The plasmids are then transformed in the E. coli strain BL21 (DE3), normally used for high levels of expression of genes cloned in expression vectors containing the T7 promotor. In this strain the T7 polymerase gene is carried into the bacteriophage λ DE3 , which is integrated into the chromosome of BL21 cells (25) . Expression of the gene is induced by incubating the cultures at an A600 nm of 0.7-0.9 with 0.4 mM of isopropyl-1-thio-β-D-galactopyranoside (IPTG) for 3 hours at 20°C in LB culture medium additioned with ZnCl2 at a concentration that can vary from 50 μM to 1 mM. After the three hours have passed the cells are harvested and washed in a saline phosphate buffer solution (20 mM sodium phosphate pH 7.5, 140 mM NaCl) , after which they are re-suspended in 25 mM sodium phosphate at pH 7.5, 10% glycerol, 500 mM NaCl , 10 mM DTT, 0.5% CHAPS (10 ml per 1 litre of culture medium) . The cells are then lysated by passing twice through a "French pressure cell" and the homogenate obtained in this way is centrifugated at 100,000xg for 1 hour, while the nucleic acids are removed by precipitation with 0.5% polyethylenimine . The supernatants are loaded onto a HiLoad 16/10 SP Sepharose High Performance column (Pharmacia) , and balanced with 50 mM of sodium phosphate at pH 7.5 , 5% glycerol, 3 m DTT, 0.1% CHAPS (buffer A). The column had been washed repeatedly with buffer A and the protease was eluted by applying a gradient of from 0 to 0.6 M NaCl. The fractions containing the protease were then collected and concentrated using a chamber for ultrafiltration under magnetic stirring, equipped with a YM-10 membrane (Amicon) . The sample was then loaded onto an HR 26/60 HiLoad Superdex 75 column (Pharmacia) , balanced with buffer A, operating at a flow rate of 1 ml/min. The fractions containing NS3 were collected and further purified on an HR 5/5 Mono S column (Pharmacia) , balanced with buffer B and operating at a flow rate of 1 ml/min. The protease was eluted from the column in pure form applying a linear gradient of 0-0.6 NaCl in buffer A.
After this passage the protein was preserved in stocks at concentrations of 50-150 μM at a temperature of
-80 °C after freezing in liquid nitrogen. The concentration of the protein was estimated by determination of absorbancy at 280 nm using a coefficient of extinction deriving from the sequence data or from quantitative amino acid analysis. Both methods come to the same results, with an error factor of 10%. The purity of the enzyme was ascertained on SDS polyacrylamide gel and by HPLC using an inverse phase
Vydac C4 column (4.6x250 mm, 5 mm, 300 A) . The eluents used were H2O/0.1% TFA (A) and acetonitryl/0.1% TFA (B) .
A linear gradient of from 3% to 95% B over 60 minutes was used. Analysis of the N-terminal end was carried out using Edman degradation on a gaseous phase sequencer
(Applied Biosystem model 470A) and the analysis by mass spectroscopy revealed that more than 96% of the purified protein has the N-terminal sequence PITAYSSQ. The remaining 3% has the sequence MAPITAYSSQ as foreseen from the data on the nucleotide sequence.
In order to measure the enzymatic activity of the puiiiicu protein, a s nth tic pe tide θι 1J amino aciuΞ was used as a substrate. This peptide was derived from the cleavage sequence of the NS4A-NS4B junction (DEEMECSSHLPYK) . A peptide with 14 amino acids corresponding to the central hydrophobic region of the protein NS4A (from position 21 to position 34) (Pep4A21- 34: GSWIVGRIILSGR) was used as a protease cofactor.
The peptides were synthesised by solid phase synthesis based on Fmoc chemistry. After washing and deprotection, the "raw" peptides were purified by HPLC to 98% purity. The identity of the peptides was determined by mass spectrometry . The peptide solutions stored were prepared in DMSO and preserved at -80°C, furthermore the concentrations were determined by quantitative amino acid analysis carried out on samples hydrolysed with HC1.
The cleavage tests were carried out using 300 nM - 1.6 μM of enzyme in 30 1 of 50 mM Tris pH 7.5, 50% glycerol, 2% CHAPS, 30 mM DTT and appropriate amounts of substrate and/or peptide-NS4A at 22°C. The reaction was stopped by addition of 70μl of H20 containing 0.1% TFA. Cleavage of the peptide substrate was determined by HPLC using a Merck-Hitachi chromatograph. After this, 90μl of each sample were injected into an inverse phase Lichrospher C18 cartridge column (4x125 mm, 5μm, Merck) and the fragments were separated using an acetonitryl gradient of 3-100% at 2%/min. Identification of the peak was achieved following both the absorbancy at 220 nm and the fluorescence of the tyrosine (λex= 260 nm, λem= 305 nm) . Tables 1 and 2 give the data for solubility and yield relating to the NS3 protease corresponding to various HCV virus isolated. Table 1 gives the data for production of the various forms of protease both with and without the addition of four lysines at the C- terminal end, and both with and without the addition of ZnC12 in the culture medium. The data are expressed as the percentage of protein recovered in the soluble fraction of the cell extracts and the protein found in the included bodies. Table 2 gives the yields and solubility of the various forms of protease, purified from E. coli cells grown in the presence of ZnCl2. As can be seen from the results given, the modified proteases (BK-ASK4, J-ASK4, H-ASK4) are between 10 and 20 times more soluble and, when expressed in a culture medium containing an excess of ZnCl2, they give a yield up to 10 times greater than the respective proteases without the lysine tail .
TABLE I
Construct Culture medium Soluble portion Included bodies
95% 80% <1% <1% >98% >95% 95%
Figure imgf000018_0001
50%
TABLE 2
Construct Yield (mg/lt medium) Solubility (mg/ml)
Pro BK 1-2 1-2 Pro BK- asK4 10- 15 >40
Pro H 0 . 1 - 0 . 2 1- 2 Pro H-asK4 1 - 2 >40
Pro J 1-2 0 . 5- 1 Pro J-asK4 15-20 >10
EXAMPLE 2 DETERMINATION OF THE METAL CONTENT OF POLYPEPTIDES WITH THE PROTELYTTC ACTIVITY OF HCV NS3 PROTEASE rhe polypeptidea purified cn_.ccrding t f J described in examples 1, 3 and 5 were further dialysed against buffers containing a chelating agent, in order to remove any metal ions bound to the protein, and their metal content was determined by atomic absorption spectrometry using a Perkin-Elmer Instrument spectrometer . The glass equipment used for analysis of the metal content was washed using 30% nitric acid and rinsed completely with deionised water. The protease (at a concentration of 4 mg/ml) was dialysed for a period of at least 16 hours against a buffer containing 50 mM Tris/Hcl pH 7.5, 3 mM DTT, 10% glycerol, 0.1% CHAPS. A Chelex-100 resin (2.5 g/1) was held in suspension in the dialysis buffer to prevent contamination by casual metal ions. The protein was then hydrolysed with nitric acid and then used to determine the metal content . The standardised Zn +, Co and Cd + solutions were purchased from Merck.
The metal content was found to be 1 g-atom per 1 mole enzyme (see table 3 - n.d.= not determined) , with the exception of of the apoprotein, which has a negligible metal content.
TABLE 3
Protein Zn (g-atoms/mole) Co (g-atoms/mole) Cd (g-atoms/mole)
Figure imgf000019_0001
Figure imgf000019_0002
n.d.: not determined
EXAMPLE 3
PROCEDURE FOR THE RENATURATION OF THE NS3 PROTEASE TN
THE PRESENCE OF ZINC
To ascertain whether or not zinc is required for HCV
NS3 serine protease activity, its proteolytic activity was first measured on a synthetic substrate peptide.
This measurement was carried out in the presence of increasing concentrations of EDTA or of 1,10- phenanthroline . It was found that these two compounds do not inhibit proteolysis by NΞ3 at concentrations lower than 1 mM. Above these concentrations both EDTA and
1 , 10 -phenanthroline only show a modest level of inhibition of NS3 activity. However a similar inhibition behaviour has been obtained in control experiments using structurally similar elements to 1 , 10 -phenanthroline, which is not capable of chelating zinc ions, and the activity was not re-obtained in the presence of an excess of Zn + ions. These results suggest that either zinc is not required for enzymatic activity, or that it is so strongly bonded to the protein that it cannot be removed by treatment with chelating agents. It was therefore decided to proceed with preparation of a protein containing no zinc (apoprotein) and to measure its biochemical activity in the absence and in the presence of this metal. Bonded zinc cannot be removed by dialysis against chelators with a pH exceeding 7, whereas on the other hand prolonged dialysis of the enzyme at a pH of less than 5 and in the presence of 10 mM EDTA causes a loss of zinc accompanied by irreversible precipitation of the sample. The above observations suggest that the zinc is strongly bound and that it is essential for the structural integrity of the protein. In order to facilitate the release of zinc the apoprotein was obtained by applying the following procedure: 1.7 mg of
NS3 protease were denaturated by addition of TFA to a final concentration of 1%. The denaturated protein was then purified on a Resouce RPC 3 ml column using an acetonitryl gradient of from 0% to 85% in the presence of
0.1% TFA. The flow rate of the column was equivalent to
2 ml/min and the volume of the gradient was 45 ml. The zinc content of the apoprotein was found to be negligible. The enzymatic activity of the apoprotein was then tested in the presence and in the absence of zinc.
The apoprotein was diluted to a final concentration of 60 nM in the activity buffer containing the concentrations of ZnCl2 shown in the graph and 10 mM DTT to prevent oxidation of the thiole groups. After an incubation period of 1 hour at 22°C the reaction was started by adding the substrate eptide at a concentration of 40 mM. The reaction was then made to proceed for another hour before taking the measurements. As shown in figure 3, reconstitution of the enzymatic activity depends on the concentration of zinc ions in the buffer. Maximum reactivation was observed at a ZnCl2 concentration of 25 μM. At this concentration the enzymatic activity is found to be approximately 50% when compared to the protease containing zinc (diluted in the same buffer at the same final concentration) . This experiment gives unequivocal proof that zinc is necessary in order for the enzyme to be structurally complete and active, and it also provides a method for reconstitution of NS3 serine protease activity starting from the apoprotein. EXAMPLE 4 PROCESS FOR THE PRODUCTION OF HCV NS3 PROTEASE IN A FORM THAT CAN BE USED EQE DETERMINATION OF THE THREE- DIMENSIONAL STRUCTURE THEREOF USING NMR TECHNIQUES
The discovery that HCV NS3 protease contains a structural zinc atom has been used to increase the production of soluble protein in bacterial cells (E. coli) and therefore to produce a protein in a form that can be used for experiments aimed at determining the structure by means of NMR.
In effect, determination of structure by means of NMR involves metabolic marking with 15N and 13c, to be carried out on a minimum culture medium, for example modified M9 culture medium (NH4) 2S04 lg/1, K-phosphate 100 mM, MgS04 0.5 mM, CaCl2 0.5 mlή, biotin SμM, thiarnine 7μM, a picillin 5μg/ml, glucose 4 g/1, FeS04.7H20 13 μM) . Induction in this culture medium, which does not include zinc salts in its composition, inevitably results in the production of insoluble protein, whereas the addition of 50μM of ZnCl2 results in the production of a completely soluble protease. In this way it is possible to produce a marked protein using (15NH4)2S04 as a source of nitrogen and -^C-glucose as a source of carbon.
Following this new procedure, a protein has been obtained that remains in a soluble form in the cytoplasm and is not captured by the inclusion bodies, as was the case using the old procedures. In this way the resolubilisation procedures become unnecessary, which results in considerable advantages, as these procedures have an extremely variable yield, require extremely controlled conditions and also frequently cause irreversible alterations in the protein. Figure 4 shows how the protease (at approximately 21 kDa - indicated in the figure by an arrow) is produced as an insoluble aggregate (PT) when the bacterial cells are grown in minimum culture medium without zinc (columns 3, 4 and 5) . On the contrary, if ZnCl2 is added to the culture medium at a concentration of 50 mM the protein is found in the soluble fraction (SN) (columns 6, 7 and 8) and disappears from the insoluble fraction (PT) . EXAMPLE 5
REPLACEMENT OF THE Zn2* ROUND TO NS3 WITH SPECTROSCOPTC PROBES SUCH AS Co2 OR C 2*
The Zn2+ binding site of the HCV NS3 protease and zinc can be studied by replacing the zinc with metals that make spectroscopic studies possible. The close binding of the structural zinc to the enzyme makes it difficult to remove the metal and replace it in vi tro . As a result, the Zn * was replaced by Co2+ and Cd2+ by incorporation in vivo. The bacterial cells (E. coli) were transformed with an appropriate expression vector and grown in minimum culture medium containing 100 M potassium phosphate at pH 7.0, 0.5 mM MgS04 , 0.5 mM CaCl2, 13 μM FeS04, 7 μM thiamine, 6 μM biotin. Glucose (4 g/1) and (NH4)2S04 (1 g/1) were used as sources of carbon and nitrogen, respectively. To reduce the amount of Zn""+ in the culture medium, the phosphate buffer was made to pass through a Chelex-100 column. To obtain production of Co2+ or Cd2+-NS3, 50 mM of CoCl2 and CdCl2 were added, respectively, 20 minutes before addition of IPTG. Purification of the Co+ and Cd ^-proteases was obtained using the procedure described in example 1, except for the fact that all the buffers used were treated with Chelex-100 resin (2.5 g/1) and the DTT was eliminated.
The addition of CoCl2 or CdCl2 to the culture medium still results in production of the soluble enzyme, which indicates that the Co + and Cd + ions can replace zinc in the binding site for metal and protease.
The protease containing Co + and Cd + was subjected to electronic absorption spectroscopic analysis . The protease containing Co + shows a typical absorption spectrum in the visible region (figure 6a) , which indicates a binding site with a tetrahedral geometry
(26) . The two main bands at 640 nm and at 685 nm and the minimums at 585 nm and at 740 nm indicate d-d transitions. The energy in these transitions and the molar extinction coefficients are characteristic of complexes with a distorted tetrahedral co-ordination geometry (27) . The d-d transition energy is consistent, with a mixed sulphur-nitrogen co-ordination bond. Furthermore, the centroide in the band corresponding to the d-d transition indicates a Co complex with a S3N bond (26) . A typical charge transfer band S -> Co2+ was observed at around 365 nm (figure 6a) , implying that the metal ion is co-ordinated by thiolates. In accordance with these data, the UV absorbancy spectrum of the Cd"+-protease (figure 6b) shows an increase in absorbancy at around 250 n , which in all probability is due to a charge transfer band S -> Cd2+
(28) . In conclusion, spectroscopic analysis of the Co and Cd2+- proteases is completely consistent with the three-dimensional model proposed by us. In face, in the model the binding site for the metal is made up of three thiole groups of three cystemes and of a nitrogen atom from the side chain of a hystidine. Each of the residues that according to the model form the binding site for the metal has been changed to alanme and, as expected, none of the mutants obtained is capable of being expressed in a soluble form in E. coli.
BIBLIOGRAPHY
1. Behrens S.E., To ei L., De Francesco R.(1996) EMBO J.15:12-22.
2. Hi ikata M., Kato N., Ootuyama Y., Nakagawa M. & Shi otohno K. (1991) Proc . Natl . Acad. Sci . USA 88:5547- 5551.
3. Grakoui A., McCourt D.W., Wychowski C, Femstone S.M., Rice CM. (1993) Proc. Natl. Acad. Sci . USA 90:
10583-10587.
4. Bartenschlager R. , Ahlborn-Laake L., Mous J. &. Jacobsen H. (1993) J. Virol. 68:1045-1055.
5. Eckart M.R., Selby M. , Masiarz F. , Lee C, Berger K. , Crawford K. , Kuo C. , Kuo G. , Houghton M. & Choo Q.-L.
(1993) Biochem. Biophys . Res. Co m. 192:399-406.
6. Grakoui A., McCourt D.W., Wychowski C, Feinstone S.M. & Rice CM. (1993) J. Virol. 67:2832-2843.
7. Tomei L., Failla C, Santol m E., De Francesco R. & La Monica N. (1993) J. Virol. 67:4017-4026.
8. Manabe S., Fuke I., Tanishita O., Ka i C, Gomi Y., Yoshida S., Mori C, Takamizawa A., Yohida I. & Okayama H. (1994) virology 198:636-644.
9. Pizzi E., Tramontano A., Tomei L., La Monica N., Failla C, Sardana M., Wood T., De Francesco R. (1994)
Proc. Natl. Acad. Sci. USA 91:888-892.
10. Shimuzu Y., Ydinanii K. , Masuho Y., Yokota T., Inouhe H., Sudo K., Satoh S. & Shimothono K. (1996) J. Virol. 70:127-132. 11. Lm C, Thomson J.A. & Rice CM (1995) J. Virol 69:4373-4380. 12. Tomei L., Failla C, Vitale R.L., Bianchi E. & De Francesco R. (1996) J. Gen. Virol. 77:1065-1070.
13. Chambers T.J., Weir R.C., Grakoui A., McCourt D.W., Bazan J.F., Fletterick R.J., Rice CM. (1990) Proc. Natl. Acad. Sci. USA 87:8898-8902.
14. Lam P . Y . , Jadhav P . K . , Eyermann C.J., Hodge C . , Ru Y., Bacheler L.T., Meek J.L., Otto M.J., Rayner M.M., Wong Y.N., Chang C.-H., Weber P.C, Jackson D.A., Sharpe T.R., Erickson-Viitanen S. (1994) Science 263:380-384. 15. Georgiou G. and Valax P. (1996) Current Opinion in Biotechnology 7:190-197.
16. Failla C, Pizzi E., De Francesco R. , Tramontano A. (1996) Folding & Design 1:35-42.
17. Zuckermann A. J. (1996) The Lancet 347:58-559. 18. Muerhoff A.S., Leary T.P., Simons J.N. , Pilot-Matias T.J., Dawson G.J., Erker J.C, Chalmers M.L., Schlauder G.C., Desai S.M. & Mushahwar I.K. (1995) J. Virol. 69:5621-5630.
19. Leary T.P.. Muerhoff A.S., Simons J.N., Pilot-Matias T.J., Erker J.C, Chalmers M.L., Schlauder G.C, Dawson
G.J., Desai S.M. & Mushahwar I.K. (1996) J. Medical Virol. 48:60-67.
20. Yu S.F. and Lloyd R.E. (1992) Virology 186:725-735
21. Voss T. , Meyer R. and Sommergruber W. (1995) Protein Science 4:2526-2531.
22. Rice W.G., Schaeffer, C.A. , Harten B., Villinger F., South T.L., Summers M.F., Henderson L.E., Bess J.W.J., Arthur L.O., McDougall J.S., Orloff S.L., Mendeleyev J. & Kun E. (1993) 361:473-475. 23. Rice W.G., Supko J.G., Malspeis L., Buckheit R.W.J., Clanton D., Bu M. , Graham L. , Schaeffer C.A. , Turpin J.A., Domogala J. , Gogliotti R., Bader J.P., Kalliday S.M., Coren L. , Sowder R.C.I. , Arthur L.O. & Henderson L.E. (1995) Science 270:1194-1197. 24. Tabor S. & Richardson C C (1985) Proc. Natl. Acad. Sci. USA 82:1074-1078. 25. Studier F.W. and Moffatt (1986) J.Mol.Biol. 189:113- 130
26. Maret W.& Vallee B.L. (1993) Meth. Enzym. 226 :52-71.
27. Bertini I.& Luchinat C (1984) Adv. Inorg. Biochem. 6:72-111.
28. Fitzgerald D.W. and Coleman J.E. (1991) Biochemistry 30 :5195-5201.
SEQUENCE LISTING GENERAL INFORMATION (i) APPLICANT:
ISTITUTO DI RICERCHE DI BIOLOGIA MOLECOLARE P.ANGELETTI S.p. (ii) TITLE OF INVENTION: "SOLUBLE POLYPEPTIDES WITH ACTIVITY THE NS3 SERINE PROTEASE OF HEPATITIS C VIRUS, AND PROCESS FO THEIR PREPARATION AND ISOLATION" (iii) NUMBER OF SEQUENCES: 12 (iv) MAILING ADDRESS: (A) ADDRESSEE: Societa' Italiana Brevetti
(B) STREET: Piazza di Pietra, 39
(C) CITY: Rome
(D) COUNTRY: Italy
(E) POST CODE: 1-00186 (v) COMPUTER-READABLE FORM:
(A) TYPE OF SUPPORT: Floppy disk 3.5" ' 1.44 MBYTES
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS Rev. 5.0
(D) SOFTWARE: Microsoft Word 6.0 (viii) AGENT INFORMATION
(A) NAME: DI CERBO Mario (Dr.)
(B) REFERENCE: RM/X88878/PC-DC (ix) TELECOMMUNICATIONS INFORMATION
(A) TELEPHONE: 06/6785941 (B) TELEFAX: 06/6794692
(C) TELEX: 612287 ROPAT
(1) INFORMATION ON SEQUENCE SEQ ID NO : 1 (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 187 amino acids
/ T5 \ VΠΓ . -n i n^ -, ,_ -; ,
(C) STRANDEDNESS : single
(D) TOPOLOGY : linear ( ix) FEATURE
(A) NAME : Pro BK-asK4 (D) OTHER INFORMATION: sequence for the NS3 protease of HCV-isolated BK.
(xi) SEQUENCE DESCRIPTION SEQ ID NO: 1: Met Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 1 5 10 15
Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly
20 25 30
Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 35 40 45 Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 50 55 60
Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 65 70 75 80
Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 85 90 95
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala
100 105 110
Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 115 120 125 Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 130 135 140
Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys 145 150 155 160
Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 165 170 175
Glu Thr Thr Met Arg Ala Ser Lys Lys Lys Lys 180 185
(2) INFORMATION ON SEQUENCE SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS (A) LENGTH: 187 amino acids
(B) TYPE: amino acid \C; STRANDEDNEΞΞ . sin le (D) TOPOLOGY: linear (ix) FEATURE (A) NAME: Pro H-asK4
(D) OTHER INFORMATION: sequence for the NS3 protease of HCV-isolated H. (xi) SEQUENCE DESCRIPTION SEQ ID NO: 2: Met Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 1 5 10 15
Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 20 25 30
Glu Val Gin lie Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys
35 40 45 lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 50 55 60 lie Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val Asp 65 70 75 80
Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser Leu Thr
85 90 95
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 100 105 110
Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu
115 120 125
Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu
130 135 140 Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val Cys
145 150 155 160
Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Asn Leu
165 170 175
Glu Thr Thr Met Arg Ala Ser Lys Lys Lys Lys 180 185
(3) INFORMATION ON SEQUENCE SEQ ID NO: 3: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 187 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS : single
(D) TOPOLOGY: linear (ix) FEATURE
(A) NAME: ProJ-asK4
(D) OTHER INFORMATION: sequence for the NS3 protease of HCV-isolated J.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: Met Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 1 5 10 15
Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Asp Gly 20 25 30 Glu Val Gin Val Leu Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 35 40 45
Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr
50 55 60
Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp 65 70 75 80
Gin Asp Leu Val Gly Trp Pro Ala Pro Pro Gly Ala Arg Ser Met Thr
85 90 95
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 100 105 110 Asp Val Val Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 115 120 125
Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu
130 135 140
Leu Cys Pro Ser Gly His Val Val Gly He Phe Arg Ala Ala Val Cys 145 150 155 160
Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Ser Met
165 170 175
Glu Thr Thr Met Arg Ala Ser Lys Lys Lys Lys 180 185 (4) INFORMATION ON SEQUENCE SEQ ID NO 4 : (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 186 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS : single (D) TOPOLOGY: linear
(ix) FEATURE
(D) OTHER INFORMATION : sequence for the NS protease of HCV- isolated J8 . (xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 4 :
Ala Pro He Thr Ala Tyr Thr Gin Gin Thr Arg Gly Leu Leu Gly Ala 1 5 10 15 He Val Val Ser Leu Thr Gly Arg Asp Lys Asn Glu Gin Ala Gly Gin
20 25 30
Val Gin Val Leu Ser Ser Val Thr Gin Thr Phe Leu Gly Thr Ser He 35 40 45
Ser Gly Val Leu Trp Thr Val Tyr His Gly Ala Gly Asn Lys Thr Leu
50 55 60
Ala Gly Pro Lys Gly Pro Val Thr Gin Met Tyr Thr Ser Ala Glu Gly 65 70 75 80 Asp Leu Val Gly Trp Pro Ser Pro Pro Gly Thr Lys Ser Leu Asp Pro
85 90 95
Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu Val Thr Arg Asn Ala Asp
100 105 110
Val He Pro Val Arg Arg Lys Asp Asp Arg Arg Gly Ala Leu Leu Ser 115 120 125
Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser Ser Gly Gly Pro Val Leu
130 135 140
Cys Ser Arg Gly His Ala Val Gly Leu Phe Arg Ala Ala Val Cys Ala 145 150 155 160 Arg Gly Val Ala Lys Ser He Asp Phe He Pro Val Glu Ser Leu Asp
165 170 175
Val Ala Thr Arg Ala Ser Lys Lys Lys Lys 180 185
(5) INFORMATION ON SEQUENCE SEQ ID NO : 5: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 26 nucleotides
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Synthetic DNA
(iv) ANTISENSE: No
( Vll J IWll-UlήUV O UI -.--I : unyuuui-j.cO' i u Sy t β Ξ l SE
(ix) FEATURE
(A) NAME: PROT(BK-K4)S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
GCATACATAT GGCGCCCATC ACGGCC 26 (6) INFORMATION ON SEQUENCE SEQ ID NO: 6: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 33 nucleotides
(B) TYPE: nucleic acid (C) STRANDEDNESS : single
(D) ASPECT: linear (ii) MOLECULE TYPE: Synthetic DNA (iv) ANTISENSE: Yes
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FURTHER CHARACTERISTICS
(A) NAME: PROT (BK-K4 ) AS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: CTACTTCTTC TTCTTGCTAG CCCGCATAGT AGT 33
(7) INFORMATION ON SEQUENCE SEQ ID NO : 7: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 26 nucleotides
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Synthetic DNA (iv) ANTISENSE: No
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FEATURE
(A) NAME: PROT(H-K4)Ξ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GAGATACATA TGGCGCCTAT CACGGC 26
(8) INFORMATION ON SEQUENCE SEQ ID NO : 8: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 42 nucleotides
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear ( ii ) MOLECULE TYPE : Synthet ic DNA
( iv) ANTISENSE : Yes (vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FEATURE
(A) NAME: PROT(H-K4)AS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: TTTGAATTCC TACTTCTTCT TCTTGCTAGC TCTCATGGTT GT 42
(9) INFORMATION ON SEQUENCE SEQ ID NO : 9: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 27 nucleotides
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Synthetic DNA (iv) ANTISENSE: No
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FEATURE
(A) NAME: PROT(J-K4)S (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: TTTCATATGG CGCCTATCAC GGCCTAT 27
(10) INFORMATION ON SEQUENCE SEQ ID NO : 10: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 26 nucleotides
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Synthetic DNA
(iv) ANTISENSE: Yes
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FEATURE
(A) NAME: PROT(J-K4)AS (xi) SEQUENCE DESCRIPTION SEQ ID NO: 10:
TTTGAATTCC TACTTCTTCT TCTTGCTAGC CCGCATGGTA GT 42
(11) INFORMATION ON SEQUENCE SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 24 nucleotides (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Synthetic DNA (iv) ANTISENSE: No
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser
(ix) FEATURE
(A) NAME: PROT(J8-K4)S
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: GGGAATTCCA TATGGCTCCC ATTACTGCT ACAC 24
(12) INFORMATION ON SEQUENCE SEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 42 nucleotides
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Synthetic DNA (iv) ANTISENSE: Yes
(vii) IMMEDIATE SOURCE: oligonucleotide synthesiser (ix) FEATURE
(A) NAME: PROT(J8-K4)S (xi) SEQUENCE DESCRIPTION SEQ ID NO: 12: TTTGAATTCC TACTTCTTCT TCTTGCTAGC CCGTGTGGCG AC 42

Claims

CLAIMS 1. Isolated and purified polypeptides containing the HCV NS3 serine protease domain sequence, characterised in that they have at their C- terminal end a tail of at least three lysines.
2. Expression vectors for the production of the polypeptides according to claim 1, characterised in that they comprise: a polynucleotide coding for one of said polypeptides; regulation and translation sequences functional in said host cell, operationally bonded to said polynucleotide; and, optionally, a selectable marker .
3. A prokaryotic cell, characterised in that it is transformed with an expression vector containing a DNA sequence coding for the polypeptides according to claim 1, so as to allow said host cell to express the specific polypeptide which is coded in the chosen sequence.
4. A process for the preparation of polypeptides containing the HCV NS3 serine protease domain sequence, characterised in that it comprises the following operations:
- transformation of a prokaryotic host cell with an expression vector containing a DNA sequence coding for a polypeptide containing the HCV NΞ3 serine protease domain sequence; - growth of the prokaryotic host cell on a special culture medium containing Zn + or alternatively salts of transition metals such as Co, Cd, Mn, Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt, V;
- expression of the DNA sequence required to produce the chosen polypeptide;
- purification of the polypeptide without having to resort to resolubilisation protocols, and without the need for renaturation of the protein from included bodies , said procedure making it possible to obtain said polypeptides in their native, soluble form suitable to enable determination of the three-dimensional structure of the enzyme by means of NMR or X-ray crystallography techniques .
5. A process for the renaturation in vi tro of polypeptides containing the HCV NS3 serine protease domain sequence, characterised in that it comprises the following operations:
- transformation of a prokaryotic host cell using an expression vector containing a DNA sequence coding for a polypeptide that contains the HCV NS3 serine protease domain sequence;
- expression of the DNA sequence required to produce the chosen polypeptide;
- purification of the denaturated polypeptide and renaturation of the protein using buffers containing Zn2* or alternatively salts of transition metals such as Co,
Cd, Mn, Cu, Ni, Ag, F , Cr, Hg, Au, Pt , V, said procedure making it possible to obtain said polypeptides in their native, soluble form suitable to enable determination of the three-dimensional structure of the enzyme by means of NMR or X-ray crystallography techniques .
PCT/IT1997/000228 1996-09-17 1997-09-17 Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation Ceased WO1998012308A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP97942190A EP0950094A1 (en) 1996-09-17 1997-09-17 Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation
JP10514467A JP2001500735A (en) 1996-09-17 1997-09-17 Soluble polypeptides having hepatitis C virus NS3 serine protease activity, and methods for their production and isolation
CA002264487A CA2264487A1 (en) 1996-09-17 1997-09-17 Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation
AU43970/97A AU4397097A (en) 1996-09-17 1997-09-17 Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITRM96A000632 1996-09-17
ITRM960632 IT1285158B1 (en) 1996-09-17 1996-09-17 SOLUBLE POLYPEPTIDES WITH THE SERINO-PROTEASIS ACTIVITY OF NS3 OF THE HEPATITIS C VIRUS, AND PROCEDURE FOR THEIR PREPARATION AND

Publications (1)

Publication Number Publication Date
WO1998012308A1 true WO1998012308A1 (en) 1998-03-26

Family

ID=11404423

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IT1997/000228 Ceased WO1998012308A1 (en) 1996-09-17 1997-09-17 Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation

Country Status (6)

Country Link
EP (1) EP0950094A1 (en)
JP (1) JP2001500735A (en)
AU (1) AU4397097A (en)
CA (1) CA2264487A1 (en)
IT (1) IT1285158B1 (en)
WO (1) WO1998012308A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2160046A1 (en) * 1998-03-30 2001-10-16 Hoffmann La Roche PENTAPEPTIDIC DERIVATIVES.
US6333186B1 (en) 1999-01-08 2001-12-25 Bristol-Myers Squibb Company Modified forms of Hepatitis C NS3 protease for facilitating inhibitor screening and structural studies of protease: inhibitor complexes
US7012066B2 (en) 2000-07-21 2006-03-14 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7169760B2 (en) 2000-07-21 2007-01-30 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7244721B2 (en) 2000-07-21 2007-07-17 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7705138B2 (en) * 2005-11-11 2010-04-27 Vertex Pharmaceuticals Incorporated Hepatitis C virus variants
US7884199B2 (en) 2003-10-27 2011-02-08 Vertex Pharmaceuticals Incorporated HCV NS3-NS4 protease resistance mutants
US8759026B2 (en) 2009-07-31 2014-06-24 Baxter International Inc. Methods for increasing recovery of an ADAMTS activity from a cell culture supernatant

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012992A2 (en) * 1991-01-14 1992-08-06 James N. Gamble Institute Of Medical Research Basic structural immunogenic polypeptides having epitopes for hcv, antibodies, polynucleotide sequences, vaccines and methods
WO1996036702A2 (en) * 1995-05-12 1996-11-21 Schering Corporation Soluble, active hepatitis c virus protease
WO1997008304A2 (en) * 1995-08-22 1997-03-06 Istituto Di Ricerche Di Biologia Molecolare P. Angeletti S.P.A. Methodology to produce, purify and assay polypeptides with the proteolytic activity of the hcv ns3 protease

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012992A2 (en) * 1991-01-14 1992-08-06 James N. Gamble Institute Of Medical Research Basic structural immunogenic polypeptides having epitopes for hcv, antibodies, polynucleotide sequences, vaccines and methods
WO1996036702A2 (en) * 1995-05-12 1996-11-21 Schering Corporation Soluble, active hepatitis c virus protease
WO1997008304A2 (en) * 1995-08-22 1997-03-06 Istituto Di Ricerche Di Biologia Molecolare P. Angeletti S.P.A. Methodology to produce, purify and assay polypeptides with the proteolytic activity of the hcv ns3 protease

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KHUDYAKOV YU.E. ET AL: "Linear B-Cell Epitopes of the NS3-NS4-NS% Proteins of the Hepatitis C Virus as modeled with Synthetic Peptides", VIROLOGY, vol. 206, 1995, pages 666 - 72, XP002003036 *
KIM D.W. ET AL.: "C-terminal Domain of Hepatitis C Virus NS3 protein contains an RNA helicas activity", BIOCHE.BIOPHYS.RESERARCH COMMUNICATIONS, vol. 215, no. 1, 1995, pages 160 - 166, XP002035618 *
TOMEI L. ET AL.: "NS3 is a Serine Protease required for processing of Hepatitis C Virus Polyprotein", J. VIROLOGY, vol. 67, no. 7, 1993, pages 4017 - 4026, XP000561255 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2160046A1 (en) * 1998-03-30 2001-10-16 Hoffmann La Roche PENTAPEPTIDIC DERIVATIVES.
US6333186B1 (en) 1999-01-08 2001-12-25 Bristol-Myers Squibb Company Modified forms of Hepatitis C NS3 protease for facilitating inhibitor screening and structural studies of protease: inhibitor complexes
US6800456B2 (en) 1999-01-08 2004-10-05 Bristol-Myers Squibb Company Modified forms of hepatitis C NS3 protease for facilitating inhibitor screening and structural studies of protease:inhibitor complexes
US7012066B2 (en) 2000-07-21 2006-03-14 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7169760B2 (en) 2000-07-21 2007-01-30 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7244721B2 (en) 2000-07-21 2007-07-17 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7595299B2 (en) 2000-07-21 2009-09-29 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
USRE43298E1 (en) 2000-07-21 2012-04-03 Schering Corporation Peptides as NS3-serine protease inhibitors of hepatitis C virus
US7884199B2 (en) 2003-10-27 2011-02-08 Vertex Pharmaceuticals Incorporated HCV NS3-NS4 protease resistance mutants
US7705138B2 (en) * 2005-11-11 2010-04-27 Vertex Pharmaceuticals Incorporated Hepatitis C virus variants
US8501450B2 (en) 2005-11-11 2013-08-06 Vertex Pharmaceuticals Incorporated Hepatitis C virus variants
US8759026B2 (en) 2009-07-31 2014-06-24 Baxter International Inc. Methods for increasing recovery of an ADAMTS activity from a cell culture supernatant
US9127265B2 (en) 2009-07-31 2015-09-08 Baxalta Incorporated Cell culture medium for ADAMTS protein expression
US9441216B2 (en) 2009-07-31 2016-09-13 Baxalta Incorporated Cell culture medium for ADAMTS protein expression
US10072254B2 (en) 2009-07-31 2018-09-11 Baxalta Incorporated Cell culture methods for expressing ADAMTS13 protein
US10724024B2 (en) 2009-07-31 2020-07-28 Baxalta Incorporated Cell culture methods for expressing ADAMTS protein
US11254921B2 (en) 2009-07-31 2022-02-22 Takeda Pharmaceutical Company Limited ADAMTS13 protein cell culture supernatant

Also Published As

Publication number Publication date
AU4397097A (en) 1998-04-14
IT1285158B1 (en) 1998-06-03
JP2001500735A (en) 2001-01-23
CA2264487A1 (en) 1998-03-26
EP0950094A1 (en) 1999-10-20
ITRM960632A1 (en) 1998-03-17

Similar Documents

Publication Publication Date Title
Stempniak et al. The NS3 proteinase domain of hepatitis C virus is a zinc-containing enzyme
WO1999038888A2 (en) Peptide inhibitors of the serine protease activity associated to the ns3 protein of hcv, relevant uses and process of production
US5739002A (en) Method for reproducing in vitro the Proteolytic activity of the NS3 protease of hepatitis C virus (HCV)
EP0846164B1 (en) Method for assaying in vitro the proteolytic activity of polypeptides having hcv ns3 protease activity and peptide substrates to be used in this method.
EP0950094A1 (en) Soluble polypeptides with activity of the ns3 serine protease of hepatitis c virus, and process for their preparation and isolation
Shoji et al. Proteolytic activity of NS3 serine proteinase of hepatitis C virus efficiently expressed in Escherichia coli
KR100369838B1 (en) Protease Proteins Derived from Nonstructural Protein 3 of Korean Hepatitis C Virus and Methods of Manufacturing the Same
ES2282320T3 (en) PROTEASA HCV NS2 / 3 ACTIVE, PURIFIED.
De Francesco et al. Mechanisms of hepatitis C virus NS3 proteinase inhibitors
van Aken et al. Expression, purification, and in vitro activity of an arterivirus main proteinase
AU2002224688B2 (en) Purified active HCV NS2/3 protease
KR100241268B1 (en) Variants of Proteolytic Enzymes of Hepatitis C Virus and Methods for Preparing the Same
KR19980069020A (en) Histidine-labeled Korean non-structural protein 3-4A and its preparation method
Luo Expression of an active hepatitis C virus serine proteinase in E. coli
KR20020070125A (en) Recombinant hepatitis c virus ns5b protein, and preparation process and use thereof
Villarreal et al. Enzymatic Characterization of Refolded Human
KR19980069021A (en) Histidine-labeled Korean Non-Structural Protein 3 of Hepatitis C Virus and Its Preparation Method
AU2002224688A1 (en) Purified active HCV NS2/3 protease
MXPA01006885A (en) Modified forms of hepatitis c virus ns3 protease

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA CN IL JP KR MX US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2264487

Country of ref document: CA

Ref country code: CA

Ref document number: 2264487

Kind code of ref document: A

Format of ref document f/p: F

ENP Entry into the national phase

Ref country code: JP

Ref document number: 1998 514467

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1997942190

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1997942190

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09269020

Country of ref document: US

WWW Wipo information: withdrawn in national office

Ref document number: 1997942190

Country of ref document: EP