Cyclotron Mass Spectrometry Screening
CROSS-REFERENCES TO RELATED APPLICATIONS
Pursuant to 35 U.S.C. § 119(e), the present application claims benefit of and priority to USSN 60/142,478, entitled "Cyclotron Mass Spectrometry Screening," filed July 6, 1999 by Raillard et al.
COPYRIGHT NOTIFICATION Pursuant to 37 C.F.R. 1.71(e), Applicants note that a portion of this disclosure contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
This invention relates to high throughput methods for mass spectrometry, for example, to monitor half-life and biological transport and permeability, e.g., of protein products of shuffled libraries of genes or other nucleic acids.
BACKGROUND OF THE INVENTION High throughput screening of biological activity typically involves quantitative detection of one or more component in a biological system. A common detection method for detecting components is mass spectrometry (MS), which allows identification of a particular molecule based on the mass and charge of the component. Traditionally, mass spectrometry is performed in tandem with liquid chromatography to purify and separate the components of interest. This purification typically consists of on-line sequential purification. The sequential nature of the purification limits the ability of mass spectrometry to screen a large number of reaction products in a short amount of time, because the purification must occur in line with and previous to the mass spectrometry. In addition, the resolution of traditional mass- spectrometry methods is insufficient to resolve more than a few components of interest at one time.
Present library construction methods, such as library construction by DNA shuffling, produces very large libraries of related products, e.g., that encode one or more biological component of interest. A shuffled library, which is constructed, e.g., by homologous exchange of DNA fragments during DNA recombination methods or by synthetic recombination methods, is not easily amenable to expression product detection by traditional mass-spectrometry methods, because of the size of the library and the closely related nature of the library products.
In addition, encoded activities which are commonly screened in vivo, such as serum half-life of encoded polypeptides, are even more difficult to screen by traditional MS, because of the need for purification of the components of interest from a biological expression system. The need to separate and purify components of interest before injection of the components into a mass spectrometer is time consuming and limits the number of samples that can be analyzed to about 100 samples per day (typical purification runs (e.g., using liquid chromatography) take about 10 minutes/sample. At 6 samples per hour, 144 samples can be run in a 24 hour period). In addition, very closely related molecules are not easily resolved by traditional MS, or purified by simple chromatography.
One recently developed MS system is the electrospray method as described in "Quantitative Electrospray Mass-Spectrometry for the Rapid Assay of Enzyme Inhibitors," by Wu et al. in Chemistry & Biology 1997, Vol. 4 No. 9, p653-657. Electrospray ionization is a mild method of transferring charged polar organic molecules into the gas phase for mass spectrometry analysis and is applicable for most biologically relevant molecules. The electrospray method eliminates the need for prior derivatization of samples before injection into a mass-spectrometer as in GC/MS and thus shortens the analysis time for mass spectrometry. However, column separation is still utilized in this technique, limiting throughput as noted above.
Another recent development is described in "Fast Screening for Drugs of Abuse by Solid-Phase Extraction Combined with Flow-Injection Ionspray-Tandem Mass Spectrometry," by Weinman and Svobodain, Journal of Analytical Toxicology, Vol. 22, 1998, p319-328. The described technique combined tandem mass spectrometry and electrospray methods to simultaneously detect different drugs in serum or urine. Although no column separation was used because the tandem mass spectrometry allowed
detection of multiple compounds, a solid phase extraction method was necessary in the sample preparation. The sample preparation steps were still too lengthy to provide high throughput screening by mass-spectrometry.
Some progress has been made on the problem of screening biological fluids such as blood for activity (or at least presence) of multiple biological components of interest. For example, a direct detection technique for the simultaneous determination of up to 10 drug candidates in plasma is proposed in Mcloughlin et al. (1997) J. Pharm. Biomed. Anal. 15(12): 1893-901. However, at 10 candidates per run, this approach is still too slow to screen large libraries effectively, and the resolution of the candidates is not ideal.
More generally, a number of new powerful mass-spectrometry methods have recently been developed. For example, Fourier Transform Ion Cyclotron Mass Spectrometry is described by Marshall et al. (1998) Mass Spectrometry Reviews 17:1- 35. However, high-throughput systems coupled to this powerful mass spectrometry method are not provided.
Accordingly, a high throughput method of performing mass spectrometry, e.g., for large libraries of molecules of interest, such as libraries of molecules produced by DNA recombination methods, would be useful. The present invention fulfills these and many other needs which will become apparent upon complete review of this disclosure.
SUMMARY OF THE INVENTION
The present invention relates to the surprising discovery that up to hundreds of compounds can be assessed simultaneously for activity, bioavailablity, transport, half-life and the like, e.g., by cyclotron mass spectrometry. Thus, methods of screening a library of compositions are provided. In the methods, a library comprising a plurality of compositions is provided. The library of compositions includes a plurality of compositions, which may number as few as about 35 compositions, to greater than 150 compositions. This plurality of compositions may be in the form of peptides, proteins, metabolic products, carbohydrates, lipids, small organic molecules, or combinations thereof. The plurality of compositions typically include a plurality of proteins, such as a library of encoded proteins produced by recombination of two or more nucleic acids.
In the method of the present invention, the library is administered to a plant or animal, or alternatively, the library is administered to, or tested in, an in vitro system comprising a plant or animal tissue. A sample is obtained from the plant or animal (or the in vitro system) and subjected to cyclotron mass spectrometry. The data generated by mass spectrometry is then used to analyze the library of compositions for a desired property. Optionally, the sample may be fractionated or purified, for example by off-line parallel purification, prior to performing the mass spectrometry step. The mass spectrometry step may be performed, for example, by injecting the sample into a Fourier transform cyclotron mass spectrometer using electrospray ionization. A method for producing a library of recombinant proteins for use in the method of screening is also described in the present invention. A first nucleic acid that encodes a protein of interest is identified. The first nucleic acid is recombined with at least a second nucleic acid to produce a library of related nucleic acids. To accomplish this, the first nucleic acid and the second nucleic acid differ from each other in two or more nucleotides. The library of related nucleic acids are then expressed, thereby creating the library of encoded proteins.
In another embodiment of the present invention, a method of screening a library of compositions is described, wherein the method comprises determining a transfer efficiency for transfer of one or more of the plurality of compositions from a surface or subsurface of an animal to a bloodstream of the animal. In this screening method, one or more of the plurality of compositions are administered to the surface or subsurface of an animal, including but not limited to a mouth, nose, lung, muscle, rectum, vagina or skin. A blood sample from the bloodstream of the animal is obtained and used to measure the concentration of the one or more of the plurality of compositions. The concentration of the one or more of the plurality of compositions in the blood sample is compared with the concentration of the one or more of the plurality of compositions as administered to the animal, to determine the transfer efficiency of the one or more of the plurality of compositions. In this manner the one or more compositions with the highest transfer efficiency can be identified. The nucleic acids encoding the one or more compositions with the highest transfer efficiency may then be determined, and recombined to form another library of compositions, and subjected to the administering, sample-obtaining, measuring and comparing steps in an iterative
fashion, thereby producing one or more compositions with improved transfer efficiency. Similarly, the composition can be administered to any tissue of a plant by standard methods and any tissue later screened for "transfer" efficiency from one plant tissue to another or from tissue into xylem or phloem. In an additional embodiment of the present invention, a method of screening a library of compositions is described, wherein the serum half-life of each of the plurality of compositions is determined for use in the screening process. In this method, a first serum or other sample is obtained from the plant, animal or in vitro system at a first time ti, and a second serum or other sample is obtained at a second time t2. The concentrations of each of the plurality of compositions in the serum samples collected at t] and t are determined. The differences in concentration between the ti and t2 samples can be used to identify one or more compositions with the longest serum half- life. This method may also be performed in an iterative manner, wherein the nucleic acids encoding the one or more compositions with the longest serum half-life may then be determined and recombined to form an additional library of compositions. The new library of compositions is subjected to the administering, sample-obtaining, measuring and comparing steps in an iterative fashion, thereby producing one or more compositions with improved serum half-life.
Yet another embodiment of the present invention describes a method of obtaining a drug with a specified desired property, for example a drug with improved transfer efficiency, or a drug with an increased half-life. In this method, a library comprising a plurality of recombinant polynucleotides is created; this library is used to produce a library of polypeptides. The library of polypeptides is administered to a plant, an animal or to an in vitro system comprising a plant or animal tissue or plant or animal tissue derivative. A sample is obtained from the plant, animal or in vitro system and screened for the specified desired property by performing cyclotron mass spectrometry. The recombinant polypeptides with the desired property, and the recombinant nucleotides that encode the polypeptides with the desired property, are then identified. The recombinant polynucleotides that encode the polypeptides with the desired property are recombined to form another library of polynucleotides, and the process is performed again in an iterative manner.
In a further embodiment of the present invention, a method for monitoring serum half-life is described. In this method, a library comprising a plurality of protein sequences is provided and administered to a plant or to an animal. A first sample is obtained from the animal at a first time, and a second sample is obtained from the animal at a second time. The amount of each protein sequence of the plurality of protein sequences present in the sample at the first time is determined by injecting the sample into a cyclotron mass spectrometer. The amount of each protein sequence of the plurality of sequences present in the sample at the second time is also determined by injecting the sample into a cyclotron mass spectrometer, and the serum half-life of the plurality of protein sequences is determined.
Another embodiment of the present invention is an apparatus for cyclotron mass spectrometry screening. The apparatus comprises a library comprising a plurality of compositions and a cyclotron mass spectrometer injectably coupled to the library. Optionally, the apparatus may further comprise a column positioned between the library and the mass spectrometer and operably coupled to the mass spectrometer. The column comprises a chromatographic material for purifying or fractionating the library, such that during the operation of the apparatus, the library loads onto the column and flows through the column, resulting in a purified or fractionated library. In addition, the apparatus may optionally comprise an automatic sampler that is operably coupled to the mass spectrometer. The apparatus for cyclotron mass spectrometry may also further comprise a computer or computer readable medium and software operably coupled to the apparatus for recording and analyzing data from the mass spectrometer.
In an additional embodiment of the present application, an apparatus for identifying compositions with a high biological transfer efficiency is described. The apparatus comprises a library comprised of a plurality of compositions, and a sample obtained from the bloodstream of an animal or a selected portion of a plant at a time t2 after administration of the plurality of compositions in a predetermined concentration to the plant or animal at a time t\. The apparatus also includes a cyclotron mass spectrometer injectably coupled to the library, and a computer or computer readable medium operably coupled to the mass spectrometer for recording data obtained from the screening. The computer or computer readable medium, or a second computer or computer readable medium also include software for analyzing the data. For example,
the software includes a first instruction set for determining a concentration of each composition in the plurality of compositions at t2, a second instruction set for comparing the concentration of each composition in the plurality of compositions at t2 to the predetermined concentration administered to the plant or animal and determining the transfer efficiency of each composition in the plurality of compositions, and a third instruction set for identifying one or more compositions with high transfer efficiency.
A further embodiment of the present invention describes an apparatus for identifying compositions with a long serum half-life. The apparatus is constructed to include a first library comprising a plurality of compositions. The library includes a sample obtained from a plant or animal at a time t] after an injection of a predetermined concentration of the plurality of compositions into the plant or animal. A cyclotron mass spectrometer for screening the first library for presence and concentration of the plurality of compositions is incorporated into the overall system or device. A computer or computer readable medium operably coupled to the mass spectrometer, and comprising software capable of recording and analyzing data obtained from the mass spectrometer is also present in the device. The software includes a first instruction set for detecting the presence of each of the plurality of compositions, a second instruction set for determining a first concentration for each of the plurality of compositions at t], a third instruction set for obtaining the difference between the first concentration and the predetermined concentration, a fourth instruction set for deconvoluting the data to determine the serum half-life of each of the plurality of compositions, and a fifth instruction set for identifying one or more compositions with a long serum half-life. These and other novel features, advantages and embodiments of the present invention will become apparent from the following detailed description of the invention, when considered in conjunction with the following.
DETAILED DISCUSSION OF THE INVENTION
I. Definitions
"Screening," in the present invention refers to a method of examining a number of compositions, e.g., a library of compositions, such as drugs, proteins or the like, for a specified desired property, e.g., serum half-life, transfer efficiency, pharmacokinetic properties and the like. A large number of compositions can be
screened at once in the present invention, e.g., 30, 50, 100, 200, 300 compounds or more can be screened at once in a cyclotron mass spectrometer, to determine which of the compounds have, e.g., a high transfer efficiency. In the present invention, the screening is optionally used to determine those compounds that possess a desired property, e.g., a high transfer efficiency or a long serum half-life. Optionally, nucleic acids encoding those components are recombined to create new compounds possessing the desired property in an improved form compared to the original compounds, e.g.. a higher transfer efficiency or a longer serum-half-life.
The "specified desired property" of the invention is any desirable property of a composition, e.g., a protein, nucleic acid, pharmaceutical, small organic molecule, and the like, that can be screened or detected by mass spectrometry. A "small organic molecule" is one that has a molecular weight less than about 2000 daltons, more typically less than about 1500 daltons. Examples include, but are not limited to, erythromycin and cholesterol. Typically the specific desired property of the invention is serum half-life or a transfer property, e.g., biological transfer efficiency.
In the present invention, the term "transfer efficiency" or "biological transfer efficiency" refers to the efficiency or rate of transfer of a composition, e.g., a drug or protein, from the site of administration to the bloodstream of the subject. The transfer efficiency is optionally detected by comparing the amount or concentration of the composition administered to the amount or concentration at a specified time after the administration. The rate is determined by dividing the change in concentration by the change in time. Mass spectrometry is used to detect the presence of the composition in the bloodstream and the concentration of the compositions, thus determining which compositions in a library of compositions have the desired, e.g., the highest, transfer efficiency rate, by detecting the concentration of the compositions in the blood stream, e.g., which are present in the greatest amount in the bloodstream at the specified time. Those compounds are then optionally selected for further testing or recombination. A "high transfer efficiency" refers to a transfer efficiency that is one of the highest in the group of compositions or library being studied. Preferably, the compositions with a high transfer efficiency will have a transfer efficiency in the top 20% of the library of compositions, more preferably in the top 10% and most preferably in the top 5%. For
example, the transfer efficiency is optionally at least 10%, preferably greater than 30%, more preferably greater than 50% and most preferably greater than 60%.
The term "serum half-life" in the present invention refers to the time in which a given amount of a substance is reduced, e.g., in serum or blood, to 50% of its initial concentration, e.g., by normal turnover such as degradative and metabolic processes of cells and other serum or blood components. In the present invention, the serum half-life is determined by mass spectrometry for a library of compositions, e.g., proteins encoded by a shuffled library, in order to select those with the longest half-lives, e.g., those that will remain in the blood stream the longest amount of time. The half-life can be determined by comparing the concentration of a particular composition at two different times, preferably a time right after administration of the composition to the subject and a later time. Data obtained from a mass spectrometer indicate the presence of the compound at each time measured and is deconvoluted to provide the concentration of the composition. By monitoring at various times the compositions with the longest half-lives are optionally identified as those that are still present at the later times in large concentrations. Typically, in the present invention, a "long serum half-life" is a desired property. A "long serum half-life" refers to a half-life that is one of the longest in the group of compositions, e.g., a library of compounds, being studied. Preferably, the compositions with a long serum-half-life will have a half-life in the top 20% of the library of compositions, more preferably in the top 10% and most preferably in the top 5%. For example, the serum half-life is optionally on the order of minutes, preferably on the order of hours, and more preferably days.
II. The Methods and Integrated Systems of The Invention
The present invention preferably utilizes DNA shuffling methodologies to generate libraries of nucleic acids which are expressed. The expression products are expressed in or administered to a plant, an animal or in vitro system which mimics an aspect of in vivo physiology in a plant or animal. After a period of time, an effect of the expression or administration is observed in the animal or in vitro system, such as the presence or absence of a protein encoded by the library in the animal or in vitro system, or the presence or absence of a molecule affected by the protein in the system. The observation is performed by MS, e.g., by Fourier-transform cyclotron MS on purified or
unpurified mixtures of library products, typically by screening tens, or even hundreds of analytes simultaneously.
III. Making Shuffled Libraries For Screening By the System
A variety of recombination and recursive recombination (e.g., DNA shuffling) reactions and/or other diversity generating reactions, in addition to or concurrent with standard cloning methods, are optionally used to produce libraries of nucleic acids which are expressed and screened. As adapted to the present invention, these methods are used to make and express libraries, which are then screened by cyclotron MS. A variety of such reactions are known to those of skill in the art, including those developed by the inventors and their co-workers.
The following publications describe a variety of recursive recombination procedures and/or methods that can be incorporated into such procedures: Stemmer, et al., (1999) "Molecular breeding of viruses for targeting and other clinical properties. Tumor Targeting" 4: 1-4; Nesset al. (1999) "DNA Shuffling of subgenomic sequences of subtilisin" Nature Biotechnology 17:893-896; Chang et al. (1999) "Evolution of a cytokine using DNA family shuffling" Nature Biotechnology 17:793-797; Minshull and Stemmer (1999) "Protein evolution by molecular breeding" Current Opinion in Chemical Biology 3:284-290; Christians et al. (1999) "Directed evolution of thymidine kinase for AZT phosphorylation using DNA family shuffling" Nature Biotechnology 17:259-264; Crameriet al. (1998) "DNA shuffling of a family of genes from diverse species accelerates directed evolution" Nature 391:288-291; Crameri et al. (1997) "Molecular evolution of an arsenate detoxification pathway by DNA shuffling," Nature Biotechnology 15:436-438; Zhang et al. (1997) "Directed evolution of an effective fucosidase from a galactosidase by DNA shuffling and screening" Proceedings of the National Academy of Sciences, U.S.A. 94:4504-4509; Patten et al. (1997) "Applications of DNA Shuffling to Pharmaceuticals and Vaccines" Current Opinion in Biotechnology 8:724-733; Crameri et al. (1996) "Construction and evolution of antibody-phage libraries by DNA shuffling" Nature Medicine 2:100-103; Crameri et al. (1996) "Improved green fluorescent protein by molecular evolution using DNA shuffling" Nature Biotechnology 14:315-319; Gates et al. (1996) "Affinity selective isolation of ligands from peptide libraries through display on a lac repressor 'headpiece dimer"' Journal of Molecular Biology 255:373-386; Stemmer (1996) "Sexual PCR and Assembly PCR" In: The
Encyclopedia of Molecular Biology. VCH Publishers, New York, pp.447-457; Crameri and Stemmer (1995) "Combinatorial multiple cassette mutagenesis creates all the permutations of mutant and wildtype cassettes" BioTechniques 18:194-195; Stemmer et al., (1995) "Single-step assembly of a gene and entire plasmid form large numbers of oligodeoxyribonucleotides" Gene, 164:49-53; Stemmer (1995) "The Evolution of Molecular Computation" Science 270: 1510; Stemmer (1995) "Searching Sequence Space" Bio/Technology 13:549-553; Stemmer (1994) "Rapid evolution of a protein in vitro by DNA shuffling" Nature 370:389-391; and Stemmer (1994) "DNA shuffling by random fragmentation and reassembly: In vitro recombination for molecular evolution." Proceedings of the National Academy of Sciences, U.S.A. 91:10747-10751.
Additional details regarding DNA shuffling methods are found in U.S. Patents by the inventors and their co-workers, including: United States Patent 5,605,793 to Stemmer (February 25, 1997), "METHODS FOR IN VITRO RECOMBINATION;" United States Patent 5,811,238 to Stemmer et al. (September 22, 1998) "METHODS FOR GENERATING POLYNUCLEOTIDES HAVING DESIRED
CHARACTERISTICS BY ITERATIVE SELECTION AND RECOMBINATION;" United States Patent 5,830,721 to Stemmer et al. (November 3, 1998), "DNA MUTAGENESIS BY RANDOM FRAGMENTATION AND REASSEMBLY;" United States Patent 5,834,252 to Stemmer, et al. (November 10, 1998) "END- COMPLEMENTARY POLYMERASE REACTION," and United States Patent 5,837,458 to Minshull, et al. (November 17, 1998), "METHODS AND COMPOSITIONS FOR CELLULAR AND METABOLIC ENGINEERING."
In addition, details and formats for DNA shuffling are found in a variety of PCT and foreign patent application publications, including: Stemmer and Crameri, "DNA MUTAGENESIS BY RANDOM FRAGMENTATION AND REASSEMBLY" WO 95/22625; Stemmer and Lipschutz "END COMPLEMENTARY POLYMERASE CHAIN REACTION" WO 96/33207; Stemmer and Crameri "METHODS FOR GENERATING POLYNUCLEOTIDES HAVING DESIRED CHARACTERISTICS BY ITERATIVE SELECTION AND RECOMBINATION" WO 97/0078; Minshull and Stemmer, "METHODS AND COMPOSITIONS FOR CELLULAR AND METABOLIC ENGINEERING" WO 97/35966; Punnonen et al. "TARGETING OF GENETIC VACCINE VECTORS" WO 99/41402; Punnonen et al. "ANTIGEN LIBRARY
IMMUNIZATION" WO 99/41383; Punnonen et al. "GENETIC VACCINE VECTOR ENGINEERING" WO 99/41369; Punnonen et al. OPTIMIZATION OF IMMUNOMODULATORY PROPERTIES OF GENETIC VACCINES WO 9941368; Stemmer and Crameri, "DNA MUTAGENESIS BY RANDOM FRAGMENTATION AND REASSEMBLY" EP 0934999; Stemmer "EVOLVING CELLULAR DNA
UPTAKE BY RECURSIVE SEQUENCE RECOMBINATION" EP 0932670; Stemmer et al., "MODIFICATION OF VIRUS TROPISM AND HOST RANGE BY VIRAL GENOME SHUFFLING" WO 9923107; Apt et al., "HUMAN PAPILLOMA VIRUS VECTORS" WO 9921979; Del Cardayre et al. "EVOLUTION OF WHOLE CELLS AND ORGANISMS BY RECURSIVE SEQUENCE RECOMBINATION" WO 9831837; Patten and Stemmer, "METHODS AND COMPOSITIONS FOR POLYPEPTIDE ENGINEERING" WO 9827230; Stemmer et al., and "METHODS FOR OPTIMIZATION OF GENE THERAPY BY RECURSIVE SEQUENCE SHUFFLING AND SELECTION" WO9813487. Certain U.S. Applications provide additional details regarding DNA shuffling and related techniques, including "SHUFFLING OF CODON ALTERED GENES" by Patten et al. filed September 29, 1998, (USSN 60/102,362), January 29, 1999 (USSN 60/117,729), and September 28, 1999, USSN 09/22588 (Attorney Docket Number 20-28520US/PCT); "EVOLUTION OF WHOLE CELLS AND ORGANISMS BY RECURSIVE SEQUENCE RECOMBINATION", by del Cardyre et al. filed July 15, 1998 (USSN 09/166,188), and July 15, 1999 (USSN 09/354,922); "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" by Crameri et al., filed February 5, 1999 (USSN 60/118,813) and filed June 24, 1999 (USSN 60/141,049) and filed September 28, 1999 (USSN 09/408,392, Attorney Docket Number 02-29620US); and "USE OF CODON-BASED OLIGONUCLEOTIDE
SYNTHESIS FOR SYNTHETIC SHUFFLING" by Welch et al., filed September 28, 1999 (USSN 09/408,393, Attorney Docket Number 02-010070US); and "METHODS FOR MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" by Selifonov and Stemmer, filed February 5, 1999 (USSN 60/118854, USSN 09/416,375 and USSN 09/494,282).
As review of the foregoing publications, patents, published applications and U.S. patent applications reveals, shuffling (or "recursive recombination") of nucleic
acids to provide new nucleic acid libraries, which are optionally screened for desired properties, is optionally earned out by a number of established methods. Any of these methods can be adapted to the present invention to produce libraries that are screened by mass spectrometry. In bπef, at least five different general classes of recombination methods are applicable to the present invention. First, nucleic acids can be recombined in vitro by any of a vaπety of techniques discussed m the references above, including e.g., DNAse digestion of nucleic acids to be recombined followed by ligation and/or PCR reassembly of the nucleic acids. Second, nucleic acids can be recursively recombined in vivo, e.g., by allowing recombination to occur between nucleic acids in cells. Third, whole cell genome recombination methods can be used in which whole genomes of cells are recombined, optionally including spiking of the genomic recombination mixtures with desired library components Fourth, synthetic recombination methods are optionally used, in which oligonucleotides corresponding to different nucleic acids are synthesized and reassembled in PCR or ligation reactions which include oligonucleotides which correspond to more than one parental nucleic acid, thereby generating new recombined nucleic acids. Oligonucleotides can be made by standard nucleotide addition methods, or by tπ-nucleotide synthetic approaches. Fifth, in si co methods of recombination can be effected in which genetic algoπthms are used in a computer to recombine sequence stπngs which correspond, e.g , to genes or proteins of interest. The resulting recombined sequence stπngs are optionally converted into nucleic acids by synthesis of nucleic acids that correspond to the recombined sequences, e.g., m concert with oligonucleotide synthesis/gene reassembly techniques. Any of the preceding general recombination formats is optionally practiced in a reiterative fashion to generate a more diverse set of recombinant nucleic acids.
The above references provide these and other basic recombination formats as well as many modifications of these formats Regardless of the format that is used, the nucleic acids of the invention are optionally recombined (with each other or with related (or even unrelated) nucleic acids) to produce a diverse set of recombinant nucleic acids, including homologous nucleic acids, thereby providing a very fast way of exploπng the manner in which different combinations of sequences can affect a desired
result. For example, desired results include, but are not limited to, improved transfer efficiency or serum half-life.
DNA shuffling and related techniques provide a robust, widely applicable, means of generating diversity useful for the engineering of proteins, pathways, cells and organisms with improved characteristics. In addition to the basic formats described above, it is sometimes desirable to combine recombination methodologies with other techniques for generating diversity. In conjunction with (or separately from) recombination-based methods, a variety of diversity generation methods can be practiced and the results (i.e., diverse populations of nucleic acids) evaluated. Additional diversity can be introduced into nucleic acids by methods that result in the alteration of individual nucleotides or groups of contiguous or non-contiguous nucleotides, e.g., mutagenesis methods. Mutagenesis methods include, for example, recombination (PCT/US 98/05223; Publ. No. WO98/42727); oligonucleotide-directed mutagenesis (for review see, Smith, Ann. Rev.Genet. 19: 423-462 (1985)); Botstein and Shortle, Science 229: 1193-1201 (1985); Carter, Biochem. J. 237: 1-7 (1986); Kunkel, "The efficiency of oligonucleotide directed mutagenesis" in Nucleic acids & Molecular Biology, Eckstein and Lilley, eds., Springer Verlag, Berlin (1987)). Included among these methods are oligonucleotide- directed mutagenesis (Zoller and Smith, Nucl. Acids Res. 10: 6487-6500 (1982), Methods in Enzvmol. 100: 468-500 (1983), and Methods in Enzvmol. 154: 329-350 (1987)) phosphothioate-modified DNA mutagenesis (Taylor et al., Nucl. Acids Res. 13: 8749-8764 (1985); Taylor et al., Nucl. Acids Res. 13: 8765-8787 (1985); Nakamaye and Eckstein, Nucl. Acids Res. 14: 9679-9698 (1986); Sayers et al., Nucl. Acids Res. 16:791- 802 (1988); Sayers et al., Nucl. Acids Res. 16: 803-814 (1988)), mutagenesis using uracil-containing templates (Kunkel, Proc. Nat'l. Acad. Sci. USA 82: 488-492 (1985) and Kunkel et al., Methods in Enzvmol. 154:367-382)); mutagenesis using gapped duplex
DNA (Kramer et al., Nucl. Acids Res. 12: 9441-9456 (1984); Kramer and Fritz, Methods in Enzvmol. 154:350-367 (1987); Kramer et al., Nucl. Acids Res. 16: 7207 (1988)); and Fritz et al., Nucl. Acids Res. 16: 6987-6999 (1988)). Additional suitable methods include point mismatch repair (Kramer et al., Cell 38: 879-887 (1984)), mutagenesis using repair-deficient host strains (Carter et al., Nucl. Acids Res. 13: 4431-4443 (1985); Carter, Methods in Enzvmol. 154: 382-403 (1987)), deletion mutagenesis (Eghtedarzadeh and Henikoff, Nucl. Acids Res. 14: 5115 (1986)), restriction-selection
and restriction-purification (Wells et al., Phil. Trans. R. Soc. Lond. A 317: 415-423 (1986)), mutagenesis by total gene synthesis (Nambiar et al., Science 223: 1299-1301 (1984); Sakamar and Khorana, Nucl. Acids Res. 14: 6361-6372 (1988); Wells et al., Gene 34:315-323 (1985); and Grundstrom et al., Nucl. Acids Res. 13: 3305-3316 (1985). Kits for mutagenesis are commercially available (e.g., Bio-Rad, Amersham International, Anglian Biotechnology).
Other relevant references which describe methods of diversifying nucleic acids include Schellenberger U.S. Patent No. 5,756,316; U.S. Patent No. 5,965,408; Ostermeier et al. (1999) "A combinatorial approach to hybrid enzymes independent of DNA homology" Nature Biotech 17:1205; U.S. Patent No. 5,783,431; U.S. Patent
No.5,824,485; U.S. Patent 5,958,672; Jirholt et al. (1998) "Exploiting sequence space: shuffling in vivo formed complementarity determining regions into a master framework" Gene 215: 471; U.S. Patent No. 5,939,250; WO 99/10539; WO 98/58085 and WO 99/10539. Any of these or other available diversity generating methods can be combined, in any combination selected by the user, to produce a nucleic acid library, which may be screened, e.g., using the screening methods provided herein.
In general, making libraries includes the construction of recombinant nucleic acids and the expression of genes in transfected host cells. Molecular cloning techniques to achieve these ends are known in the art. A wide variety of cloning and in vitro amplification methods suitable for the construction of recombinant nucleic acids such as expression vectors are well-known to persons of skill. General texts which describe molecular biological techniques useful herein, including mutagenesis, include Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 1999) ("Ausubel")).
Methods of transducing cells, including plant and animal cells, with nucleic acids as in library construction are generally available, as are methods of
expressing proteins encoded by such nucleic acids. In the present invention, the libraries typically comprise the proteins that are encoded and expressed by such nucleic acids libraries. In addition to Berger, Ausubel and Sambrook, useful general references for culture of animal cells include Freshney (Culture of Animal Cells, a Manual of Basic Technique, third edition Wiley- Liss, New York (1994)) and the references cited therein, Humason (Animal Tissue Techniques, fourth edition W.H. Freeman and Company (1979)) and Ricciardelli, et al., In Vitro Cell Dev. Biol.25:1016-1024 (1989). References for plant cell cloning, culture and regeneration include Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, NY (Payne); and Gamborg and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, Springer- Verlag (Berlin Heidelberg New York) (Gamborg). A variety of cell culture media are described in Atlas and Parks (eds), The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, FL (Atlas). Additional information for plant cell culture is found in available commercial literature such as the Life Science Research Cell Culture Catalogue (1998) from Sigma- Aldrich, Inc (St Louis, MO) (Sigma-LSRCCC) and, e.g., the Plant Culture Catalogue and supplement (1997) also from Sigma-Aldrich, Inc (St Louis, MO) (Sigma-PCCS).
IV. Other Libraries Screened By the System
In addition to shuffled libraries, the integrated system and methods of the present invention can be used to screen any other library which is advantageously screened by administering pools of library members to a biological system of interest. Because pooling provides a way of increasing screening throughput (i.e., because multiple library members are administered to the biological system simultaneously and are assayed for an activity of interest in an essentially simultaneous fashion), essentially all large libraries of structurally or functionally related compositions benefit from the increase in throughput. A wide variety of libraries are available, including those which can be obtained commercially, e.g., from SIGMA, ALDRICH, or other companies and those which can be generated by standard methods including those taught in Sambrook, Ausubel, and Berger (all supra). In addition, many public sources such as the National Institutes of Health, the National Cancer Institute, the ATCC and the Department of Agriculture possess libraries of interest, including peptide libraries and compound
libraries. Essentially any of the tens of thousands of commercially and publicly available libraries can be screened using the methods and systems herein.
V. Sample Purification and/or Separation
One method of purification involves separation of sample components. The samples are optionally separated, e.g., by column chromatography, prior to injection into the mass spectrometer. Separation techniques, such as high performance liquid chromatography or capillary zone electrophoresis, are typically performed in-line with the mass spectrometry analysis. Therefore a column is typically operably coupled to the apparatus of the invention between the library and the mass spectrometer. The sample is typically eluted from the column and directly injected into the mass spectrometer for analysis. However, with the resolution provided by the cyclotron mass spectrometer and tandem mass spectrometry, the spectrometry provides enough resolution to distinguish the various sample components even when they are not column-separated. For a general discussion of separation techniques and column chromatography, including chromatographic materials and properties, see e.g., Gel Filtration Theory and Practice, from Pharmacia Fine Chemicals; Supelco Chromatography products (1998), from Supelco (Bellafonte, PA); and Practical Protein Chromatography: Methods in Molecular Biology, Vol. 11, Kenney and Fowell (Eds.), (1992) Humana Press, Totowa, NJ). Alternatively, an off-line parallel purification system, which allows high- throughput mass spectrometry analysis by allowing the samples to be purified in a system that is not sequentially tied to (and thereby acting as a rate limiting step for the mass spectrometry analysis), is used for sample purification. The system allows for the off-line parallel purification samples with no time-consuming column separation. Off-line parallel purification is optionally performed as part of sample preparation on a preparation plate. This allows all samples to be sufficiently purified for mass spectrometry analysis without a column separation that is performed sequentially and in-line with the mass spectrometer. To do this, the system provides a chemical purification step that is selected based on the type of sample analyzed. Furthermore, this chemical purification step can be performed in the wells of a preparation plate (e.g., a microtiter dish) in an off-line system.
For example, off-line chemical purification optionally comprises the use of a different or additional buffer when preparing samples of interest. Alternatively, an off-line parallel purification system comprises the use of an ion exchange resin when preparing the samples of the inventions. Examples of off-line parallel purification systems have been developed by the inventors for use with mass-spectrometry systems generally, as in USSN 60/119766 and USSN 09/502,283, which are incorporated herein by reference.
In other embodiments, no purification is needed prior to ionization.
Autosamplers An "automatic sampler" is a robotic handler that transports samples from one location to another. An automatic sampler is used for example, to transport samples from a preparation plate and inject them into a mass spectrometer for analysis. Examples of automatic samplers include the Gilson 8-probe microtiter autosampler and the microtiter autosampler from CTC analytics. Automatic samplers optionally include robotic handlers that are used to pick colonies, such as a Q-bot, and/or add or remove reagents to or from the preparation plate.
An autosampler is coupled with the apparatus of the invention to transport samples between the preparation plate, where samples are optionally purified, to the mass spectrometer for injection and analysis. Autosamplers can be purchased from standard laboratory equipment suppliers such as Gilson and CTC Analytics. Such samplers function at rates of about 10 seconds/sample to about 1 min/sample.
In addition, robotic sampler handlers are optionally used to pick samples into the preparation plate and add reagents in the off-line parallel purification system. Such robotic handlers include but are not limited to those produced by Beckman instruments and Genetrix (e.g., the Q-bot).
Mass Spectrometers
Mass spectrometry is a method that allows detection of a large variety of proteins and different small molecule metabolites. Ionspray and electrospray mass spectrometry are used in many different fields for the analysis of organic compounds and for characterization of biomacromolecules. MS is optionally coupled to a separation technique, such as high performance liquid chromatography or capillary zone electrophoresis, which is performed in-line with the mass spectrometry analysis. For a general discussion of mass spectrometry theory and techniques, see, e.g., Kirk-Othmer
Encyclopedia of Chemical Technology, Volume 15, Forth Edition, pages 1071-1094, and all references therein.
A variety of mass spectrometer instruments are commercially available. For example, Micromass (U.K.) produces a variety of instruments such as the Quattro LC (a compact triple stage quadrupole system optimized e.g., for API LC-MS-MS) which utilizes a dual stage orthogonal "Z" spray sampling technique. Other triple stage quadrupole mass spectrometers (e.g., the "TSQ" spectrometer) are produced by the Finnigan Corporation. As noted, the MS in the system herein is preferably a cyclotron MS, e.g., as described in Marshall et al. (1998) "FOURIER TRANSFORM ION CYCLOTRON RESONANCE MASS SPETCTROMETRY: A PRIMER" Mass
Spectrometry Reviews 17:1-35 and the references cited therein (see esp. Appendix A, Appendix C and the references 1-149).
Electrospray methods are optionally used instead of gas chromatography procedures because no prior derivatization is required to inject the sample. Flow injection analysis methods (FIA) with ionspray-ionization and tandem mass spectrometry further the ability of the present invention to perform high-throughput mass spectrometry analysis. The ionspray method allows the samples to be injected without prior derivatization and the tandem mass spectrometry (MS-MS) allows extremely high efficiency in the analysis. Electrospray ionization is a very mild ionization injection method that allows detection of molecules that are polar and large, which are typically difficult to detect in GC-MS without prior derivatization. Modern electrospray mass spectrometers detect samples in femtomole quantities. This allows for decreased sample size, which is important in case where blood, tissue or serum samples are being collected and tested. Since a couple of microliters are injected, samples are optionally injected in nanomolar concentrations. Quantitation is very reproducible with standard errors ranging from 2% - 5%.
Tandem mass spectrometry uses the fragmentation of precursor ions to fragment ions within a triple quadrupole MS. The separation of compounds with different molecular weights occurs in the first quadrupole by the selection of a precursor ion. The identification is performed by the isolation of a fragment ion after collisionally induced dissociation of the precursor ion in the second quadrupole. Reviews of this
technique can be found in Kenneth, L. et al. (1988) "Techniques and Applications of Tandem Mass Spectrometry," VCH Publishers, Inc.
Triple quadrupole mass spectrometers allow MS/MS analysis of samples. For example, a triple quadropole mass spectrometer with electrospray and atmospheric pressure chemical ionization sources, such as a Finnigan TSQ 7000, is optionally used. The machine is optionally set to let one particular parent ion through the first quadrupole which undergoes fragmentation reactions with an inert gas. The most prominent daughter ion can then be singled out in the third quadropole. This method creates two checkpoints for analyte identification. The particle must have the correct molecular mass over charge of both parent and daughter ion. Tandem mass spectrometry thus leads to higher specificity and often also to higher signal to noise ration. It also introduces further separation by distinguishing analyte from impurities with same mass.
Generally, cyclotrons are particle accelerators that range in size from a few inches up to 236 inches in pole diameter, and in energy from several million electron volts (MeV) up to 700-MeV protons. The larger synchrotrons, which have a ring of magnets rather then a solid pole, have diameters up to 800 feet across and produce particle energies of up to 30,000 MeV (30GeV). Machines that accelerate particles in a straight line, such as Van De Graaffs or linear accelerators, have lengths from a few feet to 2 miles (e.g., the Stanford linear accelerator). Each of these types of accelerators produces certain types of particles in a particular energy range.
In the present invention, a "Fourier transform cyclotron mass spectrometer" is typically used to provide analysis of multiple compounds in a single injection with millidalton resolution, such that proteins that differ in 10 or fewer amino acids or only one amino acid, for example, are distinguished. For a general discussion of Fourier transform cyclotron mass spectrometry theory and techniques, see, e.g., Fourier Transform Ion Cyclotron Resonance Mass Spectrometry: a Primer, Marshall et al., Mass Spectrometry Reviews, 17, 1-35 (1998) and all references therein. See, also, Ft-Icr/MS : Analytical Applications of Fourier Transform Ion Cyclotron Resonance Mass Spectrometry, Bruce Asamoto, Vch Publisher, Inc. (1991). Computer Interface
Control of the other elements of the integrated system and/or the analysis of detected system information are coupled to an appropriately programmed processor or
computer, or computer readable medium which functions to instruct the operation of these instrument elements in accordance with preprogrammed or user input instructions, receive data and information from these instruments, and interpret, manipulate and report this information to the user. As such, the computer is typically appropriately coupled to any library storage elements, injection elements, and/or the MS, and/or to any analog to digital or digital to analog converter element as desired.
The computer typically includes appropriate software for receiving user instructions, either in the form of user input into set parameter fields, e.g., in a GUI, or in the form of preprogrammed instructions, e.g., preprogrammed for a variety of different specific operations. The software then converts these instructions to appropriate language for instructing movement of library elements, control of the MS and the like. The computer then receives the data from one or more signal sensor/detectors included within the MS system, and interprets the data, either providing it in a user interpretable format, or uses that data to initiate further instructions, in accordance with the programming, e.g., such as in monitoring and control of injection rates, library selection, temperatures, applied fields, and the like.
In the present invention, the computer typically includes software for the monitoring of materials in the MS. Additionally the software is optionally used to control injection or withdrawal of material into or from the MS. The injection or withdrawal is used to select and quantify library members in the system.
In general, one or more instruction sets are present in the computer, or on a computer-readable medium such as a computer hard-drive or CD-ROM which include instruction sets for MS operation and signal detection deconvolution. Instruction sets exist in computer memory or on a computer-readable medium such as a computer hard- drive or CD-ROM and are provided by the present invention and accessed by the system for the operation of the instruction sets.
Typically, a computer commonly used to transform signals from the detection device into reaction rates will be a PC-compatible computer (e.g., having a central processing unit (CPU) compatible with x86 CPUs (e.g., a Pentium I, II or II class machine), and running an operating system such as LINUX, DOS™, OS/2 Warp™, WINDOWS/NT™, WINDOWS/NT™ workstation, or WINDOWS 98™), or a Macintosh™ (running MacOS™), or a UNIX workstation (e.g., a SUN™ workstation
running a version of the Solaris™ operating system, a PowerPC™ workstation or a mainframe computer), all of which are commercially common, and known to one of skill "in the art. Data analysis software on the computer is then employed to deconvolute signal information. Software for these purposes is available, or can easily be constructed by one of skill using a standard programming language such as Visual Basic, Fortran, Basic, Java, or the like.
One of skill will immediately recognize that any, or all, of these components are optionally manufactured in separable modular units, and assembled to form an apparatus or system of the invention. Computers, MS detectors, library manipulation robots, and the like are optionally manufactured in a single unit, but more commonly are constructed as separate modules which are assembled to form an apparatus or system for analyzing a library of components. Further, a computer does not have to be physically associated with the rest of the apparatus to be "operably linked" to the apparatus. A computer is operably linked when data is delivered from other components of the apparatus to the computer. One of skill will recognize that operable linkage can easily be achieved using either conductive cable coupled directly to the computer (e.g., USB, parallel, serial, ethernet, or phone line cables), or using data recorders which store data to computer readable media (typically magnetic or optical storage media such as computer disks and diskettes, CDs, magnetic tapes, but also optionally including physical media such as punch cards, vinyl media or the like) which is then accessed by the computer.
VI. Screening By Cyclotron Mass Spectrometry
The library A "library of compositions" in the present invention is optionally composed of proteins, nucleic acids, or pharmacologically active compositions, e.g., drugs, small organic molecules or peptides. The libraries of the present invention comprise mixtures of components that are optionally injected into or administered to subjects, e.g., animals. Samples are then removed from the subject at various times, which samples contain various concentrations of the library components. At this point the concentrations of library components in the sample depend on how much of the component was transferred to the sample material after administration and how much the sample has been degraded in the body of the subject.
Typically the libraries are proteins and more typically protein libraries derived from shuffled nucleic acids as described above in the section on making libraries. When using the present invention to obtain a composition with an increased serum half-life or higher transfer efficiency, shuffled libraries are used and the steps of the methods iteratively repeated to produce compositions with improved properties, such as a longer serum half-life.
"Nucleic acid" or "polynucleotide" library components in the present invention refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form which typically encode a protein or protein fragment (e.g., an extein or intein) of interest. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, and even peptide -nucleic acids (PNAs). The term "related nucleic acids" is used herein to refer to a group of homologous nucleic acid sequences, for example gene sequences encoding homologous proteins, e.g., pharmacologically active proteins, or protein subunits that have been mutated, e.g., by evolving or shuffling to create new and or related genes that encode proteins with enhanced serum half-lives or transfer properties, either alone or in combination with other genes.
The term "recombinant" when used with reference to a nucleic acid or protein indicates that the nucleic acid or protein has been modified by the introduction of a heterologous nucleic acid or protein or the alternation of a native nucleic acid or protein. For example, recombinant genes or nucleic acids are those that are not native forms of the genes, nucleic acids, or proteins.
The terms "polypeptide," "peptide," and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms also apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
The term "amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. "Amino acid analogs" refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an α-carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine), N-substitutions, or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. "Amino acid mimetics" refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that function in a manner similar to a naturally occurring amino acid.
The term "encoded proteins" refers to proteins that are encoded by nucleic acids. In the present invention encoded proteins are typically those proteins encoded and/or expressed by a group of related nucleic acids or genes that have been shuffled to create nucleic acids that encode new proteins with enhanced or improved properties, such as higher transfer efficiencies or longer serum half-lives.
The libraries of the present invention optionally comprise "pharmacologically active compositions." Such compositions are those that have a pharmaceutical or biological effect on a living system, e.g., drugs or protein pharmaceuticals.
In addition, the libraries of the present invention are optionally "unlabeled" libraries. In other words, the library members, e.g., the components or compositions in the library, do not contain a label moiety or tag. Therefore, the libraries are better for use in screening for transfer properties because no tag is present to interfere with the analysis. Furthermore, the library members can be easily administered to the subject animal because they do not contain a label moiety. In the present invention, the use of cyclotron mass spectrometry can distinguish between the library members, e.g., distinguish between up to 300 or more different peptides, when they are injected into the mass spectrometer in a complex mixture. Therefore, the library is also optionally
administered to the subject plant, animal or in vitro system in one complex mixture, instead of requiring injection of each individual library member into a different animal.
The libraries in the present invention are optionally purified and/or separated libraries. Purification and/or separation occurs before injection into a mass spectrometer, e.g., to remove components that interfere with the analysis. For example, a library comprising a blood or plant or animal tissue sample is optionally fractionated to remove unwanted components, spun out to remove red blood cells, or purified using monoclonal antibodies to enrich the sample for the components of interest, e.g., protein pharmaceuticals. The crude sample or the semi -purified sample containing the components of interest is optionally injected into the mass spectrometer for analysis. The size of the libraries in the present invention varies. A library typically comprises more than about 20 or more than about 35 compositions. More typically the libraries are between about 50 and about 150 or more compositions and most typically, the libraries comprise more than about 150 compositions. For example, the library optionally comprises about 300 or more peptides or proteins. The library, e.g., comprising 300 pharmaceutical peptides or more (e.g., 384 as is conveniently stored and accessed from a 384 well microtiter plate), is optionally injected into an animal in one single dose or administration. Then the sample removed from the animal at a later time comprises those 300 peptides in a concentration indicative of their transfer efficiency and/or serum half-life.
Administration Of The Library To An Organism
When screening the library for transfer efficiency or serum half-life, a library as described above is administered to an animal, e.g., a mouse, rabbit, human, or the like, and a sample is obtained from the animal at a later time. The delivery of the library of compositions to an animal optionally occurs via a number of routes, such as oral, nasal, pulmonal, mucosal, dermal delivery or the like. Alternatively the administration is optionally by injection, either intramuscularly, intraperitoneally, or intravenously or the like. In plants, administration can be by injection, ballistic delivery, topical delivery, or using a vector for delivery to plant cells which are regenerated to form whole plants.
Administration of the library optionally comprises delivery of the entire library in one or a few doses to a single animal or plant, or delivery of one or more components of the library to more than one different plant or animal. In the latter case,
resulting samples are optionally pooled into one sample for MS analysis, thereby increasing the throughput of the system.
The Sample
Samples, e.g., blood, tissue, serum samples, or the like, are removed from the subject plant or animal, e.g., mouse, rabbit, human, corn plant, soybean plant, or the like, at various times. Alternatively, the sample can be withdrawn from an in vitro analysis system, such as a system comprising multiple chambers separated by biological tissues, or synthetic structures which mimic such tissues, or simply from cell culture.
For example, a sample may be taken upon administration of the library of compositions to an animal to determine the initial concentration of compounds in the library and then again at a later time to determine the change in concentration. Alternatively, the sample injected may contain a known concentration of each component delivered, e.g., orally to the subject animal and the sample taken at a later date from the bloodstream of the subject to determine the efficiency of transfer, e.g., from the mouth and/or gastrointestinal tract, to the bloodstream.
Purification of the sample
The samples of the present invention are optionally purified before injection or delivery of the sample to the mass spectrometer. Purification occurs as described above in a chromatographic system or by off-line parallel purification as described in U.S. Application Number 60/119,766, filed February 11, 1999, by Sun Ai Raillard and USSN 09/502,283, filed February 11, 2000. However, the samples are optionally injected as crude serum samples into the mass spectrometers or as semi- purified samples containing non-separated components.
When purifying the samples, e.g., a blood sample, components that interfere with the analysis are optionally removed. Red blood cells are optionally removed simply by centrifuging the samples. Alternatively, samples may be enriched in the components of interest. For example, for protein drugs, a C-terminal peptide is optionally added to allow purification using monoclonal antibodies. Alternatively conserved epitopes can be used for purification with antibodies. In either case, the mixture of components is optionally injected to the mass spectrometer in one injection for simultaneous analysis.
Mass spectrometry
The libraries of the present invention are analyzed by mass spectrometry, typically cyclotron mass spectrometry, more typically electrospray ionization Fourier transform cyclotron mass spectrometry. The cyclotron mass spectrometry detects over a range from zero to several hundred thousand daltons and has millidalton resolution. The high resolution capability of the cyclotron mass spectrometer provides separation and independent quantitation of molecules that differ in mass by less than 100 millidaltons.
Since different amino acids differ in mass by more than that, all of the 19 single amino acid mutants of a protein at a specific site are optionally measured directly by this method. For example, a mixture of 100 different shuffled proteins is seen as 100 separate peaks in the resultant mass spectrometry data, generally even where the proteins all have the same overall number of amino acids. This allows the simultaneous detection of the 100 or more proteins from the library even when those proteins are closely related or differ by only one or a few amino acids. The term "simultaneously" refers to events that occur at essentially the same time. For example, all the compositions that make up the library in the present invention are optionally screened at one time. To save time in the analysis, all the compositions are injected into the mass spectrometer together and detected essentially together, thus allowing a high throughput mass spectrometry screening to occur. Actual analysis of the individual signals can be bifurcated after detection, i.e., signals can be analyzed simultaneously or sequentially.
The ability to resolve 100 or more individual components by cyclotron mass spectrometry allows the determination of which components in a library have the highest transfer efficiency. For example, in a blood sample obtained from a rabbit after oral ingestion of a library of protein pharmaceuticals at equal concentrations, or after inhalation of a library of protein pharmaceuticals in equal concentrations, the highest peaks represent the highest concentrations and therefore are typically transferred to the bloodstream more efficiently than the those at lower concentrations in the samples. Of course, to make relative determinations, both the starting concentration of the components in the library and the tested concentrations are typically determined. The components corresponding to the highest transfer efficiency are then subjected to tandem mass spectrometry to characterize the fragments.
The cyclotron mass techniques described here and the references included herein are optionally used to perform screening as described below.
VII. Screening for Transfer Efficiency
In one embodiment, the present invention provides for screening samples, e.g., blood, tissue, serum samples, or the like, for transfer efficiency. A library of compounds is optionally screened to determine the transfer efficiency of each individual compound in the library. Those compounds with the highest transfer efficiencies are optionally used to develop new compounds with improved transfer efficiencies compared to the original compounds. Mass spectrometry data is used to determine the transfer efficiency of each individual compound by comparing the concentration of a component at the time of administration of the library to the animal with the concentration of the same component at a later time. For example, a sample is optionally administered to an animal, e.g., orally, intransally, intraperintoneally, intramuscularly, or the like. At a later time, a blood sample, comprising a library of compounds, each component at a known concentration, is removed from the animal. By comparing the starting concentration at the time of administration to the concentration in the removed sample, a transfer efficiency is obtained for each component.
Those compounds identified as having the highest transfer efficiency in the screening process are then optionally selected for further recombination. The compositions with the highest transfer efficiency, e.g., the top 20%, top 10%, or top 5%, are optionally recombined to produce a new library which is then screened as described above. The selection process can then be iteratively performed to produce new compounds, e.g., pharmaceuticals, with high transfer efficiencies.
VIII. Screening for Serum Half-life
In another embodiment, the present invention is used to determine the serum half-life of a variety of compounds. Alternatively, the invention is used to evolve compounds with an increased serum half-life. In general, cyclotron mass spectrometry is used to screen a library of compounds, e.g., pharmaceuticals, proteins, or the like, for serum half-life. The serum half-life is optionally determined and the compositions with the longest serum half-life are optionally selected for further analysis and manipulation to produce new compounds with a long serum-half-life.
To determine the half-life of a library of compounds in a biological system such as blood serum, the library is co-administered to a subject organism and then samples of serum, or an other tissue or fluid of interest, are removed at a time, ti, and a later time, t . For example, ti is optionally at or shortly after the time of administration and t2 a few minutes to several hours or days later. The decrease in concentration of each component over time is analyzed to provide the half-life of any analyzed member in serum or any other tissue or biological fluid. Those compounds that have the smallest concentration change over time are those with the longest serum half- life. Alternatively, a known concentration of each component of the library is administered intravenously to the subject. A serum sample is then removed at a later time, e.g., a few hours later. The components with the highest relative concentrations (i.e., as compared to the amount administered), as determined by mass spectrometry, at the later time, are those with the longest serum half-lives. This time can be calculated using the computer system described above based on the predetermined beginning concentration injected into the bloodstream and the concentration at the specified later time. Instruction sets for performing this calculation are an optional feature of the integrated system.
In another embodiment, the compositions identified as having the longest serum half-lives in the screening process are optionally selected for recombination, e.g., shuffling as described above. For example, the nucleic acids corresponding to these compositions are recombined to produce new nucleic acids, which are expressed to provide a protein library which is then screened as described above for compounds with a long serum half-life. This process can be repeated iteratively to produce compounds with an improved serum half-life compared to the original and/or to presently available compounds.
In one embodiment, a library of polypeptides comprising a library of epitope tags is optionally cloned into a vector. The polypeptides of interest are expressed and purified using the epitope tags and a half-life selection assay is performed. For example, the pool of proteins is injected into an animal, e.g., a mouse, a monkey, or the like. The pool of proteins is then isolated from a desired fraction, i.e., a fraction left in the bloodstream of the animal and removed after a desired number of half-lives. The
purified protein is then cleaved and the epitope tagged fragments are optionally purified and screened using mass spectrometry, e.g., cyclotron MS, thereby identifying the most abundant peaks, representing those polypeptides with the longest serum half-life. For example, the library is optionally constructed so that there are no tags having a different sequence but the same molecular weight. The sequence of the polypeptide is deduced and the corresponding oligonucleotide is synthesized, screened and subjected to diversity generation as described above to yield polypeptides with improved half-lives.
Other polypeptide selection methods of interest include, but are not limited to, screening uptake by macrophages and degradation in lysozymes, e.g., to get proteins having longer or shorter half lives in lysozymes; oral administration of a protein followed by recovery of intact protein either within the gut, serum, or stool; contacting, e.g., a library of proteins, with a specific protease or with a mixture of proteases and recovering and quantitating, e.g., using the MS screening methods described above, the resistant molecules; and contacting with a ligand, washing, eluting and quantitating by MS.
Table I lists proteins that are of particular commercial interest to the pharmaceutical industry and thus optionally used in the libraries of the present invention. The majority of the proteins listed in Table I are either receptors or ligands of pharmaceutical interest. These proteins are all candidates for diversity generation to improve function, e.g., improved pharmaceutical properties, such as serum half-life or transfer efficiency, etc. All are well- suited to manipulation by the techniques of the invention.
First, high throughput methods for expressing and purifying libraries of mutant proteins, are applied to the proteins of Table I. These mutants are screened for activity in a functional assay as described above, e.g., using cyclotron mass spectrometry. The genes from mutants with improved activity relative to wild-type are recovered, and subjected to further diversity generation to improve the phenotype further.
Table I POLYPEPTIDE CANDIDATES FOR EVOLUTION Name
Alpha- 1 antitrypsin alpha-, beta-, and gamma-interferon
Angiostatin
antigens (e.g., peptide antigens)
Antihemolytic factor
Apolipoprotein
Apoprotein Atrial natriuretic factor
Atrial natriuretic polypeptide
Atrial peptides B7
C-X-C chemokines (e.g., T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP-10, GCP-2, NAP- 4, SDF-1, PF4, MIG)
CTLA4 binding protein and CD28 binding protein
Calcitonin
CC chemokines (e.g., Monocyte chemoattractant protein- 1, Monocyte chemoattractant protein-2, Monocyte chemoattractant protein-3, Monocyte inflammatory protein- 1 alpha, Monocyte inflammatory protein- 1 beta, RANTES, 1309,
R83915, R91733, HCC1, T58847, D31065, T64262)
CD40 ligand
Collagen
Colony stimulating factor (CSF) Complement factor 5 a
Complement inhibitor
Complement receptor 1
Cytokines erythropoietin Factor IX
Factor VII
Factor VIII
Factor X
Fibrinogen Fibronectin
Glucocerebrosidase
Gonadotropin granulocyte colony stimulating factor (GCSF)
Hedgehog proteins (e.g., Sonic, Indian, Desert) Hemoglobin (for blood substitute; for radiosensitization)
Hirudin
Human serum albumin
Lactoferrin
Luciferase
Neurturin
Neutrophil inhibitory factor (NIF)
Osteogenic protein Parathyroid hormone
Protein A
Protein G
Relaxin
Renin Salmon calcitonin
Salmon growth hormone
Soluble complement receptor I
Soluble I-CAM 1
Soluble interleukin receptors (JL-1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15)
Soluble TNF receptor
Somatomedin
Somatostatin
Somatotropin Streptokinase
Superantigens, i.e., Staphylococcal enterotoxins (SEA, SEB, SEC1,
SEC2, SEC3, SED, SEE), Toxic shock syndrome toxin (TSST-1), Exfoliating toxins A and B, Pyrogenic exotoxins A, B, and C, and M. arthritidis mitogen
Superoxide dismutase Thymosin alpha 1
Tissue plasminogen activator
Tumor necrosis factor beta (TNF beta)
Tumor necrosis factor receptor (TNFR)
Tumor necrosis factor-alpha (TNF alpha) Urokinase
IX. Kits
The system described herein is optionally packaged to include many, if not all, of the necessary reagents, e.g., libraries, for performing the preferred function of screening for transfer properties and serum half-life. Such kits also optionally include appropriate containers and instructions for using the devices or integrated systems herein
as well as necessary reagents, and in cases where reagents are not predisposed in elements of the device, with appropnate instructions for introducing the reagents into the library storage or preparation medium (e.g., a microtiter dish or duplicate dish) or mass spectrometer of the device Such kits typically include a preparation plate with necessary reagents, e.g., a shuffled library, predisposed in the wells or separately packaged.
Generally, such reagents are provided in a stabilized form, so as to prevent degradation or other loss duπng prolonged storage, e.g., from leakage A number of stabilizing processes are widely used for reagents that are to be stored, such as the inclusion of chemical stabilizers (i.e., enzymatic inhibitors, microcides/bacteπostats, anticoagulants), the physical stabilization of the matenal, e.g , through immobilization on a solid support, entrapment m a matnx (i.e., a gel), lyophihzation, or the like.
The discussion above is generally applicable to the aspects and embodiments of the invention descnbed above Moreover, modifications can be made to the method and apparatus descnbed herein without departing from the spmt and scope of the invention as claimed, and the invention can be put to a number of different uses including the following
The use of a mass spectrometry system to perform screening of hbraπes for transfer efficiency
The use of a mass spectrometry system for performing screening of hbraπes for serum half-life determinations
The use of a mass spectrometry system for monitonng serum half-life or transfer efficiency.
The use of a mass spectrometry system as descnbed herein to perform screening of shuffled proteins for transfer efficiency or serum half-life. The use of a mass spectrometry system to obtain compounds with improved transfer efficiency
The use of a mass spectrometry system to obtain compounds with an improved serum half-life.
An assay utilizing a mass spectrometry system as descnbed herein. While the foregoing invention has been descnbed in some detail for purposes of clanty and understanding, it will be clear to one skilled in the art from a reading of this disclosure that vanous changes in form and detail can be made without
departing from the true scope of the invention. For example, all the techniques and apparatus described above may be used in various combinations. All publications, patent applications, patents, and other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent document were individually so denoted.