WO2002046362A2

WO2002046362A2 - Gene associated with benign prostatic hyperplasia in humans

Info

Publication number: WO2002046362A2
Application number: PCT/US2001/045826
Authority: WO
Inventors: William E. Munger; Prakash Kulkarni; Robert H. Getzenberg
Original assignee: Japan Tobacco Inc; Ore Pharmaceuticals Inc
Current assignee: Japan Tobacco Inc; Ore Pharmaceuticals Inc
Priority date: 2000-12-06
Filing date: 2001-12-06
Publication date: 2002-06-13
Anticipated expiration: 2003-06-06
Also published as: WO2002046362A3; AU2002239466A1

Abstract

The invention relates generally to the changes in gene expression in Benign Prostatic Hyperplasia (BPH). The invention relates specifically to a human gene family which is differentially expressed in BPH compared to normal prostate tissue.

Description

GENE ASSOCIATED WITH BENIGN PROSTATIC HYPERPLASIA

IN HUMANS

INVENTORS: Prakash Kulkarni, William E. Munger and Robert H. Getzenberg

RELATED APPLICATION

This application claims priority to U.S. Provisional Application 60/251,420, filed December 6, 2000, which is herein incorporated by reference in its entirety.

FIELD OF THE INVENTION

The invention relates generally to the changes in gene expression in prostate tissue removed from male patients with benign prostatic hyperplasia (BPH). The invention specifically relates to a novel human gene which is differentially expressed in BPH tissue compared to normal prostate tissue.

BACKGROUND OF THE INVENTION

Benign Prostatic Hypeφlasia (BPH)

BPH is the most common benign tumor in men 60 years of age or older. It is estimated that one in four men living to the age of 80 will require treatment for this disease. BPH is usually noted clinically after the age of 50, the incidence increasing with age, but as many as two thirds of men between the ages of 40 and 49 demonstrate histological evidence of the disease.

The anatomic location of the prostate at the bladder neck enveloping the urethra plays an important role in the pathology of BPH, including bladder outlet obstruction. Two prostate components are thought to play a role in bladder outlet obstruction. The first is the relatively increased prostate tissue mass. The second component is the prostatic smooth muscle tone. The causative factors of BPH in man have been intensively studied (see Ziada et ah, Urology 53:1-6, 1999). In general, the two most important factors appear to be aging and the presence of functional testes. Although these factors appear to be key to the development of BPH, both appear to be nonspecific.

Molecular Changes in BPH

Little is known about the molecular changes in prostate cells associated with the development and progression of BPH. It has been demonstrated that the expression levels of a number of individual genes are changed compared to normal prostate cells. These changes in gene expression include a decreased level of Wilm's tumor gene (WT-1) and increased expression of insulin growth factor II (IGF-II) (Dong et al., JClin Endocrin Metab 82:2198-

2203, 1997). While the changes in the expression levels of a number of individual genes have been identified, the investigation of the global changes in gene expression has not been reported.

Accordingly, there exists a need for the investigation of the changes in global gene expression levels as well as the need for the identification of new molecular markers associated with the development and progression of BPH. Furthermore, if intervention is expected to be successful in halting or slowing down BPH, means of accurately assessing the early manifestations of BPH need to be established. One way to accurately assess the early manifestations of BPH is to identify markers which are uniquely associated with disease progression. Likewise, the development of therapeutics to prevent or stop the progression of BPH relies on the identification of genes responsible for BPH growth and function.

SUMMARY OF THE INVENTION

The present invention is based on the discovery of a new gene family that is differentially expressed in BPH tissue compared to normal prostate tissue. The invention includes isolated nucleic acid molecules selected from the group consisting of an isolated nucleic acid molecule that encodes the amino acid sequence of SEQ ID NOS: 2 or 4, an isolated nucleic acid molecule that encodes a fragment of at least 6 contiguous amino acids of SEQ ID NOS: 2 or 4, an isolated nucleic acid molecule which hybridizes to the complement of a nucleic acid molecule comprising SEQ ID NOS: 1 or 3 and an isolated nucleic acid molecule which hybridizes to the complement of a nucleic acid molecule that encodes the amino acid sequence of SEQ LD NOS: 2 or 4. Nucleic acid molecules of the invention may encode a protein having at least about 50%, 60%, or 65% amino acid sequence identity to SEQ ID NOS: 2 or 4, preferably at least about 70%-75% sequence identity, more preferably at least about 80-85% sequence identity, and even more preferably at least about 90%-95% sequence identity to SEQ ID NOS: 2 or 4. The present invention further includes the nucleic acid molecules operably linked to one or more expression control elements, including vectors comprising the isolated nucleic acid molecules. The invention further includes host cells transformed to contain the nucleic acid molecules of the invention and methods for producing a protein of the invention, comprising culturing a host cell transformed with a nucleic acid molecule of the invention under conditions in which the protein is expressed.

The invention further provides an isolated polypeptide selected from the group consisting of an isolated polypeptide comprising the amino acid sequence of SEQ ID NOS: 2 or 4, an isolated polypeptide comprising a functional or antigenic fragment of at least 10 contiguous amino acids of SEQ ID NOS: 2 or 4, an isolated polypeptide comprising conservative amino acid substitutions of SEQ ID NOS: 2 or 4 and an isolated polypeptide comprising naturally occurring amino acid sequence variants of SEQ ID NOS: 2 or 4.

Polypeptides of the invention also include polypeptides with an amino acid sequence having at least about 50%, 60%, 65%, 70% or 75% amino acid sequence identity with the sequence set forth in SEQ ID NOS: 2 or 4, more preferably at least about 80%-85%, even more preferably at least about 90%, and most preferably at least about 95% sequence identity with the sequence set forth in SEQ ID NOS: 2 or 4.

The invention further provides an isolated antibody or antigen-binding fragment that specifically binds to a polypeptide of the invention, including monoclonal and polyclonal antibodies.

The invention further provides methods of identifying an agent which modulates the expression of a nucleic acid encoding a protein of the invention, comprising exposing cells which express the nucleic acid to the agent and determining whether the agent modulates expression of said nucleic acid, thereby identifying an agent which modulates the expression of a nucleic acid encoding the protein.

The invention further provides methods of identifying an agent which modulates the level of or at least one activity of a protein of the invention, comprising exposing cells which express the protein to the agent and determining whether the agent modulates the level of or at least one activity of said protein, thereby identifying an agent which modulates the level of or at least one activity of the protein.

The invention further provides methods of identifying binding partners for a protein of the invention, comprising exposing said protein to a potential binding partner and determining if the potential binding partner binds to said protein, thereby identifying binding partners for the protein.

The present invention further provides methods of modulating the expression of a nucleic acid encoding a protein of the invention, comprising administering an effective amount of an agent which modulates the expression of a nucleic acid encoding a protein of the invention. The invention also provides methods of modulating at least one activity of a protein of the invention, comprising administering an effective amount of an agent which modulates at least one activity of a protein of the invention.

The present invention further includes non-human transgenic animals modified to contain the nucleic acid molecules of the invention or mutated nucleic acid molecules such that expression of the encoded polypeptides of the invention is prevented.

The present invention also includes non-human transgenic animals in which all or a portion of the gene comprising all or a portion of SEQ ID NOS: 1 or 3 has been knocked out or deleted from the genome of the animal. The invention further provides methods of diagnosing BPH or other disease states, comprising the steps of: acquiring a tissue, blood, urine or other sample from a subject; and determining the level of expression of a nucleic acid molecule of the invention or polypeptide of the invention.

The invention further includes compositions comprising a diluent and a polypeptide or protein selected from the group consisting of an isolated polypeptide comprising all or a portion of SEQ ID NOS: 2 or 4, an isolated polypeptide comprising a fragment of at least 6 contiguous amino acids of SEQ ID NOS: 2 or 4, an isolated polypeptide comprising one or more conservative amino acid substitutions of SEQ ID NOS: 2 or 4, naturally occurring amino acid sequence variants of SEQ ID NOS: 2 or 4, and an isolated polypeptide with an amino acid sequence having at least about 50%, 60%, 70% or 75% amino acid sequence identity with the sequence set forth in SEQ ID NOS: 2 or 4, preferably at least about 80% to 85%, more preferably at least about 90%, and most preferably at least about 95% sequence identity with the sequence set forth in SEQ ID NOS: 2 or 4.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 Figure 1 shows the expression of Clone No. AA233368, SEQ ID NOS:

1 or 3, as analyzed by an Affymetrix GeneChip® set in normal human control prostate and BPH samples, including BPH samples from individuals without symptoms (BPH-w/o), samples from BPH patients without symptoms who were diagnosed with prostate cancer (BPH-Ca) and BPH samples from men with symptoms (BPH- With). Samples labeled "JPN" are from Japanese patients diagnosed as having BPH with symptoms. In all cases, the subregion of the prostate analyzed was the transitional zone. Both unadjusted values (upper panel; raw Average Difference values) and "floored" values are shown (lower panel; all

Average Difference values less than 20 are assigned a value of 20). AA233368 is consistently and significantly upregulated (mean fold change=3.87; T-test, p<0.045) in BPH with symptoms vs. normal controls.

Figure 2 Figure 2 shows the tissue distribution of RNA encoding a protein of

AA233368 as analyzed by Northern blot in human tissues. Lanes 1-8 contain total RNA from: 1. Brain, 2. Heart, 3. Placenta, 4. Lung, 5. Liver, 6. Skeletal Muscle, 7. Kidney and 8. Pancreas. Lane M contains RNA markers, indicated by small dots (the size of each marker is labeled). Expression is strongest in the liver, where a single band of about 2.2 kb is observed, but lower levels of expression of the same apparent size message are observed (from higher to lower expression levels) in skeletal muscle and pancreas, then kidney, then at approximately the same level in placenta, lung, heart and brain .

Figure 3 Figure 3 shows an electronic Northern, in which the expression level of

AA233368 was measured across a panel of normal tissues and tissues from subtypes of BPH patients (see Figure 1) using the Affymetrix 42K human GeneChip® set. For each tissue type, the mean +/- SDM is shown as a horizontal bar graph for samples obtained from 3 or more normal individuals. In normal tissues, the results show that expression from higher to lower levels is ovary > myometrium > kidney=liver=colon=small intestine=^:thymus=skeletal muscle >lung=endometrium=breast =cervix=prostate=stomach > spleen=esophagus>brain.

Figure 4 Figure 4 shows the results of semi-quantitative PCR for the expression of AA233368 (SEQ ID NOS: 1 or 3) in various normal tissues. Upper panel of the gel: 25 cycles (25 O); lower panel of the gel: 30 cycles (30 ). Double lanes are shown for each tissue: the primer pair F13-34/R324-345 were used for the first lane per tissue (TGGAGAAGGTTTCTCTCTCATC (forward, bases 13-34 of SEQ ID NOS: 1 or 3); CAGACCTGGAGTCCCTGCGGA (reverse, bases 324-345 of SEQ ID NOS: 1 or 3)) and primers F73-94/R514-535 (GCCTGCAGGTGGACTACGTCTT (forward, bases 73-94 of SEQ ID NOS: 1 or 3); GGTCACTGAAGGAGGAACTGGG (reverse, bases 514-535 of SEQ ID NOS: 1 or 3) were used for the second lane per tissue. Lanes: M) low mass DNA markers; 1) heart; 2) brain; 3) leukocytes; 4) lung; 5) liver; 6) fetal brain; 7) kidney; 8) spleen; 9) placenta; 10) BRF-55T(an immortalized, human BPH cell line); 11) glomeruli; and 12) osteoblast.

Figure 5 Figure 5 is a hydrophobicity plot (PEPPLOT) of the protein encoded by the first open reading frame of AA233368 (SEQ ID NO: 2). Analysis was done using the methods of Goldman et al. and of Kyle-Doolittle.

Figure 6 Figure 6 is a hydrophobicity plot (PEPPLOT) of the protein encoded by the second open reading frame of AA233368 (SEQ ID NO: 4). As in Figure 5, analysis was done using the methods of Goldman et al. and of Kyle-Doolittle.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

I. General Description

The present invention is based in part on the identification of a new gene family that is differentially expressed in human BPH tissue compared to normal human prostate tissue. This gene family corresponds to the human cDNA of SEQ ID NOS: 1 and 3. Genes that encode the human protein of SEQ ID NOS: 2 and 4 may also be found in other animal species, particularly mammalian species.

The proteins and nucleic acids of the invention may be used as diagnostic agents to detect BPH in a sample or monitor the progression of BPH in a patient. The proteins of the present invention can also serve as a target for agents that can be used to modulate the expression or activity of the proteins. For example, agents may be identified that modulate biological processes associated with prostate growth, including the hyperplastic process of BPH.

The present invention is further based on the development of methods for isolating binding partners that bind to a protein of the invention. Probes based on the protein are used as capture probes to isolate potential binding partners, such as other proteins. Dominant negative proteins, DNAs encoding these proteins, antibodies to these proteins, peptide fragments of these proteins or mimics of these proteins may be introduced into cells to affect function. Additionally, the protein provides novel targets for the screening of synthetic small molecules and combinatorial or naturally occurring compound libraries to discover novel therapeutics to regulate prostate function.

II. Specific Embodiments A. The Proteins Associated with BPH

The present invention provides isolated proteins, allelic variants of the proteins, and conservative amino acid substitutions of the proteins. As used herein, the "protein" or

"polypeptide" refers, in part, to a protein that has the human amino acid sequence depicted in SEQ ID NOS: 2 or 4. The terms also refer to naturally occurring allelic variants and proteins that have a slightly different amino acid sequence than that specifically recited above. Allelic variants, though possessing a slightly different amino acid sequence than those recited above, will still have the same or similar biological functions associated with these proteins.

The present invention also encompasses proteins translated from alternative splice variants of the genes encoding the identified proteins.

As used herein, the family of proteins related to the human amino acid sequence of SEQ ID NOS: 2 or 4 refers to proteins that have been isolated from organisms in addition to humans. The methods used to identify and isolate other members of the family of proteins related to these proteins are described below. The proteins of the present invention are preferably in isolated form. As used herein, a protein is said to be isolated when physical, mechanical or chemical methods are employed to remove the protein from cellular constituents that are normally associated with the protein. A skilled artisan can readily employ standard purification methods to obtain an isolated protein. The proteins of the present invention further include insertion, deletion or conservative amino acid substitution variants of SEQ ID NOS: 2 or 4. As used herein, a conservative variant refers to alterations in the amino acid sequence that do not adversely affect the biological functions of the protein. A substitution, insertion or deletion is said to adversely affect the protein when the altered sequence prevents or disrupts a biological function associated with the protein. For example, the overall charge, structure or hydrophobic/hydrophilic properties of the protein can be altered without adversely affecting a biological activity. Accordingly, the amino acid sequence can be altered, for example to render the peptide more hydrophobic or hydrophilic, without adversely affecting the biological activities of the protein.

Ordinarily, the allelic variants, the conservative substitution variants, and the members of the protein family, will have an amino acid sequence having at least about 50%, 60%, 65%, 70% or 75%) amino acid sequence identity with the sequence set forth in SEQ ID NOS: 2 or 4, more preferably at least about 80%, even more preferably at least about 90%, and most preferably at least about 95% sequence identity. Identity or homology with respect to such sequences is defined herein as the percentage of amino acid residues in the candidate sequence that are identical with the known peptides, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent homology, and not considering any conservative substitutions as part of the sequence identity (see section B for the relevant parameters). Fusion proteins, or N-terminal, C-terminal or internal extensions, deletions, or insertions into the peptide sequence shall not be construed as affecting homology.

Thus, the proteins of the present invention include molecules having the amino acid sequence disclosed in SEQ ID NOS: 2 or 4; fragments thereof having a consecutive sequence of at least about 6, 10, 15, 20, 25, 30, 35 or more contiguous amino acid residues of these proteins; amino acid sequence variants wherein one or more amino acid residues has been inserted N- or C-terminal to, or within, the disclosed coding sequence; and amino acid sequence variants of the disclosed sequence, or their fragments as defined above, that have been substituted by at least one residue. Such fragments, also referred to as peptides or polypeptides, may contain antigenic regions, functional regions of the protein identified as regions of the amino acid sequence which correspond to known protein domains, as well as regions of pronounced hydrophilicity. The regions are all easily identifiable by using commonly available protein sequence analysis software such as MacNector (Oxford Molecular).

Contemplated variants further include those containing predetermined mutations by, e.g., homologous recombination, site-directed or PCR mutagenesis, and the corresponding proteins of other animal species, including but not limited to rabbit, mouse, rat, porcine, bovine, ovine, equine and non-human primate species, and the alleles or other naturally occurring variants of the family of proteins; and derivatives wherein the protein has been covalently modified by substitution, chemical, enzymatic, or other appropriate means with a moiety other than a naturally occurring amino acid (for example a detectable moiety such as an enzyme or radioisotope).

As described below, members of the family of proteins can be used: (1) to identify agents which modulate at least one activity of the protein; (2) to identify binding partners for the protein; (3) as an antigen to raise polyclonal or monoclonal antibodies; (4) as a therapeutic agent or target; and (5) as a diagnostic agent or marker.

B. Nucleic Acid Molecules

The present invention further provides nucleic acid molecules that encode the proteins having SEQ LD NOS: 2 and 4 and the related proteins herein described, preferably in isolated form. As used herein, "nucleic acid" is defined as RNA or DNA that encodes a protein or peptide as defined above, is complementary to a nucleic acid sequence encoding such peptides, hybridizes to such a nucleic acid and remains stably bound to it under appropriate stringency conditions, or encodes a polypeptide sharing at least about 50%, 60%, 70% or 75% sequence identity, preferably at least about 80%, more preferably at least about 85%, and even more preferably at least about 90% or 95%) or more identity with the peptide sequences. Alternatively, nucleic acid molecules will have at least about 50%, 60%, 70% or 75% nucleotide sequence identity to SEQ ID NOS: 1 or 3, preferably about 80%, more preferably about 85%, and even more preferably about 90% or 95% or more identity, particularly through the open reading frames of SEQ ID NOS: 1 or 3. Specifically contemplated are genomic DNA, cDNA, mRNA and antisense molecules, as well as nucleic acids based on alternative backbones or including alternative bases whether derived from natural sources or synthesized. Such hybridizing or complementary nucleic acids, however, are defined further as being novel and unobvious over any prior art nucleic acid including that which encodes, hybridizes under appropriate stringency conditions, or is complementary to nucleic acid encoding a protein according to the present invention.

Homology or identity at the nucleotide or amino acid sequence level is determined by BLAST (Basic Local Alignment Search Tool) analysis using the algorithm employed by the programs blastp, blastn, blastx, tblastn and tblastx (Karlin et al., Proc Natl Acad Sci USA 87: 2264-2268, 1990, and Altschuk J ot^" Evot^" 36:290-300, 1993, fully incorporated by reference) which are tailored for sequence similarity searching. The approach used by the BLAST program is to first consider similar segments between a query sequence and a database sequence, then to evaluate the statistical significance of all matches that are identified and finally to summarize only those matches which satisfy a preselected threshold of significance. For a discussion of basic issues in similarity searching of sequence databases, see Altschul et al., Nature Genetics 6:119-129, 1994, which is fully incorporated by reference, or the Washington University (St. Louis, MO) BLAST web site at http://blast.wustl.edu. The search parameters for histogram, descriptions, alignments, expect (i.e., the statistical significance threshold for reporting matches against database sequences), cutoff, matrix and filter are at the default settings. The default scoring matrix used by blastp, blastx, tblastn, and tblastx is the BLOSUM62 matrix (Henikoff et al., Proc Natl Acad Sci USA 89:10915-10919, 1992, fully incorporated by reference). For blastn, the scoring matrix is set by the ratios of M (i.e., the reward score for a pair of matching residues) to N (i.e., the penalty score for mismatching residues), wherein the default values for M and N are 5 and -4, respectively. Four blastn parameters were adjusted as follows: Q=10 (gap creation penalty); R=10 (gap extension penalty); wink=l (generates word hits at every wink^th position along the query); and gapw=16 (sets the window width within which gapped alignments are generated). The equivalent Blastp parameter settings were Q=9; R=2; wink=l; and gapw=32. A Bestfit comparison between sequences, available in the GCG package version 10.0, uses DNA parameters GAP=50 (gap creation penalty) and LEN=3 (gap extension penalty) and the equivalent settings in protein comparisons are GAP=8 and LEN=2. "Stringent conditions" are those that (1) employ low ionic strength and high temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate/0.1%) SDS at 50°C, or (2) employ during hybridization a denaturing agent such as formamide, for example, 50%o (vol/vol) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42°C. Another example is hybridization in 50% formamide, 5x SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5x Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2x SSC and 0.1% SDS. A skilled artisan can readily determine and vary the stringency conditions appropriately to obtain a clear and detectable hybridization signal. Preferred molecules are those that hybridize under the above conditions to the complement of SEQ ID NO: 1 and which encode a functional protein. Even more preferred hybridizing molecules are those that hybridize under the above conditions to the complement strand of the open reading frame of SEQ ID NO: 1.

As used herein, a nucleic acid molecule is said to be "isolated" when the nucleic acid molecule is substantially separated from contaminant nucleic acid molecules encoding other polypeptides.

The present invention further provides fragments of the encoding nucleic acid molecule. As used herein, a fragment of an encoding nucleic acid molecule refers to a small portion of the entire protein coding sequence. The size of the fragment will be determined by the intended use. For example, if the fragment is chosen so as to encode an active portion of the protein, the fragment will need to be large enough to encode the functional region(s) of the protein. For instance, fragments which encode peptides corresponding to predicted antigenic regions may be prepared (see Figures 5 and 6). If the fragment is to be used as a nucleic acid probe or PCR primer, then the fragment length is chosen so as to obtain a relatively small number of false positives during probing/priming (see the discussion in Section H).

Fragments of the encoding nucleic acid molecules of the present invention (i.e., synthetic oligonucleotides) that are used as probes or specific primers for the polymerase chain reaction (PCR), or to synthesize gene sequences encoding proteins of the invention, can easily be synthesized by chemical techniques, for example, the phosphoramidite method of Matteucci et al. (J Am Chem Soc 103:3185-3191, 1981), or using automated synthesis methods. In addition, larger DNA segments can readily be prepared by well known methods, such as synthesis of a group of oligonucleotides that define various modular segments of the gene, followed by ligation of oligonucleotides to build the complete modified gene.

The encoding nucleic acid molecules of the present invention may further be modified so as to contain a detectable label for diagnostic and probe purposes. A variety of such labels are known in the art and can readily be employed with the encoding molecules herein described. Suitable labels include, but are not limited to, biotin, radiolabeled nucleotides and the like. A skilled artisan can readily employ any such label to obtain labeled variants of the nucleic acid molecules of the invention.

Modifications to the primary structure itself by deletion, addition, or alteration of the amino acids incorporated into the protein sequence during translation can be made without destroying the activity of the protein. Such substitutions or other alterations result in proteins having an amino acid sequence encoded by a nucleic acid falling within the contemplated scope of the present invention.

C. Isolation of Other Related Nucleic Acid Molecules

As described above, the identification and characterization of the human nucleic acid molecule having SEQ ED NOS: 1 or 3 allows a skilled artisan to isolate nucleic acid molecules that encode other members of the protein family in addition to the sequences herein described. Further, the presently disclosed nucleic acid molecules allow a skilled artisan to isolate nucleic acid molecules that encode other members of the family of proteins in addition to the protein having SEQ ID NOS: 2 or 4. Essentially, a skilled artisan can readily use the amino acid sequence of SEQ ID NOS:

2 or 4 to generate antibody probes to screen expression libraries prepared from appropriate cells. Typically, polyclonal antiserum from mammals such as rabbits immunized with the purified protein (as described below) or monoclonal antibodies can be used to probe a mammalian cDNA or genomic expression library, such as lambda gtll library, to obtain the appropriate coding sequence for other members of the protein family. The cloned cDNA sequence can be expressed as a fusion protein, expressed directly using its own control sequences, or expressed by constructions using control sequences appropriate to the particular host used for expression of the enzyme.

Alternatively, a portion of the coding sequence herein described can be synthesized and used as a probe to retrieve DNA encoding a member of the protein family from any mammalian organism. Oligomers containing approximately 18-20 nucleotides (encoding about a 6-7 amino acid stretch) are prepared and used to screen genomic DNA or cDNA libraries to obtain hybridization under stringent conditions or conditions of sufficient stringency to eliminate an undue level of false positives.

Additionally, pairs of oligonucleotide primers can be prepared for use in a polymerase chain reaction (PCR) to selectively clone an encoding nucleic acid molecule. A PCR denature- anneal-extend cycle for using such PCR primers is well known in the art and can readily be adapted for use in isolating other encoding nucleic acid molecules.

Nucleic acid molecules encoding other members of the protein family may also be identified in existing genomic or other sequence information using any available computational method, including but not limited to: PSI-BLAST (Altschul, et al, Nucleic Acids Res 25:3389- 3402, 1997); PHI-BLAST (Zhang, et al, Nucleic Acids Res 26:3986-3990, 1998); 3D-PSSM (Kelly et al. , JMol Biol 299(2): 499-520, 2000); and other computational analysis methods (Shi et al, Biochem Biophys Res Commun 262(1): 132-138, 1999, and Matsunami et. al., Nature 404(6778):601-604, 2000).

D. rDNA molecules Containing a Nucleic Acid Molecule The present invention further provides recombinant DNA molecules (rDNAs) that contain a coding sequence. As used herein, a rDNA molecule is a DNA molecule that has been subjected to molecular manipulation in situ. Methods for generating rDNA molecules are well known in the art, for example, see Sambrook et al., Molecular Cloning - A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. In the preferred rDNA molecules, a coding DNA sequence is operably linked to expression control sequences and/or vector sequences.

The choice of vector and/or expression control sequences to which one of the protein family encoding sequences of the present invention is operably linked depends directly, as is well known in the art, on the functional properties desired, e.g., protein expression, and the host cell to be transformed. A vector contemplated by the present invention is at least capable of directing the replication or insertion into the host chromosome, and preferably also expression, of the structural gene included in the rDNA molecule. Expression control elements that are used for regulating the expression of an operably linked protein encoding sequence are known in the art and include, but are not limited to, inducible promoters, constitutive promoters, secretion signals, and other regulatory elements.

Preferably, the inducible promoter is readily controlled, such as being responsive to a nutrient in the host cell's medium. In one embodiment, the vector containing a coding nucleic acid molecule will include a prokaryotic replicon, i.e., a DNA sequence having the ability to direct autonomous replication and maintenance of the recombinant DNA molecule extrachromosomally in a prokaryotic host cell, such as a bacterial host cell, transformed therewith. Such replicons are well known in the art. In addition, vectors that include a prokaryotic replicon may also include a gene whose expression confers a detectable marker such as a drug resistance. Typical bacterial drug resistance genes are those that confer resistance to ampicillin or tetracycline.

Vectors that include a prokaryotic replicon can further include a prokaryotic or bacteriophage promoter capable of directing the expression (transcription and translation) of the coding gene sequences in a bacterial host cell, such as E. coli. A promoter is an expression control element formed by a DNA sequence that permits binding of RNA polymerase and transcription to occur. Promoter sequences compatible with bacterial hosts are typically provided in plasmid vectors containing convenient restriction sites for insertion of a DNA segment of the present invention. Typical of such vector plasmids are pUC8, pUC9, pBR322 and ρBR329 available from BioRad Laboratories, (Richmond, CA), pPL and ρKK223 available from Pharmacia (Piscataway, NJ).

Expression vectors compatible with eukaryotic cells, preferably those compatible with vertebrate cells, such as prostate cells, can also be used to form rDNA molecules that contain a coding sequence. Eukaryotic cell expression vectors, including viral vectors, are well known in the art and are available from several commercial sources. Typically, such vectors are provided containing convenient restriction sites for insertion of the desired DNA segment. Typical of such vectors are pSNL and pKSN-10 (Pharmacia), pBPN-l/pML2d (International Biotechnologies, Inc.), pTDTl (ATCC, #31255), the vector pCDM8 described herein, and the like eukaryotic expression vectors. Vectors may be modified to include prostate cell specific promoters if needed.

Eukaryotic cell expression vectors used to construct the rDNA molecules of the present invention may further include a selectable marker that is effective in an eukaryotic cell, preferably a drug resistance selection marker. A preferred drug resistance marker is the gene whose expression results in neomycin resistance, i.e., the neomycin phosphotransferase (neo) gene (Southern et al., JMol Anal Genet 1:327-341, 1982). Alternatively, the selectable marker can be present on a separate plasmid, and the two vectors are introduced by co-transfection of the host cell, and selected by culturing in the appropriate drug for the selectable marker.

E. Host Cells Containing an Exogenously Supplied Coding Nucleic Acid Molecule

The present invention further provides host cells transformed with a nucleic acid molecule that encodes a protein of the present invention. The host cell can be either prokaryotic or eukaryotic. Eukaryotic cells useful for expression of a protein of the invention are not limited, so long as the cell line is compatible with cell culture methods and compatible with the propagation of the expression vector and expression of the gene product. Preferred eukaryotic host cells include, but are not limited to, yeast, insect and mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or human cell line. Preferred eukaryotic host cells include Chinese hamster ovary (CHO) cells available from the ATCC as CCL61, NTH Swiss mouse embryo cells (NIH/3T3) available from the ATCC as CRL 1658, baby hamster kidney cells (BHK), and the like eukaryotic tissue culture cell lines. Any prokaryotic host can be used to express a rDNA molecule encoding a protein of the invention. The preferred prokaryotic host is E. coli.

Transformation of appropriate cell hosts with a rDNA molecule of the present invention is accomplished by well known methods that typically depend on the type of vector used and host system employed. With regard to transformation of prokaryotic host cells, electroporation and salt treatment methods are typically employed, see, for example, Cohen et al, Proc Natl Acad Sci USA 69:2110, 1972; and Sambrook et al, Molecular Cloning - A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. With regard to transformation of vertebrate cells with vectors containing rDNAs, electroporation, cationic lipid or salt treatment methods are typically employed (see, for example, Graham et al, Virol 52:456, 1973; or Wigler et al, Proc Natl Acad Sci USA 76:1373-1376, 1979).

Successfully transformed cells, i.e., cells that contain a rDNA molecule of the present invention, can be identified by well known techniques including the selection for a selectable marker. For example, cells resulting from the introduction of an rDNA of the present invention can be cloned to produce single colonies. Cells from those colonies can be harvested, lysed and their DNA content examined for the presence of the rDNA using a method such as that described by Southern, JMol Biol 98:503, 1975, or Berent et al, Biotech 3:208, 1985, or the proteins produced from the cell assayed via an immunological method.

F. Production of Recombinant Proteins using a rDNA Molecule

The present invention further provides methods for producing a protein of the invention using nucleic acid molecules herein described. In general terms, the production of a recombinant form of a protein typically involves the following steps:

A nucleic acid molecule is first obtained that encodes a protein of the invention, such as a nucleic acid molecule comprising, consisting essentially of or consisting of SEQ ED NOS: 1 or 3, nucleotides 57-1592 (-1595 with the stop codon) of SEQ ID NO: 1, or nucleotides 553- 861 (-864 with the stop codon) of SEQ ED NO: 3. If the encoding sequence is uninterrupted by introns, as are these open reading frames, it is directly suitable for expression in any host.

The nucleic acid molecule is then preferably placed in operable linkage with suitable control sequences, as described above, to form an expression unit containing the protein open reading frame. The expression unit is used to transform a suitable host, and the transformed host is cultured under conditions that allow the production of the recombinant protein. Optionally the recombinant protein is isolated from the medium or from the cells. Recovery and purification of the protein may not be necessary in some instances where some impurities may be tolerated.

Each of the foregoing steps can be done in a variety of ways. For example, the desired coding sequences may be obtained from genomic fragments and used directly in appropriate hosts. The construction of expression vectors that are operable in a variety of hosts is accomplished using appropriate replicons and control sequences, as set forth above. The control sequences, expression vectors, and transformation methods are dependent on the type of host cell used to express the gene and were discussed in detail earlier. Suitable restriction sites can, if not normally available, be added to the ends of the coding sequence so as to provide an excisable gene to insert into these vectors. A skilled artisan can readily adapt any host/expression system known in the art for use with the nucleic acid molecules of the invention to produce recombinant protein. G. Methods to Identify Binding Partners

Another embodiment of the present invention provides methods for isolating and identifying binding partners of proteins of the invention. In general, a protein of the invention is mixed with a potential binding partner or an extract or fraction of a cell under conditions that allow the association of potential binding partners with the protein of the invention. After mixing, peptides, polypeptides, proteins or other molecules that have become associated with a protein of the invention are separated from the mixture. The binding partner that bound to the protein of the invention can then be removed and further analyzed. To identify and isolate a binding partner, the entire protein, for instance a protein comprising the entire amino acid sequence of SEQ ID NOS: 2 or 4 can be used. Alternatively, a fragment of the protein can be used.

As used herein, a cellular extract refers to a preparation or fraction which is made from a lysed or disrupted cell. The preferred source of cellular extracts will be cells derived from human prostate tissue or cells, for instance, biopsy tissue or tissue culture cells from subjects with BPH. Alternatively, cellular extracts may be prepared from normal human prostate tissue or available cell lines, particularly prostate derived cell lines.

A variety of methods can be used to obtain an extract of a cell. Cells can be disrupted using either physical or chemical disruption methods. Examples of physical disruption methods include, but are not limited to, sonication and mechanical shearing. Examples of chemical lysis methods include, but are not limited to, detergent lysis and enzyme lysis. A skilled artisan can readily adapt methods for preparing cellular extracts in order to obtain extracts for use in the present methods.

Once an extract of a cell is prepared, the extract is mixed with the protein of the invention under conditions in which association of the protein with the binding partner can occur. A variety of conditions can be used, the most preferred being conditions that closely resemble conditions found in the cytoplasm of a human cell. Features such as osmolarity, pH, temperature, and the concentration of cellular extract used, can be varied to optimize the association of the protein with the binding partner.

After mixing under appropriate conditions, the bound complex is separated from the mixture. A variety of techniques can be utilized to separate the mixture. For example, antibodies specific to a protein of the invention can be used to immunoprecipitate the binding partner complex. Alternatively, standard chemical separation techniques such as chromatography and density/sediment centrifugation can be used. After removal of non-associated cellular constituents found in the extract, the binding partner can be dissociated from the complex using conventional methods. For example, dissociation can be accomplished by altering the salt concentration or pH of the mixture.

To aid in separating associated binding partner pairs from the mixed extract, the protein of the invention can be immobilized on a solid support. For example, the protein can be attached to a nitrocellulose matrix or acrylic beads. Attachment of the protein to a solid support aids in separating peptide/binding partner pairs from other constituents found in the extract. The identified binding partners can be either a single protein or a complex made up of two or more proteins. Alternatively, binding partners may be identified using a Far- Western assay according to the procedures of Takayama et al. , Methods Mol Biol 69 : 171 - 184, 1997, or Sauder et al, J Gen. Virol 77:991-996, 1996, or identified through the use of epitope tagged proteins or GST-fusion proteins.

Alternatively, the nucleic acid molecules of the invention can be used in a yeast two- hybrid system or other in vivo protein-protein detection system. The yeast two-hybrid system has been used to identify other protein partner pairs and can readily be adapted to employ the nucleic acid molecules herein described.

H. Methods to Identify Agents that Modulate the Expression of a Nucleic Acid Encoding the Genes Associated BPH Another embodiment of the present invention provides methods for identifying agents that modulate the expression of a nucleic acid encoding a protein of the invention such as a protein having the amino acid sequence of SEQ ED NOS: 2 or 4. Such assays may utilize any available means of monitoring for changes in the expression level of the nucleic acids of the invention. As used herein, an agent is said to modulate the expression of a nucleic acid of the invention if it is capable of up- or down-regulating expression of the nucleic acid in a cell. In one assay format, cell lines that contain reporter gene fusions between the open reading frames defined by nucleotides 57-1592 of SEQ ED NO: 1, nucleotides 553-861 of SEQ ID NO: 3 and/or the 5 'and/or 3 ' regulatory elements and any assayable fusion partner may be prepared. Numerous assayable fusion partners are known and readily available including the firefly luciferase gene and the gene encoding chloramphenicol acetyltransferase (Alam et al, Anal Biochem 188:245-254, 1990). Cell lines containing the reporter gene fusions are then exposed to the agent to be tested under appropriate conditions and time. Differential expression of the reporter gene between samples exposed to the agent and control samples identifies agents which modulate the expression of a nucleic acid of the invention.

Additional assay formats may be used to monitor the ability of the agent to modulate the expression of a nucleic acid encoding a protein of the invention, such as the protein having

SEQ ID NOS: 2 or 4. For instance, mRNA expression may be monitored directly by hybridization to the nucleic acids of the invention. Cell lines are exposed to the agent to be tested under appropriate conditions and time and total RNA or mRNA is isolated by standard procedures such those disclosed in Sambrook et al. (supra).

The preferred cells will be those derived from human prostate tissue, for instance, prostate biopsy tissue or cultured prostate cells from normal or BPH patients, for example BRF-55T cells (immortalized human prostate cells obtained from an individual with BPH; Iype, et al, "Establishment and characterization of immortalized human cell lines from prostatic carcinoma and benign prostatic hyperplasia," IntJ Oncol 12:257-263, 1998). Alternatively, other available cells or cell lines may be used.

Probes to detect differences in RNA expression levels between cells exposed to the agent and control cells may be prepared from the nucleic acids of the invention. It is preferable, but not necessary, to design probes which hybridize only with target nucleic acids under conditions of high stringency. Only highly complementary nucleic acid hybrids form under conditions of high stringency. Accordingly, the stringency of the assay conditions determines the amount of complementarity which should exist between two nucleic acid strands in order to form a hybrid. Stringency should be chosen to maximize the difference in stability between the probe:target hybrid and probe:non-target hybrids.

Probes may be designed from the nucleic acids of the invention through methods known in the art. For instance, the G+C content of the probe and the probe length can affect probe binding to its target sequence. Methods to optimize probe specificity are commonly available in Sambrook et al.(supra) or Ausubel et al, Current Protocols in Molecular Biology, Greene Publishing Co., 1995.

Hybridization conditions are modified using known methods, such as those described by Sambrook et al. and Ausubel et al. as required for each probe. Hybridization of total cellular RNA or RNA enriched for polyA RNA can be accomplished in any available format. For instance, total cellular RNA or RNA enriched for polyA RNA can be affixed to a solid support and the solid support exposed to at least one probe comprising at least one, or part of one of the sequences of the invention under conditions in which the probe will specifically hybridize. Alternatively, nucleic acid fragments coiηprising at least one, or part of one of the sequences of the invention can be affixed to a solid support, such as a silicon chip or a porous glass wafer. The solid support can then be exposed to total cellular RNA or polyA RNA from a sample under conditions in which the affixed sequences will specifically hybridize. Such solid supports and hybridization methods are widely available, for example, those disclosed by Beattie, WO 95/11755 (1995). By examining for the ability of a given probe to specifically hybridize to an RNA sample from an untreated cell population and from a cell population exposed to the agent, agents which up or down regulate the expression of a nucleic acid encoding the protein having the sequence of SEQ ID NOS: 2 or 4 are identified.

Hybridization for qualitative and quantitative analysis of mRNAs may also be carried out by using a RNase Protection Assay (t.e., RPA, see Ma et al., Methods 10:273-278, 1996). Briefly, an expression vehicle comprising cDNA encoding the gene product and a phage specific DNA-dependent RNA polymerase promoter (e.g., T7, T3 or SP6 RNA polymerase) is linearized at the 3' end of the cDNA molecule, downstream from the phage promoter, wherein such a linearized molecule is subsequently used as a template for synthesis of a labeled antisense transcript of the cDNA by in vitro transcription. The labeled transcript is then hybridized to a mixture of isolated RNA (i.e., total or fractionated mRNA) by incubation at 45 °C overnight in a buffer comprising 80%o formamide, 40 mM Pipes, pH 6.4, 0.4 M NaCl and 1 mM EDTA. The resulting hybrids are then digested in a buffer comprising 40 μg/ml ribonuclease A and 2 μg/ml ribonuclease. After deactivation and extraction of extraneous proteins, the samples are loaded onto urea/polyacrylamide gels for analysis.

In another assay format, cells or cell lines are first identified which express the gene products of the invention physiologically, for example BRF-55T cells (e.g., by using assays of tissue distribution via Northern blot, although RPAs may serve the identical purpose of expression selection). Cell and/or cell lines so identified would be expected to comprise the necessary cellular machinery such that the fidelity of modulation of the transcriptional apparatus is maintained with regard to exogenous contact of agent with appropriate surface transduction mechanisms and/or the cytosolic cascades. Further, such cells or cell lines would be transduced or transfected with an expression vehicle (e.g., a plasmid or viral vector) construct comprising an operable non-translated 5 '-promoter containing end of the structural gene encoding the instant gene products fused to one or more antigenic fragments, which are peculiar to the instant gene products, wherein said fragments are under the transcriptional control of said promoter and are expressed as polypeptides whose molecular weight can be distinguished from the naturally occurring polypeptides or may further comprise an immunologically distinct tag or other detectable marker. Such a process is well known in the art (see Sambrook et al., supra).

Cells or cell lines transduced or transfected as outlined above are then contacted with agents under appropriate conditions; for example, the agent in a pharmaceutically acceptable excipient is contacted with cells in an aqueous physiological buffer, such as phosphate- buffered saline (PBS), at physiological pH, Eagles balanced salt solution (BSS) at physiological pH, PBS or BSS comprising serum, or conditioned media comprising PBS or BSS and/or serum incubated at 37 °C. Said conditions may be modulated as deemed necessary by one of skill in the art. Subsequent to contacting the cells with the agent, said cells will be disrupted and the polypeptides of the lysate are fractionated such that a polypeptide fraction is pooled and contacted with an antibody to be further processed by immunological assay (e.g., ELISA, immunoprecipitation or Western blot). The pool of proteins isolated from the "agent- contacted" sample will be compared with a control sample where only the excipient is contacted with the cells and an increase or decrease in the immunologically generated signal from the "agent-contacted" sample compared to the control will be used to distinguish the effectiveness of the agent.

I. Methods to Identify Agents that Modulate at Least One Activity of the BPH Associated Proteins Another embodiment of the present invention provides methods for identifying agents that modulate at least one activity of a protein of the invention such as the protein having the amino acid sequence of SEQ ID NO: 2 or SEQ ED NO: 4. Such methods or assays may utilize any means of monitoring or detecting the desired activity.

In one format, the relative amounts of a protein of the invention between a cell population that has been exposed to the agent to be tested compared to an un-exposed control cell population may be assayed. In this format, probes such as specific antibodies are used to monitor the differential expression of the protein in the different cell populations. Cell lines or populations are exposed to the agent to be tested under appropriate conditions and time. Cellular lysates may be prepared from the exposed cell line or population and a control, unexposed cell line or population. The cellular lysates are then analyzed with the probe.

Antibody probes are prepared by immunizing suitable mammalian hosts in appropriate immunization protocols using the peptides, polypeptides or proteins of the invention if they are of sufficient length, or, if desired, or if required to enhance immunogenicity, conjugated to suitable carriers. Methods for preparing immunogenic conjugates with carriers such as BSA,

KLH, or other carrier proteins are well known in the art. In some circumstances, direct conjugation using, for example, carbodiimide reagents may be effective; in other instances linking reagents such as those supplied by Pierce Chemical Co. (Rockford, IL), may be desirable to provide accessibility to the hapten. The hapten peptides can be extended at either the amino or carboxy terminus with a cysteine residue or interspersed with cysteine residues, for example, to facilitate linking to a carrier. Administration of the immunogens is conducted generally by injection over a suitable time period and with use of suitable adjuvants, as is generally understood in the art. During the immunization schedule, titers of antibodies are taken to determine adequacy of antibody formation.

While the polyclonal antisera produced in this way may be satisfactory for some applications, for pharmaceutical compositions, use of monoclonal preparations is preferred. Immortalized cell lines which secrete the desired monoclonal antibodies may be prepared using the standard method of Kohler and Milstein (Nature 256:495-497, 1975) or modifications which effect immortalization of lymphocytes or spleen cells, as is generally known. The immortalized cell lines secreting the desired antibodies are screened by immunoassay in which the antigen is the peptide hapten, polypeptide or protein. When the appropriate immortalized cell culture secreting the desired antibody is identified, the cells can be cultured either in vitro or by production in ascites fluid. The desired monoclonal antibodies are then recovered from the culture supernatant or from the ascites supernatant. Fragments of the monoclonal antibodies or the polyclonal antisera which contain the immunologically significant portion can be used as antagonists, as well as the intact antibodies. Use of immunologically reactive antibody fragments, such as the Fv, Fab, Fab', or F(ab')₂ fragments is often preferable, especially in a therapeutic context, as these fragments are generally less immunogenic than the whole immunoglobulin.

The antibodies or fragments may also be produced, using current technology, by recombinant means. Antibody regions that bind specifically to the desired regions of the protein can also be produced in the context of chimeras with multiple species origin, such as humanized antibodies. In another assay format, one or more activities of a protein of the invention may be assayed in the presence and absence of an agent to be tested to determine whether they are capable of modulating (i.e. enhancing or inhibiting) this activity, for example cell adhesion between cells expressing a protein of the invention. The ability of an agent to modulate cell adliesion may generally be evaluated in vitro by assaying the effect of the agent on one or more of the following: (1) neurite outgrowth, (2) adhesion between endothelial cells, (3) adhesion between epithelial cells (e.g., normal rat kidney cells and/or human skin), (4) adhesion between cancer cells, and/or (5) adhesion between other cell types, wherein the cells under study express a protein of the invention either endogenously or as a result of transfection with a nucleic acid of the invention. In general, an agent is considered to be a modulator of cell adhesion if, within one or more of these representative assays, contact of the test cells with the agent results in a discernible disruption of cell adhesion. For example, see U.S. Pat. No.

6,031,072; U.S. Pat. No. 5,891,706; Urushihara et al., "Transformation of Cell Adhesion Properties by Exogenously Introduced E-cadherin cDNA," Dev Biol 70: 206-216, 1979; Nagafuchi et al., Nature, 329:341-343, 1987.

Agents that are assayed in the above method can be randomly selected or rationally selected or designed. As used herein, an agent is said to be randomly selected when the agent is chosen randomly without considering the specific sequences involved in the association of a protein of the invention alone or with its associated substrates, binding partners, etc. An example of randomly selected agents is the use a chemical library or a peptide combinatorial library, or a growth broth of an organism.

As used herein, an agent is said to be rationally selected or designed when the agent is chosen on a nonrandom basis which takes into account the sequence of the target site and/or its conformation in connection with the agent's action. Agents can be rationally selected or rationally designed by utilizing the peptide sequences that make up these sites. For example, a rationally selected peptide agent can be a peptide whose amino acid sequence is identical to or a derivative of any functional consensus site.

The agents of the present invention can be, as examples, peptides, small molecules, vitamin derivatives, as well as carbohydrates. Dominant negative proteins, DNAs encoding these proteins, antibodies to these proteins, peptide fragments of these proteins or mimics of these proteins may be introduced into cells to affect function. "Mimic" used herein refers to the modification of a region or several regions of a peptide molecule to provide a structure chemically different from the parent peptide but topographically and functionally similar to the parent peptide (see Grant GA in: Molecular Biology and Biotechnology. Meyers (ed.). pp. 659-664, NCH Publishers, New York, 1995). A skilled artisan can readily recognize that there is no limit as to the structural nature of the agents of the present invention.

The peptide agents of the invention can be prepared using standard solid phase (or solution phase) peptide synthesis methods, as is known in the art. In addition, the DNA encoding these peptides may be synthesized using commercially available oligonucleotide synthesis instrumentation and produced recombinantly using standard recombinant production systems. The production using solid phase peptide synthesis is necessitated if non-gene- encoded amino acids are to be included.

Another class of agents of the present invention are antibodies immunoreactive with critical positions of proteins of the invention. Antibody agents are obtained by immunization of suitable mammalian subjects with peptides, containing as antigenic regions, those portions of the protein intended to be targeted by the antibodies.

J. Uses for Agents that Modulate the Expression or at Least One Activity of the Proteins

As provided in the Examples, the proteins and nucleic acids of the invention, such as the proteins having the amino acid sequences of SEQ ID NOS: 2 or 4, are differentially expressed in BPH tissue. Agents that up-or-down-regulate or modulate the expression of the protein, or at least one activity of the protein, such as agonists or antagonists, may be used to modulate biological and pathologic processes associated with the protein's function and activity.

As used herein, a subject can be any mammal, so long as the mammal is in need of modulation of a pathological or biological process mediated by a protein of the invention. The term "mammal" is defined as an individual belonging to the class Mammalia. The invention is particularly useful in the treatment of human subjects.

Pathological processes refer to a category of biological processes which produce a deleterious effect. For example, expression of a protein of the invention may be associated with prostate cell growth or hyperplasia. As used herein, an agent is said to modulate a pathological process when the agent reduces the degree or severity of the process. For instance, BPH may be prevented or disease progression modulated by the administration of agents which up- or down-regulate or modulate in some way the expression or at least one activity of a protein of the invention.

The agents of the present invention can be provided alone, or in combination with other agents that modulate a particular pathological process. For example, an agent of the present invention can be administered in combination with other known drugs. As used herein, two agents are said to be administered in combination when the two agents are administered simultaneously or are administered independently in a fashion such that the agents will act at the same time.

The agents of the present invention can be administered via parenteral, subcutaneous, intravenous, intramuscular, intraperitoneal, transdermal, or buccal routes. Alternatively, or concurrently, administration may be by the oral route. The dosage administered will be dependent upon the age, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired.

The present invention further provides compositions containing one or more agents which modulate expression or at least one activity of a protein of the invention. While individual needs vary, determination of optimal ranges of effective amounts of each component is within the skill of the art. Typical dosages comprise 0.1 to 100 μg/kg body wt. The preferred dosages comprise 0.1 to 10 μg/kg body wt. The most preferred dosages comprise 0.1 to 1 μg/kg body wt.

In addition to the pharmacologically active agent, the compositions of the present invention may contain suitable pharmaceutically acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically for delivery to the site of action. Suitable formulations for parenteral administration include aqueous solutions of the active compounds in water-soluble form, for example, water-soluble salts. In addition, suspensions of the active compounds as appropriate oily injection suspensions may be administered. Suitable lipopbilic solvents or vehicles include fatty oils, e.g., sesame oil, or synthetic fatty acid esters, e.g., ethyl oleate or triglycerides. Aqueous injection suspensions may contain substances which increase the viscosity of the suspension include, for example, sodium carboxymethyl cellulose, sorbitol, and/or dextran. Optionally, the suspension may also contain stabilizers. Liposomes can also be used to encapsulate the agent for delivery into the cell. The pharmaceutical formulation for systemic administration according to the invention may be formulated for enteral, parenteral or topical administration. Indeed, all three types of formulations may be used simultaneously to achieve systemic administration of the active ingredient.

Suitable formulations for oral administration include hard or soft gelatin capsules, pills, tablets, including coated tablets, elixirs, suspensions, syrups or inhalations and controlled release forms thereof.

In practicing the methods of this invention, the compounds of this invention may be used alone or in combination, or in combination with other therapeutic or diagnostic agents. In certain preferred embodiments, the compounds of this invention may be coadministered along with other compounds typically prescribed for these conditions according to generally accepted medical practice. The compounds of this invention can be utilized in vivo, ordinarily in mammals, such as humans, sheep, horses, cattle, pigs, dogs, cats, rats and mice, or in vitro.

K. Transgenic Animals

Transgenic animals containing mutant, knock-out or modified genes corresponding to the cDNA sequence of SEQ ID NOS: 1 or 3, or the open reading frame encoding the polypeptide sequences of SEQ ID NOS: 2 or 4, or fragments thereof having a consecutive sequence of at least about 6, 10, 15, 20, 25, 30, 35 or more amino acid residues, are also included in the invention. Transgenic animals are genetically modified animals into which recombinant, exogenous or cloned genetic material has been experimentally transferred. Such genetic material is often referred to as a "transgene." The nucleic acid sequence of the transgene, in some embodiments, all or a portion of SEQ ID NOS: 1 or 3, may be integrated either at a locus of a genome where that particular nucleic acid sequence is not otherwise normally found or at the normal locus for the transgene. The transgene may consist of nucleic acid sequences derived from the genome of the same species or of a different species than the species of the target animal.

In some embodiments, transgenic animals in which all or a portion of one or more genes comprising SEQ ED NOS: 1 or 3 is deleted may be constructed. In those cases where the gene corresponding to SEQ ED NOS: 1 or 3 contains one or more introns, the entire gene- all exons, introns and the regulatory sequences- may be deleted. Alternatively, less than the entire gene may be deleted. For example, a single exon and/or intron may be deleted, so as to create an animal expressing a modified version of a protein of the invention. The term "germ cell line transgenic animal" refers to a transgenic animal in which the genetic alteration or genetic information was introduced into a germ line cell, thereby conferring the ability of the transgenic animal to transfer the genetic information to offspring. If such offspring in fact possess some or all of that alteration or genetic information, then they too are transgenic animals. The alteration or genetic information may be foreign to the species of animal to which the recipient belongs, foreign only to the particular individual recipient, or may be genetic information already possessed by the recipient. In the last case, the altered or introduced gene may be expressed differently than the native gene. Transgenic animals can be produced by a variety of different methods including transfection, electroporation, microinjection, gene targeting in embryonic stem cells and recombinant viral and retroviral infection (see, e.g., U.S. Patent No. 4,736,866; U.S. Patent No.

5,602,307; Mullins et al, Hypertension 22:630-633, 1993; Brenin et al., Surg Oncol 6:99-110, 1997; Tuan, Recombinant Gene Expression Protocols, Methods in Molecular Biology,

Humana Press, 1997).

A number of recombinant or transgenic mice have been produced, including those which express an activated oncogene sequence (U.S. Patent No. 4,736,866); express simian

SV40 T-antigen (U.S. Patent No. 5,728,915); lack the expression of interferon regulatory factor 1 (IRF-1) (U.S. Patent No. 5,731,490); exhibit dopaminergic dysfunction (U.S. Patent No. 5,723,719); express at least one human gene which participates in blood pressure control (U.S. Patent No. 5,731,489); display greater similarity to the conditions existing in naturally occurring Alzheimer's disease (U.S. Patent No. 5,720,936); have a reduced capacity to mediate cellular adhesion (U.S. Patent No. 5,602,307); possess a bovine growth hormone gene (Clutter et al, Genetics 143:1753-1760, 1996); or, are capable of generating a fully human antibody response (McCarthy, Lancet 349:405, 1997).

While mice and rats remain the animals of choice for most transgenic experimentation, in some instances it is preferable or even necessary to use alternative animal species. Transgenic procedures have been successfully utilized in a variety of non-murine animals, including sheep, goats, pigs, dogs, cats, monkeys, chimpanzees, hamsters, rabbits, cows and guinea pigs (see, e.g., Kim et al, Mol ReprodDev 46:515-526, 1997; Houdebine, ReprodNutr Dev 35:609-617, 1995; Petters, Reprod Fertil Dev 6:643-645, 1994; Schnieke et al, Science 278:2130-2133, 1997; and Amoah, J Animal Science 75:578-585, 1997).

The method of introduction of nucleic acid fragments into recombination competent mammalian cells can be by any method which favors co-transformation of multiple nucleic acid molecules. Detailed procedures for producing transgenic animals are readily available to one skilled in the art, including the disclosures in U.S. Patent No. 5,489,743 and U.S. Patent No. 5,602,307.

L. Diagnostic Methods

As the genes and proteins of the invention are differentially expressed in BPH tissue compared to normal prostate tissue, the genes and proteins of the invention may be used to diagnose or monitor BPH, prostate function, or to track disease progression. One means of diagnosing BPH using the nucleic acid molecules or proteins of the invention involves obtaining prostate tissue from living subjects. Obtaining tissue samples from living sources is problematic for tissues such as prostate. However, due to the nature of the treatment paradigms for BPH, biopsy may be necessary. When possible, urine, blood or peripheral lymphocyte samples may be used as the tissue sample in the assay. Commonly, in hyperplastic diseases, genes which are up-regulated in the affected tissue (prostate, in this case) are also up- regulated in lymphocytes, which may be isolated from whole blood.

The use of molecular biological tools has become routine in forensic technology. For example, nucleic acid probes comprising all or at least part of the sequences of SEQ ID NOS: 1 or 3 may be used to deteπnine the expression of a nucleic acid molecule in forensic/pathology specimens. Further, nucleic acid assays may be carried out by any means of conducting a transcriptional profiling analysis. In addition to nucleic acid analysis, forensic methods of the invention may target the proteins of the invention, particularly a protein comprising SEQ ED NOS: 2 or 4, to determine up or down regulation of the genes (Shiverick et al, Biochim BiophysActa 393;124-133, 1975).

Methods of the invention may involve treatment of tissues with collagenases or other proteases to make the tissue amenable to cell lysis (Semenov et al, Biull Eksp Biol Med 104:113-116, 1987). Further, it is possible to obtain biopsy samples from different regions of the prostate for analysis. Assays to detect nucleic acid or protein molecules of the invention may be in any available format. Typical assays for nucleic acid molecules include hybridization or PCR- based formats. Typical assays for the detection of proteins, polypeptides or peptides of the invention include the use of antibody probes in any available format such as in situ binding assays, etc. (see Harlow & Lane, Antibodies - A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1988). In preferred embodiments, assays are carried-out with appropriate controls.

The above methods may also be used in other diagnostic protocols, including protocols and methods to detect disease states in other tissues or organs, for example the tissues in which gene expression is detected.

M. Databases and Computer-Readable Formats

In one application of this embodiment, a nucleotide or protein sequence of the present invention can be recorded on computer readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, such as magnetic tape; optical storage media such as CD-ROM, CD-R, CD-

R/W or DND; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable media can be used to create a manufacture comprising a computer readable medium having recorded thereon a nucleotide or protein sequence of the present invention. As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on a computer readable medium to generate manufactures comprising a nucleotide or protein sequence information of the present invention.

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide or protein sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information, h addition, a variety of data processor programs and formats can be used to store the nucleotide or protein sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring formats (i.e., text file or database) in order to obtain computer readable medium having recorded thereon a nucleotide or protein sequence information of the present invention. By providing any of the nucleotide sequences, SEQ ED ΝOS: 1 or 3, or a fragment thereof; or a nucleotide sequence which encodes the protein or polypeptide sequence of SEQ ED ΝOS: 2 or 4, or a fragment thereof; or any one of the protein sequences of SEQ ID ΝOS: 2 or 4, or a fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide sequences of SEQ ID ΝOS: 1 or 3, or at least 95 % identical to any nucleotide sequence which encodes the protein or polypeptide sequence of SEQ ED ΝOS: 2 or 4 in computer readable form, a skilled artisan can routinely access the sequence information through a user interface for a variety of purposes. Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. For example, computer readable formats of the nucleic acid sequences of the present invention can be used to implement the BLAST (Altschul et al, JMol Biol 215:403-410,

1990) and BLAZE (Brufiag et al, Comp Chem 17:203-207, 1993) search algorithms.

As used herein, "a computer-based system" refers to the hardware means, software means, user interface and data storage means used to analyze the nucleotide or protein sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based systems is suitable for use in the present invention. As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide or protein sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means. As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.

As used herein, "search means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of a known sequence which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly, and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTEDEIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequences fragments involved in gene expression and protein processing, may be of shorter length.

As used herein, " a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).

Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the compounds of the present invention and practice the claimed methods. The following working examples therefore, specifically point out preferred embodiments of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.

EXAMPLES

Example 1

Identification of Differentially Expressed BPH mRNA

Human tissue was obtained from the transitional zone of the prostate in biopsy samples from normal individuals and from patients with BPH or prostate cancer. BPH was defined histologically in all samples. Normal tissue and asymptomatic BPH samples came from individuals who died of trauma, and did not report symptoms. Patients having BPH with symptoms were defined as those with a need for frequent urination; in these patients a radical prostatectomy had been performed. Prostate cancer patients provided age-matched tissue samples for symptomatic BPH patients, but were without symptoms and without cancer in the transitional zone under histological examination.

Microarray sample preparation was conducted with minor modifications, following the protocols set forth in the Affymetrix GeneChip^® Expression Analysis Manual. Frozen tissue was ground to a powder using a Spex Certiprep 6800 Freezer Mill. Total RNA was extracted from ground tissue or cultured cells with Trizol^® (GibcoBRL) utilizing the manufacturer's protocol. The total RNA yield for each tissue sample was 200-500 μg per 300 mg tissue weight. mRNA was isolated using the Oligotex mRNA Midi kit^® (Qiagen) followed by ethanol precipitation. Double stranded cDNA was generated from mRNA using the Superscript Choice^® system (GibcoBRL). First strand cDNA synthesis was primed with a T7-

(dT24) oligonucleotide. The cDNA was phenol-chloroform extracted and ethanol precipitated to a final concentration of 1 μg/μl. From 2 μg of cDNA, cRNA was synthesized using

Ambion's T7 MegaScript in vitro Transcription Kit^®. To biotin label the cRNA, nucleotides Bio- 11 -CTP and Bio- 16-UTP (Enzo

Diagnostics) were added into the reaction. Following a 37°C incubation for six hours, impurities were removed from the labeled cRNA following the RNAeasy Mini kit protocol

(Qiagen). cRNA was fragmented (5' fragmentation buffer consisting of 200 mM Tris-acetate

(pH 8.1), 500 mM KOAc, 150 mM MgOAc) for thirty-five minutes at 94°C. Following the Affymetrix protocol, 55 μg of fragmented cRNA was hybridized on the Human 35k chip set and the HuGeneFL array for twenty-four hours at 60 rpm in a 45 °C hybridization oven. The chips were washed and stained with Streptavidin Phycoerythrin (SAPE) (Molecular Probes) in Affymetrix fluidics stations. To amplify staining, SAPE solution was added twice, with an anti-streptavidin biotinylated antibody (Vector Laboratories) staining step in between. Hybridization to the probe arrays was detected by fluorometric scanning (Hewlett Packard Gene Array Seamier). Data was analyzed using Affymetrix GeneChip version 3.0 and Expression Data Mining Tool (EDMT) software (version 1.0).

Differential expression of genes between the BPH and normal prostate samples (see Figure 1) was determined using the Affymetrix GeneChip® 35K human gene chip set (Hu35KsubB) by the following criteria: (1) For each gene, Affymetrix GeneChip average difference values were determined by standard Affymetrix EDMT software algorithms, which also made "Absent" (=not detected), "Present" (=detected) or "Marginal" (=not clearly Absent or Present) calls for each GeneChip element; (2) all negative values (=Absent) were raised to a floor of +20 (positive 20) so that fold change calculations could be made where values were not already greater than or equal to +20; (3) median levels of expression were compared between the normal control group and the BPH with symptoms disease group to obtain greater than or equal to 3 -fold up/down values; (4) The median value for the higher expressing group needed to be greater than or equal to 200 average difference units in order to be considered for statistical significance; (5) Genes passing the criteria of #1-4 were analyzed for statistical significance using a two-tailed T-test and deemed statistically significant if p < 0.05.

Figure 1 shows the expression of AA256268 across 33 human samples, including 10 normals, 5 BPH samples from individuals without symptoms, 8 BPH samples from individuals who were prostate cancer patients, and 10 BPH samples from patients with symptoms. Up- regulation of expression is observed in BPH samples from people with symptoms. The expression data show that up-regulation of AA256268 is diagnostic for BPH in patients with symptoms.

AA233368 exhibited on average a 4-fold change in expression levels in tissue from BPH patients with symptoms (p<0.045). A233368 is also upregulated in selected BPH patients without symptoms or BPH with cancer (T-test, p>0.05 for all samples within a group). For these two groups, up-regulation of AA233368 could be predictive of the eventual onset of BPH with symptoms.

Example 2

Cloning of a Full Length Human cDNA (AA233368) Corresponding to the Differentially Expressed mRNA Species

The full length cDNA having SEQ ID NOS: 1 and 3 was obtained by the oligo-pulling method. Briefly, a gene-specific oligo was designed based on the sequence of the AA233368 clone. The specific oligo was labeled with biotin and used to hybridize with 2 μg of single- stranded plasmid DNA (cDNA recombinants) from a cDNA library obtained from a human K562 cell line following the procedures of Sambrook et al. (supra). The hybridized cDNAs were separated by streptavidin-conjugated beads and eluted by heating. The eluted cDNA was converted to double strand plasmid DNA and used to transform E. coli cells (DH10B) and the longest cDNA was screened. After positive selection was confirmed by PCR using gene- specific primers, the cDNA clone was subjected to DNA sequencing.

The nucleotide sequences of the full-length human cDNA corresponding to the differentially regulated mRNAs detected above is set forth in SEQ ID NOS: 1 and 3 (identical nucleotide sequences). The cDNA comprises 1957 base pairs, with a first open reading frame at nucleotides 57-1592 encoding a protein of 512 amino acids. The amino acid sequence corresponding to the encoded protein is set forth in SEQ ED NO: 2.

A second open reading frame at nucleotides 553-861 encodes a protein of 103 amino acids, as set forth in SEQ ED NO: 4.

Figures 5 and 6 show the results of various hydrophobicity/hydrophilicity analyses of the amino acid sequences of SEQ ED NOS: 2 and 4, respectively.

Analysis of the amino acid sequence of SEQ ID NO: 2 predicts the presence of amidation sites at amino acids 241 and 281, N-glycosylation sites at amino acids 402 and 406, numerous casein kinase II (ck2) phosphorylation sites (amino acids 88, 90, 154, 202, 293, 300, 316, 409, 423 and 425), a glycosaminoglycan attachment site at amino acid 376, N- myristoylation sites at amino acids 15, 101, 150, 221, 311 and 372 and numerous protein kinase C phosphorylation sites (amino acids 55, 94, 116, 154, 190, 261, 264, 300, 316, 320,

395 and 413). An additional motif identified by primary peptide sequence analysis suggests a potential cell attachment sequence starting at amino acid 397, the sequence R-G-D. The RGD peptide is considered a crucial sequence for the interaction of fibronectin with its cell surface receptor (an integrin protein) and has also been found in a number of other proteins with cell adhesion roles, such as fibrinogen, vitronectin and some collagens.

Analysis of the amino acid sequence of SEQ ED NO: 4 predicts the presence of a casein kinase II (ck2) phosphorylation site at amino acid 33, a glycosaminoglycan attachment site at amino acid 71, N-myristoylation sites at amino acids 14 and 44 and protein kinase C phosphorylation sites at amino acids 29, 89 and 90.

Figure 2 shows relative AA233368 mRNA levels determined via Northern blot in a range of normal tissues. A probe based on SEQ ED NOS: 1 or 3 (probe F13-34/R324-345) was exposed to human mRNA blots (available from ClonTech) overnight at 65°C in Church- Gilbert hybridization buffer, following standard methodology as described by Sambrook et al. (supra). Expression is strongest in the liver, where a single band of about 2.2 kb is observed, but lower levels of expression of the same apparent size message are observed in skeletal muscle and pancreas, with still lower levels in kidney. The lowest levels of expression are found in placenta, lung, heart and brain.

Example 3

Semi-quantitative Analysis of Expression Levels

Figure 3 shows the results of an electronic Northern assay, in which the expression level was measured across a panel of human normal and BPH tissues using the Affymetrix human 42K 5-chip GeneChip set. The order of expression (high to low) across the normal tissues in Figure 3 is: ovary > myometrium > kidney=liver=colon=small intestine=thymus=skeletal muscle >lung=endometrium=breast =cervix=prostate=stomach > spleen=esophagus>brain. Figure 4 shows the results of the semi-quantitative PCR analysis of expression levels of mRNA corresponding to SEQ ED NOS: 1 and 3 in various disease state and normal human tissue samples. Real time PCR detection was accomplished by the use of the ABI PRISM 7700 Sequence Detection System. The 7700 measures the fluorescence intensity of the sample in each cycle and is able to detect the presence of specific amplicons within the PCR reaction.

Each sample was assayed for the level of GAPDH and mRNA corresponding to SEQ ID NOS:

1 or 3. GAPDH detection was performed using Perkin Elmer part #402869 according to the manufacturer's directions. Primers were designed from SEQ ID NOS: 1 or 3 using Primer Express, a program developed by PE to efficiently find primers and probes for specific sequences. These primers were used in conjunction with SYBR green (Molecular Probes), a nonspecific double stranded DNA dye, to measure the expression level of mRNA corresponding to SEQ ID NOS: 1 or 3, which was normalized to the GAPDH level in each sample. A high level of expression was observed in lung, kidney and spleen tissues, and lower levels were detected in heart, liver, placenta, BRF-55T (immortalized human BPH cell line), glomerular and osteoblast tissues. Expression was not detected in brain or fetal brain tissues or in leukocytes. Because BPH with symptoms is associated with hypertrophic growth of the prostate and/or a low grade chronic inflammation, upregulated expression of AA233368 in disease states involving hypertrophic growth and/or inflammation is also likely of diagnostic value in other tissues.

Example 4

Detection of BPH mRNA for BPH Disease Screening The expression level of mRNA corresponding to SEQ ED NOS: 1 or 3 is determined in prostate tissue biopsy samples, in urine or in lymphocytes from blood samples, as described in Example 1 and Figure 1, i.e., by screening mRNA samples on a GeneChip, or as described in Example 3, i.e., by semi-quantitative PCR analysis using the fluorescent detection system and GAPDH with the primers described in Figure 4. Alternatively, samples from non-prostate hyperplastic tissues in malignant or non-malignant states may also be analyzed. Tissue samples from patients with BPH with symptoms and from normal subjects are used as positive and negative controls. Using an Affymetrix GeneChip set, a level of expression about 4 times higher than that of the normal control is indicative of BPH or a likelihood of developing BPH. Based on the amplitude of fluorescence measurements in semi-quantitative PCR, an increased level of expression compared to normal prostate tissue is indicative of BPH or a likelihood of developing BPH .

Alternatively, because BPH with symptoms involves a chronic inflammatory response, the up-regulation of AA233368 in prostate could specifically identify cases in which inflammation is occurring, which would have additional diagnostic and prognostic implications and utility. More generally for inflammatory conditions, the tissue distribution results (see

Figures 2-4) indicate that the up-regulation of AA233368 is likely to be of diagnostic significance in inflammatory responses in tissues in which AA233368 is found and a potential target for therapeutic intervention.

The finding that AA233368 is upregulated in tissue adjacent to tumors could provide diagnostic and prognostic information for occurrence and localization of primary and metastatic tumors of prostate origin. More generally, because AA233368 is upregulated in non-cancerous BPH tissue adjacent to cancerous tissue, AA233368 is likely to be of diagnostic and prognostic significance in primary tumors and in metastatic tissues in which AA233368 is found (see Figures 2-4). This gene is also, therefore, a potential target for therapeutic intervention in both inflammatory and hyperplastic diseases.

Although the present invention has been described in detail with reference to examples above, it is understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the following claims. All cited patents, patent applications and publications referred to in this application are herein incorporated by reference in their entirety.

Claims

What Is Claimed:

1. An isolated nucleic acid molecule selected from the group consisting of: (a) an isolated nucleic acid molecule that encodes the amino acid sequence of SEQ ID NOS: 2 or 4; (b) an isolated nucleic acid molecule that encodes a fragment of at least 6 contiguous amino acids of SEQ ID NOS: 2 or 4; (c) an isolated nucleic acid molecule which hybridizes to the complement of a nucleic acid molecule comprising SEQ ID NOS: 1 or 3; (d) an isolated nucleic acid molecule which hybridizes to the complement of a nucleic acid molecule that encodes the amino acid sequence of SEQ ED NOS: 2 or 4; (e) an isolated nucleic acid molecule that encodes a protein that is expressed in BPH and that exhibits at least about 50% amino acid sequence identity to SEQ ID NOS: 2 or 4 and (f) an isolated nucleic acid molecule that exhibits at least about 50% nucleotide sequence identity to the open reading frames of SEQ ID NOS: 1 or 3.

2. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule comprises nucleotides 57-1592 of SEQ ID NO: 1.

3. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule consists of nucleotides 57-1592 of SEQ ID NO: 1.

4. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule comprises nucleotides 57-1595 of SEQ ED NO: 1.

5. The isolated nucleic acid molecule of claim 1 , wherein the nucleic acid molecule comprises nucleotides 553-861 of SEQ ED NO: 3.

6. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule consists of nucleotides 553-861 of SEQ ID NO: 3.

7. The isolated nucleic acid molecule of claim 1, wherein the nucleic acid molecule comprises nucleotides 553-864 of SEQ ID NO: 3.

8. The isolated nucleic acid molecule of any one of claims 1-7, wherein said nucleic acid molecule is operably linked to one or more expression control elements.

9. A vector comprising an isolated nucleic acid molecule of any one of claims 1 -7.

10. A host cell transformed to contain the nucleic acid molecule of any one of claims 1 -7.

11. A host cell comprising a vector of claim 9.

12. A host cell of claim 11, wherein said host is selected from the group consisting of prokaryotic host cells and eukaryotic host cells.

13. A method for producing a polypeptide, comprising:

(a) culturing a host cell transformed with the nucleic acid molecule of any one of claims 1-7 under conditions in which the protein encoded by said nucleic acid molecule is expressed.

14. The method of claim 13, wherein said host cell is selected from the group consisting of prokaryotic host cells and eukaryotic host cells.

15. An isolated polypeptide produced by the method of claim 13.

16. An isolated polypeptide or protein selected from the group consisting of: (a) an isolated polypeptide comprising the amino acid sequence of SEQ ID NOS: 2 or 4; (b) an isolated polypeptide comprising a fragment of at least 6 contiguous amino acids of SEQ ID NOS: 2 or 4; (c) an isolated polypeptide comprising conservative amino acid substitutions of SEQ ID NOS: 2 or 4; (d) an isolated polypeptide comprising naturally occurring amino acid sequence variants of SEQ ID NOS: 2 or 4; (e) and an isolated polypeptide exhibiting at least about 50% amino acid sequence identity with SEQ ED NOS: 2 or 4.

17. An isolated antibody that specifically binds to a polypeptide of either claim 15 or 16.

18. An antibody of claim 17, wherein said antibody is a monoclonal or a polyclonal antibody.

19. A method of identifying an agent which modulates the expression of a nucleic acid encoding a protein of claim 16, comprising:

(a) exposing cells which express the nucleic acid to the agent; and (b) determining whether the agent modulates the expression of said nucleic acid, thereby identifying an agent which modulates the expression of a nucleic acid encoding the protein.

20. A method of identifying an agent which modulates the level of or at least one activity of a protein of claim 16, comprising:

(a) exposing cells which express the protein to the agent;

(b) determining whether the agent modulates the level of or at least one activity of said protein, thereby identifying an agent which modulates the level of or at least one activity of the protein.

21. The method of claim 20, wherein the agent modulates one activity of the protein.

22. A method of identifying binding partners for a protein of claim 16, comprising: (a) exposing said protein to a potential binding partner; and (b) determining if the potential binding partner binds to said protein, thereby identifying binding partners for the protein.

23. A method of modulating the expression of a nucleic acid encoding a protein of claim 16, comprising: (a) administering an effective amount of an agent which modulates the expression of a nucleic acid encoding the protein.

24. A method of modulating at least one activity of a protein of claim 16, comprising:

(a) administering an effective amount of an agent which modulates at least one activity of the protein.

25. A non-human transgenic animal modified to contain a nucleic acid molecule of any of claims 1-7.

26. A non-human transgenic animal modified to contain a nucleic acid molecule of any of claims 1-7, wherein all or a portion of SEQ ID NOS: 1 or 3 has been knocked out.

27. A method of diagnosing a disease state in a subject, comprising:

(a) determining the level of expression of a nucleic acid molecule or protein of any one of claims 1-7 or 16.

28. The method of claim 27, wherein the disease state is benign prostatic hypeφlasia.

29. The method of claim 28, wherein the disease state is benign prostatic hypeφlasia with symptoms.

30. A computer system comprising: (a) a database containing information identifying the expression level in a tissue of a set of nucleic acids comprising at least one nucleic aid sequence selected from the group consisting of SEQ ID NOS: 1 and 3, or a complement thereof; and (b) a user interface to view the information.

31. A computer system comprising:

(a) a database containing information identifying the expression level in a tissue of a set of nucleic acids comprising at least one nucleic aid sequence encoding a protein selected from the group consisting of SEQ ID NOS: 2 and 4, or a complement of said nucleic acid sequence; and (b) a user interface to view the information.

32. A computer system of claims 30 or 31, wherein the database further comprises sequence information for said at least one nucleic acid sequence.

33. A computer system of claims 30 or 31 , wherein the database further comprises information identifying the expression level for said at least one nucleic acid sequence in at least one prostate tissue sample.

34. A computer system of claims 30 or 31 , wherein the database further comprises information identifying the expression level of said at least one nucleic acid sequence in at least one prostate tissue sample from a patient with benign prostatic hypeφlasia.

35. A computer system of claim 34, wherein the patient with benign prostatic hypeφlasia has benign prostatic hypeφlasia with symptoms.

36. A computer system of claims 30 or 31, further comprising records including descriptive information from an external database, which information correlates said genes to records in the external database.

37. A computer system of claim 36, wherein the external database is GenBank.

38. A method of using a computer system of claim 30 to present information identifying the expression level in a tissue of a set of nucleic acids comprising at least one nucleic aid sequence selected from the group consisting of SEQ ID NOS: 1 and 3, or a complement thereof, comprising comparing the expression level of at least one nucleic aid sequence selected from the group consisting of SEQ ED NOS: 1 and 3, or a complement thereof, in the tissue to the level of expression of the nucleic acid sequence in the database.

39. A method of using a computer system of claim 31 to present information identifying the expression level in a tissue of a set of nucleic acids comprising at least one nucleic aid sequence encoding a protein selected from the group consisting of SEQ ID NOS: 2 and 4, or a complement of said nucleic acid sequence, comprising comparing the expression level of at least one nucleic aid sequence encoding a protein selected from the group consisting of SEQ ED NOS: 2 and 4, or a complement of said nucleic acid sequence, in the tissue to the level of expression of the nucleic acid sequence in the database.

40. A method of claims 38 or 39, wherein the expression level of at least two nucleic acid sequences are compared.

41. A method of claim 40, wherein the expression level of at least five nucleic acid sequences are compared.

42. A method of claim 41 , wherein the expression level of all the nucleic acid sequences are compared.

43. A computer system comprising: the nucleotide and/or amino acid sequence of at least one of SEQ ED NOS: 1-4 and a user interface to view the information.