[go: up one dir, main page]

US20020015950A1 - Atherosclerosis-associated genes - Google Patents

Atherosclerosis-associated genes Download PDF

Info

Publication number
US20020015950A1
US20020015950A1 US09/349,015 US34901599A US2002015950A1 US 20020015950 A1 US20020015950 A1 US 20020015950A1 US 34901599 A US34901599 A US 34901599A US 2002015950 A1 US2002015950 A1 US 2002015950A1
Authority
US
United States
Prior art keywords
seq
polypeptide
polynucleotide
ligand
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/349,015
Inventor
Karen Anne Jones
Wayne Volkmuth
Michael G. Walker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Incyte Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US09/349,015 priority Critical patent/US20020015950A1/en
Assigned to INCYTE PHARMACEUTICALS, INC. reassignment INCYTE PHARMACEUTICALS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JONES, KAREN ANN, VOLKMUTH, WAYNE, WALKER, MICHAEL G.
Priority to JP2001509468A priority patent/JP2003504044A/en
Priority to AU58988/00A priority patent/AU5898800A/en
Priority to EP00944981A priority patent/EP1196564A2/en
Priority to PCT/US2000/017887 priority patent/WO2001004264A2/en
Priority to CA002378985A priority patent/CA2378985A1/en
Priority to US09/642,703 priority patent/US6524799B1/en
Publication of US20020015950A1 publication Critical patent/US20020015950A1/en
Priority to US10/219,664 priority patent/US20030129176A1/en
Priority to US10/247,451 priority patent/US20040018188A9/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P43/00Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P9/00Drugs for disorders of the cardiovascular system
    • A61P9/10Drugs for disorders of the cardiovascular system for treating ischaemic or atherosclerotic diseases, e.g. antianginal drugs, coronary vasodilators, drugs for myocardial infarction, retinopathy, cerebrovascula insufficiency, renal arteriosclerosis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Definitions

  • the invention relates to 34 atherosclerosis-associated polynucleotides identified by their co-expression with known atherosclerosis genes and their corresponding gene products.
  • the invention also relates to the use of these biomolecules in diagnosis, prognosis, prevention, treatment, and evaluation of therapies for diseases associated with atherosclerosis.
  • Atherosclerosis is a disorder characterized by cellular changes in the arterial intima and the formation of arterial plaques containing intra- and extracellular deposits of lipids.
  • the resultant thickening of artery walls and the narrowing of the arterial lumen is the underlying pathologic condition in most cases of coronary artery disease, aortic aneurysm, peripheral vascular disease, and stroke.
  • a cascade of molecules is involved in the cellular morphogenesis, proliferation, and cellular migration which results in an atherosclerotic lesion (Libby et al. (1997) Int. J. Cardiol. 62 (S2): 23-29).
  • a healthy artery consists of three layers.
  • the vascular intima lined by a monolayer of endothelial cells in contact with the blood, contains smooth muscle cells in extracellular matrix.
  • An internal elastic lamina forms the border between the intima and the tunica media.
  • the media contains layers of smooth muscle cells surrounded by a collagen and elastin-rich extracellular matrix.
  • An external elastic lamina forms the border between the media and the adventitia.
  • the adventitia contains nerves and some mast cells and is the origin of the vasa vasorum which supplies blood to the outer layers of the tunica media.
  • Initiation of an atherosclerotic lesion often occurs following vascular endothelial cell injury as a result of hypertension, diabetes mellitus, hyperlipidemia, fluctuating shear stress, smoking, or transplant rejection.
  • the injury results in the local release of nitric oxide and superoxide anions which react to form cytodestructive peroxynitrite radicals, causing injury to the endothelium and myocytes of the intima.
  • This cellular injury leads to the expression of a variety of molecules that produce local and systemic effects.
  • the initial cellular response to injury includes the release of mediators of inflammation such as cytokines, complement components, prostaglandins, and downstream transcription factors.
  • These molecules promote monocyte infiltration of the vascular intima and lead to the upregulation of adhesion molecules which encourage attachment of the monocytes to the damaged endothelial cells. Additionally, components of the extracellular matrix including collagens, fibrinogens, and matrix Gla protein are induced and provide sites for monocyte attachment. Annexins, plasminogen activator inhibitor 1, and nitric oxide synthases are triggered to counteract these effects.
  • Monocytes that infiltrate the lesion accumulate modified low density lipoprotein lipid through scavenger receptors such as CD36 and macrophage scavenger receptor type I.
  • modified lipids is a factor in atherogenesis and is influenced by modifying enzymes such as lipoprotein lipase, carboxyl ester lipase, serum amyloid P component, LDL-receptor related protein, microsomal triglyceride transfer protein, and serum esterases such as paraoxonase.
  • Lipid metabolism is governed by cholesterol biosynthesis enzymes such as 3-hydroxy-3-methylglutaryl coenzyme A synthase, and products of the apolipoprotein genes. Modified lipid stabilization and accumulation is aided by perilipin and alpha-2-macroglobulin.
  • monocytes As monocytes accumulate in the lesion, they can rupture and release free cholesterol, cytokines, and procoagulants into the surrounding environment. This process leads to the development of a plaque which consists of a mass of lipid-engorged monocytes and a lipid-rich necrotic core covered by a fibrous cap. The gradual progression of plaque growth is punctuated by thrombus formation which leads to clinical symptoms such as unstable angina, myocardial infarction, or stroke. Thrombus formation is initiated by episodic plaque rupture which exposes flowing blood to tissue factors, which induce coagulation, and collagen, which activates platelets.
  • cytoskeletal proteins such as calponin, myosin, desmin, and other gene products in the cells.
  • smooth muscle cells lose their contractility and become able to migrate from the media to the intima, to proliferate, and to secrete extracellular matrix components which contribute to arterial intimal thickening.
  • the present invention satisfies a need in the art by providing new compositions that are useful for diagnosis, prognosis, treatment, prevention, and evaluation of therapies for diseases associated with atherosclerosis.
  • the invention provides for a substantially purified polynucleotide comprising a gene that is coexpressed with one or more known atherosclerosis-associated genes in a biological sample.
  • known atherosclerosis-associated genes include and encode human 22 kDa smooth muscle protein, calponin, desmin, smooth muscle myosin heavy chain, alpha tropomyosin, human tissue inhibitor of metalloproteinase 3, human tissue inhibitor of metalloproteinase-2, human tissue inhibitor of metalloproteinase-4, pro alpha 1(I) collagen, collagen alpha-2 type I, collagen alpha-6 type I, procollagen alpha 2(V), collagen VI alpha-2, type VI collagen alpha3, pro-alpha-1 type 3 collagen, pro-alpha-1 (V) collagen, collagenase type IV/matrix metalloproteinase 9/gelatinase B, matrix Gla protein, cathepsin K, fibrinogen beta chain gene, fibrinogen gamma chain gene, pre-pro-
  • the invention also provides a substantially purified polynucleotide comprising a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples.
  • the polynucleotide comprises a polynucleotide sequence selected from (a) a polynucleotide encoding a peptide selected from SEQ ID NOs: 1-34; (b) a polynucleotide sequence complementary to the polynucleotide sequence of (a) or (b); and (c) a probe comprising at least 18 sequential nucleotides of the polynucleotide sequence of (a) or (b).
  • the invention further provides a pharmaceutical composition comprising a polynucleotide and a pharmaceutical carrier.
  • the invention additionally provides methods for using a polynucleotide.
  • One method uses the polynucleotide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polynucleotide and comprises combining the polynucleotide with a library of molecules or compounds under conditions to allow specific binding and detecting specific binding, thereby identifying a ligand which specifically binds the polynucleotide.
  • the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, and proteins; and the ligand identified using the method may be used to modulate the activity of the polynucleotide.
  • a second method uses the polynucleotide to purify a ligand which specifically binds the polynucleotide and comprises combining the polynucleotide with a sample under conditions to allow specific binding, detecting specific binding between the polynucleotide and a ligand, recovering the bound polynucleotide, and separating the polynucleotide from the ligand, thereby obtaining purified ligand.
  • a third method uses the polynucleotide to diagnose a disease or condition associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of biological samples and comprises hybridizing a polynucleotide to a sample under conditions to form one or more hybridization complexes, detecting the hybridization complexes, and comparing the levels of the hybridization complexes with the level of hybridization complexes in a non-diseased sample, wherein the altered level of hybridization complexes compared with the level of hybridization complexes of a non-diseased sample indicates the presence of the disease or condition.
  • a fourth method uses the polynucleotide to produce a polypeptide and comprises culturing a host cell containing an expression vector containing the polynucleotide under conditions for expression of the polypeptide and recovering the polypeptide from cell culture.
  • the invention provides a substantially purified polypeptide comprising the product of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples.
  • the invention also provides a polypeptide comprising a polypeptide sequence selected from (a) the polypeptides encoded by SEQ ID NOs: 1-34; and b) an oligopeptide sequence comprising at least 6 sequential amino acids of the polypeptide sequence of a).
  • The further provides a polypeptide comprising the amino acid sequence of SEQ ID NO: 35.
  • the invention still further provides a pharmaceutical composition comprising a polypeptide and a pharmaceutical carrier.
  • the invention additionally provides methods for using a polypeptide.
  • One method uses the polypeptide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polypeptide and comprises combining the polypeptide with the library of molecules or compounds under conditions to allow specific binding and detecting specific binding between the polypeptide and ligand, thereby identifying a ligand which specifically binds the polypeptide.
  • the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, polypeptides, agonists, antagonists, and antibodies; and the ligand identified using the method is used to modulate the activity of the polypeptide.
  • a second method uses the polypeptide to purify a ligand from a sample and comprises combining the polypeptide with a sample under conditions to allow specific binding, detecting specific binding between the polypeptide and a ligand, recovering the bound polypeptide, and separating the polypeptide from the ligand, thereby obtaining purified ligand.
  • a third method uses the polypeptide to treat or to prevent a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need and comprises administering to the subject in need the pharmaceutical composition containing the polypeptide in an amount effective for treating or preventing the disease.
  • the invention provides an antibody or Fab comprising an antigen binding site, wherein the antigen binding site specifically binds to the polypeptide.
  • the invention also provides a method for treating or preventing a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need, the method comprising the step of administering to the subject in need the antibody or the Fab in an amount effective for treating or preventing the disease.
  • the invention further provides an immunoconjugate comprising the antigen binding site of the antibody or Fab joined to a therapeutic agent.
  • the invention additionally provides a method for treating or preventing a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need, the method comprising the step of administering to the subject in need the immunoconjugate in an amount effective for treating or preventing the disease.
  • Sequence Listing provides exemplary atherosclerosis-associated gene sequences including polynucleotide sequences SEQ ID NOs: 1-34 and the polypeptide sequence, SEQ ID NO:35. Each sequence is identified by a sequence identification number (SEQ ID NO).
  • Atherosclerosis-associated gene refers to a gene or polynucleotide that exhibits a statistically significant coexpression pattern with known atherosclerosis-associated genes which are useful in the diagnosis, treatment, prognosis, or prevention of atherosclerosis.
  • known atherosclerosis-associated gene refers to a sequence which has been previously identified as useful in the diagnosis, treatment, prognosis, or prevention of atherosclerosis and includes polynucleotides encoding human 22 kDa smooth muscle protein, calponin, desmin, smooth muscle myosin heavy chain, alpha tropomyosin, human tissue inhibitor of metalloproteinase 3, human tissue inhibitor of metalloproteinase-2, human tissue inhibitor of metalloproteinase-4, pro alpha 1(I) collagen, collagen alpha-2 type I, collagen alpha-6 type I, procollagen alpha 2(V), collagen VI alpha-2, type VI collagen alpha3, pro-alpha-1 type 3 collagen, pro-alpha-1(V) collagen, collagenase type IV/matrix metalloproteinase 9/gelatinase B, matrix Gla protein, cathepsin K, fibrinogen beta chain gene, fibrinogen gamma chain gene, pre-pro-von Wille
  • Ligand refers to any molecule, agent, or compound which will bind specifically to a complementary site on a polynucleotide or polypeptide. Such ligands stabilize or modulate the activity of polynucleotides or polypeptides of the invention.
  • ligands are libraries of inorganic and organic molecules or compounds such as nucleic acids, proteins, peptides, carbohydrates, fats, and lipids.
  • NSEQ refers generally to a polynucleotide sequence of the present invention, including SEQ ID NO: 1-34.
  • PSEQ refers generally to a polypeptide sequence of the present invention, including SEQ ID NO:35.
  • a “fragment” refers to a nucleic acid sequence that is preferably at least 20 nucleotides in length, more preferably 40 nucleotides, and most preferably 60 nucleotides in length, and encompasses, for example, fragments consisting of 1-50, 51-400, 401-4000, 4001-12,000 nucleotides, and the like, of SEQ ID NO: 1-34.
  • Gene refers to the partial or complete coding sequence of a gene including 5′ or 3′ untranslated regions.
  • the gene may be in a sense or antisense (complementary) orientation.
  • Polynucleotide refers to a nucleic acid, nucleic acid sequence, oligonucleotide, nucleotide, or any fragment thereof. It may be DNA or RNA of genomic or synthetic origin, double-stranded or single-stranded, and combined with carbohydrate, lipids, protein or other materials to perform a particular activity or form a useful composition. “Oligonucleotide” is substantially equivalent to the terms amplimer, primer, oligomer, element, and probe.
  • Polypeptide refers to an amino acid, amino acid sequence, oligopeptide, peptide, or protein or portions thereof whether naturally occurring or synthetic.
  • a “portion” refers to peptide sequence which is preferably at least 5 to about 15 amino acids in length, most preferably at least 10 amino acids long, and which retains some biological or immunological activity of, for example, a portion of SEQ ID NO: 35.
  • sample is used in its broadest sense.
  • a sample containing nucleic acids may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; a tissue print; and the like.
  • substantially purified refers to a nucleic acid or an amino acid sequence that is removed from its natural environment and that is isolated or separated, and is at least about 60% free, preferably about 75% free, and most preferably about 90% free, from other components with which it is naturally present.
  • Substrate refers to any rigid or semi-rigid support to which polynucleotides or polypeptides are bound and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles with a variety of surface forms including wells, trenches, pins, channels and pores.
  • a “variant” refers to a polynucleotide or polypeptide whose sequence diverges from SEQ ID NO: 1-35. Polynucleotide sequence divergence may result from mutational changes such as deletions, additions, and substitutions of one or more nucleotides; it may also be introduced to accommodate differences in codon usage. Each of these types of changes may occur alone, or in combination, one or more times in a given sequence.
  • the present invention encompasses a method for identifying biomolecules that are associated with a specific disease, regulatory pathway, subcellular compartment, cell type, tissue type, or species.
  • the method identifies polynucleotides useful in diagnosis, prognosis, treatment, prevention, and evaluation of therapies for diseases associated with atherosclerosis including, but not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension.
  • the method entails first identifying polynucleotides that are expressed in a plurality of cDNA libraries.
  • the identified polynucleotides include genes of known or unknown function which are expressed in a specific disease process, subcellular compartment, cell type, tissue type, or species.
  • the expression patterns of the genes with known function are compared with those of genes with unknown function to determine whether a specified coexpression probability threshold is met. Through this comparison, a subset of the polynucleotides having a high coexpression probability with the known genes can be identified.
  • the high coexpression probability correlates with a particular coexpression probability threshold which is preferably less than 0.001 and more preferably less than 0.00001.
  • the polynucleotides originate from cDNA libraries derived from a variety of sources including, but not limited to, eukaryotes such as human, mouse, rat, dog, monkey, plant, and yeast; prokaryotes such as bacteria; and viruses. These polynucleotides can also be selected from a variety of sequence types including, but not limited to, expressed sequence tags (ESTs), assembled polynucleotide sequences, full length gene coding regions, promoters, introns, enhancers, 5′ untranslated regions, and 3′ untranslated regions. To have statistically significant analytical results, the polynucleotides need to be expressed in at least three cDNA libraries.
  • ESTs expressed sequence tags
  • the cDNA libraries used in the coexpression analysis of the present invention can be obtained from adrenal gland, biliary tract, bladder, blood cells, blood vessels, bone marrow, brain, bronchus, cartilage, chromaffin system, colon, connective tissue, cultured cells, embryonic stem cells, endocrine glands, epithelium, esophagus, fetus, ganglia, heart, hypothalamus, immune system, intestine, islets of Langerhans, kidney, larynx, liver, lung, lymph, muscles, neurons, ovary, pancreas, penis, peripheral nervous system, phagocytes, pituitary, placenta, pleurus, prostate, salivary glands, seminal vesicles, skeleton, spleen, stomach, testis, thymus, tongue, ureter, uterus, and the like.
  • the number of cDNA libraries selected can range from as few as 3 to greater than 10,000.
  • genes are assembled from related sequences, such as assembled sequence fragments derived from a single transcript. Assembly of the sequences can be performed using sequences of various types including, but not limited to, ESTs, extensions, or shotgun sequences.
  • the polynucleotide sequences are derived from human sequences that have been assembled using the algorithm disclosed in “Database and System for Storing, Comparing and Displaying Related Biomolecular Sequence Information”, Lincoln et al. Serial No: 60/079,469, filed Mar. 26, 1998, incorporated herein by reference.
  • differential expression of the polynucleotides can be evaluated by methods including, but not limited to, differential display by spatial immobilization or by gel electrophoresis, genome mismatch scanning, representational difference analysis, and transcript imaging. Additionally, differential expression can be assessed by microarray technology. These methods may be used alone or in combination.
  • Known atherosclerosis-associated genes are selected based on the use of these genes as diagnostic or prognostic markers or as therapeutic targets.
  • the procedure for identifying novel genes that exhibit a statistically significant coexpression pattern with known atherosclerosis-associated genes is as follows. First, the presence or absence of a gene in a cDNA library is defined: a gene is present in a cDNA library when at least one cDNA fragment corresponding to that gene is detected in a cDNA sample taken from the library, and a gene is absent from a library when no corresponding cDNA fragment is detected in the sample.
  • the significance of gene coexpression is evaluated using a probability method to measure a due-to-chance probability of the coexpression.
  • the probability method can be the Fisher exact test, the chi-squared test, or the kappa test. These tests and examples of their applications are well known in the art and can be found in standard statistics texts (Agresti (1990) Categorical Data Analysis, John Wiley & Sons, New York N.Y.; Rice (1988) Mathematical Statistics and Data Analysis, Duxbury Press, Pacific Grove Calif.).
  • a Bonferroni correction (Rice, supra, p. 384) can also be applied in combination with one of the probability methods for correcting statistical results of one gene versus multiple other genes.
  • the due-to-chance probability is measured by a Fisher exact test, and the threshold of the due-to-chance probability is set preferably to less than 0.001, more preferably to less than 0.00001.
  • occurrence data vectors can be generated as illustrated in Table 1. The presence of a gene occurring at least once in a library is indicated by a one, and its absence from the library, by a zero. TABLE 1 Occurrence data for genes A and B Library 1 Library 2 Library 3 . . . Library N gene A 1 1 0 . . . 0 gene B 1 0 1 . . . 0
  • Table 2 presents co-occurrence data for gene A and gene B in a total of 30 libraries. Both gene A and gene B occur 10 times in the libraries. Table 2 summarizes and presents: 1) the number of times gene A and B are both present in a library; 2) the number of times gene A and B are both absent in a library; 3) the number of times gene A is present, and gene B is absent; and 4) the number of times gene B is present, and gene A is absent.
  • the upper left entry is the number of times the two genes co-occur in a library, and the middle right entry is the number of times neither gene occurs in a library.
  • the off diagonal entries are the number of times one gene occurs, and the other does not. Both A and B are present eight times and absent 18 times.
  • Gene A is present, and gene B is absent, two times; and gene B is present, and gene A is absent, two times.
  • the probability (“p-value”) that the above association occurs due to chance as calculated using a Fisher exact test is 0.0003. Associations are generally considered significant if a p-value is less than 0.01 (Agresti, supra; Rice, supra).
  • This method of estimating the probability for coexpression of two genes makes several assumptions. The method assumes that the libraries are independent and are identically sampled. However, in practical situations, the selected cDNA libraries are not entirely independent, because more than one library may be obtained from a single subject or tissue. Nor are they entirely identically sampled, because different numbers of cDNAs may be sequenced from each library. The number of cDNAs sequenced typically ranges from 5,000 to 10,000 cDNAs per library. In addition, because a Fisher exact coexpression probability is calculated for each gene versus 45,233 other assembled genes, a Bonferroni correction for multiple statistical tests is used.
  • the present invention identifies 34 novel atherosclerosis-associated polynucleotides that exhibit strong association with genes known to be specific to atherosclerosis.
  • the results presented in Table 4 show that the expression of the 34 novel atherosclerosis-associated polynucleotides has direct or indirect association with the expression of known atherosclerosis-associated genes. Therefore, the novel atherosclerosis-associated polynucleotides can potentially be used in diagnosis, treatment, prognosis, or prevention of diseases associated with atherosclerosis or in the evaluation of therapies for atherosclerosis. Further, the gene products of the 34 novel atherosclerosis-associated polynucleotides are either potential therapeutics or targets of therapeutics against atherosclerosis.
  • the present invention encompasses a polynucleotide sequence comprising the sequence of SEQ ID NO: 1-34. These 34 polynucleotides are shown by the method of the present invention to have strong coexpression association with known atherosclerosis-associated genes and with each other.
  • the invention also encompasses a variant of the polynucleotide sequence, its complement, or 18 consecutive nucleotides of a sequence provided in the above described sequences.
  • Variant polynucleotide sequences typically have at least about 75%, more preferably at least about 85%, and most preferably at least about 95% polynucleotide sequence identity to NSEQ.
  • NSEQ or the encoded PSEQ may be used to search against the GenBank primate (pri), rodent (rod), mammalian (mam), vertebrate (vrtp), and eukaryote (eukp) databases, SwissProt, BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other databases that contain previously identified and annotated motifs, sequences, and gene functions. Methods that search for primary sequence patterns with secondary structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul (1993) J. Mol. Evol.
  • polynucleotide sequences that are capable of hybridizing to SEQ ID NO: 1-34, and fragments thereof under stringent conditions.
  • Stringent conditions can be defined by salt concentration, temperature, and other chemicals and conditions well known in the art. Conditions can be selected, for example, by varying the concentrations of salt in the prehybridization, hybridization, and wash solutions or by varying the hybridization and wash temperatures. With some substrates, the temperature can be decreased by adding formamide to the prehybridization and hybridization solutions.
  • Hybridization can be performed at low stringency, with buffers such as 5xSSC with 1% sodium dodecyl sulfate (SDS) at 60° C., which permits complex formation between two nucleic acid sequences that contain some mismatches. Subsequent washes are performed at higher stringency with buffers such as 0.2xSSC with 0.1% SDS at either 45° C. (medium stringency) or 68° C. (high stringency), to maintain hybridization of only those complexes that contain completely complementary sequences. Background signals can be reduced by the use of detergents such as SDS, Sarcosyl, or Triton X-100, and/or a blocking agent, such as salmon sperm DNA.
  • SDS sodium dodecyl sulfate
  • NSEQ can be extended utilizing a partial nucleotide sequence and employing various PCR-based methods known in the art to detect upstream sequences such as promoters and other regulatory elements.
  • PCR-based methods known in the art to detect upstream sequences such as promoters and other regulatory elements.
  • upstream sequences such as promoters and other regulatory elements.
  • Dieffenbach and Dveksler (1995) PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y. See, e.g., Dieffenbach and Dveksler (1995) PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.). Additionally, one may use an XL-PCR kit (PE Biosystems, Foster City Calif.), nested primers, and commercially available cDNA libraries (Life Technologies, Rockville Md.) or genomic libraries (Clontech, Palo Alto Calif.) to extend the sequence.
  • primers may be designed using commercially available software, such as OLIGO 4.06 Primer Analysis software (National Biosciences, Plymouth Minn.) or another program, to be about 18 to 30 nucleotides in length, to have a GC content of about 50%, and to form a hybridization complex at temperatures of about 68° C. to 72° C.
  • commercially available software such as OLIGO 4.06 Primer Analysis software (National Biosciences, Madison Minn.) or another program, to be about 18 to 30 nucleotides in length, to have a GC content of about 50%, and to form a hybridization complex at temperatures of about 68° C. to 72° C.
  • NSEQ can be cloned in recombinant DNA molecules that direct the expression of PSEQ, or structural or functional portions thereof, in host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent amino acid sequence may be produced and used to express the polypeptide encoded by NSEQ.
  • the nucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter the nucleotide sequences for a variety of purposes including, but not limited to, modification of the cloning, processing, and/or expression of the gene product.
  • DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences.
  • oligonucleotide-mediated site-directed mutagenesis may be used to introduce mutations that create new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth.
  • NSEQ may be inserted into an expression vector, i.e., a vector which contains the elements for transcriptional and translational control of the inserted coding sequence in a particular host.
  • elements include regulatory sequences, such as enhancers, constitutive and inducible promoters, and 5′ and 3′ untranslated regions.
  • Methods which are well known to those skilled in the art may be used to construct such expression vectors. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra; and Ausubel, supra).
  • NSEQ A variety of expression vector/host cell systems may be utilized to express NSEQ. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with baculovirus vectors; plant cell systems transformed with viral or bacterial expression vectors; or animal cell systems. For long term production of recombinant proteins in mammalian systems, stable expression in cell lines is preferred.
  • NSEQ can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable or visible marker gene on the same or on a separate vector. The invention is not to be limited by the vector or host cell employed.
  • host cells that contain NSEQ and that express PSEQ may be identified by a variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or chip based technologies for the detection and/or quantification of nucleic acid or amino acid sequences. Immunological methods for detecting and measuring the expression of PSEQ using either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS).
  • ELISAs enzyme-linked immunosorbent assays
  • RIAs radioimmunoassays
  • FACS fluorescence activated cell sorting
  • Host cells transformed with NSEQ may be cultured under conditions for the expression and recovery of the polypeptide from cell culture.
  • the polypeptide produced by a transgenic cell may be secreted or retained intracellularly depending on the sequence and/or the vector used.
  • expression vectors containing NSEQ may be designed to contain signal sequences which direct secretion of the polypeptide through a prokaryotic or eukaryotic cell membrane.
  • a host cell strain may be chosen for its ability to modulate expression of the inserted sequences or to process the expressed polypeptide in the desired fashion.
  • modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation.
  • Post-translational processing which cleaves a “prepro” form of the polypeptide may also be used to specify protein targeting, folding, and/or activity.
  • Different host cells which have specific cellular machinery and characteristic mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the American Type Culture Collection (ATCC, Bethesda Md.) and may be chosen to ensure the correct modification and processing of the expressed polypeptide.
  • ATCC American Type Culture Collection
  • natural, modified, or recombinant nucleic acid sequences are ligated to a heterologous sequence resulting in translation of a fusion polypeptide containing heterologous polypeptide moieties in any of the aforementioned host systems.
  • heterologous polypeptide moieties facilitate purification of fusion polypeptides using commercially available affinity matrices.
  • moieties include, but are not limited to, glutathione S-transferase, maltose binding protein, thioredoxin, calmodulin binding peptide, 6-His, FLAG, c-myc, hemaglutinin, and monoclonal antibody epitopes.
  • nucleic acid sequences are synthesized, in whole or in part, using chemical or enzymatic methods well known in the art (Caruthers et al. (1980) Nucl. Acids Res. Symp. Ser. 215-233; Ausubel, supra).
  • peptide synthesis can be performed using various solid-phase techniques (Roberge et al. (1995) Science 269:202-204), and machines such as the ABI 431A Peptide synthesizer (PE Biosystems) can be used to automate synthesis.
  • the amino acid sequence may be altered during synthesis and/or combined with sequences from other proteins to produce a variant protein.
  • the invention entails a substantially purified polypeptide comprising the amino acid sequence of SEQ ID NO: 35 and fragments thereof.
  • the polynucleotide sequences can be used in diagnosis, prognosis, treatment, prevention, and selection and evaluation of therapies for atherosclerosis including, but not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension.
  • the polynucleotide sequences may be used to screen a library of molecules for specific binding affinity.
  • the assay can be used to screen a library of DNA molecules, RNA molecules, PNAs, peptides, ribozymes, antibodies, agonists, antagonists, immunoglobulins, inhibitors, proteins including transcription factors, enhancers, repressors, and drugs and the like which regulate the activity of the polynucleotide sequence in the biological system.
  • the assay involves providing a library of molecules, combining the polynucleotide sequence or a fragment thereof with the library of molecules under conditions suitable to allow specific binding, and detecting specific binding to identify at least one molecule which specifically binds the polynucleotide sequence.
  • the polypeptide or a portion thereof may be used to screen libraries of molecules in any of a variety of screening assays.
  • the portion of the polypeptide employed in such screening may be free in solution, affixed to an abiotic or biotic substrate (e.g. borne on a cell surface), or located intracellularly. Specific binding between the polypeptide and molecule may be measured.
  • the assay can be used to screen a library of DNA molecules, RNA molecules, PNAs, peptides, mimetics, ribozymes, antibodies, agonists, antagonists, immunoglobulins, inhibitors, peptides, polypeptides, drugs and the like, which specifically bind the polypeptide.
  • One method for high throughput screening using very small assay volumes and very small amounts of test compound is described in Burbaum et al. U.S. Pat. No. 5,876,946, incorporated herein by reference, which screens large numbers of molecules for enzyme inhibition or receptor binding.
  • the polynucleotide sequences are used for diagnostic purposes to determine the absence, presence, and excess expression of the polypeptide.
  • the polynucleotides may be at least 18 nucleotides long and consist of complementary RNA and DNA molecules, branched nucleic acids, and/or peptide nucleic acids (PNAs).
  • PNAs peptide nucleic acids
  • the polynucleotides are used to detect and quantify gene expression in samples in which expression of NSEQ is correlated with disease.
  • NSEQ can be used to detect genetic polymorphisms associated with a disease. These polymorphisms may be detected in the transcript cDNA.
  • the specificity of the probe is determined by whether it is made from a unique region, a regulatory region, or from a conserved motif. Both probe specificity and the stringency of diagnostic hybridization or amplification (maximal, high, intermediate, or low) will determine whether the probe identifies only naturally occurring, exactly complementary sequences, allelic variants, or related sequences. Probes designed to detect related sequences should preferably have at least 75% sequence identity to any of the polynucleotides encoding PSEQ.
  • Methods for producing hybridization probes include the cloning of nucleic acid sequences into vectors for the production of mRNA probes.
  • Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by adding RNA polymerases and labeled nucleotides.
  • Hybridization probes may incorporate nucleotides labeled by a variety of reporter groups including, but not limited to, radionuclides such as 32 P or 35 S, enzymatic labels such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, fluorescent labels, and the like.
  • the labeled polynucleotide sequences may be used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR technologies; and in microarrays utilizing samples from subjects to detect altered PSEQ expression.
  • NSEQ can be labeled by standard methods and added to a sample from a subject under conditions for the formation and detection of hybridization complexes. After incubation the sample is washed, and the signal associated with hybrid complex formation is quantitated and compared with a standard value. Standard values are derived from any control sample, typically one that is free of the suspect disease. If the amount of signal in the subject sample is altered in comparison to the standard value, then the presence of altered levels of expression in the sample indicates the presence of the disease. Qualitative and quantitative methods for comparing the hybridization complexes formed in subject samples with previously established standards are well known in the art.
  • Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment of an individual subject. Once the presence of disease is established and a treatment protocol is initiated, hybridization or amplification assays can be repeated on a regular basis to determine if the level of expression in the subject begins to approximate that which is observed in a healthy subject. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to many years.
  • the polynucleotides may be used for the diagnosis of a variety of diseases associated with atherosclerosis. These include, but are not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension.
  • the polynucleotides may also be used as targets in a microarray.
  • the microarray can be used to monitor the expression patterns of large numbers of genes simultaneously and to identify splice variants, mutations, and polymorphisms. Information derived from analyses of the expression patterns may be used to determine gene function, to understand the genetic basis of a disease, to diagnose a disease, and to develop and monitor the activities of therapeutic agents used to treat a disease.
  • Microarrays may also be used to detect genetic diversity, single nucleotide polymorphisms which may characterize a particular population, at the genome level.
  • polynucleotides may be used to generate hybridization probes useful in mapping the naturally occurring genomic sequence.
  • Fluorescent in situ hybridization FISH
  • FISH Fluorescent in situ hybridization
  • antibodies or Fabs comprising an antigen binding site that specifically binds PSEQ may be used for the diagnosis of diseases characterized by the over-or-under expression of PSEQ.
  • a variety of protocols for measuring PSEQ including ELISAs, RIAs, and FACS, are well known in the art and provide a basis for diagnosing altered or abnormal levels of expression.
  • Standard values for PSEQ expression are established by combining samples taken from healthy subjects, preferably human, with antibody to PSEQ under conditions for complex formation The amount of complex formation may be quantitated by various methods, preferably by photometric means. Quantities of PSEQ expressed in disease samples are compared with standard values. Deviation between standard and subject values establishes the parameters for diagnosing or monitoring disease.
  • the anti-PSEQ antibodies of the present invention can be used for treatment or monitoring therapeutic treatment for atherosclerosis.
  • the NSEQ, or its complement may be used therapeutically for the purpose of expressing mRNA and polypeptide, or conversely to block transcription or translation of the mRNA.
  • Expression vectors may be constructed using elements from retroviruses, adenoviruses, herpes or vaccinia viruses, or bacterial plasmids, and the like. These vectors may be used for delivery of nucleotide sequences to a particular target organ, tissue, or cell population. Methods well known to those skilled in the art can be used to construct vectors to express nucleic acid sequences or their complements. (See, e.g., Maulik et al.
  • NSEQ may be used for somatic cell or stem cell gene therapy.
  • Vectors may be introduced in vivo, in vitro, and ex vivo.
  • vectors are introduced into stem cells taken from the subject, and the resulting transgenic cells are clonally propagated for autologous transplant back into that same subject.
  • Delivery of NSEQ by transfection, liposome injections, or polycationic amino polymers may be achieved using methods which are well known in the art. (See, e.g., Goldman et al.
  • NSEQ expression may be inactivated using homologous recombination methods which insert an inactive gene sequence into the coding region or other targeted region of NSEQ. (See, e.g. Thomas et al. (1987) Cell 51: 503-512.)
  • Vectors containing NSEQ can be transformed into a cell or tissue to express a missing polypeptide or to replace a nonfunctional polypeptide.
  • a vector constructed to express the complement of NSEQ can be transformed into a cell to downregulate the overexpression of PSEQ.
  • Complementary or antisense sequences may consist of an oligonucleotide derived from the transcription initiation site; nucleotides between about positions ⁇ 10 and +10 from the ATG are preferred.
  • inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory molecules.
  • Ribozymes enzymatic RNA molecules, may also be used to catalyze the cleavage of mRNA and decrease the levels of particular mRNAs, such as those comprising the polynucleotide sequences of the invention.
  • Ribozymes may cleave mRNA at specific cleavage sites.
  • ribozymes may cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. The construction and production of ribozymes is well known in the art and is described in Meyers (supra).
  • RNA molecules may be modified to increase intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5′ and/or 3′ ends of the molecule, or the use of phosphorothioate or 2′ O-methyl rather than phosphodiester linkages within the backbone of the molecule.
  • nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases, may be included.
  • an antagonist, or an antibody that binds specifically to PSEQ may be administered to a subject to treat or prevent atherosclerosis.
  • the antagonist, antibody, or fragment may be used directly to inhibit the activity of the polypeptide or indirectly to deliver a therapeutic agent to cells or tissues which express the PSEQ.
  • An immunoconjugate comprising a PSEQ binding site of the antibody or the antagonist and a therapeutic agent may be administered to a subject in need to treat or prevent disease.
  • the therapeutic agent may be a cytotoxic agent selected from a group including, but not limited to, abrin, ricin, doxorubicin, daunorubicin, taxol, ethidium bromide, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicine, dihydroxy anthracin dione, actinomycin D, diphteria toxin, Pseudomonas exotoxin A and 40, radioisotopes, and glucocorticoid.
  • a cytotoxic agent selected from a group including, but not limited to, abrin, ricin, doxorubicin, daunorubicin, taxol, ethidium bromide, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicine, dihydroxy anthracin dione, actinomycin D, diphteria toxin, Pseudom
  • Antibodies to PSEQ may be generated using methods that are well known in the art. Such antibodies may include, but are not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and fragments produced by a Fab expression library. Neutralizing antibodies, such as those which inhibit dimer formation, are especially preferred for therapeutic use. Monoclonal antibodies to PSEQ may be prepared using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma, the human B-cell hybridoma, and the EBV-hybridoma techniques. In addition, techniques developed for the production of chimeric antibodies can be used.
  • an agonist of PSEQ may be administered to a subject to treat or prevent a disease associated with decreased expression, longevity or activity of PSEQ.
  • An additional aspect of the invention relates to the administration of a pharmaceutical or sterile composition, in conjunction with a pharmaceutically acceptable carrier, for any of the therapeutic applications discussed above.
  • Such pharmaceutical compositions may consist of PSEQ or antibodies, mimetics, agonists, antagonists, or inhibitors of the polypeptide.
  • the compositions may be administered alone or in combination with at least one other agent, such as a stabilizing compound, which may be administered in any sterile, biocompatible pharmaceutical carrier including, but not limited to, saline, buffered saline, dextrose, and water.
  • the compositions may be administered to a subject alone or in combination with other agents, drugs, or hormones.
  • compositions utilized in this invention may be administered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means.
  • these pharmaceutical compositions may contain pharmaceutically-acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. Further details on techniques for formulation and administration may be found in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton Pa.).
  • the therapeutically effective dose can be estimated initially either in cell culture assays or in animal models such as mice, rats, rabbits, dogs, or pigs.
  • animal models such as mice, rats, rabbits, dogs, or pigs.
  • An animal model may also be used to determine the concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans.
  • a therapeutically effective dose refers to that amount of active ingredient which ameliorates the symptoms or condition.
  • Therapeutic efficacy and toxicity may be determined by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by calculating and contrasting the ED 50 (the dose therapeutically effective in 50% of the population) and LD 50 (the dose lethal to 50% of the population) statistics. Any of the therapeutic compositions described above may be applied to any subject in need of such therapy, including, but not limited to, mammals such as dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans.
  • the cDNA library SMCCNOS01 was selected as an example to demonstrate the construction of cDNA libraries from which the polynucleotides associated with known atherosclerosis-associated genes were derived.
  • the SMCCNOS01 subtracted coronary artery smooth muscle cell library was constructed using 7.56 ⁇ 10 6 clones from the SMCCNOT02 library and was subjected to two rounds of subtraction hybridization for 48 hours with 6.12 ⁇ 10 6 clones from SMCCNOT01.
  • the SMCCNOT02 library was constructed using RNA isolated from coronary artery smooth muscle cells removed from a 3-year-old Caucasian male. The cells were treated for 20 hours with TNF ⁇ and IL-1 ⁇ at 10 ng/ml each.
  • the SMCCNOT01 was constructed using RNA isolated from untreated coronary artery smooth muscle cells from the same donor. Subtractive hybridization conditions were based on the methodologies of Swaroop et al. (1991; Nucleic Acids Res. 19:1954) and Bonaldo et al. (1996; Genome Research 6:791).
  • the mRNA-Oligo(dT) 25 bound streptavidin particles were separated from the supernatant, washed twice with hybridization buffer I (0.1 5 M NaCl, 0.01 M Tris-HCl pH8.0, 1 mM EDTA, 0.1% lauryl sarcosinate) using magnetic separation at each step to remove the supernatant from the particles.
  • hybridization buffer I 0.1 5 M NaCl, 0.01 M Tris-HCl pH8.0, 1 mM EDTA, 0.1% lauryl sarcosinate
  • RNA was used for cDNA synthesis and construction of the cDNA library according to the recommended protocols in the SUPERSCRIPT plasmid system (Life Technologies).
  • the cDNAs were fractionated on a SEPHAROSE CL4B column (Amersham Pharmacia Biotech, Piscataway N.J.), and those cDNAs exceeding 400 bp were ligated into pINCY plasmid (Incyte Pharmaceuticals, Palo Alto, Calif.).
  • Recombinant plasmids were transformed into DH5 ⁇ competent cells or ELECTROMAX cells (Life Technologies).
  • Plasmid DNA was released from the cells and purified using the REAL Prep 96 plasmid kit (Qiagen, Valencia Calif.). The recommended protocol was employed except for the following changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth (Life Technologies) with carbenicillin at 25 mg/L and glycerol at 0.4%; 2) after inoculation, the cells were cultured for 19 hours and then lysed with 0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml distilled water, and samples were transferred to a 96-well block for storage at 4° C.
  • cDNAs were prepared using a MICROLAB 2200 System (Hamilton, Reno Nev.) in combination with the DNA ENGINE thermal cycler (MJ Research, Watertown Mass.). cDNAs were sequenced by the method of Sanger et al. (1975, J. Mol. Biol. 94:441f) using ABI PRISM 377 (PE Biosystems) or MEGABACE 1000 sequencing systems (Amersham Pharmacia Biotech).
  • sequences used for co-expression analysis were assembled from EST sequences, 5′ and 3′ longread sequences, and full length coding sequences. Selected assembled sequences were expressed in at least three cDNA libraries.
  • Bins were annotated by screening the consensus sequence in each bin against public databases, such as GBpri and GenPept from NCBI.
  • the annotation process involved a FASTn screen against the GBpri database in GenBank. Those hits with a percent identity of greater than or equal to 75% and an alignment length of greater than or equal to 100 base pairs were recorded as homolog hits.
  • the residual unannotated sequences were screened by FASTx against GenPept. Those hits with an E value of less than or equal to 10 ⁇ 8 were recorded as homolog hits.
  • Sequences were then reclustered using BLASTn and Cross-Match, a program for rapid amino acid and nucleic acid sequence comparison and database search (Green, supra), sequentially. Any BLAST alignment between a sequence and a consensus sequence with a score greater than 150 was realigned using cross-match. The sequence was added to the bin whose consensus sequence gave the highest Smith-Waterman score (Smith et al. (1992) Protein Engineering 5:35-51) amongst local alignments with at least 82% identity. Non-matching sequences were moved into new bins, and assembly processes were repeated.
  • Atherosclerosis-associated genes Sixty-six known atherosclerosis-associated genes were selected to identify novel genes that are closely associated with atherosclerosis. The known atherosclerosis-associated genes which were examined in this analysis and brief descriptions of their functions are listed in Table 3. TABLE 3 Descriptions of Known Atherosclerosis-Associated Genes GENE DESCRIPTION AND REFERENCES Human 22kDa Smooth muscle cell-specific gene which is down-regulated during smooth muscle smooth muscle cell dedifferentiation as part of atherogenic process (Sobue et al. (1998) Horm. protein (SM22) Res. 50 Suppl. 2:15-24; Sobue et al. (1999) Mol. Cell. Biochem.
  • Calponin is smooth muscle-specific and may mediate smooth muscle contractility through it's binding of the amino-terminal end of the myosin regulatory light chain.
  • CNN1 calponin
  • Calponin is smooth muscle-specific and may mediate smooth muscle contractility through it's binding of the amino-terminal end of the myosin regulatory light chain.
  • a feature of atherosclerosis Szymanski et al. (1999) Biochemistry 38(12): 3778-84) desmin Contractile component of myofibrils in differentiated smooth muscle cells.
  • DES Regarded as a marker for smooth muscle cells (Shi et al. (1997) Circulation 95(12): 2684-93) smooth muscle Contractile component of myofibrils in differentiated smooth muscle cells.
  • PDGF and (TIMP3) TGFbeta augment TIMP3 expression.
  • Human tissue TIMPs control the activity of matrix metalloproteinases and are important in local inhibitor of matrix remodeling of vasculature.
  • TIMP2 is greatly proteinase-2 increased during neointima formation in organ cultures of human saphenous vein (TIMP-2) (Kranzhofer et al. (1999) Arterioscler. Thromb. Vasc. Biol. 19(2): 255-65)
  • Human tissue TIMPs control the activity of matrix metalloproteinases and are important in local inhibitor of matrix remodeling of vasculature (Greene et al. (1996) J. Biol. Chem. 271(48): metallo- 30375-80) proteinase-4 (TIMP4) pro alpha 1(I) Member of family of fibrous structural proteins. Most abundant structural collagen component of the extracellular matrix.
  • Collagens are important in atherosclerosis for promoting platelet aggregation and for providing sites for platelet adhesion to the vessel wall (Wen et al. (1999) Arterioscler. Thromb. Vasc. Biol.
  • MMP9 localized to lesional macrophages, along with MMP-1, (MMP9) MMP-2, MMP-3.
  • Rabbit aortic macrophage foam cells express immunoreactive MMP-9 (Moreau et al. (1999) Circulation 99(3): 420-426; Zaltsman et al.
  • CSK Cockayne syndrome
  • FGG fibrinogen beta Participant in adhesion and aggregation of platelets which occurs through binding chain gene of platelet receptors.
  • FGG carries the main binding site for the platelet receptor (FGG) binding. Mutations in FGG associated with clotting defects and thrombotic tendency. Fibrin deposition is an integral part of advanced atherosclerotic lesion development (Sueishi et al. (1998) Semin. Thromb. Hemost. 24 (3): 255-60; Cote et al. (1998) Blood 92(7): 2195-2212) pre-pro-von Blood glycoprotein involved in normal hemostasis. Mediates adhesion of platelets Willebrand to sites of vascular damage.
  • VWF blood factor
  • Increased levels of VWF are found in atherosclerosis and in several of its major risk factors, including hypercholesterolemia, diabetes, obesity, hypertension. Levels serve as a predictor of adverse clinical outcome following vascular surgery, possibly as an indicator of thrombus formation (Sadler, J. E. (1998) Annu Rev Biochem 67: 395-424; Blann et al. (1994) Eur. J. Vasc. Surg. 8(1): 10-15; Kessler et al. (1998) Diabetes Metab. 24(4): 327-36; Folsom et al.
  • Circulation 96(4): 1102-1108 coagulation Central role in blood hemostasis by regulating platelet aggregation and blood factor II/ coagulation. Converts fibrinogen to fibrin in the final stage of clotting cascade.
  • prothrombin F2 Promotes cellular chemotaxis and proliferation, extracellular matrix turnover and release of inflammatory cytokines (Goldsack et al. (1998) Int. J. Biochem. Cell. Biol.
  • Deficiency in ATIII causes recurrent venous thrombosis and pulmonary embolism and can be inherited in autosomal dominant fashion (Hultin et al. (1988) Thromb. Haemost. 59(3): 468-73; Lane et al. (1996) Blood Rev. 10(2): 59-74) plasminogen Major physiological inhibitor of fibrinolysis. Plasma levels correlate with activator incidence of MI and venous thrombosis. Both adipocytes and endothelial cells inhibitor-1 produced PAI, possibly under the control of PPARG, as demonstrated using (PAI-1) recombinant PPARG expression constructs in endothelial cell lines. Increased expression of PAI observed in coronary heart disease.
  • APOAII may have ability to convert HDL from an anti- to a pro-inflammatory particle, with paraoxonase having a role in this transformation process.
  • Plasma APOAII levels significantly associated with plasma free fatty acid levels.
  • Transgenic mice expressing varying levels of APOAII show increased atherosclerotic lesions than wt when fed an atherogenic diet. Possible interaction between diet/genotype and atherogenic potential (Escola-Gil et al. (1998) J. Lipid Res. 39(2): 457-462; Warden et al. (1993) Proc. Natl. Acad. Sci.
  • APOB APOB100 underly familial defective apolipoprotein B-100 in which patients suffer from premature atherosclerosis. Mutations result in defect in binding of LDL to LDL receptor, and accumulation of plasma LDL.
  • High-expressing APOB transgenic mice exibit elevated VLDL-LDL cholesterol and atherogenic lesions (Callow et al. (1995) J Clin Invest 96(3): 1639-1646; Brasaemle et al. (1997) J. Biol. Chem.
  • lipoprotein Role in lipoprotein metabolism Cofactor in the activity of lipoprotein lipase the apoCII (APOC2) enzyme that hydrolyzes triglycerides in plasma and transfers the fatty acids to tissues. Mutations in APOC2 responsible for hyperlipoproteinemia 1B. similar to lipoprotein lipase deficiency (Cox et al. (1978) N. Engl. J. Med. 299(26): 1421- 1424; Arimoto et al. (1998) J. Lipid Res.
  • APOC2 apoCII
  • APOC3 triglyceride CIII (APOC3) rich particles.
  • SstI RFLP in apoCIII is associated with plasma triglyceride and apoCIII levels and hyperlipidemic phenotypes (Henderson et al. (1987) Hum. Genet. 75(1): 62-65)
  • apolipoprotein APOC4 is a lipid-binding protein that has the potential to alter lipid metabolism.
  • apoC-IV Human APOC4 transgenic mice are hypertriglyceridaemic compared to normal (APOC4) controls 9Allan et al. (1996) J. Lipid Res. 37(7): 1510-1518) macrophage Mediates binding, internalisation and processing of negatively-charged scavenger macromolecules. Implicated in the pathological deposition of cholesterol in receptor type I arterial walls during atherogenesis (Han et al. (1998) Hum. Mol. Genet. 7(6): (MSR1) 1039-1046) Human antigen Acts as a scavenger receptor for oxidised LDL.
  • carboxyl ester CEL gene expression increases in presence of oxidised and native LDL in vitro. It lipase gene is expressed in the vessel wall and in aortic extracts - may interact with cholesterol (CEL) to modulate progression of atherosclerosis (Li et al. (1998) Biochem. J. 329(Pt 3): 675-679) paraoxonase 1 Serum esterase exclusively associated with high-density lipoproteins; it might (PON1) confer protection against coronary artery disease by destroying pro-inflammatory oxidized lipids in oxidized low-density lipoproteins. PON1 gln192-to-arg polymorphism associated with CAD.
  • HMG CoA coenzyme A reductase is regulated by oxysterols via sterol-regulatory element in the promotor, synthase as is found in APOE.
  • Target for cholesterol-lowering therapies prevastatin, (HMGCR) “statins” (Bocan et al.
  • VLDLR VLDLR
  • perilipin Lipid storage droplets of steroidogenic cells are surrounded by perilipins, family of phosphorylated proteins encoded by a single gene, detected in adipocytes and steroidogenic cells.
  • perilipins family of phosphorylated proteins encoded by a single gene, detected in adipocytes and steroidogenic cells.
  • EDN1 is normally expressed exclusively in endothelial cells.
  • EDN1 expression is enhanced and can be found in the tunica media and vascular smooth muscle cells.
  • Analysis of recombinant EDN1 expression in vitro suggests it influences vascular smooth muscle cell proliferation.
  • Potent vasoconstriction properties (Unoki et al. (1999) Cell Tissue Res. 295(1): 89-99; Rossi et al. (1999) Circulation 99(9): 1147-1155; Yoshizumi et al. (1998) Br. J. Pharmacol.
  • endothelin Mediates action of endothelin1 on vascular smooth muscle migration, proliferation receptor A and monocyte/endothelial cell interaction during initiation and progression of (EDNRA) atherosclerotic lesion development (Kohno et al. (1998) J. Cardiovasc. Pharmacol. 31 Suppl 1: S84-9; Alberts et al. (1994) J. Biol Chem 269(13): 10112-10118) interleukin 6 Inflammatory cytokine present in arterial atherosclerotic wall which is upregulated (IL6) by platelets to stimulate smooth muscle cell growth.
  • IL6 interleukin 6 Inflammatory cytokine present in arterial atherosclerotic wall which is upregulated
  • IL6 Increased expression of IL6 in atherosclerotic aortas of APOE knockout vs aortas from aged-matched controls.
  • Secretion levels of IL6 is positively associated with increased lesion surface area in APOE aortic tissue samples (Sukovich, et al. (1998) Arterioscler. Thromb. Vasc. Biol. 18(9): 1498-1505; Loppnow, et al. (1998) Blood 91(1): 134-141)
  • interleukin 1 May contribute to regulation of local pathogenesis in the vessel wall by activation (IL1) of the cytokine regulatory network.
  • IL-1 antagonist inhibits platelet-induced cytokine production of smooth muscle cells (Loppnow et al.
  • PSGDS Northern analysis shows strong specific expression in heart. Immunocytochemical localisation to myocardial and atrio endocardial cells, and accumulates in end-stage atherosclerotic plaques. High plasma levels detected in severe angina patients (Eguchi et al. (1997) Proc. Natl. Acad. Sci. USA 94(26): 14689-14694) Annexin Inhibits phospholipase A2 activity and hence the production of arachidonic acid, II/lipocortinII the precursor of the inflammatory mediators prostaglandins and leukotrienes. (ANX2) ANX2 is an important anti-inflammatory molecule.
  • ANX1 ANXI is an important anti-inflammatory molecule (Wallner, et al.
  • Prostaglandin- Major mechanism for the regulation of prostaglandin synthesis Arachidonic acid endoperoxide pathway. Role in inflammation and endothelial cell migration/angiogenesis. Synthase 2 Regulated enzyme - major mediator of inflammation. Antiinflammatory (PTGS2) glucocorticoids are potent inhibitors of this cyclooxygenase. Over expression of PTGS2 in vitro in rabbit epithelial cells causes increased adhesion to extracellular matrix proteins and inhibition of apoptosis, hallmarks of atherosclerotic plaque formation (Morham et al. (1995) Cell 83(3): 473-482; O'Banion et al.
  • IGFBP1 insulin-like protein-1 IGF1/IGFBP1 system found to be associated with cardiovascular risk and (IGFBP-1) atherosclerosis (Janssen et al. (1998) Arterioscler. Thromb. Vasc. Biol. 18(2): 277-282) Secreted protein, Extracellular glycoprotein secreted by endothelial cells which has a suspected role acidic and rich in in calcification of atherosclerotic plaques.
  • SPARC SPARC
  • PDGF-B cysteine dimers
  • Expression of SPARC and PDGF is (SPARC) minimal in most adult tissues, but is enhanced following injury and advanced atherosclerotic lesions.
  • SPARC selective expression of SPARC causes rounding of adherent endothelial cells and influences extravasation of macromolecules (Raines et al. (1992) Proc. Natl. Acad Sci U.S.A. 89(4): 1281-1285; Goldblum et al. (1994) Proc. Natl. Acad. Sci.
  • NF- Activated NF kappa B occurs in atherosclerotic lesions, and regulates the kappa-B expression of gene important in recruitment of monocytes and inflammatory transcription response.
  • NFkB smooth muscle cells during factor
  • AAS renin-angiotensin system
  • AGT Hypertensive mice carrying renin and angiotensinogen transgenes found to have higher total cholesterol levels on an atherogenic diet than their wt counterparts, and atherogenic lesions were 4 ⁇ larger in surface area. Suggests hypertension induced by activated RAS is important atherogenic factor (Sugiyama et al. (1997) Lab. Invest. 76(6): 835-842) Nitric Oxide Mediates basal vasodilation.
  • NOS3 reactive oxygen
  • Nitric Oxide Mediates basal vasodilation. Regulates the production of nitric oxide, an Synthase 2 important signal transduction component and scavenger of reactive oxygen (NOS2) species. NOS2, known as inducible NOS is expressed in most cells only after induction by immunologic and inflammatory stimuli, and is upregulated in pathological conditions such as atherosclerosis (Dusting et al. (1998) Clin. Expt. Pharmacol. Phisiol. Suppl. 25: S34-41)
  • novel atherosclerosis-associated polynucleotides identified are listed in the table by their SEQ ID NOs numbers, and the known genes, by their names or the abbreviations shown in Table 3. TABLE 4 Co-expression of 34 novel genes with 66 known atherosclerosis genes.
  • Polynucleotides comprising the consensus sequences of SEQ ID NO: 1-34 of the present invention were first identified from Incyte bins and assembled as described in Example III. BLAST and other motif searches were performed for SEQ ID NOs: 1-34 according to Example VI. The full length and 5′-complete sequences were translated and sequence identity was sought with known sequences.
  • SEQ ID NO: 35 of the present invention was encoded by the nucleic acids of SEQ ID NO: 11.
  • SEQ ID NO: 35 has 366 amino acids which are encoded by SEQ ID NO: 11.
  • Motif analyses of SEQ ID NO: 35 shows one potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at residue S343, two potential casein kinase II phosphorylation sites at residues S179 and T351, and four potential protein kinase C phosphorylation sites at residues T29, S85, T269, and T324. Additionally, SEQ ID NO: 35 contains a potential sugar transport protein signature sequence from residues L201 to S217.
  • polypeptide sequence SEQ ID NO: 35
  • databases derived from sources such as GenBank and SwissProt. These databases, which contain previously identified and annotated sequences, were searched for regions of similarity using BLAST (Altschul, supra). BLAST searched for matches and reported only those that satisfied the probability thresholds of 10 ⁇ 25 or less for nucleotide sequences and 10 ⁇ 8 or less for polypeptide sequences.
  • the polypeptide sequence was also analyzed for known motif patterns using MOTIFS, SPSCAN, BLIMPS, and HMM-based protocols.
  • MOTIFS Genetics Computer Group, Madison Wis.
  • SPSCAN Genetics Computer Group searches polypeptide sequences for patterns that match those defined in the Prosite Dictionary of Protein Sites and Patterns (Bairoch, supra) and displays the patterns found and their corresponding literature abstracts.
  • SPSCAN Genetics Computer Group searches for potential signal peptide sequences using a weighted matrix method (Nielsen et al. (1997) Prot. Eng. 10: 1-6). Hits with a score of 5 or greater were considered.
  • BLIMPS uses a weighted matrix analysis algorithm to search for sequence similarity between the polypeptide sequences and those contained in BLOCKS, a database consisting of short amino acid segments, or blocks of 3-60 amino acids in length, compiled from the PROSITE database (Henikoff; supra; Bairoch, supra), and those in PRINTS, a protein fingerprint database based on non-redundant sequences obtained from sources such as SwissProt, GenBank, PIR, and NRL-3D (Attwood et al. (1997) J. Chem. Inf. Comput. Sci. 37:417-424).
  • the BLIMPS searches reported matches with a cutoff score of 1000 or greater and a cutoff probability value of 1.0 ⁇ 10 ⁇ 3 .
  • HMM-based protocols were based on a probabilistic approach and searched for consensus primary structures of gene families in the protein sequences (Eddy, supra; Sonnhammer, supra). More than 500 known protein families with cutoff scores ranging from 10 to 50 bits were selected for use in this invention.
  • Nucleic acids are isolated from a biological source and applied to a substrate for standard hybridization protocols by one of the following methods.
  • a mixture of target nucleic acids, a restriction digest of genomic DNA is fractionated by electrophoresis through an 0.7% agarose gel in 1xTAE [Tris-acetate-ethylenediamine tetraacetic acid (EDTA)] running buffer and transferred to a nylon membrane by capillary transfer using 20x saline sodium citrate (SSC).
  • the target nucleic acids are individually ligated to a vector and inserted into bacterial host cells to form a library.
  • Target nucleic acids are arranged on a substrate by one of the following methods.
  • bacterial cells containing individual clones are robotically picked and arranged on a nylon membrane.
  • the membrane is placed on bacterial growth medium, LB agar containing carbenicillin, and incubated at 37° C. for 16 hours.
  • Bacterial colonies are denatured, neutralized, and digested with proteinase K.
  • Nylon membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to the membrane.
  • target nucleic acids are amplified from bacterial vectors by thirty cycles of PCR using primers complementary to vector sequences flanking the insert.
  • Amplified target nucleic acids are purified using SEPHACRYL-400 beads (Amersham Pharnacia Biotech).
  • Purified target nucleic acids are robotically arrayed onto a glass microscope slide (Corning Science Products, Corning N.Y.). The slide is previously coated with 0.05% aminopropyl silane (Sigma-Aldrich, St. Louis Mo.) and cured at 110° C.
  • the arrayed glass slide (microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene).
  • cDNA probes are made from mRNA templates. Five micrograms of mRNA is mixed with 1 ⁇ g random primer (Life Technologies), incubated at 70° C. for 10 minutes, and lyophilized. The lyophilized sample is resuspended in 50 ⁇ l of 1x first strand buffer (cDNA Synthesis systems; Life Technologies) containing a dNTP mix, [ ⁇ - 32 P]dCTP, dithiothreitol, and MMLV reverse transcriptase (Stratagene), and incubated at 42° C. for 1-2 hours. After incubation, the probe is diluted with 42 ⁇ l dH 2 O, heated to 95° C. for 3 minutes, and cooled on ice.
  • cDNA Synthesis systems Life Technologies
  • mRNA in the probe is removed by alkaline degradation.
  • the probe is neutralized, and degraded mRNA and unincorporated nucleotides are removed using a PROBEQUANT G-50 microcolumn (Amersham Pharmacia Biotech).
  • Probes can be labeled with fluorescent markers, Cy3-dCTP or Cy5-dCTP (Amersharn Pharmacia Biotech), in place of the radionucleotide, [ 32 P]dCTP.
  • Hybridization is carried out at 65° C. in a hybridization buffer containing 0.5 M sodium phosphate (pH 7.2), 7% SDS, and 1 mM EDTA. After the substrate is incubated in hybridization buffer at 65° C. for at least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing the probes. After incubation at 65° C. for 18 hours, the hybridization buffer is removed, and the substrate is washed sequentially under increasingly stringent conditions, up to 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65° C.
  • the substrate is exposed to a PHOSPHORIMAGER cassette (Amersham Pharmacia Biotech), and the image is analyzed using IMAGEQUANT data analysis software (Amersham Pharmacia Biotech).
  • a fluorescent probe hybridized on a microarray the substrate is examined by confocal laser microscopy, and images are collected and analyzed using GEMTOOLS gene expression analysis software (Incyte Pharmaceuticals).
  • Molecules complementary to the polynucleotide, or a fragment thereof are used to detect, decrease, or inhibit gene expression.
  • oligonucleotides comprising from about 18 to about 60 base pairs is described, the same procedure is used with larger or smaller fragments or their derivatives (PNAs).
  • Oligonucleotides are designed using OLIGO 4.06 software (National Biosciences) and SEQ ID NO: 1-34 or fragments thereof
  • a complementary oligonucleotide is designed to bind to the most unique 5′ sequence, most preferably about 10 nucleotides before the initiation codon of the open reading frame.
  • a complementary oligonucleotide is designed to prevent ribosomal binding to the mRNA encoding the polypeptide.
  • polypeptides encoded by SEQ ID NO: 1-34, or portions thereof, substantially purified using polyacrylamide gel electrophoresis or other purification techniques, is used to immunize rabbits and to produce antibodies using standard protocols as described in Pound (1998; Immunochemical Protocols, Methods Mol. Biol. Vol. 80).
  • amino acid sequence is analyzed using LASERGENE software (DNASTAR, Madison Wis.) to determine regions of high immunogenicity, and a corresponding oligopeptide is synthesized and used to raise antibodies by means known to those of skill in the art. Methods for selection of epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the art.
  • oligopeptides 15 residues in length are synthesized using an ABI431A Peptide synthesizer (PE Biosystems) using fmoc-chemistry and coupled to keyhole limpet hemocyanin (KLH, Sigma-Aldrich) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, supra) to increase immunogenicity.
  • Rabbits are immunized with the oligopeptide-KLH complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG.
  • polynucleotide, or fragments thereof, or the polypeptide, or portions thereof are labeled with 32 P-dCTP, Cy3-dCTP, or Cy5-dCTP (Amersham Pharmacia Biotech), or with BIODIPY or FITC (Molecular Probes, Eugene OR), respectively.
  • Libraries of candidate molecules previously arranged on a substrate are incubated in the presence of labeled polynucleotide or polypeptide. After incubation under conditions for either a nucleic acid or amino acid sequence, the substrate is washed, and any position on the substrate retaining label, which indicates specific binding or complex formation, is assayed, and the binding molecule is identified. Data obtained using different concentrations of the polynucleotide or polypeptide are used to calculate affinity between the labeled nucleic acid or protein and the bound molecule.

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Toxicology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Urology & Nephrology (AREA)
  • Vascular Medicine (AREA)
  • Cardiology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides novel atherosclerosis-associated polynucleotides and polypeptides encoded by those genes. The invention also provides expression vectors, host cells, and antibodies. The invention also provides methods for screening or purifying ligands and diagnosing, treating or preventing atherosclerosis.

Description

    FIELD OF THE INVENTION
  • The invention relates to 34 atherosclerosis-associated polynucleotides identified by their co-expression with known atherosclerosis genes and their corresponding gene products. The invention also relates to the use of these biomolecules in diagnosis, prognosis, prevention, treatment, and evaluation of therapies for diseases associated with atherosclerosis. [0001]
  • BACKGROUND OF THE INVENTION
  • Atherosclerosis is a disorder characterized by cellular changes in the arterial intima and the formation of arterial plaques containing intra- and extracellular deposits of lipids. The resultant thickening of artery walls and the narrowing of the arterial lumen is the underlying pathologic condition in most cases of coronary artery disease, aortic aneurysm, peripheral vascular disease, and stroke. A cascade of molecules is involved in the cellular morphogenesis, proliferation, and cellular migration which results in an atherosclerotic lesion (Libby et al. (1997) Int. J. Cardiol. 62 (S2): 23-29). [0002]
  • A healthy artery consists of three layers. The vascular intima, lined by a monolayer of endothelial cells in contact with the blood, contains smooth muscle cells in extracellular matrix. An internal elastic lamina forms the border between the intima and the tunica media. The media contains layers of smooth muscle cells surrounded by a collagen and elastin-rich extracellular matrix. An external elastic lamina forms the border between the media and the adventitia. The adventitia contains nerves and some mast cells and is the origin of the vasa vasorum which supplies blood to the outer layers of the tunica media. [0003]
  • Initiation of an atherosclerotic lesion often occurs following vascular endothelial cell injury as a result of hypertension, diabetes mellitus, hyperlipidemia, fluctuating shear stress, smoking, or transplant rejection. The injury results in the local release of nitric oxide and superoxide anions which react to form cytodestructive peroxynitrite radicals, causing injury to the endothelium and myocytes of the intima. This cellular injury leads to the expression of a variety of molecules that produce local and systemic effects. The initial cellular response to injury includes the release of mediators of inflammation such as cytokines, complement components, prostaglandins, and downstream transcription factors. These molecules promote monocyte infiltration of the vascular intima and lead to the upregulation of adhesion molecules which encourage attachment of the monocytes to the damaged endothelial cells. Additionally, components of the extracellular matrix including collagens, fibrinogens, and matrix Gla protein are induced and provide sites for monocyte attachment. Annexins, plasminogen activator inhibitor 1, and nitric oxide synthases are triggered to counteract these effects. [0004]
  • Monocytes that infiltrate the lesion accumulate modified low density lipoprotein lipid through scavenger receptors such as CD36 and macrophage scavenger receptor type I. The abundance of modified lipids is a factor in atherogenesis and is influenced by modifying enzymes such as lipoprotein lipase, carboxyl ester lipase, serum amyloid P component, LDL-receptor related protein, microsomal triglyceride transfer protein, and serum esterases such as paraoxonase. Lipid metabolism is governed by cholesterol biosynthesis enzymes such as 3-hydroxy-3-methylglutaryl coenzyme A synthase, and products of the apolipoprotein genes. Modified lipid stabilization and accumulation is aided by perilipin and alpha-2-macroglobulin. [0005]
  • As monocytes accumulate in the lesion, they can rupture and release free cholesterol, cytokines, and procoagulants into the surrounding environment. This process leads to the development of a plaque which consists of a mass of lipid-engorged monocytes and a lipid-rich necrotic core covered by a fibrous cap. The gradual progression of plaque growth is punctuated by thrombus formation which leads to clinical symptoms such as unstable angina, myocardial infarction, or stroke. Thrombus formation is initiated by episodic plaque rupture which exposes flowing blood to tissue factors, which induce coagulation, and collagen, which activates platelets. After initiation of the atherosclerotic lesion, enzymes that degrade extracellular matrix components such as matrix metalloproteinases and cathepsin K are up-regulated, and their inhibitors are down-regulated. This results in destabilization of the atherosclerotic lesion and subsequent complications including myocardial infarction, angina, and stroke. Further arterial occlusion and infiltration increase with the expression of coagulation factors and down-regulation of their inhibitors, antithrombin III, and lipoprotein-associated coagulation inhibitor. [0006]
  • Smooth muscle cells build up in the arterial media and constitute one of the principal cell types in atherosclerotic and restenotic lesions. They show a high degree of plasticity and are able to shift between a differentiated, contractile phenotype and a less differentiated, synthetic phenotype. This modulation occurs as a response to factors secreted from cells at the site of vascular injury and results in structural reorganization with a loss of myofilaments and the formation of an extensive endoplasmic reticulum and a large Golgi complex. Genes encoding secreted protein, acidic and rich in cysteine (SPARC) and endothelin-1 contribute to these changes. At the same time, the expression of cytoskeletal proteins such as calponin, myosin, desmin, and other gene products in the cells is altered. As a result, the smooth muscle cells lose their contractility and become able to migrate from the media to the intima, to proliferate, and to secrete extracellular matrix components which contribute to arterial intimal thickening. [0007]
  • The initiation and progression of atherosclerotic lesion development requires the interplay of various molecular pathways Many genes that participate in these processes are known, and some of them have been shown to have a direct role in atherosclerosis pathogenesis by animal model experiments, in vitro assays, and epidemiological studies (Krettek et al. (1997) Arterioscler. Thromb. Vasc. Biol. 17(11):2897-2903; Fisher et al. (1997) Atherosclerosis 135(2):145-159; Shih et al. (1998) Circulation 95(120:2684-2693; and Bocan et al. (1998) Atherosclerosis 139(1):21-30). [0008]
  • The present invention satisfies a need in the art by providing new compositions that are useful for diagnosis, prognosis, treatment, prevention, and evaluation of therapies for diseases associated with atherosclerosis. We have implemented a method for analyzing gene expression patterns and have identified 34 atherosclerosis-associated polynucleotides through their co-expression with 66 known atherosclerosis-associated genes. [0009]
  • SUMMARY OF THE INVENTION
  • The invention provides for a substantially purified polynucleotide comprising a gene that is coexpressed with one or more known atherosclerosis-associated genes in a biological sample. Known atherosclerosis-associated genes include and encode human 22 kDa smooth muscle protein, calponin, desmin, smooth muscle myosin heavy chain, alpha tropomyosin, human tissue inhibitor of metalloproteinase 3, human tissue inhibitor of metalloproteinase-2, human tissue inhibitor of metalloproteinase-4, pro alpha 1(I) collagen, collagen alpha-2 type I, collagen alpha-6 type I, procollagen alpha 2(V), collagen VI alpha-2, type VI collagen alpha3, pro-alpha-1 type 3 collagen, pro-alpha-1 (V) collagen, collagenase type IV/matrix metalloproteinase 9/gelatinase B, matrix Gla protein, cathepsin K, fibrinogen beta chain gene, fibrinogen gamma chain gene, pre-pro-von Willebrand factor, coagulation factor II/prothrombin, coagulation factor XII, coagulation factor VII, platelet endothelial cell adhesion molecule, lipoprotein-associated coagulation inhibitor, antithrombin III variant, plasminogen activator inhibitor-1, lipoprotein lipase, alpha-2-macroglobulin, apolipoprotein AI, apolipoprotein AII, apolipoprotein B-100, lipoprotein apoCII, pre-apolipoprotein CIII, apolipoprotein apo C-IV, macrophage scavenger receptor type I, human antigen CD36 gene, serum amyloid P component, carboxyl ester lipase gene, paraoxonase 1, paraoxonase 2, paraoxonase 3, LDL-receptor related protein, hepatic triglyceride lipase, 3-hydroxy-3-methylglutaryl coenzyme A synthase, very low density lipoprotein receptor, microsomal triglyceride transfer protein, perilipin, endothelin-1, endothelin receptor A, interleukin 6, interleukin 1, complement protein C8 alpha, complement component C9, prostaglandin D2 synthase, annexin II/lipocortinlI, annexin I/lipocortin, prostaglandin-endoperoxide synthase 2, insulin-like growth factor binding protein-1, secreted protein, acidic and rich in cysteine, human NF-kappa-B transcription factor, angiotensinogen, nitric oxide synthase 3, and nitric oxide synthase 2. [0010]
  • The invention also provides a substantially purified polynucleotide comprising a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples. In one aspect, the polynucleotide comprises a polynucleotide sequence selected from (a) a polynucleotide encoding a peptide selected from SEQ ID NOs: 1-34; (b) a polynucleotide sequence complementary to the polynucleotide sequence of (a) or (b); and (c) a probe comprising at least 18 sequential nucleotides of the polynucleotide sequence of (a) or (b). The invention further provides a pharmaceutical composition comprising a polynucleotide and a pharmaceutical carrier. [0011]
  • The invention additionally provides methods for using a polynucleotide. One method uses the polynucleotide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polynucleotide and comprises combining the polynucleotide with a library of molecules or compounds under conditions to allow specific binding and detecting specific binding, thereby identifying a ligand which specifically binds the polynucleotide. In this first method, the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, and proteins; and the ligand identified using the method may be used to modulate the activity of the polynucleotide. A second method uses the polynucleotide to purify a ligand which specifically binds the polynucleotide and comprises combining the polynucleotide with a sample under conditions to allow specific binding, detecting specific binding between the polynucleotide and a ligand, recovering the bound polynucleotide, and separating the polynucleotide from the ligand, thereby obtaining purified ligand. A third method uses the polynucleotide to diagnose a disease or condition associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of biological samples and comprises hybridizing a polynucleotide to a sample under conditions to form one or more hybridization complexes, detecting the hybridization complexes, and comparing the levels of the hybridization complexes with the level of hybridization complexes in a non-diseased sample, wherein the altered level of hybridization complexes compared with the level of hybridization complexes of a non-diseased sample indicates the presence of the disease or condition. A fourth method uses the polynucleotide to produce a polypeptide and comprises culturing a host cell containing an expression vector containing the polynucleotide under conditions for expression of the polypeptide and recovering the polypeptide from cell culture. [0012]
  • The invention provides a substantially purified polypeptide comprising the product of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples. The invention also provides a polypeptide comprising a polypeptide sequence selected from (a) the polypeptides encoded by SEQ ID NOs: 1-34; and b) an oligopeptide sequence comprising at least 6 sequential amino acids of the polypeptide sequence of a). The further provides a polypeptide comprising the amino acid sequence of SEQ ID NO: 35. The invention still further provides a pharmaceutical composition comprising a polypeptide and a pharmaceutical carrier. [0013]
  • The invention additionally provides methods for using a polypeptide. One method uses the polypeptide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polypeptide and comprises combining the polypeptide with the library of molecules or compounds under conditions to allow specific binding and detecting specific binding between the polypeptide and ligand, thereby identifying a ligand which specifically binds the polypeptide. In this method, the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, polypeptides, agonists, antagonists, and antibodies; and the ligand identified using the method is used to modulate the activity of the polypeptide. A second method uses the polypeptide to purify a ligand from a sample and comprises combining the polypeptide with a sample under conditions to allow specific binding, detecting specific binding between the polypeptide and a ligand, recovering the bound polypeptide, and separating the polypeptide from the ligand, thereby obtaining purified ligand. A third method uses the polypeptide to treat or to prevent a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need and comprises administering to the subject in need the pharmaceutical composition containing the polypeptide in an amount effective for treating or preventing the disease. [0014]
  • The invention provides an antibody or Fab comprising an antigen binding site, wherein the antigen binding site specifically binds to the polypeptide. The invention also provides a method for treating or preventing a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need, the method comprising the step of administering to the subject in need the antibody or the Fab in an amount effective for treating or preventing the disease. The invention further provides an immunoconjugate comprising the antigen binding site of the antibody or Fab joined to a therapeutic agent. The invention additionally provides a method for treating or preventing a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need, the method comprising the step of administering to the subject in need the immunoconjugate in an amount effective for treating or preventing the disease. [0015]
  • BRIEF DESCRIPTION OF THE SEQUENCE LISTING
  • The Sequence Listing provides exemplary atherosclerosis-associated gene sequences including polynucleotide sequences SEQ ID NOs: 1-34 and the polypeptide sequence, SEQ ID NO:35. Each sequence is identified by a sequence identification number (SEQ ID NO). [0016]
  • DESCRIPTION OF THE INVENTION
  • It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include the plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a host cell” includes a plurality of such host cells, and a reference to “an antibody” is a reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so forth. [0017]
  • Definitions [0018]
  • “Atherosclerosis-associated gene” refers to a gene or polynucleotide that exhibits a statistically significant coexpression pattern with known atherosclerosis-associated genes which are useful in the diagnosis, treatment, prognosis, or prevention of atherosclerosis. [0019]
  • “Known atherosclerosis-associated gene” refers to a sequence which has been previously identified as useful in the diagnosis, treatment, prognosis, or prevention of atherosclerosis and includes polynucleotides encoding human 22 kDa smooth muscle protein, calponin, desmin, smooth muscle myosin heavy chain, alpha tropomyosin, human tissue inhibitor of metalloproteinase 3, human tissue inhibitor of metalloproteinase-2, human tissue inhibitor of metalloproteinase-4, pro alpha 1(I) collagen, collagen alpha-2 type I, collagen alpha-6 type I, procollagen alpha 2(V), collagen VI alpha-2, type VI collagen alpha3, pro-alpha-1 type 3 collagen, pro-alpha-1(V) collagen, collagenase type IV/matrix metalloproteinase 9/gelatinase B, matrix Gla protein, cathepsin K, fibrinogen beta chain gene, fibrinogen gamma chain gene, pre-pro-von Willebrand factor, coagulation factor II/prothrombin, coagulation factor XII, coagulation factor VII, platelet endothelial cell adhesion molecule, lipoprotein-associated coagulation inhibitor, antithrombin III variant, plasminogen activator inhibitor-1, lipoprotein lipase, alpha-2-macroglobulin, apolipoprotein AI, apolipoprotein AII, apolipoprotein B-100, lipoprotein apoCII, pre-apolipoprotein CIII, apolipoprotein apo C-IV, macrophage scavenger receptor type I, human antigen CD36 gene, serum amyloid P component, carboxyl ester lipase gene, paraoxonase 1, paraoxonase 2, paraoxonase 3, LDL-receptor related protein, hepatic triglyceride lipase, 3-hydroxy-3-methylglutaryl coenzyme A synthase, very low density lipoprotein receptor, microsomal triglyceride transfer protein, perilipin, endothelin-1, endothelin receptor A, interleukin 6, interleukin 1, complement protein C8 alpha, complement component C9, prostaglandin D2 synthase, annexin II/lipocortinII, annexin I/lipocortin, prostaglandin-endoperoxide synthase 2, insulin-like growth factor binding protein-1, secreted protein, acidic and rich in cysteine, human NF-kappa-B transcription factor, angiotensinogen, nitric oxide synthase 3, and nitric oxide synthase 2. Typically, this means that the known gene is expressed at higher levels (i.e., has more abundant transcripts) in atherosclerotic lesions than in normal or non-diseased arterial intima or any other tissue. [0020]
  • “Ligand” refers to any molecule, agent, or compound which will bind specifically to a complementary site on a polynucleotide or polypeptide. Such ligands stabilize or modulate the activity of polynucleotides or polypeptides of the invention. For example, ligands are libraries of inorganic and organic molecules or compounds such as nucleic acids, proteins, peptides, carbohydrates, fats, and lipids. [0021]
  • “NSEQ” refers generally to a polynucleotide sequence of the present invention, including SEQ ID NO: 1-34. “PSEQ” refers generally to a polypeptide sequence of the present invention, including SEQ ID NO:35. [0022]
  • A “fragment” refers to a nucleic acid sequence that is preferably at least 20 nucleotides in length, more preferably 40 nucleotides, and most preferably 60 nucleotides in length, and encompasses, for example, fragments consisting of 1-50, 51-400, 401-4000, 4001-12,000 nucleotides, and the like, of SEQ ID NO: 1-34. [0023]
  • “Gene” refers to the partial or complete coding sequence of a gene including 5′ or 3′ untranslated regions. The gene may be in a sense or antisense (complementary) orientation. [0024]
  • “Polynucleotide” refers to a nucleic acid, nucleic acid sequence, oligonucleotide, nucleotide, or any fragment thereof. It may be DNA or RNA of genomic or synthetic origin, double-stranded or single-stranded, and combined with carbohydrate, lipids, protein or other materials to perform a particular activity or form a useful composition. “Oligonucleotide” is substantially equivalent to the terms amplimer, primer, oligomer, element, and probe. [0025]
  • “Polypeptide” refers to an amino acid, amino acid sequence, oligopeptide, peptide, or protein or portions thereof whether naturally occurring or synthetic. [0026]
  • A “portion” refers to peptide sequence which is preferably at least 5 to about 15 amino acids in length, most preferably at least 10 amino acids long, and which retains some biological or immunological activity of, for example, a portion of SEQ ID NO: 35. [0027]
  • “Sample” is used in its broadest sense. A sample containing nucleic acids may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; a tissue print; and the like. [0028]
  • “Substantially purified” refers to a nucleic acid or an amino acid sequence that is removed from its natural environment and that is isolated or separated, and is at least about 60% free, preferably about 75% free, and most preferably about 90% free, from other components with which it is naturally present. [0029]
  • “Substrate” refers to any rigid or semi-rigid support to which polynucleotides or polypeptides are bound and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles with a variety of surface forms including wells, trenches, pins, channels and pores. [0030]
  • A “variant” refers to a polynucleotide or polypeptide whose sequence diverges from SEQ ID NO: 1-35. Polynucleotide sequence divergence may result from mutational changes such as deletions, additions, and substitutions of one or more nucleotides; it may also be introduced to accommodate differences in codon usage. Each of these types of changes may occur alone, or in combination, one or more times in a given sequence. [0031]
  • The Invention [0032]
  • The present invention encompasses a method for identifying biomolecules that are associated with a specific disease, regulatory pathway, subcellular compartment, cell type, tissue type, or species. In particular, the method identifies polynucleotides useful in diagnosis, prognosis, treatment, prevention, and evaluation of therapies for diseases associated with atherosclerosis including, but not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension. [0033]
  • The method entails first identifying polynucleotides that are expressed in a plurality of cDNA libraries. The identified polynucleotides include genes of known or unknown function which are expressed in a specific disease process, subcellular compartment, cell type, tissue type, or species. The expression patterns of the genes with known function are compared with those of genes with unknown function to determine whether a specified coexpression probability threshold is met. Through this comparison, a subset of the polynucleotides having a high coexpression probability with the known genes can be identified. The high coexpression probability correlates with a particular coexpression probability threshold which is preferably less than 0.001 and more preferably less than 0.00001. [0034]
  • The polynucleotides originate from cDNA libraries derived from a variety of sources including, but not limited to, eukaryotes such as human, mouse, rat, dog, monkey, plant, and yeast; prokaryotes such as bacteria; and viruses. These polynucleotides can also be selected from a variety of sequence types including, but not limited to, expressed sequence tags (ESTs), assembled polynucleotide sequences, full length gene coding regions, promoters, introns, enhancers, 5′ untranslated regions, and 3′ untranslated regions. To have statistically significant analytical results, the polynucleotides need to be expressed in at least three cDNA libraries. [0035]
  • The cDNA libraries used in the coexpression analysis of the present invention can be obtained from adrenal gland, biliary tract, bladder, blood cells, blood vessels, bone marrow, brain, bronchus, cartilage, chromaffin system, colon, connective tissue, cultured cells, embryonic stem cells, endocrine glands, epithelium, esophagus, fetus, ganglia, heart, hypothalamus, immune system, intestine, islets of Langerhans, kidney, larynx, liver, lung, lymph, muscles, neurons, ovary, pancreas, penis, peripheral nervous system, phagocytes, pituitary, placenta, pleurus, prostate, salivary glands, seminal vesicles, skeleton, spleen, stomach, testis, thymus, tongue, ureter, uterus, and the like. The number of cDNA libraries selected can range from as few as 3 to greater than 10,000. Preferably, the number of the cDNA libraries is greater than 500. [0036]
  • In a preferred embodiment, genes are assembled from related sequences, such as assembled sequence fragments derived from a single transcript. Assembly of the sequences can be performed using sequences of various types including, but not limited to, ESTs, extensions, or shotgun sequences. In a most preferred embodiment, the polynucleotide sequences are derived from human sequences that have been assembled using the algorithm disclosed in “Database and System for Storing, Comparing and Displaying Related Biomolecular Sequence Information”, Lincoln et al. Serial No: 60/079,469, filed Mar. 26, 1998, incorporated herein by reference. [0037]
  • Experimentally, differential expression of the polynucleotides can be evaluated by methods including, but not limited to, differential display by spatial immobilization or by gel electrophoresis, genome mismatch scanning, representational difference analysis, and transcript imaging. Additionally, differential expression can be assessed by microarray technology. These methods may be used alone or in combination. [0038]
  • Known atherosclerosis-associated genes are selected based on the use of these genes as diagnostic or prognostic markers or as therapeutic targets. [0039]
  • The procedure for identifying novel genes that exhibit a statistically significant coexpression pattern with known atherosclerosis-associated genes is as follows. First, the presence or absence of a gene in a cDNA library is defined: a gene is present in a cDNA library when at least one cDNA fragment corresponding to that gene is detected in a cDNA sample taken from the library, and a gene is absent from a library when no corresponding cDNA fragment is detected in the sample. [0040]
  • Second, the significance of gene coexpression is evaluated using a probability method to measure a due-to-chance probability of the coexpression. The probability method can be the Fisher exact test, the chi-squared test, or the kappa test. These tests and examples of their applications are well known in the art and can be found in standard statistics texts (Agresti (1990) [0041] Categorical Data Analysis, John Wiley & Sons, New York N.Y.; Rice (1988) Mathematical Statistics and Data Analysis, Duxbury Press, Pacific Grove Calif.). A Bonferroni correction (Rice, supra, p. 384) can also be applied in combination with one of the probability methods for correcting statistical results of one gene versus multiple other genes. In a preferred embodiment, the due-to-chance probability is measured by a Fisher exact test, and the threshold of the due-to-chance probability is set preferably to less than 0.001, more preferably to less than 0.00001.
  • To determine whether two genes, A and B, have similar coexpression patterns, occurrence data vectors can be generated as illustrated in Table 1. The presence of a gene occurring at least once in a library is indicated by a one, and its absence from the library, by a zero. [0042]
    TABLE 1
    Occurrence data for genes A and B
    Library 1 Library 2 Library 3 . . . Library N
    gene A 1 1 0 . . . 0
    gene B 1 0 1 . . . 0
  • For a given pair of genes, the occurrence data in Table 1 can be summarized in a 2×2 contingency table. [0043]
    TABLE 2
    Contingency table for co-occurrences of genes A and B
    Gene A present Gene A absent Total
    Gene B present 8  2 10
    Gene B absent 2 18 20
    Total 10  20 30
  • Table 2 presents co-occurrence data for gene A and gene B in a total of 30 libraries. Both gene A and gene B occur 10 times in the libraries. Table 2 summarizes and presents: 1) the number of times gene A and B are both present in a library; 2) the number of times gene A and B are both absent in a library; 3) the number of times gene A is present, and gene B is absent; and 4) the number of times gene B is present, and gene A is absent. The upper left entry is the number of times the two genes co-occur in a library, and the middle right entry is the number of times neither gene occurs in a library. The off diagonal entries are the number of times one gene occurs, and the other does not. Both A and B are present eight times and absent 18 times. Gene A is present, and gene B is absent, two times; and gene B is present, and gene A is absent, two times. The probability (“p-value”) that the above association occurs due to chance as calculated using a Fisher exact test is 0.0003. Associations are generally considered significant if a p-value is less than 0.01 (Agresti, supra; Rice, supra). [0044]
  • This method of estimating the probability for coexpression of two genes makes several assumptions. The method assumes that the libraries are independent and are identically sampled. However, in practical situations, the selected cDNA libraries are not entirely independent, because more than one library may be obtained from a single subject or tissue. Nor are they entirely identically sampled, because different numbers of cDNAs may be sequenced from each library. The number of cDNAs sequenced typically ranges from 5,000 to 10,000 cDNAs per library. In addition, because a Fisher exact coexpression probability is calculated for each gene versus 45,233 other assembled genes, a Bonferroni correction for multiple statistical tests is used. [0045]
  • The present invention identifies 34 novel atherosclerosis-associated polynucleotides that exhibit strong association with genes known to be specific to atherosclerosis. The results presented in Table 4 show that the expression of the 34 novel atherosclerosis-associated polynucleotides has direct or indirect association with the expression of known atherosclerosis-associated genes. Therefore, the novel atherosclerosis-associated polynucleotides can potentially be used in diagnosis, treatment, prognosis, or prevention of diseases associated with atherosclerosis or in the evaluation of therapies for atherosclerosis. Further, the gene products of the 34 novel atherosclerosis-associated polynucleotides are either potential therapeutics or targets of therapeutics against atherosclerosis. [0046]
  • Therefore, in one embodiment, the present invention encompasses a polynucleotide sequence comprising the sequence of SEQ ID NO: 1-34. These 34 polynucleotides are shown by the method of the present invention to have strong coexpression association with known atherosclerosis-associated genes and with each other. The invention also encompasses a variant of the polynucleotide sequence, its complement, or 18 consecutive nucleotides of a sequence provided in the above described sequences. Variant polynucleotide sequences typically have at least about 75%, more preferably at least about 85%, and most preferably at least about 95% polynucleotide sequence identity to NSEQ. [0047]
  • NSEQ or the encoded PSEQ may be used to search against the GenBank primate (pri), rodent (rod), mammalian (mam), vertebrate (vrtp), and eukaryote (eukp) databases, SwissProt, BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other databases that contain previously identified and annotated motifs, sequences, and gene functions. Methods that search for primary sequence patterns with secondary structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul (1993) J. Mol. Evol. 36:290-300; Altschul et al. (1990) J. Mol. Biol. 215:403-410), BLOCKS (Henikoff and Henikoff (1991) Nucleic Acids Res. 19:6565-6572), Hidden Markov Models (HMM; Eddy (1996) Cur. Opin. Str. Biol. 6:361-365; Sonnhammer et al. (1997) Proteins 28:405-420), and the like, can be used to manipulate and analyze nucleotide and amino acid sequences. These databases, algorithms and other methods are well known in the art and are described in Ausubel et al. (1997; [0048] Short Protocols in Molecular Biology, John Wiley & Sons, New York N.Y., unit 7.7) and in Meyers (1995; Molecular Biology and Biotechnology, Wiley VCH, New York N.Y., p 856-853).
  • Also encompassed by the invention are polynucleotide sequences that are capable of hybridizing to SEQ ID NO: 1-34, and fragments thereof under stringent conditions. Stringent conditions can be defined by salt concentration, temperature, and other chemicals and conditions well known in the art. Conditions can be selected, for example, by varying the concentrations of salt in the prehybridization, hybridization, and wash solutions or by varying the hybridization and wash temperatures. With some substrates, the temperature can be decreased by adding formamide to the prehybridization and hybridization solutions. [0049]
  • Hybridization can be performed at low stringency, with buffers such as 5xSSC with 1% sodium dodecyl sulfate (SDS) at 60° C., which permits complex formation between two nucleic acid sequences that contain some mismatches. Subsequent washes are performed at higher stringency with buffers such as 0.2xSSC with 0.1% SDS at either 45° C. (medium stringency) or 68° C. (high stringency), to maintain hybridization of only those complexes that contain completely complementary sequences. Background signals can be reduced by the use of detergents such as SDS, Sarcosyl, or Triton X-100, and/or a blocking agent, such as salmon sperm DNA. Hybridization methods are described in detail in Ausubel (supra, units 2.8-2.11, 3.18-3.19 and 4-6-4.9) and Sambrook et al. (1989; [0050] Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.)
  • NSEQ can be extended utilizing a partial nucleotide sequence and employing various PCR-based methods known in the art to detect upstream sequences such as promoters and other regulatory elements. (See, e.g., Dieffenbach and Dveksler (1995) [0051] PCR Primer, a Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.). Additionally, one may use an XL-PCR kit (PE Biosystems, Foster City Calif.), nested primers, and commercially available cDNA libraries (Life Technologies, Rockville Md.) or genomic libraries (Clontech, Palo Alto Calif.) to extend the sequence. For all PCR-based methods, primers may be designed using commercially available software, such as OLIGO 4.06 Primer Analysis software (National Biosciences, Plymouth Minn.) or another program, to be about 18 to 30 nucleotides in length, to have a GC content of about 50%, and to form a hybridization complex at temperatures of about 68° C. to 72° C.
  • In another aspect of the invention, NSEQ can be cloned in recombinant DNA molecules that direct the expression of PSEQ, or structural or functional portions thereof, in host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent amino acid sequence may be produced and used to express the polypeptide encoded by NSEQ. The nucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter the nucleotide sequences for a variety of purposes including, but not limited to, modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide-mediated site-directed mutagenesis may be used to introduce mutations that create new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth. [0052]
  • In order to express a biologically active polypeptide, NSEQ, or derivatives thereof, may be inserted into an expression vector, i.e., a vector which contains the elements for transcriptional and translational control of the inserted coding sequence in a particular host. These elements include regulatory sequences, such as enhancers, constitutive and inducible promoters, and 5′ and 3′ untranslated regions. Methods which are well known to those skilled in the art may be used to construct such expression vectors. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra; and Ausubel, supra). [0053]
  • A variety of expression vector/host cell systems may be utilized to express NSEQ. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with baculovirus vectors; plant cell systems transformed with viral or bacterial expression vectors; or animal cell systems. For long term production of recombinant proteins in mammalian systems, stable expression in cell lines is preferred. For example, NSEQ can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable or visible marker gene on the same or on a separate vector. The invention is not to be limited by the vector or host cell employed. [0054]
  • In general, host cells that contain NSEQ and that express PSEQ may be identified by a variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or chip based technologies for the detection and/or quantification of nucleic acid or amino acid sequences. Immunological methods for detecting and measuring the expression of PSEQ using either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS). [0055]
  • Host cells transformed with NSEQ may be cultured under conditions for the expression and recovery of the polypeptide from cell culture. The polypeptide produced by a transgenic cell may be secreted or retained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing NSEQ may be designed to contain signal sequences which direct secretion of the polypeptide through a prokaryotic or eukaryotic cell membrane. [0056]
  • In addition, a host cell strain may be chosen for its ability to modulate expression of the inserted sequences or to process the expressed polypeptide in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a “prepro” form of the polypeptide may also be used to specify protein targeting, folding, and/or activity. Different host cells which have specific cellular machinery and characteristic mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the American Type Culture Collection (ATCC, Bethesda Md.) and may be chosen to ensure the correct modification and processing of the expressed polypeptide. [0057]
  • In another embodiment of the invention, natural, modified, or recombinant nucleic acid sequences are ligated to a heterologous sequence resulting in translation of a fusion polypeptide containing heterologous polypeptide moieties in any of the aforementioned host systems. Such heterologous polypeptide moieties facilitate purification of fusion polypeptides using commercially available affinity matrices. Such moieties include, but are not limited to, glutathione S-transferase, maltose binding protein, thioredoxin, calmodulin binding peptide, 6-His, FLAG, c-myc, hemaglutinin, and monoclonal antibody epitopes. [0058]
  • In another embodiment, the nucleic acid sequences are synthesized, in whole or in part, using chemical or enzymatic methods well known in the art (Caruthers et al. (1980) Nucl. Acids Res. Symp. Ser. 215-233; Ausubel, supra). For example, peptide synthesis can be performed using various solid-phase techniques (Roberge et al. (1995) Science 269:202-204), and machines such as the ABI 431A Peptide synthesizer (PE Biosystems) can be used to automate synthesis. If desired, the amino acid sequence may be altered during synthesis and/or combined with sequences from other proteins to produce a variant protein. [0059]
  • In another embodiment, the invention entails a substantially purified polypeptide comprising the amino acid sequence of SEQ ID NO: 35 and fragments thereof. [0060]
  • Screening, Diagnostics and Therapeutics [0061]
  • The polynucleotide sequences can be used in diagnosis, prognosis, treatment, prevention, and selection and evaluation of therapies for atherosclerosis including, but not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension. [0062]
  • The polynucleotide sequences may be used to screen a library of molecules for specific binding affinity. The assay can be used to screen a library of DNA molecules, RNA molecules, PNAs, peptides, ribozymes, antibodies, agonists, antagonists, immunoglobulins, inhibitors, proteins including transcription factors, enhancers, repressors, and drugs and the like which regulate the activity of the polynucleotide sequence in the biological system. The assay involves providing a library of molecules, combining the polynucleotide sequence or a fragment thereof with the library of molecules under conditions suitable to allow specific binding, and detecting specific binding to identify at least one molecule which specifically binds the polynucleotide sequence. [0063]
  • Similarly the polypeptide or a portion thereof may be used to screen libraries of molecules in any of a variety of screening assays. The portion of the polypeptide employed in such screening may be free in solution, affixed to an abiotic or biotic substrate (e.g. borne on a cell surface), or located intracellularly. Specific binding between the polypeptide and molecule may be measured. The assay can be used to screen a library of DNA molecules, RNA molecules, PNAs, peptides, mimetics, ribozymes, antibodies, agonists, antagonists, immunoglobulins, inhibitors, peptides, polypeptides, drugs and the like, which specifically bind the polypeptide. One method for high throughput screening using very small assay volumes and very small amounts of test compound is described in Burbaum et al. U.S. Pat. No. 5,876,946, incorporated herein by reference, which screens large numbers of molecules for enzyme inhibition or receptor binding. [0064]
  • In one preferred embodiment, the polynucleotide sequences are used for diagnostic purposes to determine the absence, presence, and excess expression of the polypeptide. The polynucleotides may be at least 18 nucleotides long and consist of complementary RNA and DNA molecules, branched nucleic acids, and/or peptide nucleic acids (PNAs). In one alternative, the polynucleotides are used to detect and quantify gene expression in samples in which expression of NSEQ is correlated with disease. In another alternative, NSEQ can be used to detect genetic polymorphisms associated with a disease. These polymorphisms may be detected in the transcript cDNA. [0065]
  • The specificity of the probe is determined by whether it is made from a unique region, a regulatory region, or from a conserved motif. Both probe specificity and the stringency of diagnostic hybridization or amplification (maximal, high, intermediate, or low) will determine whether the probe identifies only naturally occurring, exactly complementary sequences, allelic variants, or related sequences. Probes designed to detect related sequences should preferably have at least 75% sequence identity to any of the polynucleotides encoding PSEQ. [0066]
  • Methods for producing hybridization probes include the cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by adding RNA polymerases and labeled nucleotides. Hybridization probes may incorporate nucleotides labeled by a variety of reporter groups including, but not limited to, radionuclides such as [0067] 32P or 35S, enzymatic labels such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, fluorescent labels, and the like. The labeled polynucleotide sequences may be used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR technologies; and in microarrays utilizing samples from subjects to detect altered PSEQ expression.
  • NSEQ can be labeled by standard methods and added to a sample from a subject under conditions for the formation and detection of hybridization complexes. After incubation the sample is washed, and the signal associated with hybrid complex formation is quantitated and compared with a standard value. Standard values are derived from any control sample, typically one that is free of the suspect disease. If the amount of signal in the subject sample is altered in comparison to the standard value, then the presence of altered levels of expression in the sample indicates the presence of the disease. Qualitative and quantitative methods for comparing the hybridization complexes formed in subject samples with previously established standards are well known in the art. [0068]
  • Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment of an individual subject. Once the presence of disease is established and a treatment protocol is initiated, hybridization or amplification assays can be repeated on a regular basis to determine if the level of expression in the subject begins to approximate that which is observed in a healthy subject. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to many years. [0069]
  • The polynucleotides may be used for the diagnosis of a variety of diseases associated with atherosclerosis. These include, but are not limited to, stroke, myocardial infarction, hypertension, transient cerebral ischemia, mesenteric ischemia, coronary artery disease, angina pectoris, peripheral vascular disease, intermittent claudication, renal artery stenosis, and hypertension. [0070]
  • The polynucleotides may also be used as targets in a microarray. The microarray can be used to monitor the expression patterns of large numbers of genes simultaneously and to identify splice variants, mutations, and polymorphisms. Information derived from analyses of the expression patterns may be used to determine gene function, to understand the genetic basis of a disease, to diagnose a disease, and to develop and monitor the activities of therapeutic agents used to treat a disease. Microarrays may also be used to detect genetic diversity, single nucleotide polymorphisms which may characterize a particular population, at the genome level. [0071]
  • In yet another alternative, polynucleotides may be used to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Fluorescent in situ hybridization (FISH) may be correlated with other physical chromosome mapping techniques and genetic map data as described in Heinz-Ulrich et al. (In: Meyers, supra, pp. 965-968). [0072]
  • In another embodiment, antibodies or Fabs comprising an antigen binding site that specifically binds PSEQ may be used for the diagnosis of diseases characterized by the over-or-under expression of PSEQ. A variety of protocols for measuring PSEQ, including ELISAs, RIAs, and FACS, are well known in the art and provide a basis for diagnosing altered or abnormal levels of expression. Standard values for PSEQ expression are established by combining samples taken from healthy subjects, preferably human, with antibody to PSEQ under conditions for complex formation The amount of complex formation may be quantitated by various methods, preferably by photometric means. Quantities of PSEQ expressed in disease samples are compared with standard values. Deviation between standard and subject values establishes the parameters for diagnosing or monitoring disease. Alternatively, one may use competitive drug screening assays in which neutralizing antibodies capable of binding PSEQ specifically compete with a test compound for binding the polypeptide. Antibodies can be used to detect the presence of any peptide which shares one or more antigenic determinants with PSEQ. In one aspect, the anti-PSEQ antibodies of the present invention can be used for treatment or monitoring therapeutic treatment for atherosclerosis. [0073]
  • In another aspect, the NSEQ, or its complement, may be used therapeutically for the purpose of expressing mRNA and polypeptide, or conversely to block transcription or translation of the mRNA. Expression vectors may be constructed using elements from retroviruses, adenoviruses, herpes or vaccinia viruses, or bacterial plasmids, and the like. These vectors may be used for delivery of nucleotide sequences to a particular target organ, tissue, or cell population. Methods well known to those skilled in the art can be used to construct vectors to express nucleic acid sequences or their complements. (See, e.g., Maulik et al. (1997) [0074] Molecular Biotechnology, Therapeutic Applications and Strategies, Wiley-Liss, New York N.Y.) Alternatively, NSEQ, or its complement, may be used for somatic cell or stem cell gene therapy. Vectors may be introduced in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors are introduced into stem cells taken from the subject, and the resulting transgenic cells are clonally propagated for autologous transplant back into that same subject. Delivery of NSEQ by transfection, liposome injections, or polycationic amino polymers may be achieved using methods which are well known in the art. (See, e.g., Goldman et al. (1997) Nature Biotechnology 15:462-466.) Additionally, endogenous NSEQ expression may be inactivated using homologous recombination methods which insert an inactive gene sequence into the coding region or other targeted region of NSEQ. (See, e.g. Thomas et al. (1987) Cell 51: 503-512.)
  • Vectors containing NSEQ can be transformed into a cell or tissue to express a missing polypeptide or to replace a nonfunctional polypeptide. Similarly a vector constructed to express the complement of NSEQ can be transformed into a cell to downregulate the overexpression of PSEQ. Complementary or antisense sequences may consist of an oligonucleotide derived from the transcription initiation site; nucleotides between about positions −10 and +10 from the ATG are preferred. Similarly, inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have been described in the literature. (See, e.g., Gee et al. In: Huber and Carr (1994) [0075] Molecular and Immunologic Approaches, Futura Publishing Co., Mt. Kisco N.Y., pp. 163-177.)
  • Ribozymes, enzymatic RNA molecules, may also be used to catalyze the cleavage of mRNA and decrease the levels of particular mRNAs, such as those comprising the polynucleotide sequences of the invention. (See, e.g., Rossi (1994) Current Biology 4: 469-471.) Ribozymes may cleave mRNA at specific cleavage sites. Alternatively, ribozymes may cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. The construction and production of ribozymes is well known in the art and is described in Meyers (supra). [0076]
  • RNA molecules may be modified to increase intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5′ and/or 3′ ends of the molecule, or the use of phosphorothioate or 2′ O-methyl rather than phosphodiester linkages within the backbone of the molecule. Alternatively, nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases, may be included. [0077]
  • Further, an antagonist, or an antibody that binds specifically to PSEQ may be administered to a subject to treat or prevent atherosclerosis. The antagonist, antibody, or fragment may be used directly to inhibit the activity of the polypeptide or indirectly to deliver a therapeutic agent to cells or tissues which express the PSEQ. An immunoconjugate comprising a PSEQ binding site of the antibody or the antagonist and a therapeutic agent may be administered to a subject in need to treat or prevent disease. The therapeutic agent may be a cytotoxic agent selected from a group including, but not limited to, abrin, ricin, doxorubicin, daunorubicin, taxol, ethidium bromide, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicine, dihydroxy anthracin dione, actinomycin D, diphteria toxin, Pseudomonas exotoxin A and 40, radioisotopes, and glucocorticoid. [0078]
  • Antibodies to PSEQ may be generated using methods that are well known in the art. Such antibodies may include, but are not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and fragments produced by a Fab expression library. Neutralizing antibodies, such as those which inhibit dimer formation, are especially preferred for therapeutic use. Monoclonal antibodies to PSEQ may be prepared using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma, the human B-cell hybridoma, and the EBV-hybridoma techniques. In addition, techniques developed for the production of chimeric antibodies can be used. (See, e.g., Pound (1998) [0079] Immunochemical Protocols, Methods Mol. Biol. Vol. 80). Alternatively, techniques described for the production of single chain antibodies may be employed. Fabs which contain specific binding sites for PSEQ may also be generated. Various immunoassays may be used to identify antibodies having the desired specificity. Numerous protocols for competitive binding or immunoradiometric assays using either polyclonal or monoclonal antibodies with established specificities are well known in the art.
  • Yet further, an agonist of PSEQ may be administered to a subject to treat or prevent a disease associated with decreased expression, longevity or activity of PSEQ. [0080]
  • An additional aspect of the invention relates to the administration of a pharmaceutical or sterile composition, in conjunction with a pharmaceutically acceptable carrier, for any of the therapeutic applications discussed above. Such pharmaceutical compositions may consist of PSEQ or antibodies, mimetics, agonists, antagonists, or inhibitors of the polypeptide. The compositions may be administered alone or in combination with at least one other agent, such as a stabilizing compound, which may be administered in any sterile, biocompatible pharmaceutical carrier including, but not limited to, saline, buffered saline, dextrose, and water. The compositions may be administered to a subject alone or in combination with other agents, drugs, or hormones. [0081]
  • The pharmaceutical compositions utilized in this invention may be administered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. [0082]
  • In addition to the active ingredients, these pharmaceutical compositions may contain pharmaceutically-acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. Further details on techniques for formulation and administration may be found in the latest edition of [0083] Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton Pa.).
  • For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays or in animal models such as mice, rats, rabbits, dogs, or pigs. An animal model may also be used to determine the concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans. [0084]
  • A therapeutically effective dose refers to that amount of active ingredient which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by calculating and contrasting the ED[0085] 50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% of the population) statistics. Any of the therapeutic compositions described above may be applied to any subject in need of such therapy, including, but not limited to, mammals such as dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans.
  • EXAMPLES
  • It is to be understood that this invention is not limited to the particular devices, machines, materials and methods described. Although particular embodiments are described, equivalent embodiments may be used to practice the invention. The described embodiments are provided to illustrate the invention and are not intended to limit the scope of the invention which is limited only by the appended claims. [0086]
  • I cDNA Library Construction
  • The cDNA library SMCCNOS01 was selected as an example to demonstrate the construction of cDNA libraries from which the polynucleotides associated with known atherosclerosis-associated genes were derived. The SMCCNOS01 subtracted coronary artery smooth muscle cell library was constructed using 7.56 ×10[0087] 6 clones from the SMCCNOT02 library and was subjected to two rounds of subtraction hybridization for 48 hours with 6.12 ×106 clones from SMCCNOT01. The SMCCNOT02 library was constructed using RNA isolated from coronary artery smooth muscle cells removed from a 3-year-old Caucasian male. The cells were treated for 20 hours with TNFα and IL-1β at 10 ng/ml each. The SMCCNOT01 was constructed using RNA isolated from untreated coronary artery smooth muscle cells from the same donor. Subtractive hybridization conditions were based on the methodologies of Swaroop et al. (1991; Nucleic Acids Res. 19:1954) and Bonaldo et al. (1996; Genome Research 6:791).
  • For both cDNA libraries, SMCCNOT01 and SMCCNOT02, the frozen coronary artery smooth muscle cells (50-100 mg) were homogenized in GTC buffer (4.0 M guanidine thiocyanate, 0.1 M Tris-HCl pH 7.5, 1% 2-mercaptoethanol). Two volumes of binding buffer(0.4 M LiCl, 0.1 M Tris-HCl pH 7.5, 0.02 M EDTA) were added, and the resulting mixture was vortexed at 13,000 rpm. The supernatant was removed and combined with Oligo(dT)[0088] 25 bound streptavidin particles (MPG). After rotation at room temperature, the mRNA-Oligo(dT)25 bound streptavidin particles were separated from the supernatant, washed twice with hybridization buffer I (0.1 5 M NaCl, 0.01 M Tris-HCl pH8.0, 1 mM EDTA, 0.1% lauryl sarcosinate) using magnetic separation at each step to remove the supernatant from the particles. Bound mRNA was eluted from the particles with release solution and heated to 65° C. The supernatant containing eluted mRNA was magnetically separated from the particles and used to construct the cDNA libraries.
  • The RNA was used for cDNA synthesis and construction of the cDNA library according to the recommended protocols in the SUPERSCRIPT plasmid system (Life Technologies). The cDNAs were fractionated on a SEPHAROSE CL4B column (Amersham Pharmacia Biotech, Piscataway N.J.), and those cDNAs exceeding 400 bp were ligated into pINCY plasmid (Incyte Pharmaceuticals, Palo Alto, Calif.). Recombinant plasmids were transformed into DH5α competent cells or ELECTROMAX cells (Life Technologies). [0089]
  • II Isolation and Sequencing of cDNA Clones
  • Plasmid DNA was released from the cells and purified using the REAL Prep 96 plasmid kit (Qiagen, Valencia Calif.). The recommended protocol was employed except for the following changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth (Life Technologies) with carbenicillin at 25 mg/L and glycerol at 0.4%; 2) after inoculation, the cells were cultured for 19 hours and then lysed with 0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml distilled water, and samples were transferred to a 96-well block for storage at 4° C. [0090]
  • The cDNAs were prepared using a MICROLAB 2200 System (Hamilton, Reno Nev.) in combination with the DNA ENGINE thermal cycler (MJ Research, Watertown Mass.). cDNAs were sequenced by the method of Sanger et al. (1975, J. Mol. Biol. 94:441f) using ABI PRISM 377 (PE Biosystems) or MEGABACE 1000 sequencing systems (Amersham Pharmacia Biotech). [0091]
  • Most of the sequences disclosed herein were sequenced using standard ABI protocols and kits (PE Biosystems) at solution volumes of 0.25x -1.0x concentrations. In the alternative, some of the sequences disclosed herein were sequenced using solutions and dyes from Amersham Pharmacia Biotech. [0092]
  • III Selection, Assembly, and Characterization of Sequences
  • The sequences used for co-expression analysis were assembled from EST sequences, 5′ and 3′ longread sequences, and full length coding sequences. Selected assembled sequences were expressed in at least three cDNA libraries. [0093]
  • The assembly process is described as follows. EST sequence chromatograms were processed and verified. Quality scores were obtained using PHRED (Ewing et al. (1998) Genome Res. 8:175-185; Ewing and Green (1998) Genome Res. 8:186-194), and edited sequences were loaded into a relational database management system (RDBMS). The sequences were clustered using BLAST with a product score of 50. All clusters of two or more sequences created a bin which represents one transcribed gene. [0094]
  • Assembly of the component sequences within each bin was performed using a modification of Phrap, a publicly available program for assembling DNA fragments (Green, P. University of Washington, Seattle Wash.). Bins that showed 82% identity from a local pair-wise alignment between any of the consensus sequences were merged. [0095]
  • Bins were annotated by screening the consensus sequence in each bin against public databases, such as GBpri and GenPept from NCBI. The annotation process involved a FASTn screen against the GBpri database in GenBank. Those hits with a percent identity of greater than or equal to 75% and an alignment length of greater than or equal to 100 base pairs were recorded as homolog hits. The residual unannotated sequences were screened by FASTx against GenPept. Those hits with an E value of less than or equal to 10[0096] −8 were recorded as homolog hits.
  • Sequences were then reclustered using BLASTn and Cross-Match, a program for rapid amino acid and nucleic acid sequence comparison and database search (Green, supra), sequentially. Any BLAST alignment between a sequence and a consensus sequence with a score greater than 150 was realigned using cross-match. The sequence was added to the bin whose consensus sequence gave the highest Smith-Waterman score (Smith et al. (1992) Protein Engineering 5:35-51) amongst local alignments with at least 82% identity. Non-matching sequences were moved into new bins, and assembly processes were repeated. [0097]
  • IV Coexpression Analyses of Atherosclerosis-Associated Genes
  • Sixty-six known atherosclerosis-associated genes were selected to identify novel genes that are closely associated with atherosclerosis. The known atherosclerosis-associated genes which were examined in this analysis and brief descriptions of their functions are listed in Table 3. [0098]
    TABLE 3
    Descriptions of Known Atherosclerosis-Associated Genes
    GENE DESCRIPTION AND REFERENCES
    Human 22kDa Smooth muscle cell-specific gene which is down-regulated during smooth muscle
    smooth muscle cell dedifferentiation as part of atherogenic process (Sobue et al. (1998) Horm.
    protein (SM22) Res. 50 Suppl. 2:15-24; Sobue et al. (1999) Mol. Cell. Biochem. 190(1-2): 105-
    18)
    calponin (CNN1) Calponin is smooth muscle-specific and may mediate smooth muscle contractility
    through it's binding of the amino-terminal end of the myosin regulatory light
    chain. Involved in phenotypic modulation of smooth muscle cells, a feature of
    atherosclerosis (Szymanski et al. (1999) Biochemistry 38(12): 3778-84)
    desmin Contractile component of myofibrils in differentiated smooth muscle cells.
    (DES) Regarded as a marker for smooth muscle cells (Shi et al. (1997) Circulation
    95(12): 2684-93)
    smooth muscle Contractile component of myofibrils in differentiated smooth muscle cells.
    myosin heavy Regarded as a marker for smooth muscle cells (Sobue et al. (1999) Mol. Cell.
    chain (MYH11) Biochem. 190(1-2): 105-18)
    alpha Contractile component of myofibrils in differentiated smooth muscle cells (Sobue
    tropomyosin et al. (1999) Mol. Cell. Biochem. 190(1-2): 105-18; Kashiwada et al. (1997) J.
    (TPM1) Biol. Chem. 272(24): 15396-404)
    Human tissue TIMPs control the activity of matrix metalloproteinases and are important in local
    inhibitor of matrix remodeling of vasculature. Atheroma extracts shown to have 5× higher
    metallo- TIMP3 expression levels than non-atherosclerotic tissue. Abundant TIMP1, 2, 3
    proteinase 3 expression noted in plaque macrophages and smooth muscle cells. PDGF and
    (TIMP3) TGFbeta augment TIMP3 expression. TIMP3 possible important role in plaque
    stability (Fabunmi et al. (1998) Circ. Res. 83(3): 270-8)
    Human tissue TIMPs control the activity of matrix metalloproteinases and are important in local
    inhibitor of matrix remodeling of vasculature. Abundant TIMP1, 2, 3 expression noted in
    metallo- plaque macrophages and smooth muscle cells. Expression of TIMP2 is greatly
    proteinase-2 increased during neointima formation in organ cultures of human saphenous vein
    (TIMP-2) (Kranzhofer et al. (1999) Arterioscler. Thromb. Vasc. Biol. 19(2): 255-65)
    Human tissue TIMPs control the activity of matrix metalloproteinases and are important in local
    inhibitor of matrix remodeling of vasculature (Greene et al. (1996) J. Biol. Chem. 271(48):
    metallo- 30375-80)
    proteinase-4
    (TIMP4)
    pro alpha 1(I) Member of family of fibrous structural proteins. Most abundant structural
    collagen component of the extracellular matrix. Secreted as procollagen and converted to
    (COL1A1) collagen by matrix metalloproteinases. Collagens are important in atherosclerosis
    for promoting platelet aggregation and for providing sites for platelet adhesion to
    the vessel wall (Wen et al. (1999) Arterioscler. Thromb. Vasc. Biol. 19(3): 519-
    24)
    collagen alpha-2 see COL1A1 above
    type I (COL1A2)
    COL6A1 see COL1A1 above
    procollagen see COL1A1 above
    alpha 2(V)
    (COL5A2)
    collagen VI see COL1A1 above
    alpha-2
    (COL6A2)
    type VI collagen see COL1A1 above
    alpha3
    (COL6A3)
    pro-alpha-1 type see COL1A1 above
    3 collagen
    (COL3A1)
    pro-alpha-1 (V)
    collagen see COL1A1 above
    (COL3A1
    collagenase type Contributes to the degradation of vascular wall/smooth muscle cells associated
    IV/matrix with local matrix remodeling. Expression of metalloproteinases controlled by
    metallo- tissue inhibitors of metalloproteinases (TIMPs). Balance between MMP and
    proteinase TIMP expression becomes distorted during onset and progression of
    9/gelatinase B atherosclerosis. MMP9 localized to lesional macrophages, along with MMP-1,
    (MMP9) MMP-2, MMP-3. Rabbit aortic macrophage foam cells express immunoreactive
    MMP-9 (Moreau et al. (1999) Circulation 99(3): 420-426; Zaltsman et al. (1997)
    Atherosclerosis 130(1-2): 61-70)
    matrix Gla Role in active calcification of vascular smooth muscle cells, suggested by
    protein (MGP) expression study on VSMC in vitro differentiation study. Calcifying phenotype
    associated with high MGP levels. MGP knockout mice develop to term, but die
    up to 2 months after birth due to extensive calcification of the arteries, causing
    blood vessel rupture (Luo et al. (1997) Nature 386(6620): 78-81; Mori et al.
    (1998) FEBS Lett. 433(1-2): 19-22)
    cathepsin K Nonmetalloenzyme, potent elastase present in advanced atherosclerotic plaques.
    (CTSK) Contributes to the breakdown of components of vascular extracellular matrix,
    reducing tensile strength, increasing plaque vulnerability (Sukhova et al. (1998) J.
    Clin. Invest. 102(3): 576-83)
    fibrinogen beta Component of fibrin in the extracellular matrix. Fibrin deposition is an integral
    chain gene part of advanced atherosclerotic lesion development. Variation at the beta
    (FGB) fibrinogen locus associated with peripheral atherosclerosis (Sueishi et al. (1998)
    Semin. Thromb. Hemost. 24(3): 255-260; Fowkes et al. (1992) Lancet 339 (8795):
    693-696)
    fibrinogen beta Participant in adhesion and aggregation of platelets which occurs through binding
    chain gene of platelet receptors. FGG carries the main binding site for the platelet receptor
    (FGG) binding. Mutations in FGG associated with clotting defects and thrombotic
    tendency. Fibrin deposition is an integral part of advanced atherosclerotic lesion
    development (Sueishi et al. (1998) Semin. Thromb. Hemost. 24 (3): 255-60; Cote
    et al. (1998) Blood 92(7): 2195-2212)
    pre-pro-von Blood glycoprotein involved in normal hemostasis. Mediates adhesion of platelets
    Willebrand to sites of vascular damage. Also acts as a cofactor in factor VIII activity in blood
    factor (VWF) coagulation. Increased levels of VWF are found in atherosclerosis and in several
    of its major risk factors, including hypercholesterolemia, diabetes, obesity,
    hypertension. Levels serve as a predictor of adverse clinical outcome following
    vascular surgery, possibly as an indicator of thrombus formation (Sadler, J. E.
    (1998) Annu Rev Biochem 67: 395-424; Blann et al. (1994) Eur. J. Vasc. Surg.
    8(1): 10-15; Kessler et al. (1998) Diabetes Metab. 24(4): 327-36; Folsom et al.
    (1997) Circulation 96(4): 1102-1108)
    coagulation Central role in blood hemostasis by regulating platelet aggregation and blood
    factor II/ coagulation. Converts fibrinogen to fibrin in the final stage of clotting cascade.
    prothrombin (F2) Promotes cellular chemotaxis and proliferation, extracellular matrix turnover and
    release of inflammatory cytokines (Goldsack et al. (1998) Int. J. Biochem. Cell.
    Biol. 30(6): 641-646)
    Coagualtion Activation of blood coagulation is an important part of post-vascular injury with
    factor XII (F12) initiation of atherosclerotic lesion formation and contributes to thrombosis in
    advanced stage atherosclerosis (Sueishi et al. (1998) Semin. Thromb. Hemost.
    24(3): 255-260)
    coagulation Central role in coagulation, influences plasma triglyceride levels, a risk factor in
    factor VII(F7) atherosclerosis. Epidemiological studies have linked F7 with cardiovascular
    risk/atherothrombotic tendency (Ghaddar et al. (1998) Circulation 98(25): 2815-
    2821; Koenig, W. (1998) Eur. Heart J. 19 Suppl C: C39-43; Folsom et al. (1997)
    Circulation 96(4): 1102-1108)
    platelet Signalling molecule in the migration of cells as part of the pathophysiology of
    endothelial cell vascular occulsive diseases such as atherosclerosis. Analysis of
    adhesion endothelial/monocyte co-cultures indicates oxidative stress induces
    molecule transendothelial migration of monocytes as a result of phosphorylation of
    (PECAM-1) PECAM-1 (Rattan et al. (1997) Am. J. Physiol. 273(3 Pt 1): E453-61)
    lipoprotein- Natural anticoagulant, inhibits factor VII/tissue factor complexes. Role in
    associated regulating coagulation in atherosclerotic plaques. Circulates in association with
    coagulation plasma lipoproteins VLDL, HDL and LDL. In situ expression studies indicate
    inhibitor (LACI) TFPI is expressed in adventitial layer of large arteries, and in atherosclerotic
    vessels is expressed by macrophages in focal areas throughout the plaque (Drew et
    al. (1997) Lab. Invest. 77(4): 291-298; Sandset, P.M. (1996) Haemostasis 26
    Suppl 4: 154-165)
    antithrombin III ATIII is the sole blood component through which heparin exerts its anti
    variant (AT3) coagulation effect. Deficiency in ATIII causes recurrent venous thrombosis and
    pulmonary embolism and can be inherited in autosomal dominant fashion (Hultin
    et al. (1988) Thromb. Haemost. 59(3): 468-73; Lane et al. (1996) Blood Rev.
    10(2): 59-74)
    plasminogen Major physiological inhibitor of fibrinolysis. Plasma levels correlate with
    activator incidence of MI and venous thrombosis. Both adipocytes and endothelial cells
    inhibitor-1 produced PAI, possibly under the control of PPARG, as demonstrated using
    (PAI-1) recombinant PPARG expression constructs in endothelial cell lines. Increased
    expression of PAI observed in coronary heart disease. 4G polymorphism in
    promotor causes increased PAI expression associated with MI in some studies
    (Eriksson et al. (1995) Proc. Natl. Acad. Sci. USA 92(6): 1851-5; Marx et al.
    (1999) Arterioscler. Thromb. Vasc. Biol. 19(3): 546-551)
    lipoprotein lipase Hydrolises triglyceride in chylomicrons and therefore regulates metabolism of
    (LPL) circulating lipoproteins. Appears to have an atherogenic effect on the arterial wall
    due to its ability to alter the properties of LDL. Increased activity of LPL is found
    in atherosclerotic arteries when compared to normal. Expressed by macrophages
    in atherosclerotic lesions. Mutations in LPL responsible for familial
    hypercholesterolemia and premature atherosclerosis (Fisher. et al. (1997)
    Atherosclerosis 135(2): 145-159; Goldberg, I. J. (1996) J. Lipid Res. 37(4): 693-
    707; Gerdes et al. (1997) Circulation 96(3): 733-740)
    alpha-2- Foam cell formation - retains LDL cholesterol in the lipid core of atherosclerotic
    macroglobulin plaque (Llorente et al. (1998). Rev. Esp. Cardiol. 51(8): 633-641)
    (A2M)
    apolipoprotein Participates in reverse cholesterol transport from tissues to the liver. Promotes
    AI (APOA1) cholesterol efflux from tissues and acts as a cofactor for lecithin cholesterol
    acyltransferase (LCAT). Mutations in ApoA1 and of ApoAI/CIII/AIV gene
    cluster assoc with atherosclerosis. Transgenic mice expressing high plasma
    APOAI levels are protected from fatty streak development with a high atherogenic
    diet (Gordon et al. (1989) Circulation 79(1): 8-15; Rubin et al. (1991) Nature
    353(6341): 265-7; Karathanasis et al. (1987) Proc. Natl. Acad. Sci. USA 84(20):
    7198-7202)
    apolipoprotein Major component of HDL. Appears to have an opposite effect to that of APOAI,
    AII (APOA2) though exact function unknown. APOAII may have ability to convert HDL from
    an anti- to a pro-inflammatory particle, with paraoxonase having a role in this
    transformation process. Plasma APOAII levels significantly associated with
    plasma free fatty acid levels. Transgenic mice expressing varying levels of
    APOAII show increased atherosclerotic lesions than wt when fed an atherogenic
    diet. Possible interaction between diet/genotype and atherogenic potential
    (Escola-Gil et al. (1998) J. Lipid Res. 39(2): 457-462; Warden et al. (1993) Proc.
    Natl. Acad. Sci. USA 90(22): 10886-10890)
    apolipoprotein B- Main apolipoprotein of chylomicrons and low density lipoproteins. Mutations in
    100 (APOB) APOB100 underly familial defective apolipoprotein B-100 in which patients
    suffer from premature atherosclerosis. Mutations result in defect in binding of
    LDL to LDL receptor, and accumulation of plasma LDL. High-expressing APOB
    transgenic mice exibit elevated VLDL-LDL cholesterol and atherogenic lesions
    (Callow et al. (1995) J Clin Invest 96(3): 1639-1646; Brasaemle et al. (1997) J.
    Biol. Chem. 272(14): 9378-9387)
    lipoprotein Role in lipoprotein metabolism. Cofactor in the activity of lipoprotein lipase the
    apoCII (APOC2) enzyme that hydrolyzes triglycerides in plasma and transfers the fatty acids to
    tissues. Mutations in APOC2 responsible for hyperlipoproteinemia 1B. similar to
    lipoprotein lipase deficiency (Cox et al. (1978) N. Engl. J. Med. 299(26): 1421-
    1424; Arimoto et al. (1998) J. Lipid Res. 39(1): 143-151)
    pre- Inhibits lipoprotein lipase and hepatic lipase, decreases uptake of lymph
    apolipoprotein chylomicrons by hepatic cells. APOA3 possibly delays breakdown of triglyceride
    CIII (APOC3) rich particles. SstI RFLP in apoCIII is associated with plasma triglyceride and
    apoCIII levels and hyperlipidemic phenotypes (Henderson et al. (1987) Hum.
    Genet. 75(1): 62-65)
    apolipoprotein APOC4 is a lipid-binding protein that has the potential to alter lipid metabolism.
    apoC-IV Human APOC4 transgenic mice are hypertriglyceridaemic compared to normal
    (APOC4) controls 9Allan et al. (1996) J. Lipid Res. 37(7): 1510-1518)
    macrophage Mediates binding, internalisation and processing of negatively-charged
    scavenger macromolecules. Implicated in the pathological deposition of cholesterol in
    receptor type I arterial walls during atherogenesis (Han et al. (1998) Hum. Mol. Genet. 7(6):
    (MSR1) 1039-1046)
    Human antigen Acts as a scavenger receptor for oxidised LDL. Transient regulation under control
    CD36 gene of M-CSF during monocyte-macrophage differentiation increases foam cell
    (CD36) accumulation, Possible role in atherogenesis: increased M-CSF levels detected in
    atherosclerotic lesions in rabbits and humans. Huh et al. (1996) Blood 87(5):
    2020-2028; Aitman, et al. (1999) Nat. Genet. 21(1): 76-83)
    serum amyloid P Plasma glycoprotein expressed in atherosclerotic lesions. Interacts with
    component lipoproteins in specific manner (Li et al. (1995) Arterioscler. Thromb. Vasc. Biol.
    (SAP) 15(2): 252-257; Li et al. (1998) Biochem. Biophys. Res. Commun. 244(1): 249-
    252)
    carboxyl ester CEL gene expression increases in presence of oxidised and native LDL in vitro. It
    lipase gene is expressed in the vessel wall and in aortic extracts - may interact with cholesterol
    (CEL) to modulate progression of atherosclerosis (Li et al. (1998) Biochem. J. 329(Pt 3):
    675-679)
    paraoxonase 1 Serum esterase exclusively associated with high-density lipoproteins; it might
    (PON1) confer protection against coronary artery disease by destroying pro-inflammatory
    oxidized lipids in oxidized low-density lipoproteins. PON1 gln192-to-arg
    polymorphism associated with CAD. Association between PON1 genetic
    variation and plasma LDL, HDL and non-HDL and apoB levels in genetically
    isolated Alberta Hutterite population. When fed on a high-fat, high-cholesterol
    diet, PON1-null mice were more susceptible to atherosclerosis than wild-type
    (Serrato et al. (1995) J. Clin. Invest. 96(6): 3005-3008; Boright et al. (1998)
    Atherosclerosis 139(1): 131-136; Shih et al. (1998) Nature 394(6690): 284-287)
    paraoxonase 2 Serum esterase exclusively associated with high-density lipoproteins; it might
    (PON2) confer protection against coronary artery disease by destroying pro-inflammatory
    oxidized lipids in oxidized low-density lipoproteins. Common polymorphism at
    codon 311 (cys-ser) in PON2 associated with CHD alone and synergistically with
    the 192 polymorphism in PON1 in Asian Indians. Association between genetic
    variation in PON2 and plasma cholesterol and apolipoprotein A1 in genetically
    isolated Alberta Hutterite population (Sanghera et al. (1998) Am. J. Hum. Genet.
    62(1): 36-44; Boright et al. (1998) Atherosclerosis 139(1): 131-136)
    paraoxonase 3 Serum esterase exclusively associated with high-density lipoproteins; it might
    (PON3) confer protection against coronary artery disease by destroying pro-inflammatory
    oxidized lipids in oxidized low-density lipoproteins. Other members PON2, 3
    associated with CHD and cholesterol levels (Laplaud et al. (1998) Clin. Chem.
    Lab. Med. 36(7): 431-441)
    LDL-receptor Possible important role in atherosclerotic lesion development. Abundant
    related protein expression of mRNA and protein found in vascular smooth muscle cells and
    (LRP1) macrophages of early and advanced atherosclerotic lesions. Receptor for uptake
    of ApoE-containing lipoprotein particles (Beisiegel. et al. (1989) Nature
    341(6238): 162-164; Hiltunen et al. (1998) Atherosclerosis 137 Suppl: S81-88)
    hepatic Hepatic lipase is involved in cholesterol efflux. Downstream of cholesterol ester
    triglyceride transfer protein in pathway: acts on triglyceride-rich HDL to promote formation of
    lipase (HTGL) smaller HDL particles - effectors of cellular cholesterol efflux Fan et al. (1998) J.
    Atheroscler. Thromb. 5(1): 41-45; Santamarina-Fojo et al. (1998) Curr. Opin.
    Lipidol. 9(3): 211-219)
    3-hydroxy-3- Catalyses rate limiting step in cholesterol biosynthesis as well as being involved in
    methylglutaryl other systems (eg. primordial germ cell migration). Expression of HMG CoA
    coenzyme A reductase is regulated by oxysterols via sterol-regulatory element in the promotor,
    synthase as is found in APOE. Target for cholesterol-lowering therapies: prevastatin,
    (HMGCR) “statins” (Bocan et al. (1998) Atherosclerosis 139(1): 21-30; Farnier et al. (1998)
    Am. J. Cardiol. 82(4B): 3J-10J)
    very low density Role in triglyceride metabolism. Marked induction of VLDLR expression
    lipoprotein observed in fatty streaks and plaques in rabbit atherosclerosis models (Hiltunen et
    receptor al. (1998) Circulation 97(11): 1079-1086)
    (VLDLR)
    Microsomal Catalyses transport of triglyceride, cholesterol ester and phospholipid between
    triglyceride phospholipid surfaces. Mutations cause abetalipoproteinemia. Linkage found
    transfer protein between MTP genotype and plasma triglyceride levels in a quantitative sib-pair
    (MTP) analysis of female dizygotic twins. Inhibitors of MTP normalise atherogenic
    lipoprotein profiles in an atherosclerotic rabbit model (Wetterau et al. (1992)
    Science 258(5084): 999-1001; Austin et al. (1998) Am. J. Hum. Genet. 62(2):
    406-419; Wetterau et al. (1998) Science 282(5389): 751-754)
    perilipin (PLIN) Lipid storage droplets of steroidogenic cells are surrounded by perilipins, family
    of phosphorylated proteins encoded by a single gene, detected in adipocytes and
    steroidogenic cells. Possible role in lipid metabolism (Brasaemle et al. (1997) J.
    Biol. Chem. 272(14): 9378-9387)
    endothelin-1 Secretion of EDN1 coincides with the location of native and oxidised low density
    (EDN1) lipoproteins and occurs in a specific fashion suggesting that EDN1 may be
    involved in pathophysiological processes such as atherogenesis. Quantitative and
    qualitative immunohistochemical analysis of anti EDN1 antibodies in the wall
    layers of human arteries ex vivo suggest that EDN1 is normally expressed
    exclusively in endothelial cells. However, in cases of coronary artery disease and
    atherosclerosis, EDN1 expression is enhanced and can be found in the tunica
    media and vascular smooth muscle cells. Analysis of recombinant EDN1
    expression in vitro suggests it influences vascular smooth muscle cell
    proliferation. Potent vasoconstriction properties(Unoki et al. (1999) Cell Tissue
    Res. 295(1): 89-99; Rossi et al. (1999) Circulation 99(9): 1147-1155; Yoshizumi
    et al. (1998) Br. J. Pharmacol. 125(5): 1019-1027; Alberts et al. (1994) J. Biol.
    Chem.269(13): 10112-10118)
    endothelin Mediates action of endothelin1 on vascular smooth muscle migration, proliferation
    receptor A and monocyte/endothelial cell interaction during initiation and progression of
    (EDNRA) atherosclerotic lesion development (Kohno et al. (1998)J. Cardiovasc. Pharmacol.
    31 Suppl 1: S84-9; Alberts et al. (1994) J. Biol Chem 269(13): 10112-10118)
    interleukin 6 Inflammatory cytokine present in arterial atherosclerotic wall which is upregulated
    (IL6) by platelets to stimulate smooth muscle cell growth. Increased expression of IL6
    in atherosclerotic aortas of APOE knockout vs aortas from aged-matched controls.
    Secretion levels of IL6 is positively associated with increased lesion surface area
    in APOE aortic tissue samples (Sukovich, et al. (1998) Arterioscler. Thromb.
    Vasc. Biol. 18(9): 1498-1505; Loppnow, et al. (1998) Blood 91(1): 134-141)
    interleukin 1 May contribute to regulation of local pathogenesis in the vessel wall by activation
    (IL1) of the cytokine regulatory network. IL-1 antagonist inhibits platelet-induced
    cytokine production of smooth muscle cells (Loppnow et al. (1998) Blood 91(1):
    134-141)
    complement Complement activation of C8 shown to be an initial event in atherogenesis
    protein C8 alpha (Torzewski et al. (1996) Arterioscler. Thromb. Vasc. Biol. 16(5): 673-677)
    (C8A)
    complement Complement activation of C9 shown to be an initial event in atherogenesis
    component C9 (Torzewski et al. (1996) Arterioscler. Thromb. Vasc. Biol. 16(5): 673-677)
    (C9)
    Prostaglandin D2 Catalyses conversion of PGH2 to PGD2, a prostaglandin important in smooth
    synthase muscle contraction/relaxation and potent inhibitor of platelet aggregation.
    (PTGDS) Northern analysis shows strong specific expression in heart.
    Immunocytochemical localisation to myocardial and atrio endocardial cells, and
    accumulates in end-stage atherosclerotic plaques. High plasma levels detected in
    severe angina patients (Eguchi et al. (1997) Proc. Natl. Acad. Sci. USA 94(26):
    14689-14694)
    Annexin Inhibits phospholipase A2 activity and hence the production of arachidonic acid,
    II/lipocortinII the precursor of the inflammatory mediators prostaglandins and leukotrienes.
    (ANX2) ANX2 is an important anti-inflammatory molecule. Independently binds
    plasminogen and t-PA and therefore suspected of having a role in atherogenesis.
    Binding of plasminogen to ANX2 is specifically inhibited by the excess
    atherogenic Lp(a) (Hajjar et el. (1998) J. Investig. Med. 46(8): 364-369)
    Annexin Inhibits phospholipase A2 activity and hence the production of arachidonic acid,
    I/lipocortin the precursor of the inflammatory mediators prostaglandins and leukotrienes.
    (ANX1) ANXI is an important anti-inflammatory molecule (Wallner, et al. (1986) Nature
    320(6057): 77-81)
    Prostaglandin- Major mechanism for the regulation of prostaglandin synthesis. Arachidonic acid
    endoperoxide pathway. Role in inflammation and endothelial cell migration/angiogenesis.
    Synthase 2 Regulated enzyme - major mediator of inflammation. Antiinflammatory
    (PTGS2) glucocorticoids are potent inhibitors of this cyclooxygenase. Over expression of
    PTGS2 in vitro in rabbit epithelial cells causes increased adhesion to extracellular
    matrix proteins and inhibition of apoptosis, hallmarks of atherosclerotic plaque
    formation (Morham et al. (1995) Cell 83(3): 473-482; O'Banion et al. (1992) Proc.
    Natl. Acad. Sci. USA 89(11): 4888-4892; Tsujii et al. (1995) Cell 83(3): 493-501)
    insulin-like A study of 218 individuals indicates free IGFBP1 levels are associated with high
    growth factor HDL cholesterol and more favourable cardiovascular outcome. The
    binding protein-1 IGF1/IGFBP1 system found to be associated with cardiovascular risk and
    (IGFBP-1) atherosclerosis (Janssen et al. (1998) Arterioscler. Thromb. Vasc. Biol. 18(2):
    277-282)
    Secreted protein, Extracellular glycoprotein secreted by endothelial cells which has a suspected role
    acidic and rich in in calcification of atherosclerotic plaques. Interacts with PDGF-B containing
    cysteine dimers and inhibits binding to its receptors. Expression of SPARC and PDGF is
    (SPARC) minimal in most adult tissues, but is enhanced following injury and advanced
    atherosclerotic lesions. Selective expression of SPARC causes rounding of
    adherent endothelial cells and influences extravasation of macromolecules (Raines
    et al. (1992) Proc. Natl. Acad Sci U.S.A. 89(4): 1281-1285; Goldblum et al. (1994)
    Proc. Natl. Acad. Sci. USA 9 1(8): 3448-3452)
    Human NF- Activated NF kappa B occurs in atherosclerotic lesions, and regulates the
    kappa-B expression of gene important in recruitment of monocytes and inflammatory
    transcription response. Responsible for cytokine production by smooth muscle cells during
    factor (NFkB) atherogenesis (Navab et al. (1995) Am. J. Cardiol. 76(9): 18C-23C; Hernandez-
    Presa et al. (1998) Am J Pathol 153(6): 1825-1837; Thurberg et al. (1998) Curr.
    Opin. Lipidol. 9(5): 387-396; Brand et al. (1997) Arterioscler. Thromb. Vasc.
    Biol. 17(10): 1901-1909)
    angiotensinogen Concentration of angiotensinogen influences the renin-angiotensin system (RAS).
    (AGT) Hypertensive mice carrying renin and angiotensinogen transgenes found to have
    higher total cholesterol levels on an atherogenic diet than their wt counterparts,
    and atherogenic lesions were 4× larger in surface area. Suggests hypertension
    induced by activated RAS is important atherogenic factor (Sugiyama et al. (1997)
    Lab. Invest. 76(6): 835-842)
    Nitric Oxide Mediates basal vasodilation. Regulates the production of nitric oxide, an
    Synthase 3 important signal transduction component and scavenger of reactive oxygen
    (NOS3) species. Activity of NOS3 appears to be a factor in endothelin/endothelin receptor
    B mediated endothelial cell migration and angiogenesis. Polymorphism associated
    with smoking dependent coronary artery disease (Goligorsky et al. (1999) Clin.
    Exp. Pharmacol. Physiol. 26(3): 269-271; Stroes et al. (1998) J. Cardiovasc.
    Pharmacol. 32 Suppl 3: S14-21; Sobue, et al. (1998) Horm. Res. 50 Suppl 2:15-
    24)
    Nitric Oxide Mediates basal vasodilation. Regulates the production of nitric oxide, an
    Synthase 2 important signal transduction component and scavenger of reactive oxygen
    (NOS2) species. NOS2, known as inducible NOS is expressed in most cells only after
    induction by immunologic and inflammatory stimuli, and is upregulated in
    pathological conditions such as atherosclerosis (Dusting et al. (1998) Clin. Expt.
    Pharmacol. Phisiol. Suppl. 25: S34-41)
  • From a total of 45,233 assembled gene sequences, 34 novel genes were identified, SEQ ID NOs: 1-34, that show strong association with 66 known atherosclerosis-associated genes. Initially, the degree of association was measured by probability values using a cutoff p value less than 0.00001. The sequences were further examined to ensure that the genes that passed the probability test had strong association with known atherosclerosis-associated genes. Details of the co-expression patterns for the 66 known and 34 novel atherosclerosis-associated polynucleotides are presented in Table 4. The entries in Table 4 are the negative log of the p-value (−log p) for the coexpression of the two genes. The novel atherosclerosis-associated polynucleotides identified are listed in the table by their SEQ ID NOs numbers, and the known genes, by their names or the abbreviations shown in Table 3. [0099]
    TABLE 4
    Co-expression of 34 novel genes with 66 known atherosclerosis genes. (−log p)
    SM22 CNN1 DES MYH11 TPM1 TIMP3 TIMP2 TIMP4 COL1A1 COL1A2 COL6A1 COL5A2
    SEQ ID NO:1 0 0 1 0 0 0 1 0 1 0 0 0
    SEQ ID NO:2 3 1 1 2 4 4 5 0 1 3 2 2
    SEQ ID NO:3 0 0 0 0 1 1 2 1 1 0 1 1
    SEQ ID NO:4 1 1 1 1 0 0 0 1 0 1 1 1
    SEQ ID NO:5 0 1 0 0 1 0 0 1 0 1 0 1
    SEQ ID NO:6 1 0 1 0 1 0 0 0 1 1 0 1
    SEQ ID NO:7 1 0 1 0 0 0 0 1 1 1 1 0
    SEQ ID NO:8 9 5 8 7 6 4 7 0 9 12 13 3
    SEQ ID NO:9 14 6 5 10 10 8 10 2 4 7 9 4
    SEQ ID NO:10 8 5 5 5 9 6 3 0 4 5 5 2
    SEQ ID NO:11 2 1 1 2 1 2 4 2 1 0 2 1
    SEQ ID NO:12 4 1 1 3 4 5 5 2 1 4 2 1
    SEQ ID NO:13 3 4 4 5 2 0 1 0 1 1 1 2
    SEQ ID NO:14 4 3 0 3 2 3 2 7 4 3 3 3
    SEQ ID NO:15 0 1 1 0 1 2 1 0 1 1 1 1
    SEQ ID NO:16 9 11 12 8 11 7 4 1 4 7 9 2
    SEQ ID NO:17 2 2 2 1 4 2 3 2 6 5 6 9
    SEQ ID NO:18 0 0 1 0 1 0 0 0 0 0 0 0
    SEQ ID NO:19 27 17 14 14 15 11 12 3 7 13 17 2
    SEQ ID NO:20 20 13 9 12 12 19 19 4 22 27 27 15
    SEQ ID NO:21 3 2 0 1 4 7 4 4 1 3 3 0
    SEQ ID NO:22 11 6 3 6 6 10 3 2 5 6 7 3
    SEQ ID NO:23 1 0 0 0 0 1 1 0 1 1 2 1
    SEQ ID NO:24 10 8 6 7 7 8 5 5 7 8 12 3
    SEQ ID NO:25 3 0 1 0 1 3 2 3 2 4 2 4
    SEQ ID NO:26 1 0 1 0 0 1 0 0 1 0 1 0
    SEQ ID NO:27 0 1 1 0 0 0 0 1 1 0 0 0
    SEQ ID NO:28 2 2 0 1 1 1 1 1 1 0 0 0
    SEQ ID NO:29 1 1 1 1 1 1 1 0 0 1 0 1
    SEQ ID NO:30 1 1 1 1 2 4 1 1 1 0 1 3
    SEQ ID NO:31 2 1 1 0 1 5 2 0 7 8 6 11
    SEQ ID NO:32 0 1 0 1 0 1 0 0 1 0 1 0
    SEQ ID NO:33 0 0 1 0 1 0 0 0 1 2 0 0
    SEQ ID NO:34 30 48 44 44 30 15 10 3 11 17 24 3
    COL6A2 COL6A3 COL3A1 COL5A1 MMP9 MGP CTSK FGB FGG VWF F2 F12 F7
    SEQ ID NO:1 0 0 0 0 0 0 0 4 3 1 5 1 0
    SEQ ID NO:2 2 4 3 4 4 5 3 1 1 6 0 0 1
    SEQ ID NO:3 1 0 0 1 0 1 1 4 4 0 6 6 2
    SEQ ID NO:4 0 1 1 1 0 0 1 7 4 0 3 1 0
    SEQ ID NO:5 0 0 0 0 0 0 0 2 2 0 4 1 4
    SEQ ID NO:6 1 0 0 1 1 0 1 4 6 0 4 2 6
    SEQ ID NO:7 1 0 1 0 1 0 0 7 5 0 4 2 2
    SEQ ID NO:8 14 11 12 4 8 10 13 1 2 3 0 1 0
    SEQ ID NO:9 13 5 6 8 11 14 6 1 2 12 1 0 0
    SEQ ID NO:10 7 5 6 5 5 10 1 2 3 1 1 2 1
    SEQ ID NO:11 1 0 1 0 0 1 0 1 0 1 1 1 0
    SEQ ID NO:12 6 4 4 5 4 6 1 0 1 10 0 0 0
    SEQ ID NO:13 2 2 2 2 1 1 0 0 1 0 1 0 1
    SEQ ID NO:14 4 4 3 1 2 4 1 0 1 5 0 0 0
    SEQ ID NO:15 1 1 0 1 2 1 2 1 2 1 2 2 4
    SEQ ID NO:16 9 11 10 7 5 11 3 0 1 5 0 0 1
    SEQ ID NO:17 8 4 6 7 6 1 3 1 1 1 1 2 2
    SEQ ID NO:18 0 0 1 0 0 1 1 4 4 1 4 3 1
    SEQ ID NO:19 25 12 11 6 10 21 6 3 1 7 1 0 0
    SEQ ID NO:20 33 24 34 15 27 20 18 0 2 8 1 0 0
    SEQ ID NO:21 2 6 3 1 3 9 1 0 1 7 0 0 1
    SEQ ID NO:22 7 10 6 2 8 14 5 0 6 5 0 1 1
    SEQ ID NO:23 1 0 1 0 1 0 1 7 7 0 7 5 0
    SEQ ID NO:24 9 11 10 3 11 11 6 1 0 8 1 0 0
    SEQ ID NO:25 2 5 5 3 4 2 1 0 0 4 0 0 1
    SEQ ID NO:26 0 1 0 0 1 1 0 0 0 0 0 0 0
    SEQ ID NO:27 0 0 0 1 0 0 0 6 4 0 7 4 2
    SEQ ID NO:28 1 0 0 0 0 1 1 12 10 0 9 7 1
    SEQ ID NO:29 I I 1 1 0 1 0 6 4 0 4 0 0
    SEQ ID NO:30 1 2 1 1 2 1 1 4 5 1 7 5 2
    SEQ ID NO:31 8 9 10 7 9 3 7 1 2 3 1 2 4
    SEQ ID NO:32 1 0 1 0 0 0 0 7 4 0 6 4 2
    SEQ ID NO:33 0 0 1 0 1 1 0 8 5 1 8 4 1
    SEQ ID NO:34 26 18 17 13 14 20 6 1 1 8 0 1 2
    PECAM1 LAC1 AT3 PA11 LPL A2M APOA1 APOA2 APOB APOC2 APOC3 APOC4 MSR1 CD36
    SEQ ID NO:1 0 1 5 1 0 1 3 4 2 2 5 4 0 0
    SEQ ID NO:2 7 4 0 3 1 4 0 0 0 0 0 2 3
    SEQ ID NO:3 0 1 8 1 1 1 5 4 3 3 5 7 0 0
    SEQ ID NO:4 0 0 4 0 1 2 4 5 4 4 6 2 0 1
    SEQ ID NO:5 1 3 1 2 0 0 5 4 3 5 2 0 0 0
    SEQ ID NO:6 0 3 6 1 1 1 10 6 5 6 4 0 0 1
    SEQ ID NO:7 1 2 5 1 1 1 6 5 6 3 6 3 0 0
    SEQ ID NO:8 2 1 0 0 4 9 1 0 0 1 0 0 4 2
    SEQ ID NO:9 6 3 1 2 5 7 0 1 1 0 1 0 7 3
    SEQ ID NO:10 4 6 3 5 3 8 1 1 2 0 2 1 2 7
    SEQ ID NO:11 0 1 0 0 3 3 0 0 0 2 0 0 1 1
    SEQ ID NO:12 8 7 0 3 5 6 1 1 1 2 0 0 5 6
    SEQ ID NO:13 0 0 0 1 1 1 11 0 10 2 4 0 0 0
    SEQ ID NO:14 3 0 0 1 4 4 0 0 0 0 0 0 9 5
    SEQ ID NO:15 1 1 2 1 1 0 3 1 2 2 1 0 0 2
    SEQ ID NO:16 2 1 1 0 3 6 2 0 1 0 0 0 4 3
    SEQ ID NO:17 0 1 0 5 1 1 3 2 1 2 1 0 1 2
    SEQ ID NO:18 0 1 5 2 1 1 4 5 5 5 3 1 0 0
    SEQ ID NO:19 5 2 1 4 1 13 1 1 1 0 1 0 10 1
    SEQ ID NO:20 7 1 0 4 3 12 1 0 0 1 1 0 9 5
    SEQ ID NO:21 5 3 0 2 6 7 0 0 0 1 0 0 5 9
    SEQ ID NO:22 4 4 0 2 6 10 1 0 1 0 0 0 8 5
    SEQ ID NO:23 1 1 8 1 1 3 5 8 4 6 8 9 1 1
    SEQ ID NO:24 2 1 0 0 12 10 1 1 0 1 0 0 9 14
    SEQ ID NO:25 4 1 0 1 8 5 0 1 0 1 1 0 5 11
    SEQ ID NO:26 0 0 0 0 0 0 0 0 0 0 0 0 1 0
    SEQ ID NO:27 1 1 8 2 0 1 5 7 6 4 7 3 0 0
    SEQ ID NO:28 0 3 13 1 0 2 10 16 10 8 14 7 1 1
    SEQ ID NO:29 1 2 5 0 0 1 5 6 5 3 4 2 0 0
    SEQ ID NO:30 4 2 9 3 0 2 6 8 7 6 4 3 1 1
    SEQ ID NO:31 3 2 1 1 2 2 3 2 1 1 0 0 1 2
    SEQ ID NO:32 0 1 6 2 0 1 6 7 7 4 8 1 0 1
    SEQ ID NO:33 0 2 7 3 0 2 8 10 10 7 8 4 0 0
    SEQ ID NO:34 2 0 0 1 2 20 1 0 3 0 0 0 10 4
    SAP CEL PON1 PON2 PON3 LRP1 HTGL HMGCR VLDLR MTP PLIN EDN1 EDNRA IL6
    SEQ ID NO:1 6 0 5 1 2 1 0 3 0 1 0 0 1 0
    SEQ ID NO:2 0 1 1 6 1 2 1 0 6 0 1 4 3 0
    SEQ ID NO:3 8 1 5 2 2 0 2 2 1 1 0 0 0 1
    SEQ ID NO:4 3 0 2 0 1 0 1 1 0 0 0 0 0 1
    SEQ ID NO:5 1 1 1 1 2 0 2 1 0 1 1 1 2 0
    SEQ ID NO:6 2 3 5 3 8 1 6 4 2 3 2 2 2 0
    SEQ ID NO:7 4 1 2 0 1 0 1 1 0 3 0 1 0 0
    SEQ ID NO:8 0 2 1 1 1 8 1 2 2 1 6 1 2 0
    SEQ ID NO:9 0 2 1 4 0 8 0 1 3 0 11 2 3 1
    SEQ ID NO:10 2 0 1 4 5 2 1 5 1 0 2 3 1 6
    SEQ ID NO:11 1 0 1 5 0 5 1 0 4 0 1 1 0 1
    SEQ ID NO:12 0 1 0 7 1 4 1 2 3 0 6 4 4 0
    SEQ ID NO:13 1 3 1 1 1 1 1 5 1 9 0 0 1 0
    SEQ ID NO:14 0 0 0 1 0 3 0 0 1 0 15 1 0 3
    SEQ ID NO:15 3 6 6 2 7 1 3 2 2 2 4 2 2 0
    SEQ ID NO:16 0 3 1 3 1 4 1 3 4 0 3 1 4 1
    SEQ ID NO:17 1 3 2 3 2 2 2 2 3 1 2 2 7 2
    SEQ ID NO:18 2 1 2 2 1 0 3 1 1 1 1 1 0 0
    SEQ ID NO:19 1 0 0 2 1 8 1 1 2 0 1 0 1 2
    SEQ ID NO:20 0 1 1 2 2 17 0 1 1 0 7 0 4 4
    SEQ ID NO:21 0 1 0 1 0 2 0 0 2 0 6 0 0 2
    SEQ ID NO:22 1 1 1 2 3 3 1 2 1 0 8 0 1 2
    SEQ ID NO:23 11 0 5 1 2 1 0 4 0 0 0 1 0 0
    SEQ ID NO:24 0 0 0 0 0 6 0 0 2 0 19 0 1 2
    SEQ ID NO:25 1 1 0 1 0 2 1 2 2 0 8 3 2 1
    SEQ ID NO:26 0 0 0 0 1 0 0 0 0 0 0 0 0 2
    SEQ ID NO:27 9 1 6 1 5 0 1 3 0 2 0 0 1 0
    SEQ ID NO:28 11 0 7 3 2 0 3 3 0 1 0 1 0 0
    SEQ ID NO:29 1 0 1 1 1 0 2 1 0 1 0 0 0 0
    SEQ ID NO:30 7 0 4 2 2 1 3 4 1 2 1 1 1 0
    SEQ ID NO:31 1 1 0 1 1 3 2 2 2 0 3 1 6 1
    SEQ ID NO:32 7 1 4 1 3 0 1 4 0 2 1 0 1 0
    SEQ ID NO:33 5 1 3 1 1 0 2 2 0 2 0 0 0 0
    SEQ ID NO:34 0 1 1 1 1 10 2 6 3 3 3 0 4 0
    IL1 C8A C9 PTGDS ANX2 ANX1 PTGS2 IGFBP1 SPARC NFkB AGT NOS3 NOS2
    SEQ ID NO:1 0 3 4 0 2 0 1 4 1 0 2 1 0
    SEQ ID NO:2 0 0 0 2 3 4 0 2 8 5 0 4 1
    SEQ ID NO:3 0 5 4 0 1 0 1 4 1 1 4 0 1
    SEQ ID NO:4 0 3 4 0 1 0 1 2 1 0 3 0 0
    SEQ ID NO:5 0 1 0 1 0 0 0 5 0 1 2 2 0
    SEQ ID NO:6 0 4 0 0 0 0 1 8 0 0 5 6 1
    SEQ ID NO:7 0 4 3 1 0 0 0 5 1 1 2 0 0
    SEQ ID NO:8 2 0 0 4 6 3 1 1 9 5 1 2 2
    SEQ ID NO:9 0 0 1 7 6 4 0 0 11 6 3 2 0
    SEQ ID NO:10 1 5 2 1 11 7 6 2 7 2 1 1 0
    SEQ ID NO:11 1 0 0 23 1 1 0 1 4 2 15 0 0
    SEQ ID NO:12 0 1 0 3 3 2 1 1 7 3 5 6 1
    SEQ ID NO:13 0 1 0 0 1 1 0 1 0 1 1 1 1
    SEQ ID NO:14 0 0 0 2 2 1 0 0 2 3 1 1 0
    SEQ ID NO:15 0 3 0 0 0 2 0 4 1 0 2 4 1
    SEQ ID NO:16 0 0 0 3 6 3 1 1 7 3 0 1 0
    SEQ ID NO:17 2 1 0 1 1 3 1 4 4 4 2 3 0
    SEQ ID NO:18 0 6 1 0 0 0 0 4 1 0 5 1 0
    SEQ ID NO:19 1 1 0 12 7 9 3 1 14 6 2 0 1
    SEQ ID NO:20 0 0 0 6 12 9 0 0 22 8 0 1 1
    SEQ ID NO:21 1 0 0 2 3 1 1 1 6 2 1 0 1
    SEQ ID NO:22 0 1 1 6 3 10 1 1 6 2 0 2 1
    SEQ ID NO:23 1 5 11 1 0 1 1 3 0 0 3 0 0
    SEQ ID NO:24 0 0 0 4 4 2 1 0 7 3 1 0 0
    SEQ ID NO:25 1 1 0 0 2 2 0 1 3 5 1 1 0
    SEQ ID NO:26 0 0 0 1 0 0 2 0 0 0 0 0 5
    SEQ ID NO:27 0 9 8 0 1 1 1 6 0 0 3 2 0
    SEQ ID NO:28 0 5 8 2 0 0 0 10 1 0 8 0 0
    SEQ ID NO:29 0 1 2 1 2 1 0 5 2 0 3 0 0
    SEQ ID NO:30 1 7 6 2 1 3 1 6 1 3 5 0 0
    SEQ ID NO:31 0 2 0 1 3 2 0 1 6 4 1 2 0
    SEQ ID NO:32 0 7 6 0 0 0 1 7 1 0 4 2 0
    SEQ ID NO:33 0 7 4 2 1 1 0 6 1 0 5 0 1
    SEQ ID NO:34 0 0 0 11 9 6 1 1 9 3 0 3 1
  • V Novel Genes Associated with Atherosclerosis
  • Using the co-expression analysis method, 34 novel atherosclerosis-associated polynucleotides were identified, SEQ ID NOs: 1-34, that exhibit strong association, or co-expression, with 66 known atherosclerosis-associated genes. [0100]
  • Polynucleotides comprising the consensus sequences of SEQ ID NO: 1-34 of the present invention were first identified from Incyte bins and assembled as described in Example III. BLAST and other motif searches were performed for SEQ ID NOs: 1-34 according to Example VI. The full length and 5′-complete sequences were translated and sequence identity was sought with known sequences. [0101]
  • SEQ ID NO: 35 of the present invention was encoded by the nucleic acids of SEQ ID NO: 11. SEQ ID NO: 35 has 366 amino acids which are encoded by SEQ ID NO: 11. Motif analyses of SEQ ID NO: 35 shows one potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at residue S343, two potential casein kinase II phosphorylation sites at residues S179 and T351, and four potential protein kinase C phosphorylation sites at residues T29, S85, T269, and T324. Additionally, SEQ ID NO: 35 contains a potential sugar transport protein signature sequence from residues L201 to S217. [0102]
  • VI Homology Searching for Atherosclerosis-Associated Polynucleotides and Polypeptides
  • The polynucleotide sequences, SEQ ID NO: 1-34, and polypeptide sequence, SEQ ID NO: 35, were queried against databases derived from sources such as GenBank and SwissProt. These databases, which contain previously identified and annotated sequences, were searched for regions of similarity using BLAST (Altschul, supra). BLAST searched for matches and reported only those that satisfied the probability thresholds of 10[0103] −25 or less for nucleotide sequences and 10−8 or less for polypeptide sequences.
  • The polypeptide sequence was also analyzed for known motif patterns using MOTIFS, SPSCAN, BLIMPS, and HMM-based protocols. MOTIFS (Genetics Computer Group, Madison Wis.) searches polypeptide sequences for patterns that match those defined in the Prosite Dictionary of Protein Sites and Patterns (Bairoch, supra) and displays the patterns found and their corresponding literature abstracts. SPSCAN (Genetics Computer Group) searches for potential signal peptide sequences using a weighted matrix method (Nielsen et al. (1997) Prot. Eng. 10: 1-6). Hits with a score of 5 or greater were considered. BLIMPS uses a weighted matrix analysis algorithm to search for sequence similarity between the polypeptide sequences and those contained in BLOCKS, a database consisting of short amino acid segments, or blocks of 3-60 amino acids in length, compiled from the PROSITE database (Henikoff; supra; Bairoch, supra), and those in PRINTS, a protein fingerprint database based on non-redundant sequences obtained from sources such as SwissProt, GenBank, PIR, and NRL-3D (Attwood et al. (1997) J. Chem. Inf. Comput. Sci. 37:417-424). For the purposes of the present invention, the BLIMPS searches reported matches with a cutoff score of 1000 or greater and a cutoff probability value of 1.0×10[0104] −3. HMM-based protocols were based on a probabilistic approach and searched for consensus primary structures of gene families in the protein sequences (Eddy, supra; Sonnhammer, supra). More than 500 known protein families with cutoff scores ranging from 10 to 50 bits were selected for use in this invention.
  • VII Labeling of Probes and Hybridization Analyses
  • Substrate Preparation [0105]
  • Nucleic acids are isolated from a biological source and applied to a substrate for standard hybridization protocols by one of the following methods. A mixture of target nucleic acids, a restriction digest of genomic DNA, is fractionated by electrophoresis through an 0.7% agarose gel in 1xTAE [Tris-acetate-ethylenediamine tetraacetic acid (EDTA)] running buffer and transferred to a nylon membrane by capillary transfer using 20x saline sodium citrate (SSC). Alternatively, the target nucleic acids are individually ligated to a vector and inserted into bacterial host cells to form a library. Target nucleic acids are arranged on a substrate by one of the following methods. In the first method, bacterial cells containing individual clones are robotically picked and arranged on a nylon membrane. The membrane is placed on bacterial growth medium, LB agar containing carbenicillin, and incubated at 37° C. for 16 hours. Bacterial colonies are denatured, neutralized, and digested with proteinase K. Nylon membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene) to cross-link DNA to the membrane. [0106]
  • In the second method, target nucleic acids are amplified from bacterial vectors by thirty cycles of PCR using primers complementary to vector sequences flanking the insert. Amplified target nucleic acids are purified using SEPHACRYL-400 beads (Amersham Pharnacia Biotech). Purified target nucleic acids are robotically arrayed onto a glass microscope slide (Corning Science Products, Corning N.Y.). The slide is previously coated with 0.05% aminopropyl silane (Sigma-Aldrich, St. Louis Mo.) and cured at 110° C. The arrayed glass slide (microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker (Stratagene). [0107]
  • Probe Preparation [0108]
  • cDNA probes are made from mRNA templates. Five micrograms of mRNA is mixed with 1 μg random primer (Life Technologies), incubated at 70° C. for 10 minutes, and lyophilized. The lyophilized sample is resuspended in 50 μl of 1x first strand buffer (cDNA Synthesis systems; Life Technologies) containing a dNTP mix, [α-[0109] 32P]dCTP, dithiothreitol, and MMLV reverse transcriptase (Stratagene), and incubated at 42° C. for 1-2 hours. After incubation, the probe is diluted with 42 μl dH2O, heated to 95° C. for 3 minutes, and cooled on ice. mRNA in the probe is removed by alkaline degradation. The probe is neutralized, and degraded mRNA and unincorporated nucleotides are removed using a PROBEQUANT G-50 microcolumn (Amersham Pharmacia Biotech). Probes can be labeled with fluorescent markers, Cy3-dCTP or Cy5-dCTP (Amersharn Pharmacia Biotech), in place of the radionucleotide, [32P]dCTP.
  • Hybridization [0110]
  • Hybridization is carried out at 65° C. in a hybridization buffer containing 0.5 M sodium phosphate (pH 7.2), 7% SDS, and 1 mM EDTA. After the substrate is incubated in hybridization buffer at 65° C. for at least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing the probes. After incubation at 65° C. for 18 hours, the hybridization buffer is removed, and the substrate is washed sequentially under increasingly stringent conditions, up to 40 mM sodium phosphate, 1% SDS, 1 mM EDTA at 65° C. To detect signal produced by a radiolabeled probe hybridized on a membrane, the substrate is exposed to a PHOSPHORIMAGER cassette (Amersham Pharmacia Biotech), and the image is analyzed using IMAGEQUANT data analysis software (Amersham Pharmacia Biotech). To detect signals produced by a fluorescent probe hybridized on a microarray, the substrate is examined by confocal laser microscopy, and images are collected and analyzed using GEMTOOLS gene expression analysis software (Incyte Pharmaceuticals). [0111]
  • VIII Complementary Polynucleotides
  • Molecules complementary to the polynucleotide, or a fragment thereof, are used to detect, decrease, or inhibit gene expression. Although use of oligonucleotides comprising from about 18 to about 60 base pairs is described, the same procedure is used with larger or smaller fragments or their derivatives (PNAs). Oligonucleotides are designed using OLIGO 4.06 software (National Biosciences) and SEQ ID NO: 1-34 or fragments thereof To inhibit transcription by preventing promoter binding, a complementary oligonucleotide is designed to bind to the most unique 5′ sequence, most preferably about 10 nucleotides before the initiation codon of the open reading frame. To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding to the mRNA encoding the polypeptide. [0112]
  • IX Production of Specific Antibodies
  • The polypeptides encoded by SEQ ID NO: 1-34, or portions thereof, substantially purified using polyacrylamide gel electrophoresis or other purification techniques, is used to immunize rabbits and to produce antibodies using standard protocols as described in Pound (1998; Immunochemical Protocols, Methods Mol. Biol. Vol. 80). [0113]
  • Alternatively, the amino acid sequence is analyzed using LASERGENE software (DNASTAR, Madison Wis.) to determine regions of high immunogenicity, and a corresponding oligopeptide is synthesized and used to raise antibodies by means known to those of skill in the art. Methods for selection of epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the art. Typically, oligopeptides 15 residues in length are synthesized using an ABI431A Peptide synthesizer (PE Biosystems) using fmoc-chemistry and coupled to keyhole limpet hemocyanin (KLH, Sigma-Aldrich) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, supra) to increase immunogenicity. Rabbits are immunized with the oligopeptide-KLH complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. [0114]
  • X Screening Molecules for Specific Binding with the Polynucleotide or Polypeptide
  • The polynucleotide, or fragments thereof, or the polypeptide, or portions thereof, are labeled with [0115] 32P-dCTP, Cy3-dCTP, or Cy5-dCTP (Amersham Pharmacia Biotech), or with BIODIPY or FITC (Molecular Probes, Eugene OR), respectively. Libraries of candidate molecules previously arranged on a substrate are incubated in the presence of labeled polynucleotide or polypeptide. After incubation under conditions for either a nucleic acid or amino acid sequence, the substrate is washed, and any position on the substrate retaining label, which indicates specific binding or complex formation, is assayed, and the binding molecule is identified. Data obtained using different concentrations of the polynucleotide or polypeptide are used to calculate affinity between the labeled nucleic acid or protein and the bound molecule.
  • 0
    SEQUENCE LISTING
    <160> NUMBER OF SEQ ID NOS: 35
    <210> SEQ ID NO 1
    <211> LENGTH: 1334
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 674, 735, 788
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 1
    aggcctccct ccacctgtct tctcagagca gataatggca agcatggctg ccgtgctcac 60
    ctgggctctg gctcttcttt cagcgttttc ggccacccag gcacggaaag gcttctggga 120
    ctacttcagc cagaccagcg gggacaaagg cagggtggag cagatccatc agcagaagat 180
    ggctcgcgag cccgcgaccc tgaaagacag ccttgagcaa gacctcaaca atatgaacaa 240
    gttcctggaa aagctgaggc ctctgagtgg gagcgaggct cctcggctcc cacaggaccc 300
    ggtgggcatg cggcggcagc tgcaggagga gttggaggag gtgaaggctc gcctccagcc 360
    ctacatggca gaggcgcacg agctggtggg ctggaatttg gagggcttgc ggcagcaact 420
    gaagccctac acgatggatc tgatggagca ggtggccctg cgcgtgcagg agctgcagga 480
    gcagttgcgc gtggtggggg aagacaccaa ggcccagttg ctggggggcg tggacgaggc 540
    ttgggctttg ctgcagggac tgcagagccg cgtggtgcac cacaccggcc gcttcaaaga 600
    gctcttccac ccatacgccg agagcctggt gagcggcatc gggcgccacg tgcaggagct 660
    gcaccgcagt gtgntccgca cgcccccgcc agccccgcgc gcctcagtcg ctgcgtgcag 720
    gtgctctccc ggaantcacg ctcaaggcca aggccctgca cgcacgcatc cagcagaacc 780
    tggaccantg cgcgaagagc tcagcagagc ctttgcaggc actgggactg aggaaggggc 840
    cggcccggac ccccagatgc tctccgagga ggtgcgccag cgacttcagg ctttccgcca 900
    ggacacctac ctgcagatag ctgccttcac tcgcgccatc gaccaggaga ctgaggaggt 960
    ccagcagcag ctggcgccac ctccaccagg ccacagtgcc ttcgccccag agtttcaaca 1020
    aacagacagt ggcaaggttc tgagcaagct gcaggcccgt ctggatgacc tgtgggaaga 1080
    catcactcac agccttcatg accagggcca cagccgtctg ggggacccct gaggatctac 1140
    ctgcccaggc ccattcccag cttcttgtct ggggagcctt ggctctgagc ctctagcatg 1200
    gttcagtcct tgaaagtggc ctgttgggtg gagggtggaa ggtcctgtgc aggacaggga 1260
    ggccaccaaa ggggctgctg tctcctgcat atccagcctc ctgcgactcc ccaatgcagg 1320
    atgcattcat tcac 1334
    <210> SEQ ID NO 2
    <211> LENGTH: 1702
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 2
    cgttcccact gcaccctgga gaacgagcct ttgcggggtt tctcctggct gtcctccgac 60
    cccggcggtc tcgaaagcga cacgctgcag tgggtggagg agccccaacg ctcctgcacc 120
    gcgcggagat gcgcggtact ccaggccacc ggtggggtcg agcccgcagg ctggaaggag 180
    atgcgatgcc acctgcgcgc caacggctac ctgtgcaagt accagtttga ggtcttgtgt 240
    cctgcgccgc gccccggggc cgcctctaac ttgagctatc gcgcgccctt ccagctgcac 300
    agcgccgctc tggacttcag tccacctggg accgaggtga gtgcgctctg ccggggacag 360
    ctcccgatct cagttacttg catcgcggac gaaatcggcg ctcgctggga caaactctcg 420
    ggcgatgtgt tgtgtccctg ccccgggagg tacctccgtg ctggcaaatg cgcagagctc 480
    cctaactgcc tagacgactt gggaggcttt gcctgcgaat gtgctacggg cttcgagctg 540
    gggaaggacg gccgctcttg tgtgaccagt ggggaaggac agccgaccct tggggggacc 600
    ggggtgccca ccaggcgccc gccggccact gcaaccagcc ccgtgccgca gagaacatgg 660
    ccaatcaggg tcgacgagaa gctgggagag acaccacttg tccctgaaca agacaattca 720
    gtaacatcta ttcctgagat tcctcgatgg ggatcacaga gcacgatgtc tacccttcaa 780
    atgtcccttc aagccgagtc aaaggccact atcaccccat cagggagcgt gatttccaag 840
    tttaattcta cgacttcctc tgccactcct caggctttcg actcctcctc tgccgtggtc 900
    ttcatatttg tgagcacagc agtagtagtg ttggtgatct tgaccatgac agtactgggg 960
    cttgtcaagc tctgctttca cgaaagcccc tcttcccagc caaggaagga gtctatgggc 1020
    ccgccgggcc tggagagtga tcctgagccc gctgctttgg gctccagttc tgcacattgc 1080
    acaaacaatg gggtgaaagt cggggactgt gatctgcggg acagagcaga gggtgccttg 1140
    ctggcggagt cccctcttgg ctctagtgat gcatagggaa acaggggaca tgggcactcc 1200
    tgtgaacagt ttttcacttt tgatgaaacg gggaaccaag aggaacttac ttgtgtaact 1260
    gacaatttct gcagaaatcc cccttcctct aaattccctt tactccactg aggagctaaa 1320
    tcagaactgc acactccttc cctgatgata gaggaagtgg aagtgccttt aggatggtga 1380
    tactggggga ccgggtagtg ctggggagag atattttctt atgtttattc ggagaatttg 1440
    gagaagtgat tgaacttttc aagacattgg aaacaaatag aacacaatat aatttacatt 1500
    aaaaaataat ttctaccaaa atggaaagga aatgttctat gttgttcagg ctaggagtat 1560
    attggttcga aatcccaggg aaaaaaataa aaataaaaaa ttaaaggatt gttgataaaa 1620
    aaaaaaaaaa aaaaagatct ttaattaagc ggcccaagct tattcccttt agtgagggtt 1680
    aattttagct tgcactggcc ac 1702
    <210> SEQ ID NO 3
    <211> LENGTH: 586
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 48, 66, 560, 574, 577, 580
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 3
    tcgaggactc cgccaactac agctgcgtct acgtggacct gaagccgnct ttcgggggct 60
    acgcgnccag cgagcgcttg gagctgcacg tggacggacc ccctcccagg cctcagctcc 120
    gggcgacgtg gagtggggcg gtcctggcgg gccgagatgc cgtcctgcgc tgcgagggac 180
    ccatccccga cgtcaccttc gagctgctgc gcgagggcga gacgaaggcc gtgaagacgg 240
    tccgcacccc cggggccgcg gcgaacctcg agctgatctt cgtggggccc cagcacgccg 300
    gcaactacag gtgccgctac cgctcctggg tgccccacac cttcgaatcg gagctcagcg 360
    accctgtgga gctcctggtg gcagaaagct gatgcagccg cgggcccagg gtgctgttgg 420
    tgtcctcaga agtgccgggg attctggact ggctccctcc cctcctgttg cagcacaagg 480
    ccggggtctc tggggggctg gagaagcctc cctcattcct cccaggaatt aataaatgtg 540
    aagagagctc tgtttaaaan aaaaaaaaag aaanaanaan aaccaa 586
    <210> SEQ ID NO 4
    <211> LENGTH: 433
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 4
    ctcaagaccc agcagtggga cagccagaca gacggcacga tggcactgag ctcccagatc 60
    tgggccgctt gcctcctgct cctcctcctc ctcgccagcc tgaccagtgg ctctgttttc 120
    ccacaacaga cgggacaact tgcagagctg caaccccagg acagagctgg agccagggcc 180
    agctggatgc ccatgttcca gaggcgaagg aggcgagaca cccacttccc catctgcatt 240
    ttctgctgcg gctgctgtca tcgatcaaag tgtgggatgt gctgcaagac gtagaaccta 300
    cctgccctgc ccccgtcccc tcccttcctt atttattcct gctgccccag aacataggtc 360
    ttggaataaa atggctggtt cttttgtttt ccaaaaaaaa aaaaaaaaaa aaaaaaaaaa 420
    aaaaaaaaaa aaa 433
    <210> SEQ ID NO 5
    <211> LENGTH: 752
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 5
    attgtacact ttaaaataat ggaattttac agtaagtgaa gtatgtatcg atgaagctat 60
    taaaacatct atttattagc tcaaatttct acaggtcaga attctggcat ggagtggctg 120
    cattctttgt ttaagctgaa atcaaggtgt tgtttggccg tgttctcacc tgaagctcag 180
    agttcacttc caagctcatt tttgtcctta gcagaattga gtgtcttgca attgtagaac 240
    tgaggtcttg gcttgctgtc tgtcagcagg ggactgctcc ctgcttctag aggccaccgg 300
    tttccctcgt tatgtggccc cttccatttt caggccagca ataatgtgtt gaatacttcc 360
    tatgcttcaa atctctggct tctgctacca gctggagaaa aaactctctg cttgtagagg 420
    gctcatgtga tttacttagt ctttgtctta aggtcaattt atttggtact tgggatttta 480
    attgtatctg tatgtttcca tcaaggcaat aactgtatta gtgtttgaat aaataaccag 540
    gtaatctggt aatttaccat actggtaatc tgacagggag atgggaattc atctttataa 600
    ttctgcttac cacaaaccat gtctgtgctt attttctttg gggaagagtt gtctgtgact 660
    gtcacttagt ttgaggttcc atgttgctga gattctgtcc agtattttga cctcttcccc 720
    aaatctggtc ttcagaacca tctcttagga gc 752
    <210> SEQ ID NO 6
    <211> LENGTH: 944
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 6
    tcttgggccg agaatttttt tttttttttt tttgcttggt cgggtaattt tcattccaaa 60
    taaacttatc acaaaaaaac tcagcttccc aaggtcattc ccccgctgcc agatacatac 120
    ttatctctga aagagtttgg aagatggacc tttcaattcc tctacaatta gtagctgagt 180
    tacagagtaa cctgccagca atcctatcag cattcatcag actatttaaa tagagcaaag 240
    tccacaaaaa gttccactga gacatgctga gcaaaggccg gagccccaga agaaaacaag 300
    tacagactca gaggaaagct gccctggtcc tgagtgtgac tcccatggtc cccgtggggt 360
    ctgtgtggtt ggcaatgagc tctgtgctgt cagctttcat gagggagctc cctggctggt 420
    tcctgttctt tggggtcttc ctccccatga ctttgctgct gctcctcctc atcgcctact 480
    tcaggatcaa actgattgag gttaatgaag aactgtccca gaactgtgat cgccaacata 540
    atcccaagga tggctcttcc ctgtaccaga gaatgaaatg gacgtgaagt tggtgacttt 600
    ccaataacta aagcacaatg agtttctact ggtcagcaag caatggccaa cagttcagct 660
    aataaagtag gttgataaac tagaaccata gcaaaataga aagaatacta agatactcat 720
    tctgaaccat actgaaaagt ggcagctatt atctaagggg acttctcaga gactcagtat 780
    aacagcagct cttgaaaagt accaagaatg gatttcctgg gtatatacac tggacacatt 840
    gtaacttttt aacttttatt gtgactgtgt ctgctctaaa cggcatattt aaaaaataaa 900
    attctgcagc atcttactac ataaaaaaaa aaaaaaaaaa aaaa 944
    <210> SEQ ID NO 7
    <211> LENGTH: 868
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 7
    cctccctccg cgagctggac gctccgcagc ccgcccgcca gccggcccgc cggccgccgc 60
    aggaatccct ggataaagac cagctcaacc atcgctgaga aaacagacct aggcttccca 120
    gggcggttaa cccgccggcc tctgggcaga gactaaaaga caaaacaaaa taaaacaaca 180
    acaaaaaact cccagtgtgt ttcctactct tctttgtctt ggaggaaagc aaagggagag 240
    aaatggactt caccagtggt ctttggcttc atcaattcac aggaaatggc atcaagatgg 300
    ttcaactaag acatgatcac taaaaacatt ataataatac ctttttgaaa aactcagttt 360
    ctcctgttta ctaaatattt atttcatcaa catgggctgc gttccactgt gtcaggattc 420
    tgcatgtggg tggagcactg ttccagcctg agaagatggt tctgaggcca cttagcaaga 480
    cattttccag catgagcagg tttctctgtg gaaatagtga cacctgttct ggtgtgttgt 540
    ctttcctcag ggaacttaag gggtacaaag ctcctgaaaa tgttctttat gctggttgaa 600
    gctcttatgt cgctgtactg attccctacg atgcagattt gaatcacaga gtaattaaaa 660
    tatggatcaa ataaggctgg ggctcaccaa ggctgaaagc tgtagccatt caaggcatca 720
    tttctgtcat gaaaatatag gaccttttca aaacatgcct tcaggaaggt gttctctttt 780
    caaacaaaag tctaatgact gcataactct tcttgaccac atcttacact ttctctagac 840
    ttgcttattt acagctactg gaacaaaa 868
    <210> SEQ ID NO 8
    <211> LENGTH: 3111
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 44
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 8
    cgagggcgga cgcaaagaac gcggaggacc tctgggtgcc tgcnggggag ctgctccagc 60
    cgggccgccg ggagcggtgg ggagagcatc gcgcagccgc ccctccacgc gcccgcccag 120
    ccgcgttcgc ccactgggct ctcccggctg cagtgccagg gcgcaggacg cggccgatct 180
    cccgctcccg ccacctccgc caccatgctg ctcccccagc tctgctggct gccgctgctc 240
    gctgggctgc tcccgccggt gcccgctcag aagttctcgg cgctcacgtt tttgagagtg 300
    gatcaagata aagacaagga ttgtagcttg gactgtgcgg gttcgcccca gaaacctctc 360
    tgcgcatctg acggaaggac cttcctttcc cgttgtgaat ttcaacgtgc caagtgcaaa 420
    gatccccagc tagagattgc atatcgagga aactgcaaag acgtgtccag gtgtgtggcc 480
    gaaaggaagt atacccagga gcaagcccgg aaggagtttc agcaagtgtt cattcctgag 540
    tgcaatgacg acggcaccta cagtcaggtc cagtgtcaca gctacacggg atactgctgg 600
    tgcgtcacgc ccaacgggag gcccatcagc ggcactgccg tggcccacaa gacgccccgg 660
    tgcccgggtt ccgtaaatga aaagttaccc caacgcgaag gcacaggaaa aacagatgat 720
    gccgcagctc cagcgttgga gactcagcct caaggagatg aagaagatat tgcatcacgt 780
    taccctaccc tttggactga acaggttaaa agtcggcaga acaaaaccaa taagaattca 840
    gtgtcatcct gtgaccaaga gcaccagtct gccctggagg aagccaagca gcccaagaac 900
    gacaatgtgg tgatccctga gtgtgcgcac ggcggcctct acaagccagt gcagtgccac 960
    ccctccacgg ggtactgctg gtgcgtcctg gtggacacgg ggcgccccat tcccggcaca 1020
    tccacaaggt acgagcagcc gaaatgtgac aacacgggcc agggcccacc cagccaaagc 1080
    ccgggacctg tacaagggcc gccagctaca aggttgtccg ggtgccaaaa agcatgagtt 1140
    tctgaccagc gttctggacg cgctgtccac ggacatggtc cacgccgcct ccgacccctc 1200
    ctcctcgtca ggcaggctct cagaacccga ccccagccat accctagagg agcgggtggt 1260
    gcactggtac ttcaaactac tggataaaaa ctccagtgga gacatcggca aaaaggaaat 1320
    caaacccttc aagaggttcc ttcgcaaaaa atcaaagccc aaaaaatgtg tgaagaagtt 1380
    tgttgaatac tgtgacgtga ataatgacaa atccatctcc gtacaagaac tgatgggctg 1440
    cctgggcgtg gcgaaagagg acggcaaagc ggacaccaag aaacgccaca cccccagagg 1500
    tcatgctgaa agtacgtcta atagacagcc aaggaaacaa ggataaatgg ctcatacccc 1560
    gaaggcagtt cctagacaca tgggaaattt ccctcaccaa agagcaatta agaaaacaaa 1620
    aacagaaaca catagtattt gcactttgta ctttaaatgt aaattcactt tgtagaaatg 1680
    agctatttaa acagactgtt ttaatctgtg aaaatggaga gctggcttca gaaaattaat 1740
    cacataccaa tgtatgtgtc ctcttttgac cttggaaatc tgtatgtggt ggagaagtat 1800
    ttgaatgcat ttaggcttaa tttcttcgcc ttccacatgt taacagtaga gctctatgca 1860
    ctccggctgc aatcgtatgg ctttctctaa cccctgcagt cacttccaga tgcctgtgct 1920
    tacagcattg tggaatcatg ttggaagctc cacatgtcca tggaagtttg tgatgtacgg 1980
    ccgaccctac aggcagttaa catgcatggg ctggtttgtt tcttgggatt ttctgttagt 2040
    ttgtcttgtt ttgctttcca gagatcttgc tcatacaatg aatcacgcaa ccactaaagc 2100
    tatccagtta agtgcaggta gttcccctgg aggaaataat attttcaaac tgtcgttggt 2160
    gtgatacttt ggctcaaagg atctttgctt ttccatttta agcttctgtt ttgagttttg 2220
    ccctggggct tgaatgagtc ccagagagtc gttcggatgg tgggaggctg cctaggaggc 2280
    agtaaatcca gttcacagtg cctgggaggg gcccatcctt ccaaaatgta aatccagttc 2340
    gcggtgtgac cgagctgggc taacaggctt gtctgcctgg ttttcctacc tacacgtgga 2400
    cattattctc ctgatcctcc tacctggttc caccccaggg ctaccggaag gtaaaatctt 2460
    cacctgaacc aattatgagc agtctcctta ctgaaggtac agccggatac gtggtgcccc 2520
    cggggctggt gttggcagcc ggggggaggt gcctgagggt ccccacggtt cctttctgct 2580
    tttctgaatg catcaagggt acgagaactt gccaatggga aattcatccg agtggcactg 2640
    gcagagaagg ataggagtgg aatgcccaca cagtgaccaa cagaactggt ctgcgtgcat 2700
    aaccagctgc caccctcagg cctgggcccc agagctcagg gcacccagtg tcttaaggaa 2760
    ccatttggag gacagtctga gagcaggaac ttcaagctgt gattctatct cggctcagac 2820
    ttttggttgg aaaaagatct tcatggcccc aaatcccctg agacatgcct tgtagaatga 2880
    ttttgtgatg ttgtgatgct tgtggagcat cgcgtaaggc ttcttgctta tttaaactgt 2940
    gcaaggtaaa aatcaagcct ttggagccac agaaccagct caagtacatg ccaatgttgt 3000
    ttaagaaaca gttatgatcc taaacttttt ggataatctt ttatatttct gacctttgaa 3060
    tttaatcatt gttcttagat taaaataaaa tatgctattg aaactaaaaa a 3111
    <210> SEQ ID NO 9
    <211> LENGTH: 2311
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484,
    485,
    <222> LOCATION: 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496,
    497,
    <222> LOCATION: 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508,
    509,
    <222> LOCATION: 510, 511, 512, 513, 514, 515,516, 517, 518, 519, 520,
    521,
    <222> LOCATION: 522, 523, 524, 525, 526, 527, 528, 529, 530,531, 532,
    533,
    <222> LOCATION: 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544,
    <222> LOCATION: 2288,2295, 2296, 2297, 2298, 2299
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 9
    gccgctcgcc cacggactcc gacgtgtccc tcgactccga ggactccggg gctaagtctc 60
    caggcatcct gggctacaat atctgtcccc gcgggtggaa tggcagcctt cggctcaagc 120
    gtggcagcct ccccgccgag gcctcctgca ccacctagag ccccaccccc gaccccaccc 180
    cgggagggca gagccagaag aaggctcatt agacctgggg gacccaaagg gtctggcctc 240
    tttgggcagc cccagagatg aggggtcagc agaggagagc tctggggttg gggatgggtt 300
    agggacgcaa gcttgagttc tagcccttgc tctcattcag ctgttgtgtg accctgggta 360
    agacccttcc ttgtttgacc ctcagctttc ccatctgttt aatggtggct ttggccaagg 420
    caatccacaa acgtcaaaat tccccttccc atcagtacac acaccgatgc acannnnnnn 480
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 540
    nnnntagtta gtgccttgga tgaggcgggg cagtgtgtat atggacccct ggacttgcta 600
    ccttcagggt tccatactcg tccctcccct cctggctctg ctgtctggag tctggcaagc 660
    ggggtgtgtt cagaaggtcc taggcctgtg tcgcatgtcc aggcactggc ctgaccatcc 720
    ggctccctgg gcaccaagtc ccagggcagg agcagctgtt ttccatccct tcccagacaa 780
    gctctatttt tatcacaatg acctttagag aggtctccca ggccagctca aggtgtccca 840
    ctatcccctc tggagggaag aggcaggaaa attctccccg ggtccctgtc atgctacttt 900
    ctccatccca gttcagactg tccaggacat cttatctgca gccataagag aattataagg 960
    cagtgatttc ccttaggccc aggacttggg cctccagctc atctgttcct tctgggccca 1020
    ttcatgggca ggttctgggc tcaaagctga actggggaga gaagagatac agagctacca 1080
    tgtgacttta cctgattgcc ctcagtttgg ggttgcttat tgggaaagag agagacaaag 1140
    agttacttgt tacgggaaat atgaaaagca tggccaggat gcatagagga gattctagca 1200
    ggggacagga ttggctcaga tgacccctga gggctcttcc agtcttgaaa tgcattccat 1260
    gatattagga agtcgggggt gggtggtggt ggtgggctag ttgggcttga atttaggggc 1320
    cgatgagctt gggtacgtga gcagggtgtt aagttagggt ctgcctgtat ttctggtccc 1380
    cttgggaaat gtccccttct tcagtgtcag acctcagtcc cagtgtccat atcgtgccca 1440
    gaaaagtaga cattatcctg ccccatccct tccccagtgc actctgacct agctagtgcc 1500
    tggtgcccag tgacctgggg gagcctggct gcaggccctc actggttccc taaaccttgg 1560
    tggctgtgat tcaggtcccc aggggggact cagggaggaa tatggctgag ttctgtagtt 1620
    tccagagttg ggctggtaga gctttctaga ggttcagaat attagcttca ggatcagctg 1680
    ggggtatgga attggctgag gatcaaacgt atgtaggtga aaggatacca ggatgttgct 1740
    aaaggtgagg gacagtttgg gtttgggact taccggggtg atgttagatc tggaaccccc 1800
    aagtgaggct ggagggagtt aaggtcagta tggaagatag ggttgggaca gggtgctttg 1860
    gaatgaaaga gtgaccttag agggctcctt gggcctcagg aatgctcctg ctgctgtgaa 1920
    gatgagaagg tgctcttact cagttaatga tgagtgacta tatttaccaa agcccctacc 1980
    tgctgctggg tcccttgtag cacaggagac tggggctaag ggcccctccc agggaaggga 2040
    caccatcagg cctctggctg aggcagtagc atagaggatc catttctacc tgcatttccc 2100
    agaggactag caggaggcag ccttgagaaa ccggcagttc ccaagccagc gcctggctgt 2160
    tctctcattg tcactgccct ctccccaacc tctcctctaa cccactagag attgcctgtg 2220
    tcctgcctct tgcctcttgt agaatgcagc tctggccctc aataaatgct tcctgcattc 2280
    taaaaaanaa aaaannnnna aaaaaaaaag g 2311
    <210> SEQ ID NO 10
    <211> LENGTH: 1866
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 10
    agcttttgtt cacactttaa atagcagtcc cagaatgatt tcactacaga ctctctggaa 60
    agcctgggag ctgaattccg gaagatcccc acatcgatga aagcaaagcg aagccaccaa 120
    gccatcatca tgtccacgtc gctacgagtc agcccatcca tccatggcta ccacttcgac 180
    acagcctctc gtaagaaagc cgtgggcaac atctttgaaa acacagacca agaatcacta 240
    gaaaggctct tcagaaactc tggagacaag aaagcagagg agagagccaa gatcattttt 300
    gccatagatc aagatgtgga ggagaaaacg cgtgccctga tggccttgaa gaagaggaca 360
    aaagacaagc ttttccagtt tctgaaactg cggaaatatt ccatcaaagt tcactgaaga 420
    gaagaggatg gataaggacg ttatccaaga atggacattc aaagaccaag tgagtttgtg 480
    agattctaac agatgcagca ttttgctgct accttacaag cttctcttct gtcaggactc 540
    cagaggctgg aaagggaccg ggactggaaa gggaccagga ctgaacagac tggttacaaa 600
    gactccaaac aatttcatgc cctgtgctgt tacagaggag aacaaaatgc tttcagcaag 660
    gatttgaaaa ctcttccgtc cctgcaggaa aggattgatg ctgatagaag agcctggaca 720
    gatgtaatga gaactaaaga aaacagatgg ctggagatga catttatcca gggtcacttt 780
    gtcaggccct aggacttaaa tcgaagttga actttttttt ttttttaacc aaatagatag 840
    gggaagggag gagggagagg gaggacaggg agagaaaata ccatgcataa attgtttact 900
    gaatttttat atctgagtgt tcaaaatatt tccaagcctg agtattgtct attggtatag 960
    atttttagaa atcaataatt gattatttat ttgcacttat tacaatgcct gaaaaagtgc 1020
    accacatgga tgttaagtag aaattcaaga aagtaagatg tcttcagcaa ctcagtaaaa 1080
    ccttacgcca ccttttggtt tgtaaaaggt tttttataca tttcaaacag gttgcacaaa 1140
    agttaaaata atggggtctt ttataaatcc aaagtactgt gaaaacattt tacatatttt 1200
    ttaaatcttc tgactaatgc taaaacgtaa tctaattaaa tttcatacag ttactgcagt 1260
    aagcattagg aagtgaatat gatatacaaa atagtttata aagactctat agtttctata 1320
    atttatttta ctggcaaatg tcatgcaaca ataataaatt attgtaaact ttgtggcttt 1380
    tggtctgtga tgcttggtct caaaggaaaa aataagatgg taaatgttga tatttacaaa 1440
    cttttctaaa gatgtgtctc taacaataaa agttaatttt agagtagttt tatattaatt 1500
    accaaacttt ttcaaaacaa attcttacgt caaatatctg ggaagtttct ctgtcccaat 1560
    cttaaaatat aaaatataga tatagaagtt catagattga ctccttggca tttctattta 1620
    tgtatccatt aaggatgagt tttaaaaggc tttctcttca tacttttgaa aaatttcttc 1680
    tatgattaca gtagctatgt acatgtgtac atctattttt cccaagcaat atgttttggg 1740
    tttagagtct gagtgatgac caagattctg tgtgttacta ctgtttgttt aataggaaca 1800
    aatatagaaa taatattatc tctttgctta tttcccgtta aaactataat aaaatgtttc 1860
    taggaa 1866
    <210> SEQ ID NO 11
    <211> LENGTH: 1929
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 11
    gctgcctgcc ggtgctcttc gtggctctgg gcatggcctc ggaccccatc ttcacgctgg 60
    cgcccccgct gcattgccac tacggggcct tcccccctaa tgcctctggc tgggagcagc 120
    ctcccaatgc cagcggcgtc agcgtcgcca gcgctgccct agcagccagc gccgccagcc 180
    gtgtcgccac cagtaccgac ccctcgtgca gcggcttcgc cccgccggac ttcaaccatt 240
    gccctcaagg attgggacta taatggcctt cctgtgctca ccaccaacgc catcggccag 300
    tgggatctgg tgtgtgacct gggctggcag gtgatcctgg agcagatcct cttcatcttg 360
    ggctttgcct ccggctacct gttcctgggt taccccgcag acagatttgg ccgtcgcggg 420
    attgtgctgc tgaccttggg gctggtgggc ccctgtggag taggaggggc tgctgcaggc 480
    tcctccacag gcgtcatggc cctccgattc ctcttgggct ttctgcttgc cggtgttgac 540
    ctgggtgtct acctgatgcg cctggagctg tgcgacccaa cccagaggct tcgggtggcc 600
    ctggcagggg agttggtggg ggtgggaggg cacttcctgt tcctgggcct ggcccttgtc 660
    tctaaggatt ggcgattcct acagcgaatg atcaccgctc cctgcatcct cttcctgttt 720
    tatggctggc ctggtttgtt cctggagtcc gcacggtggc tgatagtgaa gcggcagatt 780
    gaggaggctc agtctgtgct gaggatcctg gctgagcgaa accggcccca tgggcagatg 840
    ctgggggagg aggcccagga ggccctgcag gacctggaga atacctgccc tctccctgca 900
    acatcctcct tttcctttgc ttccctcctc aactaccgca acatctggaa aaatctgctt 960
    atcctgggct tcaccaactt cattgcccat gccattcgcc actgctacca gcctgtggga 1020
    ggaggaggga gcccatcgga cttctacctg tgctctctgc tggccagcgg caccgcagcc 1080
    ctggcctgtg tcttcctggg ggtcaccgtg gaccgatttg gccgccgggg catccttctt 1140
    ctctccatga cccttaccgg cattgcttcc ctggtcctgc tgggcctgtg ggattatctg 1200
    aacgaggctg ccatcaccac tttctctgtc cttgggctct tctcctccca agctgccgcc 1260
    atcctcagca ccctccttgc tgctgaggtc atccccacca ctgtccgggg ccgtggcctg 1320
    ggcctgatca tggctctagg ggcgcttgga ggactgagcg gcccggccca gcgcctccac 1380
    atgggccatg gagccttcct gcagcacgtg gtgctggcgg cctgcgccct cctctgcatt 1440
    ctcagcatta tgctgctgcc ggagaccaag cgcaagctcc tgcccgaggt gctccgggac 1500
    ggggagctgt gtcgccggcc ttccctgctg cggcagccac cccctacccg ctgtgaccac 1560
    gtcccgctgc ttgccacccc caaccctgcc ctctgagcgg cctctgagta ccctggcggg 1620
    aggctggccc acacagaaag gtggcaagaa gatcgggaag actgagtagg gaaggcaggg 1680
    ctgcccagaa gtctcagagg cacctcacgc cagccatcgc ggagagctca gagggccgtc 1740
    cccaccctgc ctcctccctg ctgctttgca ttcacttcct tggccagagt caggggacag 1800
    ggagagagct ccacactgta accactgggt ctgggctcca tcctgcgccc aaagacatcc 1860
    acccagacct cattatttct tgctctatca ttctgtttca ataaagacat ttggaataaa 1920
    aaaaaaaaa 1929
    <210> SEQ ID NO 12
    <211> LENGTH: 1831
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 12
    ctggagccgc cctgggtgtc agcggctcgg ctcccgcgca cgctccggcc gtcgcgcacc 60
    tcgggcacct gcaggtccgt ggcgtcccgc ggctgggcgc ccctgactcc gtcccggcca 120
    gggagggcca tgatttccct cccggggccc ctggtgacca acttgctgcg gtttttgttc 180
    ctggggctga gtgccctcgc gcccccctcg cgggcccagc tgcaactgca cttgcccgcc 240
    aaccggttgc aggcggtgga gggaggggaa gtggtgcttc cagcgtggta caccttgcac 300
    ggggaggtgt cttcatccca gccatgggag gtgccctttg tgatgtggtt cttcaaacag 360
    aaagaaaagg aggatcaggt gttgtcctac atcaatgggg tcacaacaag caaacctgga 420
    gtatccttgg tctactccat gccctcccgg aacctgtccc tgcggctgga gggtctccag 480
    gagaaagact ctggccccta cagctgctcc gtgaatgtgc aagacaaaca aggcaaatct 540
    aggggccaca gcatcaaaac cttagaactc aatgtactgg ttcctccagc tcctccatcc 600
    tgccgtctcc agggtgtgcc ccatgtgggg gcaaacgtga ccctgagctg ccagtctcca 660
    aggagtaagc ccgctgtcca ataccagtgg gatcggcagc ttccatcctt ccagactttc 720
    tttgcaccag cattagatgt catccgtggg tctttaagcc tcaccaacct ttcgtcttcc 780
    atggctggag tctatgtctg caaggcccac aatgaggtgg gcactgccca atgtaatgtg 840
    acgctggaag tgagcacagg tcagtgaggg ggcctggagc tgcagtggtt gctggagctg 900
    ttgtgggtac cctggttgga ctggggttgc tggctgggct ggtcctcttg taccaccgcc 960
    ggggcaaggc cctggaggag ccagccaatg atatcaagga ggatgccatt gctccccgga 1020
    ccctgccctg gcccaagagc tcagacacaa tctccaagaa tgggaccctt tcctctgtca 1080
    cctccgcacg agccctccgg ccaccccatg gccctcccag gcctggtgca ttgaccccca 1140
    cgcccagtct ctccagccag gccctgccct caccaagact gcccacgaca gatggggccc 1200
    accctcaacc aatatccccc atccctggtg gggtttcttc ctctggcttg agccgcatgg 1260
    gtgctgtgcc tgtgatggtg cctgcccaga gtcaagctgg ctctctggta tgatgacccc 1320
    accactcatt ggctaaagga tttggggtct ctccttccta taagggtcac ctctagcaca 1380
    gaggcctgag tcatgggaaa gagtcacact cctgaccctt agtactctgc ccccacctct 1440
    ctttactgtg ggaaaaccat ctcagtaaga cctaagtgtc caggagacag aaggagaaga 1500
    ggaagtggat ctggaattgg gaggagcctc cacccacccc tgactcctcc ttatgaagcc 1560
    agctgctgaa attagctact caccaagagt gaggggcaga gacttccagt cactgagtct 1620
    cccaggcccc cttgatctgt accccacccc tatctaacac cacccttggc tcccactcca 1680
    gctccctgta ttgatataac ctgtcaggct ggcttggtta ggttttactg gggcagagga 1740
    tagggaatct cttattaaaa ctaacatgaa atatgtgttg ttttcatttg caaatttaaa 1800
    taaagataca taatgtttgt atgagataag a 1831
    <210> SEQ ID NO 13
    <211> LENGTH: 909
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 13
    gaggaggtgg gcgccaacag acaggcgatt aatgcggctc ttacccaggc aaccaggact 60
    acagtataca ttgtggacat tcaggacata gattctgcag ctcgggcccg acctcactcc 120
    tacctcgatg cctactttgt cttccccaat gggtcagccc tgacccttga tgagctgagt 180
    gtgatgatcc ggaatgatca ggactcgctg acgcagctgc tgcagctggg gctggtggtg 240
    ctgggctccc aggagagcca ggagtcagac ctgtcgaaac agctcatcag tgtcatcata 300
    ggattgggag tggctttgct gctggtcctt gtgatcatga ccatggcctt cgtgtgtgtg 360
    cggaagagct acaaccggaa gcttcaagct atgaaggctg ccaaggaggc caggaagaca 420
    gcagcagggg tgatgccctc agcccctgcc atcccaggga ctaacatgta caacactgag 480
    cgagccaacc ccatgctgaa cctccccaac aaagacctgg gcttggagta cctctctccc 540
    tccaatgacc tggactctgt cagcgtcaac tccctggacg acaactctgt ggatgtggac 600
    aagaacagtc aggaaatcaa ggagcacagg ccaccacaca caccaccaga gccagatcca 660
    gagcccctga gcgtggtcct gttaggacgg caggcaggcg caagtggaca gctggagggg 720
    ccatcctaca ccaacgctgg cctggacacc acggacctgt gacaggggcc cccactcttc 780
    tggacccctt gaagaggccc taccacaccc taactgcacc tgtctccctg gagatgaaaa 840
    tatatgacgc tgccctgcct cctgcttttg gccaatcacg gcagacaggg gttggggaaa 900
    tattttatt 909
    <210> SEQ ID NO 14
    <211> LENGTH: 1453
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913,
    914,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 14
    ggagccaagt ggggccctcg gcctcttcct tcgttccagg cccatgattt tccctacact 60
    tctccctggc ccaggctcca gccacaggca cctctcctgc ccccgcccac cctcctgacc 120
    gcagctccca ggccctggag acctccaggc tttcctgccc tgggcagccc cacctcacag 180
    ccagagtcaa tgccttcatg ggaagggctc ccagccacac ccagagtggc ccaaagctgt 240
    tgaagtcagc atcctttgtc ccatcaggac cctcctgcct cctctccagg cccttgttcg 300
    cctccccacc ctcctcagag gcccggggaa gggaagagca ggtcagtaca gaggttctgt 360
    ctacagggag gggccctggg tctatgcaca gctggagctc tgagccttcc acagcccgtg 420
    tgactgctag agggcagggg tgcagggctc aggggggccg ggctggtcct ttggggctgg 480
    tgttcctacg tcagtcccca cctggggaat aaactccagc ctctcctgct catacagaag 540
    gaactggttg ggtttgcttt atgggatctt tgagaccaaa acagatgctc ctgtttgctg 600
    ggggagggtg tgagcacgga gtatttctgt ccctcgtgaa gtcacgtcac acaggggaga 660
    ggcgaggtcg atggaactgg ccacgcacag gctctggctc tggaaggagg gatgatgagt 720
    gggcgttttc ccggcaggcc cccggggtcc tcagcctcag caacccaggg agaggacaga 780
    aatgaaccga tggttgaggg attgtcacgg gaggaacatg acacccgaag ggactctagg 840
    tgccctcgga gtgccacaca tgcccagacc ttctcacacc cacacaaata ggctctgccg 900
    tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnttgtt cacactcaga acccaggaca 960
    gccacagcca ccgcttaggg gaagccactg cagatgcccc tggaatgggc acagcacagc 1020
    cagggcgctc ttccaggcag gcgaggataa cttgagagtt tcctagggca ccagggacag 1080
    agctcagagg cccccgaggt gtgtgtagga ggcggaggcc cgcagagcac agagcaggag 1140
    aagggcttgg gccctggagg agaaagccat tctggacacc aggggacctg gacggagggt 1200
    ccccacagcc cgtgccccac gccgcctgga ggccagaggg gtcagtggcc ctgctgtccc 1260
    ggctccatct tggttctagc cgccacctgt atgaacacag tggcccggct taacgcacta 1320
    acccagcctc tccctgtgtc ccacagggag tagcaagacc caccccacac tgccttcacc 1380
    atctacacca gtgacgccgc tgtgtgtctt agcatggaaa taaataaacc tgaatgcaaa 1440
    aaaaaaaaaa agg 1453
    <210> SEQ ID NO 15
    <211> LENGTH: 443
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 15
    gatatagaca acttccagag tcaccagtgt gcaaatggag ccacctgcat tagtcatact 60
    aatggctatt cttgcctctg ttttggaaat tttacaggaa aattttgcag acagagcaga 120
    ttaccctcaa cagtctgtgg gaatgagaag acaaatctca cttgctacaa tggaggcaac 180
    tgcacagagt tccagactga attaaaatgt atgtgccggc caggttttac tggagaatgg 240
    tgtgaaaagg acattgatga gtgtgcctct gatccgtgtg tcaatggagg tctgtgccag 300
    gacttactca acaaattcca gtgcctctgt gatgttgcct ttgctggcga gcgctgcgag 360
    gtggacttgg cagatgactt gatctccgac attttcacca ctattggctc agtgactgtc 420
    gccttgttac tgatcctctt gct 443
    <210> SEQ ID NO 16
    <211> LENGTH: 1537
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 284, 285, 287
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 16
    aaaaaaaaca acccggtagc attgtccctt ccccactgac aaacttatca aatccagaag 60
    ctttagagtt tcgtctctaa ttatttttct cctgaacaaa attacccaag tcaaaacaaa 120
    atgtattttt agaattacgg cagcatacga cctgaatttt gtgagtttcg tggctttatc 180
    ttaaatcacc atttccctaa aaacggtttc tttctcctta gaaatgctgg tggcaacttg 240
    atgaaacagc caaatgcacc agggcaggtc actttcccaa aaannanaag aaaaaaaact 300
    cattgagata gctacagttc tataggttaa tttaaagcct cctttttcta ctcatttttg 360
    aaagcaaaat tacattttac tattttacat aaccagtgaa aagacgttga aagcctacag 420
    ctcactgttt ttggtgctct ggaaatgttg agggtgggtt tttaaccagt gatttttaac 480
    gtgcagtgaa tttgttagac ttttaaacac cagctaaggt agtcaaactt gatccccatt 540
    aaaaatcaag gaattagggg tcgggggagg gtttaggagt gatccagaat gacctcccag 600
    aattactgtg cgtacaactt tatttttcag agttttcatt ggaatggtaa gagttttatg 660
    aaagacagtt ttaaaactta ttctgagtta aatattaata ctttaaaaaa ttattgtact 720
    agacttattg cagccttttg aaagtagcag agtttcatca taccacatat ataacagagc 780
    ataaattttc tataatcagg caccttttgc tgcttttgag taagactgtt ttcctgttta 840
    ggtgttaagc atcgccagac ataaaaatct attctctcct ctcgattgta gcatagcctg 900
    acagctctag atacagcatt tctatgatga aaaatgagta tccatcagga aatctagaag 960
    actagccgtg ttttctcaga ctccaccttt gtttgcactc tgttgcctgt gaggagcttt 1020
    ctggcatgtg attatttact tcaaaactag agttccaagc acctacatta attattttat 1080
    attgtgtgca gaatagtata tcttttaatg tcagatatga tacactgcac atattgcttt 1140
    tgcactctta aaatttttgt actaaataat agaaaatatt tatattcttt gagtgtgagc 1200
    tttgaataga tggcattatc actttattgt ttttttaaca aaaacttttt ctcaattatt 1260
    ctattgcaat gttattctga gcaagtccta tgccaaatat cttgtataat gtttgtatgg 1320
    aagattaaat tttactcttg tgtggtaaga ctatttcagt tactgatttt atagttggaa 1380
    tttgatattc cagcacaaag tccacagtgt attcagaaat ccaagttggt gtcatacatt 1440
    tcattttgat gtgaactttt ctttgctttc ctttgttcta agactccatt ttgcaataaa 1500
    cgttttgaca gtaaaaaaaa taaaaaagga aaaaaaa 1537
    <210> SEQ ID NO 17
    <211> LENGTH: 972
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 17
    acgcaaattc ggcacgaggg ttctaaaacc cagtttggtt tacgttgtct ttcacagtag 60
    tatatttagc tcttctctgg aaagttgtgg gttaatataa ttcttaaaca tgaaaatgta 120
    attaaacaca ccacgagaga acaatattcc aggagactta atagtgatta ctttcttcaa 180
    tcaggaaatc gtttcagtgc ctcctttgta ggaatgcttt gttttgtgat gggttttctt 240
    aaagaagagc acacctccgt ccaatctcct gagacagcca cgtctccgct gacatcccac 300
    tgtgatgctt tcagatagtc agtgaatgtt tctgataacc ttcatccagt atctgaaaca 360
    caatgtgaga gattatattg ttttagataa taacatccca tttagttgac taaaatcttc 420
    caaactctga aagctgcaca ctgctactcc agagagtgca ggtcttagct cttctccttt 480
    ctgacttcaa gatgaatctt tgggacgatg tttctggtgc ttggtccaca gtgattcact 540
    tttgaaggag aggccacatg acatgaactg cctggtgtta caacctagct aacatatttg 600
    atgctactcc tgttgtctgt actgcttatt caagtagtat tctaagttat gttactaaaa 660
    aacatggtgg gtaaagcaca atcctaccca tcattgtcct ccaaaataat tgtatgacat 720
    acacggccca gcccattgcc ctccctgcat ctctgtgctg ctttgccatt tccccttcta 780
    cccagcctcc tcaaggggta ccttggtgga tatttcagta cttaaaacca gactgtaatc 840
    ataacctccc tctgtgtggc atcaataaat agccaaactc aaaaaaaaaa aaaaaaaaaa 900
    aaaaaaaaaa aaaaaatatc ggtcgcaagc ttattccctt tagtgagggt taattttagc 960
    ttgcactgcc ta 972
    <210> SEQ ID NO 18
    <211> LENGTH: 1544
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 18
    tactttgact ttggatcatt tccctgactg ggctaatgtg acacatattg agacttagga 60
    agagccacaa gaccacacac acagccctta ccctcctcag gactaccgaa ccttctggca 120
    caccttgtac agagttttgg ggttcacacc ccaaaatgac ccaacgatgt ccacacacca 180
    ccaaaaccca gccaatgggc cacctcttcc tccaagccca gatgcagaga tggacatggg 240
    cagctggagg gtaggctcag aaatgaaggg aacccctcag tgggctgctg gacccatctt 300
    tcccaagcct tgccattatc tctgtgaggg aggccaggta gccgagggat caggatgcag 360
    gctgctgtac ccgctctgcc tcaagcatcc cccacacagg gctctggttt tcactcgctt 420
    cgtcctagat agtttaaatg ggaatcagat cccctggttg agagctaaga caaccaccta 480
    ccagtgccca tgtcccttcc agctcacctt gagcagcctc agatcatctc tgtcactctg 540
    gaagggacac cccagccagg gacggaatgc ctggtcttga gcaacctccc actgctggag 600
    tgcgagtggg aatcagagcc tcctgaagcc tctgggaact cctcctgtgg ccaccaccaa 660
    aggatgagga atctgagttg ccaacttcag gacgacacct ggcttgccac ccacagtgca 720
    ccacaggcca acctacgccc ttcatcactt ggttctgttt taatcgactg gccccctgtc 780
    ccacctctcc agtgagcctc cttcaactcc ttggtcccct gttgtctggg tcaacatttg 840
    ccgagacgcc ttggctggca ccctctgggg tccccctttt ctcccaggca ggtcatcttt 900
    tctgggagat gcttcccctg ccatccccaa atagctagga tcacactcca agtatgggca 960
    gtgatggcgc tctgggggcc acagtgggct atctaggtcc tccctcacct gaggcccaga 1020
    gtggacacag ctgttaattt ccactggcta tgccacttca gagtctttca tgccagcgtt 1080
    tgagctcctc tgggtaaaat cttccctttg ttgactggcc ttcacagcca tggctggtga 1140
    caacagagga tcgttgagat tgagcagcgc ttggtgatct ctcagcaaac aacccctgcc 1200
    cgtgggccaa tctacttgaa gttactcgga caaagacccc aaagtggggc aacaactcca 1260
    gagaggctgt gggaatcttc agaagccccc ctgtaagaga cagacatgag agacaagcat 1320
    cttctttccc ccgcaagtcc attttatttc cttcttgtgc tgctctggaa gagaggcagt 1380
    agcaaagaga tgagctcctg gatggcattt tccagggcag gagaaagtat gagagcctca 1440
    ggaaacccca tcaaggaccg agtatgtgtc tggttccttg ggtgggacga ttcctgacca 1500
    cactgtccag ctcttgctct cattaaatgc tctgtctccc gcgg 1544
    <210> SEQ ID NO 19
    <211> LENGTH: 1109
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761,
    762,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 19
    cggacgcgtg ggggccagcc tggaggccca gacgtggcgc agcgactcgg aggttcgcct 60
    ccagcttgcg catcatctgc ggccgggtcc cgatgagcct cctgttgcct ccgctggcgc 120
    tgctgctgct tctcgcggcg cttgtggccc cagccacagc cgccactgcc taccggccgg 180
    actggaaccg tctgagcggc ctaacccgcg cccgggtaga gacctgcggg ggatgacagc 240
    tgaaccgcct aaaggaggtg aaggctttcg tcacgcagga cattccattc tatcacaacc 300
    tggtgatgaa acacctccct ggggccgacc ctgagctcgt gctgctgggc cgccgctacg 360
    aggaactaga gcgcatccca ctcagtgaaa tgacccgcga agagatcaat gcgctagtgc 420
    aggagctcgg cttctaccgc aaggcggcgc ccgacgcgca ggtgcccccc gagtacgtgt 480
    gggcgcccgc gaagccccca gaggaaactt cggaccacgc tgacctgtag gtccgggggc 540
    gcggcggagc tgggacctac ctgcctgagt cctggagaca gaatgaagcg ctcagcatcc 600
    cgggaatact tctcttgctg agagccgatg cccgtccccg ggccagcagg gatggggttg 660
    gggaggttct cccaacccca ctttcttcct tccccagctc cactaaattc cctcctgcct 720
    taaaaaaaac aagaaaaacc aaacaaacaa nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780
    nnnnnnnnnn nnnnnnnnnn nnnnnntctt ctatagtgtc acctaaattc aattcactgg 840
    ccgtcgtttt acaacgtcgt gactggcgac ggacaaagtt atcnntttaa tcgccttgca 900
    gcacataccc ctttggccag ctggggtaat aggggaagcg ggccggaccc gatcggcctt 960
    cccaaacagt tggggaagct tgaaatgcgc gacattgggc cgacggcctt ctatacggga 1020
    ggatctctaa acgcggccgg ggtgttggtt gggttaaggc ggagtgtgac cccgcataat 1080
    aacttttgca caggggccct ataggggcc 1109
    <210> SEQ ID NO 20
    <211> LENGTH: 1740
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 20
    aagagaagtt accccgatga cttggtttgg aaggggttaa ggcaccagtg catcctcttc 60
    taaagtgatt tatgatgatg tgtggagttt aaaaacttta ccccacccca aagaacagcc 120
    ctctcactcc tcactgagtc cactctgaac gtgctaaaat gggaaggagg cggtgttttg 180
    ctgatctgtt aaattcttag tgaagtttcc ttgatttcca gtggctgctg ttgtttgagt 240
    ttggtttgga gcaaaactga ggtagtccta acatttctgg gactgaatcc aggcaagaga 300
    aagaagaaaa agaagaagaa aaagaggagg aaaaaggtag ggagaaataa agggaggaga 360
    gaagcacagt gaaagaaaaa aaaagtccct tttgcgacat cacattcctg tgttttccct 420
    cagcctggaa aacatattaa tcccagtgct tttacgcccg gaaacaaaga gactaagcca 480
    gactatgggg gaaagggaga taagaaggat cctggaactt taaagaggga aagagtgaga 540
    ttcagaaatc gccaggactg gactttaagg gacgtcctgt gtcagcacaa gggactggca 600
    cacacagaca cacgagaccg aggagaaact gcagacaaat ggagatacaa agacttagaa 660
    ggacagctcc tttcacctca tcctacttgt ccagaaggta aaaagacaca gccagaaaga 720
    aaaggcatcg gctcagctct cagatcagga caggctgtgg atctgtggcg gtactctgaa 780
    agctggagct gcagcacacc ccttttgtat tgctcaccct cggtaaagag agagagggct 840
    gggaggaaaa gtagttcatc taggaaactg tcctgggaac caaacttctg atttcttttg 900
    caaccctctg cattccatct ctatgagcca ccattggatt acacaatgac atggagaatg 960
    ggaccccgtt tcactatgct gttggccatg tggctagtgt gtggatcaga accccacccc 1020
    catgccacta ttagaggcag ccacggagga cggaaagtgc ctttggtttc tccggacagc 1080
    agtaggccag ctcggtttct gaggcacact gggaggtctc gcggaattga gagatccact 1140
    ctggaggaac caaaccttca gcctctccag agaaggagga gtgtgcccgt gttgagacta 1200
    gctcgcccaa cagagccgcc agcccgctcg gacatcaatg gggccgccgt gagacctgag 1260
    caaagaccag cagccagggg ctctccgcgt gagatgatca gagatgaggg gtcctcagct 1320
    cggtcaagaa tgttgcgttt cccttacggg gtccagctct cccaacatcc ttgccagctt 1380
    tgcagggaag aacagagtat gggtcatctc agcccctcat gcctcggaag gctactaccg 1440
    cctcatgatg agcctgctga aggacgatgt gtactgtgag ctggcggaga ggcacatcca 1500
    acagattgtg ctcttccacc aggcaggtga ggaaggaggc aaggtgagaa ggatcaccag 1560
    cgagggccag atcctggagc agcccctggg accctagcct catccctaag ctgatgagct 1620
    tcctgaagct ggagaagggc aagtttggca tggtgctgct gaagaagacg ctgcaggtgg 1680
    aggagcgcta tccatatccc gttaggctgg aagccatgta cgaggtcatc gaccaaggcc 1740
    <210> SEQ ID NO 21
    <211> LENGTH: 4467
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 971, 978, 1295, 1296, 1297, 1298, 1299, 1300, 1301,
    1302,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 21
    gcgtcgcgct caccctgcgc gtgcccccgc ggcgcctggg cgtcttcctg gactacgagg 60
    ccggagagct gtccttcttc aacgtgtccg acggctccca catcttcacc ttccacgaca 120
    ccttctcggg cgcgctctgt gcgtacttca ggcccagggc ccacgacggc ggcgaacatc 180
    cggatcccct gaccatctgc ccgctgccgg ttagagggac gcgcgtcccc gaagagaacg 240
    acagtgacac ctggctacag ccctatgagc ccgcggatcc cgccctggac tggtggtgag 300
    gcgccctcgt ggccgcggga ctggccccgg gggggccccc tggatcccag gccagcgctt 360
    tgctctcctg ctccgtctga agggagcagg tgcaccagcc aaaatgtcag cgagggggac 420
    aaagagaggg acctttgcct acgtagatgt gtatgtgtag tgcgattttc ttcaaggaaa 480
    ggagacaagt ccaaagctcg tttgtggatt gtgggactga gcaaaggagt acaaatatat 540
    ccacgtcgct cagagctggg gtgctcacgg tgggtggtgg gaaagaagcc agcatggaag 600
    aaagaaggga gaaaactttg gtgactgcct tagagggatc agttaatttg tatagtttta 660
    tattttttgt atatgtttgc tagctctaaa aaggtcgaga tgcaataaca cttcgtaagc 720
    aacgagttca cctaagtaag gctcagatcc tagttttaaa aaccatttcc cattaaaatg 780
    aagttggagg aacagctgct tctggagccg gggcaaaaaa tttcaaggtg agcctggagc 840
    attgtgtgtg gtgaagtaaa ataaaggctc aaaacgtgac ggcaacccgg caaaagggta 900
    gggagccagg ccgaagggcc tcactgacca attgtgggac aatttgaaca tcaggatgaa 960
    taatgacagg ngaggttnta acacactgaa taaaaacata atccatgagt tcatgctgat 1020
    actcaaattt ctttttaaaa aggagaaaca ggaaggtttc ttttggaggt gaaatctaat 1080
    tattggtgag agtcttggag aacaggctgt ttccagtctc aaagcagtaa ccttatacac 1140
    tacttataag tttgaaaggg gaaaggttac ctttacaatg gagacatcta ccagatgcat 1200
    ccaagtgatt aaatttaaca tcatcaatga tgggaccaag gacattatta gtttgacaac 1260
    tggggaaaga agtgttcttc accccctacc cccannnnnn nnnnnnnnnn nnnnnnnnnn 1320
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500
    nnnnnnnnnn nnnnnnnnnn nnttcatttt cagagtgaga catttgtact gtggctatgt 1560
    aggagaacat tcttgttctt agcaaacata ctgaagtttt tagatattaa ttaccacagt 1620
    gtctgccact gaatttccag tgactaagtg gaaaaatata aaacatatga atataaagaa 1680
    agaaagagac aagtcaaatg tagtaaaatg acaacacttg gtgactctag gtgactggtc 1740
    gacagatgtt cattgtacta tcaatgtggc tttgctgtgg gtttgaaatt ttgcaaacta 1800
    agagttgggt ggcggggaga aggatacacc aaaaaactaa gtgattatct ttggatggga 1860
    aaatgtttgg taattgcatt cttaaaatgt cttctttgta ttttttaatg ttcaataatg 1920
    tatatgtatc agttctgtaa taaaggggaa aacacttttt ttaaatactc ataaaaaacc 1980
    atccgtagga tcgagaagat caggcagaag ggctttgtcc agaaatgtaa ggcctctggt 2040
    gtagagggcc aggtggtggc ggaggggaat gacggtggag ggggagcagg aaggccaagc 2100
    ctgggcagcg agaagaagaa agaggaccca aggagagcac aagtcccacc aaccagagag 2160
    agtcgggtga aggtcctgag aaaactggcc gccactgcac cagctttgcc ccaacctccc 2220
    tcaaccccca gagccaccac ccttcctcct gccccaggcc acaacagtga ctcggtccac 2280
    gtcccgggcg gtaacagttg ctgcaagacc tatgaccacc actgcctttc ccaccacggc 2340
    agaggccctg gaccccctca ccctcccaca ggccccctac aaccactgag gtgatcactg 2400
    ccaggagacc ctcagtttca gagaatcttt accctccatc ccggaaggat cagcacaggg 2460
    agaggccaca gacaaccagg aggcccagca aggccaccag cttggagagc ttcacaaatg 2520
    cccctcccac caccatctca gaacccagca caagggctgc tggcccaggc cgtttccggg 2580
    acaaccgcat ggacaggcgg gaacatggcc accgagaccc aaatgtggtg ccaggtcctc 2640
    ccaagccagc aaaggagaaa cctcccaaaa agaaggccca ggacaaaatt cttagtaatg 2700
    agtatgagga gaagtatgac ctcagccggc ctactgcctc tcagctggag gacgagctgc 2760
    aggtggggaa tgttcccctt aaaaaagcaa aggagtctaa aaagcatgaa aagcttgaga 2820
    aaccagagaa ggagaagaaa aaaaagatga agaatgagaa cgcagacaag ttacttaaga 2880
    gtgaaaagca aatgaagaag tctgagaaaa agagcaagca agagaaagag aagagcaaga 2940
    agaaaaaagg aggtaaaaca gaacaggatg gctatcagaa acccaccaac aaacacttca 3000
    cgcagagtcc caagaagtca gtggccgacc tgctggggtc ctttgaaggc aaacgaagac 3060
    tccttctgat cactgctccc aaggctgaga acaatatgta tgtgcaacaa cgtgatgaat 3120
    atctggaaag tttctgcaag atggctacca ggaaaatctc tgtgatcacc atcttcggcc 3180
    ctgtcaacaa cagcaccatg aaaatcgacc actttcagct agataatgag aagcccatgc 3240
    gagtggtgga tgatgaagac ttggtagacc aagcgtctca tcagcgagct gaggaaagag 3300
    tacggaatga cctacaatga cttcttcatg gtgctaacag atgtggatct gagagtcaag 3360
    caatactatg aggtaccaat aacaatgaag tctgtgtttg atctgatcga tactttccag 3420
    tcccgaatca aagatatgga gaagcagaag aaggagggca ttgtttgcaa agaggacaaa 3480
    aagcagtccc tggagaactt cctatccagg ttccggtgga ggaggaggtt gctggtgatc 3540
    tctgctccta acgatgaaga ctgggcctat tcacagcagc tctctgccct cagtggtcag 3600
    gcgtgcaatt ttggtctgcg ccacataacc attctgaagc ttttaggcgt tggagaggaa 3660
    gttgggggag tgttagaact gttcccaatt aatgggagct ctgttgttga gcgagaagac 3720
    gtaccagccc atttggtgaa agacattcgt aactattttc aagtgagccc ggagtacttc 3780
    tccatgcttc tagtcggaaa agacggaaat gtcaaatcct ggtatccttc cccaatgtgg 3840
    tccatggtga ttgtgtacga tttaattgat tcgatgcaac ttcggagaca ggaaatggcg 3900
    attcagcagt cactggggat gcgctgccca gaagatgagt atgcaggcta tggttaccat 3960
    agttaccacc aaggatacca ggatggttac caggatgact accgtcatca tgagagttat 4020
    caccatggat acccttactg agcagaaata tgtaacctta gactcagcca gtttcctctg 4080
    cagctgctaa aactacatgt ggccagctcc attcttccac actgcgtact acatttcctg 4140
    cctttttctt tcagtgtttt tctaagacta aataaatagc aaactttcac ctattcatga 4200
    gttattattg aaacctcaaa tcataaagac atttaaaaga attgtttttc taactggagg 4260
    ggctctagtg ctaaataata gtactgaaaa ttgatattat tttccttttc ttatatgaag 4320
    gaccttattt ggcatataaa attttataaa atatgtattt aaagcttttt cttatttttt 4380
    gtattaattg gtaagtgaaa actctgttaa agatcacacc acaatgtttt caagaaacat 4440
    ctgaaaagat aaaacaaaga acaaata 4467
    <210> SEQ ID NO 22
    <211> LENGTH: 2965
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481,
    1482,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 22
    aaacaaagtt caatttagct ggatttctga actatggttt tgaatgttta aagaagaatg 60
    atgggtacag ttaggaaagt ttttttctta cacccgtgac ttgagggaaa cattgcttgt 120
    ctttgagaaa ttgactgaca tactggaaga gaacaccatt ttatctcagg ttagtgaaga 180
    atcagtgcag gtccctgact cttattttcc cagaggccat ggagctgaga ttgagactag 240
    ccttgtggtt ttcacactaa agagtttcct tgttatgggc aacatgcatg acctaatgtc 300
    ttgcaaaatc caatagaagt attgcagctt ccttctctgg ctcaagggct gagttaagtg 360
    aaaggaaaaa cagcacaatg gtgaccactg ataaaggctt tattaggtat atctgaggaa 420
    gtgggtcaca tgaaatgtaa aaagggaatg aggtttttgt tgttttttgg aagtaaaggc 480
    aaacataaat attaccatga tgaattctag tgaaatgacc ccttgacttt gcttttctta 540
    atacagatat ttactgagag gaactatttt tataacacaa gaaaaattta caattgatta 600
    aaagtatcca tgtcttggat acatacgtat ctatagagct ggcatgtaat tcttcctcta 660
    taaagaatag gtataggaaa gactgaataa aaatggaggg atatcccctt ggatttcact 720
    tgcattgtgc aataagcaaa gaagggttga taaaagttct tgatcaaaaa gttcaaagaa 780
    accagaattt tagacagcaa gctaaataaa tattgtaaaa ttgcactata ttaggttaag 840
    tattatttag gtattataat atgctttgta aattttatat tccaaatatt gctcaatatt 900
    tttcatctat taaattaatt tctagtataa ataagtagct tctatatctg tcttagtcta 960
    ttataattgt aaggagtaaa attaaatgaa tagtctgcag gtataaattt gaacaatgca 1020
    tagatgatcg aaaattacgg aaaatcatag ggcagagagg tgtgaagatt catcattatg 1080
    tgaaatttgg atctttctca aatccttgct gaaatttagg atggttctca ctgtttttct 1140
    gtgctgatag taccctttcc aaggtgacct tcagggggat taaccttcct agctcaagca 1200
    aggagctaaa aggagcctta tgcatgatct tcccacatat caaaataact aaaaggcact 1260
    gagtttggca tttttctgcc tgctctgcta agaccttttt ttttttttac tttcattata 1320
    acatattata catgacatta tacaaaaatg attaaaatat attaaaacaa catcaacaat 1380
    ccaggatatt tttctataaa actttttaaa aataattgta tctatatatt caattttaca 1440
    tcctttttca aaggctttgt ttttctaaag gcnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1740
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnatc tgggccttac gtaatatatt 1800
    ttcttaatgg ctgcataata tcacatcaaa taggcatttt tcaaacctct ttccttatta 1860
    aacatgtaga ctatatccat tttttactaa aataaataac atttcagata atatctttgc 1920
    actgataatg ttgccaagcc atttctaaag tgaccttatc aatttaatta ccattggatg 1980
    agggtgttgc tttcatcgca ccattgtaga ttgtcttttt tatttcaatt tgcgtttatt 2040
    tataactggt tgcaaaggta cacagaacac acgctccttc aacttatctt tgataaaccc 2100
    aagcaaggat acaaaaagtt ggacgacatt gagtagagtc atggtatacg gtgctgaccc 2160
    tacagtatca gtggaaaaga taaggaaaat gtcactactc acctatgtta tgcaaaacag 2220
    ttaggtgtgc tggggctgga tactgctctt ttacttgagc attggttgat taaagtttag 2280
    gtaccatcca gggctggtct agagaagtct ttggagttaa ccatgctctt tttgttaaag 2340
    aagagagtaa tgtgtttatc ctggctcata gtccgtcacc gaaaatagaa aatgccatcc 2400
    ataggtaaaa tgctgaccta tagaaaaaaa tgaactctac ttttatagcc tagtaaaaat 2460
    gctctacctg agtagttaaa agcaattcat gaagcctgaa gctaaagagc actctgatgg 2520
    ttttggcata atagctgcat ttccagacct gacctttggc cccaaccaca agtgctccaa 2580
    gccccaccag ctgaccaaag aaagcccaag ttctccttct gtccttccca caacctccct 2640
    gctcccaaaa ctatgaaatt aatttgacca tattaacaca gctgactcct ccagtttact 2700
    taaggtagaa agaatgagtt tacaacagat gaaaataagt gctttgggcg aactgtattc 2760
    cttttaacag atccaaacta ttttacattt aaaaaaaaag ttaaactaaa cttctttact 2820
    gctgatatgt ttcctgtatt ctagaaaaat ttttacactt tcacattatt tttgtacact 2880
    ttccccatgt taagggatga tggcttttat aaatgtgtat tcattaaatg ttactttaaa 2940
    aataaaanaa naaaaaaaaa naaaa 2965
    <210> SEQ ID NO 23
    <211> LENGTH: 1734
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581,
    582,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 23
    cgcctccgga aactgccccc cgggctgctg gccaacttca ccctcctgcg cacccttgac 60
    cttggggaga accagttgga gaccttgcca cctgacctcc tgaggggtcc gctgcaatta 120
    gaacggctac atctagaagg caacaaattg caagtactgg gaaaagatct cctcttgccg 180
    cagccggacc tgcgctacct cttcctgaac ggcaacaagc tggccagggt ggcagccggt 240
    gccttccagg gcctgcggca gctggacatg ctggacctct ccaataactc actggccagc 300
    gtgcccgagg ggctctgggc atccctaggg cagccaaact gggacatgcg ggatggcttc 360
    gacatctccg gcaacccctg gatctgtgac cagaacctga gcgacctcta tcgttggctt 420
    caggcccaaa aagacaagat gttttcccag aatgacacgc gctgtgctgg gcctgaagcc 480
    gtgaagggcc agacgctcct gggcagtggc caagtcccag tgagaccagg ggcttgggtt 540
    gagggtgggg ggtctggtag aacactgcaa nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 600
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 660
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 720
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 780
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 840
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnna 900
    taatcctgct tttacaggtg aaactcgggg ctgtccatag cggctgggac cccgtttcat 960
    ccatccatgc ttcctagaac acacgatggg ctttccttac ccatgcccaa ggtgtgccct 1020
    ccgtctggaa tgccgttccc tgtttcccag atctcttgaa ctctgggttc tcccagcccc 1080
    ttgtccttcc ttccagctga gccctggcca cactggggct gcctttctct gactctgtct 1140
    tccccaagtc agggggctct ctgagtgcag ggtctgatgc tgagtcccac ttagcttggg 1200
    gtcagaacca aggggtttaa taaataaccc ttgaaaactg gatcggatga attggctttc 1260
    attgtgttcc tagcatcttc tcaaatcaac ttcccaggac tccagggtga aggaggaaaa 1320
    gaggcatggc ccaggccctg gggtgtggga tatggtctcc ctaggggatg acagttggga 1380
    tcaatggcct gtgacttctc ctctcccttc ccccatcctg ggacctaact ggaaataaaa 1440
    ccttgactgt tgcccgggtg tcattttacc agtggatttc tgccagggct tgtgtcctag 1500
    gagaaggttt aagttaaacc agattgccca ggtctccaaa cgatttgtca tgctgacctg 1560
    agatcatcga agggggcacc tgcccccggg caaggttgca ggggcaggat ggggctgaag 1620
    ggatgagcag ggtcccgggc ccacctgctg atacagcatt ggccatgtgg gggctgcaat 1680
    cggatttgga agaccctggg gcttgggggc atgtccattt tcccagtccc taaa 1734
    <210> SEQ ID NO 24
    <211> LENGTH: 4005
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 24
    ggacaccgtc tgcagtggag tcactggtgc cgtaaatgtg gccaaggggg ccgtccagac 60
    gggctgtaga cacggccaag accgtgctga ccggcaccaa ggacacagtc actactgggc 120
    tcatgggggc agtgaatgtc gccaaaggga ccgtccagac cagtgtggac accaccaaga 180
    ctgtcctaac tggtaccaag gacaccgtct gcagtggggt gaccggtgct gcgaatgtgg 240
    ccaagggggc cgtccagggg ggcctggaca ctacaaagtc tgtcctgact ggcactaaag 300
    acaccgtatc cactgggctc acaggggctg tgaacttggc caaagggact gtccagaccg 360
    gcgtggacac cagcaagact gtcctgaccg gtaccaagga caccgtctgc agtggagtca 420
    ctggtgccgt aaatgtggcc aaaggcaccg tccagacagg tgtggacaca gccaagacgg 480
    tgctgagtgg cgctaaggat gcagtgacta ctggagtcac gggggcagtg aatgtggcca 540
    aaggaaccgt gcagaccggc gtggacgcct ccaaggctgt gcttatgggt accaaggaca 600
    ctgtcttcag tggggttacc ggtgccatga gcatggccaa aggggccgtc caggggggcc 660
    tggacaccac caagacagtg ctgaccggaa ccaaagacgc agtgtccgct gggctcatgg 720
    ggtcagggaa cgtggcgaca ggggccaccc acactggcct cagcaccttc cagaactggt 780
    tacctagtac ccccgccacc tcctggggtg gactcaccag ttccaggacc acagacaatg 840
    gtggggagca gactgccctg agcccccaag aggccccgtt ctctggcatc tccacgcccc 900
    cggatgtgct cagtgtaggc ccggagcctg cctgggaagc cgcagccact accaagggcc 960
    ttgcgactga cgtggcgacg ttcacccaag gggccgcccc aggcagggag gacacggggc 1020
    ttttggccac cacacacggc cccgaagaag ccccacgctt ggcaatgctg cagaatgagt 1080
    tggaggggct gggggacatc ttccacccca tgaatgcgga ggagcaagct cagctggctg 1140
    cctcccagcc cgggccaaag gtgctgtcgg cggaacaggg gagctacttc gttcgtttag 1200
    gtgacctggg tcccagcttc cgccagcggg catttgaaca cgcggtgagc cacctgcagc 1260
    acggccagtt ccaagccagg gacactctgg cccagctcca ggactgcttc aggctgattg 1320
    aaaaggccca gcaggctcca gaagggcagc cacgtctgga ccagggctca ggtgccagtg 1380
    cggaggacgc tgctgtccag gaggagcggg atgccggggt tctgtccagg gtctgcggcc 1440
    ttctccggca gctgcacacg gcctacagtg gcctggtctc cagcctccag ggcctgcccg 1500
    ccgagctcca gcagccagtg gggcgggcgc ggcacagcct ctgtgagctc tatggcatcg 1560
    tggcctcagc tggctctgta gaggagctgc ccgcagagcg gctggtgcag agccgcgagg 1620
    gtgtgcacca ggcttggcag gggttagagc agctgctgga gggcctacag cacaatcccc 1680
    cgctcagctg gctggtaggg cccttcgcct tgcccgctgg cgggcagtag ctgtaggagc 1740
    ctgcaggccc ggcgcggggt cgccctgctc tgtccaggga ggagctgcct cagaactttc 1800
    tccccgcccc caaacctgga tcggttccct aaagccctag acctttgggg ctgcagctgg 1860
    ctgagcgccg aggggctgcg gaggcagtga ccttcttaac tgagccaccc cacgccctgc 1920
    tccgggcctg cctgcatctc ccacctcctc cccagcgctg cctgcccctc tcggagcctg 1980
    gggtcactca gaccaccagc caagagcctt cccttgaagt ccccaagcaa gcactgcaat 2040
    taggaaagag aaaaagcagc gtgcccagcc tggaagggca tctgtttgcc ccgctagcaa 2100
    cccttttata tctagcaggg ctcttccagt cctgcagcac gggcccccag ctatcagcgg 2160
    tgcaggcagt gctgtggcat cccaggctcc gggcagctcc gttctcatgc tgaaagtggg 2220
    tctccggcct tagcacacac accttgaggg tcttaagaac cacattccct catagtagaa 2280
    agtactagaa aaagcgacac tgccatcatc atcccaaggc aggctgctac tgcctttgct 2340
    gacccccggg gtggcctcac ggtggggaca aagctgccag gagccacagc agccacagct 2400
    ggggctttgc accagcctgg cttgagactg agcagtttgc agggggtggg gggtgcaaaa 2460
    aacaagcaaa caggctgctg ctgcctccag ctgcccacca caggcctgcc ccaggcacct 2520
    ggggctctga ggcccctggg gaggctgggc ccagcagctg cccctggaga acacagacaa 2580
    aggacttccc cgcagggaac tgtgccctat ggagggatca gacagggctg ggaacagcca 2640
    cagaggctgc gtgcctatgg cacagccctt cctccgccgc acactccccc tgggtcctca 2700
    ggcccaccca agcgccgggc tgcagaggaa gcggggctgg ggaggctgca ggcatcagag 2760
    acactggtgg tggcggaccc ggccgccggg ccccgtgctc tcaggctagc ccaggtcgtg 2820
    gaggctggca ggctcaggtc gggtgtgaga cgtgccgtgg ctgcgctcag tccagcgggg 2880
    aggagccgtt cagcccggcc tccccaggaa gccatatccc cactcacccg gtaagagaac 2940
    cttgtcgtcc cctttccatg ctctcctagg acacgagccc aggaacccca gacccagggg 3000
    gaggaagggt ggaggggccc caggggtcac catgtgcacc aggggccgtg aggggccggg 3060
    gcattcagct cagctctgaa ccggggaagc tggcacggca aggactgcct caggtgacgg 3120
    gccgtgagag gggacgggtc aggagccttc ccaagccttc tcctcagccc gacacccatg 3180
    gccatcggag gctaggatgc cagacacagc catttgcaga aatcaggcac agtgactgca 3240
    gctcacgtcc agccaaccaa gcatggggcc gcagctcagg aagtcccttc ccgccacacc 3300
    acagcctaat tcttactggg acggaggcaa ctcggctacg ctgggcagga cgacaaacac 3360
    gagacgccac tgtggaatga gcaacttcgg agcacggggt gacttgcttg ggaccgtgcc 3420
    cacgtgacag ccccttatgc agaggaggaa agagaagccc cgagtgggag gggaacctgt 3480
    ccaaagtcac acggtgtgtg ggtgacacag ctggggtgag tcgaggctgg cccctgaggc 3540
    ccatgctccc tgaacgctgg agaccactgt cggctagcag cggctctcag ggaaggcctg 3600
    gtctccaccc tcccagccta gcctcgcgga ccctcgtcct ccccacatcg gacctgctca 3660
    cctgcctgga ccctgggctg ccagatgcag gaagcatcaa accccccagc ctcgtgggtg 3720
    cggggcaggg cgcaggcagc acagcttaga tgccctggtt tgtccctctt gtctcctggg 3780
    aagagcttgc tcccgcccag ctctcctgcc actggccttt cagggttggg ctgggcccag 3840
    agtgcctttt agtcgcttct cacggtggcc tgatggctca acccagtccc aaacgggccc 3900
    agtgacactg ccgactgcac cccagctcag gcccccactg caccagcaat gctagaaaac 3960
    caagccaata aaagtgattt cttttttcat taaaaaaaaa aaaaa 4005
    <210> SEQ ID NO 25
    <211> LENGTH: 846
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 25
    caaaaatgag cggggtgtgg tggcccatgc ctgtagtccc agctgctcgg gagactgaat 60
    ctcttgagcc tgggaagcag aggttgcagt gaactgagat cgcgtcactg cactccagcc 120
    tggtgacaga gcgagattcc atctcaaaaa aaaaaaacag tatgcacgta caaatttctt 180
    aacctgttat caatgtctga gctacataat tatctttcta gttggagttt gttttaggtg 240
    tgtaccaact gacatttcag tttttctgtt tgaagtccaa tgtattagtg actctgtggc 300
    tgctctcttc acctgcccct tgtggcctgt ctacaattct aaatggattt tgaactcaat 360
    gtcgtcgctt ctggtttcct gcatatacca atagcattac ctatgacttt ttttttcctg 420
    agctattttc actgagctga gctaatgaac taaaactgag ttatgtttaa tatttgtatc 480
    aaatacataa aaggaatact gctttttcct tttgtggctc aaaggtagct gcattttaaa 540
    atatttgtga aaataaaaac ttttgttatt agaaaaaaaa aaaaaaaaaa aaaaaaaaaa 600
    aaaaaaaaaa agaaagacca aaaaggaaga gaagggaaaa agaagaagag aaacggaggg 660
    acaacgggaa acacagagag cgagccggtg acgaaaagcg ggaaggccaa cgcaggagaa 720
    gaaagagagg ggggcggcgt cgctcattgt gggagtgtcc tcagagttat gcgagtgggg 780
    gatgatgggc aggagtgcta tgcgcccctt tgtatgaggg ggtgcctcaa ttgttgatgg 840
    gccggg 846
    <210> SEQ ID NO 26
    <211> LENGTH: 599
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 103
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 26
    cggacggtgg gcggacgcgt gggcggacgc gtggggggtc ttgctatgtt taacaaccca 60
    ggctgatctc aaactcctgg gctcaaatga tcctccacct tgncccctca aagtgatggg 120
    attataggca tgagccactg gctggcatca ggtgccaaga tttctgtact gcctctaatt 180
    tctgctacca cttaaactca ggcaggtgga gcctacacac tgatatttcc ttgtggatat 240
    cacacttcag aacgtgtccg ctagataaag ctctcaaact taccaaggaa agtgatgaca 300
    gcttgactcg gccttacaca gaaccctatg taggtctcac acaatagaac aatgtacaaa 360
    taagcatttt tctttcccaa agaagcatgt aaagatttcc cattcctgcc actcaacttc 420
    tctttgttgt gacagggtgg aagaattact gtatatagaa aagatgtccg cagcgttcag 480
    taaacacaga cactaatgag actcagaggc tcatctgtgg tcaggtatta taacagctta 540
    aaactaaaaa aaaaaaaaaa aaaagggcgg tccaagctta ttccctttag tgaggttaa 599
    <210> SEQ ID NO 27
    <211> LENGTH: 603
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 27
    gttccacgtt gcttgaaatt gaaaatcaag ataaaaatgt tcacaattaa gctccttctt 60
    tttattgttc ctctagttat ttcctccaga attgatcaag acaattcatc atttgattct 120
    ctatctccag agccaaaatc aagatttgct atgttagacg atgtaaaaat tttagccaat 180
    ggcctccttc agttgggaca tggtcttaaa gactttgtcc ataagacgaa gggccaaatt 240
    aatgacatat ttcaaaaact caacatattt gatcagtctt tttatgatct atcgctgcaa 300
    accagtgaaa tcaaagaaga agaaaaggaa ctgagaagaa ctacatataa actacaagtc 360
    aaaaatgaag aggtaaagaa tatgtcactt gaactcaact caaaacttga aagcctccta 420
    gaagaaaaaa ttctacttca acaaaaagtg aaatatttag aagagcaact aactaactta 480
    attcaaaatc aacctgaaac tccagaacac ccagaagtaa cttcacttaa agtaagtaga 540
    aaataaagag ggttcatgtt tatgttttca atgtggatct tttaaaaaaa atatttctaa 600
    ggc 603
    <210> SEQ ID NO 28
    <211> LENGTH: 879
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 28
    gccacgcgtc cgcaaacaca aaaagaaaac aaacaaacaa aaaaacaaaa agactggctg 60
    gcgggaaggg tgactcgggc ctttgctccc gagccagagc ccccaaccct gacctgatcc 120
    ccctctctgc gcaggtggag ttctacttcc tttcccagta cgtgtcgcca gccgactccc 180
    cgttccgcca catcttcatg ggccgtggag accacacgct gggcgccctg ctggaccacc 240
    tgcggctgct gcgctccaac agctccggga cccccggggc cacctcctcc actggcttcc 300
    aggagagccg tttccggcgt cagctagccc tgctcacctg gacgctgcaa ggggcagcca 360
    atgcgcttag cggggatgtc tggaacattg ataacaactt ctgaggccct ggggatcctc 420
    acatccccgt cccccagtca agagctcctc tgctcctcgc ttgaatgatt cagggtcagg 480
    gaggtggctc agagtccacc tctcattgct gatcaatttc tcattacccc tacacatctc 540
    tccacggagc ccagacccca gcacagatat ccacacaccc cagccctgca gtgtagctga 600
    ccctaatgtg acggtcatac tgtcggttaa tcagagagta gcatcccttc aatcacagcc 660
    ccttcccctt tctggggtcc tccataccta gagaccactc tgggaggttt gctaggccct 720
    gggacctggc cagctctgtt agtgggagag atcgctggca ccatagcctt atggccaaca 780
    ggtggtctgt ggtgaaaggg gcgtggagtt tcaatatcaa taaaccacct gatatcaata 840
    aaaaaaaaaa aaaaaaattc tgggcgcaag aaatcgctg 879
    <210> SEQ ID NO 29
    <211> LENGTH: 397
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 319, 331
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 29
    cctactcaac agggggtccc aaatgcccac tgcaactagg tacagggtct gtgtgtggtg 60
    gtggaggatt ggcattggga gacatgggag gcaaagagct gggcctggcc aggccaggcc 120
    tctggcttcc aagaactcct agttccaggg gacacccagt gggggaagat ctggctgctg 180
    ggaggcccac agcctagggc tggtcggcca aacagccagc tctggtccct gctcacaagt 240
    gccctatggc ttctaatgca tttctcttct tcctcccttg ctgcacctgc agatgcagaa 300
    ggcagggcct ccccagcanc ctactggctg ncccactcca tttggactgg cacattggac 360
    tggggcatca cattccctca gaacagcctg ataaatg 397
    <210> SEQ ID NO 30
    <211> LENGTH: 1740
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226,
    227,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 30
    tacaaagttt taagaaagcc agcatctcag aaaggccttt caaacaagga cacttaatta 60
    gccatcttat gtataagaaa agaaatataa agaacatgaa aatttaaaaa cagatttggc 120
    agttttataa cagtctagga ggtggtgtta ttttttccta ttaagaatta gagggcaggt 180
    taggaataaa taaaatacag tttgaaaata atgagnnnnn nnnnnnnnnn nnnnnnnnnn 240
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 300
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 360
    nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gcacttcccc tactgattgc tgccttctct 420
    gtggctacaa gggacccaca gaattacagg gaagttacag ggaagcaggt ttcatctcaa 480
    tattgggaga gatttcaaac aatcacacct gcctgagaag gagtgggctg tcactaggaa 540
    tttttattcc cagtccgtca ggaattttgt agaagggctt catgtgctgg taccaatagg 600
    acaggaagat tttaatcagc tttactatct atgttttttt atggaaactg tgtgtatgta 660
    tacatacatt ttccaaaaag aaaaattaaa tgattataga gattatgttt ttcagactac 720
    tcacgtatct gcttttctta ctccccacct ctgctgataa ttcctagttc attggttttt 780
    cccccacact ggaattacct ggggagctta aaaaaccctg atgcctgggt cccaccctca 840
    gagatnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt gatatgaatg 900
    cagcctgggc actgggaaat ttaaaaactc cccagataat tactgtgaca gccaaggttg 960
    agacctccta gtctagagct ttgctataca cagggtggtc cacaagccag gagcatcagc 1020
    atcacttggc agcgcttaga gattcagacc ccagacccac tgaatctgac cctgcatttt 1080
    cactagaccc caggtgatca gctgcactag actcaacctc taaaccaaga cctcccaccc 1140
    tcacagtcta tgatccttta gtgaccctca gctgagtcct gtgctgaact gtgtttgttc 1200
    tccttgagca catgcccgct gaccagggac agactggatg agcaagcaac ctgctggcta 1260
    tggagaagag ccaggctggg taaatgtttg ctgtgactaa gccaggatca aagaactgcc 1320
    tgttgcttgc actggctggc actgagcttg ccactctgtg aactgtgctt ccttcccctg 1380
    catggacctg tgcctcagtc actattatcc gcaggccttc tccaagggca gccctctcct 1440
    tgtttatccc tcttaagcct gcgtgcagga aggcacatta accctgtggc cccctgcagg 1500
    caggagggtg ttgggtgccc ttacctacct tgcccttttt cttgtaccgt aggctgtgcc 1560
    gtttatgagt aagtgatgtg tgtctgtgtg tgtgtctaga agtgctgcac tcaccttgtg 1620
    ttattggagg ttgtgtaacc ccctagcttt gagcctggtc tcagatgttc cttttcccgt 1680
    tctctgtcca gccgttaacg cccccagtct gtaataaaag cctatcagcc gtgcacttta 1740
    <210> SEQ ID NO 31
    <211> LENGTH: 2394
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 31
    aaatgtagaa gggaaagtcc ttcctggtag taatggaaaa ccgaatggac agagaattat 60
    caatggccct caaggaacaa agtgggttgt ggaccttgat cgtgggttag tattgaatgc 120
    agaaggaagg tacctccaag attcacatgg aaatcctctt cggattaaac taggaggaga 180
    tggtcgaacc attgtagatc tggaagggac ccccgtggtg agtcctgacg gcctcccact 240
    ctttgggcag gggcgacatg gcacacctct ggccaatgcc caggataagc caattttgag 300
    tcttggagga aagccgctgg tgggcttgga ggtcatcaaa aaaaccaccc atccccctac 360
    cactaccatg cagcccacca ctactacgac gcccctgcct accactacaa ccccgaggcc 420
    caccactgcc accacccgcc gcacgaccac caggcgtcca acaaccacag tccgaaccac 480
    tacgcggaca accaccacca ccacccccaa acccaccact cccatcccca cctgtccccc 540
    tgggaccttg gaacggcacg acgatgatgg caacctgata atgagctcca atgggatccc 600
    agagtgctac gctgaagaag atgagttctc aggcttggag actgacactg cagtacctac 660
    ggaagaggcc tacgttatat atgatgaaga ttatgaattt gagacgtcaa ggccaccaac 720
    caccactgag ccttcgacca ctgctaccac accgagggtg atcccagagg aaggcgccat 780
    cagttccttt cctgaagaag aatttgatct ggctggaagg aaacgatttg ttgctcctta 840
    cgtgacgtac ctaaataaag acccatcagc cccgtgctct ctgactgatg cactggatca 900
    cttccaagtg gacagcctcg atgaaatcat ccccaatgac ctgaagaaga gtgatctgcc 960
    tccccagcat gctccccgca acatcaccgt ggtggccgtg gaaggttgcc actcatttgt 1020
    cattgtggac tgggacaaag ccaccccagg agatgtggtc acaggttact tggtttacag 1080
    tgcatcctat gaagacttca tcaggaacaa gtggtccact caagcttcat cagtaactca 1140
    cttgcccatt gagaacctaa agcccaacac gaggtattat tttaaagtgc aagcacaaaa 1200
    tcctcatggc tacggaccta tcagcccttc ggtctcattt gtcaccgaat cagataatcc 1260
    tctgcttgtt gtgaggcccc caggcggtga gcctatctgg atcccattcg ctttcaaaca 1320
    tgatcccagc tacacggact gccatggacg gcaatatgtg aagcgcacgt ggtatcgaaa 1380
    gttcgtggga gttgttcttt gtaattcact gaggtataaa atctacctca gtgacaacct 1440
    gaaagataca ttctacagca ttggagacag ctggggaaga ggtgaagacc attgccaatt 1500
    tgtggattca caccttgatg gaagaacagg gcctcagtcc tatgtagaag ccctccctac 1560
    tattcaaggc tactatcgcc agtatcgtca ggagcctgtc aggtttggga acatcggctt 1620
    cggaaccccc tactactatg tgggctggta cgagtgtggg gtctccatcc ctggaaagtg 1680
    gtaatcacag gaccgtcatg ctgcaagctt gccctgccca gccccaccaa ctaagtcgca 1740
    ctaggggctg tgagcaaaga cagccagcat gctcagcccc gctgccctag gtgccaggaa 1800
    ggtcacagat ggacactggc cattctggtc atctcagtct ggaactcagt cccacttctt 1860
    ggcctggaca atgaacagga ttcagttttg ctgttaactt tgcttctcta cttttttttg 1920
    tttgtttgta atagcacatc ccagagacat cagaaaccag caactgattc agtgtgattt 1980
    ccagactttt taggcatgaa attcggacac ttcagtattt ccaggaatag catatgcacg 2040
    ctgttcttgc ttcatggaat gctacatgct ttctgttttt ctcattttgg atttctccaa 2100
    aactaactga atttaagctt caggtccctt tgtatgcagt agaaaggaat tattaaaaac 2160
    accaccaaag aaaataaata tatcctactt gaaatttact ctatggactt acccactgct 2220
    agaataaatg tatcaaatct tatttgtaaa ttctcaattt tgatatatat atgtatatat 2280
    gcatatacat atccacactt gtctgcaaga atattgatta aaattgctaa atttgtactt 2340
    gttcaccaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aagg 2394
    <210> SEQ ID NO 32
    <211> LENGTH: 499
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 32
    ctgggatagc aataacctgt gaaaatgctc ccccggctaa tttgtatcaa tgattatgaa 60
    caacatgcta aatcagtact tccaaagtct aaatatgact attacaggtc tggggcaaat 120
    gatgaagaaa ctttggctga taatattgca gcattttcca gatggaagct gtatccaagg 180
    atgctccgga atgttgctga aacagatctg tcgacttctg ttttaggaca gagggtcagc 240
    atgccaatat gtgtgggggc tacggccatg cagcgcatgg ctcatgtgga cggcgagctt 300
    gccactgtga gagcctgtca gtccctggga acgggcatga tgttgagttc ctgggccacc 360
    tcctcaattg aagaagtggc ggaagctggt cctgaggcac ttcgttggct gctactgtat 420
    atctacaagg accgagaagt caccaagaag ctagtgcggc aggcagagaa gatgggctac 480
    aaggccatat ttgtgacag 499
    <210> SEQ ID NO 33
    <211> LENGTH: 1774
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 1679, 1681, 1684, 1685, 1686, 1691, 1692, 1693, 1705,
    1706,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 33
    ctttcctaga caaggctgaa aggggccaac attatttctg aagacttcat tattggaatt 60
    ctatgggagt gatctcactg agctattttg gaatagaaat gtggctagtt gcctgacctc 120
    cctcaatggt ttcacgtggc tttcaaaggg aaggaagggc agtgctgact tttggtaaaa 180
    tgggcgaaag ggtccatgcc agcaacacaa tcactcaaag tccagatgag ggatcagtaa 240
    atacaacgtg cctgaaaggt ggcccttgag cacattcctc cggtagacat taacttatta 300
    aattgattct gattacaaat ataaactttg cccccatctc acccagtaac aatgcaagag 360
    ttgatgtcag tctataaaag gaagtaggaa ctgtccctgg ctttcaggct ccaacatcct 420
    ccccctgtca agatgtggca cctcaaactt tgtgcagtcc tcatgatctt cctgttgctg 480
    ttgggccagg taaggaggga aggatactta tgtgtgtgtg tggagtgtgg agatgatagt 540
    ggtggtggaa cttgaaagct agattcagtc ctgaggaatg gttcctctgt tctgagtcta 600
    cagcatctgc ggaatggaat gatcactctt ccaaggtgtg cagcagggtg tcaacacttt 660
    catatctgaa tgtctttgcc cttacagata gatggctccc caataccaga agtgagttcg 720
    gcaaagagaa ggccacggag aatgacccca ttttggagag gggtttccct caggcctatt 780
    ggagcctcct gccgggatga ttctgagtgt atcacaaggc tatgcagaaa aagacgctgt 840
    tccttaagtg tggcccagga atgatgtaca taccagggaa agaaaggaca gcagtcacct 900
    ccgacaatgc tccgttctat ggaatattga ttaactgcat tttggctgga gacacccaag 960
    tgaagcaatc ttgtattttt aatatttaaa ggcagatgta cgctttaaat tggtctccat 1020
    ttcttcttag aatgttgata tatggataag cataactaaa cttgtcaatt tagagtttat 1080
    ttttctatgg atactattaa atgtctcaaa ttgaaatttt agcagtctgg aattcaagct 1140
    tttgagggaa agaaggattc actttgtata ctaaagaaaa aaacagcatt gcccaataat 1200
    gtgttaactt ctcaatctgg aaagtgtagt gagagctaca taatcaatag ctacgtaatc 1260
    aacttcagca agttcctaag ctgtggccct ggatcccttc actccatact cttcagggag 1320
    gtgtcaaagg tggtcaagct tgggaggctg aggcaggaga atggcgtgaa ccgggagacg 1380
    gacttgcagt gagccgagat ccgccactga ctccagcctg gggaaagagc gagactccgt 1440
    tcacaaaaaa aaagaatcaa aaaaaaaaag gggagccccc ccttggtatc ggaagaccca 1500
    gtcctgtaat tcacacaggt tgagttcaag gcattaagcc ctgtaagggc cacttcggcc 1560
    cctcagagtt gctgttctga tccaacggaa gccgcttaca aatttccctt cggaatttgc 1620
    ctccggcatt ccctaggggc ggtatttgga agcaaagtcc ttttaacagc cagtgtatnc 1680
    naannncggg nnngcccttg cggcnnnngn ccananattg ctccttcttc ncctcttctn 1740
    tttnttcccc cccgtgtcga cagggggtgt ggtc 1774
    <210> SEQ ID NO 34
    <211> LENGTH: 4158
    <212> TYPE: DNA
    <213> ORGANISM: HOMO SAPIENS
    <220> FEATURE:
    <222> LOCATION: 3667, 3668, 3669, 3670, 3671, 3672, 3673, 3674, 3675,
    3676,
    <223> OTHER INFORMATION: a or g or c or t, unknown, or other
    <400> SEQUENCE: 34
    ctcccacaac aatttcattg ttgttagcat atctatttct ccatacattg taaaactgta 60
    atccttaggt atttctaaaa cataaagagg agaattaagt cagctgcaga acaatggggc 120
    tgattcttct gctttttctc tggaaaatct ttcattgctt ttggtggaaa tttacctaga 180
    ggttacaacc acaggatgta gcttggtctc ttatttgcct ttttgggaaa ccaattaaga 240
    ttaatacagg ataaaggaaa aaagcaatct attcattata taacacagtt gtttgtatta 300
    cttgttccct gcaaaggaaa tctgttgaat gcttgcattt tgaattcttt tctaatagaa 360
    caaccaaaaa aggcttctta tggtgcagca ggaaaaaaga tcatttttat agctttgcat 420
    tcttaacata gcatttaaag agcggcatga attagaggaa agacatggaa cacacaggta 480
    gtcggtttga gatcatcggc ttaaaagtat cctaggatgg taatgaccca gaagtatttc 540
    cagttgtcta gtggtgtggt atgcaggaat gagaagtgtt ttctttccat ttcctgttgg 600
    acaggtggca atcttagcag agccactatt tggagttgat aactaaagat gcaaataacg 660
    tgactatgcc ttctggtcat cctacgacta tttggagttc tccaaaacct tgtaagaggc 720
    atgtcaggca tgcagtaaaa gcatctacaa cttcagctgg gcactggcag cataggtctc 780
    atcttggacc atacagtccc actttataga agagagtgga agttctccaa aacaatatcc 840
    acaacaaagt ctgacctcac tctgagggag atgggaagtg ggaggaagaa ggactaacca 900
    gctccctgga gtaagaggaa tttgctttcc ctgtctgccc accaggggct atatgtgcca 960
    cctttcaggt tggggccaag gaagtgatgt cagtgtgaca gaagggagag ttagacctcc 1020
    agacgtcagc ctccctccca tggggtacat tttcaatctg agtgttgttg ccttagctgt 1080
    gttggtatta gcttgattgg ttggtccgct ggttatgagg tgtagggagg cagtttttgt 1140
    ttagttttta ggactttgcc tcttcctttg tccttagcat aatttctagg cagagcatcc 1200
    acgaagtcgg ttttcattgc cagctcaaga gcgacaatca tttacgagtt cctatgttat 1260
    gttaggtgcc ttatgtatat tatcccaaat ccactgcatg gtttaaatac aggcactgga 1320
    atataaatga aaaaggtcat tacagtcact gactttctgc aggaccttaa acatttctct 1380
    ttccacaagt ttccccttaa tcatgtgtca aacctctctt cctgacggga atgttgtgct 1440
    ataatgaatc tgcataacgc ttgggattct aggaggaagg aaggttccat ggacatgtaa 1500
    gtacagcata ttcccctcag tcttctagga gggcagagtg aatcccagaa ctggtaagat 1560
    tgggaatctg agcattgcca ctttaatctt agaatattta tcattttgac acatcctgtt 1620
    ttttagagag gaaaacaaac acagtttctg cattggtagt gtaaagcata ccttgttagg 1680
    aacgtgtttt gtaagacaca tttgggttgt cattctagag catgtcaaac tttgtacttc 1740
    aaaatatatt tagtatgatt gttagtggta acatatatca aggctttgaa ttaactgttt 1800
    tatttaattt tcacaagaag cacttatttt agccatagga aaaccaatct gagctacaaa 1860
    tagttcttta aaataagccc aggttattta gctattctag aaagtgccga cttctttcaa 1920
    gaagcaggca ttgtaggaca gctgagaatt atcacatagc ctaaattcta gcctggcagc 1980
    aagagtcaca tctgagatgt ccaaaaaaaa aaaaaaaaaa cacctgatct acattgaaag 2040
    ggggtagact aacgtatgtg agaccatttt cctatttgca gttacaaggt taaagaactt 2100
    tgaaggtcat tcggctgcta agaggcatgt cgaacactct gtgtggctct ttcacagtaa 2160
    accctcctaa gagcagaaga cacatggctg ttagtgtctg cgtttagatt taatttctca 2220
    aataaaggcc cttggctgcg tatcatttca tccagttata aactagggct cctgcaagca 2280
    cccccattct aagggtgaat tattgaaatc agttgctatt tgatgagtca caactggccc 2340
    agcaggcagg gcatttgaag tcatggtcat caaaaagaaa tgattgtttt ttgaaaagct 2400
    aaatgcttaa aatgcttcta gagggaagtc gtggggcgtg tgctcattct ctttaaaatc 2460
    agggttgttg agtttgtttt taaacatttt tataagttca tgagaaaaaa tatataaatt 2520
    ctaagaacca acactgtatt cccagaaaca tgaccctcgc tggtcttggg tccacatatc 2580
    attggactct gggggacaca aagatgcctg tgacactttg gtgttgccga gttagtcaac 2640
    aattattctg ggaaaaagca gaattgaatt cttctctaga tgtcctacca gggttggcca 2700
    agggccacaa agcaggctaa taaattccca caggatccag acaccaggca aaattgctct 2760
    aagaagccag ttactgtcat ccctctatgg ttctagaaaa aatagtacaa aaatgacagg 2820
    tcatcctatg agcgtcatgc caatgaaacc ccatcttctg gagaagccct tgaatcagaa 2880
    ttatcttttt tcttgatgtc gtcagatgca gccagtttct taattttttt aaaaactgta 2940
    tgtttctgtg gtatgtatat ttgtacacct aactacctgg cacttggaaa tcacagcact 3000
    actcagaggc aattgaataa agagaaattt aattttaaat atcaagtcct gtcaaacatt 3060
    tctcaaactt ctgattttat caaaggtttg ccagccaata aagtgcatcc caagtataca 3120
    ggggagaaag ctagactcct acagggtcct agagtttaag taattttttt gttattaata 3180
    taggtaataa tttttctaat ttttattttt tggttccaaa tgtaaagctc cttgtgttta 3240
    cctctgttta tgtcattctt gacatgttta tctaaattat gtgtgctctg tgacaggtga 3300
    aatgtaaatc tgggatccat agtcaagata tcataaggac ctacttccca gcctaccttt 3360
    cttcctctac ctgataatga taatactcaa aataacaaca ttcaaaggaa acacaaagaa 3420
    atcctgcttt cacatctcct atttcttggg ctccttaata actactgatg gtttgttcat 3480
    gaaaaaaaat ttttaaatca aaagattgta cttggccctg agttgaaaaa atttcaaaaa 3540
    tcaaaagttt gtacttggcc ctgagttgaa aaaaaaaatt cacattctaa gaataaacag 3600
    aaaaatgttc ttcttggaag taaataacaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3660
    aaaaaannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3720
    nntcttctat agtgtcacct aaattcaatt cactggccgt cgttttacaa cgtcgtgact 3780
    gggaaaaccc tggcgttacc caacttaatc gccttgcagc acatccccct ttcgcaagct 3840
    ggcgtaatac gcgaagaggc ccgaaccgtt ggcccttccc aacagttgcg cagcctgaat 3900
    ggcgaatggg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt ggttagtaga 3960
    ggtgtgccgt aaaaaataga ataatttttt ttcaagagat gagcagaatt gagtaggaat 4020
    gattacgggg aggaaaagat ctagaagata gacaatagag aggagagaaa aagagggacg 4080
    aggaggctga gaggaaaaga gtagaagcgt gatatgaata tatacagaaa cagaaaaagg 4140
    agagagggta agacataa 4158
    <210> SEQ ID NO 35
    <211> LENGTH: 366
    <212> TYPE: PRT
    <213> ORGANISM: HOMO SAPIENS
    <400> SEQUENCE: 35
    Met Ala Leu Arg Phe Leu Leu Gly Phe Leu Leu Ala Gly Val Asp Leu
    1 5 10 15
    Gly Val Tyr Leu Met Arg Leu Glu Leu Cys Asp Pro Thr Gln Arg Leu
    20 25 30
    Arg Val Ala Leu Ala Gly Glu Leu Val Gly Val Gly Gly His Phe Leu
    35 40 45
    Phe Leu Gly Leu Ala Leu Val Ser Lys Asp Trp Arg Phe Leu Gln Arg
    50 55 60
    Met Ile Thr Ala Pro Cys Ile Leu Phe Leu Phe Tyr Gly Trp Pro Gly
    65 70 75 80
    Leu Phe Leu Glu Ser Ala Arg Trp Leu Ile Val Lys Arg Gln Ile Glu
    85 90 95
    Glu Ala Gln Ser Val Leu Arg Ile Leu Ala Glu Arg Asn Arg Pro His
    100 105 110
    Gly Gln Met Leu Gly Glu Glu Ala Gln Glu Ala Leu Gln Asp Leu Glu
    115 120 125
    Asn Thr Cys Pro Leu Pro Ala Thr Ser Ser Phe Ser Phe Ala Ser Leu
    130 135 140
    Leu Asn Tyr Arg Asn Ile Trp Lys Asn Leu Leu Ile Leu Gly Phe Thr
    145 150 155 160
    Asn Phe Ile Ala His Ala Ile Arg His Cys Tyr Gln Pro Val Gly Gly
    165 170 175
    Gly Gly Ser Pro Ser Asp Phe Tyr Leu Cys Ser Leu Leu Ala Ser Gly
    180 185 190
    Thr Ala Ala Leu Ala Cys Val Phe Leu Gly Val Thr Val Asp Arg Phe
    195 200 205
    Gly Arg Arg Gly Ile Leu Leu Leu Ser Met Thr Leu Thr Gly Ile Ala
    210 215 220
    Ser Leu Val Leu Leu Gly Leu Trp Asp Tyr Leu Asn Glu Ala Ala Ile
    225 230 235 240
    Thr Thr Phe Ser Val Leu Gly Leu Phe Ser Ser Gln Ala Ala Ala Ile
    245 250 255
    Leu Ser Thr Leu Leu Ala Ala Glu Val Ile Pro Thr Thr Val Arg Gly
    260 265 270
    Arg Gly Leu Gly Leu Ile Met Ala Leu Gly Ala Leu Gly Gly Leu Ser
    275 280 285
    Gly Pro Ala Gln Arg Leu His Met Gly His Gly Ala Phe Leu Gln His
    290 295 300
    Val Val Leu Ala Ala Cys Ala Leu Leu Cys Ile Leu Ser Ile Met Leu
    305 310 315 320
    Leu Pro Glu Thr Lys Arg Lys Leu Leu Pro Glu Val Leu Arg Asp Gly
    325 330 335
    Glu Leu Cys Arg Arg Pro Ser Leu Leu Arg Gln Pro Pro Pro Thr Arg
    340 345 350
    Cys Asp His Val Pro Leu Leu Ala Thr Pro Asn Pro Ala Leu
    355 360 365

Claims (20)

What is claimed is:
1. A substantially purified polynucleotide comprising a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples.
2. The polynucleotide of claim 1, comprising a polynucleotide sequence selected from
(a) a polynucleotide encoding a peptide selected from SEQ ID NOs: 1-34;
(b) a polynucleotide sequence complementary to the polynucleotide sequence of (a) or (b);
(c) a probe comprising at least 18 sequential nucleotides of the polynucleotide sequence of (a) or (b).
3. A pharmaceutical composition comprising one of the polynucleotides of claim 2 and a pharmaceutical carrier.
4. A method of using a polynucleotide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polynucleotide, the method comprising:
(a) combining the polynucleotide of claim 1 with a library of molecules or compounds under conditions to allow specific binding, and
(b) detecting specific binding, thereby identifying a ligand which specifically binds the polynucleotide.
5. The method of claim 4 wherein the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, and proteins.
6. A ligand identified by the method of claim 4 which modulates the activity of the polynucleotide.
7. A method of using a polynucleotide of to purify a ligand which specifically binds the polynucleotide, the method comprising:
(a) combining the polynucleotide of claim 1 with a sample under conditions to allow specific binding,
(b) detecting specific binding between the polynucleotide and a ligand,
(c) recovering the bound polynucleotide, and
(d) separating the polynucleotide from the ligand, thereby obtaining purified ligand.
8. A method for diagnosing a disease or condition associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of biological samples, the method comprising the steps of:
(a) hybridizing a polynucleotide of claim 2 to a sample under conditions effective to form one or more hybridization complexes;
(b) detecting the hybridization complexes; and
(c) comparing the levels of the hybridization complexes with the level of hybridization complexes in a non-diseased sample, wherein the altered level of hybridization complexes compared with the level of hybridization complexes of a non-diseased sample indicates the presence of the disease or condition.
9. An expression vector comprising the polynucleotide of claim 2.
10. A host cell comprising the expression vector of claim 9.
11. A method for producing the polypeptide, the method comprising:
(a) culturing the host cell of claim 10 under conditions for expression of the polypeptide,
(b) recovering the polypeptide from cell culture.
12. A substantially purified polypeptide comprising the product of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a plurality of samples.
13. The polypeptide of claim 12, comprising a polypeptide sequence selected from
(a) the polypeptides encoded by SEQ ID NOs: 1-34; and
(b) an oligopeptide sequence comprising at least 6 sequential amino acids of the polypeptide sequence of a).
14. The polypeptide comprising the amino acid sequence of SEQ ID NO: 35.
15. A pharmaceutical composition comprising a polypeptide of claim 12 and a pharmaceutical carrier.
16. A method for using a polypeptide to screen a library of molecules or compounds to identify at least one ligand which specifically binds the polypeptide, the method comprising:
(a) combining the polypeptide of claim 12 with the library of molecules or compounds under conditions to allow specific binding, and
(b) detecting specific binding between the polypeptide and ligand, thereby identifying a ligand which specifically binds the polypeptide.
17. The method of claim 16 wherein the library is selected from DNA molecules, RNA molecules, PNAs, mimetics, proteins, agonists, antagonists, and antibodies.
18. A ligand identified by the method of claim 16 which modulates the activity of the polypeptide.
19. A method of using the polypeptide to purify a ligand from a sample, the method comprising:
(a) combining the polypeptide of claim 12 with a sample under conditions to allow specific binding,
(b) detecting specific binding between the polypeptide and a ligand,
(c) recovering the bound polypeptide, and
(d) separating the polypeptide from the ligand, thereby obtaining purified ligand.
20. A method for treating or preventing a disease associated with the altered expression of a gene that is coexpressed with one or more known atherosclerosis-associated genes in a subject in need, the method comprising the step of administering to the subject in need the pharmaceutical composition of claim 15 in an amount effective for treating or preventing the disease.
US09/349,015 1999-01-20 1999-07-07 Atherosclerosis-associated genes Abandoned US20020015950A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US09/349,015 US20020015950A1 (en) 1999-07-07 1999-07-07 Atherosclerosis-associated genes
CA002378985A CA2378985A1 (en) 1999-07-07 2000-06-28 Atherosclerosis-associated genes
PCT/US2000/017887 WO2001004264A2 (en) 1999-07-07 2000-06-28 Atherosclerosis-associated genes
AU58988/00A AU5898800A (en) 1999-07-07 2000-06-28 Atherosclerosis-associated genes
EP00944981A EP1196564A2 (en) 1999-07-07 2000-06-28 Atherosclerosis-associated genes
JP2001509468A JP2003504044A (en) 1999-07-07 2000-06-28 Genes associated with atherosclerosis
US09/642,703 US6524799B1 (en) 1999-07-07 2000-08-16 DNA encoding sparc-related proteins
US10/219,664 US20030129176A1 (en) 1999-07-07 2002-08-14 Atherosclerosis-associated genes
US10/247,451 US20040018188A9 (en) 1999-01-20 2002-09-18 Sparc-related proteins

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/349,015 US20020015950A1 (en) 1999-07-07 1999-07-07 Atherosclerosis-associated genes

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/840,787 Continuation-In-Part US20020058264A1 (en) 1997-09-23 2001-09-26 Human regulatory molecules

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US09/642,703 Continuation-In-Part US6524799B1 (en) 1999-01-20 2000-08-16 DNA encoding sparc-related proteins
US10/219,664 Continuation-In-Part US20030129176A1 (en) 1999-07-07 2002-08-14 Atherosclerosis-associated genes
US10/247,451 Continuation-In-Part US20040018188A9 (en) 1999-01-20 2002-09-18 Sparc-related proteins

Publications (1)

Publication Number Publication Date
US20020015950A1 true US20020015950A1 (en) 2002-02-07

Family

ID=23370530

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/349,015 Abandoned US20020015950A1 (en) 1999-01-20 1999-07-07 Atherosclerosis-associated genes
US09/642,703 Expired - Fee Related US6524799B1 (en) 1999-01-20 2000-08-16 DNA encoding sparc-related proteins
US10/219,664 Abandoned US20030129176A1 (en) 1999-07-07 2002-08-14 Atherosclerosis-associated genes

Family Applications After (2)

Application Number Title Priority Date Filing Date
US09/642,703 Expired - Fee Related US6524799B1 (en) 1999-01-20 2000-08-16 DNA encoding sparc-related proteins
US10/219,664 Abandoned US20030129176A1 (en) 1999-07-07 2002-08-14 Atherosclerosis-associated genes

Country Status (6)

Country Link
US (3) US20020015950A1 (en)
EP (1) EP1196564A2 (en)
JP (1) JP2003504044A (en)
AU (1) AU5898800A (en)
CA (1) CA2378985A1 (en)
WO (1) WO2001004264A2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030113327A1 (en) * 2001-05-25 2003-06-19 Adams Sean H. Compositions and methods for adipose abundant protein
US20030175860A1 (en) * 2000-12-18 2003-09-18 Sheppard Paul O. Seleno-cysteine containing protein zsel1
US20070218513A1 (en) * 2003-10-16 2007-09-20 Novartis Ag Differentially Expressed Genes Related to Coronary Artery Disease
US20100316622A1 (en) * 2007-11-02 2010-12-16 University Of Miami Diagnosis and treatment of cardiac disorders
US8663122B2 (en) 2005-01-26 2014-03-04 Stuart Schecter LLC Cardiovascular haptic handle system
US8942828B1 (en) 2011-04-13 2015-01-27 Stuart Schecter, LLC Minimally invasive cardiovascular support system with true haptic coupling
US10013082B2 (en) 2012-06-05 2018-07-03 Stuart Schecter, LLC Operating system with haptic interface for minimally invasive, hand-held surgical instrument
CN111458512A (en) * 2019-01-21 2020-07-28 中国科学院分子细胞科学卓越创新中心 A kind of atherosclerosis biomarker and use thereof
US12104210B2 (en) 2016-09-01 2024-10-01 The George Washington University Blood RNA biomarkers of coronary artery disease

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8609614B2 (en) * 1998-07-22 2013-12-17 Vanderbilt University GBS toxin receptor compositions and methods of use
DK1244705T3 (en) * 1999-12-06 2010-05-31 Agensys Inc Serpentine transmembrane antigens expression in human prostate cancer and its applications
MXPA02005758A (en) * 1999-12-09 2002-09-18 Sankyo Co Method of testing remedy or preventive for hyperlipemia.
AU2001239742A1 (en) * 2000-03-06 2001-09-17 Eli Lilly And Company Nucleic acids, encoding human crsp1 and uses thereof
MXPA03005388A (en) * 2000-12-14 2003-09-25 Amylin Pharmaceuticals Inc Peptide yy and peptide yy agonists for treatment of metabolic disorders.
US20030113728A1 (en) * 2001-12-14 2003-06-19 Jukka Salonen Method for assessing the risk of cardiovascular disease
EP1576131A4 (en) * 2002-08-15 2008-08-13 Genzyme Corp REASONS FOR EXPRESSION OF BRAIN ENDOTHELIAL CELLS
GB0408685D0 (en) * 2004-04-19 2004-05-19 Guys & St Thomas Hospital Trus Diagnostic method
AU2006249235B2 (en) 2004-05-14 2010-11-11 Abraxis Bioscience, Llc Sparc and methods of use thereof
US8697139B2 (en) 2004-09-21 2014-04-15 Frank M. Phillips Method of intervertebral disc treatment using articular chondrocyte cells
US20060130157A1 (en) 2004-10-22 2006-06-15 Kevin Wells Ungulates with genetically modified immune systems
US7553496B2 (en) * 2004-12-21 2009-06-30 University Of Kentucky Research Foundation VEGF-A as an inhibitor of angiogenesis and methods of using same
US8165517B2 (en) * 2005-01-19 2012-04-24 The Trustees Of The University Of Pennsylvania Methods for identifying inhibitors of vascular injury
WO2006112930A2 (en) * 2005-02-18 2006-10-26 Abraxis Bioscience, Inc. Q3 sparc deletion mutant and uses thereof
WO2006102497A2 (en) * 2005-03-22 2006-09-28 The Board Of Trustees Of The Leland Stanford Junior University Methods and compositions for diagnosis, monitoring and development of therapeutics for treatment of atherosclerotic disease
US8586534B2 (en) * 2007-05-08 2013-11-19 Albert Einstein College Of Medicine Of Yeshiva University Intracellular domain of a mammalian Fat1 (Fat1IC)
WO2009148050A1 (en) * 2008-06-04 2009-12-10 株式会社バイオマーカーサイエンス Method of evaluating effects of relieving and preventing arteriosclerosis, kit for the evaluation and method of screening substance
WO2010017102A2 (en) * 2008-08-06 2010-02-11 The U.S.A. As Represented By The Secretary, Department Of Health And Human Services Methods for intracellular modulation of bone morphogenetic protein signaling
ES2548377T3 (en) 2008-10-27 2015-10-16 Revivicor, Inc. Immunosuppressed ungulates
JP2010190804A (en) * 2009-02-19 2010-09-02 Biomarker Science:Kk Method and kit for evaluating improvement/preventive effect of arteriosclerosis, and substance screening method
WO2011027132A1 (en) 2009-09-03 2011-03-10 Cancer Research Technology Limited Clec14a inhibitors
WO2011107586A1 (en) * 2010-03-05 2011-09-09 Novartis Forschungsstiftung, Zweigniederlassung, Friedrich Miescher Institute For Biomedical Research, Smoc1, tenascin-c and brain cancers
WO2017141980A1 (en) * 2016-02-16 2017-08-24 公立大学法人横浜市立大学 Blood biomarker for detecting arteriosclerosis
JP6756377B2 (en) * 2016-12-12 2020-09-16 日本電気株式会社 Information processing equipment, genetic information creation method and program
ES2688737A1 (en) * 2017-05-04 2018-11-06 Universidad Del País Vasco / Euskal Herriko Unibertsitatea METHOD FOR DIAGNOSTICING UNSTABLE ATEROSCLEROTIC PLATE

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0976824A1 (en) * 1998-07-10 2000-02-02 Amsterdam Molecular Therapeutics Gene and protein involved in liver regeneration

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030175860A1 (en) * 2000-12-18 2003-09-18 Sheppard Paul O. Seleno-cysteine containing protein zsel1
US20030113327A1 (en) * 2001-05-25 2003-06-19 Adams Sean H. Compositions and methods for adipose abundant protein
WO2002096355A3 (en) * 2001-05-25 2005-11-17 Genentech Inc Compositions and methods for adipose abundant protein
US20070218513A1 (en) * 2003-10-16 2007-09-20 Novartis Ag Differentially Expressed Genes Related to Coronary Artery Disease
US8663122B2 (en) 2005-01-26 2014-03-04 Stuart Schecter LLC Cardiovascular haptic handle system
US8956304B2 (en) 2005-01-26 2015-02-17 Stuart Schecter LLC Cardiovascular haptic handle system
US20100316622A1 (en) * 2007-11-02 2010-12-16 University Of Miami Diagnosis and treatment of cardiac disorders
US8942828B1 (en) 2011-04-13 2015-01-27 Stuart Schecter, LLC Minimally invasive cardiovascular support system with true haptic coupling
US10013082B2 (en) 2012-06-05 2018-07-03 Stuart Schecter, LLC Operating system with haptic interface for minimally invasive, hand-held surgical instrument
US12104210B2 (en) 2016-09-01 2024-10-01 The George Washington University Blood RNA biomarkers of coronary artery disease
CN111458512A (en) * 2019-01-21 2020-07-28 中国科学院分子细胞科学卓越创新中心 A kind of atherosclerosis biomarker and use thereof

Also Published As

Publication number Publication date
CA2378985A1 (en) 2001-01-18
EP1196564A2 (en) 2002-04-17
WO2001004264A3 (en) 2001-07-26
US20030129176A1 (en) 2003-07-10
AU5898800A (en) 2001-01-30
JP2003504044A (en) 2003-02-04
WO2001004264A2 (en) 2001-01-18
US6524799B1 (en) 2003-02-25

Similar Documents

Publication Publication Date Title
US20020015950A1 (en) Atherosclerosis-associated genes
US6566325B2 (en) 49 human secreted proteins
ES2397441T3 (en) Polynucleotide and polypeptide sequences involved in the bone remodeling process
US20020102569A1 (en) Diagnostic marker for cancers
US20080051338A1 (en) 98 Human Secreted Proteins
EP1373293A1 (en) Steap-related protein
US20040248256A1 (en) Secreted proteins and polynucleotides encoding them
CA2364541A1 (en) Genes associated with diseases of the kidney
US20020076705A1 (en) 31 human secreted proteins
US20040152164A1 (en) 62 human secreted proteins
US6225451B1 (en) Chromosome 11-linked coronary heart disease susceptibility gene CHD1
CA2518101A1 (en) Compositions and methods for the treatment of systemic lupus erythematosis
US20030186333A1 (en) Down syndrome critical region 1-like protein
US20020019000A1 (en) Polynucleotides coexpressed with matrix-remodeling genes
US20050037454A1 (en) Gene associated with bone disorders
US20040225118A1 (en) 31 human secreted proteins
US20030054446A1 (en) Novel retina-specific human proteins C7orf9, C12orf7, MPP4 and F379
CA2413158A1 (en) Ecm-related tumor marker
JP2004505637A (en) Cancer-related SIM2 gene
US20030170627A1 (en) cDNAs co-expressed with placental steroid synthesis genes
US20030129655A1 (en) Nucleic acids encoding GTPase activating proteins
AU2002328200B2 (en) DNA sequences for human angiogenesis genes
JP2003520031A (en) 28 human secreted proteins
US6444443B1 (en) Gene
EP1538161A2 (en) 32 human secreted proteins

Legal Events

Date Code Title Description
AS Assignment

Owner name: INCYTE PHARMACEUTICALS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JONES, KAREN ANN;VOLKMUTH, WAYNE;WALKER, MICHAEL G.;REEL/FRAME:010308/0237;SIGNING DATES FROM 19990702 TO 19990907

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION