MXPA98005611A - Compositions and methods for the treatment and diagnosis of m cancer - Google Patents
Compositions and methods for the treatment and diagnosis of m cancerInfo
- Publication number
- MXPA98005611A MXPA98005611A MXPA/A/1998/005611A MX9805611A MXPA98005611A MX PA98005611 A MXPA98005611 A MX PA98005611A MX 9805611 A MX9805611 A MX 9805611A MX PA98005611 A MXPA98005611 A MX PA98005611A
- Authority
- MX
- Mexico
- Prior art keywords
- seq
- polypeptide
- topology
- nucleic acid
- linear
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 83
- 238000003745 diagnosis Methods 0.000 title claims description 9
- 238000011282 treatment Methods 0.000 title abstract description 12
- 239000000203 mixture Substances 0.000 title abstract description 9
- 206010028980 Neoplasm Diseases 0.000 title description 23
- 201000011510 cancer Diseases 0.000 title description 3
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 186
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 185
- 229920001184 polypeptide Polymers 0.000 claims abstract description 177
- 208000026310 Breast neoplasm Diseases 0.000 claims abstract description 138
- 206010006187 Breast cancer Diseases 0.000 claims abstract description 132
- 229960005486 vaccine Drugs 0.000 claims abstract description 26
- 239000008194 pharmaceutical composition Substances 0.000 claims abstract description 21
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims abstract description 16
- 238000012544 monitoring process Methods 0.000 claims abstract description 14
- 239000002773 nucleotide Substances 0.000 claims description 76
- 125000003729 nucleotide group Chemical group 0.000 claims description 76
- 108020004414 DNA Proteins 0.000 claims description 43
- 239000002299 complementary DNA Substances 0.000 claims description 43
- 239000000523 sample Substances 0.000 claims description 36
- 239000012472 biological sample Substances 0.000 claims description 26
- 102000053602 DNA Human genes 0.000 claims description 18
- 210000000481 breast Anatomy 0.000 claims description 15
- 239000007787 solid Substances 0.000 claims description 15
- 230000002163 immunogen Effects 0.000 claims description 13
- 230000028993 immune response Effects 0.000 claims description 10
- 238000012986 modification Methods 0.000 claims description 10
- 230000004048 modification Effects 0.000 claims description 10
- 238000006467 substitution reaction Methods 0.000 claims description 10
- 230000000890 antigenic effect Effects 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 8
- 230000037430 deletion Effects 0.000 claims description 8
- 238000011161 development Methods 0.000 claims description 8
- 238000009007 Diagnostic Kit Methods 0.000 claims description 7
- 239000013604 expression vector Substances 0.000 claims description 7
- 238000003780 insertion Methods 0.000 claims description 7
- 230000037431 insertion Effects 0.000 claims description 7
- 238000003752 polymerase chain reaction Methods 0.000 claims description 7
- 108020004635 Complementary DNA Proteins 0.000 claims description 6
- 239000003153 chemical reaction reagent Substances 0.000 claims description 6
- 230000000717 retained effect Effects 0.000 claims description 5
- 108020005187 Oligonucleotide Probes Proteins 0.000 claims description 4
- 230000000295 complement effect Effects 0.000 claims description 4
- 239000003623 enhancer Substances 0.000 claims description 4
- 239000002751 oligonucleotide probe Substances 0.000 claims description 4
- 108091034117 Oligonucleotide Proteins 0.000 claims description 3
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims description 3
- 230000002401 inhibitory effect Effects 0.000 claims description 3
- 238000003259 recombinant expression Methods 0.000 claims description 3
- 125000003275 alpha amino acid group Chemical group 0.000 claims 3
- 150000001875 compounds Chemical class 0.000 abstract description 9
- 238000002560 therapeutic procedure Methods 0.000 abstract description 7
- 230000002265 prevention Effects 0.000 abstract description 4
- 238000004519 manufacturing process Methods 0.000 abstract description 3
- 150000007523 nucleic acids Chemical group 0.000 description 208
- 102000039446 nucleic acids Human genes 0.000 description 205
- 108020004707 nucleic acids Proteins 0.000 description 205
- 210000001519 tissue Anatomy 0.000 description 50
- 150000001413 amino acids Chemical group 0.000 description 36
- 108090000623 proteins and genes Proteins 0.000 description 27
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 24
- 238000004458 analytical method Methods 0.000 description 23
- 230000001177 retroviral effect Effects 0.000 description 18
- 230000003321 amplification Effects 0.000 description 16
- 210000004027 cell Anatomy 0.000 description 16
- 238000003199 nucleic acid amplification method Methods 0.000 description 16
- 108020004999 messenger RNA Proteins 0.000 description 13
- 102000004169 proteins and genes Human genes 0.000 description 13
- 230000027455 binding Effects 0.000 description 12
- 125000006853 reporter group Chemical group 0.000 description 12
- 239000012528 membrane Substances 0.000 description 11
- 238000002360 preparation method Methods 0.000 description 11
- 210000001744 T-lymphocyte Anatomy 0.000 description 10
- 210000003719 b-lymphocyte Anatomy 0.000 description 10
- 230000015572 biosynthetic process Effects 0.000 description 10
- 239000000427 antigen Substances 0.000 description 9
- 102000036639 antigens Human genes 0.000 description 9
- 108091007433 antigens Proteins 0.000 description 9
- 238000012360 testing method Methods 0.000 description 9
- 239000013598 vector Substances 0.000 description 8
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 7
- 210000003491 skin Anatomy 0.000 description 7
- 241001465754 Metazoa Species 0.000 description 6
- 230000009257 reactivity Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 238000010790 dilution Methods 0.000 description 5
- 239000012895 dilution Substances 0.000 description 5
- 238000002347 injection Methods 0.000 description 5
- 239000007924 injection Substances 0.000 description 5
- 210000004185 liver Anatomy 0.000 description 5
- 210000004072 lung Anatomy 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- 210000002307 prostate Anatomy 0.000 description 5
- 210000002784 stomach Anatomy 0.000 description 5
- 239000003981 vehicle Substances 0.000 description 5
- 102100022900 Actin, cytoplasmic 1 Human genes 0.000 description 4
- 108010085238 Actins Proteins 0.000 description 4
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- LVSYIKGMLRHKME-IUCAKERBSA-N Gln-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N LVSYIKGMLRHKME-IUCAKERBSA-N 0.000 description 4
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 4
- CDHURCQGUDNBMA-UBHSHLNASA-N Phe-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 CDHURCQGUDNBMA-UBHSHLNASA-N 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- KHCSOLAHNLOXJR-BZSNNMDCSA-N Tyr-Leu-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHCSOLAHNLOXJR-BZSNNMDCSA-N 0.000 description 4
- 239000007853 buffer solution Substances 0.000 description 4
- 208000029742 colonic neoplasm Diseases 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 238000005755 formation reaction Methods 0.000 description 4
- 238000001502 gel electrophoresis Methods 0.000 description 4
- 230000003053 immunization Effects 0.000 description 4
- 238000002649 immunization Methods 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 210000003734 kidney Anatomy 0.000 description 4
- 210000001672 ovary Anatomy 0.000 description 4
- 210000000496 pancreas Anatomy 0.000 description 4
- 210000002027 skeletal muscle Anatomy 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 238000001179 sorption measurement Methods 0.000 description 4
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 3
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 3
- DGKCOYGQLNWNCJ-ACZMJKKPSA-N Asp-Glu-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DGKCOYGQLNWNCJ-ACZMJKKPSA-N 0.000 description 3
- XZLLTYBONVKGLO-SDDRHHMPSA-N Gln-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N)C(=O)O XZLLTYBONVKGLO-SDDRHHMPSA-N 0.000 description 3
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 3
- UROVZOUMHNXPLZ-AVGNSLFASA-N His-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 UROVZOUMHNXPLZ-AVGNSLFASA-N 0.000 description 3
- 102100034343 Integrase Human genes 0.000 description 3
- 241000124008 Mammalia Species 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 3
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 3
- SWCOXQLDICUYOL-ULQDDVLXSA-N Phe-His-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SWCOXQLDICUYOL-ULQDDVLXSA-N 0.000 description 3
- 206010035226 Plasma cell myeloma Diseases 0.000 description 3
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 3
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 3
- 241000714205 Woolly monkey sarcoma virus Species 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- 238000003556 assay Methods 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 108700004026 gag Genes Proteins 0.000 description 3
- 210000004408 hybridoma Anatomy 0.000 description 3
- 108010053037 kyotorphin Proteins 0.000 description 3
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 3
- 210000001165 lymph node Anatomy 0.000 description 3
- 239000004005 microsphere Substances 0.000 description 3
- 201000000050 myeloid neoplasm Diseases 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 230000002285 radioactive effect Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 229910052709 silver Inorganic materials 0.000 description 3
- 239000004332 silver Substances 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 2
- LGFCAXJBAZESCF-ACZMJKKPSA-N Ala-Gln-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O LGFCAXJBAZESCF-ACZMJKKPSA-N 0.000 description 2
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 101100505161 Caenorhabditis elegans mel-32 gene Proteins 0.000 description 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 2
- JEFZIKRIDLHOIF-BYPYZUCNSA-N Gln-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(O)=O JEFZIKRIDLHOIF-BYPYZUCNSA-N 0.000 description 2
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 2
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 2
- 206010020751 Hypersensitivity Diseases 0.000 description 2
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 2
- 102000043129 MHC class I family Human genes 0.000 description 2
- 108091054437 MHC class I family Proteins 0.000 description 2
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- 239000000020 Nitrocellulose Substances 0.000 description 2
- 238000000636 Northern blotting Methods 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 241000283973 Oryctolagus cuniculus Species 0.000 description 2
- VHWOBXIWBDWZHK-IHRRRGAJSA-N Phe-Arg-Asp Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 VHWOBXIWBDWZHK-IHRRRGAJSA-N 0.000 description 2
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 2
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 2
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 229920001213 Polysorbate 20 Polymers 0.000 description 2
- 239000004793 Polystyrene Substances 0.000 description 2
- RETPETNFPLNLRV-JYJNAYRXSA-N Pro-Asn-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O RETPETNFPLNLRV-JYJNAYRXSA-N 0.000 description 2
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 2
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 102000006382 Ribonucleases Human genes 0.000 description 2
- 108010083644 Ribonucleases Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- JXNRXNCCROJZFB-RYUDHWBXSA-N Tyr-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JXNRXNCCROJZFB-RYUDHWBXSA-N 0.000 description 2
- AOIZTZRWMSPPAY-KAOXEZKKSA-N Tyr-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O AOIZTZRWMSPPAY-KAOXEZKKSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- 210000000709 aorta Anatomy 0.000 description 2
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 239000011248 coating agent Substances 0.000 description 2
- 238000000576 coating method Methods 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 210000001151 cytotoxic T lymphocyte Anatomy 0.000 description 2
- 238000004925 denaturation Methods 0.000 description 2
- 230000036425 denaturation Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000000975 dye Substances 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 125000000524 functional group Chemical group 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 101150098622 gag gene Proteins 0.000 description 2
- 108010078144 glutaminyl-glycine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 230000005847 immunogenicity Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 230000002147 killing effect Effects 0.000 description 2
- 208000037841 lung tumor Diseases 0.000 description 2
- 239000011777 magnesium Substances 0.000 description 2
- 229910052749 magnesium Inorganic materials 0.000 description 2
- HQKMJHAJHXVSDF-UHFFFAOYSA-L magnesium stearate Chemical compound [Mg+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O HQKMJHAJHXVSDF-UHFFFAOYSA-L 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 210000004165 myocardium Anatomy 0.000 description 2
- 229920001220 nitrocellulos Polymers 0.000 description 2
- 239000006174 pH buffer Substances 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 229920003023 plastic Polymers 0.000 description 2
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 2
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 2
- 229920002223 polystyrene Polymers 0.000 description 2
- 239000004800 polyvinyl chloride Substances 0.000 description 2
- 229920000915 polyvinyl chloride Polymers 0.000 description 2
- 208000023958 prostate neoplasm Diseases 0.000 description 2
- 238000010839 reverse transcription Methods 0.000 description 2
- 238000001612 separation test Methods 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 210000000813 small intestine Anatomy 0.000 description 2
- 230000000638 stimulation Effects 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 210000001550 testis Anatomy 0.000 description 2
- YNJBWRMUSHSURL-UHFFFAOYSA-N trichloroacetic acid Chemical compound OC(=O)C(Cl)(Cl)Cl YNJBWRMUSHSURL-UHFFFAOYSA-N 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- NWUYHJFMYQTDRP-UHFFFAOYSA-N 1,2-bis(ethenyl)benzene;1-ethenyl-2-ethylbenzene;styrene Chemical compound C=CC1=CC=CC=C1.CCC1=CC=CC=C1C=C.C=CC1=CC=CC=C1C=C NWUYHJFMYQTDRP-UHFFFAOYSA-N 0.000 description 1
- AZQWKYJCGOJGHM-UHFFFAOYSA-N 1,4-benzoquinone Chemical compound O=C1C=CC(=O)C=C1 AZQWKYJCGOJGHM-UHFFFAOYSA-N 0.000 description 1
- TVZGACDUOSZQKY-LBPRGKRZSA-N 4-aminofolic acid Chemical compound C1=NC2=NC(N)=NC(N)=C2N=C1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 TVZGACDUOSZQKY-LBPRGKRZSA-N 0.000 description 1
- 102100039819 Actin, alpha cardiac muscle 1 Human genes 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- FQNILRVJOJBFFC-FXQIFTODSA-N Ala-Pro-Asp Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N FQNILRVJOJBFFC-FXQIFTODSA-N 0.000 description 1
- MTDDMSUUXNQMKK-BPNCWPANSA-N Ala-Tyr-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MTDDMSUUXNQMKK-BPNCWPANSA-N 0.000 description 1
- GUBGYTABKSRVRQ-XLOQQCSPSA-N Alpha-Lactose Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)O[C@H](O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-XLOQQCSPSA-N 0.000 description 1
- 102100022987 Angiogenin Human genes 0.000 description 1
- 102000006306 Antigen Receptors Human genes 0.000 description 1
- 108010083359 Antigen Receptors Proteins 0.000 description 1
- 108020000948 Antisense Oligonucleotides Proteins 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- 206010003445 Ascites Diseases 0.000 description 1
- HXWUJJADFMXNKA-BQBZGAKWSA-N Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O HXWUJJADFMXNKA-BQBZGAKWSA-N 0.000 description 1
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 1
- VNXQRBXEQXLERQ-CIUDSAMLSA-N Asp-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N VNXQRBXEQXLERQ-CIUDSAMLSA-N 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102100031680 Beta-catenin-interacting protein 1 Human genes 0.000 description 1
- 101100315624 Caenorhabditis elegans tyr-1 gene Proteins 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 241000759568 Corixa Species 0.000 description 1
- 241000557626 Corvus corax Species 0.000 description 1
- DXSBGVKEPHDOTD-UBHSHLNASA-N Cys-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CS)N DXSBGVKEPHDOTD-UBHSHLNASA-N 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 1
- 108010041986 DNA Vaccines Proteins 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 201000004624 Dermatitis Diseases 0.000 description 1
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 101100218161 Escherichia coli (strain K12) atpG gene Proteins 0.000 description 1
- 101710177291 Gag polyprotein Proteins 0.000 description 1
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 1
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 1
- KBKGRMNVKPSQIF-XDTLVQLUSA-N Glu-Ala-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KBKGRMNVKPSQIF-XDTLVQLUSA-N 0.000 description 1
- ZWQVYZXPYSYPJD-RYUDHWBXSA-N Glu-Gly-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZWQVYZXPYSYPJD-RYUDHWBXSA-N 0.000 description 1
- DVLZZEPUNFEUBW-AVGNSLFASA-N Glu-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N DVLZZEPUNFEUBW-AVGNSLFASA-N 0.000 description 1
- QGAJQIGFFIQJJK-IHRRRGAJSA-N Glu-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O QGAJQIGFFIQJJK-IHRRRGAJSA-N 0.000 description 1
- SITLTJHOQZFJGG-XPUUQOCRSA-N Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- QVDGHDFFYHKJPN-QWRGUYRKSA-N Gly-Phe-Cys Chemical compound NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CS)C(O)=O QVDGHDFFYHKJPN-QWRGUYRKSA-N 0.000 description 1
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 1
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 102100039620 Granulocyte-macrophage colony-stimulating factor Human genes 0.000 description 1
- 102000008949 Histocompatibility Antigens Class I Human genes 0.000 description 1
- 108010088652 Histocompatibility Antigens Class I Proteins 0.000 description 1
- 101000760663 Hololena curta Mu-agatoxin-Hc1a Proteins 0.000 description 1
- 101000959247 Homo sapiens Actin, alpha cardiac muscle 1 Proteins 0.000 description 1
- 101000993469 Homo sapiens Beta-catenin-interacting protein 1 Proteins 0.000 description 1
- 101000795167 Homo sapiens Tumor necrosis factor receptor superfamily member 13B Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 1
- 108060003951 Immunoglobulin Proteins 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 102000013462 Interleukin-12 Human genes 0.000 description 1
- 108010065805 Interleukin-12 Proteins 0.000 description 1
- 102000000588 Interleukin-2 Human genes 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 102000000704 Interleukin-7 Human genes 0.000 description 1
- 108010002586 Interleukin-7 Proteins 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- AIXUQKMMBQJZCU-IUCAKERBSA-N Lys-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O AIXUQKMMBQJZCU-IUCAKERBSA-N 0.000 description 1
- 102000043131 MHC class II family Human genes 0.000 description 1
- 108091054438 MHC class II family Proteins 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 101100462972 Mus musculus Pcdh8 gene Proteins 0.000 description 1
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241000282579 Pan Species 0.000 description 1
- 241000237988 Patellidae Species 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 101100082836 Pediococcus acidilactici pedC gene Proteins 0.000 description 1
- 201000005702 Pertussis Diseases 0.000 description 1
- 241000276498 Pollachius virens Species 0.000 description 1
- ZCXQTRXYZOSGJR-FXQIFTODSA-N Pro-Asp-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZCXQTRXYZOSGJR-FXQIFTODSA-N 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 238000010240 RT-PCR analysis Methods 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- 229920002472 Starch Polymers 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 201000008754 Tenosynovial giant cell tumor Diseases 0.000 description 1
- OLFOOYQTTQSSRK-UNQGMJICSA-N Thr-Pro-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLFOOYQTTQSSRK-UNQGMJICSA-N 0.000 description 1
- IXEGQBJZDIRRIV-QEJZJMRPSA-N Trp-Asn-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IXEGQBJZDIRRIV-QEJZJMRPSA-N 0.000 description 1
- 102100029675 Tumor necrosis factor receptor superfamily member 13B Human genes 0.000 description 1
- 206010053613 Type IV hypersensitivity reaction Diseases 0.000 description 1
- HZZKQZDUIKVFDZ-AVGNSLFASA-N Tyr-Gln-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)O HZZKQZDUIKVFDZ-AVGNSLFASA-N 0.000 description 1
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 108700005077 Viral Genes Proteins 0.000 description 1
- UZQJVUCHXGYFLQ-AYDHOLPZSA-N [(2s,3r,4s,5r,6r)-4-[(2s,3r,4s,5r,6r)-4-[(2r,3r,4s,5r,6r)-4-[(2s,3r,4s,5r,6r)-3,5-dihydroxy-6-(hydroxymethyl)-4-[(2s,3r,4s,5s,6r)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxyoxan-2-yl]oxy-3,5-dihydroxy-6-(hydroxymethyl)oxan-2-yl]oxy-3,5-dihydroxy-6-(hy Chemical compound O([C@H]1[C@H](O)[C@@H](CO)O[C@H]([C@@H]1O)O[C@H]1[C@H](O)[C@@H](CO)O[C@H]([C@@H]1O)O[C@H]1CC[C@]2(C)[C@H]3CC=C4[C@@]([C@@]3(CC[C@H]2[C@@]1(C=O)C)C)(C)CC(O)[C@]1(CCC(CC14)(C)C)C(=O)O[C@H]1[C@@H]([C@@H](O[C@H]2[C@@H]([C@@H](O[C@H]3[C@@H]([C@@H](O[C@H]4[C@@H]([C@@H](O[C@H]5[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O5)O)[C@H](O)[C@@H](CO)O4)O)[C@H](O)[C@@H](CO)O3)O)[C@H](O)[C@@H](CO)O2)O)[C@H](O)[C@@H](CO)O1)O)[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O UZQJVUCHXGYFLQ-AYDHOLPZSA-N 0.000 description 1
- 238000011256 aggressive treatment Methods 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- WNROFYMDJYEPJX-UHFFFAOYSA-K aluminium hydroxide Chemical compound [OH-].[OH-].[OH-].[Al+3] WNROFYMDJYEPJX-UHFFFAOYSA-K 0.000 description 1
- 150000001412 amines Chemical class 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 229960003896 aminopterin Drugs 0.000 description 1
- 108010072788 angiogenin Proteins 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- -1 antibodies Chemical class 0.000 description 1
- 239000000074 antisense oligonucleotide Substances 0.000 description 1
- 238000012230 antisense oligonucleotides Methods 0.000 description 1
- 230000005975 antitumor immune response Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 108010038633 aspartylglutamate Proteins 0.000 description 1
- 238000002820 assay format Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 235000021028 berry Nutrition 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 230000001588 bifunctional effect Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 239000002981 blocking agent Substances 0.000 description 1
- 230000005773 cancer-related death Effects 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000012832 cell culture technique Methods 0.000 description 1
- 230000007910 cell fusion Effects 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 230000001461 cytolytic effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 208000035647 diffuse type tenosynovial giant cell tumor Diseases 0.000 description 1
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000004043 dyeing Methods 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000011152 fibreglass Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 125000002485 formyl group Chemical group [H]C(*)=O 0.000 description 1
- 102000054766 genetic haplotypes Human genes 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 102000006602 glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 1
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000005802 health problem Effects 0.000 description 1
- 108060003552 hemocyanin Proteins 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 230000028996 humoral immune response Effects 0.000 description 1
- 210000004754 hybrid cell Anatomy 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000002998 immunogenetic effect Effects 0.000 description 1
- 102000018358 immunoglobulin Human genes 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000003456 ion exchange resin Substances 0.000 description 1
- 229920003303 ion-exchange polymer Polymers 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 201000004614 iritis Diseases 0.000 description 1
- 239000008101 lactose Substances 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 230000021633 leukocyte mediated immunity Effects 0.000 description 1
- GZQKNULLWNGMCW-PWQABINMSA-N lipid A (E. coli) Chemical compound O1[C@H](CO)[C@@H](OP(O)(O)=O)[C@H](OC(=O)C[C@@H](CCCCCCCCCCC)OC(=O)CCCCCCCCCCCCC)[C@@H](NC(=O)C[C@@H](CCCCCCCCCCC)OC(=O)CCCCCCCCCCC)[C@@H]1OC[C@@H]1[C@@H](O)[C@H](OC(=O)C[C@H](O)CCCCCCCCCCC)[C@@H](NC(=O)C[C@H](O)CCCCCCCCCCC)[C@@H](OP(O)(O)=O)O1 GZQKNULLWNGMCW-PWQABINMSA-N 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- ZLNQQNXFFQJAID-UHFFFAOYSA-L magnesium carbonate Chemical compound [Mg+2].[O-]C([O-])=O ZLNQQNXFFQJAID-UHFFFAOYSA-L 0.000 description 1
- 239000001095 magnesium carbonate Substances 0.000 description 1
- 229910000021 magnesium carbonate Inorganic materials 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 235000019359 magnesium stearate Nutrition 0.000 description 1
- 239000006249 magnetic particle Substances 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 235000010355 mannitol Nutrition 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 239000002480 mineral oil Substances 0.000 description 1
- 235000010446 mineral oil Nutrition 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 210000005087 mononuclear cell Anatomy 0.000 description 1
- 229940035032 monophosphoryl lipid a Drugs 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 230000037311 normal skin Effects 0.000 description 1
- 101150013860 oppB gene Proteins 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 101150049619 p30 gene Proteins 0.000 description 1
- 101150061305 papC gene Proteins 0.000 description 1
- 238000007911 parenteral administration Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 235000010482 polyoxyethylene sorbitan monooleate Nutrition 0.000 description 1
- 229920000053 polysorbate 80 Polymers 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 238000004007 reversed phase HPLC Methods 0.000 description 1
- CVHZOJJKTDOEJC-UHFFFAOYSA-N saccharin Chemical compound C1=CC=C2C(=O)NS(=O)(=O)C2=C1 CVHZOJJKTDOEJC-UHFFFAOYSA-N 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 238000011451 sequencing strategy Methods 0.000 description 1
- 230000000405 serological effect Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 210000004927 skin cell Anatomy 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 239000008107 starch Substances 0.000 description 1
- 235000019698 starch Nutrition 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 239000007929 subcutaneous injection Substances 0.000 description 1
- 238000010254 subcutaneous injection Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000008961 swelling Effects 0.000 description 1
- 239000000454 talc Substances 0.000 description 1
- 235000012222 talc Nutrition 0.000 description 1
- 229910052623 talc Inorganic materials 0.000 description 1
- 208000002918 testicular germ cell tumor Diseases 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 229960001005 tuberculin Drugs 0.000 description 1
- 230000005951 type IV hypersensitivity Effects 0.000 description 1
- 208000027930 type IV hypersensitivity disease Diseases 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Abstract
Compositions and methods for the detection and therapy of breast cancer are described. Compounds provided include nucleotide sequences that are preferentially expressed in breast tumor tissue, as well as polypeptides encoded by said nucleotide sequences. Vaccines and pharmaceutical compositions comprising said compounds are also provided and may be used, for example, for the prevention and treatment of breast cancer. The polypeptides can also be used for the production of antibodies, which are useful for diagnosing and monitoring the progression of breast cancer in a patient.
Description
COMPOSITIONS AND METHODS FOR THE TREATMENT AND DIAGNOSIS OF BREAST CANCER Technical Field The present invention relates generally to the detection and therapy of breast cancer. The invention relates more specifically to nucleotide sequences that are preferentially expressed in breast tumor tissue and to polypeptides encoded by said nucleotide sequences. The nucleotide and polypeptide sequences can be used in vaccines and pharmaceutical compositions for the prevention and treatment of breast cancer. The polypeptides can also be used for the production of compounds, such as antibodies, useful for the diagnosis and monitoring of the progression of breast cancer in a patient. Background of the Invention Breast cancer is a significant health problem for women in the United States and throughout the world. Although advances have been made in the detection and treatment of the disease, breast cancer remains the second leading cause of cancer-related deaths in women, affecting more than 180,000 women in the United States each year. For women in North America, the chances in their lifetime of contracting breast cancer is now one in eight. Currently there is no vaccine or other universally successful method available for the prevention or treatment of breast cancer. The management of the disease is currently based on a combination of early diagnosis (through routine breast screening procedures) and aggressive treatment, which may include one or more of a variety of treatments such as surgery, radiotherapy, chemotherapy, and therapy. hormones The course of treatment for a particular breast cancer is often selected based on a variety of diagnostic parameters, including an analysis of specific tumor markers. See, for example, Porter-Jordan and Lippman, Breast Cancer 8: 73-100 (1994).
However, the use of established markers often leads to a result that is difficult to interpret and the high mortality observed in patients with breast cancer indicates that improvements are necessary in the treatment, diagnosis and prevention of the disease. 15 Consequently, there is a need in the matter of
# Improved methods for therapy and diagnosis of breast cancer. The present invention meets these needs and also provides other related advantages. SUMMARY OF THE INVENTION In summary, the present invention provides compositions and methods for the diagnosis and therapy of breast cancer. In one aspect, isolated DNA molecules are provided, comprising (a) a nucleotide sequence preferentially expressed in breast cancer tissue, relative to normal tissue; (b) a variant
of said sequence containing one or more substitutions, deletions, insertions and / or modifications of nucleotides in no more than 20% (preferably not more than 5%) of the nucleotide positions, so that the antigenic and / or immunogenic properties of the polypeptide encoded by the nucleotide sequence are retained; or (c) a nucleotide sequence encoding an epitope of a polypeptide encoded by at least one of the above sequences. In one embodiment, the isolated DNA molecule comprises a human endogenous retroviral sequence as recited in SEQ ID NO: 1. In other embodiments, the isolated DNA molecule comprises a nucleotide sequence recited in any of SEQ ID NO: 3-SEQ ID NO: 77 or SEQ ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 1994-198, 200-204, 206, 207. 209-214, 216, 218, 219, 221-227. In related embodiments, the isolated DNA molecule encodes an epitope of a polypeptide, wherein the polypeptide is encoded by a nucleotide sequence which: (a) hybridizes to a sequence recited in one of SEQ ID NO: 1 or SEQ ID NO: 3 - SEQ ID NO: 77 or SEQ ID NOS .: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216 , 218, 219, 221-227 under strict conditions; and (b) at least 80% identical to a sequence recited in either SEQ ID NO: 1 or SEQ ID NO: 3 -SEC ID NO: 77 or SEQ ID NOS: 142, 143, 146-152, 154- 166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227; and wherein RNA corresponding to said nucleotide sequence is expressed at a higher level in human breast tumor tissue in normal breast tissue. In another embodiment, the present invention provides an isolated DNA molecule encoding an epitope of a polypeptide, the polypeptide being encoded by (a) a nucleotide sequence transcribed from SEQ ID NO: 141; or (b) a variant of said nucleotide sequence containing one or more substitutions, deletions, insertions and / or nucleotide modifications in no more than 20% of the nucleotide positions, so as to retain the antigenic properties and / or immunogenic of the polypeptide encoded by the nucleotide sequence. Isolated DNA and RNA molecules comprising a nucleotide sequence complementary to a DNA molecule as described above are also provided. In related aspects, the present invention provides recombinant expression vectors comprising a DNA molecule as described above and host cells transformed or transfected with said expression vectors. In additional aspects, polypeptides are provided which comprise an amino acid sequence encoded by a DNA molecule as described above and monoclonal antibodies that bind to said polypeptides. In yet another aspect, methods are provided to determine the presence of breast cancer in a patient. In one embodiment, the method comprises detecting in a biological sample, a polypeptide as described above. In another embodiment, the method comprises detecting, within a biological sample, an RNA molecule encoding a polypeptide as described above. In yet another embodiment, the method comprises (a) injecting intradermally a patient, a polypeptide as described above; and (b) detecting an immune response in the patient's skin and thus detecting the presence of breast cancer in the patient. In additional embodiments, the present invention provides methods for determining the presence of breast cancer in a patient as described above wherein the polypeptide is encoded by a nucleotide sequence selected from the group consisting of SEC I D NO. : 78-86, SEC I D NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220 and sequences that hybridize to it under strict conditions. In a related aspect, useful diagnostic equipment is provided for the determination of breast cancer. Diagnostic kits generally comprise one or more monoclonal antibodies as described above, or one or more monoclonal antibodies that bind to a polypeptide encoded by a nucleotide sequence selected from the group consisting of sequences provided in SEQ ID NOS: 78-86. and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220 and a detection reagent. Within a related aspect, the diagnostic kit comprises a first polymerase chain reaction primer and a second polymerase chain reaction primer, the first and second primers each comprising at least about 10 contiguous nucleotides of a molecule of DNA as described above, or an RNA molecule encoding a polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NOS: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193 , 199, 205, 199, 205, 208, 215, 217, 220. Within another related aspect, the diagnostic kit comprises at least one oligonucleotide probe, the probe comprising at least about 1 5 contiguous nucleotides of a molecule of DNA as described above, or a DNA molecule selected from the group consisting of SEQ ID NOS: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220. In Another related aspect, the present invention provides methods for monitoring the progression of breast cancer in a patient. In one embodiment, the method comprises: (a) detecting an amount, in a biological sample, of a polypeptide as described above at a first point in time; (b) repeating step (a) at a subsequent point in time; and (c) comparing the amounts of polypeptides detected in steps (a) and (b) and of the same monitor the progression of breast cancer in the patient. In another embodiment, the method comprises (a) detecting an amount, within a biological sample, of an AR N molecule that encodes a polypeptide as described above at a first point in time, (b) repeating the step (a ) at a subsequent point in time; and (c) comparing the amounts of RNA molecules detected in steps (a) and (b) and from them monitoring the progression of breast cancer in the patient. In still other embodiments, the present invention provides methods for monitoring the progression of breast cancer in a patient, as described above wherein the polypeptide is encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78 -86, SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220 and sequences that hybridize thereto under strict conditions. In yet other aspects, pharmaceutical compositions are provided comprising a polypeptide as described above in combination with a physiologically acceptable carrier and vaccines, comprising a polypeptide as described above in combination with an enhancer or immune response helper. In still other aspects, the present invention provides pharmaceutical compositions and vaccines comprising a polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86, SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220 and sequences that hybridize thereto under stringent conditions. In related aspects, the present invention provides methods for inhibiting the development of breast cancer in a patient, comprising administering to a patient a pharmaceutical composition or vaccine as described above.
These and other aspects of the present invention will be apparent by reference to the following detailed description and accompanying drawings. All references described herein are incorporated herein by reference in their entirety as if each were incorporated individually. BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows the differentially spread PCR products, separated by gel electrophoresis, obtained from cDNA prepared from normal breast tissue (lines 1 and 2) and from cDNA prepared from breast tumor tissue thereof. patient (lines 3 and
4) . The arrow indicates the band corresponding to B18Ag 1. Figure 2 is one of Northern blots comparing the level of B18Ag1 mRNA in breast tumor tissue (line 1) with the level in normal tissue. Figure 3 shows the mRNA level of B18Ag 1 in breast tumor tissue compared to that of several normal and non-breast tumor tissues as determined by the RNase protection analysis. Figure 4 is a genomic clone map showing the location of additional retroviral sequences obtained from the ends of Xbal constraints and restrictions (provided in SEC i D NO: 3 - SEC I D NO: 10) in relation to B 18Ag 1. Figures 5A and 5B show sequencing strategy, genomic organization and open reading frame provided for the retroviral element containing B 18Ag1.
% Figure 6 shows the nucleotide sequence of B18Ag1 of cDNA specific for representative breast tumor. Figure 7 shows the nucleotide sequence of B17Ag 1 of cDNA specific for representative breast tumor. Figure 8 shows the nucleotide sequence of B17Ag2 of cDNA specific for representative breast tumor. Figure 9 shows the nucleotide sequence of B13Ag2a * ^ ff cDNA specific for representative breast tumor. Figure 10 shows the nucleotide sequence of B13Ag 1 b 10 of cDNA specific for representative breast tumor. Figure 11 shows the nucleotide sequence of B 13Ag 1 a cDNA specific for breast tumor representative. Figure 12 shows the nucleotide sequence of B1 1 Ag 1 of cDNA specific for representative breast tumor. 15 Figure 13 shows the nucleotide sequence of B3CA3c
4 cDNA specific for representative breast tumor. Figure 14 shows the nucleotide sequence of representative B9CG 1 cDNA for breast tumor. Figure 15 shows the nucleotide sequence of representative B9CG3 20 cDNA for breast tumor. Figure 16 shows the nucleotide sequence of B2CA2 of representative breast tumor-specific cDNA. Figure 17 shows the nucleotide sequence of representative B3CA 1 cDNA for breast tumor.
% Figure 18 shows the nucleotide sequence of B3CA2 of cDNA specific for representative breast tumor. Figure 19 shows the nucleotide sequence of B3CA3 of representative breast tumor-specific cDNA. Figure 20 shows the nucleotide sequence of B4CA1 of cDNA specific for representative breast tumor. Figure 21A, describes PCR-TI analysis of breast tumor W genes in breast tumor tissues (lines 1-8) and normal breast tissues (9-13) and H20 (line 14). Figure 21B, describes RT-PCR analysis of breast tumor genes in prostate tumors (lines 1, 2), colon tumors (line 3), lung tumor (line 4), normal prostate (line 5), normal colon (line 6), normal kidney (line 7), normal liver (line 8), normal lung (line 9), normal ovary (lines 10, 18), pancreas 15 normal (lines 11, 12), normal skeletal muscle (line 13), skin
* normal (line 14), normal stomach (line 15), normal testicles (line 16), normal small bowel (line 17), HBL-100 (line 19). MCF-12A (line 20), breast tumors (lines 21-23), H20 (line 24) and colon tumor (line 25). Detailed Description of the Invention As noted above, the present invention is generally directed to compositions and methods for the diagnosis, monitoring and therapy of breast cancer. The compositions described herein, include polypeptides, nucleic acid sequences and
antibodies. The polypeptides of the present invention generally comprise at least a portion of a protein that is expressed at a higher level in human breast tumor tissue than in normal tumor tissue (i.e., the level of RNA encoding the polypeptide is at least 2 times higher in tumor tissue). Said polypeptides are referred to herein as polypeptides specific for breast tumor and the cDNA molecules encoding said polypeptides are referred to herein as cDNAs specific for breast tumor. The nucleic acid sequences of the present invention generally comprise a
A DNA or RNA sequence that encodes an entire portion of a polypeptide as described above, or that is complementary to said sequence. Antibodies are generally proteins of the immune system, or fragments thereof, which are capable of binding to a portion of a polypeptide as described above. The
Antibodies can be produced by cell culture techniques, including the generation of monoclonal antibodies as described herein, or via transfection of antibody genes in suitable bacteria or mammalian host cells, in order to allow the production of recombinant antibodies. Polypeptides within the scope of this invention include, but are not limited to, polypeptides (and epitopes thereof) encoded by a human endogenous retroviral sequence, such as the sequence designated B18AG1 (Figure 5 and SEQ ID NO: 1). Also within the scope of the present invention are polypeptides
encoded by other sequences within the retroviral genome containing B18Ag1 (SEQ ID NO: 141). Such sequences include, but are not limited to, the sequences recited in SEQ ID NO: 3 -SEC ID NO: 10. B18Ag1 has homology to the gag p30 gene of the endogenous retroviral element S71, as described in Werner et al., Virology 1744 : 225-238 (1990) and also shows homology to approximately thirty with approximately thirty other retroviral gag genes. As discussed in more detail below, the present invention also includes a number of additional breast tumor-specific polypeptides, such as those encoded by the nucleotide sequences recited in SEQ ID NO: 1-SEQ ID NO: 77 and SEC ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227. As used herein, the term "polypeptide" embraces chains of amino acids of any length, including full-length proteins containing the sequences recited herein. A polypeptide comprising an epitope of a protein containing a sequence as described herein may consist entirely of the epitope, or may contain additional sequences. The additional sequences may be derived from the native protein or they may be heterologous, and said sequences may (but not necessarily) have immunogenic or antigenic properties. An "epitope", as used herein, is a portion of a polypeptide that is recognized (i.e., specifically bound) by a surface antigen receptor of B cells and / or T cells. Epitopes can generally be identified using well-known techniques, such as those summarized in Paul, Fundamental Immunology, 3a. ed. , 243-247 (Raven Press, 1993) and references cited therein. Such techniques include screening polypeptides derived from a native polypeptide for the ability to react with antiserum specific for antigen and / or T cell lines or clones. An epitope of a polypeptide is a portion that reacts f with said antiserum and / or T cells at a level that is similar to the reactivity of full-length polypeptide (e.g., in an analysis
ELISA and / or T cell reactivity). Such sieves may be snowed out generally using methods well known to those of ordinary skill in the art, such as those described in Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988. Epitopes of B cells and 15 T cells can also be predicted via computer analysis. Polypeptides comprising an epitope of a polypeptide that is preferentially expressed in a tumor tissue (with or without additional amino acid sequence) are within the scope of the invention. The compositions and methods of the present invention also
encompass variants of the above polypeptides and nucleic acid sequences encoding said polypeptides. As used herein a "variant" of polypeptides, is a polypeptide that differs from the native polypeptide in substitutions and / or modifications, so that the antigenic and / or immunogenic properties of
polypeptides are retained. Said variants can generally be identified by modifying one of the above polypeptide sequences and evaluating the reactivity of the modified polypeptide with antiserum and / or T cells as described above. The nucleic acid variants may contain one or more substitutions, deletions, insertions and / or modifications so that the antigenic and / or immunogenic properties of the encoded polypeptide are retained. A preferred variant of the polypeptides described herein is a variant containing substitutions, deletions, insertions and / or nucleotide modifications in no more than 20% of the nucleotide positions. Preferably, a variant contains conservative substitutions. A "conservative substitution" is one in which an amino acid is substituted by another amino acid having similar properties, so that one skilled in the art of peptide chemistry could hope that the secondary structure and hydropathic nature of the amino acid will not be substantially changed. poly peptide. In general, the following amino acid groups represent conservative changes: (1) ala, pro, gly, glu, asp, gln, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. The variants also (or alternatively) can be modified, for example, by deletion or addition of amino acids that have minimal influence on the immunogenic or antigenic properties, secondary structure or hydropathic nature of the polypeptide. For example, a polypeptide can be conjugated to a signal sequence (or leader) at the N-terminus of the protein which directs co-translational or post-translational transfer of the protein. The polypeptide can also be conjugated to a linker or other sequence for ease of synthesis, purification or identification of polypeptides (e.g., poly-His), or to increase the binding of the polypeptide to a solid support. For example, a polypeptide can be conjugated to an Fc region of immunoglobulin. In general, nucleotide sequences that encode all or a portion of the polypeptides described herein can be prepared using any of several techniques. For example, the cDNA molecules encoding said polypeptides can be cloned on the basis of breast tumor specific expression of the corresponding mRNAs, using differential deployment PCR. This technique compares the amplified products of RNA standard prepared from normal and tumor breast tissue. The cDNA can be prepared by reverse transcription of RNA using a (dT) 1 2AG primer. After the amplification of the AD Nc using a random primer, a band corresponding to a specific amplified product for the tumor RNA of a silver-stained gel can be cut and subcloned into a suitable vector (e.g. vector T, N ovagen, Madison, Wl). The nucleotide sequences encoding a whole portion of the polypeptides specific for breast tumors described herein can be amplified from cDNA prepared as described above using the random primers shown in SEQ ID NO: 87-125. Alternatively, a gene encoding a polypeptide as described herein (or a portion thereof) can be amplified from a human genomic DNA, or from a breast tumor cDNA, via a polymerase chain reaction. For this approach, sequence-specific primers of B18Agl can be designed based on the sequence provided in SEC I D NO: 1, and can be purchased or synthesized. A pair of primers suitable for the amplification of breast tumor cDNA is (5 'ATG GCT ATT TTC GGG GGC TGA CA) (SEQ ID NO: 126) and (5' CCG GTA TCT CCT CGT GGG TAT T) (SEC ID NO.: 127). an amplified portion of B18Ag 1 can then be used to isolate the full length gene from a human genomic DN A bank or from a breast tumor cDNA bank, using well known techniques, such as that described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY (1989). Other sequences within the retroviral genome of which B 18Ag 1 is a part, can similarly be prepared by screening human genomic libraries using sequences specific for B 18Ag 1 as probes. Nucleotides translated into retroviral genome proteins shown in SEQ ID NO: 141 can be determined by cloning the corresponding cDNAs, predicting open reading frames and cloning the appropriate cDNAs in a vector containing a viral promoter, such as T7. The resulting constructs can be employed in a translation reaction, using techniques known to those skilled in the art, to identify nucleotide sequences that result in the expressed protein. Similarly, the specific primers for the remaining breast tumor-specific polypeptides described herein may be designed based on the nucleotide sequence provided in SEQ ID NO: 1-SEQ ID NO: 86 and SEQ ID NO: 142-SEQ ID NO: 226 The recombinant polypeptides encoded by the DNA sequences described above can be easily prepared from DNA sequences. For example, supernatants from suitable host / vector systems that secrete recombinant protein or polypeptide in culture medium can be first concentrated using a commercially available filter. After concentration, the concentrate can be applied to a suitable purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can be employed to further purify a recombinant polypeptide. In general, any of a variety of expression vectors known to those of ordinary skill in the art can be used to express recombinant polypeptides of this invention. Expression can be achieved in any appropriate host cell that has been transformed or transfected with an expression vector containing a DNA molecule encoding a recombinant polypeptide. Suitable host cells include prokaryotes, yeasts and higher eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a mammalian cell line such as COS or CHO. Such techniques can also be used to prepare polypeptides comprising native epitopes or variants of polypeptides. For example, variants of a native polypeptide can generally be prepared using standard mutagenesis techniques, such as site-specific mutagenesis directed to oligonucleotides and DNA sequence sections can be removed to allow the preparation of truncated polypeptides. Portions and other variants having less than about 100 amino acids and generally less than about 50 amino acids, can also be generated by synthetic means using techniques well known to those of ordinary skill in the art. For example, such polypeptides can be synthesized using any of the commercially available solid phase techniques, such as the Merrifield solid phase synthesis method, wherein the amino acids are sequentially added to a growing chain of amino acids. See Merrifield, J. Am. Chem. Soc. 85: 2149-2146 (1963). The equipment for automatic synthesis of polypeptides is commercially available from suppliers such as Applied BioSystems, Inc., Foster City, CA, and can be operated in accordance with the manufacturer's instructions. In specific embodiments, the polypeptides of the present invention encompass amino acid sequences encoded by a * DNA molecule having a sequence recited by anyone, of SEQ ID NO: 1 or SEQ ID NO: 3 - SEQ ID NO: 77 or SEQ ID NOS.142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227, variants of said polypeptides that are encoded by DNA molecules containing one or more substitutions, deletions, insertions and / or modifications of nucleotides in no more than 20% of the nucleotide and epitope positions of the above polypeptides. The polypeptides within the scope of the present invention also
include polypeptides (and epitopes thereof) encoded by DNA sequences that hybridize to a DNA molecule having a sequence recited in one of SEQ ID NO: 1 or SEQ ID NO: 3 - SEQ ID NO: 77 or SEC ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219,
221-227 under stringent conditions, wherein the DNA sequences are at least 80% identical in overall sequence to a recited sequence and wherein the RNA corresponding to the nucleotide sequence is expressed at a higher level in tumor tissue of human breast, in normal breast tissue. How it is used
in the present "stringent conditions" refers to a prewash in a solution of 6X SSC, 0.2% SDS; hybridizing at 65 ° C, 6X SSC, 0.2% SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in 1X SSC, 0.1% SDS at 65 ° C and two washes of 30 minutes each in 1X SSC, 0.1% SDS at 65 ° C. The # DNA molecules according to the present invention include molecules that encode any of the above polypeptides. In another aspect of the present invention, antibodies are provided. Said antibodies can be prepared by any of a variety of techniques known to those of ordinary skill in the art. See, for example Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor
^ p. Laboratory, 1988. In said technique, an immunogen comprising the polypeptide is initially injected to any of a wide
variety of mammals (e.g., mice, rats, rabbits, sheep or goats). In this step, the polypeptides of this invention can serve as the immunogen without modification. Alternatively, particularly for relatively short polypeptides, a superior immune response can be produced if the polypeptide is
binds to a carrier protein, such as bovine serum albumin or boundary limpet hemocyanin. The immunogen is injected into an animal host, preferably according to a predetermined schedule incorporating one or more booster immunizations and the animals are bled periodically. Antibodies
Polyclonal, specific for the polypeptide can then be purified from said antiserum, for example, by affinity chromatography using the polypeptide coupled to a suitable solid support. Monoclonal antibodies specific for the polypeptide
Antigen of interest can be prepared, for example, using the technique of Kohler and Milstein, Eur. J. Immunol. 6:51 1 -519 (1976) and improvements in it. In summary, these methods involve the preparation of immortal cell lines capable of producing antibodies having the desired specificity (i.e., reactivity with the polypeptide of interest). Said cell lines can be produced, for example, from cells of the vessel obtained from an immunized animal as described above. The cells of the vessel are then immortalized, for example, by fusion with a pair of myeloma cell fusion cells, preferably one that is syngeneic with the immunized animal. A variety of fusion techniques can be employed. For example, vessel cells and myeloma cells can be combined with a non-ionic detergent for a few minutes and then plated at low density in a selective medium that supports the growth of hybrid cells, but not myeloma cells. One technique uses HAT selection (hypoxanthine, aminopterin, thymidine). After a sufficient time, usually about 1 or 2 weeks, colonies of hybrids were observed. The colonies alone are selected and their culture supernatants are tested for binding activity against the polypeptide. Hybridomas that have high reactivity and specificity are preferred. Monoclonal antibodies can be isolated from the supernatants of growth hybridoma colonies. further, several techniques can be expanded to enhance the redemption, such as injection of the hybridoma cell line into the cavity * of a host of suitable vertebrates such as a mouse. The monoclonal antibodies can then be cultured from ascites fluid or blood. The contaminants can be removed from the antibodies by conventional techniques, such as chromatography, filtration, precipitation and extraction. The polypeptides of this invention can be used in purification processes, for example, in an affinity chromatography step. ** ß For example, antibodies can be used in methods to detect breast cancer in a patient. These methods involve
using an antibody to detect the presence or absence of a breast tumor-specific polypeptide as described herein in a suitable biological sample. As used herein, suitable biological samples include biopsy samples of normal or tumor tissue, mastectomy, blood, nodes
lymphatic, serum or urine, or other tissue, homogenate, or extract thereof obtained from a patient. There is a variety of assay formats known per se from ordinary experience in the art to use an antibody in order to detect polypeptide markers in a sample. See for
Example, Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988. For example, the analysis can be carried out in a Western blot format, wherein a protein preparation of the biological sample is subjected to to gel electrophoresis, it is transferred to a suitable membrane and
lets react with the antibody. The presence of the antibody in the membrane can then be detected using a suitable detection reagent, as described below. In another embodiment, the assay involves the use of immobilized antibody on a solid support to bind to the polypeptide and remove it from the rest of the sample. The bound polypeptide can then be detected using a second antibody or reagent containing a reporter group. Alternatively, a competitive analysis can be used, in which the polypeptide is labeled with a group
# reporter and let bind the immobilized antibody after the
incubation of the antibody with the sample. The degree to which the components of the sample inhibit the binding of the labeled polypeptide to the antibody indicates the reactivity of the sample with the immobilized antibody, and as a result, indicates the concentration of the polypeptide in the sample. 15 The solid support can be of any material known by
* those of ordinary experience in the subject to which the antibody can be attached. For example, the solid support can be a test on a microtitre plate or nitrocellulose filter or other suitable membrane. Alternatively, the support may be a bead or disk.
such as glass, fiberglass, latex or a plastic material such as polystyrene or polyvinyl chloride. The support can also be a magnetic particle or a fiber optic sensor, such as that described, for example, in the Patent of E. U.A. number 5, 359,681.
The antibody can be immobilized on the solid support using a variety of techniques known to those skilled in the art, which are widely described in the patent and scientific literature. In the context of the present invention the term "immobilization" refers to the non-covalent association, such as adsorption, and covalent binding (which may be a direct ligation between the antigen and functional groups on the support or it may be a ligation). way of entanglement agent). Immobilization by adsorption to a well in the microtiter plate or membrane is preferred. In such cases, adsorption can be achieved by contacting the antibody in a buffer solution of suitable pH, with the solid support for an appropriate amount of time. The contact time varies with the temperature, but it is not normally between 1 hour and 1 day. In general, to have in contact a well of a plastic microtiter plate (such as polystyrene or polyvinyl chloride) with an amount of antibody ranging from 10 ng to about 1 μg, and preferably 1 00-200 ng, it is sufficient to immobilize an adequate amount of polypeptides. The covalent attachment of antibody to a solid support can also generally be achieved by first reacting the support with a bifunctional reagent that will react with the support and a functional group, such as a hydroxyl or amino group., in the antibody. For example, the antibody can be covalently attached to supports that have a suitable polymer coating "using benzoquinone or by condensation of a group of aldehydes in the support with an amine and an active hydrogen in the binding partner (see, for example, Pierce Immunotechnology Catalog and Handbook (1991) in A12-A 13). In certain embodiments, the polypeptide detection assay in a sample is a sandwich analysis of two antibodies. This analysis can be carried out by first contacting an antibody that has been immobilized on a solid support, commonly the well of a microtiter plate, with the
biological sample, so that the polypeptide within the sample is allowed to bind to the immobilized antibody. The unbound sample is then removed by the immobilized polypeptide-antibody complexes and a second antibody (containing a reporter group) capable of binding to a different site of the polypeptide is added.
The amount of the second antibody that remains attached to the support
»Solid is then determined using the appropriate method for the ific reporter group. More ifically, once the antibody is immobilized on the support as described above, the binding sites of
The remaining protein on the support is normally blocked. Any suitable blocking agent known to those of ordinary skill in the art, such as bovine serum albumin or Tween 20 ™ (Sigma Chemical Co., St. Louis, MO). The immobilized antibody is then incubated with the sample and the
The polypeptide is allowed to bind to the antibody. The sample can be diluted with # a suitable diluent, such as saline with pH regulated with phosphate solution (PBS) before incubation. In general, the appropriate contact time (i.e., incubation time) is the period that is sufficient to detect i the presence of polypeptides within a sample obtained from an individual with breast cancer. Preferably, the contact time is sufficient to achieve a binding level that is at least 95% of that achieved in equilibrium between the bound and unbound polypeptide. Experts in the field will recognize that time
necessary to achieve the balance to be easily determined by analyzing the level of union that occurs over a period of time. At room temperature, an incubation time of about 30 minutes is generally sufficient. The unbound sample can then be removed by washing the
solid support with an appropriate pH buffer solution, such
% as PBS containing 0.1% Tween 20 ™. The second antibody, which contains a reporter group, can then be added to the solid support. Preferred reporter groups include enzymes (such as horseradish peroxidase), substrates, cofactors.
inhibitors, dyes, radionuclides, luminescent groups, fluorescent groups and biotyping. The conjugation of antibody to reporter group can be achieved using normal methods known to those of ordinary skill in the art. The second antibody is then incubated with the complex
Antibody-polypeptide immobilized for a sufficient time to detect the bound polypeptide. An appropriate amount of time can usually be determined by analyzing the level of binding that occurs over time. The unbound second antibody is then removed and the second bound antibody is detected using the reporter group. The method used to detect the reporter group depends on the nature of the reporter group. For radioactive groups, scintillation or autoradiographic counting methods are generally appropriate. troscopic methods can be used to detect dyes, luminescent groups and fluorescent groups. Biotin can be detected using avidin, coupled to a different reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme reporter groups can usually be detected by the addition of substrate (usually for a ific time), followed by troscopic or other analysis of the reaction products. To determine the presence or absence of breast cancer, the signal detected from the reporter group that remains attached to the solid support is generally compared to a signal corresponding to a predetermined established cutoff value of non-tumor tissue. In a preferred embodiment, the cut-off value is the average average signal obtained when the immobilized antibody is incubated with samples from patients without breast cancer. In general, a sample that generates a signal that is three normal deviations above the predetermined cutoff value can be considered positive for breast cancer. In an alternative preferred embodiment, the cutoff value is determined using the receiver operator curve, according to the method of Sackett et al., Clinical Epidemiology: A Basic Science for Clinical Medicine, p. 106-7 (Little Brown and Co., 1985). In summary, in this modality, the cut-off value can be determined from a graph of pairs of real positive regimes (ie, sensitivity) and false positive regimes (100% specificity) that corresponds to each possible cut-off value for the result of Diagnostic test, the cutoff value in the graph is closest to the upper left corner (that is, the value that covers the largest area) is the most accurate cutoff value and a sample that generates a signal that is higher The cut-off value determined by this method can be considered positive. Alternatively, the cut-off value can be changed to the left along the graph, to minimize the false positive regime or to the right to minimize the false negative regime. In general, a sample that generates a sample that is greater than the cut-off value determined by this method is considered positive for breast cancer. In a related embodiment, the analysis is carried out in a flow or separation test format, wherein the antibody is immobilized on a membrane, such as nitrocellulose. In the next flow test, the polypeptide within the sample binds to the immobilized antibody as the sample passes through the membrane. A second labeled antibody that binds to the antibody-polypeptide complex as a solution containing the second antibody flows through the membrane. The detection of the second bound antibody can then be carried out as described above. In the separation test format, one end of the membrane to which the antibody binds is immersed in a solution containing the sample. This sample migrates along the membrane through a region that contains a second antibody and the area of immobilized antibody. The concentration of the second antibody and the area of immobilized antibody indicates the presence of breast cancer. Normally, the concentration of the second antibody in that site generates a pattern, such as a line, that can be read visually. The absence of such pattern indicates a negative result. In general, the amount of antibody immobilized on the membrane is selected to generate a visually discernible pattern when the biological sample contains a level of polypeptide that could be sufficient to generate a positive signal in the sandwich analysis of two antibodies, in the treated format before. Preferably, the amount of antibody immobilized on the membrane varies from about 25 ng to about 1 μg and more preferably from about 50 ng to about 1 μg. Such tests can usually be carried out with a very small amount of biological sample. The presence or absence of breast cancer in a patient can also be determined by evaluating the level of mRNA that encodes a breast tumor-specific polypeptide as described herein within the biological sample (e.g., a biopsy, mastectomy and / or blood sample of a patient) in relation to a predetermined cut-off value. Such evaluation can be accomplished using any of a variety of methods known to those of ordinary skill in the art such as, for example, in situ hybridization and polymerase chain reaction amplification. For example, the chain reaction of
I »polymerase can be used to amplify cDNA sequences prepared from RNA that is isolated from one of the biological samples
previous Sequence-specific primers for use in said amplification can be designed based on the sequences provided in one of SEQ ID NO: 1 or SEQ ID NO: 1 1 - SEQ ID NO: 86 and SEQ ID NO: 142 - SEQ ID NO.226 and can be purchased or synthesized. In the case of B18Ag1, as noted here, a pair of
suitable primer is B 18Ag 1 -2 (5 'ATG GCT ATT TTC GGG GGC
• TGA CA) (SEQ ID NO.126) and B18Ag 1 -3 (5 'CCG GTA TCT CCT CGT GGG TAT T) (SEC I D NO .: 127). The PCR reaction products can then be separated by gel electrophoresis and visualized according to methods well known to those of experience
ordinary in the matter. Normally amplification is carried out in samples obtained from even pairs of tissue (tumor and non-tumor tissue from the same individual) or from unmatched pairs of tissue (tumor and non-tumor tissue from different individuals). The amplification reaction was preferably carried out in several dilutions of
cDNA expanding two orders of magnitude. A double or greater increase in expression in several dilutions of the tumor sample compared to the same dilution of the non-tumor sample is considered positive. Conventional PCR-TI protocols using agarose and ethidium bromide tinsión while it is important to define the specificity of genes does not lead to the development of diagnostic equipment due to the time and effort required to make them quantitative (ie, construction of curves). of saturation and / or titration) and its passage of the samples. This problem is overcome by the development of procedures such as real-time PCR-IT that allows the analysis to be carried in single tubes and in turn can be modified to be used in 96-well plate formats. The instrumentation to carry out said methodologies is available from ABI / Perkin Elmer. Alternatively, other higher-pass assays using labeled probes (eg, digoxigenin) in combination with labeled antibodies (eg, fluorescent, radioactive enzyme) for such probes can also be used in the development of plaque analysis. 96 wells In yet another method for determining the presence or absence of breast cancer in a patient, one or more polypeptides specific for breast tumor described in the skin test may be used. As used herein, a "skin test" is any test performed directly on a patient in which the delayed-type hypersensitivity reaction (HTR) (such as swelling, redness or dermatitis) is measured after intradermal injection of one or more polypeptides as described above. Such an injection can be accomplished using any device sufficient to contact the polypeptide or polypeptides with the skin cells of the patient, such as a tuberculin syringe or a 1 mL syringe. Preferably, the reaction was measured at least 48 hours after injection, more preferably 48-72 hours. The HTR reaction is a cell-mediated immune response that is higher in patients, who have previously been exposed to a test antigen (i.e., an immunogenic portion of a polypeptide employed, or a variant thereof). The answer can be measured visually, using a rule. In general, a response that is greater than about 0.5 cm in diameter, preferably greater than about 5.0 cm in diameter, is a positive response indicating breast cancer. The breast tumor specific polypeptides described herein are preferably formulated for use in a skin test, such as pharmaceutical compositions containing at least one polypeptide and a physiologically acceptable carrier, such as water, saline, alcohol or buffer solution. pH. Said compositions typically contain one or more of the above polypeptides in an amount ranging from about 1 μg to 100 μg, preferably from about 10 μg to 50 μg in a volume of 0.1 mL.Preferably, the carrier employed in said pharmaceutical compositions is in a saline solution with appropriate preservatives, such as phenol and / or Tween 80 ™. In other aspects of the present invention, the progression and / or
breast cancer treatment response can monitored carrying out any of the above analyzes for a time and evaluating the previous analysis change for a while and
V evaluating the change in the level of response (i.e., the amount of polypeptide or mRNA detected or, in the case of skin test,
the degree of immune response detected). For example, month-by-month analyzes can be carried out over a period of 1 to 2 years. In general, breast cancer progresses in those patients in whom the level of response increases with time. In contrast, breast cancer does not progress when the signal detected
by either of the two remains constant or decreases with time. In additional aspects of the present invention, the compounds described herein may be used for the immunotherapy of breast cancer. In these aspects, the
compounds (which may be polypeptides, antibodies or nucleic acid molecules) are preferably incorporated in pharmaceutical compositions or vaccines. The pharmaceutical compositions comprise one or more of said compounds and a physiologically acceptable carrier. Vaccines can include one or
more polypeptides and an immune response enhancer, such as an assistant or a liposome (in which the compound is incorporated). The pharmaceutical compositions and vaccines may additionally contain a delivery system, such as biodegradable microspheres which are described, for example, in the U.A. Nos. 4,897,268 and 5,075, 109. Pharmaceutical compositions and vaccines within the scope of the present invention may also contain other compounds, including one or more separate polypeptides. Alternatively, a vaccine may contain DNA encoding one or more of the polypeptides as described above, such that the polypeptide is generated in situ. In such vaccines, DNA can be present within a variety of delivery systems known to those of ordinary skill in the art, including nucleic acid expression systems, bacteria and viral expression systems. Suitable nucleic acid expression systems contain the DNA sequences necessary for expression in the patient (such as the promoter and suitable termination signal). Bacterial delivery systems involve the administration of a bacterium (such as Bacillus-Calmette-Guerrin) that expresses an immunogenic portion of the polypeptide on its cell surface. In a preferred embodiment, the DNA can be introduced using a viral expression system. (V. gr., Vaccine or other pox virus, retrovirus, or adenovirus), which may involve the use of a competent non-pathogenic (defective) replication virus. The techniques for incorporating DNA into said expression systems are well known to those of ordinary skill in the art. The DNA can also be "pure", as described, for example, in Ulmer et al., Science 259: 1745-1749 (1993), and was reviewed by Cohen, Science 2559: 1691-1692 (1993). DNA adsorption can be increased by coating the DNA in biodegradable beads, which are transported efficiently in the cells. While any suitable vehicle known to someone skilled in the art can be employed in the pharmaceutical compositions of this invention, the type of vehicle will vary depending on the mode of administration. For parenteral administration, such as subcutaneous injection, and vehicle preferably comprises water, saline, alcohol, a fat, a wax or a buffer solution. For oral administration, any of the vehicles or a solid vehicle, such as mannitol, lactose, starch, magnesium stearate, sodium saccharine, talcum, cellulose, glucose, sucrose and magnesium carbonate, can be used. Biodegradable microspheres (e.g., polylactate polyglycolate) can also be employed as carriers for the pharmaceutical compositions of this invention. Any of a variety of auxiliaries can be employed in the vaccines of this invention to nonspecifically increase the immune response. Most auxiliaries contain a substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or mineral oil, and a stimulator of immune responses, such as lipid A, proteins * derived from Bortadella pertussis or Mycobacterium tuberculosis. Suitable auxiliaries are commercially available, e.g., as Incomplete Auxiliary and Freund's Complete Auxiliary (Difco Laboratories, Detroit, Ml), Auxiliary 65 from Merck (Merck and 5 Company, Inc., Rahway, NJ), alume, biodegradable microspheres, Monophosphoryl lipid A and quil A. Cytokines such as GM-CSF or interleukin-2, -7, or -12 can also be used as auxiliaries. The above pharmaceutical compositions and vaccines can
used, for example, for the therapy of breast cancer in a patient. As used herein, a "patient" refers to any warm-blooded animal, preferably a human being. A patient may or may not suffer from breast cancer. Consequently, the pharmaceutical compositions and vaccines
can be used to prevent the development of breast cancer or to
* Treat a patient suffering from breast cancer. To prevent the development of breast cancer, a pharmaceutical composition or vaccine comprising one or more polypeptides as described herein can be administered to a patient. Alternatively, the
Pure DNA or plasmid or viral vector encoding the polypeptide can be administered. To treat a patient with breast cancer, the pharmaceutical composition or vaccine may comprise one or more polypeptides, antibodies or nucleotide sequences complementary to the A DN encoding a polypeptide as described herein (e.g., antisense RNA or deoxyribonucleotide antisense oligonucleotides). The routes and frequency of administration, as well as the dose, will vary from individual to individual. In general, the pharmaceutical compositions and vaccines can be administered by injection (eg, intracutaneous, intramuscular, intravenous or subcutaneous), intranasally (e.g., by aspiration) or orally. Between 1 and 10 doses can be administered over a period of 52 weeks. Preferably, 6 doses are administered, at intervals of 1 month and then booster vaccines can be given periodically. Alternative protocols may be appropriate for individual patients. A suitable dose is an amount of a compound that, when administered as described above, is capable of promoting an anti-tumor immune response. Said response can be monitored by measuring the anti-tumor antibodies in a patient or in a generation dependent on a vaccine of cytolytic effector cells capable of killing the tumor cells of the patient in vitro. Such vaccines may also be capable of eliciting an immune response that leads to improved clinical progress (eg, more frequent remissions, complete or partial or longer disease-free survival) in vaccinated patients compared to unvaccinated patients. In general, for pharmaceutical compositions and vaccines comprising one or more polypeptides, the amount of each polypeptide present in a dose ranges from about 100μg to 5mg. The appropriate dose sizes will vary with the size of the patient, but will usually vary from 0.1 mL to about 5 mL. The following Examples are offered by way of illustration and not by way of limitation. EXAMPLES Example 1 Preparation of cDNAs Specific for Breast Tumor Using *? M-Differential Deployment RCP-TI This example illustrates the preparation of cDNA molecules that encode breast tumor-specific polypeptides using a differential display screen. A. Preparation of B18Ag 1 cDNA and mRNA Expression Characterization Samples of breast tumor tissue and normal tissue from a patient with breast cancer were prepared and confirmed by pathology after removal of the patient. Normal RNA and tumor RNA was extracted from the samples and the m RNA was isolated and converted to cDNA using a 3 '(dT) 12AG-anchored initiator (SEQ ID NO: 130). The differential display PCR was executed using a randomly chosen primer (CTTCAACCTC) (SEC I D NO .: 103). the amplification conditions were normal pH buffer containing 1.5 mM MgCl2, 20 pmoles of primer, 500 pmoles of dNTP and one unit of Taq DNA polymerase (Perkin-El mer, Branchburg, NJ). Forty-five amplification cycles were carried out using denaturation at 94 ° C for * 30 seconds annealing at 42 ° C for 1 minute and extension at 72 ° C for 30 seconds. An RNA fingerprint containing 76 amplified products was obtained. Although the RNA fingerprint of the breast tumor tissue was more than 98% identical to normal breast tissue, a specific band for the fingerprint pattern of tumor RNA was repeatedly observed. This band was cut from a silver-stained gel, subcloned into the T vector (Novagen, Madison, Wl) and sequenced. '* The sequence of the cDNA, designated as B18Ag1, is provided
in SEC I D NO: 1. A database investigation of GEN BANK and EMBL revealed that the initially cloned B 18Ag 1 fragment is 77% identical to the endogenous human retroviral element S71, which is a truncated retroviral element homologous to the Simian Sarcoma Virus (SSV) . S71 contains a gag gene, a portion of the gene of
pol and a structure similar to LTR at the 3 'end (see Werner et al., Virology 174: 225-238 (1990)). B18Ag 1 is also 65% identical to the SSV in the region corresponding to the P30 site (gag). B 18Ag 1 contains three separate and incomplete reading frames covering a region that shares considerable homology to a
wide variety of retrovirus gag proteins that infect mammals. In addition, the homology to S71 is not only within the gag gene, but extends several kb of sequence including an LTR. The PCR primers specific for B18Ag 1 are
synthesized using computer analysis guidelines. The PCR-TI amplification (94 ° C, 30 seconds, 60 ° C, 42 ° C, 30 seconds, 72 ° C, 30 seconds, for 40 cycles) confirmed that B 18Ag 1 represents a real ARMm sequence present at levels Relatively high breast tumor tissue in the patient The primers used in the amplification were B 18Ag-1 (CTG CCT GAG CCA CAA ATG) (SEQ ID NO: 128) and B 18Ag 1-4 (CCG GAG GAG GAA GCT AGA GGA ATA) (SEQ ID NO: 129) at a magnesium concentration of 3.5 mM and a pH of 8.5 and B 18Ag 1-2 (ATG GCT ATT TTC GGG GCC TGA CA) (SEQ ID NO: 12β) and B18Ag1 -3 (CCG GTA TCT CCT CGT GGT TATT) (SEQ ID NO: 127) to 2mM magnesium at pH 9.5. The same experiments showed excessively low at the non-existent levels of expression in this patient's normal breast tissue (see Figure 1). The PCR-TI experiments were then used to show that B 18Ag 1 mRNA is present in nine breast tumor samples (from Brazilian and American patients) but were absent, or at excessively low levels, in normal breast tissue that corresponds to each patient with cancer. PCR-TI analyzes also showed that transcription of B18Ag 1 is not present in several normal tissues (including lymph node, myocardium and liver) and present at relatively low levels of PBMC and lung tissue. The presence of B18Ag1 mRNA in breast tumor samples and its absence of normal breast tissue has been confirmed by Northern blot analysis, as shown in Figure 2.
"jl Differential expression of B18IAg1 in breast tumor tissue was also confirmed by RNase protection analysis Figure 3 shows the level of B18Ag1 mRNA in various tissue types as determined in four RNase 5 protection analyzes Lines 1-12 represent normal breast tissue samples, lines 13-25 represent several breast tumor samples: lines 26-27 represent normal prostate samples, lines 28-29 represent tumor samples of prostate;
* lines 30-32 represent samples of colon tumor; line 33
represents normal aorta; line 34 represents normal small intestine; line 35 represents normal skin; line 36 represents normal lymph node; line 37 represents normal ovary; line 38 represents normal liver; line 39 represents normal skeletal muscle; line 40 represents a first sample of
normal stomach, line 41 represents a second normal stomach sample; line 42 represents a normal lung; line 43 represents normal kidney; and line 44 represents normal pancreas. The inter-experimental comparison was facilitated by including a positive control RNA of message abundance of B-actin in
each analysis and normalizing the results of different analyzes with respect to this control. The PCR-TI and analysis of Southern spots has shown that the site of B18Ag1 is present in human genomic DNA in a single copy of the endogenous retroviral element. A genomic clone of
approximately 12-18 kb was isolated using the B 1 8Ag 1 sequence as a probe. Four subclones were also isolated
* additional by digestion of Xbal. The additional retroviral sequences obtained from the ends of the Xbal digestions of these clones (localized as shown in Figure 4) are shown as SEQ ID NO: 3 - SEQ ID NO: 10, wherein SEQ ID NO: 3 shows the location of the sequence marked 10 in Figure 4, SEQ ID NO: 4 shows the location of the sequence marked 1 1 -29, SEQ ID NO: 5 shows the location of the sequence marked 3, SEQ ID NO: 6 shows the location of the sequence marked 6, SEQ ID NO: 7
shows the location of the sequence marked 12, SEQ ID NO: 8 shows the location of the sequence marked 13, SEQ ID NO: 9 shows the location of the sequence marked 15 and SEQ ID NO: 10 shows the location of the marked sequence 1 1 -22. Subsequent studies showed that the genomic clone
of 12-18 kb contains a retroviral element of approximately 7.75 kb, as shown in Figures 5A and 5B. The sequence of
* this retroviral element is shown in SEC I D NO: 141. The line numbered in the upper part of Figure 5A represents the sequence of the yarn direction of the retroviral genomic clone. The box
of this line shows the position of the selected restriction sites. The arrow describes the different overlap clones used to sequence the retroviral element. The direction of the arrow shows whether the sequence of the one-step subclone corresponded to the direction or sense of the thread. Figure 5B
is a schematic diagram of the retroviral element containing one of viral genes within the element. The open boxes correspond to predicted reading frames, starting with a methionine, found through the element. Each of the six similar reading frames is shown, as indicated to the left of the boxes, with frames 1 -3 corresponding to those found in the direction of the thread. Using the cDNA of SEQ ID NO: 1 as a probe, a longer cDNA was obtained (SEQ ID NO: 227) which contains difference in
* minor nucleotides (less than 1%) compared to the sequence
genomics shown in SEC I D NO: 141. B. Preparation of cDNA Molecules Encoding Other Breast Tumor Specific Polypeptides Normal RNA and tumor RNA were prepared and mRNA was isolated and converted to cDNA using a 30 'anchor (dT)? 2AG primer,
as described above. The differential display PCR was then executed using randomly chosen primers SEC I D NO. : 87-125 The amplification conditions were as noted above, and the observed bands specific for the fingerprint pattern of tumor RNA were cut from a dyeing gel
with silver, were subcloned into the vector T (Novagen, Madison, Wl) or the vector pCR I l (Invitrogen, San Diego, CA) and sequenced. The sequence is provided in SEC I D NO: 1 1 - SEQ ID NO: 86. Of the 79 isolated sequences, 67 were found to be novel (SEC I D NO: 1 1 -77) (see also Figures 6-20). Subsequent studies
identified that 84 additional sequences (SEQ ID NOS: 142-226) of which 72 appeared to be novel (SEQ ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198 , 200-204, 206, 207, 209-214, 219, 218, 219, 221-227). To the best of the inventors' knowledge none of the previously identified sequences has thus been shown to be expressed at a higher level in the h breast tumor tissue than in normal breast tissue. Table 1 shows the level of specific transcripts for representative breast tumors in normal breast tissue
* (columns BN I-BN7), samples of breast tumors (columns BTI-10 BT12) and normal prostate, kidney, liver, lung, small intestine, stomach, myocardium, lymph node, pancreas, skeletal muscle, ovary and normal aorta , as determined by PCR-IT analysis. A scale of 0-3 for the abundance of messages was used, with 0 denoting no detectable message and 3 a level of
message comparable to the control message (glyceraldehyde 3-phosphate dehydrogenase). The lack of data in a given box
* indicates that the tissue has not been tested for the presence or absence of that specific antigen.
-iritis
4*". s >
•
Example 2 Preparation of B18Ag1 DNA from Human Genomic DNA This Example illustrates the preparation of B18Ag1 DNA by amplification of genomic DNA. B 18Ag 1 DNA can be prepared from 250 ng of DNA
* human genomic using 20 pmoles of specific primers of 10 B18Ag1, 500 pmoles of dN ITPS and 1 unit of Taqf DNA polymerase (Perkin Elmer, Branchburg, NJ) using the following amplification parameters: 94 ° C for 30 seconds of denaturation , 30 seconds at 60 ° C at 42 ° C of descent annealing at 2 ° C increased every two cycles and extension at 15 72 ° C for 30 seconds. The last increment (at an annealing temperature of 42 ° C) should rotate 25 times. The primers were selected using a computer analysis. The primers synthesized were B18Ag 1 -1, B18Ag 1 -2, B18Ag 1 -3 and B18Ag1 -4. The primer pairs that can be used are 1 +3, 1 +4, 2 + 3 and 2 + 4.
After gel electrophoresis, the band corresponding to B18Ag1 DNA can be excised and cloned into a suitable vector. Example 3 Preparation of B18Ag 1 DNA from Breast Tumor cDNA This Example illustrates the preparation of B18Ag1 DNA by the amplification of human breast tumor cDNA. The first cDNA strand was synthesized from RNA prepared from human breast tumor tissue in a reaction mixture containing
500 ng of poly A + RNA, 200 pmoles of the primer (T) 12AG (ie,
TTT TTT TTT TTT AG) (SEQ ID NO: 130), 1X of the first strand of reverse transcriptase pH buffer, 6.7 mM of DDT,
500 mmoles of dNTP and 1 unit of AMV reverse transcriptase or
Hr MMLV (from a provider, such as Gibco-BRL (Grand Island, NY)) in
a final volume of 30 μl. After the synthesis of the first thread, the
CDNA was diluted approximately 25-fold and 1 μl was used for the amplification as described in Example 2. While some pairs of primers may result in a heterogeneous population of transcripts, the primers B18Ag1-2 (5 'ATG
GCT ATT TTC GGG GGC TGA CA) (SEQ ID NO: 126) and B18Ag1-3 (5 'CCG GTA TCT CCT CGT GGG TAT T) (SEQ ID NO: 127) produced a single amplification product of 151 bp. Example 4 Identification of B-Cell and B-cell Epitopes of B18Aq1 This Example illustrates the identification of B18Ag1 epitopes. The sequence of B18Ag1 can be screened using a variety of computer algorithms. To determine B-cell epitopes, the sequence can be screened for hydrophobicity and hydrophilicity values using the method of Hopp, Prog.
Clin. Biol. Res. 172B: 367-77 (1985) or, alternatively, Cease et al. J. Exp. Med 164: 1779-84 (1986) or Spouge et al., J. Immunol 138: 204-12 (1987). Additional MHC Class II epitopes (antibodies or B-cells) can be predicted using programs such as AM PH I (see, Margalit et al., J. Immunol., 138: 2213 (1987)) or the methods of Rothbard and Taylor (e.g., EMBO J. 7:93 (1988)). Once the peptides (15-20 amino acids long) were identified using these techniques, individual peptides can be synthesized using automatic equipment for peptide synthesis (available from manufacturers such as Applied Biosystems, Inc., Foster City, CA) and techniques such as the Merrfield synthesis. After the synthesis, the peptides can be used to screen cultured serum from normal patients and with breast cancer to determine if breast cancer patients have antibodies reactive with the peptides. The presence of said antibodies in patients with breast cancer could confirm the immunogenicity of the specific B-cell epitope in question. Also peptides can be tested for their ability to generate a serological or humoral immune response in animals (mice, rats, rabbits, chimpanzees, etc.) following immunization in vivo. The generation of a specific antiserum for peptides after immunization further confirms the immunogenicity of the specific B-cell epitope in question. To identify T-cell epitopes, the B18Ag1 sequence can be screened using different computer algorithms * that can be useful to identify 8-10 amino acid motifs within the B18Ag1 sequence that are capable of binding to M HC molecules Class I of HLA. (See, for example, Ramme nsee et al., Immunogenetics 41: 178-228 (1995)). After synthesis, said peptides can be tested for their ability to bind to MHC Class I using normal binding assays (eg, Sette et al., J. Immunol., 153: 5586-92 (1994)) and more. importantly they can be tested for their capacity in order to generate reactive cytotoxic T-cells for antigens after the
In vitro stimulation of the patient or normal peripheral mononuclear cells using, for example, the methods of Bakker et al. Cancer Res. 55: 5330-34 (1995); Visseren et al., J. Immunol 154: 3991-98 (1995); Kawakami et al., J. Immunol. 154: 3961-68 (1995). and Kast et al., J. Immunol. 152: 3904-12 (1994). The
successful in vitro generation of T-cells capable of killing autologous tumor cells (which have the same Class I M HC molecules) after stimulation of in vitro peptides, further confirms the immunogenicidas of the B18Ag1 antigen. In addition, said peptides can be used to generate murine peptides and
reactive cytotoxic T-cells of B 18Ag1 after in vivo immunization of mice converted to transgenic for the expression of a particular human MHC Class I haplotype (Vitiello et al., J. Exp. Med. 173: 1007-15 (1991 ).
A representative list of B-cell epitopes and predicted B18Ag1 T-cells, interrupted according to the expected HLA Class I MHC binding antigen, is shown below. Predicted Th motives (epitopes of B-cells) (SEQ ID 5 NOS .: 131-133) SSGGRTFDDFHRYLLVGI QGAAQKPINLSKXIEVVQGHDE SPGVFLEHLQEAYRIYTPFDLSA * Reasons A2.1 HLA Expected (epitopes of T-cells) (SEQ ID 10 NOS..134-140) YLLVGIQGA GAAQKPINL NLSKXIEVV EVVQGHDES 15 HLQEAYRIY NLAFVAQAA FVAQAAPDS Example 5 Characterization of Breast Tumor Genes Discovered by Differential Deployment RCP 20 The specificity and sensitivity of the breast tumor genes discovered by the differential deployment PCR was determined using RCP-TI. This procedure allowed rapid evaluation of mRNA expression of breast tumor genes
semiquantitatively without using large amounts of RNA.
Utilizing specific gene primers, the levels of mRNA expression in a variety of tissues was examined, including 8 breast tumors, 5 normal mammals, 2 prostate tumors, 2 colon tumors, 2 lung tumors and 14 different tissues normal human adults, including prostate, colon, kidney, liver, lung, ovary, pancreas, skeletal muscle, skin, stomach and normal testicles. To ensure the semiquantitative nature of the TI-PCR, β-actin was used as an internal control for each of the tissues examined. Several dilutions of the first cDNA strand were prepared and PCR-TI analysis was carried out and specific β-actin primers were used. A dilution was then selected that allowed amplification of the linear scale of the β-actin pattern and that was sufficiently sensitive to reflect the difference in the initial copy number. Using this condition, β-actin levels were determined for each reverse transcription reaction of each tissue. DNA contamination was reduced by DNase treatment ensuring a negative result when using the first cDNA strand that was prepared without addition to reverse transcriptase. Using gene-specific primers, mRNA expression levels were determined in a variety of tissues. To date 32 genes have been successfully examined by RT-PCR, three of which exhibited good specificity and sensitivity for breast tumors. Figures 21 A and 21 B describe the results * for these three genes: B15AG-1 (SEQ ID NO: 27), B31GA1b (SEQ ID NO: 148) and B38GA2a (SEQ ID NO: 157). From the foregoing, it will be appreciated that, although the specific embodiments of the invention have been described herein for the purpose of illustration, various modifications may be made without departing from the spirit and scope of the invention.
*
*
SEQUENCE LIST (1) GENERAL INFORMATION (I) APPLICANT: Corixa Corporation. (ii) TITLE OF THE INVENTION: COMPOSITIONS AND METHODS FOR THE TREATMENT AND DIAGNOSIS OF BREAST CANCER (iii) NUMBER OF SEQUENCES: 227 (v) ADDRESS OF CORRESPONDENCE (TO) RECIPIENT: DEED and BERRY LLP (B) STREET: 6300 Columbia Center, 701 Fifth Avenue (C) CITY: Seattle (D) STATE: Washington (E) COUNTRY: E.U.A. (F) ZP: 98104-7092 (v) COMPUTER LEADABLE FORM: (A) TYPE OF MEDIUM: soft disk (B) COMPUTER: IBM compatible PC (C) OPERATING SYSTEM: PC-DOS / MS-DOS (D) SOFTWARE: Patentln Reléase # 1.0- Version # 1.30 (vi) CURRENT REQUEST DATA: (A) APPLICATION NUMBER: (B) DATE OF SUBMISSION: 10-JANUARY 1996 (C) CLASSIFICATION: (viii) EMPLOYEE / AGENT INFORMATION: (A) NAME: Maki, David J .. (B) REGISTRATION NUMBER: 31,392 * (C) REFERENCE / NUMBER: 210121.419PC (ix) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: (206) 622-4900 (B) TELEFAX: (206) 682-6031 5 (2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 363 base pairs (B) TYPE: nucleic acid * (C) THREAD FORM : simple 10"(D) TOPOLOGY: linear (ix) FEATURE: (A) NAME / KEY: CDS (B) LOCATION: 1.363 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1
fifteen
#
twenty
# TTA GAG ACC CAG TTG GGA CCT AAT TGG GAC CCA AAT TTC TGA AGT GGA 48 Leu Glu Thr Gln Leu Gly Pro Asn Trp Asp Pro Asn Phe Ser Ser Gly 1 5 10 15
GGG AGA ACT TTT GAC GAT TTC CAC CGG TAT CTC CTC GTG GGT ATT CAG% Gly Arg Thr Phe Asp Asp Phe His Arg Tyr Leu Leu Val Gly He Gln 20 25 30
GGA GCT GCC CAG AAA CCT ATA AAC TTG TCT AAG GCG ATT GAA GTC GTC 144 Gly Ala Wing Gln Lys Pro He Asn Leu Ser Lys Wing He Glu Val Val 35 40 45 * CAG GGG CAT GAT GAG TCA CCA GGA GTG TTT TTA GAG CAC CTC CAG GAG 192
Gln Gly His Asp Glu Ser Pro Gly Val Phe Leu Glu His Leu Gln Glu 50 55 60
GCT TAT CGG Ap TAC ACC CCT TTT GAC CTG GCA GCC CCC GAA AAT AGC 240
Wing Tyr Arg He Tyr Thr Pro Phe Asp Leu Wing Wing Pro Glu Asn Ser 65 70 75 80 15 CAT GCT Cp AAT pG GCA TTT GTG GCT CAG GCA GCC CCA GAT AGT AAA 288
# His Ala Leu Asn Leu Ala Phe Val Ala Ala Ala Ala Pro Asp Ser Lys 85 90 95
AGG AAA CTC CAA AAA CTA GAG GGA Tp TGC TGG AAT GAA TAC CAG TCA 336 Arg Lys Leu Gln Lys Leu Glu Gly Phe Cys Trp Asn Glu Tyr Gln Ser 20 100 105 110
GCT TTT AGA GAT AGC CTA AAA GGT T 363 Wing Phe Arg Asp Ser Leu Lys Gly Phe 115 120
(2) INFORMATION FOR SEQ ID NO: 2: 25 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 121 amino acids (B) TYPE: amino acids (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 2:
Leu Glu Thr Gln Leu Gly Pro Asn Trp Asp Pro Asn Phe Ser Ser Gly 1 5 10 15
Gly Arg Thr Phe Asp Asp Phe His Arg Tyr Leu Leu Val Gly lie G n 20 25 30
Gly Ala Ala G n Lys Pro He Asn Leu Ser Lys Ala He Glu Val Val 35 40 45
Gln Gly His Asp Glu Ser Pro Gly Val Phe Leu Glu His Leu Gln Glu 50 55 60
Ala Tyr Arg He Tyr Thr Pro Phe Asp Leu Ala Wing Pro Glu Asn Ser 65 70 75 80
His Ala Leu Asn Leu Ala Phe Val Ala Ala Gln Ala Ala Pro Asp Ser Lys 85 90 95
Arg Lys Leu Glp Lys Leu Glu Gly Phe Cys Trp Asn Glu Tyr Gln Ser 100 105 110
Wing Phe Arg Asp Ser Leu Lys Gly Phe 115 120 (2) INFORMATION FOR SEQ ID NO: 3: # (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 1 101 base pairs (B) TYPE: nucleic acid ( C) FOR HI LO: Simple 5 (D) TOPOLOGY: line l (xi) DESCR I SEQUENCE PICTURE: SEQ ID NO: 3: TCpAGAATC pCATACCCC GAACTCTTGG GAAAACTpA ATCAGTCACC TACAGTCTAC 60
CACCCATHA GGAGGAGCAA AGCTACCTCA GCTCCTCCGG AGCCGTpTA AGATCCCCCA 120
* 10 TCpCAAAGC CTAACAGATC AAGCAGCTCT CCGGTGCACA ACCTGCGCCC AGGTAAATGC 180
CAAAAAAGGT CCTAAACCCA GCCCAGGCCA CCGTCTCCAA GAAAACTCAC CAGGAGAAAA 240
GTGGGAAAp GACTITACAG AAGTAAAACC ACACCGGGCT GGGTACAAAT ACCpCTAGT 300
ACTGGTAGAC ACCTTCTCTG GATGGACTGA AGCATpGCT ACCAAAAACG AAACTGTCAA 360
TATGGTAGp AAGppTAC TCAATGAAAT CATCCCTCGA CGTGGGCTGC CTGpGCCAT 420
AGGGTCTGAT AATGGAACGG CCpCGCCp GTCTATAGp TAATCAGTCA GTAAGGCGp 480
AAACApCAA TGGAAGCTCC ApGTGCCTA TCGACCCAGA GCTCTGGGCA AGTAGAACGC 540
ATGAACTGCA CCCTAAAAAA ACACTCpAC AAAApAATC pAAAAACCG GTGpAApG 600
TGpAGTCTC CpCCCpAG CCCTACTTAG AGpAAGGTG CACCCCTTAC TGGGCTGGGT 660
TCpTACCp pGAAATCAT NTTTGGAAG GGGCTGCCTA TCpTNCpA ACTAAAAAAN 720
AAAAATCTNC TGCCCTpTC AAGGAACCAT CCCATCCAp CCTNAACAAA AGGCCTGCCN 840
TTCpCCCCC AGpAACTNT ppTNpAA AApCCCAAA AAANGAACCN CCTGCTGGAA 900
AAACNCCCCC CTCCAANCCC CGGCCNAAGN GGAAGGpCC CpGAATCCC NCCCCCNCNA 960
ANGGCCCGGA ACCNpAAAN TNGpCCNGG GGGTNNGGCC TAAAAGNCCN ATpGGTAAA 1020
CCTANAAAp ppcppN TAAAAACCAC NpTNNpT pCTTAAACA AAACCCTNp 1080
TNTAGNANCN TATpCCCNC C 1101
(2) INFORMATION FOR SEQ ID NO: 4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1087 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 4: TCTAGAGCTG CGCCTGGATC CCGCCACAGT GAGGAGACCT GAAGACCAGA GAAAACACAG 60
CAAGTAGGCC CpTAAACTA CTCACCTGTG pGTCTTCTA ATTTApCTG TpTATpTG 120
pTCCATCAT pTAAGGGGT TAAAATCATC pßpCAGAC CTCAGCATAT AAAATGACCC 180
ATCTGTAGAC CTCAGGCTCC AACCATACCC CAAGAGpGT CTGGTpTGT pAAApACT 240
GCCAGGpTC AGCTGCAGAT ATCCCTGGAA GGAATApCC AGApCCCTG AGTAGTTTCC 300 2 ooieíonu opioB -Odll (a) seseq ep sejed 0101- On lONO "! (V) • VI0N3n03S 30 SV0llSJt_J310VdV0 (!) • 9-ON 0103S VUVd NÓlOVI? JdOdNI (z) 03
0801 1V1WWN133U1L31VD 31NWW3333N1Í131LLL 1131333333333131ULU.
0201 W333319NV 33333N33199933NW33311199999¡V1N1N1N33333133N1LLL
»
096 NV33333NNN Nlllllllll 1VN33VW1L 11333JUL1L9993NN1W91131NV991N1 91. 006 31LLLWVN11V331331LL 1L9NN9N33333333V1L399NV1NN99N11N1WN3133
0fr8 39N31333331N31VN1N3333V9131NNV 333991V99V p9VlLLL91331LL9993V
08 ¿WWV19V1L n milB LLL 13993333333333N 3333399VNV 11W99133V
UL 11W99NN3333V33133V333IIMI33N 33V11WW3331N3913N1 WNW39113 01-
099 3N13933131 W33991LLL NLLU.W999139NV3319119331391131NV993WW *
009 U1LL1NN13 INlNllllNN JL133N3N313131LWV1V9 VN31V3331111VÍV1LL19
OW 19W9W91131L91WJJLL 31131W9V1199199111191VW331LV V1333V1L9V 9
08! 7 31LLV33913 WV1LV99V99V3JJ.3011V WVHWV9V 919V11913V 19VW1LL39
02 * V13333V1311LLWWWV V1V1393WV 9191L391W 1LL3W1LLL 1V99LU1V9
09C 1V3VWW9V 9V3191331L 9V9W99V911L191311311399V1V1331WW1L99V # 09 * (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: TCTAGACCAA GAAATGGGAG GATpTAGAG TGACTGATGA TpCTCTATC ATCTGCAGp 60
AGTAAACAp CTCCACAGp TATGCAAAAA GTAACAAAAC CACTGCAGAT GACAAACACT 120
AGGTAACACA CATACTATCT CCCAAATACC TACCCACAAG CTCAACAAp pAAACTGp 180
AGGATCACTG GCTCTAATCA CCATGACATG AGGTCACCAC CAAACCATCA AGCGCTAAAC 240
#
AGACAGAATG pTCCACTCC TGATCCACTG TGTGGGAAGA AGCACCGAAC pACCCACTG 300
GGGGGCCTGC NTCANAANAA AAGCCCATGC CCCCGGGTNT NCC rTNAAC CGGAACGAAT 360
NAACCCACCA TCCCCACANC TCCTCTGpC NTGGGCCCTG CATCTTGTGG CCTCNTNTNC 420
TTTNGGGGAN ACNTGGGGAA GGTACCCCAT pCNpGACC CCNCNANAAA ACCCCNGTGG 480
CCCTTTGCCC TGApCNCNT GGGCCTpTC TCTpTCCCT TTTGGGpGT pAAApCCC 540
• AATGTCCCCN GAACCCTCTC CNTNCTGCCC AAAACCTACC TAAATTNCTC NCTANGNNp 600
pCTTGGTGT TNCT? TCAA AGGTNACCTT NCCTGpCAN NCCCNACNAA AATrTrípCC 660
NTATNNTGGN CCCNNAAAAA NNNATCNNCC CNAApGCCC GAApGGpN GGppTCCT 720
NCTGGGGGAA ACCCpTAAA pTCCCCCp GGCCGGCCCC CCl II 11 ICC CCCCpTNGA 780
# AGGCAGGNGG pCTTCCCGA ACTTCCAAp NCAACAGCCN TGCCCApGN TGAAACCCp 840
pCCTAAAAT TAAAAAATAN CCGGpNNGG NNGGCCTCp TCCCCTCCNG GNGGGNNGNG 900
AAANTCCTTA CCCCNAAAAA GGpGCpAG CCCCCNGTCC CCACTCCCCC NGGAAAAATN 960
AACCTiTTCN AAAAAAGGAA TATAANpTN CCACTCCTTN GpCTCpCC 1010
(2) INFORMATION FOR SEQ ID NO: 6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 950 base pairs 10 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
TCTAGAGCTC GCGGCCGCGA GCTCTAATAC GACTCACTAT AGGGCGTC6A CTCGATCTCA 60
GCTCACTGCA ATCTCTGCCC CCGGGGTCAT GCGApCTCC TGCCTCAGCC pCCAAGTAG_120_* CTGGGApAC AGGCGTGCAA CACCACACCC GGCTAATTp GTATppAA TAGAGATGGG 180
GTTpCCCp GpGGCCANN ATGGTCTCNA ACCCCTGACC TCNNGTGATC CCCCCNCCCN 240
NGANCTCNNA CTGCTGGGGA TNNCCGNNNN NNNCCTCCCN NCNCNNNNNN NCNCNNTCCN 300
TNNTCCTTNC TCNNNNNNNN CNNTCNNTCC NNCTTCTCNC CNNNTNpNT CNNCNNCCNN 360
CNNNCCNCNT NCCCNCNNNT TCNCNTNCNN TNTCCNNCNN NNTCNNCNNN CNNNNCNTNN 420
CCNNTACNTC NTNNNCNNNT CCNTCTNTNN CCTCNNCNNT CNCTNCNCNT TNTCTCCTCN 480
NTNNNNNNCT CCNNNNNTCT CNTCNCNNCN TNCCTCNNTN NCCNCNCCCC NCCTCNCNNC 540
CTNNTTTNNN CNNCNNNTCC NTNCCNpCN NNTCCNNTNN CNNCNTCNCN NNCNpMTC 600
CCNCCNNpC CpNCNCNTN NNNTNTCNNN CNCNTCNNTC NTpNCTCCT NNNTCCCNNC 660
-10 TCNNpCNCC CNNNTCCNCC CCCCNCCTNT CTCTCNCCCN NNTNNNTNTN NNNCNTCCNC 720
TNTCNCNpC NTCNNTNCNT TNCTNTCNNC NNCNNTNCNC TNCCNTNTNT CTNNNTCNCN 780
TCNCNTNTCN CCNTCCNpN CTNTCTCCTN TNTCCpCCC CTCNCCTNCT CNpCNCCNC 840
CCNNTNTNTN TNNCNCCNNT NCTNNNCNNC CNTCNTTTCN TCTCTNCTNN NNNTNNCCTC 900
NNCCCNTNCC CTNNTNCNCT NCTNNTACCN TNCTNCTCCN TCTTCCTTCC 950 (2) INFORMATION FOR SEQ ID NO: 7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1086 base pairs (B) TYPE: nucleic acid 0 (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7
TCTAGAGCTC GCGGCCGCGA GCTCAApAA CCCTCACTAA AGGGAGTCGA CTCGATCAGA 60 CTGpACTGT GTCTATGTAG AAAGAAGTAG ACATAAGAGA pCCAppG pCTGTACTA 120 5 AGAAAAAApC pCTGCCTTG AGATGCTGp AATCTGTAAC CCTAGCCCCA ACCCTGTGCT 180
CACAGAGACA TGTGCTGTGT TGACTCAAGG pCAATGGAT pAGGGCTAT GCTpGpAA 240
AAAAGTGCTT GAAGATAATA TGCTTGTTAA AAGTCATCAC CApCTCTAA TCTCAAGTAC 300
CCAGG6ACAC AATACACTGC GGAAGGCCGC AGGGACCTCT GTCTAGGAAA GCCAGGTAp 360
GTCCAAGAp TCTCCCCATG TGATAGCCTG AGATATGGCC TCATGGGAAG GGTAAGACCT 420
GACTGTCCCC CAGCCCGACA TCCCCCAGCC CGACATCCCC CAGCCCGACA CCCGAAAAGG 480
GTCTGTGCTG AGGAAGApA NTAAAAGAGG AAGGCTCTp GCApGAAGT AAGAAGAAGG 540
CTCTGTCTCC TGCTCGTCCC TGGGCAATAA AATGTCTTGG TGpAAACCC GAATGTATGT 600
TCTACTTACT GAGAATAGGA GAAAACATCC pAGGGCTGG AGGTGAGACA CCCTGGCGGC 660
ATACTGCTCT pAATGCACG AGATGTTTGT NTAApGCCA TCCAGGGCCA NCCCCTUCC 720 pAAcprp ATGANACAAA AAcprGpc NcppccTG CGAACCTCTC CCCCTA? AN 780
CCTApGGCC TGCCCATCCC CTCCCCAAAN GGTGAAAANA TGpCNTAAA TNCGAGGGAA 840
TCCAAAACNT pTCCCGpG GTCCCCTpC CAACCCCGTC CCTGGGCCNN pTCCTCCCC 900
AACNTGTCCC GGNTCCTTCN pCCCNCCCC CTTCCCNGAN AAAAAACCCC GTNTGANGGN 960
GCCCCCTCAA ApATAACCT pCCNAAACA AANNGGpCN AAGGTGGTp GNpCCGGTG 1020
CGGCTGGCCT TGAGGTCCCC CCTNCACCCC AATTTGGAAN CCNGppp pApGCCCN 1080
NTCCCC 1086 * (2) INFORMATION FOR SEQ ID NO: 8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1177 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
NCCNpTAGA TGpGACAAN NTAAACAAGC NGCTCAGGCA GCTGAAAAAA GCCACTGATA 60
AAGCATCCTG GAGTATCAGA GTTTACTGp AGATCAGCCT CATpGACTT CCCCTCCCAC 120
ATGGTGpTA AATCCAGCTA CACTACTTCC TGACTCAAAC TCCACTApC CTGpCATGA 180
CTGTCAGGAA CTGpGGAAA CTACTGAAAC TGGCCGACCT GATCTTCAAA ATGTGCCCCT 240
AGGAAAGGTG GATGCCACCG TGpCACAGA CAGTACCNCC pCCTCGAGA AGGGACTACG 300
AGGGGCCGGT GCANCTGpA CCAAGGAGAC TNATGTGpG TGGGCTCAGG CTpACCANC 360
* AAACACCTCA NCNCNNAAGG CTGAApGAT CGCCCTCACT CAGGCTCTCG GATGGGGTAA 420
GGGATApAA CGpAACACT GACAGCAGGT ACGCCpTGC TACTGTGCAT GTACGTGGAG 480
CCATCTACCA GGAGCGTGGG CTACTCACTC GGCAGGTGGC TGTNATCCAC TGTAAANG6A 540
twenty
# CATCAAAAGG AAAACNNGGC TGpGCCCGT GGTAACCANA AANCTGATCN NCAGCTCNAA 600
GATGCTGTGT TGACTTTCAC TCNCNCCTCT TAAACpGCT GCCCACANTC TCCTpCCCA 660
ACCAGATCTG CCTGACAATC CCCATACTCA AAAAAAAAAN AANACTGGCC CCGAACCCNA 720
ACCAATAAAA ACGGGGANGG TNGGTNGANC NNCCTGACCC AAAAATAATG GATCCCCCGG 780
GCTGCAGGAA pCAApCAN CCpATCNAT ACCCCCAACN NGGNGGGGGG GGCCNGTNCC 840
CApNCCCCT NTATTNApC pTNNCCCCC CCCCCGGCNT CCTTTpNAA CTCGTGAAAG 900
GGAAAACCTG NCTTACCAAN pATCNCCTG GACCNTCCCC pCCNCGGTN GNpANAAAA 960
AAAAGCCCNC ANTCCCNTCC NAAApTGCA CNGAAAGGNA AGGAApTAA CCpTApp 1020
pNNTCCTTT ANTpGTNNN CCCCCTpTA CCCAGGCGAA CNGCCATCNT pAANAAAAA 1080
AAANAGAANG pTATppC CpNGAACCA TCCCAATANA AANCACCCGC NGGGGAACGG 1140
GGNGGNAGGC CNCTCACCCC CTTTNTGTNG GNGGGNC 1177
(2) INFORMATION FOR SEQ ID NO: 9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1146 base pairs 20 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
NCCNNpNNT GATGpGTCT piTGGCCTC TCTpGGATA CpTCCCTCT CpCAGAGGT 60
GAAAAGGGTC AAAAGGAGCT GpGACAGTC ATCCCAGGTG GGCCAATGTG TCCAGAGTAC 120
AGACTCCATC AGTGAGGTCA AAGCCTGGGG CTTpCAGAG AAGGGAGGAT TATGGGTTp 180
CCAApATAC AAGTCAGAAG TAGAAAGAAG GGACATAAAC CAGGAAGGGG GTGGAGCACT 240
CATCACCCAG AGGGACTTGT GCCTCTCTCA GTGGTAGTAG AGGGGCTACT TCCTCCCACC 300
ACGGpGCAA CCAAGAGGCA ATGGGTGATG AGCCTACAGG GGACATANCC GAGGAGACAT 360
GGGATGACCC TAAGGGAGTA GGCTGGTpT AAGGCGGTGG GACTGGGTGA GGGAAACTCT 420
CCTCpcpC AGAGAGAAGC AGTACAGGGC GAGCTGAACC GGCTGAAGGT CGAGGCGAAA 480
ACACGGTCTG GCTCAGGAAG ACCTTGGAAG TAAAApATG AATGGTGCAT GAATGGAGCC 540
ATGGAAGGGG TGCTCCTGAC CAAACTCAGC CApGATCAA TGpAGGGAA ACTGATCAGG 600
GAAGCCGGGA ApTCApAA CAACCCGCCA CACAGCpGA ACApGTGAG GpCAGTGAC 660
CCpCAAGGG GCCACTCCAC TCCAACTpG GCCApCTAC pTGCNAAAT pCCAAAACT 720
TCCTTppA A GCCGAATC CNTANTCCCT NAAAAACNAA AAAAAATCTG CNCCTApCT 780
GGAAAAGGCC CANCCCpAC CAGGCTGGAA GAAATTpNC Cl l l l l l l l l ppTGAAGG 840
93
09e 1331393V33 39V19V9199 V9139133V3 39V3V11V99 19V3VU133 191V9111L3
OO? 31L1L91933 9V3W39993 V33V1L9913 1LL39W1W 9139139V91 131V3913V3
0 * 2 V319VW913 V919399V9V U19W39V3 9991913V9V 99WV99191 399131V9V3
03
081 99V3191LV9 99391V19V9 1L91311311 31LLL9V339 9V9131L9V9 1313991LV1
021 liliV91331 133W33V33 W9W9DVD1 9V9V131L9V 13V331V999 9_ 30_VD91
09 331LW931V 1V91139W1 V931V1993V 93199V9313 33339993V1 9991LV31L3
• O μON 01 03S: VI0N3n03S 30 NQIOdl _.OS30 (¡x) 91. | BTU || : Vj9010d01 (?) OÜÜOUTS:? TI H 30 VlAI ^ Od (O) ooj? | Onu opioe:? D ll (a) seseq ap saied g ^ g: a nil 9N01 (v): VI 0N3 n03S 30 SVOilSj _. 310V_. VO (!) 01-: O I.: ON 01 03S VdVd NQI OVWdOd N I (Z)
9 * 11 NV9V1V
0 * 11 1LL1NN133. NWWV999N 99N1N91LLN WVWNN331 JJ11V33WV 331133W3N
0801 VNJ1V9N3J-9 9N1LLLJJ13 33N33NW99 VN3NV3331L 1LL3WVN33 UJLÍV33333
0201 3333331119 NN1V9N13N9 9N33111W3 333339WN1 V9V9V999V1 1WLLL131L
096 N1333N331L 33311NJ133 3N333WWV 3WWW30V 1133311WN 3WWV33JJ.
006 1V99099999 9N33N333W WVWW333 3333N3J1W N133W911V W1LN1LLN3
89 • GGTAGATGGC TCCACGTACA TGCACAGTAG CAAAGGCGTA CCTGCTGTCA GTGpAACGT 420
TAATATCCTT ACCCCATCGG AGAGCCTGAG TGAGGGCGAT CAApCAGCC CTpTGTGCT 480
GAGGTGpTG CTGGpAAGC CCTGAACCCA CAACACATCT GTCTCCATGG TAACAGCTGC 540
ACCGG 545
(2) INFORMATION FOR SEQ ID NO: 1 1: (i) SECTION CHARACTERISTICS: (A) LONGITU D: 196 base pairs 10 (B) TI PO: nucleic acid (C) THREAD FORM: single ( D) TOPOLOGY: line l (xi) DESCR IPTION OF SEQUENCE: SEQ ID NO: 1 1:
TCTCCTAGGC TGGGCACAGT GGCTCATACC TGTAATCCTG ACCGTpCAG AGGCTCAGGT 60 15 GGGGGGGATCG CpGAGCCCA AGApTCAAG ACTAGTCTGG GTAACATAGT GAGACCCTAT 120
# CTCTACGAAA AAATAAAAAA ATGAGCCTGG TGTAGTGGCA CACACCAGCT GAGGAGGGAG 180
AATCGAGCCT AGGAGA 196
(2) INFORMATION FOR SEQ ID NO: 12: 20 (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 388 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 25 • (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
TCTCCTAGGC TTGGGGGCTC TGACTAGAAA pCAAGGAAC CTGGGApCA AGTCCAACTG 60
TGACACCAAC pACACTGTG GNCTCCAATA AACTGCpCT pCCTApCC CTCTCTApA 120
AATAAAATAA GGAAAACGAT GTCTGTGTAT AGCCAAGTCA GNTATCCTAA AAGGAGATAC 180
TAAGTGACAT TAAATATCAG AATGTAAAAC CTGGGAACCA GGpCCCAGC CTGGGApAA 240
ACTGACAGCA AGAAGACTGA ACAGTACTAC TGTGAAAAGC CCGAAGNGGC AATATGpCA 300
• CTCTACCGp GAAGGATGGC TGGGAGAATG AATGCTCTGT CCCCCAGTCC CAAGCTCACT 360
TACTATACCT CCTITATAGC CTAGGAGA 388
(2) INFORMATION FOR SEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 337 base pairs 15 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
twenty
93
081 V1_3_13V3V 1111133119933V1LLLL999V931393V 1U1V1V3J11V1V31WW 0Z
021 33WV1LLV31L313911L3 V3V99W1W 3WV1913V11LU19199139V31L919V
09 913VW1V3993W9133V33333W1LLV JLLLV331LL33919V3V1V3391L9V19V1: *? N 0103S: VI0N3fl03S 30 NOlOdlídOSSO (.x) * | B? Ui | : V) 9010d01 (?) G OIÜOUTS: 0 IH 30 VI? IyOd (O) oo? E | onu opioB:? Dll (a) seseq ep seied (.g: anil9NO "l (v): VI0N3n03S 30 svoiisjasiovy VO (!): f! ON Ql O? s V_JVd NOlOVI? Id OdNI (3) 0l
Ee V13V13V V3999111V1 13199V9V91
00 £ 3V9V99919319199V99V11L3V391LLV 131V399V3313131V313913V3V9131L
* 2 39V19131LL 9V3991LLL9 V9V9V9W311L9LL139V1 W191W9319999191199 9 81 3111VI.IIU 33HLV1U91331113311131391V9V31V3V13H1V 91V1V9W3V
21 31V9HW1V 9V3JLL91W399V999V3W V91V1V1VW 999V91V1W WV91333V1
9 11913111W 33WJJ.VJLÍ11V3V31L1LV 11V3131U91V31W1V1339119V19V1 # l-Z # AAACAGpp TAAGTCGTp GGAACAAGAT ATTTTpcp TCCTGGCAGC TpTAACAp 240
ATAGCAAAp TGTGTCTGGG GGACTGCTGG TCACTGTpC TCACAGpGC AAATCAA6GC 300
ATpGCAACC AAGAAAAAAA AAl 11111 IG ppApTGA AACTGGACCG GATAAACGGT 360
GpTGGAGCG GCTGCTGTAT ATAGTpTAA ATGGTTTAp GCACCTCCTT AAGpGCACT 420
TATGTGGGGG GGGGNTpTG NATAGAAAGT NTHANTCAC ANAGTCACAG GGACTpTNT 480
CpiTGGNNA CTGAGCTAAA AAGGGCTGNT TfTCGGGTGG GGGCAGATGA AGGCTCACAG 540
GAGGCCTTTC TCpAGAGGG GGGAACTNCT A 571
(2) INFORMATION FOR SEQ ID NO: 15: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 548 base pairs 15 (B) TYPE: nucleic acid - k (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
TATATAGGTA ATAAcpAM TATAT? TGA TCACCCACTG GGGTGATAAG ACAATAGATA 60
TAAAAGTAp TCCAAAAAGC ATAAAACCAA AGTATCATAC CAAACCAAAT TCATACTGCT 120
TCCCCCACCC GCACTGAAAC pCACCTTCT AACTGTCTAC CTAACCAAAT TCTACCCpC 180
AAGTCpTGG TGCGTGCTCA CTACTCTHl 111 III i ilf I? INIIIIGG AGATGGAGTC 240
TGGCTGTGCA GCCCAGGGGT GGAGTACAAT GGCACAACCT CAGCTCACTG NAACCTCCGC 300
CTCCCAGGp CATGAGApC TCCTGNpCA GCCTTCCCAG TAGCTGGGAC TACAGGTGTG 360
CATCACCATG CCTGGNTAAT pTNGGGTAG AGATGGGGGT TpACATGp 420
GGCCAGGNTG GTNTCGAACT CCTGACCTCA AGTGATCCAC CCACCTCAGG CTCCCAAAGT 480
GCTAGGApA CAGACATGAG CCACTGNGCC CAGNCCTGGT GCATGCTCAC pCTCTAGGC 540
AACTACTA 548
(2) INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 638 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 16: # pcCGTTATG CACATGCAGA ATApCTATC GGTACpCAG CTATTACTCA TTpGATGGC 60
GCAATCCGAG CCTATCCTCA AGATGAGTAT pAGAAAGAA pGApTAGC GATAGACCAA 120
GCTGGTAAGC ACTCTGACTA CACGAAApG pCAGATGTG ATGGApTAT GACAGpGAT 180
CpTGGAAGA GApApAAG TGApApp AAAGGGAATC CApAApCC AGAATATCp 240
GGpTAGCTC AAGATGATAT AGAAATAGAA CAGAAAGAGA CTACAAATGA AGATGTATCA 300
CCAACTGATA pGAAGAGCC TATAGTAGAA AATGAApAG CTGCATTTAT TAGCCpACA 360
CATAGCGAp pCCTGATGA ATCTTATAp CAGCCATCGA CATAGCApA CCTGATGGGC 420
AACCTTACGA ATAATAGAAA CTGGGTGCGG GGCTApGAT GAApCATCC NCAGTAAAp 480
TGGATATNAC AAAATATAAC TCGApGCAT pGGATGATG GAATACTAAA TCTGGCAAAA 540
GTAACpTGG AGCTACTAGT AACCTCTCTT TpGAGATGC AAAAppCT pTAGGGTp 600
CpApCTCT ACpTACGGA TApGGAGCA TAACGGGA 638
(2) INFORMATION FOR SEQ ID NO: 17: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 286 base pairs 20 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
ACTGATGGAT GTCGCCGGAG GCGAGGGGCC pATCTGATG CTCGGCTGCC TGTTCGTGAT 60
GTGCGCGGCG ApGGGCTGT pATCTCAAA CACCGCCACG GCGGTGCTGA TGGCGCCTAT 120
TGCCpAGCG GCGGCGAAGT CAATGGGCGT CTCACCCTAT CCTTTTGCCA TGGTGGTGGC 180
GATGGCGGCT TCGGCGGCGT pATGACCCC GGTCTCCTCG CCGGpAACA CCCTGGTGCT 240
TGGCCCTGGC AAGTACTCAT pAGCGAHT TGTCAAAATA GGCGTG 286
(2) INFORMATION FOR SEQ ID NO: 18: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 262 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 18: TCGGTCATAG CAGCCCCTTC pCTCAATTT CATCTGTCAC TACCCTGGTG TAGTATCTCA 60
TAGCCpACA ppTATAGC CTCCTCCCTG GTCTGTCTp TGATprCCT GCCTGTAATC 120
CATATCACAC ATAACTGCAA GTAAACATp CTAAAGTGTG GpATGCTCA TGTCACTCCT 180
GTGNCAAGAA ATAGpTCCA pACCGTCp AATAAAApC GGApTGpC pTCTAp 240
TCACTCTTCA CCTATGACCG AA 262
(2) INFORMATION FOR SEQ ID NO: 19 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 261 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: TCGGTCATAG CAAAGCCAGT GGTpGAGCT CTCTACTGTG TAAACTCCTA AACCAAGGCC 60
ApTATGATA AATGGTGGCA GGAppTAT TATAAACATG TACCCATGCA AATpCCTAT 120
AACTCTGAGA TATApCTTC TACATTTAAA CAATAAAAAT AATCTATpT TAAAAGCCTA 180
ATpGCGTAG pAGGTAAGA GTGpTAATG AGAGGGTATA AGGTATAAAT CACCAGTCAA 240
CGTpCTCTG CCTATGACCG A 261
(2) INFORMATION FOR SEQ ID NO: 20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 294 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
TCGGTCATAG CAGCCCCpC pCTCAATp CATCTGTCAC TACCCTGGTG TAGTATCTCA 60
TAGCCpACA TTTpATAGC CTCCTCCCTG GTCTGTCpr TGATTpCCT GCCTGTAATC 120
CATATCACAC ATAACTGCAA GTAAACATp CTAAAGTGTG GpATGCTCA TGTCACTCCT 180
GTGNCAAGAA ATAGTpCCA pACCGTCp AATAAAApC GGApTGpC pTNCTApN 240
TCACTCTTCA CCTATGACCG AA 262
(2) INFORMATION FOR SEQ ID NO: 21: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 208 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 21: TCGGTCATAG CAAAGCCAGT GGTpGAGCT CTCTACTGTG TAAACTCCTA AACCAAGGCC 60
ApTATGATA AATGGTGGCA GGATprTAT TATAAACATG TACCCATGCA AApTCCTAT 120
AACTCTGAGA TATApCTTC TACApTAAA CAATAAAAAT AATCTATpT TAAAAGCCTA 180
ATITGCGTAG pAGGTAAGA GTGpTAATG AGAGGGTATA AGGTATAAAT CACCAGTCAA 240
CGTpCTCTG CCTATGACCG A 261
(2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 287 base pairs (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: NCCNpGAGC TGAGTGApG AGATNTGTAA TGGpGTAAG GGTGApCAG GCGGApAGG 60
GTGGCGGGTC ACCCGGCAGT GGGTCTCCCG ACAGGCCAGC AGGATpGGG GCAGGTACGG 120
NGTGCGCATC GCTCGACTAT ATGCTATGGC AGGCGAGCCG TGGAAGGNGG ATCAGGTCAC 180
* GGCGCTGGAG CTTTCCACGG TCCATGNAp GNGATGGCTG pCTAGGCGG CTGpGCCAA 240
GCGTGATGGT ACGCTGGCTG GAGCApGAT pCTGGTGCC AAGGTGG 287
(2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 204 base pairs 15 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: pGGGTAAAG GGAGCAAGGA GAAGGCATGG AGAGGCTCAN GCTGGTCCTG GCCTACGACT 60
GGGCCAAGCT GTCGCCGGGG ATGGTGGAGA ACTGAAGCGG GACCTCCTCG AGGTCCTCCG 120
NCGpACTTC NCCGTCCAGG AGGAGGGTCT pCCGTGGTC TNGGAG6AGC GGGGGGAGAA 180
GATNCTCCTC ATGGTCNACA TCCC 204
93
-QZ-OH 0103S: VI0N3n03S 30 NplOdldOSSO (¡x) | B? Ui | : Vj90! Od01 (?) 03 OIÜOUTS -OllH 30 VI? JdOd () ooiepnu opioe:? Dll (a) seseq ep S? JBd 9Z3: an l9N01 (V): VI0N3D03S 30 SV0I1SI _J310V_1V0 (!): 9:? N 0103S VÜVd NOlüVlAli OdNl (3) 91.
Z V33V 9191V19V1139V1V11V3V
0 * 2 9JJ19V9119 J_lLV199_.il 11W31V1V11VJL1N119911VW9V91L1 V991LWV1L
081 V1LL11V13N 1LV91W1L9131L1LV1L3 V9V1L91V3V 991V3931W 33U 3W11 01. 021 1LL1VJ119V 991LV1V1L9 V91V333V91 M3NIIIW 91391L9V1V 91WV13319 i 9 LILVIIVV WW311V1V 9999V91LV33V39919V9V 199939V99V 3199J1V991 Jfß '
• * 3ON 0103S: VI0N3n03S 30 NOlOdlcdOSSO (ix) | BTU.i | : V | 9010d01 (?) OIMOUTS: oilH 30 VllüOd (O) S ooi? | Onu opioe -Odll (a) s? SBq ep S? JBd] 793: anil9N01 (V): VI0N3n03S 30 SV0I1SI_.310V._IV0 (0; -? N a? 03S yvd NOiovi? IaodNi (z)
61 ^^ * 9 pACAACGAG GGGAAACTCC GTCTCTACAA AAApAAAAA ApAGCCAGG T6TGGTGGTG 60
TGCACCCGCA ATCCCAGCTA CpGGGAGGT TGAGACACAA GANTCACCTA NATGTGGGAG 120
GTCAAGGpG CATGAGTCAT GApGTGCCA CTGCACTCCA GCCTGGGTGA CAGACCGAGA 180
CCCTGCCTCA ANAGANAANG AATAGGAAGT TCAGAAATCN TGGNTGTGGN GCCCAGCAAT 240
CTGCATCTAT NCAACCCCTG CAGGCAANGC TGATGCAGCC TANGpCAAG AGCTGCTGp 300
TCTGGAGGCA GCAGpNGGG CpCCATCCA GTATCACGGC CACACTCGCA CNAGCCATCT 360
GTCCTCCGTN TGTNAC 376
(2) INFORMATION FOR SEQ ID NO: 26: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 372 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: simple * (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: pACAACGAG GGGAAACTCC GTCTCTACAA AAApAAAAA ApAGCCAGG TGTGGTGGTG 60
TGCACCTGTA ATCCCAGCTA CpGGGCGGC TGAGACACAA GAACCACCTA AATGTGGGAG 120
GGTCAAGGp GCATGAGTCA TGATCGCGCC ACTGCACTCC AGCCTGGGTG ACAGACTGAG 180
ACCCTGCCTC AAAAGAAAAA GAATAGGAAG pCAGAAACC CTGGGTGTGG NGCCCAGCAA 240
TCTGCATpA AACAATCCCT GCAGGCAATG CTGATGCAGC CTAAGpCAA GAGCTGCTGT 300
TCTGGAGGCA GNAGTAAGGG CpCCATCCA GCATCACGGN CAACACTGCA AAAGCACCTG 360 TCCTCGpGG TA 372
(2) INFORMATION FOR SEQ ID NO: 27: 5 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 477 base pairs (B) TYPE: nucleic acid 4 - (C) THREAD FORM: single (D) TOPOLOGY: linear 10 (xi) DESCRIPTION OF SEQUENCE CIA: SEQ ID NO: 27:
pCTGTCCAC ATCTACAAGT pTATTTAp pGTGGGTp TCAGGGTGAC TAAGppTC 60
CCTACApGA AAAGAGAAGT TGCTAAAAGG TGCACAGGAA ATCAIIMII TAAGTGAATA 120
TGATAATATG GGTCCGTGCT TAATACAACT GAGACATAp TGpCTCTGT TppTAGAG 180
TCACCTCpA AAGTCCAATC CCACAATGGT GAAAAAAAAA TAGAAAGTAT pGpCTACC 240
pTAAGGAGA CTGCAGGGAT TCTCCpGAA AACGGAGTAT G6AATCAATC pAAATAAAT 300
ATGAAApGG pGGTCTTCT GGGATAAGAA ApCCCAACT CAGTGTGCTG AAApCACCT 360
GACMIIIII GGGAAAAAAT AGTCGAAAAT GTCAATpGG TCCATAAAAT ACATGpACT 420
ApAAAAGAT ApTAAAGAC AAApcpTC AGAGCTCTAA GApGGTGTG GACAGAA 477
(2) IN FORMATION FOR SEQ ID NO: 28 (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 438 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: TCTNCAACCT CpGANTGTC AAAAACCpN TAGGCTATCT CTAAAAGCTG ACTGGTApC 60
ApCCAGCAA AATCCCTCTA GTppGGAG pTCC pA CTATCTGGGG CTGCCTGAGC 120
CACAAATGCC AAApAAGAG CATGGCTAp pCGGGGGCT GACAGGTCAA AAGGGGTGTA 180
AATCCGATAA GCCTCCTGGA GGTGCTCTAA AAACACTCCT GGTGACTCAT CATGCCCCTG 240
GACGACpCA ATCGNCpAG ACAAGpTAT AGGTTTCTGG GCAGCTCCCT GAATACCCAC 300
GAGGAGATAC CGGTGGAAAT CGTCAAAAGT TCTCCCTCCA CpGAGAAAT pGGGTCCCA 360
ApAGGTCCC AApGGGTCT CTAATCACTA pCCTCTAGC TTCCTCCTCC GGNCTApGG 420
pGATGTGAG GpGAAGA 438
(2) INFORMATION FOR SEQ ID NO: 29: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 620 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 29 'AAGAGGGTAC CAGCCCCAAG CCpGACAAC pCCATAGGG TGTCAAGCCT GTGGGTGCAC 60
AGAAGTCAAA AApGAGpT TGGGATCCTC AGCCTAGAp TCAGAGGATA TAAAGAAACA 120
CCTAACACCT AGATApCAG ACAAAAGpT ACTACAGGGA TGAAGCpTC ACGGAAAACC 180
TCTACTAGGA AAGTACAGAA GAGAAATGTG GGTpGGAGC CCCCAAACAG AATCCCCTCT 240
AGAACACTGC CTAATGAAAC TGTGAGAAGA TGGCCACTGT CATCCAGACA CCAGAATGAT 300
AGACCCACCA AAAACpATG CCATApGCC TATAAAACCT ACAGACACTC AATGCCAGCC 360 CCATGAAAAA AAAACTGAGA AGAAGACTGT NCCCTACAAT GCCACCGGAG CAGAACTGCC 420
CCAGGCCATG GAAGCACAGC TCTTATATCA ATGTGACCTG GATGpGAGA CATGGAATCC 480
NANGAAATCN TTpAANACT TCCACGGpN AATGACTGCC CTApANAp CNGAACTTAN 540
ATCCNGGCCT GTGACCTCp TGCpTGGCC ApCCCCCp pTGGAATGG 600
CCCATGCCTG TNCCCTCpA 620
(2) INFORMATION FOR SEQ ID NO: 30: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 100 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 30: * pACAACGAG GGGGTCAATG TCATAAATGT CACAATAAAA CAATCTCTTC IMIIMIII 60
1111 H 1111 1111111111 IIIHHin 100
(2) INFORMATION FOR SEQ ID NO: 31: 5 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 762 base pairs (B) TYPE: nucleic acid ^ (C) THREAD FORM: single (D) TOPOLOGY: linear 10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
fifteen
twenty
TAGTCTATGC GCCGGACAGA GCAGAApAA ApGGAAGp GCCCTCCGGA CpTCTACCC 60
ACACTCpCC TGAAAAGAGA AAGAAAAGAG GCAGGAAAGA GGpAGGAp TCATpTCAA 120
GAGTCAGCTA ApAGGAGAG CAGAGpTAG ACAGCAGTAG GCACCCCATG ATACAAACCA 180
TGGACAAAGT CCCTGTpAG TAACTGCCAG ACATGATCCT GCTCAGGTp TGAAATCTCT 240
CTGCCCATAA AAGATGGAGA GCAGGAGTGC CATCCACATC AACACGTGTC CAAGAAAGAG 300
TCTCAGGGAG ACAAGGGTAT CAAAAAACAA GApCTTAAT GGGAAGGAAA TCAAACCAAA 360
AAApAGAp pTCTCTACA TATATATAAT ATACAGATAT pAACACAp ApCCAGAGG 420
TGGCTCCAGT CCpGGGGCT TGAGAGATGG TGAAAACTp TGpCCACAT TAACTTCTGC 480
TCTCAAApC TGAAGTATAT CAGAATGGGA CAGGCAATGT pTGCTCCAC ACTGGGGCAC 540
AGACCCAAAT GGpCTGTGC CCGAAGAAGA GAAGCCCGAA AGACATGAAG GATGCpAAG 600
GGGGGpGGG AAAGCCAAAT TGGTANTATC TTITCCTCCT GCCTGTGpC CNGAAGTCTC 660
CNCTGAAGGA ApCTTAAAA CCCpTGTGA GGAAATGCCC CCpACCATG ACAANTGGTC 720
CCApGCpT TAGGGNGATG GAAACACCAA GGGTpTGAT CC 762
(2) INFORMATION FOR SEQ ID NO: 32: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 276 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) DESCRIPTION OF SECU ENC IA: SEQ ID NO: 32:
TAGTCTATGC GTGTApAAC CTCCCCTCCC TCAGTAACAA CCAAAGAGGC AGGAGCTGp 60
ApACCAACC CCApTTACA GATGCATCAA TAATGACAGA GAAGTGAAGT GACpGCGCA 120
CACAACCAGT AAApGGCAG AGTCAGATp GAATCCATGG AGTCTGGTCT GCACTpCAA 180
TCACCGAATA CCCTITCTAA GAAACGTGTG CTGAATGAGT GCATGGATAA ATCAGTGTCT 240
# ACTCAACATC pTGCCTAGA TATCCCGCAT AGACTA 276
(2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 477 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: simple A (D) TOPOLOGY : line l (xi) DESCRIPTION OF SECU ENC IA: SEQ ID NO: 33:
TAGTAGpGC CAAATATpG AAAApTACC CAGAAGTGAT TGAAAACTp pGGAAACAA 60
AAACAAATAA AGCCAAAAGG TAAAATAAAA ATATCTpGC ACTCTCGpA pACCTATCC 120
ATAACTpp CACCGTAAGC TCTCCTGCp GpAGTGTAG TGTGGpATA pAAACTTp 180
TAGpApAT prpApcA cppccACT AGAAAGTCAT TApGApTA GCACACATGT 240
TGATCTCAp ppTATAGG CAAAApTGA TGCTATGCAA CAAAAATACT 300
CAAGCCCAp ATCTppTC CCCCCGAAAT CTGAAAApG CAGGGGACAC AGGGAAGpA 360AApGTAAA TATGpCAGT pATGpTAA AAATGCACAA AACATAAGAA 420
AApGTGTp ACpGAGCTG CTGApGTAA GCAGTpTAT CTCAGGGGCA ACTACTA 477
(2) INFORMATION FOR SEQ ID NO: 34: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 631 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
TAGTAGpGC CAApCAGAT GATCAGAAAT GCTGCTpCC TCAGCApGT CTTGpAAAC 60
CGCATGCCAT pGGAACTp GGCAGTGAGA AGCCAAAAGG AAGAGGTGAA TGACATATAT 120
ATATATATAT ApCAATGAA AGTAAAATGT ATATGCTCAT ATACpTCTA GpATCAGAA 180
TGAGpAAGC pTATGCCAT TGGGCTGCTG CATATTÍTAA TCAGAAGATA AAAGAAAATC 240
TGGGCAGGGT TAGAATGTGA TACATGT? T pTAAAACTG pAMTApA prcGATAp 300
TGTCTAAGAA CCGGAATGp CpAAAApT ACTAAAACAG TApGpTGA GGAAGAGAAA 360
ACTGTACTGT pGCCApAT TACAGTCGTA CAAGTGCATG TCAAGTCACC CACTCTCTCA 420 (2) INFORMATION FOR SEQ ID NO: 35: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 578 base pairs (B) TI PO: nucleic acid (C) FORM OF H ILO : simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:
TAGTAGpGC CATCCCATAT TACAGAAGGC TCTGTATACA TGACTTATp GGAAGTGATC 60
TGTpTCTCT CCAAACCCAT pATCGTAAT pCACCAGTC pGGATCAAT CTTGGpTCC 120
ACTGATACCA TGAAACCTAC pGGAGCAGA CApGCACAG piTCTGTGG TAAAAACTAA 180
AGGTpApT GCTAAGCTGT CATCpATGC pAGTAp l i l i l lACAG TGGGGAApG 240
CTGAGApAC AppGpAT TCApAGATA cprGGGATA ACTTGACACT GTcpcpp 300
TpCGCTpT AApGCTATC ATCATGCTp TGAAACAAGA ACACApAGT CCTCAAGTAT 360
TACATAAGCT TGCTTGpAC GCCTGGTGGT pAAAGGACT ATCTTTGGCC TCAGGpCAC 420
AAGAATGGGC AAAGTGTpC CTTATGpCT GTAGpCTCA ATAAAAGAp GCCAGGGGCC 480
GGGTACTGTG GCTCGCACTG TAATCCCAGC ACTpGGGAA GCTGAGGCTG GCGGATCATG 540 pAGGGCAGG TGpCGAAAC CAGCCTGGGC AACTACTA 578
(2) INFORMATION FOR SEQ ID NO: 36: (i) SEQUENCE CHARACTERISTICS: _ (A) LENGTH: 583 base pairs 5 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear fr (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 1Q TAGTAGpGC CTGTAATCCC AGCAACTCAG GAGGCTGGGG CAGGAGAATC AGpGAACCT 60
GGGAGGCAGA AGTTGTAAp AGCAAAGATC GCACCApGC ACTTCAGCCT GGGCAACAAG 120
AGTGAGApC CATCTCAAAA ACAAAAAAAA GAAAAAGAAA AGAAAAGGAA AAAACGTATA 180
AACCCAGCCA AAACAAAATG ATCApCpR TAATAAGCAA GACTAATpA ATGTGpTAT 240
pAATCAAAG CAGpGAATC pCTGAGTTA pGGTGAAAA TACCCATGTA GpAApTAG_300_GGpcpACT TGGGTGAACG pTGATGpC ACAGGpATA AAATGGpAA CAAGGAAAAT "360
GATGCATAAA GAATCTTATA AACTACTAAA AATAAATAAA ATATAAATGG ATAGGTGCTA 420
TGGATGGAGT TpTGTGTAA TTTAAAATCT TGAAGTCAp pGGATGCTC ApGGpGTC 480
TGGTAATpC CApAGGAAA AGGpATGAT ATGGGGAAAC TGTpCTGGA AApGCGGAA 540
TGpTCTCAT CTGTAAAATG CTAGTATCTC AGGGCAACTA CTA 583
(2) INFORMATION FOR SEQ ID NO: 37: 25 (i) SEQUENCE CHARACTERISTICS: (A) ITU D LENGTH: 716 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
GATCTACTAG TCATNTGGAT TCTATCCATG GCAGCTAAGC CTTTCTBAAT GGApCTACT 60
GCTpCTTGT TCpTAATCC AGACCCTTAT ATATGTpAT GpCACAGGC AGGGCAATGT 120
pAGTGAAAA CAApCTAAA TppTATp TGCATpTCA TGCTAATpC CGTCACACTC 180
CAGCAGGCp CCTGGGAGAA TAAGGAGAAA TACAGCTAAA GACApGTCC CTGCpACTT 240
ACAGCCTAAT GGTATGCAAA ACCACpCAA TAAAGTAACA GGAAAAGTAC TAACCAGGTA 300
GAATGGACCA AAACTGATAT AGAAAAATCA GAGGAAGAGA GGAACAAATA TTTACTGAGT 360
CCTAGAATGT ACAAGGCTp pAApACAT ATpTATGTA AGGCCTGCAA AAAACAGGTG 420
AGTAATCAAC ATpGTCCCA TpTACATAT AAGGAAACTG AAGCTTAAAT TGAATAATTT 480
AATGCATAGA pTTATAGp AGACCATGp CAGGTCCCTA TGpATACp ACTAGCTGTA 540
TGAATATGAG AAAATAATp TGpAppC pGGCATCAG TATTITCATC TGCAAAATAA 600
AGCTAAAGp ApTAGCAAA CAGTCAGCAT AGTGCCTGAT ACATAGTAGG TGCTCCAAAC 660
ATGApACNC TANTApNGG TApANAAAA ATCCAATATA GGCNTGGATA AAACCG 716
(2) FOUNDATION FOR SEC ID NO: 38: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 688 base pairs (B) TYPE: nucleic acid (C) FORM OF H ILO: simple 5 (D) ) TOPOLOGY: line l (xi) DESCRIPTION OF SEC U IN C IA: SEQ ID NO: 38: pCTGTCCAC ATATCATCCC ACpTAApG pAATCAGCA AAACpTCAA TGAAAAATCA 60 CCAGGATCAC ACCAGGAAAC TGAAGGTGTA l i l i l I I M? CCTTAAAAAA 120
AAAAAAAAAA ACCAAACAAA CCAAAACAGA pAACAGCAA AGAGpCTAA AAAApTACA 180
pTCTCTTAC AACTGTCAp CAGAGAACAA TAGpcpAA GTCTGpAAA TCpGGCAp 240
AACAGAGAAA CpGATGAAN AGpGTACTT GGAATApGT GGAI I I I I U ppGTCTAA 300
TCTCCCCCTA pGppGCC AACAGTAAp TAAGTTTGTG TGGAACATCC CCGTAGp &A 360
AGTGTAAACA ATGTATAGGA AGGAATATAT GATAAGATGA TGCATCACAT ATGCApACA 420
# TGTAGGGACC pCACAACp CATGCACTCA GAAAACATGC pGAAGAGGA GGAGAGGACG 480
GCCCAGGGTC ACCATCCAGG TGCCTTGAGG ACAGAGAATG CAGAAGTGGC ACTGpGAAA 540
TTTAGAAGAC CATGTGTGAA TGGTpCAGG CCTGGGATGT pGCCACCAA GAAGTGCCTC 600
CGAGAAATp CpTCCCAp TGGAATACAG GGTGGCTTGA TGGGTACGGT GGGTGACCCA 660
ACGAAGAAAA TGAAApCTG CCCTpCC 688
(2) I NFORMATION FOR S EC ID NO: 39: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 585 base pairs (B) TI PO: nucleic acid (C) FORM OF HI LO: simple (D) ) TOPOLOGY: linear (xi) DESCR I SECU ENC AI PCIÓ N: SEC ID NO: 39:
TAGTAGpGC CGCNNACCTA AAANpGGAA AGCATGATGT CTAGGAAACA TANTAAAATA 60
GGGTATGCCT ATGTGCTACA GAGAGATGp AGCApTAAA GTGCATANp pATGTATp 120
TGACAAATGC ATATNCCTCT ATAATCCACA ACTGApACG AAGCTATTAC AApAAAAAG 180
HTGGCCGGG CGTGGTGGGC GGTGGCTGAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA 240
GGCACGCGGA TCACGAGGTC GGGAGpCAA GACCATCCTG GCTAACACGG TGAAAGTCCA 300
TCTCTACTAA AAATACGAAA AAATTACCCC GGCGTGGTGG CGGGCGCCTG TAGTCCCAGC 360
TACTCCGGAG GCTGAGGCAG GAGAATGGCG TGAACCCAGG ACACGGAGCT TGCAGTGTGC 420
CAACATCACG TCACTGCCCT CCAGCCTGGG GGACAGGAAC AAGANTCCCG TCCTCANAAA 480
AGAAAAATAC TACTNATANT pCNACpTA T TAApA CACAGAACTN CCTCpGGTA 540
CCCCCTTACC ApCATCTCA CCCACCTCCT ATAGGGCACN NCTAA 585 OI? OUTS: oilH 30 VI? Idid (O) 93
ooi? pnu opioB -Odll (a) s? seq? p s? jed ZZ: anil9N01 (V): VI0N3lT03S 30 SV0llSjd310Vd VO 0): |. *: 0N 0103S VidVd NOlOVI? od OdNI (3) 03
SZ * V9V3V 99191V9V191L3WW1W V1WW3V333WW91333 V319V1L3W
02 * VW999V191 W3111L313 U3W3991L 1133V3919133U1V91W WVW1L3V3
09S 1LV1V01V11 V1V333V993 V39WJ1V19119V313191 V1WV3W9V 9V3VWWW
91- 00e 1313V9199V 9W1113V991LV999Í911 V33V31LU11Í1LV133113V1VW3W9
0 * 2 V199WV1L331319V3913331W9V99V V3111L93313V1V331LV91LV9W1LLV
081 1LLV1V3111 W33W33V9 W9V3331V1131L1W9991L9V913V3V 39V3111W9
021 199V319WV VW33UIH 1913V931LL 1V3V911VW 33V991VU11V191V3W1 01
09 9V1W1LLL31V1VW1LL319L11W9W W913139W 9V1L31W33 V3V3319131: 0 * -ON I heard 03S: VI N3fl03S 30 Npl0dl-J0S30 (! X) | B? Ui | -VjOOlOdOl (O) OIMOUTS: 01IH 30 VlAi-JOd () oo¡? | Onu oppB:? Dll (a) s? Seq? P S? JBd g¿: aflll9N01 (V): VI0N3n03S 30 svousjasiova VO (!): 0 * ON 01 G3S VidVd NQIOVI? I-JOdNI (Z)
eß # # (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: TAAGAGGGTA CATCGGGTAA GAACGTAGGC ACATCTAGAG CpAGAGAAG TCTGGGGTAG_60_GAAAAAAATC TAAGTATTTA TAAGGGTATA GGTAACATp AAAAGTAGGG CTAGCTGACA 120
pApTAGAA AGAACACATA CGGAGAGATA AGGGCAAAGG ACTAAGACCA GAGGAACACT 180
AATATpAGT GATCACTTCC ApCTTGGTA AAAATAGTAA CTpTAAGp AGCpCAAGG 240
AAGATTpTG GCCATGApA GpGTCAAAA GpAGpCTC pGGGpTAT ApACTAAp 300
pGppAAG ATCCpGpA GTGCpTMT AAAGTCATGT TATATCAAAC GCTCTAAAAC 360
ApGTAGCAT GpAAATGTC ACAATATACT TACCATpGT TGTATATGGC TGTACCCTCT 420
CTA 423
(2) IN FORMATION FOR SEQ ID NO: 42: jk (i) SECU ENC AL CHARACTERISTICS: (A) LONGITU D: 527 base pairs (B) TI PO: nucleic acid (C) FORM OF HI LO: simple 2Q (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:
TCTCCTAGGC TAATGTGTGT GTpCTGTAA AAGTAAAAAG pAAAAApT TAAAAATAGA 60
AAAAAGCpA TAGAATAAGA ATATGAAGAA AGAAAATAp pTGTACAp TGCACAATGA 120
GpTATGTp TAAGCTAAGT GpApACAA AAGAGCCAAA AAGGppAA AAApAAAAC 180
GpTGTAAAG pACAGTACC CTTATGpAA pTATAApG AAGAAAGAAA AACIHUII 240
TATAAATGTA GTGTAGCCTA AGCATACAGT ApTATAAAG TCTGGCAGTG pCAATAATG 300
TCCTAGGCCT TCACApCAC TCACTGACTC ACCCAGAGCA ACTTCCAGTC CTGTAAGCTC 360
CApCGTGGT AAGTGCCCTA TACAGGTGCA CCATTTATp TACAGTATp pACTGTACC 420 pCTCTATGT pCCATATGT pCGATATAC AAATACCACT GGpACTATN GCCCNACAGG 480
TAApCCAGT AACACGGCCT GTATACGTCT GGTANCCCTA GNGAAGA 527
(2) INFORMATION FOR SEQ ID NO: 43: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 331 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 93
03
**: 0N 01 03S: VION3D03S 30 NplOdldOSSO (jx) IB? Uü: VI9010d01 (?) Gl OI? OUTS: 01IH 30 VIAldOd (O) or iepnu opioe:? Dll (a) s? Seq? P S? JBd 369 : anil9N01 (V) VI0N3n03S 30 SV0llSjd310VídV0 (!): **: 0N 01 03S VÜVd N lOVI? i andOdNI (Z) 0 | - iee 3 W1W1WV1 WIVIVJIVI 3V13V13311 #
OOe 3W1LV1L9V 33W9V9V3V 3W991L391911V1W19119W9WW31L990V11LV 0 * 2 1V9131LV311V99V9W999133331W31L9131139V V1LV3WV191L3V3VW31 9
081 1VW3V31V9 V3JU1V1L9191LLLW1U.1L91V31WV 9V9V1LWW 991L9V1313 021 3113131119 W3V331U13W9VWWV 9W9WVWV VWV9W1LL 3113339139 09 91L33V13V1199V1LU1V 13V399913391V1V31313 W3V99V193133W31L31 93
: 9 ON 01 03S: V10N3fl03S 30 NOI0d 0S30 (? X) | e? Ui | : vj9010dO! (?) OI I IDUTS: oi l H 30 V yOd (O) 03 oo '? nu op e:? dll (a) s? seq? p S? JBd Z99: a nil9N0"l (V): VI0N3D03S 30 SVOllSJdSlOVy VO (!): 9 * -0N 01 03S VÜVd NplOVI? iyOdNI (3) 91.
269 W 13V13W399 3991V9191L 1V991WV93 91U393W3 3V911333V3
0 * S V1V9W9V33 1999V99913 9V333_tr1333 V133V9H91L 13V31991LV 991XV89913
08 * 91V991LNV9 V999V1LL9V VW331V999 W99V9V191 WJJ19V313 13131V9133
02 * 91VW99139 W39V91913 V333991V9V VW1V991V1 WV9139V11 991L39111V 0.
098 H19V1W13 1L991W11L 199WWW1 19V999V313 9V391L91V9 1LLL39131L
00C V9V9V19W1 3V333W911 1V91LV1L3V 991399V31V 991L9V9V1V 1W99W993
0 * 2 991W1V9V9 191999WU V1V991W99 V1V319V9W 9V99131VW 9V39W1V1V
081 11LW131U 1 £ M) 113913 VV9WW9V1 3V3V139191 V9V99V91L3 33V919V9V3
021 JJL99W) 199V 9V13333V9V 39V33319W V13V9V133V 1V133V9V13 1L391LL913
09 33DV3W333:) 3V039V93V 313013J1V9 1L93MV1VW V399V3391L 9V19VJ1399
Z6 F GGCpAGTAG pGCCApGC GAGTGCTTGC TCAACGAGCG pGAACATGG CGGApGTCT 60
AGApCAACG GApTGAGp pACCAGCAA AGCGAACCAA GCGCGGCCCA GAGAApATG 120
GGpGGpGG CpTGAAAAG ATGGAAATCC TGTAGGCCTA GTCAGAAAAG CCTTCTTGCA 180
GAACAGpGG pCTCGGGCG AACGCTCATC AAGATGCCCA pGGAAAGGC TAGCGTGTAT 240 pGGGAGAGC CTGATAGCGT GTCTTCTGAT GATGTpGTG CTTGGACAGT GACAAAAGAT 300
ATGCAAAGCA AGTCCGAACT AGACGTCAAG CTTCGTGAGC AAApApGT AGACTCCTAC 360 pATACTGTG AGGAATGATA GCCAAGGGTG GGGACTpAA GACTAAGGTG GTpGTACTT 420
GCGCCGATGA TCCCAGGCAG AAAGAMCTGA TCGCTAGTp TATACGGGCA ACTACTAAGC 480
CGAATTCCAG CACACTGGCG GCCGpACTA ApGGATCCG ANCTCGGTAC CAGCpGATG 540
CATASCpGA GpWTCTATA NTGTCNC 567
(2) INFORMATION FOR SEQ ID NO: 46: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 908 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 93: Z * - ON 01 03S V Vd NOI OVI? J dOd NI (Z)
806 19VW1V3
006 W99339V93 V1V33V3333 331LW3331 39331VÜ91 1VW919191 331LL9139V
0 * 8 1V31991V31 V119399133 9V1VW13V3 1919V1V131 V19V91L39V 1V391V91L3 oz
08 ¿9W33V1993 139V9331V9 91W13V119 399939913V 3V39V331LV V93339W13
2Z V13W39991 V1V33V3V91 9W9V31319 1V191V1LV1 V9199W9V9 99191L9V39
99 139V331919 V31W9V339 VW9V91999 V91L31W91 W99VW399 W913331W 91-09 1V9V991311 W39V991W 1U1V3911L V193W1V1V 13313331V3 19191W119
* 9 1V19V9V913 V11W11V9V 39V39V1V91 V99V1V9133 1W11VW1V 931VM3V991
8 * 3913111V11 33939V3133 191V9V1LV3 V131LL13ÍV 191LWV1LV 3V313V3133
2 * 9V13331331? 9V9133V1V V9V1399V13 3V13919V99 1V33V93391 19V19V1L39 01.
9C 931LW9919 1919V33933 993W19V13 V331V9931L 9V933V1991 139W31V39
00e 1V1L9W313 W1W9V1V1 1V3V9199V1 1LV139W33 93V1LV91V3 3V91V139V3
0 * 2 VW99V3V3V 31LLW3W1 V9939V9191 1W9991991 91V1991L99 931139W11
081 1V3WU399 9333V39V1L V311V3399V 19V9191W1 1W39W393 9999V39939
021 9W991V933 31LL99V3V9 9V99199V39 9W1LV31LV 9339199999 9333311139
09 33W39WW V33999V9V9 939W9NV93 9N9NV1NN9N 9V3999V933 V9VW339V9
66 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 480 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple 5 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:
TGCCAACAAG GAAAGTpTA AATGTCCCCT TGAGGApCT TGCTGATCAT CAAApCAGT 60
GGTTTpAAG GpßppCT GTCAAATAAC TCTAACTpA AGCCAAACAG TATATGGAAG 120
CACAGATAKA ATApACACA GATAAAAGAG GAGpGATCT AAAGTARAGA TAGpGGGGG 180
CpTAATpC TGGAACCTAG GTCTCCCCAT CpcpCTGT GCTGAGGAAC pcpGGAAG 240
CGGGGApCT AAAGpcpT GGAAGACAGT pGAAAACCA CCATGpGp CTCAGTACCT 300
pAppTAA AAAGTAGGTG AACATpTGA GAGAGAAAAG GGCpGGpG AGATGAAGTC 360
# CCCCCCCCCC CH 11 lll II ppAGCTGA AATAGATACC CTATGpNAA RGAARGGAp 420
APApTACC ATGCCAYTAR SCACATGCTC TpGATGGGC NYCTCCSTAC CCTCCpAAG 480
(2) INFORMATION FOR SEQ ID NO: 48: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 591 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 25 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48:
AAGAGGGTAC CGAGTGGAAT pCCGCTTCA CTAGTCTGGT GTGGCTAGTC GGTFTCGTGG 60
TGGCCAACAT TACGAACTTC CAACTCAACC CTTCTTGGAC GpCAAGCGG GAGTACCGGC 120
GAGGATGGTG GCGTGAApC TGGCCpTCT pGCCGTGGG ATCGGTAGCC GCCATCATCG 180
GTATGpTAT CAAGATCTTC TpACTAACC CGACCTCTCC GATpACCTG CCCGAGCCGT 240
GGpTAACGA GGGGAGGGGG ATCCAGTCAC GCGAGTACTG GTCCCAGATC TTCGCCATCG 300
TCGTGACAAT GCCTATCAAC pCGTCGTCA ATAAGpGTG GACCTTCCGA ACGGTGAAGC 360
ACTCCGAAAA CGTCCGGTGG CTGCTGTGCG GTGACTCCCA AAATCpGAT AACAACAAGG 420
TAACCGAATC GCGCTAAGGA ACCCCGGCAT CTCGGGTACT CTGCATATGC GTACCCCpA 480
AGCCGAApC CAGCACACTG GCGGCCGpA CTAApGGAT CCGAACTCCG TAACCAAGCC 540 TGATGCGTAA CpGAGpAT TCTATAGTGT CCCTAAAATA ACCTGGCGp A 591
(2) INFORMATION FOR SEQ ID NO: 49: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 454 ares base (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 49:
AAGAGGGTAC CTGCCpGAA ATpAAATGT CTAAGGAAAR TGGGAGATGA pAAGAGpG 60
GTGTGGCYTA GTCACACCAA AATGTATpA pACATCCTG CTCCTpCTA GpGACAGGA 120
AAGAAAGCTG CTGTGGGGAA AGGAGGGATA AATACTGAAG GGApTACTA AACAAATGTC 180
CATCACAGAG TpTCCTTp AGACAGAGTC pGCTCTGTC ACCCAGGCTG 240
GAATGAAGWG GTATGATCTC AGpGAATGC AACCTCTACC TCCTAGGpc AAGCGApCT 300
CATGCCTCAG CCTCCTGAGC AGCTGGGACT ATAGGCGCAT GCTACCATGC CAGGCTAAp 360
pTATATpT TApAGAGAC GGGGTGpGC CATGpGGCC AGGCAGGTCT CGAACTCCTG 420
GGCCTCAGAT GATCTGCCCC ACCGTACCCT CTTA 454
(2) INFORMATION FOR SEQ ID NO: 50: (i) SEQUENCE CHARACTERISTICS: 15 (A) LENGTH: 463 base pairs # (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50:
twenty
AAGAGGGTAC CAAAAAAAAG AAAAAGGAAA AAAAGAAAAA CAACpGTAT AAGGCTpCT 60
GCTGCATACA GCI II I HU pTAAATAAA TGGTGCCAAC AAATGTpp GCApCACAC 120
CAApGCTGG pTTGAAATC GTACTCpCA AAGGTATpG TGCAGATCAA TCCAATAGTG 180
ATGCCCCGTA GGTpTGTGG ACTGCCCACG pGTCTACCT TCTCATGTAG GAGCCApGA 240
GAGACTGTp GGACATGCCT GTGpCATGT AGCCGTGATG TCCGGGGGCC GTGTACATCA 300
TGpACCGTG GGGTGGGGTC TGCApGGCT GCTGGGCATA TGGCTGGGTG CCCATCATGC 360
CCATCTGCAT CTGCATAGGG TApGGGGCG pTGATCCAT ATAGCCATGA pGCTGTGGT 420
AGCCACTGp CATCApGGC TGGGACATGC TGpACCCTC pA 463
(2) INFORMATION FOR SEQ ID NO: 51: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 399 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 51:
93
OOe 91399WV19 VW31911ÍV 3133931991 1399911393 V3V133133V W1919U11
0 * 2 991LV9139V 33W13V331 1333913331 9W131V991 V333V99J19 3W9V139V1
081 13V313WV9 W9339V31V 19V19V31W 9939WW31 1W1W91LL V1V99W911
021 31W9V3131 3JJ1V39113 WV11V9WV 919919VWV WW9V9131 V1LW1W39 03
09 91W1V1V91 3H13WJ13 V31V31WW 1V91LW199 1L33W31W 3133W3113
: 29: 0N 01 03S: VI ON3n? 3S 30 NOlOdl dOSSO (¡x)
oupu? ooi? pnu op B .'Odll (a) S? seq? p S? JBd 368: a nil 9N01 (V): VI 0N3 n03S 30 svousjysiovy VO (!): 3: 0N 01 03S V? dVd NÓlOVI? HdOd NI (3)
0
66C 91L99V911 13191V91V3 1V1V1LW1L 1V19V3V333
09C 39VW1913V V319LLV1D1 91W91V3V1 V1V1399VW V1LLL13V11 V139V9H33
002 V133W13W lilVlUlVl 93V1V13V3V 91V19V39V9 1319V11V13 13H33V91L
0 * 2 131V3H1W 9W1LÍ1V31 1L39V91VW 9V1L3V1L3V 991V19V9W 1LV1LW1V3 9
081 3H991L91V 39V1L331LL 1LW3W1LV V1LV1LLV9V 1L3W991W 1W3V11V91
021 1LW3919V3 9V9WV1399 99V91V919V 3V33V131V9 39W131333 V13V31LLLL
09 31339W133 9V31393V33 V339V913V9 9V3V1LV999 13919VW33 3133W31L3
* ÍH F * AAATAGGAAG ATAATGAACC GTGTCTlTp GGTCTCTpT CCATCCApA CTCTGATTp 360
ACAAAGAGGC CTGTApCCC CTGGTGAGGT TG 392
(2) INFORMATION FOR SEQ ID NO: 53: 5 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 179 base pairs (B) TYPE: nucleic acid F (C) THREAD FORM: single (D) TOPOLOGY: linear 10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: pCGGGTGAT GCCTCCTCAG GCTACAGTGA AGACTGGAp ACAGAAAGGT GCCAGCGAGA 60
TpCAGApC CTGTAAACCT CTAAAGAAAA GGAGTCGCGC CTCAACTGAT GTAGAAATGA 120
CTAGpCAGC ATACNGAGAC ACNTCTGACT CCGApCTAG AGGACTGAGT GACCTGCAN 179
(2) INFORMATION FOR SEQ ID NO: 54: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 112 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: pCGGGTGAT GCCTCCTCAG GCTACATCAT NATAGAAGCA AAGTAGAANA ATCNNGpTG 60
TGCAT? TCC CACANACAAA ApCAAATGA NTGGAAGAAA pGGGANAGT AT 112
(2) INFORMATION FOR SEQ ID NO: 55: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 225 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 55: TGAGCpCCG CpCTGACAA CTCAATAGAT AATCAAAGGA CAACpTAAC AGGGApCAC 60
AAAGGAGTAT ATCCAAATGC CAATAAACAT ATAAAAAGGA ApCAGCTTC ATCATCATCA 120
GAAGWATGCA AApAAAACC ATAATGAGAA ACCACTATGT CCCACTAGAA TAGATAAAAT 180
CpAAAAGAC TGGTAAAACC AAGTGpGGT AAGGCAAGAG GAGCA 225
(2) INFORMATION FOR SEQ ID NO: 56: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 175 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: line! (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 92
: 89ON 0103S: VI0N3í) 03S 30 NOIOdl_.OS30 (¡x) ÍTTU? : Vj9010d01 (?) OIIPUTS: oilH 30 VI? IdOd (O) ooi? Pnu oppe:? Dll (a) s? Seq? P S? JBd nz OnilONOl (V) 03: VI0N3fl03S 30 SVOllSj _.310V? DV0 (! ): 89-ON 0103S V Vd NplOVI? JÍdOdNI (3)
E22 VW VWVWWW 3913V3313911313131991L1199V119
081 9V313W191131LL91131913V1LV339 V1V1LL33313V31991331391L9V3331 9
021 919V99WV199V9991V1V 991LV3V1V3 V3VW91913 II II 191191 HLW1L9J1
09 1W9191L11 V191L9139V 101LW191L 1LV991W91 V991V333V33V1LLV339V JS ON 0103S: VI0N3n03S 30 NOlOdlUOSSO (! X) | B? Ui | : vj9010d01 (?) 01- onpu? s: oilH 30 VIAlidOd (O) ooi? pnu oppe:? dll (a) # s? seq? p S? JBd e33 On lONOl (V): VI0N3H03S 30 SVOIISI-OIOVH VO ( !) • Z9: 0N 0103S VHVd NQIOVI? J-.OdNI (Z)
S_I V3JLD9 W9939W9V 319JU.919W. DJL31V999V9999V993V91333V1VD399
021 9V9919W91 WWV33V191V1L91V9W V1V1331913333VU11V9991LV13.il
09 9133131LV39W13319V9 V119103WV W3JL3J1V3V 3W33V113391L3133139
ZOl. F * GpCGAAGGT GAACGTGTAG GTAGCGGATC TCACAACTGG GGAACTGTCA AAGACGAAp 60
AACTGACTTG GATCAATCAA ATGTGACTGA GGAAACACCT GAAGGTGAAG AACATCATCC 120
AGTGGCAGAC ACTGAAAATA AGGAGAATGA ACTTGAAGAG GTAAAAGAGG AGGGTCCAAA 180
(2) INFORMATION FOR SEQ ID NO: 59: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 208 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 10 (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: GCTCCTCpG CCpACCAAC pTGCACCCA TCATCAACCA TGTGGCCAGG pTGCAGCCC 60
AGGCTGCACA TCAGGG6ACT GCCTCGCAAT ACpCATGCT GpGCTGCTG ACTGATGGTG 120
CTGTGACGGA TGTGGAAGCC ACACGTGAGG CTGTGGTGCG TGCCTCGAAC CTGCCCATGT 180
* CAGTGATCAT TATGGGTGGT AAATGGCT 208
(2) INFORMATION FOR SEQ ID NO: 60: 20 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 171 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: AGCCApTAC CACCCATACT AAApCTAGT TCAAACTCCA ACTTCTTCCA TAAAACATCT 60
AACCACTGAC ACCAGpGGC AATAGCTTCT TCCTTCpTA ACCTCTTAGA GTATpATGG 120
TCAATGCCAC ACATpCTGC AACTGAATAA AGpGGTAAG GCAAGAGGAG C 171
(2) INFORMATION FOR SEQ ID NO: 61: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 134 base pairs (B) TYPE: nucleic acid 10 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: CGGGTGATGC CTCCTCAGGC pTGGTGTGT CCACTCNACT CACTGGCCTC pCTCCAGCA 60
ACTGGTGAAN ATGTCCTCAN GAAAANCNCC ACACGCNGCT CAGGGTGGGG TGGGAANCAT 120
j CANAATCATC NGGC 134
(2) INFORMATION FOR SEQ ID NO: 62: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 145 base pairs 20 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION 'SEC ID NO'62'
AGAGGGTACA TATGCAACAG TATATAAAGG AAGAAGTGCA CTGAGAGGAA CTTCATCAAG 60
GCCATpAAT CAATAAGTGA TAGAGTCAAG GCTCAACCCA GGTGTGACGG ApCCAGGTC 120
CCAAGCTCCT TACTGGTACC CTCTT 145
(2) INFORMATION FOR SEQ ID NO: 63: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 297 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 10 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: TGCACTGAGA GGAApCAAA GGGpTATGC CAAAGAACAA ACCAGTCCTC TGCAGCCTAA 60
CTCApTGp TpGGGCTGC GAAGCCATGT AGAGGGCGAT CAGGCAGTAG ATGGTCCCTC 120
CCACAGTCAG CGCCATGGTG GTCCGGTAAA GCATpGGTC AGGCAGGCCT CGTpCAGGT 180 AGACGGGCAC ACATCAGCp TCTGGAAAAA CTpTGTAGC TCTGGAGCp TGTTpTCCC 240
AGCATAATCA TACACTGTGG AATCGGAGGT CAGTpAGp GGTAAGGCAA GAGGAGC 297
(2) INFORMATION FOR SEQ ID NO: 64: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 300 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 25 (D) TOPOLOGY: linear CIA: SEC ID NO: 64:
GCACTGAGAG GAACTTCCAA TACTATGpG AATAGGAGTG GTGAGAGAGG GCATCCTTGT 60
CTTGTGCCGG TTpCAAAGG GAATGCpCC AGCTTTTGCC CApCAGTAT AATApAAAG 120
AATGTTpAC CAppcTGT cpGccTGp prcTGTGp TpGpGGTc TcpcApcT? ßo
CCAppTAG GCCpTACAT GpAGGAATA TATrTCTpT AATGATACp CACCpTGGT 240
ATCTpTGTG AGACTCTACT CATAGTGTGA TAAGCACTGG GpGGTAAGG CAAGAGGAGC 300
(2) INFORMATION FOR SEQ ID NO: 65: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 203 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: GCTCCTCpG CCpACCAAC TCACCCAGTA TGTCAGCAAT TITATCRGCT pACCTACGA 60
AACAGCCTGT ATCCAAACAC pAACACACT CACCTGAAAA GpCAGGCAA CAATCGCCp 120
CTCATGGGTC TCTCTGCTCC AGpCTGAAC CTITCTCTH TCCTAGAACA TGCATTTARG 180
TCGATAGAAG pCCTCTCAG TGC 203
(2) INFORMATION FOR SEQ ID NO: 66: (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 344 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66:
TACGGGGACC CCTGCApGA GAAAGC6AGA CTCACTCTGA AGCTGAAATG CTGpGCCCT 60
TGCAGTGCTG GTAGCAGGAG pCTGTGCTT TGTGGGCTAA GGCTCCTGGA TGACCCCTGA 120
CATGGAGAAG GCAGAGpGT GTGCCCCpC TCATGGCCTC GTCAAGGCAT CATGGACTGC 180
CACACACAAA ATGCCGTpT TApAACGAC ATGAAApGA AGGAGAGAAC ACAApCACT 240
GATGTGGCTC GTAACCATGG ATATGGTCAC ATACAGAGGT GTGApATGT AAAGGpAAT 300
TCCACCCACC TCATGTGGAA ACTAGCCTCA ATGCAGGGGT CCCA 344
(2) INFORMATION FOR SEQ ID NO: 67: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 157 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 67:
GCACTGAGAG GAACpCGTA GGGAGGpGA ACTGGCTGCT GAGGAGGGGG AACAACAGGG 60
TAACCAGACT GATAGCCAp GGATGGATAA TATGGTGGp GAGGAGGGAC ACTACpATA 120
GCAGAGGGp GTGTATAGCC TGAGGAGGCA TCACCCG 157 (2) INFORMATION FOR SEQ ID NO: 68: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 137 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: GCACTGAGAG GAACpCTAG AAAGTGAAAG TCTAGACATA AAATAAAATA AAAAT? AAA 60
ACTCAGGAGA GACAGCCCAG CACGGTGGCT CACGCCTGTA ATCCCAGAAC pTGGGAGCC 120
TGAGGAGGCA TCACCCG 137
(2) INFORMATION FOR SEQ ID NO: 69: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 137 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 69: CGGGTGATGC CTCCTCAGGC TGTATpTGA AGACTATCGA CTGGACpCT TATCAACTGA 60
AGAATCCGp AAAAATACCA GpGTApAT pCTACCTGT CAAAATCCAT pCAAATGp 120
GAAGpCCTC TCAGTGC 137
(2) INFORMATION FOR SEQ ID NO: 70: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 220 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 70: AGCATGpGA GCCCAGACAC GCAATCTGAA T6AGTGTGCA CCTCAAGTAA ATGTCTACAC 60
GCTGCCTGGT CTGACATGGC ACACCATCNC GTGGAGGGCA CASCTCTGCT CNGCCTACWA 120
CGAGGGCANT CTCATWGACA GGpCCACCC ACCAAACTGC AAGAGGCTCA NNAAGTACTR 180
CCAGGGTMYA SGGACMASGG TGGGAYTYCA YCACWCATCT 220
(2) INFORMATION FOR SEQ ID NO: 71: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 353 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 71:
CGpAGGGTC TCTATCCACT GCTAAACCAT ACACCTGGGT AAACAGGGAC CATpAACAT 60
TCCCANCTAA ATATGCCAAG TGACpCACA TGpTATCp AAAGATGTCC AAAACGCAAC 120
TGATTpCTC CCCTAAACCT GTGATGGTGG GATGApAAN CCTGAGTGGT CTACAGCAAG 180
pAAGTGCAA GGTGCTAAAT GAANGTGACC TGAGATACAG CATCTACAAG GCAGTACCTC 240
TCAACNCAGG GCAACTTTGC pCTCANAGG GCATpAGCA GTGTCTGAAG TAATpCTGT 300
- f ApACAACTC ACGGGGCGGG GGGTGAATAT CTANTGGANA GNAGACCCTA ACG 353
(2) INFORMATION FOR SEQ ID NO: 72: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 343 base pairs (B) TYPE: nucleic acid (C) HI FORM: simple 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72:
GCACTGAGAG 6AACTTCCAA TACYATKATC AGAGTGAACA RGCARCCYAC AGAACAGGAG 60
AAAATGpYG CAATCTCTCC ATCTGACAAA AGGCTAATAT CCAGAWTCTA AWAGGAACp 120
AAACAAApT ATGAGAAAAG AACARACAAC CTCAWCAAAA AGTGGGTGAA GGAWATGCTS 180
AAARGAAGAC ATYTApCAG CCAGTAAACA YATGAAAAAA AGGCTCATSA TCACTGAWCA 240
pAGAGAAAT GCAAATCAAA ACCACAATGA GATACCATCT YAYRCCAGp AGAAYGGTGA 300
# TCApAAAAR STCAGGAAAC AACAGATGCT GGACAAGGTG TCA 343
(2) INFORMATION FOR SEQ ID NO: 73: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 321 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: GCACTGAGAG GAACTTCAGA GAGAGAGAGA GAGpCCACC CTGTACpGG GGAGAGAAAC 60
AGAAGGTGAG AAAGTCTpG GpCTGAAGC AGCTTCTAAG ATCTpTCAT pGCTTCAp 120
twenty
TCAAAGpCC CATGCTGCCA AAGTGCCATC CTpGGGGTA CTGTpTCTG AGCTCCAGTG 180
ATAACTCAp TATACAAGGG AGATACCCAG AAAAAAAGTG AGCAAATCp AAAAAGGTGG 240
CTTGAGpCA GCCpAAATA CCATCpGAA ATGACACAGA GAAAGAANGA TGpGGGTGG 300
GAGTGGATAG AGACCCTAAC G 321
(2) INFORMATION FOR SEQ ID NO: 74: # (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 321 base pairs 10 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74:
GCACTGAGAG GAACpCAGA GAGAGAGAGA GAGpCCACC CTGTACpGG GGAGAGAAAC 60
# AGAAGGTGAG AAAGTCTpG GpCTGAAGC AGCpCTAAG ATCTpTCAT pGCTTCAp 120
TCAAAGpCC CATGCTGCCA AAGTGCCATC CTpGGGGTA CTGTpTCTG AGCTCCAGTG 180
ATAACTCAp TATACAAGGG AGATACCCAG AAAAAAAGTG AGCAAATCp AAAAAGGTGG 240
CpGAGpCA GYCpAAATA CCATCpGAA ATGAMACAGA GAAAGAAGGA TGpGGGTGG 300
GAGTGGATAG AGACCCTAAC G 321
(2) INFORMATION FOR SEQ ID NO: 75: (i) SEQUENCE CHARACTERISTICS: 25 • (A) LENGTH: 317 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: GCACTGAGAG GAACTTCCAC ATGCACTGAG AAATGCATGT TCACAAGGAC TGAAGTCTGG 60
AACTCAGTp CTCAGpCCA ATCCTGApC AGGTGTTTAC CAGCTACACA ACCpAAGCA 120
AGTCAGATAA CCpAGCTTC CTCATATGCA AAATGAGAAT GAAAAGTACT CATCGCTGAA 180
pGppGAG GApAGAAAA ACATCTGGCA TGCAGTAGM ApCAApAG TApCATTp 240
CApcpCTA AApAAACAA ATAGGATGGT TAGTGGTGGA AOTCAGACA CCAGAAATGG 300
GAGTGGATAG AGACCCT 317
(2) INFORMATION FOR SEQ ID NO: 76 • (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 244 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: ^^ CGpAGGGTC TCTATCCACT CCCACTACTG ATCAAACTCT A? TA? GAA pAppTAT 60
CATACpTAA GpCTGGGAT ACACGTGCAG CATGCGCAGG TpGpGCAT AGGTATACAC 120
pGCCATGGT GGTpGCTGC ACCCATCAGT CCATCATCTA CApAGGTAT pCTCCTAAT 180
GCTATCCCTC CCCTAGCCCC pACACCCCC AACAGGCTCT AGTGTGTGAA GpCCTCTCA 240
GTGC 244
* (2) INFORMATION FOR SEQ ID NO: 77: 10 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 254 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: TCTATCCACT GAAATCTGAA GCACAGGAGG AAGAGAAGCA GTYCTAGTGA 60
GATGGCAAGT TCWpTACCA CACTCTpAA CApTYG AGTpTAACC mATITATG 120
GATAATAAAG GpAATApA ATAATGATp AppAAGGC ApCCCRAAT pGCATAAp 180
CTCCTpTGG AGATACCCp pATCTCCAG TGCAAGTCTG GATCAAAGTG ATASAKAGAA 240
GpCCTCTCA GTGC 254
(2) INFORMATION FOR SEQ ID NO: 78: (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 355 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: pCGATACAG GCAAACATGA ACTGCAGGAG GCTGGTGACG ATCATGATGT TGCCGATGGT 60
CCGGATGGNC ACGAAGACGC ACTGGANCAC GTGCTTACGT CCTpTGCTC TGpGATGGC 120
CCTGAGGGGA CGCAGGACCC pATGACCCT CAGAATCTTC ACAACGGGAG ATGGCACTGG 180
ApGANTCCC ANTGACACCA GAGACACCCC AACCACCAGN ATATCANTAT ApGATGTAG_240_pCCTGTAGA NGGCCCCCp GTGGAGGAAA GCTCCATNAG pGGTCATCT TCAACAGGAT 300
CTCAACAGp TCCGATGGCT GTGATGGGCA TAGTCATANT TAACCNTGTN TCGAA 355
(2) INFORMATION FOR SEQ ID NO: 79: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 406 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79:
TAAGAGGGTA CCAGCAGAAA GGpAGTATC ATCAGATAGC ATCpATACG AGTAATATGC 60
CTGCTApTG AAGTGTAAp GAGAAGGAAA ATpTAGCGT GCTCACTGAC CTGCCTGTAG_120_CCCCAGTGAC AGCTAGGATG TGCApCTCC AGCCATCAAG AGACTGAGTC AAGpGpCC 180
pAAGTCAGA ACAGCAGACT CAGCTCTGAC ApCTGApc GAATGACACT GpCAGGAAT 240
CGGAATCCTG TCGApAGAC TGGACAGCTT GTGGCAAGTG AATrTGCCTG TAACAAGCCA 300
* GAIHIIIAA AApTATAp GTAAATAATG TGTGTGTGTG TCTGTGTATA TATATATATA 360
TGTACAGpA TCTAAGpAA pTAAAAGp GpTGGTACC CTCTTA 406
(2) INFORMATION FOR SEQ ID NO: 80: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 327 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80.
twenty
^ w ppppp pTACTCGGC TCAGTCTAAT CCTT? TGTA GTCACTCATA GGCCAGACTT eo
AGGGCTAGGA TGATGApAA TAAGAGGGAT GACATAACTA pAGTGGCAG GpAGpGp 120
TGTAGGGCTC ATGGTAGGGG TAAAAGGAGG GCAATpCTA GATCAAATAA TAAGAAGGTA 180
ATAGCTACTA AGAAGAATp TATGGAGAAA GGGACGCGGG CGGGGGATAT AGGGTCGAAG 240
CCGCACTCGT AAGGGGTGGA ppTCTATG TAGCCGpGA GpGTGGTAG TCAAAATGTA 300
ATAApApA GTAGTAAGCC TAGGAGA 327
(2) INFORMATION FOR SEQ ID NO: 81: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 318 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear F- (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: TAGTCTATGC GGpGApCG GCAATCCAp ApTGCTGGA TpTGTCATG TGTpTGCCA 60
twenty
ApGCApCA TAATpApA TGCATpATG CTTGTATCTC CTAAGTCATG GTATATAATC 120
CATGCTTTp ATGTpTGTC TGACATAAAC TCpATCAGA GCCCTpGCA CACAGGGAp 180
CAATAAATAT TAACACAGTC TACATpAp TGGTGAATAT TGCATATCTG CTGTACTGAA 240
AGCACApAA GTAACAAAGG CAAGTGAGAA GAATGAAAAG CACTACTCAC AACAGpATC 300
ATGApGCGC ATAGACTA 318
(2) INFORMATION FOR SEQ ID NO: 82: (i) SEQUENCE CHARACTERISTICS (A) LENGTH: 338 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82:
TCpCAACCT CTACTCCCAC TAATAGCTp pGATGACTT CTAGCAAGCC TCGCTAACCT 60
CGCCpACCC CCCACTApA ACCTACTGGG AGAACTCTCT GTGCTAGTAA CCACGpCTC 120
CTGATCAAAT ATCACTCTCC TACpACAGG ACTCAACATA CTAGTCACAG CCCTATACTC 180
CCTCTACATA TITACCACAA CACAATGGGG CTCACTCACC CACCACApA ACAACATAAA 240
ACCCTCApC ACACGAGAAA ACACCCTCAT GpCATACAC CTATCCCCCA pCTCCTCCT 300
ATCCCTCAAC CCCGACATCA pACCGGGp pCCTCp 338
(2) INFORMATION FOR SEQ ID NO: 83: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 111 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 83:
AGCCATpAC CACCCATCCA CAAAAAAAAA AAAAAAAAAG AAAAATATCA AGGAATAAAA 60
«* ATAGACTGTG AACAAAAAGG AACATGTGCT GGCCTGAGGA GGCATCACCC G lll
(2) INFORMATION FOR SEQ ID NO: 84: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 224 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84:CCTCCTCAGG CCAAGAAGAT AAAGCTTCAG ACCCCTAACA CATGÍCCAAA 60
AAGGAAGAAA GGAGAAAAAA GGGCATCATC CCCGpCCGA AGGGTCAGGG AGGAGGAAAT 120
TGAGGTGGAT TCACGAGpG CGGACAACTC CpTGATGCC AAGCGAGGTG CAGCCGGAGA 180
CTGGGGAGAG CGAGCCAATC AGGTpTGAA GpCCTCTCA GTGC 224
(2) INFORMATION FOR SEQ ID NO: 85: 25 (i) SEQUENCE CHARACTERISTICS: base (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEC ID NO: 85: GCACTGAGAG GAACTTCGp GGAAACGGGT HIIHCATG TAAGGCTAGA CAGAAGAAp 60
CTCAGTAACT TCCTTGTGp GTGTGTApC AACTCACASA GpGAACGAT CCpTACACA 120
GAGCAGACp GTAACACTCT TVTTGTGGAA pTGCAAGTG GAGATTTCAG SCGCpTGAA 180
GTSAAAGGTA GAAAAGGAAA TATCpCCTA TAAAAACTAG ACAGAATGAT TCTCAGAAAC 240
TCCpTGTGA TGTGTGCGp CAACTCACAG AGpTAACCT pCWHTCAT AGAAGCAGp 300
AGGAAACACT CTGTpGTAA AGTCTGCAAG TGGATAGAGA CCCTAACG 348
(2) INFORMATION FOR SEQ ID NO: 86: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 293 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86:
GCACTGAGAG GAACTTCYp GTGWTGTKTG YApCAACTC ACAGAGpGA ASSV'fl p 60
ACABAGWKCA GGCTTKCAAA CACTCTpp GTMGAATYTG CAAGWGGAKA TpSRRCCRC 120
pTGWGGYCW WYSKTMGAAW MGGRWATATC pCWYATMRA AMCTAGACAG AAKSApCTC 180
AKAAWSTYYY YTGTGAWGWS TGCRpCAAC TCACAGAGKT KAACMWTYCT ICYTSATRGAG 240
CAGTTWKGAA ACTCTMTpC pTGGApCT GCAAGTGGAT AGAGACCCTA ACG 293 * (2) INFORMATION FOR SEQ ID NO: 87: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: CTCTAGGCT 10
F 10 (2) INFORMATION FOR SEQ ID NO: 88: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM, single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: AGTAGTTGCC 10
(2) INFORMATION FOR SEQ ID NO: 89: UENCE: (A) LENGTH: 11 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple 5 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 89: TTCCGTTATGC 11
(2) INFORMATION FOR SEQ ID NO: 90: (i) SEQUENCE CHARACTERISTICS: 10 (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 15 TGGTAAAGGG 10
(2) INFORMATION FOR SEQ ID NO: 91: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: TCGGTCATAG_10_(2) INFORMATION FOR SEQ ID NO: 92: 25 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: TACAACGAGG 10
(2) INFORMATION FOR SEQ ID NO: 93: IT (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs 10 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: TGGATTGGTC 10
(2) INFORMATION FOR SEQ ID NO: 94: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: CTTTCTACCC 10
(2) INFORMATION FOR SEQ ID NO: 95: (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 5 TTTTGGCTCC 10
(2) INFORMATION FOR SEQ ID NO: 96: (i) SEQUENCE CHARACTERISTICS: jttt (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid 10 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96:. GGAACCAATC 10
(2) INFORMATION FOR SEQ ID NO: 97: 15 (i) SEQUENCE CHARACTERISTICS: ft (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear 20 (xí) SEQUENCE DESCRIPTION: SEQ ID NO: 97: TCGATACAGG 10
(2) INFORMATION FOR SEQ ID NO: 98: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs 25 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: GGTACTAAGG 10 (2) INFORMATION FOR SEQ ID NO: 99: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: AGTCTATGCG 10
(2) INFORMATION FOR SEQ ID NO: 100: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 100: CTATCCATGG 10
(2) INFORMATION FOR SEQ ID NO: 101: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 101: TCTGTCCACA 10
(2) INFORMATION FOR SEQ ID NO: 102: 5 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: AAGAGGGTAC 10
(2) INFORMATION FOR SEQ ID NO: 103: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs 15 (B) TYPE: nucleic acid? (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: CTTCAACCTC 10
(2) INFORMATION FOR SEQ ID NO: 104: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 25 (D) TOPOLOGY: linear * (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: GCTCCTCTTG CCTTACCAAC 20
(2) INFORMATION FOR SEQ ID NO: 105: (i) SEQUENCE CHARACTERISTICS: 5 (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear * (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 10 GTAAGTCGAG CAGTGTGATG 20
(2) INFORMATION FOR SEQ ID NO: 106: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: GTAAGTCGAG CAGTCTGATG 20
(2) INFORMATION FOR SEQ ID NO: 107: 20 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107:
GACTTAGTGG AAACAATGTA 20
(2) INFORMATION FOR SEQ ID NO: 108: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 108: GTAATTCCGC CAACCGTAGT 20 (2) INFORMATION FOR SEQ ID NO: 109: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: ATGGTTGATC GATAGTGGAA 20
(2) INFORMATION FOR SEQ ID NO: 110: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 110: ACGGGGACCC CTGATTGAG 20 (2) INFORMATION FOR SEQ ID NO: 111: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: TATTCTAGAC CATTCGCTAC 20
(2) INFORMATION FOR SEQ ID NO: 112: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 112: ACATAACCAC TTTAGCGTTC 20 (2) INFORMATION FOR SEQ ID NO: 113: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: CGGGTGATGC CTCCTCAGGC 20 (2) INFORMATION FOR SEQ ID NO: 114: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: AGCATGTTGA GCCCAGACAC 20TION FOR SEQ ID NO: 115: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 115: GACACCTTGT CCAGCATCTG 20
(2) INFORMATION FOR SEQ ID NO: 116: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (x) ) SEQUENCE DESCRIPTION: SEQ ID NO: 116: TACGCTGCAA CACTGTGGAG 20
(2) INFORMATION FOR SEQ ID NO: 117: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 117: CGTTAGGGTC TCTATCCACT 20
(2) INFORMATION FOR SEQ ID NO: 118: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 118: AGACTGACTC ATGTCCCCTA 20 (2) INFORMATION FOR SEQ ID NO: 119: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: TCATCGCTCG GTGACTCAAG 20
(2) INFORMATION FOR SEQ ID NO: 120: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs # (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 5 CAAGATTCCA TAGGCTGACC 20
(2) INFORMATION FOR SEQ ID NO: 121: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid 10 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: ACGTACTGGT CTTGAAGGTC 20
(2) INFORMATION FOR SEQ ID NO: 122: 15 (i) SEQUENCE CHARACTERISTICS: # (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear 20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: GACGCTTGGC CACTTGACAC 20
(2) INFORMATION FOR SEQ ID NO: 123: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs 25 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: GTATCGACGTAGTGGTCTCC 20 (2) INFORMATION FOR SEQ ID NO: 1244: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: TAGTGACATTACGACGCTGG 20
(2) INFORMATION FOR SEQ ID NO: 125: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (x) ) SEQUENCE DESCRIPTION: SEQ ID NO: 125: CGGGTGATGC CTCCTCAGGC 20
(2) INFORMATION FOR SEQ ID NO: 126: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 126: ATGGCTATTT TCGGGGGCTG ACA 23
(2) INFORMATION FOR SEQ ID NO: 127: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 127: CCGGTATCTC CTCGTGGGTA TT 20
(2) INFORMATION FOR SEQ ID NO: 128: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 128: CTGCCTGAGC CACAAATG 20 (2) INFORMATION FOR SEQ ID NO: 129: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 244 base pairs (B) TYPE: nucleic acid (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO.129: CCGGAGGAGGAAGCTAGAGGAATA 20
(2) INFORMATION FOR SEQ ID NO: 130: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 14 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO.130: TTTTTTTTTT TTAG_20_
(2) INFORMATION FOR SEQ ID NO: 131: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 18 amino acids (B) TYPE: amino acids (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) DESCRIPTION OF SEQUENCE: SEQ ID NO: 131: Ser Ser Gly Gly Arg Thr Phe Asp Asp Phe His Arg Tyr Leu Leu Val 1 5 10 15 (2) INFORMATION FOR SEC ID N0.132: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 22 amino acids (B) TYPE: amino acids (C) THREAD FORM: simple (D) TOPOLOGY: linear (xí) SEQUENCE DESCRIPTION: SEQ ID NO: 132:
Gln Gly Wing Wing Gln Lys Pro He Asn Leu Ser Lys Xaa lie Glu Val 1 5 10 15 Van Gln Gly His Asp Glu 20 (2) INFORMATION FOR SEQ ID NO: 133: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 23 amino acids (B) TYPE: amino acids (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133:
Ser Pro Gly Val Phe Leu Glu His Leu Gln Glu Wing Tyr Arg He Tyr 1 5 10 15 Thr Pro Phe Asp Leu Ser Wing 20 (2) INFORMATION FOR SEQ ID NO: 134: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 9 amino acids (B) TYPE: amino acids (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134:
Tyr Leu Leu Val Gly lie Gln Gly Wing 1 5 (2) INFORMATION FOR SEQ ID NO: 135: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids (B) TYPE: amino acids (C) THREAD FORM: simple (D) TOPOLOGY: linear 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: Gly Ala Ala Gln Lys Pro He Asn Leu 1 5 (2) INFORMATION FOR SEQ ID NO: 136: * (i) SEQUENCE CHARACTERISTICS: 10 (A) LENGTH: 9 amino acids (B) TYPE: amino acids (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136:
Asn Leu Ser Lys Xaa He Glu Val Val 1 5 (2) INFORMATION FOR SEQ ID NO: 137: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids 20 (B) TYPE: amino acids (C) THREAD FORM : simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: Glu Val Val Gln Gly His Asp Glu Ser 25 1 5 (2) INFORMATION FOR SEQ ID NO.138: (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 9 amino acids (B) TYPE: amino acids (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: His Leu Gln Glu Ala Tyr Arg He Tyr -c ^^ 1 5 w (2) INFORMATION FOR SEQ ID NO: 139: 10 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids (B) TYPE: amino acids (C) THREAD FORM: single (D) TOPOLOGY: linear 15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: Asn Leu Ala Phe Val Ala Ala Gln Ala Ala 1 5 (2) INFORMATION FOR SEQ ID NO: 140: (i) SEQUENCE CHARACTERISTICS: 20 (A) LENGTH: 9 amino acids (B) TYPE: amino acid two (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140:
Phe Val Ala Gln Ala Ala Pro Asp Ser 1 5: 141: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9388 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY : linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141:
GCTCGCGGCC GCGAGCTCAA pAACCCTCA CTAAAGGGAG TCGACTCGAT CAGACTCTTA 60
CTGTGTCTAT GTAGAAAGAA GTAGACATAA GAGApCCAT TpGpCTGT ACTAAGAAAA 120
ApCTTCTGC CpGAGATGC TGpAATCTG TAACCCTAGC CCCAACCCTG TGCTCACAGA 180
GACATGTGCT GTGp &ACTC AAGGpCAAT GGATTTAGGG CTATGCTpG pAAAAAAGT 240
GCpGAAGAT AATATGCTTG pAAAAGTCA TCACCApCT CTAATCTCAA GTACCCAGGG 300
ACACAATACA CTGCGGAAGG CCGCAGGGAC CTCTGTCTAG GAAAGCCAGG TApGTCCAA 360
GATpCTCCC CATGTGATAG CCT6AGATAT GGCCTCATGG GAAGGGTAAG ACCTGACTGT 420
CCCCCAGCCC GACATCCCCC AGCCCGACAT CCCCCAGCCC GACACCCGAA AAGGGTCTGT 480
GCTGAGGAGG ApAGTAAAA GAGGAAGGCC TCTpGCAGT TGAGGTAAGA GGAAGGCATC 540
TGTCTCCTGC TCGTCCCTGG GCAATAGAAT GTCpGGTGT AAAACCCGAT TGTATGpCT 600 0891 1393999193 13V9933V33 19331V3131 1193399333 33391399V3 13V99913W 93
0291 9993993V33 91LV33991L VWS991999 3V3319391L 1133991331 9V991939V3
0991 9V31999V99 1393V9991V 3V99339133 9193V33993 HV1399911 319V313319
00SI V91919W99 919WV3919 9V33133991 99V3913199 1L99W3939 91339W993
03
0 ** I 31LLV33313 3393339W9 9331119339 33991LV333 1L193V31V9 131V3W19V
08CT 1339131913 V993VW991 3391191913 J-131LL3311 93939V913V 91399V9JJLLL
02 i V131933131 WV91399H V331V3V391 33W999V99 191W13991 1V1L31L3V3
092T 9W333399V 199V399V19 1139191391 11139VW99 V39391WW V3939V3311 # 9L
0021 V339V199V3 9939399111 V9ULLVW1 9J1V91999V V33V39V3W V99V3919W
Opi V9W33131V W39199133 939V331319 99V999199V 339991V399 V399199933
0801 V31L91991L 319991199V V9V3919919 913V9W3W 3V913931L9 1339V13V1L 01.
020? 1W1W31L3 333V3399V3 9999V99191 99V3V333V1 WV9V93V91 33V331L931
096 31319V3131 JJLL31LL313 1919131319 1U.3V1V131 3111311113 V3139991L3 *
006 3319933939 V913919193 3133199V19 13331919V3 3V9V9V313V V999V913V1
0 * 8 VW1W31V9 1W1V9V9V1 991V9V9331 3D30331V3 V339133991 1V1333V1LV
Q8 ¿13V3333131 333V913913 31U193V31 191LL33V9V 9V3V3V91V1 J1V1L3VW1
02Z 133ULL33V3 9V3V399W3 1V3V3919W 1V1911L91V 9V933V391V V1L 1391D
099 V1W399399 1393V3V9V9 199V991399 9V1L331V3V VW9V99V1V 9V913V1L3V
9 * 1.
09Z2 9333999339 V9V9399933 V3193VW9V 99933V3193 WV9V939V3 33399V991L 93
00Z2 1V3V93V199 V3139313W 1313111193 991919VW3 13191HW1 1H3V991W
0 * 92 V1VW139V3 3993991131 31V339V1LV 39V9111LW 9V9V9939V9 V3J131L333
0892 39V131V9W 9WW1V19V W33WV131 1133331333 3133V9V13V V1U1911W
03
02SZ W19131LLV 1113-11391 I U H3W1L 1L331V11ÍV 33W191331 1L3W11V13
09 * 2 W99133V39 133339VW1 131W1W19 1133V31V13 193933339V VW33391V1
00 * 2 1V9LLV9331 V31LW99V9 331L9V3933 1V1V139W3 991913V1V9 W1L913W1
0 * 82 1W391W9V V331V3V39V 33313V9139 W3919113V V113V99W3 1331133339 * 91.
0822 V399DV311 1313991131 V3313V991V 9V33939V9V V139V1L333 1313V911W
02Z2 D133V33111 91L339V133 V31VW1L33 1113331313 V9V1LLLL91 333W13V1L
0912 LLL99133W 33391W191 9V19V1L31V 1V91LL1991 33119 19V V99VW1L11 01.
0012 91LW9911L 9V119V939V 3V399W191 9131L1WV9 99991L31L1 V1V33WW1
0 * 02 V1LL399V31 31L3W3W3 9V1W13319 1U91991LL 19VU I HW 33VW1LW1
0861 3911911911 11W991WV V119V339V3 V99W1V913 131913V991 9V11LV1LV1
026T 91L9V339J1 1LL3V3131V VW3V1L133 W3333V39V W1W1V1L1 91V1191V1L
0981 1991331LV3 ViWWlLVi 99V91199 V39333V919 V999V99913 393V391L91
0081 1LL9199V39 9V13U1V9Í 11L3393VW V1999313V3 339V31L11L 399S93WW
0 * ZI V199WV1LV 9V3313191L V19333V913 133V913311 19V191V913 939913V393
9 * 1- F CTGTGTGCTC CCCCGGAAGG ACAGCCAGCT TGTAGGGGGG AGTGCCACCT GAAAAAAAAA 2820
TpCCAGGTC CCCAAAGGGT GACCGTCTTC CGGAGGACAG CGGATCGACT ACCATGCGGG 2880
TGCCCACCAA AApCCACCT CTGAGTCCTC AACTGCTGAC CCCGGGGTCA GGTAGGTCAG 2940
ATpGACpT GGpCTGGCA GAGGGAAGCG ACCCTGATGA GGGTGTCCCT (T? TGACTC 3000
TGCCCATTTC TCTAGGATGC TAGAGGGTAG AGCCCTGGp pCTGpAGA CGCCTCTGTG 3060
TCTCTGTCTG GGAGGGAAGT GGCCCTGACA GGGGCCATCC GTCAG TCCACATCCC 3120
AGGATGCTGG GGGACTGAGT CCTGGTpCT GGCAGACTGG TCTCTCTCTC TCTCPyTTC 3180
TATCTCTAAT CTTTC TCAGGTpCT TGGAGAATCT CTGGGAAAGA AAAAAGAAAA 3240
ACTGpATAA ACTCTGTGTG AATGGTGAAT GAATGGGGGA GGACAAGGGC pGCG 3300
CCTCCAGTp GTAGCTCCAC GGCGAAAGCT ACGGAGpCA AGTGGGCCCT CACCTGCGGT 3360
TCCGTGGCGA CCTCATAAGG CpAAGGCAG CATCCGGCAT AGCTCGATCC GAGCCGGGGG 3420
# pTATACCGG CCTGTCAATG CTAAGAGGAG CCCAAGTCCC CTAAGGGGGA GCGGCCAGGC 3480
GGGCATCTGA CTGATCCCAT CACGGGACCC CCTCCCpTGTCTAAA AAAAAAAAAA 3540
GAAGAAACTG TCATAACTGT pACATGCCC TAGGGTCAAC TGpTGTpT ATGTTTApG 3600
pCTGpCGG TGTCTApGT TAGT GGpGTCAAG GTpTGCATG TCAGGACGTC 3660
GATApGCCC AAGACGTCTG GGTAAGAACT TCTGCAAGGT CCTTAGTGCT GAIUIUGT 3720
CACAGGAGGT TAAATpCTC ATCAATCAp TAGGCTGGCC ACCACAGTCC TGTCTpTCT 3780
GCCAGAAGCA AGTCAGGTGT TGpACGGGA ATGAGTGTAA AAAAACApC GCCTGApGG 026 3840 * 139V311993 39V31333V9 3133931LLL 91933V9V33 9V91331331 99W33133V 098 * 1LHW1V31 1V33V391V3 13V3313991 V33113333V 39V99V3199 1119V91399 008 * 1W31V91LV 3V13331LL9 V31V913331 339933313V W913V1191 1999399191 0 * Z * 9139W3a9 aW3V3133 W913V3199 9W91L3333 9919V91L9V 9911L9VW3 03
089 * 3991W9V19 VW391313V 3V9913139V V399WWW 113393319V 991V1LV99V 9V3131LL99 029 * 313 131 I I I I W9V391391 V1V19V3313 13339V99U 999W19113 09S * 39V3331331 13V3VW999 WWWWW WWW31L3 3911WVJ1V p31VW311 OOS * 91V131V1V9 1LL9991391 9131913131 1L9V919V99 9V3333339V 3313V1V931 9
0 *** 11311913a V91L91V319 913 a3139 131 1 I I 1991 9a9139Va 1V19919W3 08C * 99191V919V 3V139WVa 31W313331 3313319V13 1339913999 3131V9V99V 028 * 333V913W3 31V9V133V3 W9V9V9V9V V9W9V131V 9V3V1999V1 aV199WW 0
092 * 3V3V9V991V 319V91H33 91J1V139V9 ai999139V WW9WW1 1333W9a3 002 * 313V313139 131399a3V 99V3313193 9V3a9W39 V3V93919V3 31U193131
0 * 1 * 1333339133 31331331st V3V3V3333V 13319919W 331V93WW Va31V3331 080 * 313V139919 13133a913 13V113aV9 V11193a3V ai3913V19 31333V133V 020 * W999V1919 13V0V39V1V W1Ü33919 133.I..I H3W 1913aV319 9ai3133V3 0968 139a33a V9V391333V 339V3WW3 a99919919 913VW13W V13133133V 0068 99 VDa99 V331V3V333 3VlV319aV 9VJHV19a 991V91V33V 39913U1V9 8 * 1 . f CTCACCCAGT CCCACCGCCT 4980
TAAAACCAGC CTACTCCCTT AGGGTCATCC CATGTCTCCT CGGCTATGTC CCCTGTAGGC 5040
TCATCACCCA pGCCTCTTG GpGCAACCG TGGTGGGAGG AAGTAGCCCC TCTACTACCA 5100
CTGAGAGAGG CACAAGTCCC TCTGGGTGAT GACTGCTCCA CCCCCpCCT GGpTATGTC 5160
CCpcpTCT ACpCTGACT TGTATAApG GAAAACCCAT AATCCTCCCT TCTCTGAAAA 5220
GCCCCAGGCT pGACCTCAC TGATGGAGTC TGTACTCTGG ACACApGGC CCACCTGGGA 5280
TGACTGTCAA CAGCTCCTp TGACCCTTp CACCTCTGAA GAGAGGGAAA GTATCCAAAG 5340
AGAGGCCAAA AAGTACAACC TCACATCAAC CAATAGGCCG GAGGAGGAAG CTAGAGGAAT 5400
AGTGApAGA GACCCAApG GGACCTAAp GGGACCCAAA TTTCTCAAGT GGAGGGAGAA 5460
CppGACGA TGTCCACCGG TATCTCCTCG TGGGTApCA GGGAGCTGCT CAGAAACCTA 5520
TAAACpGTC TAAGGCGACT GAAGTCGTCC AGGGGCATGA TGAGTCACCA GGAGTGTpT 5580
# TAGAGCACCT CCAGGAGGCT TATCGGATp ACACCCCHT TGACCTGGCA GCCCCCGAAA 5640
ATAGCCATGC TCpAApTG GCApTGTGG CTCAGGCAGC CCCAGATAGT AAAAGGAAAC 5700
TCCAAAAACT AGAGGGATp TGCTGGAATG AATACCAGTC AGCTpTAGA GATAGCCTAA 5760
AAGGTTpTG ACAGTCAAGA GGpGAAAAA CAAAAACAAG CAGCTCAGGC AGCTGAAAAA 5820
AGCCACTGAT AAAGCATCCT GGAGTATCAG AGTTTACTGT TAGATCAGCC TCATpGACT 5880
TCCCCTCCCA CATGGTGTp AAATCCAGCT ACACTACpC CTGACTCAAA CTCCACTAp 5940
CCTGpCATG ACTCTCAGGA ACTGpGGAA ACTACTGAAA CTGGCCGACC TGATCTTCAA 6000 l \ AATGTGCCCC TAGGAAAGGT GGATGCCACC GTGpCACAG ACAGTAGCAG CpCCTCGAG 6060
AAGGGACTAC GAAAGGCCGG TGCAGCTGp ACCATGGAGA CAGATGTGp GTGGGCTCAG 6120
GCpTACCAG CAAACACCTC AGCACAAAAG GCTGAApGA TCGCCCTCAC TCAGGCTCTC 6180
CGATGGGGTA AGGATApAA CGpAACACT 6ACAGCAGGT ACGCCTpGC TACTGTGCAT 6240
GTACGTGGAG CCATCTACCA GGAGCGTGGG CTACTCACCT CAGCAGGTGG CTGTAATCCA 6300
CTGTAAAGGA CATCAAAAGG AAAACACGGC TGpGCCCGT GGTAACCAGA AAGCTGApC 6360 AGCAGCTCAA GATGCAGTGT GACTÍTCAGT CACGCCTCTA AACpGCTGC CCACAGTCTC 6420
CipCCACAG CCAGATCTGC CTGACAATCC CGCATACTCA ACAGAAGAAG AAAACTGGCC 6480
TCAGAACTCA GAGCCAATAA AAATCAGGAA GGpGGTGGA pcpCCTGA CTCTAGAATC 6540
pCATACCCC GAACTCpGG GAAAACpTA ATCAGTCACC TACAGTCTAC CACCCATTTA 6600
GGAGGAGCAA AGCTACCTCA GCTCCTCCGG AGCCGTpTA AGATCCCCCA TCpCAAAGC 6660
# CTAACAGATC AAGCAGCTCT CCGGTGCACA ACCTGCGCCC AGGTAAATGC CAAAAAAGGT 6720
CCTAAACCCA GCCCAGGCCA CCGTCTCCAA GAAAACTCAC CAGGAGAAAA GTGGGAAAp 6780
GACTTTACAG AAGTAAAACC ACACCGGGCT GGGTACAAAT ACCTTCTAGT ACTGGTAGAC 6840
ACCpCTCTG GATGGACTGA AGCATTTGCT ACCAAAAACG AAACTGTCAA TATGGTAGp 6900
AAGTppAC TCAATGAAAT CATCCCTCGA CGTGGGCTGC CTGpGCCAT AGGGTCTGAT 6960
AATGGACCGG CCpCGCCTT GTCTATAGp TAGTCAGTCA GTAAGGCGp AAACApCAA 7020
TGGAAGCTCC ApGTGCCTA TCGACCCCAG AGCTCTGGGC AAGTAGAACG CATGAACTGC 7080 fil ^ r ACCCTAAAAA ACACTCTTAC AAAApAATC pAGAAACCG GTGTAAApG TGTAAGTCTC 7140
CpCCpTAG CCCTACTGAG AGTAAGGTGC ACCCCpACT GGGCTGGGp CTTACCpp 7200
GAAATCATGT ATGGGAGGGC GCTGCCTATC pGCCTAAGC TAAGAGATGC CCAApGGCA 7260
AAAATATCAC AAACTAATp ApACAGTAC CTACAGTCTC CCCAACAGGT ACAAGATATC 7320
ATCCTGCCAC pGpCGAGG AACCCATCCC AATCCAApC CTGAACAGAC AGGGCCCTGC 7380
cApcApcc CGCCAGGTGA ccTGpG r CTTAAAAAGT TCCAGAGAGA AGGACTCCCT 7440
CCTGCpGGA AGAGACCTCA CACCGTCATC ACGATGCCAA CGGCTCTGAA GGTGGATGGC 7500
ATTCCTGCGT GGApCATCA CTCCCGCATC AAAAAGGCCA ACGGAGCCCA ACTAGAAACA 7560
TGGGTCCCCA GGGCTGGGTC AGGCCCCpA AAACTGCACC TAAGTTGGGT GAAGCCApA 7620
GApAApCT TpTCpAAT pTGTAAAAC AATGCATAGC pCTGTCAAA CpATCTATC 7680
p VAGACTCA ATATAACCCC CTTGpATAA CTGAGGAATC AATGATpGA pCCCCAAAA 7740
# ACACAAGTGG GGAATGTAGT GTCCAACCTG GppTACTA ACCCTGpp TAGACTCTCC 7800
CTpCCpTA ATCACTCAGC CpGHTCCA CCTGAApGA CTCTCCCpA GCTAAGAGCG 7860
CCAGATGGAC TCCATCpGG CTCTpCACT GGCAGCCGCT TCCTCAAGGA CTTAACTTGT 7920
twenty
GCAAGCTGAC TCCCAGCACA TCCAAGAATG CAApAACTG ATAAGATACT GTGGCAAGCT 7980
ATATCCGCAG pCCCAGGAA pCGTCCAAT TGApACACC CAAAAGCCCC GCGTCTATCA 8040
CCpGTAATA ATCpAAAGC CCCTGCACCT GGAACTApA ACGpCCTGT AACCApTAT 8100
CCppAACT ppTGCCTA CpTATpCT GTAAAApCT pTAACTAGA CCCCCCCTCT 8160 CCTpCTAAA CCAAAGTATA AAAGCAAATC TAGCCCCTTC pCAGGCCGA GAGAATrTCG 8220
AGCGpAGCC GTCTCTTGGC CACCAGCTAA ATAAACGGAT TCpCATGTG TCTCAAAGTG 8280
TGGCGTpTC TCTAACTCGC TCAGGTACGA CCGTGGTAGT ATpTCCCCA ACGTCpAp 8340
TpAGGGCAC GTATGTAGAG TAACTpTAT GAAAGAAACC ACTTAAGGAG GTpTGGGAT 8400 pCCpTATC AACTGTAATA CTGGTpTGA pApTATp ApTATTTAT I H? I IGAG 8460
AAGGAGTTTC ACTCpGpG CCCAGGCTGG AGTGCAATGG TGCGATCpG GCTCACTGCA 8520 ACpcCGCCT CCCAGGpCA AGCGApCTC CTGCCTCAGC CTCGAGAGTA GCTGGGApA 8580
TAGGCATGCG CCACCACACC CAGCTAATp TGTATTpTA GTAAAGATGG GGTpcpCA 8640
TGpGGTCAA GCTGGTCTGG AACTCCCCGC CTCGGGTGAT CTGCCCGCCT CGGCCTCCGA 8700
AAGTGCTGGG ApACAGGTG TGATCCACCA CACCCAGCCG ATpATATGT ATATAAATCA 8760
CApCCTCTA ACCAAAATGT AGTGpTCCT TCCATCpGA ATATAGGCTG TAGACCCCGT 8820
GGGTATGGGA CApGpAAC AGTGAGACCA CAGCAGTTp TATGTCATCT GACAGCATCT 8880
CCAAATAGCC pCATGGpG TCACTGCpC CCAAGACAAT TCCAAATAAC ACTTCCCAGT 8940
GATGACTTGC TACpGCTAT TGpACpAA TGTGpAAGG TGGCTGpAC AGACACTAp 9000
AGTATGTCAG GAApACACC AAAApTAGT GGCTCAAACA ATCATpTAT TATGTATGTG 9060
GATTCTCATG GTCAGGTCAG GATTTCAGAC AGGGCACAAG GGTAGCCCAC pGTCTCTGT 9120
CTATGATGTC TGGCCTCAGC ACAGGAGACT CAACAGCTGG GGTCTGGGAC CATpGGAGG 9180
CpGpCCCT CACATCTGAT ACCTGGCpG GGATGpGGA AGAGGGGGTG AGCTGAGACT 9240 * (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143:
TGTAAGTCGA GCAGTGTGAT GTCCACTGCA GTGTGpGCT GGGAACAGp AATGAGCAAA 60
pGTATACAA TGGCTAGTAC ApGACCGGG ApTGpGAA GCTGGTGAGT GpATGACTT 120
AGCCTGpAG ACTAGTCTAT GCACATGGCT CTGGTCAACT ACCGCTCTCT CApTCTCCA 180
GATAAATCCC CCATGCpTA TApCTCpC CAAACATACT ATCCTCATCA CCACATAGp 240
ccpTGp TGcprGpc TAGAcprcc cppcTGp pcpApcA AACCTATATC 300
TCHTGCATA GApGTAAAT TCAAATGCCC TCAGGGTGCA GGCAGpCAT GTAAGGGAGG 360
GAGGCTAGCC AGTGAGATCT GCATCACACT GCTCGACpA CA 402
* (2) INFORMATION FOR SEC ID NO: 144: (i) SEQUENCE CHARACTERISTICS. (A) LENGTH: 224 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear (xí) SEQUENCE DESCRIPTION: SEQ ID NO: 144
GAGTGCCTAT ATGTAGTGp TCCATATGGC CTTGACTTCC pACAGCCTG GCAGCCTCAG 9300
GGTAGTCAGA ApCTTAGGA GGCACAGGGC TCCAGGGCAG ATGCTGAGGG CTCTTpATG 9360
AGGTAGCACA GCAAATCCAC CCAGGATC 9388
(2) INFORMATION FOR SEQ ID NO: 142: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 419 base pairs ^ F- (B) TYPE: nucleic acid (C) THREAD FORM: single 10 (D) TOPOLOGY : linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: TGTAAGTCGA GCAGTGTGAT GGAAGGAATG GTCpTGGAG AGAGCATATC CATCTCCTCC 60
TCACTGCCTC CTAATGTCAT GAGGTACACT GAGCAGAAp AAACAGGGTA GTCpAACCA 120
CACTATpp AGCTACCGTG TCAAGCTAAT GGpAAAGAA CACppGGT pACACpGT 180
TGGCTCATAG AAGpGCpT CCGCCATCAC GCAATAAGp TGTGTGTAAT CAGAAGGAGT 240
TACCpATGG TTTCAGTGTC ApcpTAGT TAACpGGGA GCTGTGTAAT pAGGCpTG 300
CGTApApT CACpCTGp CTCCACpAT GAAGTGApG TGTGpCGCG TGTGTGTGCG 360
TGCGCATGTG CpCCGGCAG pAACATAAG CAAATACCCA ACATCACACT GCTCGACp 419
(2) INFORMATION FOR SEQ ID NO: 143: (i) SEQUENCE CHARACTERISTICS. (A) LENGTH: 402 base pairs TCGGGTGATG CCTCCTCAGG CCAAGAAGAT AAAGCTTCAG ACCCCTAACA CATiTCCAM 60
AAGGAAGAAA GGAGAAAAAA GGGCATCATC CCCGpCCGA AGGGTCAGGG AGGAGGAAAT 120
TGAGGTGGAT TCACGACTTG CGGACAACTC CpTGATGCC AAGCGAGGTG CAGCCGGAGA 180
CTGGGGAGAG CGAGCCAATC AGGTpTGAA GpCCTCTCA GTGC 224
(2) INFORMATION FOR SEQ ID NO: 145: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 111 base pairs 10 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: AGCCATpAC CACCCATCCA CAAAAAAAAA AAAAAAAAAG AAAAATATCA AGGAATAAAA 60G AACAAAAAGG AACATpGCT GGCCTGAGGA GGCATCACCC G 111
(2) INFORMATION FOR SEQ ID NO: 146: (i) SEQUENCE CHARACTERISTICS: 20 (A) LENGTH: 585 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146:
TAGCATGpG AGCCCAGACA CpGTAGAGA GAGGAGGACA GpAGAAGAA GAAGAAAAGT .60
TpTAAATGC TGAAAGpAC TATAAGAAAG CTTTGGCpT GGATGAGACT pTAAAGATG 120
CAGAGGATGC pTGCAGAAA CpCATAAAT ATATGCAGGT GApCCTTAT pCCTCCTAG_180_AAApTAGTG ATATpGAAA TAATGCCCAA ACpAApp CTCCTGAGGA AAACTApCT 240
ACApACpA AGTAAGGCAT TATGAAAAGT pcpHTAG GTATAGTTp TCCTAApGG 300
GpTGACAp GCpCATAGT GCCTCTGTTT pGTCCATAA TCGAAAGTAA AGATAGCTGT 360
GAGAAAACTA pACCTAAAT pGGTATGp GTpTGAGAA ATGTCCpAT AGGGAGCTCA 420
CCTGGTGGp pTAAApAT TGpGCTACT ATAApGAGC TAApATAAA AACCppTG 480
AGACATATGT TAAApGTCT TpCCTGTAA TACTGATGAT GATGTTpCT CATGCATpT 540
CpCTGAAp GGGACCApG CTGCTGTGTC TGGGCTCACA TGCTA 585
(2) INFORMATION FOR SEQ ID NO.147: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 579 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO.R47: TAGCATGpG AGCCCAGACA CTGGGCAGCG GGGGTGGCCA CGGCAGCTCC TGCCGAGCCC 60
AAGCGTGTp GTCTGTGAAG GACCCTGACG TCACCTGCCA GGCTAGGGAG GGGTCAATGT 120
GGAGTGAATG pCACCGACT pCGCAGGAG TGTGCAGAAG CCAGGTGCAA CpGGpTGC 180
pGTGpCAT CACCCCTCAA GATATGCACA CTGCTpCCA AATAAAGCAT CAACTGTCAT 240
CTCCAGATGG GGAAGACTp pCTCCAACC AGCAGGCAGG TCCCCATCCA CTCAGACACC 300
Üt AGCACGTCCA CCTTCTCGGG CAGCACCACG TCCTCCACCT TCTGCTGGTA CACGGTGATG 360
ATGTCAGCAA AGCCGpCTG CANGACCAGC TGCCCCGTGT GCTGTGCCAT CTCACTGGCC 420
TCCACCGCGT ACACCGCTCT AGGCCGCGCA TANTGTGCAC AGAANAAATG ATGATCCAGT 480
CCCACAGCCC ACGTCCAAGA NGACTTTATC CGTCAGGGAT TCpTApCT GCAGGATGAC 540
CTGTGGTAp AApGpCGT GTCTGGGCTC AACATGCTA 579
(2) INFORMATION FOR SEQ ID NO: 148: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 249 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148:
TGACACCTTG TCCAGCATCT GCAAGCCAGG AAGAGAGTCC TCACCAAGAT CCCCACCCCG 60
pGGCACCAG GATCpGGAC pCCAATCTC CAGAACTGTG AGAAATAAGT ApTGTCGCT 120
AAATAAATCT pGTGGTTTC AGATATTTAG CTATAGCAGA TCAGGCTGAC TAAGAGAAAC 180
CCCATAAGAG pACATACTC ApAATCTCC GTCTCTATCC CCAGGTCTCA GATGCTGGAC 240
AAGGTGTCA 249
(2) INFORMATION FOR SEQ ID NO: 149: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 255 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 149: TGACACCpG TCCAGCATCT GCTATpTGT GACT? TTAA TAATAGCCAT TCTGACTGGT 60
GTGAGATGGT AACTCApGT GGGTpGGTC TGCATFTCTC TAATGATCAG TGATApAAG 120
ATATGCpGT TGACCACATG TATATCATCT pTGAGAAGT GTCTGpCAT 180
ATccprGCC CAcprpAA pppTATc pGTAAAr GTpMpTc cpACAGATG 240
CTGGACAAGG TGTCA 255
(2) INFORMATION FOR SEQ ID NO: 150: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 318 base pairs (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150:
TTACGCTGCA ACACTGTGGA GGCCAAGCTG GGATCACpC pCApCTAA CTGGAGAGGA 60
GGGAAGpCA AGTCCAGCAG AGGGTGGGTG GGTAGACAGT GGCACTCAGA AATGTCAGCT 120
GGACCCCTGT CCCCGCATAG GCAGGACAGC AAGGCTGTGG CTCTCCAGGG CCAGCTGAAG 180
^^^. AACAGGACAC TGTCTCCGCT GCCACAAAGC GTCAGAGACT CCCATCTTTG AAGCACGGCC 240
pcpGGTCT TCCTGCACp CCCTGpCTG pAGAGACCT GGpATAGAC AAGGCpCTC 300
CACAGTGpG CAGCGTAA 318
(2) INFORMATION FOR SEQ ID NO: 151: 15 (¡) SEQUENCE CHARACTERISTICS: (A) LENGTH: 323 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 20 (? ¡) DESCRIPTION OF SEQUENCE: SEQ ID NO'151
TNACGCNGCN ACNNTGTAGA GANGGNAAGG CNpCCCCAC ApNCCCCp CATNANAGAA 60
pApC ACC AAGNNTGACC NATGCCNTp ATGACTTACA TGCNNACTNC NTAATCTGTN 120
TCNNGCCpA AAAGCNNNTC CACTACATGC NTCANCACTG TNTGTGTNAC NTCATNAACT 180
GTCNGNAATA GGGGCNCATA ACTACAGAAA TGCANpCAT ACTGCTTCCA NTGCCATCNG 240
CGTGTGGCCT TNCCTACTCT TCpNTApC CAAGTAGCAT CTCTGGANTG CTTCCCCACT 300
CTCCACApG pGCAGCNAT AAT 323
(2) INFORMATION FOR SEQ ID NO: 152: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 311 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 152: TCAAGApCC ATAGGCTGAC CAGTCCAAGG AGAGpGAAA TCATGAAGGA GAGTCTATCT 60
GGAGAGAGCT GTAGTpTGA GGGpGCAAA GACTTAGGAT GGAGpGGTG GCTGTGGpA 120
GTCTCTAAGG pGATTpGT TCATAAATp CATGCCCTGA ATGCCpGCT TGCCTCACCC 180
TGGTCCAAGC CpAGTGAAC ACCTAAAAGT CTCTGTCpC pGCTCTCCA AACpCTCCT 240
GAGGATTTCC TCAGApGTC TACApCAGA TCGAAGCCAG pGGCAAACA AGATGCAGTC 300
CAGAGGGTCA G 311
(2) INFORMATION FOR SEQ ID NO: 153: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 332 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 153:
CAAGApCCA TAGGCTGACC AGGAGGCTAT TCAAGATCTC TGGCAGpGA GGAAGTCTCT 60
pAAGAAAAT AGUTA CA ApTGpAAA AlTTpCTGT CpACpCAT pCTGTAGCA 120
GpGATATCT GGCTGTCCp pTATAATGC AGAGTGGGAA CTpCCCTAC CATGTiTGAT 180
AAATGpGTC CAGGCTCCAT TGCCAATAAT GTGpGTCCA AAATGCCTGT lTAGppTA 240
AAGACGGAAC TCCACCCTp GCpGGTCp AAGTATGTAT GGAATGpAT GATAGGACAT 300
AGTAGTAGCG GTGGTCAGCC TATGGAATCT TG 332
(2) INFORMATION FOR SEQ ID NO: 154: ^^ (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 345 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple 5 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: TCAAGApCC ATAGGCTGAC CTGGACAGAG ATCTCCTGGG TCTGGCCCAG GACAGCAGGC 60
TCAAGCTCAG TGGAGAAGGT pCCATGACC CTCAGApCC CCCAAACCTT GGATTGGGTG 120
ACApGCATC TCCTCAGAGA GGGAGGAGAT GTANGTCTGG GCTTCCACAG GGACCTGGTA 180
ppAGGATC AGGGTACCGC TGGCCTGAGG CpGGATCAT TCANAGCCTG GGGGTGGAAT 240
GGCTGGCAGC CTGTGGCCCC ApGAAATAG GCTCTGGGGC ACTCCCTCTG pCCTANpG 300
AACpGGGTA AGGAACAGGA ATGTGGTCAN CCTATGGAAT CpGA 345
(2) INFORMATION FOR SEQ ID NO: 155: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 295 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155:
GACGCTTGGC CACTTGACAC ApAAACAGT TTTGCATAAT CACTANCATG TATpCTAGT 60
pGCTGTCTG CTGTGATGCC CTGCCCTGAT TCTCTGGCGT TAATGATGGC AAGCATAATC 120
AAACGCTGp CTGpAApc CAAGpATAA CTGGCApGA pAAAGCAp ATCTITCACA 180
ACTAAACTGT TCTTCATANA ACAGCCCATA pApATCAA ApAAGAGAC AATCTApCC 240
AATATCCTta ANGGCCAATA TATTTNATGT CCCTTAApA AGAGCTACTG TCCGT 295 fk (2) INFORMATION FOR SEQ ID NO: 156: 10 (i) SECU ency CHARACTERISTICS: (A) LONGITU D: 406 base pairs (B) TYPE: nucleic acid (C) FORM OF H ILO: simple (D) TOPOLOGY: linear 15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156:
twenty
GACGCTTGGC CACTTGACAC TGCAGTGGGA AAACCAGCAT GAGCCGCTGC CCCCAAGGAA 60
CCTCGAAGCC CAGGCAGAGG ACCAGCCATC CCAGCCTGCA GGTAAAGTGT GTCACCTGTC 120
AGGTGGGCp GGGGTGAGTG GGTGGGGGAA GTGTGTGTGC AAAGGGGGTG TNAATGTNTA 180
TGCGTGTGAG CATGAGTGAT GGCTAGTGTG ACTGCATGTC AGGGAGTGTG AACAAGCGTG 240
CGGGGGTGTG TGTGCAAGTG CGTATGCATA TGAGAATATG TGTCTGTGGA TGAGTGCAp 300
l TGAAAGTCTG TGTGTGTGCG TGTGGTCATG ANGGTAANp ANTGACTGCG CAGGATGTGT 360
GAGTGTGCAT GGAACACTCA NTGTGTGTGT CAAGTGGCCN ANCGTC 406
(2) INFORMATION FOR SEQ ID NO: 157: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 208 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: TGACGCpGG CCACpGACA CACTAAAGGG TGpACTCAT CACTpCTTC TCTCCTCGGT 60
GGCATGTGAG TGCATCTAp CACpGGCAC TCATpGpT GGCAGTGACT GTAANCCANA 120
TCTGATGCAT ACACCAGCp GTAAAp & AA TAAATGTCTC TAATACTATG TGCTCACAAT 180
ANGGTANGGG TGAGGAGAAG GGGAGAGA 208
(2) INFORMATION FOR SEQ ID NO: 158: 25 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 547 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: CTTCAACCTC CpCAACCTC CTTCAACCTC CTGGApCAA ACAATCATCC CACCTCAGAC 60CTGAGACTAC AGACTCACGC CACTACATCT GGCTAAApT pGTAGAGAT 120
AGGGpTCAT CATGpGCCC TGGCTGGTCT CAAACTCCTG ACCTCAAGCA ATGTGCCCAC 180
CTCAGCCTCC CAAAGTGCTG GGApACAGG CATAAGCCAC CATGCCCAGT CCATNTTTAA 240
TC? TCCTAC CACApcpA CCACACTTTC ppATGTp AGATACATAA ATGCTTACCA 300
pATGATACA ApGCCCACA GTApAAGAC AGTAACATGC TGCACAGGp TGTAGCCTAG_360_GAACAGTAGG CAATACCACA TAGCpAGGT GTGTGGTAGA CTATACCATC TAGGpTGTG 420
TAAGpACAC pTATGCTGT pACACAATG ACAAAACCAT CTAATGATGC ATpCTCAGA 480
ATGTATCCp GTCAGTAAGC TATGATGTAC AGGGAACACT GCCCAAGGAC ACAGATApG 540
TACCTGT 547
(2) INFORMATION FOR SEQ ID NO: 159: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 203 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 159:
GCTCCTCpG CCpACCAAC TCACCCAGTA TGTCAGCAAT pTATCRGCT pACCTACGA 60
AACAGCCTGT ATCCAAACAC pAACACACT CACCTGAAAA GpCAGGCAA CAATCGCCp 120
CTCATGGGTC TCTCTGCTCC AGpCTGAAC CTpCTCTp TCCTAGAACA TGCApTARG 180
TCGATAGAAG pCCTCTCAG TGC 203
(2) INFORMATION FOR SEQ ID NO: 160: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 402 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 160:
TGTAAGTCGA GCAGTGTGAT GGGTGGAACA GGGpGTAAG CAGTAApGC AAACTGTAp 60
TAAACAATAA TAATAATAp TAGCATpAT AGAGCACpT ATATCTTCAA AGTACpGCA 120
AACApAYCT AApAAATAC CCTCTCTGAT TATAATCTGG ATACAAATGC ACTTAAACTC 180
AGGACAGGGT CATGAGARAA GTATGCATp GAAAGpGGT GCTAGCTATG CTrTAAAAAC 240
CTATACAATG ATGGGRAAGT TAGAGpCAG ApCTGpGG ACTGTppG TGCATpCAG 300
pCAGCCTGA TGGCAGAAp AGATCATATC TGCACTCGAT GACTYTGCp GATAACpAT 360
CACTGAAATC TGAGTGpGA TCATCACACT GCTCGACpA CA 402
(2) INFORMATION FOR SEQ ID NO: 161: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 193 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO 161: AGCATGpGA GCCCAGACAC TGACCAGGAG AAAAACCAAC CAATAGAAAC ACGCCCAGAC 60
ACTGACCAGG AGAAAAACCA ACCAATAAAA ACAGGCCCGG ACATAAGACA AATAATAAAA 120
pAGCGGACA AGGACATGAA AACAGCTAp GTAAGAGCGG ATATAGTGGT GTGTGTCTGG 180
GCTCAACATG CTA 193 (2) INFORMATION FOR SEQ ID NO: 162: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 147 base pairs (B) TYPE: nucleic acid 5 (C) THREAD FORM: single (D) TOPOLOGY : linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: TGpGAGCCC AGACACTGAC CAGGAGAAAA ACCAACCAAT AAAAACAGGC CCGGACATAA 60
GACAAATAAT AAAApAGCG GACAAGGACA TGAAAACAGC TApGTAAGA GCGGATATAG_120_10 TGGTGTGTGT CTGGGCTCAA CATGCTA j47
(2) INFORMATION FOR SEQ ID NO: 163: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 294 base pairs 15? (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163:
twenty
TAGCATGpG AGCCCAGACA- CAAATCpTC CTTAAGCAAT AAATCATpC TGCATATGp 60 pTAAAACCA CAGCTAAGCC ATGApApC AAAAGGACTA pGTApGGG TATTpGAp 120
TGGGpcpA TCTCCCTCAC ApATCTTCA pTCTATCAT TGACCTCTTA TCCCAGAGAC 180
TCTCAAACTT pATGpATA CAAATCACAT TCTGTCTCAA AAAATATCTC ACCCACTTCT 240
CTTCTGTpC TGCGTGTGTA TGTGTGTGTG TGTGTGTCTG GGCTCAACAT GCTA 294
(2) INFORMATION FOR SEQ ID NO: 164: (i) SEQUENCE CHARACTERISTICS: (A) ITU D LENGTH: 412 base pairs (B) TYPE: nucleic acid (C) FORM OF H ILO: simple (D) TOPOLOGY: linear (? ¡) DESCRIPTION OF SEQUENCE: SEQ ID NO: 164:
CGGGApGGC TpGAGCTGC AGATGCTGCC TGTGACCGCA CCCGGCGTGG AACAGAAAGC 60
CACCTGGCTG CAAGTGCGCC AGAGCCGCCC TGACTACGTG CTGCTGTGGG GCTGGGGCGT 120
GATGAACTCC ACCGCCCTGA AGGAAGCCCA GGCCACCGGA TACCCCCGCG ACAAGATGTA 180
CGGCGTGTGG TGGGCCGGTG CGGAGCCCGA TGTGCGTGAC GTGGGCGAAG GCGCCAAGGG 240
CTACAACGCG CTGGCTCTGA ACGGCTACGG CACGCAGTCC AAGGTGATCC ANGACATCCT 300
GAMCACGTG CACGACAAGG GCCAGGGCAC GGGGCCCCAAA GACGAAGTGG GCTCGGTGCT 360
GTACACCCGC GGCGTGATCA TCCAGATGCT GGACAAGGTG TCAATCACTA AT 412
(2) INFORMATION FOR SEQ ID NO: 165: (i) SEQUENCE CHARACTERISTICS: (A) ITU D LENGTH: 361 base pairs (B) TYPE: nucleic acid 15 fk (C) HI FORM: simple (D) ) TOPOLOGY: line l (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165:GACACCp GTCCAGCATC TGCATCTGAT GAGAGCCTCA GATGGCTACC ACTAATGGCA 60
GAAGGCAAAG GAGAACAGGC ApGTATGGC AAGAAAGGAA GAAAGAGAGA GGGGAGAAAG 120
GTGCTAGGp CpTTCAACA ACCAGpCTT GATGGAACTG AGAGTAAGAG CTCAAGGCCA 180
GGTGTGGTGA CTCCAACCAG TAATCCCAAC ATpTAGGAG .GCTGAGGCAG GCAGATGTCT 240
TGACCCCATG AG1TTGTGAC CAGCCTGAAC AACATCATGA GACTCCATCT CTACAATAAT 300
TACAAAAAp AATCAGGCAT TGTGGTATGC CCTGTAGTCC CAGATGCTGG ACAAGGTGTC 360
A 361
(2) INFORMATION FOR SEQ ID NO: 166: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 427 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single - • ^ (D) TOPOLOGY : linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166:
twenty
TWGACTGACT CATGTCCCCT ACACCCAACT ATCpCTCCA GGTGGCCAGG CATGATAGAA 60
TCTGATCCTG ACTTAGGGGA ATATpTCp pTACTTCCC ATCpGApC CCTGCCGGTG 120
AGTpCCTGG pCAGGGTAA GAAAGGAGCT CAGGCCAAAG TAATGAACAA ATCCATCCTC 180
ACAGACGTAC AGAATAAGAG AACWTG6ACW TAGCCAGCAG AACMCAAKTG AAAMCAGAAC 240
MCpAMCTAG GATRACAAMC MCRRARATAR KTGCYCMCMC WTATAATAGA AACCAAACp 300
GTATCTAAp AAATATpAT CCACYGTCAG GGCApAGTG GTpTGATAA ATACGCpTG 360
GCTAGGApC CTGAGGpAG AATGGAARAA CAApGCAHC GAGGGTAGGG GACATGAGTC 420
AKTCTAA 427
(2) INFORMATION FOR SEQ ID NO: 167: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 500 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEC ID NOR67:
twenty
AACGTCGCAT GCTCCCGGCC GCCATGGCCG CGGGATAGAC TGACTCATGT CCCCTAAGAT 60
AGAGGAGACA CCTGCTAGGT GTAAGGAGAA GATGGpAGG TCTACGGAGG CTCCAGGGTG 120
GGAGTAGpC CCTGCTAAGG GAGGGTAGAC TGpCAACCT GpCCTGCTC CGGCCTCCAC 180
TATAGCAGAT GCGAGCAGGA GTAGGAGAGA GGGAGGTAAG AGTCAGAAGC pATGpGp 240
TATGCGGGGA AACGCCRTAT CGGGGGCAGC CRAGpApA GGGGACACTRTR TAGWYARTCW 300
• AGNTAGCATC CAAAGCGNGG GAGpNTCCC ATATGGpGG ACCTGCAGGC GGCCGCApA 360
GTGApAGCA TGTGAGCCCC AGACACGCAT AGCAACAAGG ACCTAAACTC AGATCCTCTG 420
CTGApACp AACATGAAp ApGTATpA pTAACAACT pGAGpATG AGGCATApA 480
pAGGTCCAT ATTACCTGGA 500
(2) INFORMATION FOR SEQ ID NO: 168: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 358 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 20 ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168:
pCATCGCTC GGTGACTCAA GCCTGTAATC CCAGAACTp GGGAGGCCGA GGGGAGCAGA 60
TCACCTGAGG pGGGAG GAGACCAGCC TGGCCAACAT GGTGACAACC CGTCTCTGCT 120
AAAAATACAA AAApAGCCA AGCATGGTGG CATGCACpG TAATCCCAGC TACTCGGGAG 180
GCTGAGGCAG GAGAATCACT TGAGGCCAGG AGGCAGAGGT TGCAGTGAGG CAGAGGpGA 240
GATCATGCCA CTGCACTCCA GCCTGGGCAA CAGAGTAAGA CTCCATCTCA AAAAAAAAAA 300
* AAAAAAAGAA TGATCAGAGC CACAAATACA GAAAACCpG AGTCACCGAG CGATGAAA 358
(2) INFORMATION FOR SEQ ID NO: 169: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1265 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169:
twenty
pCTGTCCAC ACCAATCpA GAGCTCTGAA AGAATpGTC pTAAATATC TpTAATAGT 60
AACATGTAp pATGGACCA AATTGACAp pCGACTAp TTpCCCAAA AAAAGTCAGG 120
TGAATpCAG CACACTGAGT TGGGAATpC pATCCCAGA AGWCGGCACG AGCAATpCA 180
TATpApTA AGA GApC CATACTCCGT prCAAGGAG AATCCCTGCA GTCTCCTTAA 240
AGGTAGAACA AATACpTCT CACCApGTG GGApGGACT pAAGAGGTG 300
ACTCTAAAAA AACAGAGAAC AAATATGTCT CAGpGTAp AAGCACGGAC CCATApATC 360
ATApCACp AAAAAAATGA pTCCTGTGC ACCTpTGGC AACTTCTCTT pCAATGTAG_420_10 GGAAAAACTT AGTCACCCTG AAAACCCACA AAATAAATAA AACpGTAGA TGTGGGCAGA 480
ARGTTTGGGG GTGGACApG TATGTGTpA AApAAACCC TGTATCACTG AGAAGCTGp 540
GTATGGGTCA GAGAAAATGA ATGCpAGAA GCTGpCACA TCpCAAGAG CAGAAGCAAA 600
CCACATGTCT CAGCTATAp ApApTA rpTATGCAT AAAGTGAATC ATpcpCTG 660 TApAApTC CAAAGGGTp TACCCTCTAT pAAATGCp TGAAAAACAG TGCApGACA 720
ATGGGpGAT ATppc T AAAAGAAAAA TATAApATG AAAGCCAAGA TAATCTGAAG 780
CCTGT? AT prAAAAcp prATGpcT GTGGpGATG pGpTGpr GpTG rcT 840
AppGpGG pppAcp TGTTppGT rGppGT ptGGtpoG CATACTACAT 900
GCAGTpcp TAACCAATGT CTGpTGGCT AATGTAApA AAGpGpAA pTATATGAG 960
TGCATpCAA CTATGTCAAT GGTpCTTAA TATTTApGT GTAG AGTAC TGGTAATpT 1020
pTATpACA ATATGTpAA AGAGATAACA GTpGATATG TTpCATGTG pTATAGCAG 1080 * A GpApTA pTCTATGGC ATTCCAGCGG ATATpTGGT GTpGCGAGG CATGCAGTCA 1140
ATATpTGTA CAGpAGTGG ACAGTApCA GCAACGCCTG ATAGCpCTT TGGCCpATG 1200 pAAATAAAA AGACCTGTp GGGATGTAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1260
AAAAA 1265
(2) INFORMATION FOR SEQ ID NO: 170: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 383 base pairs 10 (ß) TYPE: nucleic acid (C) FORM OF H ILO: simple (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170:
TGTAAGTCGA GCAGTGTGAT GACGATApc pCTTApAA TGTGGTAAp GAACAAATGA 60
TCTGTGATAC TGATCCTGAG CTAGGAGGCG CTGpCAGp AATGGGACp CTTCGTACTC 120
^ F TAApGATCC AGAGAACATG CTGGCTACAA CTAATAAAAC CGAAAAAAGT GAATITCTAA 180
ApppCTA CAACCApGT ATGCATGpC TCACAGCACC ACTTpGACC AATACpCAG 240
AAGACAAATG TGAAAAGGAT AATATAGpG GATCAAACAA AAACAACACA ATpGTCCCG 300
ATAApATCA AACAGCACAG CTACTTGCCT TAATpTAGA GpACTCACA TpTGTGTGG 360
AACATCACAC TGCTCGACp ACA 383
(2) INFORMATION FOR SEQ ID NO: 171 25 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 383 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 171: TGGGCACCTT CAATATCGCA AGpAAAAAT AATGpGAGT pApATACT TTTGACCTGT 60
pAGCTCAAC AGGGTGAAGG CATGTAAAGA ATGTGGACTT CTGAGGAAp pCTpTAAA 120
AAGAACATAA TGAAGTAACA TAApAC TCAAGGACTA CTpTGGpG AAGpTATAA 180
TCTAGATACC TCTACTTpT GppTGCTG pCGACAGp CACAAAGACC pCAGCAAp 240
TACAGGGTAA AATCGpGAA GTAGTGGAGG TGAAACTGAA ApTAAAAp ApCTGTAAA 300
TACTATAGGG AAAGAGGCTG AGCpAGAAT CTpTGGpG pCATGTGp CTGTGCTCp 360
ATCATCACAC TGCTCGACp ACA 383
(2) INFORMATION FOR SEQ ID NO: 172: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 699 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 172: TCGGGTGATG CCTCCTCAGG CTTGTCGpA GTGTACACAG AGCTGCTCAT GAAGCGACAG 60
CGGCTGCCCC TGGCACpCA GAACCTCpC CTCTACACp pGGTGCGCT TCTGAATCTA 120
GGTCTGCATG CTGGCGGCGG CTCTGGCCCA GGCCTCCTGG AAAGTpCTC AGGATGGGCA 180
6CACTCGTGG TGCTGAGCCA GGCACTAAAT GGACTGCTCA TGTCTGCTGT CATGGAGCAT 240
GGCAGCAGCA TCACACGCCT CTTTGTGGTG TCCTGCTCGC TGGTGGTCAA CGCCGTGCTC 300
* TCAGCAGTCC TGCTACGGCT GCAGCTCACA GCCGCCTTCT TCCTGGCCAC ApGCTCAp 360
GGCCTGGCCA TGCGCCTGTA CTATGGCAGC CGCTAGTCCC TGACAACpC CACCCTGAp 420
CCGGACCCTG TAGApGGGC GCCACCACCA GATCCCCCTC CCAGGCCpC CTCCCTCTCC 480
CATCAGCGGC CCTGTAACAA GTGCCpGTG AGAAAAGCTG GAGAAGTGAG GGCAGCCAGG 540
pApCTCTG GAGGpGGTG GATGAAGGGG TACCCCTAGG AGATGTGAAG TGTGGGTTTG 600
k GpAAGGAAA TGCpACCAT CCCCCACCCC CAACCAAGp NpCCAGACT AAAGAApAA 660
GGTAACATCA ATACCTAGGC CTGAGGAGGC ATCACCCGA 699
(2) INFORMATION FOR SEC ID N0.173: 2Q (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 701 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 2b TCGGGTGATG CCTCCTCAGG CCAGATCAAA CpGGGGpG AAAACTGTGC AAAGAAATCA 60
ATGTCGGAGA AAGAATTpG CAAAAGAAAA ATGCCTAATC AGTACTAAp TAATAGGTCA 120
CApAGCAGT GGAAGAAGAA ATGpGATAT pTATGTCAG CTATpTATA ATCACCAGAG 180
TGCpAGCp CATGTAAGCC ATCTCGTAp CApAGAAAT AAGAACAAp pApCCTCG 240
GAAAGAACp pCAApTAT AGCATCpAA pGCTCAGGA TpTAAATp TGATAAAGAA 300
AGCTCCACp pGGCAGGAG TAGGGGGCAG GGAGAGAGGA GGCTCCATCC ACAAGGACAG 360
AGACACCAGG GCCAGTAGGG TAGCTGGTGG CTGGATCAGT CACAACGGAC TGACpATGC 420
CATGAGAAGA AACAACCTCC AAATCTCAGT TGCpAATAC AACACAAGCT CATpCTTGC 480
TCACGpACA TGTCCTATGT AGATCAACAG CAGGTGACTC AGGGACCCAG GCTCCATCTC 540
CATATGAGCT TCCATAGTCA CCAGGACACG GGCTCTGAAA CTGTCCTCCA TGCAGGGACA 600
CATGCCTCp CCpTCApG GGCAGAGCAA GTCACpATG GCCAGAAGTC ACACTGCAGG 660
GCAGTGCCAT CCTGCTGTAT GCCTGAGGAG GCATCACCCG A 701
(2) INFORMATION FOR SEQ ID NO: 174: 20 (¡) SEQUENCE CHARACTERISTICS: (A) LENGTH: 700 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: CAGGAGACAG 60
GGAAAGACAT AGATpTAAC CGGCCCCCp CAGGAGApC TGAGGCTCAG pCACpTGT 120
TGCAGTpGA ACAGAGGCAG CAAGGCTAGT GGpAGGGGC ACGGTCTCTA AAGCTGCACT 180
GCCTGGATCT GCCTCCCAGC TCTGCCAGGA ACCAGCTGCG TGGCCpGAG CTGCTGACAC 240
GCAGAAAGCC CCCTCTGGAC CCAGTCTCCT CGTCTGTAAG ATGAGGACAG GACTCTAGGA 300
* ACCCTpCCC pGGpTGGC CTCAC? TCA CAGGCTCCCA TCpGAACTC TATCTACTCT 360
TpCCTGAAA CCpGTAAAA GAAAAAAGTG CTAGCCTGGG CAACATGGCA AAACCCTGTC 420
TCTACAAAAA ATACAAAAAT TAGpGGGTG TGGTGGCATG TGCCTGTAGT CCCAGCCACT 480
TGGGAGGTGC TGAGGTGGGA GGATCACpG AGCCCGGGAG GTGGAGGpG CAGTGAGCCA 540
AGATCATGCC ACTGCACTCC AGCCTGAGTA ATAGAGTAAG ACTCTGTCTC AAAAACAACA 600
ACAACAACAG TGAGTGTGCC TCTGTTTCCG GGpGGATGG GGCACCACAT pATGCATCT 660 CTCAGATpG GACGCTGCAG CCTGAGGAGG CATCACCCGA 700
(2) INFORMATION FOR SEC ID NO: 175:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 484 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175 :
ß TATAGGGCGA ApGGGCCCG AGpGCATGN TCCCGGCCGC CATGGCCGCG GGApCGGGT 60
GATGCCTCCT CAGGCTTGTC TGCCACAAGC TACpCTCTG AGCTCAGAAA GTGCCCCTTG 120
ATGAGGGAAA ATGTCCTACT GCACTGCGAA pTCTCAGp CCATpTACC TCCCAGTCCT 180
CCpCTAAAC CAGpAATAA ApCAiTCCA CAAGTATTTA CTGApACCT GCpGTGCCA 240
GGGACTApC TCAGGCTGAA GAAGGTGGGA GGGGAGGGCG GAACCTGAGG AGCCACCTGA 300
* GCCAGCTpA TATpCAACC ATGGCTGGCC CATCTGAGAG CATCTCCCCA CTCTCGCCAA 360
CCTATCGGGG CATAGCCCAG GGATGCCCCC AGGCGGCCCA GGpAGATGC GTíXCpTGG 420
CpGTCAGTG ATGACATACA CCpAGCTGC pAGCTGGTG CTGGCCTGAG GAGGCATCAC 480
CCGA 484
(2) INFORMATION FOR SEQ ID NO: 176: 15 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 432 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176:
* TCGGGTGATG CCTCCTCAGG GCTCAAGGGA TGAGAAGTGA CpcpTCTG GAGGGACCGT 60
TCATGCCACC CAGGATGAAA ATGGATAGGG ACCCACTTGG AGGACpGCT GATATGpTG 120
GACAAATGCC AGGTAGCGGA ApGGTACTG GTCCAGGAGT TATCCAGGAT AGATTpCAC 180
CCACCATGGG ACGTCATCGT TCAAATCAAC TCpCAATGG CCATGGGGGA CACATCATGC 240
CTCCCACACA ATCGCAGTp GGAGAGATGG GAGGCAAGp TATGAAAAGC CAGGGGCTAA 300 6CCAGCTCTA CCATAACCAG AGTCAGGGAC TCpATCCCA GCTGCAAGGA CAGTCGAAGG 360
ATATGCCACC TCGGTpTCT AAGAAAGGAC AGCpAATGC AGATGAGAp AGCCTGAGGA 420 GGCATCACCC GA 432
(2) INFORMATION FOR SEQ ID NO: 177: (i) SEQUENCE CHARACTERISTICS: 15 (A) LENGTH: 788 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 20
TAGCATGpG AGCCCAGACA CAGTAGCAp TGTGCCAAp TCTGGpGGA ATGGTGACAA 60
CATGCTGGAG CCAAGTGCTA ACATGCCTTG GpCAAGGGA TGGAAAGTCA CCCGTAAGGA 120
TGGCAATGCC AGTGGAACCA CGCTGCpGA GGCTCTGGAC TGCATCCTAC CACCAACTCG 180
CCCAACTGAC AAGCCCpGC GCCTGCCTCT CCAGGATGTC TACAAAApG GTGGTApGG 240
TACTGpCCT GpGGCCGAG TGGAGACTGG TGpCTCAAA CCCGGTATGG TGGTCACCTT 300
# TGCTCCAGTC AACCTTACAA CGGAAGTAAA ATCTGTCGAA ATGCACCATG AAGCTGTGAG 360
TGAAGCTCp CCTGGGGACA ATGTGGGCp CAATGTCAAG AATGTGTCTG TCAAGGATGT 420
TCGTCGTGGC AACGpGCTG GTGACAGCAA AAATGACCCA CCAATGGAAG CAGCTGGCTT 480
CACTGCTCAG GTGApATCC TGAACCATCC AGGCCAAATA AGTGCCGGCT ATGCCCCTGT 540
Ap8ApGC CACACGGCTC ACApGCATG CAAGTpGCT GAGCTGAAGG AAAAGApGA 600
TCGCCGpCT GGTAAAAAGC TGGAAGATGG CCCTAAApC pGAAGTCTG GTGATGCTGC 660
CApGpGAT ATGGpCCTG GCAAGCCCAT GTGTGpGAG AGCTTCTCAG ACTATCCACC 720
pTGGGTCGC pTGCTGpC GTGATATGAG ACAGACAGp GCGGTGGGTG TCTGGGCTCA 780
ACATGCTA 788
(2) IN FO R M ANAGEMENT FOR SEQ ID NO: 1 78: (i) S ECUENCE CHARACTERISTICS: (A) ITU D LENGTH: 786 base pairs 25 (B) TI PO: nucleic acid # (C) ) THREAD FORM: simple (D) TOPOLOGY: linear (xi) DESCRI SEQUENCE PC: SEQ ID NO: 178: TAGCATGpG AGCCCAGACA CCTGTGTTTC TGGGAGCTCT GGCAGTGGCG GApCATAGG 60
CACpGGGCT GCACpTGAA TGACACACp GGCpTApA GApCACTAG ppTAAAAA 120
ApGpGpc GpTcppc ApAAAGGp TAATCAGACA GATCAGACAG CATAAT? GC íßo
* TApTAATGA CAGAAACGp GGTACATpC pCATGAATG AGCTTGCAp CTGAAGCAAG 240
AGCCTACAAA AGGCACpGT TATAAATGAA AGpCTGGCT CTAGAGGCCA GTACTCTGGA 300
GTTTCAGAGC AGCCAGTGAT TGpCCAGTC AGTGATGCCT AGpATATAG AGGAGGAGTA 360
CACTGTGCAC TCpCTAGGT GTAAGGGTAT GCAACpTGG ATCpAAAAT TCTGTACACA 420
TACACACTTT ATATATATGT ATGTATGTAT GAAAACATGA AApAGpTG TCAAATATGT 480
GTGTGTTTAG TATpTAGCT TAGTGCAACT ApTCCACAT TATpApAA ApGATCTAA 540
* GACACTpCT TGpGACACC pGAATApA ATGpCAAGG GTGCAATGTG TApCCpTA 600
GApGpAAA GCTTAApAC TATGATTTGT AGTAAApAA CppAAAAT GTATGTGAGC 660
CCpCTGTAG TGTCGTAGGG CTCTTACAGG GTGGGAAAGA TpTAATpT CCACTTGCTA 720
ATTGAACAGT ATGGCCTCAT TATATATpT GATpATAGG AGTpGTGTC TGGGCTCAAC 780
ATGCTA 786
(2) IN FO R MAC ION FOR S EC ID NO: 1 79: 25 UENCE: (A) LONGITU D: 796 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple 5 (D) TOPOLOGY : linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: TAGCATGpG AGCCCAGACA CTGGpACAA GACCAGACCT GCTTCCTCCA TATGTAAACA 60
# GCTTpAAAA AGCCAGTGAA CCTTpTAAT ACpTGGCAA CCTTCTpCA CAGGCAAAGA 120
ACACCCCCAT CCGCCCCpG pTGGAGTGC AGAGTpGGC pTGGpcp TGCCpGCCT 180
GGAGTATACT TCTAApCCT GpGTCCTGC ACAAGCTGAA TACCGAGCTA CCCACCGCCA 240
CCCAGGCCAG GpTCCACTC ApTApACT pATGpTCT GpCCApGC TGGTCCACAG 300
AAATAAGTp TCCTpGGAG GAATGTGAp ATACCCCTp AATpCCTCC ppGCpp 360
ppAATATC ApGGTATGT GpTGGCCCA GAGGAAACTG AAApCACCA TCATCpGAC 420
# TGGCAATCCC ApACCATGC l i l l l lAAA AAACGTAAp pTCpGCCT TACApGGCA 480
twenty
GAGTAGCCCT TCCTGGCTAC TGGCTTAATG TAGTCACTCA GpTCTAGGT GGCApAGGC 540
ATGAGACCTG AAGCACAGAC TGTCpACCA CAAAAGGTGA CAAGATCTCA AACCpAGCC 600
AAAGGGCTAT GTCAGGpTC AATGCTATCT GCpCTGpC CTGCTCACTG pCTGGATIT 660
TGTccpcp CATCCCTAGC ACCAGAATIT CCCAGTCTCC CTCCCTACCT TcccpGpr 720
TAApCTAAT CTATCAGCAA AATAACTpT CAAATGTpT AACCGGTATC TCCATGTGTC 780
TGGGCTCAAC ATGCTA 796
(2) INFORMATION FOR SEQ ID NO: 180: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 488 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 180:
GGATGTGCTG CAAGGCGAp AAGpGGGTA ACGCCAGGGT pTCCCAGTC ACGACGpGT 60
AAAACGACGG CCAGTGAAp GTAATACGAC TCACTATAGG GCGAApGGG CCCGACGTCG 120
CATGCTCCCG GCCGCCATGG CCGCGGGATA GCATGpGAG CCCAGACACC TGCAGGTCAT 180
pGGAGAGAT TpTCACGp ACCAGCpGA TGGTCTpp CAGGAGGAGA GACACTGAGC 240
ACTCCCAAGG TGAGGpGAA GATpCCTCT AGATAGCCGG ATAAGAAGAC TAGGAGGGAT 300
GCCTAGAAAA TGApAGCAT GCAAATpCT ACCTGCCAp TCAGAACTGT GTGTCAGCCC 360
ACApCAGCT GCpcpGTG AACTGAAAAG AGAGAGGTAT TGAGACTTp CTGATGGCCG 420
CTCTAACAp GTAACACAGT AATCTGTGTG TGTGTGGGTG TGTGTGTGTG TCTGGGCTCA 480
ACATGCTA 488
(2) INFORMATION FOR SEQ ID NO: 181: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 317 base pairs (B) TYPE: nucleic acid 20 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xí) SEQUENCE DESCRIPTION: SEQ ID NO: 181
TAGCATGpG AGCCCAGACA CGGCGACGGT ACCTGATGAG TGGGGTGATG GCACCTGTGA 60
AAAGGAGGAA CGTCATCCCC CATGATApG GGGACCCAGA TGATGAACCA TGGCTCCGCG 120
TCAATGCATA pTAATCCAT GATACTGCTG ApGGAAGGA CCTGAACCTG AAGT TGTGC 180
TGCAGGTTTA TCGGGACTAT TACCTCACGG GTGATCAAAA CpCCTGAAG GACATGTGGC 240
CTGTGTGTCT AGTAAGGGAT GCACATGCAG TGGCCAGTGT GCCAGGGGTA TGGpGGTGT 300
CTGGGCTCCA CATGCTA 317
(2) INFORMATION FOR SEQ ID NO: 182: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 507 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182:
twenty
# TAGCATGpG AGCCCAGACA CTGGCTGpA GCCAAATCCT CTCTCAGCTG CTCCCTGTGG 60
pTGGTGACT CAGGApACA GAGGCATCCT GTÍTCAGGGA ACAAAAAGAT pTAGCTGCC 120
AGCAGAGAGC ACCACATACA pAGAATGGT AAGGACTGCC ACCTCCTTCA AGAACAGGAG 180
TGAGGGTGGT GGTGAATGGG AATGGAAGCC TGCApCCCT GATGCATpG TGCTCTCTCA 240
AATCCTGTCT TAGTCTTAGG AAAGGAAGTA AAGTpCAAG GACGGpCCG AACTGCTTp 300
TGTGTCTGGG CTCAACATGC TATCCCGCGG CCATGGCGGC CGGGAGCATG CGACGTCGGG 360
CCCAApCGC CCTATAGTGA GTCGTApAC AApCACTGG CCGTCGTTp ACAACGTCGT 420
GACTGGGAAA ACCCTGGCGT TACCCAACTT AATCGCCpG CAGCACATCC CCCTpCCCA 480TANCGAAAAG GCCCGCA 507
(2) INFORMATION FOR SEQ ID NO: 183: 15 F (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 227 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183:
GApTACGCT GCAACACTGT GGAGGTAGCC CTGGAGCAAG GCAGGCATGG ATGCpCTGC 60
AATCCCCAAA TGGAGCCTGG TATpCAGCC AGGAATCTGA GCAGAGCCCC CTCTAApGT 120
AGCAATGATA AGpApCTC pTGpCTTC AACCTTCCAA TAGCCpGAG CpCCAGGGG 180
AGTGTCGpA ATCApACAG CCTGGTCTCC ACAGTGp & C AGCGTAA 227
(2) INFORMATION FOR SEQ ID NO: 184: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 225 base pairs (B) TYPE: nucleic acid 10 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184:
pACGCTGCA ACACTGTGGA GCAGApAAC ATCAGACTTT TCTATCAACA TGACTGGGGT 60
TACTAAAAAG ACAACAAATC AATGGCpCA AAAGTCTAAG GAATAATpC GATACTTCAA 120
CpTATAAAA CCTGACAAAA CTATCAATCA AGCATAAAGA CAGATGAAGA ACATpCCAG 180 ATpTGGCCA ATCAGATAp pACCTCCAC AGTGpGCAG CGTAA 225
(2) INFORMATION FOR SEQ ID NO: 185: 20 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 597 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: GGCCCGACGT CGCATGCTCC CGGCCGCCAT GGCCGCGGGA pCGpAGGG TCTCTATCCA 60 *
CTGGGACCCA TAGGCTAGTC AGAGTATpA GAGpGAGp CCTTTCTGCT TCCCAGAAp 120
TGAAAGAAAA GGAGTGAGGT GATAGAGCTG AGAGATCAGA pTGCCTCTG AAGCCTGpC 180
AAGATGTATG TGCTCAGACC CCACCACTGG GGCCTGTGGG TGAGGTCCTG GGCATCTAp 240
TGAATGAAp GCTGAAGGGG AGCACTATGC OWGGAAGGG GAACCCATCC TGGCACTGGC 300
ACAGGGGTCA CCpATCCAG TGCTCAGTGC TTCTpGCTG CTACCTGGp pCTCTCATA 360
TGTGAGGGGC AGGTAAGAAG AAGTGCCCRG TGpGTGCGA GTpTAGAAC ATCTACCAGT 420
AAGTGGGGAA GTpCACAAA GCAGCAGCp TGTpTGTGT Ap TCACCT TCAGpAGAA 480
GAGGAAGGCT GTGAGATGAA TGpAGpGA GTGGAAAAGA CGGGTAAGCT TAGTGGATAG_540_AGACCCTAAC GAATCACTAG TGCGGCCGCC pGCAGGTCG ACCATATGGG AGAGCTC 597
(2) INFORMATION FOR SEQ ID NO: 186: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 597 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186:
# GGCCCGAAGT TGCATGpCC CGGCCGCCAT GGCCGCGGGA pCGpAGGG TCTCTATCCA 60
CTACCTAAAA AATCCCAAAC ATATAACTGA ACTCCTCACA CCCAApGGA CCAATCCATC 120
ACCCCAGAGG CCTACAGATC CTCCTpGAT ACATAAGAAA ApTCCCCAA ACTACCTAAC 180
TATATCATp TGCAAGATp GTpTACCAA ATpTGATGG CCTTTCTGAG CTTGTCAGTG 240
TGAACCACTA pACGAACGA TCGGATApA ACTGCCCCTC ACCGTCCAGG TGTAGCTGGC 300
AACATCAAGT GCAGTAAATA pCApAAGT TITCACCTAC TAAGGTGCp AAACACCCTA 360
GGGTGCCATG TCGGTAGCAG ATCTTpGAT pGIlílIAT pCCCATAAG GGTCCTGpC 420
AAGGTCAATC ATACATGTAG TGTGAGCAGC TAGTCACTAT CGCATGACp GGAGGGTGAT 480
AATAGAGGCC TCCTpGCTG pAAAGAACT CTTGTCCCAG CCTGTCAAAG TGGATAGAGA 540
CCCTAACGAA TCACTAGTGC GGCCGCCTGC AGGTCGACCA TATGGGAGAG CTCCCAA 597
(2) INFORMATION FOR SEQ ID NO: 187: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 324 base pairs 20 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NOR87: TCGpAGGGT CTCTATCCAC pGCAGGTAA AATCCAATCC TGTGTATATC pATAGTCTT 60
CCATATGTAG TGGpCAAGA GACTGCAGp CCAGAAAGAC TAGCCGAGCC CATCCATGTC 120
pCCACpAA CCCTGCTTTG GGpACACAT CTTAACpp CTGpCAAGT pCTCTGTGT 180
AGTpATAGC ATGAGTApG GGAWAATGCC CTGAAACCTG ACATGAGATC TGGGAAACAC 240
AAACpACTC AATAAGAAp TCTCCCATAT TpTATGATG GAAAAATpC ACATGCACAG 300
AGGAGTGGAT AGAGACCCTA ACGA 324
(2) INFORMATION FOR SEQ ID NO: 188: 10 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 178 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: GCGCGGGGAT TCGGGGTGAT ACCTCCTCAT GCCAAAATAC AACGTNTAAT pCACAACTT 60
GCcpccAAT pACGCApr TCAA? TGCT CTCCCCATGT Gp & GTCAc AACAAACACC 120
ApGCCCAGA AACATGTAp ACCTAACATG CACATACTCT TAAAACTACT CATCCCp 178
(2) INFORMATION FOR SEQ ID NO: 189: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 367 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ^ ^ P (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: TGACACCTTG TCCAGCATCT GACACACTCT TGGCTCTTGG AAAATApGG ATAAATGAAA 60
ATGAATpCT pAGCAAGTG GTATAAGCTG AGAATATACG TATCACATAT CCTCApCTA 120
AGACACApC AGTGTCCCTG AAApAGAAT AGGACTTACA ATAAGTCTCT TCACTFTCTC 180
AATAGCTGp ApCAApGA TGGTAGGCCT TAAAAGTCAA AGAAATGAGA GGGCATGTGA 240
AAAAAAGCTC AACATCACTG ATCApAGAA AACTTCCAp CAAACCCCCA ATGAGATACC 300
ATCTCATACC AGTCAGAATG GCTApApA AAAAGTCAAA AAATAACAGA TGCTGGACAA 360
GGTGTCA 367
(2) INFORMATION FOR SEQ ID NO: 190: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 369 base pairs 15 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190:
twenty
GACACCTTGT CCAGCATCTG ACAACGCTAA CAGCCTGAGG AGATCTTTAT pApTATTT 60
AGTT? TACT CTGGCTAGGC AGATGGTGGC TAAAACApc ApTACccAT pApcApr 120
AApGpCT GCAAGGCCTA TGGATAGAGT ApGTCCAGC ACTGCTCTGG AAGCTAGGAG 180
CATGGGGATG AACAAGATAG GCTACATCCT GpCCCACAG AACpCCACT pAGTCTGGG 240
AAACAGATGA TATATACAAA TATATAAATG AApCAGGTA GTpTAAGTA CGAAAAGAAT 300
AAGAAAGCAG AGTCATGAp TANAATGCTG GAAACAGGGG CTApGCTTG AGATApGAA 360
GGTGCCCAA 369
(2) INFORMATION FOR SEQ ID NO: 191: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 369 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO.R91
twenty
AAGAAACTAT TATCAGAGTG AACAGGCAAC 60
CTACAGAATG GGAGAAAAp pTGCAATCT ATCCATCTGA CAAAGGGCTA ATATCCAGAA 120
TCTACAAAGA ACpATACAA ATTTACAAGA AACAAACAAA CAAACAACTC CTCAAAAAGT 180
GGGTGAAGGA TGTGAACAGA CACTTCTCAA AAGAAGACAT pATGGGGCC AACAAACATA 240
TGAAAAAAAG CTCATCATCA CTGGTCACTA GATAAATGCA AATCAAAACC ACAATGAGAT 300
ACCATCTCAT TCCAGpAGA ATGGCAATCA pAAAAAGTC AGGAAACAAC AGATGCTGGA 360 CAAGGTGTC 369
(2) INFORMATION FOR SEQ ID NO: 192: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 449 base pairs (B) TYPE: nucleic acid 15 ^^^^ (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192:
TGACGCpGG CCACpGACA CTTCATCTp GCACAGAAAA ACTTCpTAC AGATpAAp 60
CAAGACTGGT CTAGTGACAG TCCTCCAGAC Al II 11 ICAT pGpCCATA TACGTGGAAT 120
pTAAAATCA TGTpCATCA GpTGAAATG ATpGGGCTG CTAATCAACA CAApGGATC 180
GACTGpCTA CTAAACAACA GGAAAATGTG TATCTG6CAG CCTGTGGAGA AACACTAAAC 240
ApGAprp cpTGccpr TACGGAcpr GpccAGCTA CATGTAATAC CAAGpCTCT 300
pAAGAGGAG AAGATGpGA TCTTCApTG TTTCTACCAG ACTGCCACCC TAGTAAATAT 360
TCTpApTA TGCTGGTAAA AAApGCCAT CCAAATAAGA TGApCATGA TACTGGTAp 420
* CCTGCTGAGT GTCAAGTGGC CAAGCGTCA 449
(2) INFORMATION FOR SEQ ID NO: 193: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 372 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple 5 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193:
twenty
* TGACGCpGG CCACpGACA CCAGGGATGT AKCAGpGAA TATAATCCTG CAApCTACA 60
TApGGCAAT pCCCATCAA ACApCTAGA AAGAGACAAC CAGGApGCT AGGCCATAAA 120
AGCTGCAATA AATAACTGGT AApGCAGTA ATCATpCAG GCCAApCAA TCCAGTpGG 180
CTCAGAGGTG CCpTGGCTG AGAGAAGAGG TGAGATATAA TGTGTpTCT TGCAACpCT 240
TGGAAGAATA ACTCCACAAT AGTCTGAGGA CTAGATACAA ACCTATpGC CApAAAGCA 300
CCAGAGTCTG pAApCCAG TACTGATAAG TGpGGAGAT TAGACTCCAG TGTGTCAAGT 360
GGCCAAGCGT CA 372
(2) IN FOR MATION FOR SEQ ID NO: 194: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 309 base pairs (B) TI PO: nucleic acid 15 (C) FORM OF H ILO: simple (D) ) TOPOLOGY: linear (xi) DESCR ITION OF SECUENC IA: SEQ ID NO: 194:
twenty
TGACGCpGG CCACpGACA CTTATGTAGA ATCCATCGTG GGCTGATGCA AGCCCpTAT 60
pAGGCpAG TGpGTGGGC ACCTTCAATA TCACACTAGA GACAAACGCC ACAAGATCTG 120
CAGAAACAp CAGpCTGAN CACTCGAATG GCAGGATAAC ppTGTGp GTAATCCpC 180
ACATATACAA AAACAAACTC TGCANTCTCA CGpACAAAA AAACGTACTG CTGTAAAATA 240
pAAGAAGGG GTAAAGGATA CCATCTATAA CAAAGTAACT TACAACTAGT GTCAAGTGGC 300
CAAGCGTCA 309
(2) INFORMATION FOR SEQ ID NO: 195: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 312 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: TGACGCpGG CCACpGACA CCCAATCTCG CACTTCATCC TCCCAGCACC TGATGAAGTA 60
twenty
GGACTGCAAC TATCCCCACT TCCCAGATGA GGGGACCAAN GTACACApA GGACCCG6AT 120
GGGAGCACAG ApTGTCCGA TCCCAGACTC CAAGCACTCA GCGTCACTCC AGGACAGCGG 180
CTpCAGATA AGGTCACAAA CATGAATGGC TCCGACAACC GGAGTCAGTC CGTGCTGAGT 240
TAAGGCAATG GTGACACGGA TGCACGTCTN ACCTGTAATG GpCATCGTA AGTGTCAAGT 300
GGCCAAGCGT CA 312
* 2) INFORMATION FOR SEQ ID NO: 196: (i) SEQUENCE CHARACTERISTICS: 10 (A) LENGTH: 288 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196:
TGTATCGACG TAGTGGTCTC CTCAGCCATG CAGAACTGTG ACTCAApAA ACCTCpTCC 60
pTATGAAp ACCCAATCTC GGGTAGTGTC pTATAGTAG TGTGAGAATG GACTAATACA 120
AGTACATTp ACpAGTAAT AATAATAAAC AAATATApA CAppTGTG TATpACTAC 180
ACCATATTp pApGpAT TGTAGTGTAC ACCpCTACT TApAAAAGA AATAGGCCCG 240
AGGCGGGCAG ATCACGAGGT CAGGAGATGG AGACCACTAC GTCGATAC 288
(2) INFORMATION FOR SEQ ID NO: 197: (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 289 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197:
pGGGCACCT TCAATATCAT GACAGGTGAT GTGATAACCA AGAAGGCTAC TAAGTGApA 60
ATGGGTGGGT AATGTATACA GAGTAGGTAC ACTGGACAGA GGGGTAApC ATAGCCAAGG 120
CAGGAGAAGC AGAATGGCAA AACATpCAT CACACTACTC AGGATAGCAT GCAGpTAAA 180
ACCTATAAGT AGTTTApp TGGAATpTC CACpAATAT pTCAGACTG CAGGTAACTA 240
AACTGTGGAA CACAAGAACA TAGATAAGGG GAGACCACTA CGTCGATAC 289
(2) INFORMATION FOR SEQ ID NO: 198: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 288 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 198: GTATCGACGT AGTGGTCTCC CAAGCAGTGG GAAGAAAACG TGAACCAAp AAAATGTATC 60
AGATACCCCA AAGAAAGGCG CpGAGTAAA GApCCAAGT GGGTCACAAT CTCAGATCTT 120
AAAApCAGG CTGTCAAAGA GATpGCTAT GAGGpGCTC TCAAT6ACTT CAGGCACAGT 180
CGGCAGGAGA pGAAGCCCT GGCCApGTC AAGATGAAGG AGCpTGTGC CATGTATGGC 240
AAGAAAGACC CCAATGAGCG GGACTCCTGG AGACCACTAC GTCGATAC 288
(2) INFORMATION FOR SEQ ID NO: 199: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1027 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199:
twenty
* GCTTpTGGG AAAAACNCAA NTGGGGGAAA GGGGGNpNN TNGCAAGGGG ATAAAGGGGG 60
AANCCCAGGG pTCCCCAp CAGGGAGCTG TAAAAAGNCG GCCAGGGGAT TGTAANAGGA 120
pCAATAATA GGGGGAATGG GCCCNGAAGT TGCAAGGpC CNGCCCGCCA TGNCCGCGGG 180
ATpAGTGAC ApACGACGS TGGTAATAAA GTGGGSCCAA WAAATATpG TGATGTGAp 240
pTSGACCAG TGAACCCAp GWACAGGACC TCATpCCTY TGAGATGRTA GCCATAATCA 300
GATAAAAGRT TAGAAGTYp TCTGCACGp AACAGCATCA pAAATGGAG TGGCATCACC 360
AATpCACCC TTTGpAGCC GATACCpCC CCpGAAGGC ApCAApAA GTGACCAATC 420
GTCATACGAG. AGGGGATGGC ATGGGGApG ATGATGATAT CAGGGGTGAT ACCpCACAG 480
GTGAAAGGCA TATCCTCTTG TCTATACTGA ATACCACAAG TACCCTTTTG ACCATGTCGA 540
CTAGCAAAp TGTCTCCAAT CTGTGTWATC CCTAACAGAG CGTACCCpA ppACAAAA 600
TpATATCCT TCCTGApGA GAGpACCAT AACCTGATCC ACAATGCCCG TCTCGCTWGT 660
• TCTGAGAAAA GTGCTACAGT CTCTCpGGT ATAGCGTCTA pGGTGCTCT CCAApCATC 720
twenty
pcApprc AGGCAAGGTG AACTGppG CCTATAATAA CMTCATCTCC TGATACMCGA 780
AACCCCKGGA RCTATCAAAC CATCATCATC CAGCGpCKT WATGTYMCTA AATCCCTAp 840
GCGGCCGCCT GCAGGTCAAC ATATNGGAAA ACCCCCCACC CCTTNGGAGC NTACCpGAA 900
ppCCATAT GTCCCNTAAA pANCTNGNC pANCCTGGC CNTAACCTNT TCCGGpTAA 960
ApGpTCCG CCCCCNpCC CCNCCpNNA ACCGGAAACC pAAppA ACCNGGGGp 1020
CCTATCC 1027
(2) INFORMATION FOR SEQ ID NO: 200: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 207 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: AGTGACApA CGACGCTGGC CATCpGAAT CCTAGGGCAT GAAGpGCCC CAAAGpCAG 60
CACTTGGpA AGCCTGATCC CTCTGGTGTA TCACAAAGAA TAGGATGGGA TAAAGAAAGT 120
GGACACpAA ATAAGCTATA AApATATGG TCCpGTCTA GCAGGAGACA ACTGCACAGG 180
TATACTACCA GCGTCGTAAT GTCACTA 207
(2) INFORMATION FOR SEQ ID NO: 201: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 209 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: TGGGCACCp CAATATCTAT TAAAAGCACA AATACTGAAG AACACACCAA GACTATCAAT 60
GAGGpACAT CTGGAGTCCT CGATATATCA GGAAAAAATG AAGTGAACAT TCACAGAGp 120
pAcpcpr GGGAACTCAA ATGCTAGAAA AGAAAAGGGT 6ccctctpc TCTGGCGTCC? ßo
TGGTCCTATC CAGCGTCGTA ATGTCACTA 209
2) INFORMATION FOR SEQ ID NO: 202: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 349 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202:
NTACGCTGCA ACACTGTGGA GCCACTGGp pTApCCCG GCAGGpATC CAGCAAACAG 60
TCACTGAACA CACCGAAGAC CGTGGTATGG TAACCGpCA CAGTAATCGT TCCAGTCGTC 120
TGCGGGACCC CGACGAGCGT CACTGGGTAC AGACCAGAp CAGCCGGAAG AGAAAGCGCC 180
GCAGGGAGAG ACTCGAACTC CACTCCGCTG GTGAGCAGCC CCATGTpTC AACTCGAAGT 240
TCAAACGGCA pGGGpATA TACCATCAGC TGAACTTCAC ACACATCTCC pGAACCCAC 300
TGGAAATCTA TpTCpGp CCGCTCTTCT CCACAGTGp GCAGCGTAA 349
(2) INFORMATION FOR SEQ ID NO: 203: 10 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 241 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203:
TGCTCCTCp GCCTTACCAA CCCAAAGCCC ACTCTGAAAT ATGAAGTGAA TGACAAAAp 60
CAGTpTCAA CGCAATATAG TATAGpTAT CTGApcpT TGATCTCCAG GACACpTAA 120
ACAACTGCTA CCACCACCAC CAACCTAGGG ApTAGGAp CTCCACAGAC CAGAAATTAJ 180
pCTCCTTTG AGpTCAGGC TCCTCTGGGA CTCCTCTTCA TCAATGGGTG GTAAATGGCT 240
A 241
(2) INFORMATION FOR SEQ ID NO: 204: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 248 base pairs (B) TYPE: nucleic acid 5 (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204:
TAGCCApTA CCACCCATCT GCAAACCSWG ACMWWCARGR CYWGMACKYA GGCGATpGA 60
0 AGTACTGGTA ATGCTCTGAT CATGpAGp ACATAAGTGT GGTCAGTpA CAAAAApCA 120 CAGAACTAAA TACTCAATGC TATGTGpCA TGTCTGTGp TATGTGTGTG TAATGTpCA 180
ApAAGpp pTAAAAAAA AGAGATGAp TCCAAATAAG AAAGCCGTGT TGGTAAGGCA 240
AGAGGAGC 248
fifteen
(2) INFORMATION FOR SEQ ID NO: 205: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 505 base pairs 20 (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205:
TACGCTGCAA CACTGTGGAG CCApCATAC AGGTCCCTAA pAAGGAACA AGTGApATG 60
CTACCTpGC ACGGpAGGG TACCGCGGCC GpAAACATG TGTCACTGGG CAGGCGGTGC 120
CTCTAATACT GCTGATGCTA GAGGTGATGT TpTGGTAAA CAGGCGGGGT AAGATpGCC 180
AGpapT TACI 111111 AACCTGTCCT TATGAGCATG CCTGTGpGG GpGACAGTG 240
GGGGTAATAA TGACpGpG GpGApGTA GATApGGGC TGpAApGT CAGpCAGTG 300
* TpTAATCTG ACGCAGGCp ATGCGGAGGA GAATGTpTC ATGpACpA TACTAACAp 360
AGpcpCTA TAGGGTGATA GApGGTCCA ApGGCTGTG AGGAGpCAG pATATGTp 420
GGGATTpp AGGTAGTGGG TGpGANCp GAACGCpTC pAApGGTG GCTGCTpTA 480
RGCCTACTAT GGGTGGTAAA TGGCT 505
(2) INFORMATION FOR SEQ ID NO: 206: (i) SEQUENCE CHARACTERISTICS 15 (A) LENGTH: 179 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 206:
TAGACTGACT CATGTCCCCT ACCAAAGCCC ATGTAAGGAG CTGAGpcp AAAGACTGAA 60
GACAGACTAT TCTCTGGAGA AAAATAAAAT GGAAApGTA CpTAAAAAA AAAAAAAATC 120
GGCCGGGCAT GGTAGCACAC ACCTGTAATC CCAGCTACTA GGGGACATGA GTCAGTCTA 179
(2) INFORMATION FOR SEQ ID NO: 207: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 176 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: AGACTGACTC ATGTCCCCTA CCCCACCTTC TGCTGTGCTG CCGTGpCCT AACAGGTCAC 60
AGACTGGTAC TGGTCAGTGG CCTGGGGGp GGGGACCTCT ApATATGGG ATACAAATp 120
AGGAGpGGA ApGACACGA pTAGTGACT GATGGGATAT GGGTGGTAAA TGGCTA 176
(2) INFORMATION FOR SEQ ID NO: 208: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 196 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 208:
AGACTGACTC ATGTCCCCTA TpAACAGGG TCTCTAGTGC TGTGAAAAAA AAAAATGCTG 60
AACApGCAT ATAACpATA pGTAAGAAA TACTGTACAA TGACTpAp GCATCTGGCT 120
AGCTGTAAGG CATGAAGGAT GCCAAGAAGT pAAGGAATA TGGGTGOTAA ATGGCTAGGG 180
GACATGAGTC AGTCTA 196
- »• - (2) INFORMATION FOR SEQ ID NO: 209: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 345 base pairs 10 (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xí) SEQUENCE DESCRIPTION: SEQ ID NO.209:
GACGCTTGGC CACpGACAC CpTTApp pAAGGApC pAAGTCAp TANGTNACp 60
TGTAAGpp TCCTGTGCCC CCATAAGAAT GATAGCpTA AAAApATGC TGGGGTAGCA 120
AAGAAGATAC pCTAGCTp AGAATGTGTA GGTATAGCCA GGApCpGT GAGGAGGGGT 180
GATTTAGAGC AAApTCTTA pCTCCpGC CTCATCTGTA ACATGGGGAT AATAATAGAA 240
CTGGCpGAC AAGGpGGAA pAGTApAC ATGGTAAATA CATGTAAAAT GTpAGAATG 300
GTGCCAAGTA TCTAGGAAGT ACTTGGGCAT GGGTGGTAAA TGGCT 345
(2) INFORMATION FOR SEQ ID NO: 210: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 178 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: GACGCTTGGC CACpGACAC TAGAGTAGGG pTGGCCAAC TTTpCTATA AAGGACCAGA 60
GAGTAAATAT pCAGGCTp GTGGGpGTG CAGTCTCTCT TGCAACTACT CAGCTCTGCC 120
ApGTAGCAT AGAAATCAGC CATAGACAGG ACAGAAATGA ATGGGTGGTA AATGGCTA 178
(2) INFORMATION FOR SEQ ID NO: 211: (i) SEQUENCE CHARACTERISTICS: 15 (A) LENGTH: 454 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 20 TGGGCACCp CAATATCTAT CCAGCGCATC TAAApCGCT ppTCTTGA ITAAAAApT 60
CACCACTTGC TGTTTpGCT CATGTATACC AAGTAGCAGT GGTCTGAGGC CATGCTTGp 120
ppGApCG ATATCAGCAC CGTATAAGAG CAGTGCTpG GCCApAAp TATCTTCAp 180
GTAGACAGCA TAGTGTAGAG TGGTATCTCC ATACTCATCT GGAATATpG GATCAGTGCC 240
ATGpCCAGC AACApAACG CACApCATC pCCTGGCAT TGTACGGCCT pGTCAGAGC 300
TGTCCTCTp pGpGTCAA GGACApAAG pGACATCGT CTGTCCAGCA CGAGTpTAC 360
TACpCTGAA pCCCApGG CAGAGGCCAG ATGTAGAGCA GTCCTCTTp GCpGTCCCT 420 i _ * ^ CTTGpCACA TCAGTGTCCC TGAGCATAAC GGAA 454
2) IN FORMATION FOR SEC ID NO: 212: (i) CHARACTERISTICS OF SECUENC IA: (A) LONGITU D: 337 base pairs (B) TYPE: nucleic acid (C) FORM OF HI LO: simple 15 (D) TOPOLOGY: line l ^? (xi) SEQUENCE DESCRIPTION: SEC I D NO.212:
twenty
^^ ~ TCCGpATGC CACCCAGAAA ACCTACTGGA GpACTTAp AACATCAAGG CTGGAACCTA 60
pTGCCTCAG TCCTATCTGA pCATGAGCA CATGGpAp ACTGATCGCA pGAAAACAT 120
TGATCACCTG GGTpcpTA pTATCGACT GTGTCAT6AC AAGGAAACp ACAAACTGCA 180
ACGCAGAGAA ACTApAAAG GTApCAGAA ACGTGAAGCC AGCAApGp TCGCAApCG 240
GCAppGAA AACAAApTG CCGTGGAAAC TpAApTGT TCTTGAACAG TCAAGAAAAA 300
# CApApGAG GAAAApAAT ATCACAGCAT AACGGAA 337
(2) INFORMATION FOR SEQ ID NO: 213: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 715 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213:
twenty
TCGGGT6ATG CCTCCTCAGG CATCTTCCAT CCATCTCTTC AAGApAGCT GTCCCAAATG 60 ppTCCpC TCpcpTAC TGATAAATp GGACTCCpC pGACACTGA TGACAGCTp 120
AGTATCCpC pGTCACCp GCAGACpTA AACATAAAAA TACTCApGG ppAAAAGG 180
AAAAAAGTAT ACApAGCAC TApAAGCp GGCCpGAAA CAppCTAT CTpTApAA 240
ATGTCGGpA GCTGAACAGA ApCA TTA CAATGCAGAG TGAGAAAAGA AGGGAGCTAT 300
ATGCATpGA GAATGCAAGC ApGTCAAAT A CATTíTA AATGCTpCT TAAAGTGAGC 360
ACATACAGAA ATACApAAG ATApAGAAA GTGTTTpGC pGTGTACTA CTAApAGGG 420
AAGCACCTTG TATAGpCCT CpCTAAAAT TGAAGTAGAT pTAAAAACC CATGTAATTT 480
AApGAGCTC TCAGpCAGA TpTAGGAGA AppAACAG GGApTGGp pGTCTAAAT 540 pTGTCMp TNpTAGpA ATCTGTATAA TTGTATAAAT GTCAAACTGT AIGTACTCCG ßoo
TTpCATGCT GCTATGAAAG AAATACCCAN GACAGGGpA pTATAAANG GAAAGANGp 660
(2) INFORMATION FOR SEQ ID NO: 214: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 345 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) ) SEQUENCE DESCRIPTION: SEQ ID NO: 214: GGGApCGGG TGATGCCTCC 60
TCAGGCCCAC pGGGCCTGc T? TCCCAAA TGGCAGCTCC TCTGGACATG CCApCCTTC 120
TCCCACCTGC CTGApcpc ATATGpGGG TGTCCCTGp pTCTGGTGC TATTTCCTGA 180
CTGCTGpCA GCTGCCACTG TCCTGCAAAG CCTGCCTTp TAAATGCCTC ACCApCCTT 240
CATpG rC pAAATATGG GAAGTGAAAG TGCCACCTGA GGCCGGGCAC AGTGGCTCAC 300CCAGCACTp GGGAGCCTGA GGAGGCATCA CCCGA 345
(2) INFORMATION FOR SEQ ID NO: 215: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 429 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 15 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: ATCCTTCTGA CCT? TGGGT pTAAGCAGG AGGTGTCAGA AAAGpACCA CAGGGATAAC 180
TGGCTTGTGG CGGCCAAGCG pCATAGCGA CGTCGCTpT TGATCCTTCG ATGTCGGCTC 240
pCCTATCAT TGTGAAGCAG AApCACCAA GCGpGGAp GpCACCCAC TAATAGGGAA 300
CGTGAGCTGG GpTAGACCG TCGTGAGACA GGpAGpiT ACCCTACTGA TGATGTGTKG 360
pGCCATGGT AATCCTGCTC AGTACGAGAG GAACCGCAGG pCASACAp TGGTGTATGT 420
GCpGCCTT 429
: 216: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 593 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEC ID NO: 216: TGACACCTAT GTCCNGCATC TGpCACAGT pCCACAAAT AGCCAGCCTT TGGCCACCTC 60
TCTCTCCTGA GGTATACAAG TATATCAGGA GGTGTATACC pCTCTTCTC pCCCCACCA 120
AAGAGAACAT GCAGGCTCTG GAAGCTGTCT TAGGAGCCTT TGGGCTCAGA ATpCAGAGT 180
CpGGGTACC pGGATGTGG TCTGGAAGGA GAAACATTGG CTCTGGATAA GGAGTACAGC 240
CGGAGGAGGG TCACAGAGCC CTCAGCTCAA GCCCCTGTGC CTTAGTCTAA AAGCAGCTH 300
GGAT6AGGAA GCAGGpAAG TAACATACGT AAGCGTACAC AGGTAGAAAG TGCTGGGAGT 360
CAGAApGCA CAGTGTGTAG GAGTAGTACC TCAATCAATG AGGGCAAATC AACTGAAAGA 420
- AGAAGACCNA pAATGAAp GCpANGGGG AAGGATCAAG GCTATCATGG AGATCTpCT 480
2Q AGGAAGApA pGpTANAA pATGAAAGG ANTAGGGCAG GGACAGGGCC AGAAGTANAA 540
GANAACApG CCTATANCCC pGTCpGCA CCCAGATGCT GGACAAGGTG TCA 593
(2) INFORMATION FOR SEQ ID NO: 217: (i) SEQUENCE CHARACTERISTICS: 25 (A) LENGTH: 335 base pairs (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEC ID NO: 217:
TGACACCTTG TCCAGCATCT GACGTGAAGA TGAGCAGGTC AGAGGAGGTG TCCTGGApT 60
CCTGGpCTG TGGGCTCCGT GGCAATGAAT TCTTCTGTGA AGTGGATGAA GACTACATCC 120
AGGACAAAp TAATCpACT GGACTCAATG AGCAGGTCCC TCACTATCGA CAAGCTCTAG_180_* ACATGATCp GGACCTGGAG CCTGATGAAG AACTGGAAGA CAACCCCAAC CAGAGTGACC 240
TGApGAGCA GGCAGCCGAG ATGCTpATG GApGATCCA CGCCCGCTAC ATCCpACCA 300
ACCGTGGCAT CGCCCAGATG CTGGACAAGG TGTCA 335
(2) INFORMATION FOR SEQ ID NO: 218: 1-5 (i) SEQUENCE CHARACTERISTICS:
* (A) LENGTH: 248 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 23 (? ¡) DESCRIPTION OF SEQUENCE: SEQ ID NO: 218:
TCAAAGACTA 60
TGTATGAAAT GGGACTGTAA GTACAGAGGG AAGGGTGGCC CpATCGCCA GAAGpGGTA 120
GATGCGTCCC CCTCATGAAA TGpCTGTCA CTGCCCGACA pTGCCGAAT TACTGAAAp 180
CCGTAGAAp AGTGCAAAp CTAACGpGT TCATCTAAGA pATGGpCC ATGpTCTAG_240_TAC? TTA 248
(2) INFORMATION FOR SEQ ID NO: 219: (i) SEQUENCE CHARACTERISTICS: 10 (A) LENGTH: 530 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID N0.219:
TGACGCpGG CCACpGACA CAAGTAGGGG ATAAGGACAA AGACCCATNA GGTGGCCTGT 60
* CAGCCpTTG pACTGpGC pCCCTGTCA CCACGGCCCC CTCTGTAGGG GTGTGCTGTG 120
CTCTGTGGAC ApGGTGCAT pTCACACAT ACCApCTCT pCTGCTTCA CAGCAGTCCT 180
GAGGCGGGAG CACACAGGAC TACCpGTCA GATGANGATA ATGATGTCTG GCCAACTCAC 240
CCCCCAACCT TCTCACTAGT TATANGAAGA GCCANGCCTA NAACCTTCTA TCCTGNCCCC 300
pGCCCTATG ACCTCATCCC TGpCCATGC CCTApCTGA pTCTGGTGA ACTpGGAGC 360
AGCCTGGTp NTCCTCCTCA CTCCAGCCTC TCTCCATACC ATGGTANGGG GGTGCTCTTC 420
CACNCAAANG GTCAGGTGTG TCTGGGGAAT CCTNANANCT GCCNGGAGp TCCNANGCAT 480
TCpAAAAAC CpcpGCCT AATCANATNG TGTCCAGTGG CCAACCNTCN 530
(2) INFORMATION FOR SEQ ID NO: 220: (i) SEQUENCE CHARACTERISTICS: (A) LONGITU D: 531 base pairs (B) TI PO: nucleic acid (C) HI FORM: simple (D) TOPOLOGY 'linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220:
TCTTCTAAAG GCCTGA? CA GAG? GTGGA 60
AAA? CTCCC AGTGTCAGGG A? GTCAGGA ACAGGGCTGC TCCTGTGCTC ACT? ACCTG 120
CTGTGTGGCT GCTGGAAAAG GAGGGAAGAG GAATGGCTGA ITG? ACCTA ATGTCTCCCA ISO 5 GTT? TCATA pcpcpGG ATCCTCTTCT CTGACAACTG pcccppG GTcpcpcT 240
TCpGCTCAG AGAGCAGGTC TCpTAAAAC TGAGAAGGGA GAATGAGCAA ATGApAAAG 300
% AAAACACACT TCTGAGGCCC AGAGATCAAA TApAGGTAA ATACTAAACC GCpGCCTGC 360
TGTGGTCACT pTCTCCTCT pCACATGCT CTATCCCTCT ATCCCCCACC TApCATATG 420
GCTpTATCT GCCAAGpAT CCGGCCTCTC ATCAACCpC TCCCCTAGCC TACTGGGGGA 480
TATCCATCTG GGTCTGTCTC TGGTGTApG GTCTCAAGTG GCCAAGCGTC A 531
(2) INFORMATION FOR SEQ ID NO: 221: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 530 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221:
ApGACGCp GGCCACp & CACCCGCCTG CCTGCAATAC TGGGGCAAGG GCCpCACTG 60
CpTCCTGCC ACCAGCTGCC ACTGCACACA GAGATCAGAA ATGCTACCAA CCAAGACTGT 120
TGGTCCTCAG CCTCTCTGAG GAGAAAGAGC AGAAGCCTGG AAGTCAGAAG AGAAGCTAGA 180
TCGGCTACGG CCTTGGCAGC CAGCpCCCC ACCTGTGGa ATAAAGTCGT GaTGGCTTA 240
ACAATGGGGG CACCTCCTGA GAAACACAp GpAGGCAAT TCGGCGTGTG pCATCAGAG 300
CATATpAa CAAACCTCGA TAGTGCAGCC TACTATCaC TApGCTCCT ACGCTGCAAA 360 -? -
CCTGAACAGC ATGGGACTGT ACTGAATACT GGAAGCAGCT GGTGATGGTA CpApTGTG 420
TATCTAAAa CAGAGAAGGT ACAGTAAGAA TATGGTATa TAAACpAa GGGACCGCCA 480
TCCTATATGC AGTCTGpGT GACCAAAATG TGTCAAGTGG CCAAGCGTCA 530
(2) INFORMATION FOR SEQ ID NO: 222: 15 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 578 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222:
ACCTGGpCA 60
CTGAAAGGCG aTCTCCCTC CCCGCGTCGC CCTGAAGCAG GGGGAGGACT TCGCCCAGCC 120
AAGGCAGpG TATGAGTTp AGCTGCGGCA CpCGAGACC TCTGAGCCCA CCTCCpCAG 180
GAGCCpCCC CGApAAGGA AGCCAGGGTA AGGApCCTT CCTCCCCCAG ACACCACGAA 240
CAAACCACCA CCCCCCCTAT TCTGGCAGCC CATATACATC AGAACGAAAC AAAAATAAa 300
? ^ k AATAAACNAA AACCAAAAAA AAAAGAGAAG GGGAAATGTA TATGTCTGTC aTCCTGpG 360
CpTAGCCTG TCAGCTCCTA NAGGGCAGGG ACCGTGTCp CCGAATGGTC TGTGCA6CGC 420
CGACTGCGGG AAGTATCGGA GGAGGAAGCA GAGTCAGCAG AAGpGAACG GTGGGCCCGG 480
CGGCTCpGG GGGCTGGTGT TGTACpCGA GACCGCpTC GCTTTpGTC TTAGApTAC 540
GT TGCTCp TGGAGTGGGA NACCACTACN TCNATACA 578
(2) INFORMATION FOR SEQ ID NO: 223: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 578 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single 20 (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: TGTATCGACG TAGTGGTCTC CTCpGCAAA GGACTGGCTG GTGAATGGp TCCCTGAAp 60 ATGGACTTAC CCTAAAaTA TCTTATaTC ApACCAGp GCAAAATAp AGAATGTGp 120
GTCACTGTp apTGApC CTAGAAGGp AGTCTTAGAT ATGpACTp AACCTGTATG 180
CTGTAGTGCT pGAATGaT ppTGpTG ppTGp TGCCCAACCT GTCAApATA 240
GCTGCpAGG TCTGGACTGT CCTGGATAAA GCTGpAAAA TApCACCAG TCCAGCCATC 300
pACAAGCTA ApAAGTCAA CTAAATGCTT CCTTGppG CaGACpGT TATGTCAATC 360
CTCAATpCT GGGpapr TGGGTGCCCT AAATCTTAGG GTGTGAcpr cpAGaTcc 420
TGTAAaTCC ApCCCAAGC AAGaCAACT TaaTAATA CTpCCAGAA Gp pGCT 480
GAAGCCpTC CpaCCCAG CGGAGCAACT TGATpTCTA CAACpCCCT aTCAGAGCC 540
ACAAGAGTAT GGGATATGGA GACaCTACG TCGATAa 578
(2) INFORMATION FOR SEQ ID NO: 224: 15 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 345 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear 20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224:
CaCTCCCAG 60
GTGGATcpr pcpTATAC pAcpap AGGTpcTCT TApcAAGAA GTGTAGTGGT 120
AAAAGTCTp TCAATCTAa TGGpAAATA ATGATAGCCT GGGAAATAAA TAGAAATpT 180
pcpraTc rAGGpGA ATAAAGAAAC AGAAAAAATA GAAaTACTG AAAATAATCT 240
AAGpCCAAC aTAGAAGAA CTGCA6AA6A AATGAAGAAA GTGATGATGA THAGATm 300
GATApGAp TAGAAGAaC AGGAGGAGAC aCTACGTCG ATAa 345
(2) INFORMATION FOR SEQ ID NO: 225: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 347 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: simple
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225:
twenty
TGTATCGACG TAGTGGTCTC CAAACTGAGG TATGTGTGCC ACTAfíCACAC AAAGCCTTCC 60
AACAGGGACG aGG OGG CAGpTAAAG GGAATCTGp TCTAAApAA pTC CCTT 120
CTCTAAGTAT TCTpCCTAA AACTGATCAA GGTGTGAAGC CTGTGCTCp TCCCAACTCC 180
CCpTGACAA OGCCpCAA CTAACACAAG AAAAGGaTG TCTGAaCTC pCCTGAGTC 240
TGACTCTGAT ACGpGpCT GATGTCTAAA GAGCTCCAGA ACACCAAAGG GACAApCAG 300
* AATGCTGGTG TATAACAGAC TCCAATGGAG ACaCTACGT CGATAa 347
(2) INFORMATION FOR SEQ ID NO: 226: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 281 base pairs (B) TYPE: nucleic acid 15 (C) THREAD FORM: single (D) TOPOLOGY: linear ( xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226:
twenty
AGGNGNGGGA NTGTATCGAC CTAGTGCTCT CCCAACAGTC TGT p G TCTGCAGGTG 60
TCAGTGTTp GGACAATGAG GCACapGT cpApGA CTCCTCAGCT CTAAATGCTG 120
AAApAAATC pGTaTaC AAGTCTGGAA pCCTGATGA GGTTpACAA AGTATpTGG 180
ATCAATACTC CAACAAATCA GAAAGCaGA AAGAGGATCC TTTCAATAp GaGAACaC 240
GAGTGGApT AaaCCTa GGAGACCACT ACGTCGATAC A 281
* (2) INFORMATION FOR SEQ ID NO: 227: (i) SEQUENCE CHARACTERISTICS: 10 (A) LENGTH: 3646 base pairs (B) TYPE: nucleic acid (C) THREAD FORM: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227:
fifteen
twenty
# GGGAAAaCT TCCTCCaGC CTTGTAAGGG pGGAGCCCT CTCCAGTATA TGCTGaGAA 60
TTTpCTCTC GGTpCTaG AGGApATGG AGTCCGCCp AAAAAAGGCA AGCTCTGGAC 120
ACTCTGCAAA GTAGAATGGC CAAAGTpGG AGpGAGTGG CCCCpGAAG GGTaCTGAA 180
CCTaCAAp GpCAAGCTG TGTGGCGGGT TGpACTGAA ACTCCCGGCC TCCCTGATa 240
GTpCCCTAC ApGATCAAT GGCTGAGTp GGTCAGGAGC ACCCCTTCCG TGGCTCaCT 300
TG c T TaTAApp ACCTCCAAGG TCCTCCTGAG CCAGACCGTG ppcGCCTC 360
«GACCCTCAGC CGGpCGGCT CGCCCTGTAC TGCCTCTCTC TGAAGAA6AG GAGAGTCTCC 420
CTCACCCAGT CCaCCGCCT TAAAACCAGC CTACTCCCTT AGGGTaTCC aTGTCTCCT 480
fifteen
twenty
CGGCTATGTC CCCTGTAGGC TaTOCCa pGCCTCTTG GpGCAACCG TGGTGGGAGG 540
AACTAGCCCC TCTACTACa CTGAGAGAGG aCAAGTCCC TCTGGGTGAT GAGTGCTCa 600
CCCCCpCCT GGpTATGTC CCTTCpTCT ACpCTGACT TGTATAApG GAAAACCaT 660
AATCCTCCCT TCTCTGAAAA GCCCCAGGCT pGACCTCAC TGATGGAGTC TGTACTCTGG 720
ACAapGGC CaCCTGGGA TGACTGTCAA CAGCTCCTIT TGACCCTpT aCCTCTGAA 780
* aGAGGGAAA GTATCCAAAG AGAGGCCAAA AAGTACAACC TaaTOAC CAATAGGCCG 840
GAGGAGGAAG CTAGAGGAAT AGT pAa GACCCAApG GGACCTAAp GGGACCCAAA 900
pTCTCAAGT GGAGGGAGAA CTpTGACGA TTTCaCCGG TATCTCCTCG TGGGTApCA 960
GGGAGCTGCT CAGAAACCTA TAAACpGTC TAAGGCGACT GAAGTCGTCC AGGGGaTGA 1020
TaGTCAc GGAGTGG? T TAGAGCACCT CCAGGAGGCT TATCAGATp Aaccccpr íoßo
TaCCTGG GCCCCCGAAA ATAGCaTGC TCpAApTG GaTTTGTGG CTCAGGCAGC 1140
CCCAGATAGT AAAAGGAAAC TCCAAAAACT AGAGGGATTT TGCTGGAATG AATACCAGTC 1200
AGCTpTAGA GATAGCCTAA AAGGTmTG AaGTCAAGA GGpGAAAAA CAAAMCAAG 1260
CAGCTCAGGC AGCTGAAAAA AGCaCTGAT AAAGaTCCT GGAGTATaG AGpTACTGT 1320
TAaT GCC TCATpaCT TCCCCTCCa aTGGTGTp AAATCaGCT AaCTACTTC 1380
CTGACTCAAA CTCaCTAp CCTGpaTG ACTGTCAGGA ACTGpGGAA ACTACTGAAA 1440
CTGGCCGACC TGATCTTCAA AATGTGCCCC TAGGAAAGGT GGATGCCACC ATGpaaG 1500
AaGTAGaG CpCCTCGAG AAGOaCTAC GAAAGGCCGG TGCAGCTGp ACaTGGAGA 1560 CAGATGTGp GTGGGCTCAG GCTpACCAG OAAaCCTC AGaCAAAAG GCTGAApa 1620
TCGCCCTaC TCAGGCTCTC CGATGGGGTA AGGATApAA COTTV CT GACAGCAGGT 1680
ACGCCTpGC TACTGTGaT GTACGTGGAG CaTCTACa GGAGCGTGGG CTACTCACCT 1740
CAGaGGTGG CTGTAATCa CTGTAAAGGA aTOAAAGG AAAAaCGGC TGpGCCCGT 1800
GGTAACCAa AAGCTWTC AGCAGCTCAA aTGaGTGT GACTpCAGT aCGCCTCTA 1860
AAcpGCTGC caaGTCTc ctpcaoG COGATCTGC CTGACAATCC CGaTACTa 1920
ACAGAAGAAG AAAACTGGCC TCAGAACTa GAGCCAATAA AAATCAGGAA GGpGGTGa 1980
pcpccrra CTCTAGAATC paTACCCC GAActcp &G GAAAAc rA ATCAGTCACC 2040
TAaGTCTAC aCCaTTTA GGAGGAGCAA AGCTACCTa GCTCCTCCGG AGCCGTpTA 2100
AaYcccca TC? OAAGC CTAAaaTC AAGCAGCTCT CCGGTGaa ACCTGCGCCC 2160
AGGTAAATGC CAAAAAAGGT CCTAAACCa GCCaGGCa CCGTCTCCAA GAAAACTaC 2220
GaGAAAA GTGGGAAAp ac TAaG AAGTAAAACC AaCCGGGCT GGGTACAAAT 2280
ACCpCTAGT ACTGGTAaC ACCTTCTCTG aTGGACTa AGaiTTGCT ACCAAAAACG 2340
AAACTGTCAA TATGGTAGp AAGppTAC TCAATGAAAT aTCCCTCa aTGGGCTGC 2400
CTGlTTGCa TAGGGTCTa TAATGaCCG GCCpCGCCT TGTCTATAGT 1TAGTCAGTC 2460
AGTAAGGCGT TAAAapCA ATGGAAGCTC apGTGCCT ATCGACCCa aGCTCTGGG 2520
CAAGTAGAAC GaTGAACTG CACCCTAAAA AAaCTCpA CAAAApAAT CpAGAAACC 2580
GGTGTAAAp GTGTAAGTCT CCTTCCTpA GCCCTACTTA GAGTAAGGTG CACCCCpAC 2640 TGGGCTGGGT TCTTACCTp TGAAATaTG TATGGGAGGG TGCTGCCTAT CTTGCCTAAG 2700CCCAApGGC AAAAATATa OAACTAAp TApACAGTA CCTACAGTCT 2760
CCCCAACAGG TACAAaTAT aTCCTGCa CTTCTTCaG GAACCaTCC CAATCCAAp 2820
CCTGAACAGA CAGGGCCCTG C papC CCGCCAGGTG ACCTGpGp TGpAAAAAG 2880
pCCAGAGAG AAGGACTCCC TCCTGCTTGG AAGAGACCTC AaCCGT T CaTGC 2940
* ACGGCTCTGA AGGTGaTGG apCCTGCG TGGApaTC ACTCCCGaT CAAAAAGGCC 3000
AACAGAGCCC AACTAGAAAC ATGGGTCCCC AGGGCTGGCT CAGGCCCCp AAAACTGCAC 3060
CTAAGpGGG TGAAGCap AapAApC pTCpAA ppCTAAAA CAATGaTAG_3120_CpCTGTCAA ACTTATGTAT CTTAAaCTC AATATAACCC CCpGpATA ACTGAGGAAT 3180
CAATaTpG ApCCCCCAA AAAaOAGT GGGGAATGTA GTGTCCAACC TGGTTpTAC 3240
TAACccTGp pTAacTCT cccrpccp TAATaar GcpGprcc AccTG pG 3300 ACTCTcccp AGCTAA GC Gca TG CTcaTcpG GCTCTGTCAC TGGCAGCCGC 3360
pCCTCAAGG ACpAACpG TGCAAGCTa CTCCCAGaC ATCCAAGAAT GCAApAACT 3420
, 20 GATAAaTAC TGTGGCAAGC TATATCCGa GpCCCAGGA ApCCTCCAA p Ta AG 3480
CCCCTCTACC CpCAGCAAC acaCCCTG ATCAGTaGC AGCaTCAGC ACCGAGGCAA 3540
GGCCCTCaC CAGOAAAAG ApCTGACTC ACTGAAaCT TGGATaTCA pAGTATTIT 3600
TAGCAGTAAA GH I H U I I CppTCTp CT I H U ICT CGTGCC 3646 25
Claims (17)
- CLAIMS 1. An isolated DNA molecule, comprising: (a) a nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 - SEQ ID NO: 77 and SEQ ID NOS: 142, 143 , 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227; (b) a variant of said nucleotide sequence containing one or more substitutions, deletions, insertions and / or modifications of nucleotides and no more than 20% of the nucleotide positions, so that the antigenic and / or immunogenic properties of the polypeptide encoded by the nucleotide sequence are retained; or (c) a nucleotide sequence encoding an epitope of a polypeptide encoded by at least one sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 -SEC ID NO: 77 and SEQ ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227.
- 2. An isolated DNA molecule encoding an epitope of a polypeptide, wherein the polypeptide is encoded by a nucleotide sequence that: (a) hybridizes to a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO. : 3 - SEQ ID NO: 77 and SEQ ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216 , 218, 219, 221-227 under strict conditions; and - (b) is at least 80% identical to a sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 - SEQ ID NO: 77 and SEQ ID NOS: 142, 143, 146-152 , 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 218, 219, 221-227; 5 and wherein the RNA corresponding to the nucleotide sequence is expressed at a higher level in human breast tumor tissue than in normal breast tissue.
- 3. An isolated DNA molecule that encodes an epitope of a * polypeptide, wherein the polypeptide is encoded by: (a) a nucleotide sequence transcribed from the sequence of SEC I D NO: 141; or (b) a variant of the nucleotide sequence containing one or more substitutions, deletions, insertions and / or nucleotide modifications in no more than 20% of the nucleotide positions, so that the antigenic and / or immunogenic properties of the polypeptide encoded by the nucleotide sequence are retained.
- 4. An isolated DNA or AR N molecule comprising a nucleotide sequence complementary to a DNA molecule 20 according to one of the claims 1 -3.
- 5. A recombinant expression vector comprising a DNA molecule according to any of claims 1 -3.
- 6. A host cell transformed or transfected with the expression vector according to claim 5.
- 7. A polypeptide comprising an amino acid sequence encoded by a DNA molecule according to any of claims 1-3.
- A polypeptide according to claim 7, wherein the polypeptide comprises an epitope of an amino acid sequence encoded by at least one nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 - SEQ ID NO: 77 and SEQ ID NOS: 142, 143. 146-152, 154- * 166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216, 10 218, 219, 221 -227
- 9. A monoclonal antibody that binds to a polypeptide according to claim 7.
- 10. A method for determining the presence of breast cancer in a patient comprising detecting, within a sample Biological, at least one polypeptide according to claim 7, and thus determine the presence of breast cancer in the patient.
- 1 1. A method for determining the presence of breast cancer in a patient comprising detecting, within a sample Biological, at least one polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 1 77, 193, 199, 205, 208 , 215, 217, 220, and sequences that hybridize to it under strict conditions.
- 12. The method of claims 10 or 11, wherein the biological sample is a portion of a breast tumor. The method of claim 10, wherein the detection step comprises contacting the biological sample with a The monoclonal antibody according to claim 9. 14. The method of claim 1, wherein the detection step comprises contacting the biological sample with a monoclonal antibody that binds to a polypeptide encoded by * a nucleotide sequence selected from the group consisting of 10 SEC I D NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences for hybridizing them under stringent conditions. 15. A method for determining the presence of breast cancer in a patient comprising detecting, within a sample A biological molecule, an RNA molecule encoding at least one polypeptide according to claim 7, and from which the presence of breast cancer in the patient is determined. 16. A method for determining the presence of breast cancer in a patient comprising detecting, within a sample Biological, at least one AR N molecule encoding a polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193 , 199, 205, 208, 215, 217, 220, and sequences that hybridize thereto under stringent conditions; and determining from them the presence of breast cancer in the patient. The method of claim 15 or 16, wherein the biological sample is a portion of a breast tumor. 18. The method of claim 15, wherein the detection step comprises: (a) preparing cDNAs of RNA molecules within the biological sample; and (b) specifically amplifying cDNA molecules that are capable of encoding at least a portion of a polypeptide according to claim 7, and thus determining the presence of breast cancer in the patient. The method of claim 16, wherein the detection step comprises: (a) preparing a cDNA of RNA molecules within the fP &, biological sample; and (b) specifically amplifying cDNA molecules that are capable of encoding at least a portion of a polypeptide encoded by a nucleotide sequence selected from group 20 consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145 , 153. 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences that hybridize thereto under strict conditions; and thus determine the presence of breast cancer in the patient. * 20. A polypeptide according to claim 7, for use within a method for detecting the presence of breast cancer in a patient. 21. A polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEC I D NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences that hybridize to them under strict conditions; and to be used within a method to detect the presence of * breast cancer in a patient. 22. A method for monitoring the progression of breast cancer in a patient, comprising: (a) detecting an amount, in a biological sample, of at least one polypeptide according to claim 7 at a first point in time; 15 (b) repeating step (a) at a subsequent point in time; and (c) comparing the amounts of polypeptides detected in steps (a) and (b) and monitoring the progression of breast cancer in the patient. 23. A method for monitoring the progression of breast cancer in a patient, comprising: (a) detecting in a biological sample an amount of at least one polypeptide at a first point in time, the polypeptide being encoded by a sequence of nucleotides selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences that hybridize to the same under strict conditions; (b) repeating step (a) at a subsequent point in time; and (c) comparing the amounts of polypeptides detected in the steps (a) and (b) and thus monitoring the progression of breast cancer in the patient. 24. The method of claim 22 or 23, wherein the < Biological sample is a portion of a breast tumor. The method of claim 22, wherein the detection step comprises contacting a portion of the biological sample with a monoclonal antibody according to claim 9. 26. The method of claim 23, wherein the step of of detection comprises contacting the biological sample with a 15 monoclonal antibody that binds to a polypeptide encoded by "t- a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences for hybridizing thereto, under stringent conditions. 27. The method of any of claim 20 or 22, wherein the polypeptide comprises an epitope of an amino acid sequence encoded by at least one nucleotide sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO. : 3 - SEQ ID NO: 77 and SEQ ID NOS: 142, 143, 146-152, 154-166, 168-176, 178-192, 194-198, 200-204, 206, 207, 209-214, 216 , 218, 219, 221-227. 28. A method for monitoring the progression of breast cancer in a patient comprises: (a) detecting an amount, in a biological sample, of at least one RNA molecule encoding a polypeptide according to claim 7 in a first point in time. (b) repeating step (a) at a subsequent point in time; and (c) comparing the amounts of RNA molecules detected in steps (a) and (b) and thus monitoring the progression of breast cancer in the patient. The method of claim 28, wherein the detection step comprises: (a) preparing cDNAs of RNA molecules within the biological sample; and (b) specifically amplifying cDNA molecules that are capable of encoding at least a portion of a polypeptide according to claim 7. 30. A method for monitoring the progression of breast cancer in a patient, comprising: (a) detecting an amount within a biological sample of at least one RNA molecule at a point in time, the AR N molecule encoding a polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 1 53, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences that hybridize thereto under stringent conditions; (b) repeating step (a) at a subsequent point in time; and (c) comparing the amounts of RNA molecules detected in steps (a) and (b) and thus monitoring the progression of breast cancer in the patient. 31. A pharmaceutical composition, comprising a polypeptide according to claim 7 and a physiologically acceptable carrier. 32. A pharmaceutical composition for inhibiting the development of breast cancer, comprising a polypeptide and a physiologically acceptable carrier, the polypeptide being encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NO. NOS: 144, 145, 153, 167. 177, 193, 199, 205, 208, 215, 217, 220, and sequences to hybridize them under strict conditions. 33. A vaccine, comprising a polypeptide according to claim 7 and an immune response enhancer. 34. A vaccine, comprising a DN-molecule according to any of claims 1 -3. 35. A vaccine, comprising a recombinant expression vector comprising an A DN molecule according to any of claims 1-3. 36. A vaccine for inhibiting the development of breast cancer, comprising a polypeptide and an immune response enhancer, r a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220, and sequences that hybridize thereto under stringent conditions. 37. A diagnostic kit comprising: (a) one or more monoclonal antibodies according to claim 9; and (b) a detection reagent. 38. A diagnostic kit comprising: (a) one or more monoclonal antibodies that bind to a polypeptide encoded by a nucleotide sequence selected from the group consisting of sequences provided in SEQ ID NO: 78-86 and SEQ ID NOS : 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, 217, 220; and 15 (b) a detection reagent. iF ^ 39. The equipment of any of claims 37 or 38, wherein the monoclonal antibodies are immobilized on a solid support. 40. A diagnostic kit comprising a first polymerase chain reaction primer and a second polymerase chain reaction primer, the first and second primers each comprising at least about 10 contiguous nucleotides of an RNA molecule of according to claim 4. ico comprising a first polymerase chain reaction primer and a second polymerase chain reaction primer, the first and second primers each comprising at least about 10 contiguous nucleotides of an RNA molecule encoding a polypeptide encoded by a nucleotide sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215, * 217, 220. 10 42. A diagnostic kit comprising at least one of oligonucleotides, the oligonucleotide probe comprising at least 15 contiguous nucleotides of a DNA molecule according to claim 4. 43. A team of diagnosis comprising at least one 15 oligonucleotide probe, the oligonucleotide probe comprising at least about 15 contiguous oligonucleotides of a DNA sequence selected from the group consisting of SEQ ID NO: 78-86 and SEQ ID NOS: 144, 145, 153, 167, 177, 193, 199, 205, 208, 215 , 217, 220.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US700014 | 1991-05-14 | ||
| US585392 | 1996-01-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| MXPA98005611A true MXPA98005611A (en) | 1999-05-31 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU728777B2 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| WO1997025426A9 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| WO1998045328A2 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| WO1997025431A1 (en) | Compositions and methods for the treatment and diagnosis of cancer | |
| US20030125536A1 (en) | Compositions and methods for the therapy and diagnosis of breast cancer | |
| US6225054B1 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US6828431B1 (en) | Compositions and methods for the therapy and diagnosis of breast cancer | |
| US6586570B1 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US6344550B1 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US20020068285A1 (en) | Compositions and methods for the therapy and diagnosis of breast cancer | |
| AU774824B2 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US6656480B2 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US6423496B1 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US20020165371A1 (en) | Compositions and methods for the therapy and diagnosis of breast cancer | |
| US20040073016A1 (en) | Compositions and methods for the therapy and diagnosis of breast cancer | |
| JPH11509730A (en) | Early-onset Alzheimer's disease gene and gene product | |
| MXPA98005611A (en) | Compositions and methods for the treatment and diagnosis of m cancer | |
| AU7150600A (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| MXPA99009237A (en) | Compositions and methods forthe treatment and diagnosis of breast cancer | |
| HK1018288A (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| US5644045A (en) | X-linked adrenoleukodystrophy gene and corresponding protein | |
| US6861506B1 (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| HK1047133A (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| HK1037880A (en) | Compositions and methods for the treatment and diagnosis of breast cancer | |
| JP2008289472A (en) | Composition and method for treating and diagnosing breast cancer |