MXPA97002205A - Gene of galactocinasa hum - Google Patents
Gene of galactocinasa humInfo
- Publication number
- MXPA97002205A MXPA97002205A MXPA/A/1997/002205A MX9702205A MXPA97002205A MX PA97002205 A MXPA97002205 A MX PA97002205A MX 9702205 A MX9702205 A MX 9702205A MX PA97002205 A MXPA97002205 A MX PA97002205A
- Authority
- MX
- Mexico
- Prior art keywords
- sequence
- seo
- nucleic acid
- human
- dna
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 121
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 65
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 61
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 61
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 59
- 108700023157 Galactokinases Proteins 0.000 claims abstract description 47
- 102000048120 Galactokinases Human genes 0.000 claims abstract description 42
- 230000035772 mutation Effects 0.000 claims abstract description 33
- 101001024874 Homo sapiens Galactokinase Proteins 0.000 claims abstract description 29
- 108020004414 DNA Proteins 0.000 claims description 74
- 238000000034 method Methods 0.000 claims description 52
- 241000282414 Homo sapiens Species 0.000 claims description 50
- 239000000523 sample Substances 0.000 claims description 41
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 29
- 239000013598 vector Substances 0.000 claims description 26
- 230000014509 gene expression Effects 0.000 claims description 25
- 239000012634 fragment Substances 0.000 claims description 23
- 239000002773 nucleotide Substances 0.000 claims description 21
- 125000003729 nucleotide group Chemical group 0.000 claims description 21
- 230000007812 deficiency Effects 0.000 claims description 18
- 238000009396 hybridization Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 14
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 12
- 208000027472 Galactosemias Diseases 0.000 claims description 11
- 201000007412 galactokinase deficiency Diseases 0.000 claims description 9
- 238000001502 gel electrophoresis Methods 0.000 claims description 7
- 230000002068 genetic effect Effects 0.000 claims description 7
- 238000000338 in vitro Methods 0.000 claims description 7
- 108091027305 Heteroduplex Proteins 0.000 claims description 6
- 230000029087 digestion Effects 0.000 claims description 6
- 239000013612 plasmid Substances 0.000 claims description 6
- 210000002966 serum Anatomy 0.000 claims description 6
- 230000009261 transgenic effect Effects 0.000 claims description 6
- 238000001712 DNA sequencing Methods 0.000 claims description 5
- 102000008394 Immunoglobulin Fragments Human genes 0.000 claims description 5
- 108010021625 Immunoglobulin Fragments Proteins 0.000 claims description 5
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 claims description 5
- 239000000499 gel Substances 0.000 claims description 4
- 238000007901 in situ hybridization Methods 0.000 claims description 4
- 108091008146 restriction endonucleases Proteins 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 3
- 108020004705 Codon Proteins 0.000 claims description 2
- 241000124008 Mammalia Species 0.000 claims description 2
- 238000001962 electrophoresis Methods 0.000 claims description 2
- 229920002401 polyacrylamide Polymers 0.000 claims description 2
- 238000003757 reverse transcription PCR Methods 0.000 claims description 2
- 230000001225 therapeutic effect Effects 0.000 abstract description 11
- 108020004485 Nonsense Codon Proteins 0.000 abstract description 5
- 230000037434 nonsense mutation Effects 0.000 abstract description 3
- 210000004027 cell Anatomy 0.000 description 77
- 150000001413 amino acids Chemical class 0.000 description 28
- 108091026890 Coding region Proteins 0.000 description 23
- 230000000694 effects Effects 0.000 description 20
- 238000003752 polymerase chain reaction Methods 0.000 description 16
- 239000002299 complementary DNA Substances 0.000 description 14
- 238000001415 gene therapy Methods 0.000 description 14
- 230000001105 regulatory effect Effects 0.000 description 13
- 230000002950 deficient Effects 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 239000000047 product Substances 0.000 description 11
- 201000010099 disease Diseases 0.000 description 10
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 10
- 229930182830 galactose Natural products 0.000 description 10
- 239000000203 mixture Substances 0.000 description 10
- 102000004196 processed proteins & peptides Human genes 0.000 description 10
- 150000001875 compounds Chemical class 0.000 description 9
- 239000013604 expression vector Substances 0.000 description 9
- 239000000243 solution Substances 0.000 description 9
- 241000700605 Viruses Species 0.000 description 8
- 230000008859 change Effects 0.000 description 8
- 239000003153 chemical reaction reagent Substances 0.000 description 8
- 239000008194 pharmaceutical composition Substances 0.000 description 8
- 102000004190 Enzymes Human genes 0.000 description 7
- 108090000790 Enzymes Proteins 0.000 description 7
- 108091029865 Exogenous DNA Proteins 0.000 description 7
- 241001465754 Metazoa Species 0.000 description 7
- 239000000427 antigen Substances 0.000 description 7
- 102000036639 antigens Human genes 0.000 description 7
- 108091007433 antigens Proteins 0.000 description 7
- 238000003780 insertion Methods 0.000 description 7
- 229920001184 polypeptide Polymers 0.000 description 7
- 230000010076 replication Effects 0.000 description 7
- 210000001519 tissue Anatomy 0.000 description 7
- 241000588724 Escherichia coli Species 0.000 description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- 210000000349 chromosome Anatomy 0.000 description 6
- 238000010367 cloning Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- 108700028369 Alleles Proteins 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 108091092195 Intron Proteins 0.000 description 5
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 5
- 108091034117 Oligonucleotide Proteins 0.000 description 5
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 238000012217 deletion Methods 0.000 description 5
- 230000037430 deletion Effects 0.000 description 5
- 239000003814 drug Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 101150045500 galK gene Proteins 0.000 description 5
- 210000004962 mammalian cell Anatomy 0.000 description 5
- 238000000746 purification Methods 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- 238000001890 transfection Methods 0.000 description 5
- 108020004635 Complementary DNA Proteins 0.000 description 4
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 4
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 4
- LLRJEFPKIIBGJP-DCAQKATOSA-N Gln-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LLRJEFPKIIBGJP-DCAQKATOSA-N 0.000 description 4
- JQECLVNLAZGHRQ-CIUDSAMLSA-N Met-Asp-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O JQECLVNLAZGHRQ-CIUDSAMLSA-N 0.000 description 4
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 4
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 4
- LNYOXPDEIZJDEI-NHCYSSNCSA-N Val-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LNYOXPDEIZJDEI-NHCYSSNCSA-N 0.000 description 4
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 108010016616 cysteinylglycine Proteins 0.000 description 4
- 108010054813 diprotin B Proteins 0.000 description 4
- 210000002950 fibroblast Anatomy 0.000 description 4
- 239000002502 liposome Substances 0.000 description 4
- 210000004185 liver Anatomy 0.000 description 4
- 210000004072 lung Anatomy 0.000 description 4
- 239000002609 medium Substances 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- 239000013615 primer Substances 0.000 description 4
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 3
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 3
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 3
- 208000002177 Cataract Diseases 0.000 description 3
- 241000701959 Escherichia virus Lambda Species 0.000 description 3
- WVTIBGWZUMJBFY-GUBZILKMSA-N Glu-His-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O WVTIBGWZUMJBFY-GUBZILKMSA-N 0.000 description 3
- JUBDONGMHASUCN-IUCAKERBSA-N Gly-Glu-His Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O JUBDONGMHASUCN-IUCAKERBSA-N 0.000 description 3
- LXTRSHQLGYINON-DTWKUNHWSA-N Gly-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN LXTRSHQLGYINON-DTWKUNHWSA-N 0.000 description 3
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 3
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 3
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 3
- 102000008300 Mutant Proteins Human genes 0.000 description 3
- 108010021466 Mutant Proteins Proteins 0.000 description 3
- 238000000636 Northern blotting Methods 0.000 description 3
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 3
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 3
- 108010003201 RGH 0205 Proteins 0.000 description 3
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 3
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 3
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 3
- 230000000692 anti-sense effect Effects 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 108010050848 glycylleucine Proteins 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 210000004408 hybridoma Anatomy 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 210000003734 kidney Anatomy 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000002844 melting Methods 0.000 description 3
- 230000008018 melting Effects 0.000 description 3
- 108010005942 methionylglycine Proteins 0.000 description 3
- 230000037230 mobility Effects 0.000 description 3
- 210000002826 placenta Anatomy 0.000 description 3
- 210000005059 placental tissue Anatomy 0.000 description 3
- 108010029020 prolylglycine Proteins 0.000 description 3
- 238000010839 reverse transcription Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 208000024891 symptom Diseases 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- KVWLTGNCJYDJET-LSJOCFKGSA-N Ala-Arg-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N KVWLTGNCJYDJET-LSJOCFKGSA-N 0.000 description 2
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 2
- NLOMBWNGESDVJU-GUBZILKMSA-N Ala-Met-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLOMBWNGESDVJU-GUBZILKMSA-N 0.000 description 2
- XAXHGSOBFPIRFG-LSJOCFKGSA-N Ala-Pro-His Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XAXHGSOBFPIRFG-LSJOCFKGSA-N 0.000 description 2
- NLYYHIKRBRMAJV-AEJSXWLSSA-N Ala-Val-Pro Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N NLYYHIKRBRMAJV-AEJSXWLSSA-N 0.000 description 2
- NABSCJGZKWSNHX-RCWTZXSCSA-N Arg-Arg-Thr Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H]([C@H](O)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N NABSCJGZKWSNHX-RCWTZXSCSA-N 0.000 description 2
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 2
- QADCERNTBWTXFV-JSGCOSHPSA-N Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(O)=O)=CNC2=C1 QADCERNTBWTXFV-JSGCOSHPSA-N 0.000 description 2
- DPSUVAPLRQDWAO-YDHLFZDLSA-N Asn-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)N)N DPSUVAPLRQDWAO-YDHLFZDLSA-N 0.000 description 2
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 2
- FMWHSNJMHUNLAG-FXQIFTODSA-N Asp-Cys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FMWHSNJMHUNLAG-FXQIFTODSA-N 0.000 description 2
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 2
- CZECQDPEMSVPDH-MNXVOIDGSA-N Asp-Leu-Val-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O CZECQDPEMSVPDH-MNXVOIDGSA-N 0.000 description 2
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- MBRWOKXNHTUJMB-CIUDSAMLSA-N Cys-Pro-Glu Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O MBRWOKXNHTUJMB-CIUDSAMLSA-N 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 102100031780 Endonuclease Human genes 0.000 description 2
- 108700024394 Exon Proteins 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- LOJYQMFIIJVETK-WDSKDSINSA-N Gln-Gln Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LOJYQMFIIJVETK-WDSKDSINSA-N 0.000 description 2
- HPBKQFJXDUVNQV-FHWLQOOXSA-N Gln-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O HPBKQFJXDUVNQV-FHWLQOOXSA-N 0.000 description 2
- DXVOKNVIKORTHQ-GUBZILKMSA-N Glu-Pro-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O DXVOKNVIKORTHQ-GUBZILKMSA-N 0.000 description 2
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 2
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 2
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 101000829171 Hypocrea virens (strain Gv29-8 / FGSC 10586) Effector TSP1 Proteins 0.000 description 2
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 2
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 2
- HUEBCHPSXSQUGN-GARJFASQSA-N Leu-Cys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N HUEBCHPSXSQUGN-GARJFASQSA-N 0.000 description 2
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 2
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 2
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 2
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 2
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 2
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 2
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 2
- HAUUXTXKJNVIFY-ONGXEEELSA-N Lys-Gly-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAUUXTXKJNVIFY-ONGXEEELSA-N 0.000 description 2
- ATIPDCIQTUXABX-UWVGGRQHSA-N Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN ATIPDCIQTUXABX-UWVGGRQHSA-N 0.000 description 2
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 2
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 2
- ZBLSZPYQQRIHQU-RCWTZXSCSA-N Met-Thr-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ZBLSZPYQQRIHQU-RCWTZXSCSA-N 0.000 description 2
- 241001529936 Murinae Species 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 101100068676 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) gln-1 gene Proteins 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 241000283973 Oryctolagus cuniculus Species 0.000 description 2
- LZDIENNKWVXJMX-JYJNAYRXSA-N Phe-Arg-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC1=CC=CC=C1 LZDIENNKWVXJMX-JYJNAYRXSA-N 0.000 description 2
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 238000010240 RT-PCR analysis Methods 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 241000235070 Saccharomyces Species 0.000 description 2
- GHPQVUYZQQGEDA-BIIVOSGPSA-N Ser-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N)C(=O)O GHPQVUYZQQGEDA-BIIVOSGPSA-N 0.000 description 2
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 2
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 2
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 2
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 2
- 238000002105 Southern blotting Methods 0.000 description 2
- 241000187747 Streptomyces Species 0.000 description 2
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- MEBDIIKMUUNBSB-RPTUDFQQSA-N Thr-Phe-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MEBDIIKMUUNBSB-RPTUDFQQSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- LVRFMARKDGGZMX-IZPVPAKOSA-N Thr-Tyr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=C(O)C=C1 LVRFMARKDGGZMX-IZPVPAKOSA-N 0.000 description 2
- HIINQLBHPIQYHN-JTQLQIEISA-N Tyr-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HIINQLBHPIQYHN-JTQLQIEISA-N 0.000 description 2
- RGYCVIZZTUBSSG-JYJNAYRXSA-N Tyr-Pro-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O RGYCVIZZTUBSSG-JYJNAYRXSA-N 0.000 description 2
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 2
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 2
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 2
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 2
- STTYIMSDIYISRG-UHFFFAOYSA-N Valyl-Serine Chemical compound CC(C)C(N)C(=O)NC(CO)C(O)=O STTYIMSDIYISRG-UHFFFAOYSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000004523 agglutinating effect Effects 0.000 description 2
- 230000004520 agglutination Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 230000000890 antigenic effect Effects 0.000 description 2
- 239000000074 antisense oligonucleotide Substances 0.000 description 2
- 238000012230 antisense oligonucleotides Methods 0.000 description 2
- 108010047857 aspartylglycine Proteins 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 239000013599 cloning vector Substances 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000012258 culturing Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 238000002405 diagnostic procedure Methods 0.000 description 2
- 239000000975 dye Substances 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 210000002919 epithelial cell Anatomy 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- 108010089804 glycyl-threonine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 210000003494 hepatocyte Anatomy 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 238000001990 intravenous administration Methods 0.000 description 2
- 210000003292 kidney cell Anatomy 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 108010009298 lysylglutamic acid Proteins 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 238000007911 parenteral administration Methods 0.000 description 2
- 102000054765 polymorphisms of proteins Human genes 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000000069 prophylactic effect Effects 0.000 description 2
- 238000003127 radioimmunoassay Methods 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 210000002536 stromal cell Anatomy 0.000 description 2
- 229940124597 therapeutic agent Drugs 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- SDMAQFGBPOJFOM-GUBZILKMSA-N Ala-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SDMAQFGBPOJFOM-GUBZILKMSA-N 0.000 description 1
- WDIYWDJLXOCGRW-ACZMJKKPSA-N Ala-Asp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WDIYWDJLXOCGRW-ACZMJKKPSA-N 0.000 description 1
- LGFCAXJBAZESCF-ACZMJKKPSA-N Ala-Gln-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O LGFCAXJBAZESCF-ACZMJKKPSA-N 0.000 description 1
- CXQODNIBUNQWAS-CIUDSAMLSA-N Ala-Gln-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CXQODNIBUNQWAS-CIUDSAMLSA-N 0.000 description 1
- PEIBBAXIKUAYGN-UBHSHLNASA-N Ala-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 PEIBBAXIKUAYGN-UBHSHLNASA-N 0.000 description 1
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 1
- 108020004491 Antisense DNA Proteins 0.000 description 1
- 108020000948 Antisense Oligonucleotides Proteins 0.000 description 1
- GXCSUJQOECMKPV-CIUDSAMLSA-N Arg-Ala-Gln Chemical compound C[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GXCSUJQOECMKPV-CIUDSAMLSA-N 0.000 description 1
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- IYMAXBFPHPZYIK-BQBZGAKWSA-N Arg-Gly-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IYMAXBFPHPZYIK-BQBZGAKWSA-N 0.000 description 1
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 1
- NOZYDJOPOGKUSR-AVGNSLFASA-N Arg-Leu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O NOZYDJOPOGKUSR-AVGNSLFASA-N 0.000 description 1
- KMFPQTITXUKJOV-DCAQKATOSA-N Arg-Ser-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O KMFPQTITXUKJOV-DCAQKATOSA-N 0.000 description 1
- XLDMSQYOYXINSZ-QXEWZRGKSA-N Asn-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XLDMSQYOYXINSZ-QXEWZRGKSA-N 0.000 description 1
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 1
- HCOQNGIHSXICCB-IHRRRGAJSA-N Asp-Tyr-Arg Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)O HCOQNGIHSXICCB-IHRRRGAJSA-N 0.000 description 1
- NJLLRXWFPQQPHV-SRVKXCTJSA-N Asp-Tyr-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJLLRXWFPQQPHV-SRVKXCTJSA-N 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 244000063299 Bacillus subtilis Species 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 201000004569 Blindness Diseases 0.000 description 1
- 101000782236 Bothrops leucurus Thrombin-like enzyme leucurobin Proteins 0.000 description 1
- 101100315624 Caenorhabditis elegans tyr-1 gene Proteins 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 101100007328 Cocos nucifera COS-1 gene Proteins 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- KFYPRIGJTICABD-XGEHTFHBSA-N Cys-Thr-Val Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N)O KFYPRIGJTICABD-XGEHTFHBSA-N 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 102100024746 Dihydrofolate reductase Human genes 0.000 description 1
- 244000257039 Duranta repens Species 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 101150039314 GK2 gene Proteins 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- IPHGBVYWRKCGKG-FXQIFTODSA-N Gln-Cys-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O IPHGBVYWRKCGKG-FXQIFTODSA-N 0.000 description 1
- OZEQPCDLCDRCGY-SOUVJXGZSA-N Gln-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CCC(=O)N)N)C(=O)O OZEQPCDLCDRCGY-SOUVJXGZSA-N 0.000 description 1
- PBYFVIQRFLNQCO-GUBZILKMSA-N Gln-Pro-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O PBYFVIQRFLNQCO-GUBZILKMSA-N 0.000 description 1
- MRVYVEQPNDSWLH-XPUUQOCRSA-N Gln-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(N)=O MRVYVEQPNDSWLH-XPUUQOCRSA-N 0.000 description 1
- KHHDJQRWIFHXHS-NRPADANISA-N Gln-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHHDJQRWIFHXHS-NRPADANISA-N 0.000 description 1
- WDTAKCUOIKHCTB-NKIYYHGXSA-N Glu-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N)O WDTAKCUOIKHCTB-NKIYYHGXSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- DWBBKNPKDHXIAC-SRVKXCTJSA-N Glu-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCC(O)=O DWBBKNPKDHXIAC-SRVKXCTJSA-N 0.000 description 1
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 1
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 1
- ZAPFAWQHBOHWLL-GUBZILKMSA-N Glu-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N ZAPFAWQHBOHWLL-GUBZILKMSA-N 0.000 description 1
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- VNBNZUAPOYGRDB-ZDLURKLDSA-N Gly-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)CN)O VNBNZUAPOYGRDB-ZDLURKLDSA-N 0.000 description 1
- IEFJWDNGDZAYNZ-BYPYZUCNSA-N Gly-Glu Chemical compound NCC(=O)N[C@H](C(O)=O)CCC(O)=O IEFJWDNGDZAYNZ-BYPYZUCNSA-N 0.000 description 1
- FQKKPCWTZZEDIC-XPUUQOCRSA-N Gly-His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 FQKKPCWTZZEDIC-XPUUQOCRSA-N 0.000 description 1
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 1
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 1
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 1
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 1
- IZVICCORZOSGPT-JSGCOSHPSA-N Gly-Val-Tyr Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IZVICCORZOSGPT-JSGCOSHPSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 208000028782 Hereditary disease Diseases 0.000 description 1
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 1
- IAYPZSHNZQHQNO-KKUMJFAQSA-N His-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N IAYPZSHNZQHQNO-KKUMJFAQSA-N 0.000 description 1
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 1
- 241001611138 Isma Species 0.000 description 1
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 1
- JVTYXRRFZCEPPK-RHYQMDGZSA-N Leu-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(C)C)N)O JVTYXRRFZCEPPK-RHYQMDGZSA-N 0.000 description 1
- NJMXCOOEFLMZSR-AVGNSLFASA-N Leu-Met-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O NJMXCOOEFLMZSR-AVGNSLFASA-N 0.000 description 1
- PWPBLZXWFXJFHE-RHYQMDGZSA-N Leu-Pro-Thr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O PWPBLZXWFXJFHE-RHYQMDGZSA-N 0.000 description 1
- UETQMSASAVBGJY-QWRGUYRKSA-N Lys-Gly-His Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 UETQMSASAVBGJY-QWRGUYRKSA-N 0.000 description 1
- 239000006154 MacConkey agar Substances 0.000 description 1
- 208000036626 Mental retardation Diseases 0.000 description 1
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 1
- ZGVYWHODYWRPLK-GUBZILKMSA-N Met-Pro-Cys Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O ZGVYWHODYWRPLK-GUBZILKMSA-N 0.000 description 1
- WEDDFMCSUNNZJR-WDSKDSINSA-N Met-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(O)=O WEDDFMCSUNNZJR-WDSKDSINSA-N 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 1
- 102100037774 N-acetylgalactosamine kinase Human genes 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108091061960 Naked DNA Proteins 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- 108010033276 Peptide Fragments Proteins 0.000 description 1
- 102000007079 Peptide Fragments Human genes 0.000 description 1
- RFEXGCASCQGGHZ-STQMWFEESA-N Phe-Gly-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O RFEXGCASCQGGHZ-STQMWFEESA-N 0.000 description 1
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 1
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- LSIWVWRUTKPXDS-DCAQKATOSA-N Pro-Gln-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LSIWVWRUTKPXDS-DCAQKATOSA-N 0.000 description 1
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 229920005654 Sephadex Polymers 0.000 description 1
- 239000012507 Sephadex™ Substances 0.000 description 1
- RZUOXAKGNHXZTB-GUBZILKMSA-N Ser-Arg-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O RZUOXAKGNHXZTB-GUBZILKMSA-N 0.000 description 1
- CRZRTKAVUUGKEQ-ACZMJKKPSA-N Ser-Gln-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CRZRTKAVUUGKEQ-ACZMJKKPSA-N 0.000 description 1
- 208000036623 Severe mental retardation Diseases 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 1
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- VPZKQTYZIVOJDV-LMVFSUKVSA-N Thr-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(O)=O VPZKQTYZIVOJDV-LMVFSUKVSA-N 0.000 description 1
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 108700009124 Transcription Initiation Site Proteins 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 108010058532 UTP-hexose-1-phosphate uridylyltransferase Proteins 0.000 description 1
- 102000006321 UTP-hexose-1-phosphate uridylyltransferase Human genes 0.000 description 1
- DDRBQONWVBDQOY-GUBZILKMSA-N Val-Ala-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DDRBQONWVBDQOY-GUBZILKMSA-N 0.000 description 1
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 1
- VXCAZHCVDBQMTP-NRPADANISA-N Val-Cys-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VXCAZHCVDBQMTP-NRPADANISA-N 0.000 description 1
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 1
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 1
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 1
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 1
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 1
- 241000021375 Xenogenes Species 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000013543 active substance Substances 0.000 description 1
- 230000001464 adherent effect Effects 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 1
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 1
- 235000011130 ammonium sulphate Nutrition 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 208000036878 aneuploidy Diseases 0.000 description 1
- 231100001075 aneuploidy Toxicity 0.000 description 1
- 210000000628 antibody-producing cell Anatomy 0.000 description 1
- 239000003816 antisense DNA Substances 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000008365 aqueous carrier Substances 0.000 description 1
- 239000008135 aqueous vehicle Substances 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010008355 arginyl-glutamine Proteins 0.000 description 1
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000011888 autopsy Methods 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000037429 base substitution Effects 0.000 description 1
- 239000003833 bile salt Substances 0.000 description 1
- 229940093761 bile salts Drugs 0.000 description 1
- 102000023732 binding proteins Human genes 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000007910 cell fusion Effects 0.000 description 1
- 108091092328 cellular RNA Proteins 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 238000004737 colorimetric analysis Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 210000000695 crystalline len Anatomy 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000003936 denaturing gel electrophoresis Methods 0.000 description 1
- 239000000032 diagnostic agent Substances 0.000 description 1
- 229940039227 diagnostic agent Drugs 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 108020001096 dihydrofolate reductase Proteins 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- 208000037765 diseases and disorders Diseases 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- -1 e.g. Proteins 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 238000011990 functional testing Methods 0.000 description 1
- FBPFZTCFMRRESA-GUCUJZIJSA-N galactitol Chemical compound OC[C@H](O)[C@@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-GUCUJZIJSA-N 0.000 description 1
- 230000009395 genetic defect Effects 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 208000016354 hearing loss disease Diseases 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 210000003917 human chromosome Anatomy 0.000 description 1
- 230000008348 humoral response Effects 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 230000002163 immunogen Effects 0.000 description 1
- 238000010249 in-situ analysis Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000010255 intramuscular injection Methods 0.000 description 1
- 239000007927 intramuscular injection Substances 0.000 description 1
- 238000000021 kinase assay Methods 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- STEPQTYSZVCJPV-UHFFFAOYSA-N metazachlor Chemical compound CC1=CC=CC(C)=C1N(C(=O)CCl)CN1N=CC=C1 STEPQTYSZVCJPV-UHFFFAOYSA-N 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- PGSADBUBUOPOJS-UHFFFAOYSA-N neutral red Chemical compound Cl.C1=C(C)C(N)=CC2=NC3=CC(N(C)C)=CC=C3N=C21 PGSADBUBUOPOJS-UHFFFAOYSA-N 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 231100000590 oncogenic Toxicity 0.000 description 1
- 230000002246 oncogenic effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 239000003002 pH adjusting agent Substances 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 210000001428 peripheral nervous system Anatomy 0.000 description 1
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 230000001817 pituitary effect Effects 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 239000001267 polyvinylpyrrolidone Substances 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000037452 priming Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009465 prokaryotic expression Effects 0.000 description 1
- 230000002062 proliferating effect Effects 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 230000002685 pulmonary effect Effects 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000241 respiratory effect Effects 0.000 description 1
- 210000001525 retina Anatomy 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000012679 serum free medium Substances 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 210000002027 skeletal muscle Anatomy 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000010474 transient expression Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Abstract
The present invention relates to human galacto kinase and the identification of galacto kinase mutations, a missense mutation and a nonsense mutation, as well as nucleic acids encoding them, to a recombinant host cell transformed with DNA coding for such proteins and Uses of Expressed Proteins and Nucleic Acid Sequences in Therapeutic and Diagnostic Applications
Description
GENE OF Lfl GALACTOCINASA HUTTANA
The present invention was carried out in part with government support > under the project EY-09404 granted by the National Institutes of Health The government of E.U.PL has certain rights in the invention.
CROSS REFERENCES FOR RELATED APPLICATIONS
This application is a continuation in part of the application Ser. No. PCT / US94 / 10825 filed on September 23, 1994.
FIELD OF THE INVENTION
The present invention relates to human galacto kinase and the identification of galacto kinase mutations, a missense and a nonsense mutation, as well as isolated nucleic acids: Leos encoding them, to a recombinant host cell transformed with DNA coding for such. { • Roteins and uses of expressed proteins and sequences < nucleic acids in therapeutic and diagnostic applications BACKGROUND OF THE INVENTION
There are numerous inherited diseases of the human metabolism, most of which are recessive. Many have devastating effects that may include a combination of several clinical aspects, such as severe mental retardation, peripheral nervous system deterioration, blindness, hearing impairment and organ egalia. Most diseases are rare. However, most of these diseases can not be treated with medication. Galactokinase deficiency is one of the three known forms of galactosemia. The other forms are deficiency of galactose-1-phosphate uridyltransferase and deficiency of UDP-galactose-4-epi-erase. These three enzymes are involved in the metabolism of galactokinase, e.g. the conversion of galactose to glucose in the body. The deficiency of galactocinasa is inherited as a recessive autosomal character with a heterozygote frequency, which is estimated to be 0.2% in the general population (see, eg, Levy et al., 3. Pediatr., 9_2: 871- 877 (1978). Patients with oculogenic galactose deficiency usually become symptomatic in the early childhood period showing galactosemia, galactosura, increased levels of galactitol, cataracts and, in a few cases, mental retardation (Segal et al., 3. Pediatr, 95_: 750-752 (1979).) These symptoms generally improve dramatically with the administration of a galactose-free diet.The heterozygotes for galactocinase deficiency tend to present with cataracts beginning during the 20-50 years of age (Stambolian et al., Invest. Qphthal., Vis. Sci., 27: 429-433 (1986).) Galactokinase activity has been found in several mammalian tissues, including liver, kidney, brain, lens, placenta, erythrocytes and leuc While the protein has been purified from E ^ coli, it has been proven that the purification of the protein from mammalian tissues is difficult due to its low cell concentration. In addition, the molecular basis of galactokinase deficiency is unknown. The present invention provides a human galactokinase gene. The DNAs of the present invention, such as the specific sequences described herein, are useful because they encode the genetic information required for the expression of this protein. Additionally, the sequences can be used as probes for the purpose of isolating and identifying additional members, family, type and / or subtype as well as mutations which can form the basis of the galactocinease deficiency that can be characterized by specific mutations for the site or by the atypical expression of the galacto kinase gene. The galactokinase gene is also useful as a diagnostic agent for identifying galactokinase or co-mutant proteins or a therapeutic agent via gene therapy. The first clinical trials of gene therapy began in 1990. Since that time, through a regulatory authority, such as the Recombination Advisory Committee (RAO of the NIH (National Institute of Health) have been reviewed and approved more than 70 trial protocols Clinical, see eg, Anderson, UF, Human Gene Therapy, 5_: 281--282
(.1994). The therapeutic treatment of diseases and disorders by gene therapy includes the transfer and stable insertion of new genetic information into the cells. The correction of a genetic defect by reintroducing the normal allele of a gene has shown, from this, that this concept is clinically feasible (see, eg, Rosenberg, et al., New Enq. 3. fled., 323: 570 (1990)) . These and other additional uses for the reagents described herein will become apparent to those of ordinary skill in the art after reading this specification.
BRIEF DESCRIPTION OF THE INVENTION
The present invention provides isolated nucleic acid molecules that code for human galacto kinase, as well as nucleic acid molecules that encode missense and senseless mutations, which include mRNAs, DNAs (eg cDNA, genomic DNA, etc.) , as well as antisense analogs thereof and fragments thereof diagnostically and therapeutically useful. The present invention also provides recombinant vectors, such as cloning and expression plasmids useful as reagents in the recombinant production of galactocinease proteins, as well as recombinant prokaryotic and / or eukaryotic host cells which comprise a human galactokinase nucleic acid sequence. The present invention also provides a method for preparing human galactokinase proteins characterized in that it comprises culturing prokaryotic and / or recombinant eukaryotic host cells, which contain a human galactokinase nucleic acid sequence, under conditions that promote the expression of said protein and subsequent recovery of said protein. Another related aspect of the present invention is that of isolated human galactocine proteins produced by said method. In yet another aspect, the present invention also provides antibodies that are directed to (e.g., bind) galacto kinase proteins. The present invention also provides galactocine proteins? isolated human that have a missense or senseless mutation and antibodies (monoclonal or polyclonal) that are specifically reactive with said proteins. The present invention also provides nucleic acid probes and PCR primers which are characterized in that they comprise nucleic acid molecules of sufficient length to hybridize to human galactokinase sequences. The present invention also provides a method for diagnosing human galactokinase deficiency by isolating a nucleic acid sample from an individual and testing the sequence of said nucleic acid sample with the reference gene of the present invention and comparing differences between said sample and the nucleic acid of the instant invention, wherein said differences indicate mutations in the gene of human galactocinae isolated from an individual. The sample can be tested by direct comparison of the sequence (eg DNA sequencing), where the sample nucleic acid can be compared to the galactokinase reference gene by hybridization (eg mobility change tests such as gel electrophoresis of the heteroduplex, SSCP or other techniques such as Northern or Southern blot analysis which are based on the length of the nucleic acid sequence) or other known methods of gel electrophoresis such as RFLP (eg, by restriction endonuclease digestion of a sample amplified by PCR (for DNfi) or PCR-RT (Reverse Transcription PCR) (for RNA)). Alternatively, the diagnostic method comprises isolating cells from an individual or containing genomic DNA and testing said sample (eg, cellular RNA) by in situ hybridization using the DNA sequence of the present invention, or at least one exon, or a fragment containing at least 15, preferably 18, and preferably 21 contiguous pairs of bases as a probe. The present invention also provides an antisense oligonucleotide having a sequence capable of agglutinating with mRNfis coding for human galacto kinase in order to iden tify the galacto kinase mutant genes. The. present invention also provides yet another method to diagnose human galactocinease deficiency which comprises obtaining a serum or tissue sample; allowing such sample to come in contact with an antibody or antibody fragment which binds specifically to a human galactocine demantant protein of the present invention under conditions such that an antigen-antibody complex is formed between said antibody (or fragment thereof). antibody) and said galactokinase mutant protein; and detecting the presence or absence of said complex. The present invention also provides non-human transgenic animals that comprise a nucleic acid molecule encoding human galacto kinase. Methods for the use of such transgenic animals are also provided as models for disease states, mutation and SAR. The present invention also provides a method for treating conditions which are related to the insufficient activity of human galacto kinase which comprises administering to a patient in need thereof a pharmaceutical composition containing the galacto kinase protein of the invention which is effective to supply the Endogenous galactose of a patient and hence alleviate said condition. L? present invention also provides a method for treating conditions which are related to the insufficient activity of human galacto kinase via gene therapy. An additional or reference gene, comprising the non-proliferating galactokinase gene of the instant invention, is inserted into the cells of a patient and the result is that the protein cephilated by the reference gene corrects the defect (eg, galactokinase deficiency). ) thus allowing transfected cells to function normally and alleviate disease conditions (or symptoms).
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 represents the organization of the introns / exonee of the human galactocinease gene. Figure 2 is the genomic DNA sequence (and single-letter abbreviations of the amino acids) for human galactocinae [SEO ID NO: 73. The bold DNA sequence corresponds to the exon regions while the sequences with the Normal typeface corresponds to the introns regions of human galactocinease.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to human galacto kinase (amino acid and nucleotide sequences) and its use as a diagnostic and therapeutic means. The particular cDNA and the amino acid sequence of human galacto kinase is identified by SEO ID NO: such as is more fully described below. The present invention also relates to the genomic DNA sequence for human galacto kinase [SEQ ID NO:] and also to galactocin to human genes and mutant amino acid sequences [SEQ ID NO: 5 and 6 and their use for diagnostic purposes. In order to describe the present invention more broadly, the following additional terms will be employed, and are intended to be defined as indicated below. Ur "antigen" refers to a molecule that contains one or more epitopes that will stimulate a host's immune system to make a humoral and / or cellular response specific for the antigen. The term is also used interchangeably with "immunogen". The term "epitope" refers to the site on an antigen or hapten to which a specific antibody molecule is agglutinated. The term is also used interchangeably with "antigenic determinant" or "antigenic determinant site". A coding sequence is "operably linked" to another coding sequence when the RNA polymerase transcribes the two coding sequences into an individual mRNA, which is then translated into a single polypeptide having amino acids derived from both coding sequences . The coding sequences do not need to be contiguous with each other as long as the expressed sequence is finally processed to produce the desired protein. "Recombinant" polypeptides refer to polypeptides produced by recombinant DNA techniques; e.g. produced from transformed cells by an exogenous DNA construct encoding the desired polypeptide. "Synthetic" polypeptides are those that are prepared by chemical synthesis. Ur "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions as an autonomous unit of DNA replication in vivo; e.g. capable of replication under its own control. Ur "vector" is a replicon, such as a plasmid, phage or cosmid, to which another DNA segment can be attached in order to carry out re-cleavage of the attached segment. A "replication-deficient virus" is a virus in which the functions of excision and / or replication have been altered in such a way that after transfection into a host cell, the virus is not able to reproduce and / or infect addition cells. A "reference" gene refers to the galactokinase sequence of the present invention and is understood to include the different sequence polymorphisms that exist, where there are nucleotide substitutions in the gene sequence, but do not affect the essential function of the gene. product of the gene. Ur "mutant" gene refers to galactocinase sequences different from those of the reference gene wherein substitutions and / or deletions and / or insertions of nucleotides result in a deterioration of the essential function of the gene product such that the levels of galactose in an individual (or patient) are atypically elevated. For example, the substitution of G for A at position 122 of human galactocinease [SEO ID NO: 5] is a missense mutation associated with patients who are deficient in galactose. Another substitution of G for T produces a nonsense codon within the frame at the 80 amino acid position of the mature protein. The result is a truncated protein consisting of the first 79 amino acids of human galactocinease. A "DNA coding sequence" or a "nucleotide sequence encoding" a particular protein, is a DNA sequence which is transcribed and translated into a polypeptide when placed under the control of appropriate regulatory sequences. A "promoter sequence" is a DNA regulatory region capable of agglutinating the RNA polymerase in a cell and initiating the transcription of a downstream coding sequence (3 'direction). For purposes of defining the present invention, the promoter sequence is bound at the 3 'terminus by a translation initiation codon (eg, OTG) of a coding sequence and extended upstream (5' direction) to include the number minimum of bases or elements necessary to initiate transcription at detectable levels over the medium. Within the promoter sequence will be found a transcription initiation site (conveniently defined by mapping with nuclease SI), as well as protein agglutination domains (consensus sequences) responsible for the agglutination of the RNA polymerase. Eukaryotic promoters will frequently contain, but not always, "TATA" boxes and "CAT" boxes. The prokaryotic promoters contain Shine-Dalgarno sequences in addition to the consensus sequences of -10 to -35. The "DNA control sequences" ee collectively refer to the promoter sequences, ribosome binding sites, polyadenylation signals, transcription termination sequences, upstream regulatory domains, enhancers, and the like, which collectively provide the expression (eg transcription and translation) of a coding sequence in a host cell. A control sequence "directs the expression" of a coding sequence in a cell when the RNA polymerase binds to the promoter sequence and transcribes the coding sequence to mRNfl, which is translated into the polypeptide encoded by the coding sequence. A "host cell" is a cell that has been transformed or transfected, or is capable of transformation or transfection by an exogenous DNA sequence. A cell has been "transformed" by means of exogenous DNA when such exogenous DNA has been introduced into the cell membrane. The xenogen DNA may or may not integrate (bind covalently) to the DNA of the chromosomes forming the genome of the cell. In prokaryotes and yeasts, for example, exogenous DNA can be maintained on an episome element, such as a plasmid. With respect to eukaryotic cells, a stably transformed or transinfected cell is one in which the exogenous DNA has been integrated into the chromosome so that it is inherited by the daughter cells through replication of the chromosome. This stability is demonstrated by the ability of the eukaryotic cell to establish cell lines or clones that comprise a population of daughter cells containing the exogenous DNA. "Transinfection" or "trane-infected" refers to a process by which cells take foreign DNA and integrate it into its chromosome. Transfection can be carried out, for example, by several techniques in which the cells make DNA (eg calcium phosphate precipitation, electroporation, liposome assimilation, etc.) or by infection, which viruses are used to transfer DNA to cells
A "white cell" is a cell (s) that is selectively transferred onto other cell types (or cell lines). A "clone" is a population of cells derived from a single cell or common ancestor by rnitosis. A "cell line" is a clone of a primary cell that is capable of stable growth in vitro for many generations. A "heterologous" region of a DNA construct is an identifiable DNA segment within or attached to another DNA molecule that is not found in association with the other molecule in nature. Thus, when the heterologous region encodes a gene, the gene will usually be flanked by DNP that does not flank the gene in the genome of the animal source. Another example of a heterologous coding sequence is a construct in "where the same coding sequence is not found in nature (eg, synthetic sequences having different codons from the native gene) .The allelic variation or mutational events that occur in nature does not give rise to a heterologous region of DNA as used here. "Conditions that refer to insufficient human galactokinase activity" or a "deficiency in galactokinase activity" represent mutations of the galactokinase protein that affect activity of galactokinase or may affect galactokinase expression or both, such that a patient's galactose levels are atypically elevated, and this definition is intended to atypically cover low levels of galactokinase expression in a patient due to defective control sequences for the galactocine protein, a reference. The present invention provides an isolated nucleic acid molecule encoding a human galactocine protein and substantially similar sequences. The isolated nucleic acid sequences are "substantially similar" if: (i) they are approximately the same length (e.g., at least 80% of the coding region of SEO ID NO: 4); (ii) code for a protein with the same galacto kinase activity (e.g., within an order of magnitude) that the protein encoded by SEO ID NO: 4; and (iii) are capable of hybridizing under moderately stringent conditions for SEO ID N0: 4; or that encode DNA sequences that are degenerate with respect to SEO ID NO: 4. Degenerate DNA sequences encode the same amino acid sequence as SEO ID NO: 4, but have variation (s) in nucleotide coding sequences . Hybridization under moderately stringent conditions is delineated later. Hybridization under moderately stringent conditions can be carried out as follows. The nitrocellulose filters are prehybridized at 65 ° C in a solution containing 6X SSPE, 5X Denhardt solution (10Og Ficoll, 10Og BSA and 10Og Polyvinylpyrrolidone per liter of solution), 0.05% SD and 100 micrograms of tRNA. Hybridization probes are labeled, preferably with radioactive labels (e.g., using the Bios TAG-IT equipment). The hybridization is then carried out for about 18 hours at 65 ° C. The filters are washed in a 2X SSC solution and 0.5% SDS at room temperature for 15 minutes (repeated once). Subsequently, the filters are washed at 58 ° C, dried with air and exposed to X-ray film overnight at -70 ° C with an intensifying screen. Alternatively, the "substantially similar" sequences are substantially the same when 66% (preferably 75% and preferably 90%) of the nucleotides or amino acids coincide over a defined length (eg at least 80% of the SEO coding region). ID NO: 4) of the molecule and the protein encoded by such sequence has the same galactokinase activity (eg within an order of magnitude) such as the protein encoded by SEO ID NO: 4. As used herein, substantially similar refers to sequences that have identity similar to the sequences of the instant invention. A) Yes, nucleotide sequences that are substantially the same can be identified by hybridization or by comparison of sequences. Protein sequences that are substantially the same can be identified by one or more of the following: proteolytic digestion, gel electrophoresis and / or microsequencing. The present invention also provides nucleic acid molecules that encode a missense mutation (SEO ID NO: 5) or a nonsense mutation (SEO ID NO: 6) of the human galactokinase protein and the DNA sequences that are degenerate to SEO ID NO: 5 or 6. Degenerate DNA sequences encode the same amino acid sequence (or termination site) such as SEO ID NO: 5 or 6, but have variation (s) in coding sec? of nucleotides. One means for isolating a nucleic acid molecule for a human galactocinase is to probe human genomic DNA or a collection of cDNA with a natural or artificially designed probe using art recognized procedures (See, for example: "Current Protocole in Molecular Biology", Aus? Bel, FM, et al. (Ede.) Green Publishing Assoc. And 3ohn Uiley Interscience, New York, 1989, 1992). One skilled in the art will appreciate that SEO ID NO: 4 or fragments thereof (comprising at least 15 contiguous nucleotides), is a particularly useful probe. Several probes particularly useful for this purpose are set forth in Table 1, or hybridizable fragments thereof (e.g., comprising at least 15 contiguous nucleotides). It is also appreciated that talee probes can be and preferably are labeled with an analytically detectable reagent to facilitate identification of the probe. Useful reagents include but are not limited to radioactivity, dyes or fluorescent enzymes capable of catalyzing the formation of a detectable product. The probes are thus capable of isolating complementary copies of genomic DNA, cBNA, or RNfI from mammalian, human or other animal sources for related sequences (eg additional members of the family, type and / or subtype) and including tranecriptional elements. regulators and control defined abovemen e as well as other regions of stability, processing, translation and tissue specificity from 5 'and / or 3' regions relative to the coding sequences described in the present invention. The present invention also considers gene therapy. "Gene therapy" means supplementing with genes. That is, an additional copy (eg, reference) of the gene of interest is inserted into the cells of the patients. As a result, the protein encoded by the reference gene corrects the defect (e.g., galactocinease deficiency) and allows cells to function normally, thereby alleviating disease symptoms. The gene therapy of the present invention can occur in vivo or ex vivo. Ex vivo gene therapy requires the isolation and purification of patient cells, the introduction of a therapeutic gene and the introduction of genetically altered cells back into the patient. A deficient replication virus such as co or a nodulated retrovirus can be used to introduce the therapeutic gene (galactokinase) into such cells. For example, the mouse Moloney leukemia virus (flMLV) is a well-known vector in clinical gene therapy trials (see, eg, Bops-Laueri et al., Curr. Opin. Genet. Dev., 3: 102 -109 (1993)). In contrast, live i_n gene therapy does not require isolation and purification of patient cells. The therapeutic gene is typically "packaged" for administration to a patient in liposome or in a replication-deficient virus such as adenovirus (see, eg, Berkner, KL, Curr. Top., Microbiol. Imol., 158: 39 -66
(1992)) or adenovirus-associated vectors (AAV) (see eg Muzyczka, N., Curr. Top, Microbiol.I m? Nol., 158: 97-129 (1992) and US patent 5,252,479"Safe Vector. for Gene Therapy "). Another approach is the administration of the so-called "naked DNA" in which the therapeutic gene is injected directly into the bloodstream or muscle tissue. Cell types useful for gene therapy of the present invention include hepatocytes, fibroblasts, lymphocytes, any cell of the eye (e.g. retina), epithelial and endothelial cells. Preferably the cells are hepatocytes, any eye cells or respiratory (or pulmonary) epithelial cells. Transfection of epithelial (lung) cells can occur via inhalation of a nebulized preparation of DNA vectors in liposomes, protein-DNA complexes or deficient replication adenoviruses (see, eg, US Patent 5,240,846"Gene Therapy Vector for Cystic Fibrosis"). The present invention also contemplates a process for preparing human galactocinease proteins. The non-mutant proteins are defined with reference to the amino acid sequence listed in SEO ID NO: 4 and include variants with a substantially similar amino acid sequence or € 1 has the same galactokinase activity. Additional proteins of the present invention include human galactocmase-binding proteins as set forth in SEO ID NO: 5 or 6. The proteins of the present invention are preferably made by recombinant genetic engineering techniques. The nucleic acids isolated particularly the DNAs can be introduced into expression vectors by operatively ligating the DNA to the necessary expression control regions (e.g., regulatory regions) required for gene expression. Vectors can be introduced into appropriate host cells such as prokaryotic (e.g. bacterial), or eukaryotic (e.g., yeast or mammalian) cells by methods well known in the art (Ausubel, et al., Supra). The coding sequences, which have been prepared or isolated, for the desired proteins, can be cloned into any suitable vector or replicon. Numerous cloning vectors are known to those skilled in the art and the selection of an appropriate cloning vector is a matter of choice. The examples of 23
insertion of a vector, such as the cloning vectors described above. Alternatively, the coding sequence can be cloned directly into an expression vector which contains the control sequences and an appropriate restriction site. In some cases, it may be desirable to produce other mutants or analogues for the galactokinase protein. Mutants or analogs can be prepared by deleting a portion of the sequence encoding the protein by inserting a sequence, and / or by substituting one or more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such as site-directed mtotagenesis, are well known to those skilled in the art. See e.g. , T. Maniatis et al., Above; DNA Cloning, Veis. I and II, above; Nucleic Acid Hybridization, previous. A number of prokaryotic expression vectors are known in the art. See e.g. the patent of
E.U.A. Nos. 4,578,355; 4,440,859; 4,436,815; 4,431,740; 4,431,739; 4,428,941; 4,425,437; 4,418,149; 4,411,994; 4,366,246; 4,342,832; see also the British patent applications GB 2,121,054; GB 2,008,123; GB2,007,675; and European application 103,395. Yeast expression vectors are also known in the art. See e.g. the patents of E.U.A. Nos. 4,446,235; 4,443,539; 4,430,428; see also European patent applications 103,409;
recombinant DNA vectors for cloning and host cells that they can transform include, but are not limited to, the bacteriophage lambda (E. coli), pBR322, (E.coli), PACYC177 (§ _._coli), pKT230 ( gram-negative bacteria), pGVHOd (negative bacteria), pLAFRI (gram-negative bacteria), pME290 (gram-negative bacteria not E ^ coli). pHV14 (E. coli and
Bacillus subtilis), pBD9 (Bacillus), pID61 (Streptomyces), pUC6
(Streptomyces), YIp5 (Saccharomyces), baculovirus system of ineecto cells, a system of insect of rosophila, and YCpl9 (Saccharomyces). See, generally, "DNA Cloning": Vols. I S II, Glover et al. ed. IRL Press Oxford (1985) (1987) and; T. Maniatis et a_l ("Molecular Cloning" Cold Spring Harbor Laboratory (1982) .The EJ gene can be placed under the control of a promoter site, which binds ribosomes (for bacterial expression) and, optionally, an operator (referred to collectively here as "control" elements) so that the DNA sequence encoding the desired protein is transcribed into RNA in the transformed host cell by a vector containing this expression construct.The coding sequence may or may not contain a signal peptide or leader sequence The subunit antigens of the present invention can be expressed using, for example, the tac promoter of E ^ coli or the promoter (spa) and the signal sequence of the protein A gene. can be removed by a bacterial host in a post 77 process
translation. See, e.g. the patents of E.U.A. Nos. 4,431,739; 4,425,437; 4,338,397. In addition to the control sequences, it may be desirable to add regulatory sequences that allow to control the expression of protein sequences relative to the growth of the host cell. The regulatory sequences are known to those skilled in the art, and examples include those that cause the expression of a gene to turn on or off in response to a chemical or physical stimulus, including the presence of a regulatory component. Other types of regulatory elements may also be present in the vector, for example, sequence of enhancers. Ur expression vector is constructed so that the particular coding sequence is located in the vector with the appropriate regulatory sequences, the placement and orientation of the coding sequence is transcribed under the "control" of the control sequences (eg polymerase of RNA that binds to the DNA molecule in the control sequences transcribes the coding sequence). It may be desirable that the modification of the sequences encoding the particular antigen of interest achieve this purpose. For example, in some cases it may be necessary to modify the sequence so that it can adhere to the control sequences in the proper orientation; e.g. to maintain the reading frame. The control sequences and other regulatory sequences may be linked to the coding sequence prior to 100,561; 96, 491. pSV2 neo (as described in 3. Mol.Appl. Genet., 1: 327-341) which uses the SV40 late promoter to induce expression in mammalian cells or pCDNAlneo, a vector derived from pCDNAl (Mol Cell Biol. 7: 4125-29) uses the CMV promoter to induce expression. These latter two vectors can be used for transient or stable expression (using resistance to G418) in mammalian cells. The insect cell expression systems, e.g. Drosophila, are also useful, see for example, the applications of PCT UO 90/06358 and UO 92/06212 as well as EP 290,261-Bl. Depending on the expression system and the selected host, the proteins of the present invention are produced by culturing transformed host cells by an expression vector described above under conditions whereby the protein of interest is expressed. Preferred mammalian cells include human embryo kidney cells, monkey kidney (HEK-293 cells), fibroblasts (COS), Chinese hamster ovary (CHO) cells, Drosophila or murine L cells. If the expulsion system secretes the protein in the culture medium, the protein can be purified directly from the medium. If the protein is not secreted, it is isolated from used cells or recovered from cell membrane fractions. The selection of appropriate culture conditions and recovery methods are within the skill in the art.
An alternative method for identifying proteins of the present invention is by constructing gene libraries, using the resulting clones to transform E. coli and pool and label individual colonies using polyclonal or monoclonal anti-galactocine serum. The proteins of the present invention can be produced by chemical synthesis such as solid phase synthesis of peptides, using known sequences of amino acids or amino acid sequences derived from the DNA sequence of the genes of interest. Such methods are known to those skilled in the art. The chemical synthesis of peptides is not particularly preferred. The proteins of the present invention or fragments thereof comprising at least one epitope can be used to produce polyclonal and monoclonal antibodies. If polyclonal antibodies are desired, a selected mammal (eg, mouse, rabbit, goat, horse, etc.) is immunized with the protein of the present invention, or a fragment thereof, capable of producing an immune response (eg, tended by at least an epitope). The serum is collected from the immunized animal and treated according to known procedures. If serum containing polyclonal antibodies is used, can these be purified by immunological affinity chromatography? other known methods. The monoclonal antibodies against the proteins of the present invention, and against the fragments thereof, can also easily be produced by one skilled in the art. The general methodology for making monoclonal antibodies using hybridoma technology is well known. Immortal antibody-producing cell lines can be created by cell fusion, and also by other techniques such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. 3; 682.007.6 5; See, e.g. M. Schreier et al., "Hybridorna Techniques" (1980); Hamrnerling et a¿. , "Monoclonal Antibodies and T-cell Hybridomas" (1981); Kennet et al., "Monoclonal Antibodies" (1980); see also the patents of E.U.A. Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,452,570; 4,466,917; 4,472,500; 4,491,632 and 4,493,890. Monoclonal antibody panels produced against the antigen of interest, or fragment thereof, can be classified by different properties: e.g. by isotype, epitope, affinity, etc. From here, one skilled in the art can produce monoclonal antibodies specifically reactive with galacto kinase mutant proteins, e.g. the mutation of wrong sense of SEO ID NO: 5 or the mutation without sense SEO ID NO: 6. The monoclonal antibodies are useful in the purification, using immunoaffinity techniques, of the individual antigens against which they are directed. Alternatively, the genes encoding the monoclonal antibodies of interest can be isolated from the hybridomas by PCR techniques known in the art and cloned and expressed in the appropriate vectors. The antibodies of the present invention, whether polyclonal or monoclonal, have additional utility in that they can be used as reagents in immunoassays, RIAs, ELISAs and the like. As used herein, "monoclonal antibody" is understood to include antibodies derived from a species (eg, murine, rabbit, goat, rat, human, etc.) as well as antibodies derived from two (or perhaps more) species (eg. chimeric and humanized antibodies). Chimeric antibodies, in which non-human variable regions bind or fuse to human constant regions (see, e.g., Liu et al., Proc. Nati. Acad. Sci. USA,
84: 3439 (1987)), can also be used in tests or therapeutically. Preferably a monoclonal antibody would be "hunanized" as described in Jones et al.,
Nature, 321: 522 (1986); Verhoeyen et al., Science, 239: 1534
(1988); Kab.?st et al., 3. I unol., 147-1709 (1991); Queen et al., Proc. Nati Acad. Sci. USA, 88: 34181 (1991); and Hodgson et al., Bio / Technology, 9: 421 (1991). Therefore, the present invention also contemplates antibodies, polyclonal or monoclonal (including chimeric and "humanized") directed to epitopes corresponding to amino acid sequences described herein from human galacto kinase. Methods for the production of polyclonal and monoclonal antibodies are well known, see for example Chapter 11 of Aus? Bel et al. (previous) When the antibody is labeled with an analytically detectable reagent such as radioactivity, fluorescence or an enzyme, the antibody can be used to detect the presence or absence of human galacto kinase and / or its quantitative level. In addition, specific antibodies (polyclonal or monoclonal) for missense or missense mutations of the present invention are useful for diagnostic purposes. A serum or tissue sample (eg, liver, lung, etc.) is obtained and allowed to come in contact with an antibody or antibody fragment which binds specifically to a human galacto kinase protein of the present invention under conditions such that antigen-antibody complex is formed between said antibody (or antibody fragment) and said mutant galactocine protein. The detection of the presence or absence of said complex is within the skill in the art (eg ELISA, RIA, Western Blot analysis, Optical Biosensor- (eg BIAcore - Pharmacia Biosensor, Uppeala, Sweden)) and does not limit the pree invention. The present invention also contemplates pharmaceutical compositions comprising an effective amount of the galactokinase protein of the invention and a pharmaceutically acceptable carrier. The pharmaceutical compositions of drug? Proteinaceous compounds of the present invention are particularly useful for parenteral administration, e.g.
subcutaneous, intramuscular or intravenous. Optionally, the protein galactokinase is surrounded by a vesicle bound to a membrane, such as a liposome. Compositions for parenteral administration will commonly comprise a solution of the compounds of the invention or a mixture thereof dissolved in an acceptable carrier, preferably an aqueous carrier. A variety of aqueous vehicles can be employed, e.g., water, pH regulated water, 0.4% ealine solution, 0.3% glycine, and the like. These solutions are sterile and generally free of particulate matter. These solutions can be sterilized by conventional techniques, well known. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions such as pH adjusting agents and pH regulators, etc. The concentration of the compound of the present invention in such a formulation can vary widely, e.g. less than 0.5%, usually at or at least 1% up to as much as 15 or 20% by weight and will be selected based primarily on fluid volumes, viscosities, etc., in accordance with the particular mode of administration selected. Thus, a pharmaceutical composition of the present invention can be prepared for intramuscular injection containing 1 ml of sterile buffered water and 50 mg of a compound of the invention. Similarly, a pharmaceutical composition of the present invention could be made for intravenous infusion containing 250 ml of sterile Ringer's solution and 150 mg of a compound of the invention. Current methods for preparing compositions that are amenable to administration are well known or will be apparent to those skilled in the art and are described in more detail in, for example, Re-ington's Pharmaceutical Science, 15th ed., Mack Publishing Company, Easton, Penneylvania. The compounds described in the present invention can be lyophilized for storage and reconstituted in a vehicle acceptable prior to use. This technique has been shown to be effective with conventional proteins and lyophilization and reconstitution techniques known in art can be employed. The physician will determine what will be the most appropriate dosage of the therapeutic agents presentee and will vary with the particular patient under treatment. The physician will generally wish to initiate the treatment with small doses substantially less than the optimum dose of the compound and increase the dosage by small increments until the optimum effect is reached under the circumstances. It will generally be found that when the composition is administered orally, larger amounts of the active agent will be required to produce the same effect as a smaller amount administered parenterally. The therapeutic dosage will generally be from 1 to 10 milligrams per day and may - although this may be administered in several different dosage units. Dop > depending on the condition of the patient, the pharmaceutical composition of the present invention can be administered for prophylactic and / or therapeutic treatments. In the therapeutic application the compositions are administered to a patient who already suffers from a disease in an amount sufficient to cure or at least partially arrest the disease and its complications. In prophylactic applications, the compositions containing the present compounds or a mixture thereof are administered to a patient who is not already in a disease state to improve their resistance. Individual or multiple administrations of the pharmaceutical compositions can be carried out, the physician selecting dosage and standard levels. In any event, the pharmaceutical composition of the invention will provide an amount of compound of the invention sufficient to effectively treat the patient. The present invention also contemplates the use of galactokinase genes of the instant invention as a diagnosis. For example, some diseases are the result of inherited defective genes. These genes can be detected by comparing the sequence of the defective gene with that of a normal gene. Subsequently, one can verify that a "mutant" gene is associated with galactocinease deficiency by measuring galactose. That is, a significant gene would be associated with high (atypical) levels of galactose in a patient. In addition, galactokinase mutant genes can be inserted into a suitable vector for expression in a functional test system (eg, colorimetric tests, MacConkey plate expression, conplementation experiments, eg in a yeast strain or E. coli deficient in galactocinease) as well as other means to verify or identify galactocinease mutations. As an example, the RNA of an individual can be transcribed with the reverse transcriptase to cDNA which can then be amplified by polymerase chain reaction (PCR), cloned into an expression vector of E_j_coli and transformed into a galacto kinase deficient strain. . When grown on indicator MacConkey plates, the galactokinase deficient cells will produce colonies that are white, whereas cells that have been transformed / supplemented with a functional galactocinase gene will be red (see e.g., the Examples section). If the majority of all colonies from an individual are red, then the individual is considered normal with respect to the activity of galactocinaea. If approximately 50% of the colonies is red (the other 50% is white), then that individual is likely to be a carrier of galactocinease deficiency. If most colonies are white, then that individual is likely to have a galactocinease deficiency. Once the "mutant" genes are identified, the population can be classified by portadoree of the "mutant" galactocinease gene. (A carrier is an apparently healthy person whose chromosomes contain a "mutant" galactocine gene that can be passed on to their offspring). In addition, monoclonal antibodies that are specific p >The galacto kinase proteins can be used for diagnostic purposes as described above. Individuals that carry mutations in the human galactocinease gene can be detected at the DNA level by a variety of techniques. Nucleic acids used for diagnosis (genomic DNA, mRNA, etc.) can be obtained from cells of a patient, such as from blood, urine, saliva, tissue biopsies (eg chorionic hair sampling or cell removal) of arnniotic fluid) and autopsy material. Genomic DNA can be used directly for detection or can be amplified enzymatically using PCR, ligase chain reaction (LCR), chain removal amplification (SDA), etc. (see eg, Saiki et al., Nature, 324: 163-266 (1986), Bej, et al, Crit. Rev. Biochem. Molec. Biol., 26: 301-334 (1991), Birkenmeyer et al., 3. Virol. Meth., 35: 117-126 (1991), Van Brunt, J. Bio / Technology, 8: 291-294 (1990)) prior to analysis. RNA can also be used for the same purpose. Reverse transcription of the RNA and amplification can be done at the same time with RT-PCR (reverse transcriptase polymerase chain reaction) or reverse transcription to a non-amplified cDNA. As an example, PCT primers complementary to the nucleic acid of the instant invention can be used to identify and analyze galacto kinase mutations. For example, deletions and insertions can be detected by a change in the size of the amplified product compared to the normal galactocinase genotype. Point mutations can be identified by hybridizing the amplified DNA to RNA (of the present invention) of radiolabeled galactcin kinase or alternatively, galactokinase antisense DNA sequences (of the present invention). Perfectly paired sequences can be distinguished from several duplexes by digestion with RNase A or by differences in melting temperatures (Tm). Such a diagnosis would be particularly useful for prenatal and even neonatal tests. In addition, point mutations and other differences in the sequences between the reference gene and the "mutant" genes can be identified by other well-known techniques, e.g. direct DNA sequencing, single-strand conformation polymorphism (SSCP; Orita, et al., Genoroics, 5_: 874 ~ 879
(1989)). For example, a sequencing primer with a two-stranded PCR product or a single-stranded template molecule generated by a modified PCT is used. The determination of the sequence is carried out by conventional methods with radioactively labeled nucleotides or by automatic sequencing procedures with fluorescent labels. Cloned segments of DNA can also be used as probes to detect specific segments of DNA. The sensitivity of this method is greatly improved when combined with PCR. The presence of nucleotide repeats can be correlated with a change in galactokinase activity (change of cause) or serve as a marker for several polymorphisms. Genetic testing based on differences in DNA sequence can be carried out by detecting alteration in the electrophoretic mobility of DNA fragments in gels with or without denaturing agents. Small deletions and sequence insertions can be visualized by high-resolution gel electrophoresis. The DNA fragments of different sequences can be distinguished on the formamide gradient denaturing gels in which the mobilities of the different DNA fragments are delayed in the gel at different positions in accordance with their specific melting or partial melting temperatures (see , eg Myers, et al., Science, 230: 1242 (1985)). In addition, sequence alterations, in particular small screenings, can be detected as changes in the migration pattern of the DNA heteroduplexes in non-denaturing gel electrophoresis (Vg heteroduplex electrophoresis) (see, eg, Nagamine et al., Am. 3. H?. Genet., 45: 337-339 (1989)). Sequence changes at specific sites can also be revealed by nuclease protection assays, such as RNase and SI protection or the chemical cleavage method (e.g., Cotton, et al., Proc. Nati. Acad. Sci. USA,
85: 4397-4401 (1985)). Thus, the detection of a specific DNA sequence can be achieved by methods such as hybridization (eg heteroduplex electroporation, see, Uhite et al., Genornics, 1.2: 301-306 (1992), RNAse protection (eg Myers et al. , Science, 230: 1242 (1985)) chemical composition (eg Cotton et al., Proc. Nati, Acad. Sci. USA, B_5_: 4397-4401 (1985))), direct DNA sequencing, or the use of enzymes of restriction (eg restriction fragment length polymorphisms (RFLP) where variations in the number and size of restriction fragments can indicate-insertions, deletions, presence of nucleotide repeats and any other mutation which creates or destroys a restriction endonuclease sequence). Southern blot analysis of genomic DNA can also be used to identify deletions and large insertions (e.g., greater than 100 base pairs). In addition to conventional gel electrophoresis and DNA sequencing, mutations (eg microsupreee, aneuploidies, translocations, inversions) can also be detected by in situ analysis (See eg Keller et al., DNA Probes, 2nd Ed., Stockton Prees, New York , NY, USA (1993)). That is, the DNA (or RNA) sequences in the cells can be analyzed to find mutations without isolation and / or immobilization on a membrane. In-situ fluorescence hybridization (FISH) is currently the most commonly applied method and numerous reviews of FISH have appeared. See, e.g. Trachuck, et al., Science, 250: 559-562 (1990) and Traek, et al., Trends. Genet , _7: 149-154 (1991) which are incorporated herein by reference for background purposes. Hence, using nucleic acids based on the structure of specific genes, e.g., galacto kinase, diagnostic tests for galactokinase deficiency can be developed. In addition, some diseases are the result of, or are characterized by, changes in gene expression that can be detected by changes in the rnRNA. Alternatively, the galacto kinase gene can be used as a reference to identify individuals expressing a decreased level of galacto cinase, e.g. by Northern blot analysis or in situ hybridization. Appropriately defining the hybridization conditions is within the skill in the art. See, e.g. , "Current Protocols in Mol. Biol." Vol. I to II, Wiley Interscience. Ausubel, et al ^ (ed) (1992). Probing technology is well known in the art and it is appreciated that the size of the probes can vary widely but it is preferred that the probe be at least 15 nucleotides in length. It is also appreciated that such probes can be and are preferably labeled with an analytically detectable reagent p >; to facilitate the identification of the probe. Useful reagents include, but are not limited to, radioactivity, dyes or fluorescent enzymes capable of catalyzing the formation of a detectable product. As a general rule, as long as the conditions of hybridization are rigorous, the genes that will recover are most closely related. Also within the scope of the present invention are antisense oligonucleotides based on the sequences described herein for human galactocinae. Synthetic oligonucleotides or related antisense chemical structural analogs are designed to specifically recognize and agglutinate a target nucleic acid encoding galactocinase and its mutations. The general field of antisense technology is illustrated by the following descriptions which are incorporated herein by references for background purposes (Cohen, 3.S. Trends in Pharrn.Sci., 10: 435 (1989). And Ueintra? B , HM Scientific American, Jan.
(1990) on page 40. Transgenic, non-human animals can be obtained by transfecting fertilized eggs or appropriate embryos of a host with nucleic acids encoding the human galactokinase described herein, see for example, U.S. Pat. 4,736,866; 5,175,385; 5,175,384 and 5,175,386.
The resulting transgenic animal can be used as a model for the study of galactocinasa. Particularly, useful transgenic animals are those that exhibit a detectable phenotype associated with receptor expression. The drugs; they can then be classified by their ability to reverse or exacerbate the relevant phenotype. The present invention also contemplates operatively ligating the gene encoding p > for the receptor to regulatory elements which respond differentially to various temperature or metabolic conditions, effectively turning on or off the phenotypic expression in response to those conditions. Although not necessarily limiting of the present invention, the following are illustrative experimental data of this invention.
EDE? PLO I
Purification of human galactokinase from placental tissue Galactokinase (galK) is obtained from human placenta as described by Stambolian et al. Biochi Biophys Acta, 831: 306-312 (1985)) which is incorporated herein completely for reference. In essence, the human placenta tissue (obtained within the first hour after delivery) is homogenized. centrifuge and the supernatant is absorbed in DEAE-Sephacel *. The material is eluted, precipitated with ammonium sulfate and then run through a column that separates by size (Sephadex G-100 SF *). Lae active fractionated together are concentrated. The purified protein is obtained by separating by gel electrophoresis of SDβ-polyacrylamide and afterwards an Uestern blot analysis is made using normal techniques (see Laemmli, Nature,
227: 680-685 (1970), or LeGendre et al., Biotechniques, 6,154
(1988)). Small amounts of galacto kinase were isolated
(rnicrogramoe) from multiple rounds of protein purification. After digestion with trypsin peptide, 7 peptide sequences were eventually isolated and identified. The three largest fragments are presented below: [S? Q ID N0: 1] Val Asn Leu He Gly Glu His Thr Asp Tyr Asn Gln Gly
Leu Val Leu Pro Met Ala Leu Glu Leu Met Thr Val Leu Val Gly Ser
Pro Arg
[SEO ID NO: 2] His He Gln Glu His Tyr Gly Gly Thr Ala Thr Phe Tyr
Leu Ser Gln Ala Ala Asp Gly Ala Lys
[SEO ID NO: 3] Wing Gln Val Cye Gln Gln Wing Glu His Ser Phe Wing Gly Met Pro Cys Gly He Met Asp Gln Phe He Ser Leu Met Gly Gln Lys The fragments were compared with sequences of peptides encoded by the cDNAs, where the cDNAs were partially sequenced. The cDNAs (also known as expressed expression marks or ESTs) were obtained from Human Genome Sciences, Inc. (Rockville, MD, USA). The best alignments occurred with an EST sequence from a collection of osteoclas stromal cells »human take (SEO ID N0: 1 showed 100% identity on and contiguous IB amino acids) and an EST sequence from a collection of Human pituitary (SEO ID NO: 2 showed 95.5% identity over 22 contiguous amino acids) A full-length cDNA was identified and sequenced from the collection of osteoclato-to-human stromal cells (SEO ID NO: 4) in? ABI 373A automatic sequencer. The sequence was confirmed in both chains. The correspondence of the amino acid sequence (SEO ID NO: 4) was compared against the peptide fragments identified above. The SEO ID N0: 1 corresponds to amino acids 38-68 of the complete human galactocine protein. Similarly, SEO ID N0e: 2 and 3 correeponden to amino acids 367-388 and 167-195 respectively, of human galactocinease.
Analysis of the human galactokinase gene
A comparison of the sequence for human galacto- cinase with that of E. coli galacto- cinase (Debouck et al, Nuc Acid Ros., 1_3_: 1841-1853 (1985)) shows 61% similarity and
44. 5% identity A subsequent comparison with another reported gene of human galactocinaea (GK2) (Lee et al., Proc.
Nati Acad. Sci. USA, 89: 10887-10891 (1992) m? Etra 54% and 34.6% identity at the level of arninoacid. Furthermore, the GK2 gene is mapped to human chromosome 17, position q24 as determined by fluorescence in situ hybridization (FISH) analysis. The SEO ID NO: 4 hybridizes against a Northern blot containing human messenger RNA from placenta, brain, skeletal muscle, kidney, intestine, heart, lung and liver conformed to normal procedures (see, eg Sanbrook et al., Molecular-Cloning: A Laboratory Manual, 2nd ed., Cold Spring Hartor Laboratory Press, 1989). The strongest hybridization was with human liver and lung tissue.
Completion of galacto cinase:
The SEO ID NO: 4 was subcloned into an E ^ coli vector, plaemido pBluescript [Stratagene]. When transformed into C600K, a galactocine-deficient strain, the transformed E ^ coli grew on MacConkey agar plates containing 1% galactose (and ampicillin @ 50ug / ml for plasmid selection) and produced brick-red colonies, indicating fermentation of sugar. Specifically, the red color is due to the action of acids produced by the fermentation of galactose, bile salts and the indicator (neutral red) in MacConkey medium.
Expression in mammalian cells
SEQ ID NO: 4 was also subcloned into COS-1 cells
CATCC CRL] 650]. The cells were transfected, cultured and prepared. The Used ones were tested by a 4C galacto kinase assay as described in Stambolian et al. (Exp. Eye Res., 38: 231-237 (1984)) which is fully incorporated herein for reference. When expressed in transiently transfected COS cells, the galactokinase activity was ten times higher than in the control levels (6600 vs. 640 counts per minute - repeated three times). These results definitely confirm that the SEO ID NO: 4 encodes a complete, biologically active gene of human galactocinease. The nucleic acid molecule of the invention can also be subcloned into an expression vector to produce high levels of human galacto kinase (either fused to another protein, eg operably linked to the 5 'end with another coding sequence, or non-fused) in transfected cells. For mammalian cells, the expression vector would optionally encode a neo-icine resistance gene to select transinfectants based on the ability to grow in G418 and a dihydrofolate reductase gene which allows for the amplification of the transfected gene in cells DHFR-. The plasmid could then be introduced into host cell lines e.g. CHO ACC98, a DHFR non-adherent cell line adapted to grow in serum free medium and 293 human embryonic kidney cells (ATCC CRL) 573) and cell lines could then be selected for resistance to G418.
Human galactocinease gene - genomic sequence
A coding region of a full-length genomic galactocine gene was identified from a human genomic collection (made from placental tissue) of lambda phage (Lambda Fix II) using the galK cDNA as a probe. An isolated clone, designated clone 17, was deposited on May 3, 1995 with the American Type Culture Collection (ATCC: American Type Culture Collection), Rockville, MD, USA, under accession number ATCC 97135 and has been accepted as a patent deposit, in accordance with the Budapest Treaty of 1977 that governs the deposit of microorganisms for the purposes of the patent procedure. The coding region of the genomic gene is divided into at least 8 isolated exons from four DNA fragments. The array is plotted in Figure 1. The DNA sequence is determined using multiple PCR priming oligonucleotides corresponding to the galK sequence of cDNA (eg in correspondence to the galK genomic exons) as well as subsequently designed PCR primer oligonucleotides corresponding to to non-coding regions (eg galK genomic introns). Thus, the structure of the galactokinase genomic gene is summarized later in Table 1 (see also Fig. 2 and SEO ID N0: 7):
TABLE 1
Genomic Galactokinase Amino Acid Generator # / PCR Exon * Encoded [SEO ID NO] 1 1-55 3333 / [8] 3334 / [9] 3598 / [10] 3599 / [ll] 2 56-118 1888 / Í12] 3332 / [13] 3604 / Q4] 3605 / C15] 3 119-158 3331 / [16] 3606 / [17] 4 159-204 1657 / Í18] 3034 / [19] 5 205-264 3330 / [20] 3607 / [21] 6 265-315 1539 / [22] 2665 / [23] 316-269 1891 / [24] 2665 / Í25] 8 370-392 2665 / Í26] 2666 / [27] 2667 / [28] Gen / galactinase deficiency marker
A fibroblast cell line (GM00334) derived from a patient with galactocinease deficiency was obtained from the Coriell Institute for Medical Research, Haddon Avenue 401, Camden, New Jersey, 08103. Total RNA was isolated from cells cultured using RNAZOL equipment for RNA isolation (Biotecx, Houston, Tx). Reverse transcription of cytoplasmic DNA (1 ug) was performed with initiator oligonucleotides 1823 [SEO ID NO: 29] and 1825 [SEO ID NO: 303. The sample was amplified by 35 cycles at 94 ° C for 1 minute, 50 ° C dur-ante one minute and 72 ° C for 7 minutes. The DNA product was purified electrophoretically, ligated to the TA cloning vector (Invitrogen) and sequenced. Twelve cDNAs were sequenced in total (representing PCR products cloned from multiple independent PCR reactions). This procedure was also repeated with fibroblasts cultured from normal controls (e.g., people who did not exhibit galactocinease deficiency). A comparison with normal controls identified a single base substitution of A by G at position 122 of the "normal" human galactocinease gene [SEO ID NO: 4]. The result is a missense mutation in amino acid 32 from Val to Met [SEO ID N0: 5]. The change of G by A creates an Mscl endonuclease restriction site (e.g., TGG CCA) in the mutant allele. This restriction site is then used to quickly classify the mutant allele in the parents of the patient with galactokinase deficiency. In essence, the 3l exon encoding the galactocine 1 to 5 residues (eg exon 1, see Table 1) was cloned from a collection of genomic lambda phage and its DNA sequence was determined, including a portion of the flanking sequences of introns. The oligonucleotide primers (X2-50UT [SEO ID NO: 31] and X2-30UT [SEO ID NO: 32] were designed to hybridize the introns sequences for the amplification of a 346 bp DNA fragment of the genomic DNA. analyzed the point mutation in the PCR product via RFLP, that is, in the presence of a new Mscl site created as detected by the 1.5% agarose gel electrophoresis.A "normal" allele remains uncut with the Mscl enzyme, and in this way, it migrates as a 346 bp fragment on an agarose gel.The PCR product from the patient with galactokinase deficiency (eg the change from A to G) is segmented with Mscl, resulting in two fragments. of 193 and 153 bp respectively, the absence of the 346bp fragment indicates that the patient is homozygous for this allele.In contrast, the PCR products from the parents of this patient, followed by an Mscl digestion, gave as a result three fragments (346, 193 and 153 bp) which is c He was consistent with a heterozygous pattern for the change from A to G. This was, both parents were carriers of the same mutation. To determine if the missense mutation results in decreased enzyme activity, a cDNA clone containing the change from A to G in COS cells is subcloned and the galactokinase activity is tested as previously described. COS cells transfected with cDNA encoding the wrong sense mutation have the same level of galactokinase activity as COS host cells, namely 0.02 units / ug protein. In contrast, COS cells transfected with the non-mutant cDNA of galactocinaea [SEO ID NO: 4] have an activity fifty times higher compared to host COS cells (e.g., control). These results support the substitution of Val32 to Met32 as the cause of the decreased enzymatic activity. Another mutation was discovered in an unrelated patient who had cataracts and was diagnosed with a galactocinease deficiency (the galactokinase activity was found to be close to zero). Genomic DNA was isolated from lines of lymphoblastoid and sec-enceded cells by automated sequencing on an ABI 373A sequencer. This resulted in a single substitution of T by G in nonsense codon within the framework (e.g., TAG) in the amino acid with position 80 [SEO ID NO: 6]. This mutation causes the premature termination of human galactocinasa, resulting in a truncated protein of 79 amino acids that would be expected to be non-functional. (The genomic DNA of this patient's parents was heterozygous for this mutation and therefore, did not have a galactocinease deficiency).
The above description and examples fully describe the invention including the modalities of the isma. Those skilled in the art will recognize or be able to determine using no more than a routine experimentation protocol, many equivalents to the specific embodiments of the present invention. Such equivalents are within the scope of the following claims. SEQUENCE LIST
(1) GENERAL INFORMATION (i) APPLICANT: Bergsma, Derk, 3. Stambolian, Dwight (ii) TITLE OF THE INVENTION: Human Galactokinesa Gene (iii) SEQUENCE NUMBER: 32 (iv) ADDRESS FOR CORRESPONDENCE: (A) RECIPIENT: Smithkline Beecham Corp. / Corporate Intellectual Property (B) STREET: 709 Swedeland Road / UU2220 (C) CITY: King of Prussia (D) STATE: Penneylvania (E) COUNTRY: UNITED STATES OF AMERICA (F) POSTAL CODE: 19406 -0939 (v) LEGIBLE COMPUTER FORM: (A) TYPE OF MEDIUM: Flexible Disk (B) COMPUTER: compatible with IBM PC (C) OPERATING SYSTEM: PC-DOC / MS-DOS (D) PROGRAMS: Patentln Relay - # 1.0, Version # 1.30 (vi) COMMON DATA OF THE APPLICATION: (ON APPLICATION NUMBER:
(2) INFORMATION FOR SEO ID NO: 2: (.,) CHARACTERISTICS OF THE SEQUENCE: (tt) LENGTH: 22 amino acids (B) TYPE: amino acid (O) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (Ji) TYPE OF MOLECULE: protein (>: i) DESCRIPTION OF SEQUENCE: SEO ID NO: 2: His He Gln Glu His Tyr Gly Gly Thr Ala Thr Phe Tyr 1 5 10 Leu Ser Gln Ala Ala Asp Gly Ala Lys 15 20
(2) INFORMATION FOR SEO ID NO: 3:
(i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 29 amino acids (B) TYPE: amino acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: protein (xi) DESCRIPTION OF THE SEQUENCE : SEO ID NO: 3: Wing Gln Val Cys Gln Gln Wing Gl? His Ser Phe Ala Gly
1 5 10 Met Pro Cys Gly He Met Asp Gln Phe He Ser Leu Met 15 20 25 Gl > > Gln Lye (2) INFORMATION FOR SEO ID NO: 4: (i) CHARACTERISTICS OF THE SEQUENCE: (P) LENGTH: 1349 paree of bases (I) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (E) TOPOLOGY: linear (Ü) TYPE OF MOLECULE: cDNA (xi) FEATURE: (A) NAME / KEY: CDS (B) LOCATION: 29..1204 (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 4:
GAAGTCGGCA CGAGTGCAGG C3C3C37C? TG GCT GCT TTG? G? C? G CC. C? G
52 Met Ala Ala Leu Arg Gln? R; Gln: 5
GTC GCG GAG CTG CTG GCC GAG GCC CGG CGA GCC TTC CGG GAG GTC TTC 100 Val Wing Glu Leu Leu Wing Glu Wing Arg Arg Wing Phe Arg Glu Glu Phe 10 15 20
GGG GCC GAG CCC GAG CTG GZC G7G TCA GCG CCG 148 GGC CGC GTC AAC CTC
Gly? The Glu Pro Glu Leu Wing Val Ser Wing Pro Gly Arg Val Asn Leu 35 40
ATC GGG GAA CAC ACG GAC TAC AAC CAG GGC CTG GTG CTG CCT A7G GCT 196 He Gly Glu His Thr Asp Tyr Asn Gln Gly Leu Val Leu Pro Mee Wing 45 5s0u 55 CTG GAG CTC ATG ACG GTG CTG GTG GGC AGC CCC CGC AAG GAT GGG CTG 244 Leu Glu Leu Met Thr Val Leu Val Gly Ser Pro Arg Lys Asp Gly Leu 60 65 ™ GTG TCT CTC CTC ACC ACC TCT GAG GGT GCC GAT GAG CCC CAG CGG CTG 292 Val Ser Leu Leu Thr Thr Ser Glu Gly Ala Asp Glu Pro Gln Arg Leu 75 '80 85
CAG TTT CCA CTG CCC ACÁ GCC CAG CGC TCG CTG GAG CCT GGG ACT CCT
340 Gln Phe Pro Leu Pro Thr Ala Gln Arg Ser Leu Glu Pro Gly Thr Pro
90 95 100
CGG TGG GCC AAC TAT GTC AAG GGA GTG ATT CAG TAC TAC CCA GCT GCC
388 Arg Trp Wing Asn Tyr Val Lys Gly Val He Gln Tyr Tyr Pro Ala Wing
105 110 115 120
CCC CTC CCT GGC TTC AGT GCA GTG GTG GTC AGC TCA GTG CCC CTG GGG 436 Pro Leu Pro Gly Phe? Er Wing Val Val Val Ser Ser Val Pro Leu Gly 125 130 135
GGT GGC CTG TCC AGC TCA GCA TCC TTG GAA GTG GCC ACG TAC ACC TTC 484 Gly Gly Leu Ser Ser Wing Ser Leu Glu Val Wing Thr Tyr Thr Phe 140 145 150
CTC CAG CAG CTC TGT CCA GAC TCG GGC ACÁ ATA GCT GCC CGC GCC CAG 532 Leu Gln Gln Leu Cys Pro Asp Ser Gly Thr lie Ala Ala Arg Ala Gln 155 160 165
GTG TGT CAG CAG GCC GAG CAC AGC TTC GCA GGG ATG CCC TGT GGC ATC 580 Val Cys Gln Gln Wing Glu His Ser Phe Wing Gly Met Pro Cys Gly He 170 175 180
ATG GAC CAG TTC ATC TCA CTT ATG GGA CAG AAA GGC CAC GCG CTG CTC
628 Met Asp Gln Phe He Ser Leu Met Gly Glr. Lys Gly His Ala Leu Leu
185 190 195 200 ATT GAC TGC AGG TCC TTG GAG ACC AGC CTG GTG CCA CTC TCG GAC CCC 676 He Asp Cys Arg Ser Leu Glu Thr Ser Leu Val Pro Leu Ser Asp Pro 205 210 215
AAG CTG GCC GTG CTC ATC ACC AAC TCT AAT GTC CGC CAC TCC CTG GCC 724 Lys Leu Wing Val Leu He Thr Asn Ser Asn to Arg His Ser Leu Wing
220 225 230
TCC AGC GAG TAC CCT GTG CGG CGG CGC CAA TGT GAA GAA GTG GCC CGG 772 Ser Ser Glu Tyr Pro Val Arg Arg Arg Gln Cys Glu Giu Val Ala Arg 235 240 245
GCG CTG GGC AAG GAA AGC CTC CGG GAG GTA CAG CTG GAA GAG CTA GAG 820 Wing Leu Gly Lys Glu Ser Leu Arg Glu Val Gln Leu Giu Glu Leu Glu
250 255 260
GCT GCC AGG GAC CTG GTG AGC AAA GAG GGC TTC CGG CGG GCC CGG CAC
868 Ala Ala Arg Asp Leu Val Ser Lys Glu Gly Phe Arg Arg Ala Arg His
265 270 275 280
GTG GTG GGG GAG ATT CGG CGC ACG GCC CAG GCA CC3 GCC GCC CTG AGA 916 Val Vile Gly Glu He Arg Arg Thr Ala Gin Ala Ala Ala Ala Ala Leu Arg 285 290 295
CGT GGC GAC TAC AG GCC TTT GGC CGC CTC ATG GTG GAG AGC CAC CGC 964 Arg Gly Asp Tyr Arg Wing Phe Gly Arg Leu Met Val Glu Ser His Arg 300 305 310
TCA CTC AGA GAC GAC TAT GAG GTG AGC TGC CCA GAG CTG GAC CAG CTG 1012 Ser Leu Arg Asp Asp Tyr Glu Val Ser Cys Pro Glu Leu Asp Gln Leu 315 320 325 GTG GAG GCT GCG CTT GCT GTG CCT GGG G7T TAT GGC AGC CGC ATG ACG 1060 Val Glu Ala Ala Leu Ala Val Pro Gly Val Tyr Gly Ser Arg Met Thr 333 - 335 340
GGC GGT GGC TTC GGT GGC TGG ACG GTG CT CTG GAG GCC TCC GCT 1108 Gly Gly Gly Phe Gly Gly Cys Thr Val Thr Leu Leu Glu Wing Be Ala
345 350 355 360
GCT CCC CAC GCC ATG CGG CAC ATC CAG GAG CAC TAC GGC GGG ACT GCC 1156 Ala Pro His Ala Met Arg Kis He Gln Glu His Tyr Giy Gly Thr Ala 365 370 375
ACC TTC TAC CTC TCT CAA GCC GCC GAT GGA GCC AAG GTG CTG TGC TTG 1204 Thr Phe Tyr Leu Ser Gin Wing Wing Asp Gly Wing Lys Val Leu Cys Leu 380 385 390
TGAGGGACCC CCAGGACAGC ACACGGTGAG GGTGCGGGGC CTGCAGGCCA GTCCCACGGC 1264
TCTGTGCCCG GTGCCATCTT CCATATCCGG GTGCTCAATA AACTTGTGCC TCCAATGTGG 1324
AAAAAAAAAA AAAAAAAAAC TCGAG 1349
(2) INFORMATION FOR SEO ID NO: 5: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1349 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: double (D) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: cDNA (i) CHARACTERISTIC: (A) NAME / KEY: CDS (3) LOCATION: 29..1204 (xí.) SEQUENCE DESCRIPTION: SEO ID NO: 5: GAATTCGGCA CGAGTGCAGG CGCGCGTC ATG GCT GCT TTG AGA CAG CCC CAG 52 Met Ala Ala Leu Arg Gln Pro Gln 1 5
GTC GCG GAG CTG CTG GCC GAG GCC CGG CGA GCC TTC CGG GAG GTC TTC 100 Val Ala Glu Leu Leu Ala Giu Ala Arg Arg Ala Phe Arg Glu Giu Phe 10 15 20
GGG GCC GAG CCC GAG CTG GCC ATG TCA GCG CCG GGC CGC GTC AAC CTC 148 Gly Wing Glu Pro Glu Leu Wing Met Ser Wing Pro Gly Arg Val Asn Leu
2 30 35 40
ATC GGG GAA CAC ACG GAC TAC AAC CAG GGC CTG GTG CTG CCT ATG GCT 19IÍ H Gly Glu His Thr Asp Tyr Asn Gln Gly Leu Val Leu Pro Met Wing 45 50 55
CTG GAG CTC ATG ACG GTG CTG GTG GGC AGC CCC CGC AAG GAT GGG CTG 24 < l, Leu Glu Leu Met Thr Val Leu Val Gly Ser Pro Arg Lys Asp Gly Leu GTG TCT CTC CTC ACC ACC TCT GAG GGT ZCC GAT GAG CCC CAG CGG CTG 292 Val Ser Leu Leu Thr Thr Ser Glu Gly Wing Asp Glu Pro Gln Arg Leu 75 80 85
CAG TTT CCA CTG CCC ACÁ GCC CAG CGC TCG CTG GAG CCT GGG ACT CCT
340 Gln Píie Pro Leu Pro Thr Ala Glu Arg Ser Leu Glu Pro Gly Thr Pro < 0 95 100
CGG TCG GCC AAC TAT GTC AAG GGA GTG ATT CAG TAC TAC CCA GCT GCC
388 Arg Trp Wing Asn Tyr Val Lys Gly Val He Gln Tyr Tyr Pro Ala Wing
105 110 115 120
CCC CTC CCT GGC TTC AGT GCA GTG GTG GTC AGC TCA GTG CCC CTG GGG 436 Pro Leu Pro Gly Phe Ser Wing Val Val Val Ser Ser Val Pro Leu Gly 125 130 135
GGT GGC "CTG TCC AGC TCA GCA TCC TTG GAA GTG GCC ACG TAC ACC TTC 484 Gly Gly Leu Ser Ser Wing Ser Leu Glu Val Wing Thr Tyr Thr Phe 140 145 150
CTC CAG CAG CTC TGT CCA GAC TCG GGC ACA ATA GCT GCC CGC GCC CAG 532 Leu Gln Gln Leu Cys Pro Asp Ser Gly Thr He Ala Wing Arg Ala Gln 155 160 165
GTG TGT CAG CAG GCC GAG CAC AGC TTC GCA GGG ATG CCC TGT GGC ATC 580 Val Cy, j Gln Gln Wing Glu His Ser Phe Wing Gly Met Pro Cys Gly He 171) 175 180
ATG GAC CAG TTC ATC TCA CTT ATG GGA CAG AAA GGC CAC GCG CTG CTC
628 Met Asp Gln Phe He Ser Leu Met Gly Gln Lys Gly His Ala Leu Leu
185 190 195 200 ATT GAC TGC AGG TCC TTG GAG ACC AGC CTG GTG CCA CTC TCG GAC CCC
676 He Asp Cys Arg Ser Leu Glu Thr Ser Leu Val Pro Leu Ser Asp Pro 205 210 215
AAG CTG GCC GTG CTC ATC ACC AAC TCT AAT GTC CGC CAC TCC CTG GCC 724 Lys Leu Wing Val Leu He Thr Asn Ser Asn Val Arg His Ser Leu Wing 220 225 230
TCC AGC GAG TAC CCT GTG CGG CGG CGC CAA TGT GAA GAA GTG GCC CGG 772 Ser Ser Glu Tyr Pro val Arg Arg Arg Gln Cys Glu Glu Val Ala Arg
235 240 245
GCG CTG GGC AAG GAA AGC CTC CGG GAG GTA CAG CTG GAA GAG CTA GAG 820 Wing Leu Gly Lys Glu Ser Leu Arg Glu Val Gin Leu Glu Glu Leu Glu 250 255 260
GCT GCC AGG GAC CTG GTG AGC AAA GAG GGC TTC CGG CGG GCC CGG CAC
868 Ala Ala Arg Asp Leu Val Ser Lys Glu Gly Phe Arg Arg Ala Arg His
265 270 275 280
GTG GTG GGG GAG ATT CGG CGC ACG GCC CAG GCA GCG GCC GCC CTG AGA 916 Val Val Gly Glu He Arg Arg Thr Ala Gln Ala Ala Ala Ala Ala Leu Arg 285 290 295
CGT GGC GAC TAC AGA GCC TTT GGC CGC C C ATG GTG GAG AGC CAC CGC 964 Arg GJ.and Asp Tyr Arg Ala Phe Gly Arg Leu Met Val Glu Ser His Arg
300 305 310
TCA CTC AGA GAC GAC TAT GAG GTG AGC TGC CCA GAG CTG GAC CAG CTG 1012 Ser Leu Arg Asp Asp Tyr Glu Val Ser Cys Pro Glu Leu Asp Gln Leu 315 320 325 GTG GAG GCT GCG CTT GCT GTG CCT GGG GTT TAT GGC AGC CGC ATG ACG 1060 Val Glu Ala Ala Leu Ala Val Pro Giy Val Tyr Giy Ser Arg Met Thr
330 335 340
GGC G3T GGC TTC GGT GGC TGC ACG GTG ACE CTG CTG GAG GCC TCC GCT
1108 Gly Giy Gly Phe Gly Gly Cys Thr Val Thr Leu Leu Giu Wing Being Wing
345 350 355 360
GCT CCC CAC GCC ATG CGG CAC ATC CAG GAG CAC TAC GGC GGG ACT GCC
1156 Ala Pro His Ala Met Arg His He Gln Glu His Tyr Gly Giy Thr Ala 365 370 375
ACC TTC TAC CTC TCT CAA GCC GCC GAT GGA GCC AAG GTG CTG TGC TTG 1204 Thr Phe Tyr Leu Ser Gln Wing Wing Asp Gly Wing Lys Val Leu Cys Leu 380 385 390
TGAGGCACCC CCAGGACAGC ACACGGTGAG GGTGCGGGGC CTGCAGGCCA GTCCCACGGC 1264
TCTGTGOCCG GTGCCATCTT CCATATCCGG GTGCTCAATA AACTTGTGCC TCCAATGTGG 1324
AAAAAAAAAA AAAAAAAAAC TCGAG 1349
(2) INFORMATION FOR SEO ID NO: 6: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 1349 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: double (D) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: cDNA (xi) FEATURE: (A) NAME / KEY: CDS (B) LOCATION: 29.265 (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 6: GAATTCGGCA CGAGTGCAGG CGCGCGTC ATG GCT GCT TTG AGA CAG CCC CAG 52 Met Wing Ala Leu Arg Gln Pro Gln 1 5 GTC GCG GAG CTG GCC GCC CGC CGA CGA GCC TTC CGG GAG GTC TTC 100 Val Glu Wing Leu Leu Wing Glu Wing Arg Arg Wing Phe Arg Glu Glu Phe 10 15 twenty
GGG GCC GAG CCC GAG CTG GCC GTG TCA GCG CCG GGC CGC GTC AAC CTC 148 Gly Wing Glu Pro Giu Leu Wing Val Ser Wing Pro Gly Arg Val Asn Leu 25 30 35 40
ATC GGG GAA CAC ACG GAC TAC AAC CAG GGC CTG GTG CTG CCT ATG GCT 196 He GJ..y Glu His Thr Asp Tyr Asn Gln Giy Leu Val Leu Pro Met Wing 45 50 55
CTG G? G CTC ATG ACG GTG CTG GTG GGC AGC CCC CGC AAG GAT GGG CTG 244 Leu GJLu Leu Met Thr Val Leu Val Gly Ser Pro Arg Lys Asp Gly Leu GTG TCT CTC CTC ACC ACC TCT TAGGGTGCCG ATGAGCCCCA GCGGCTGCAG 295 Val Sur Leu Leu Thr Thr Ser 75
TTTCCACTGC CCACAGCCCA GCGCTCGCTG GA TGGGA CTCCTCGGTG GGCCAACTAT 355
GTCAAGGGAG TGATTCAGTA CTACCCAGCT GCCCCCCTCC CTGGCTTCAG TGCAGTGGTG 415
GTCAGCTCAG TGCCCCTGGG GGGTGGCCTG TCCAGCTCAG CATCCTTGGA AGTGGCCACG 475
TACACCTTCC TCCAGCAGCT CTGTCCAGAC TCGGGCACAA TAGCTGCCCG CGCCCAGGTG 535
TGTC GCAGG CCGAGCACAG CTTCGCAGGG ATGCCCTGTG GCATCATGGA CCAGTTCATC 595 TCACTTATGG GACAGAAAGG CCACGCGCTG CTCATTGACT GCAGGTCCTT GGAGACCAGC 655
CTGGTGCCAC TCTCGGACCC CAAGCTGGCC GTGCTCATCA CCAACTCTAA TGTCCGCCAC 715
TCCCTGGCCT CCAGCGAGTA CCCTGTGCGG CGGCGCCAAT GTGAAGAAGT GGCCCGGGCG 775
CTGGGCAAGG AAAGCCTCCG GGAGGTACAA CTGGAAGAGC TAGAGGCTGC CAGGGACCTG 835
GTGAi? CAAAG AGGGCTTCCG GCGGGCCCGG CACGTGGTGG GGGAGATTCG GCGCACGGCC 895
CAGGCAGCGG CCGCCCTGAG ACGTGGCGAC TACAGAGCCT TTGGCCGCCT CATGGTGGAG 955 AGCCACCGCT CACTCAGAGA CGACTATGAG GTGAGCTGCC CAGAGCTGGA CCAGCTGGTG
1015
GAGGCT &CGC TTGCTGTGCC TGGGGTTTAT GGCAGCCGCA TGACGGGCGG TGGCTTCGGT 1075
GGCTGCACGG TGACACTGCT GGAGGCCTCC GCTGCTCCCC ACGCCATGCG GCACATCCAG 1135
GAGCACT.SCG GCGGGACTGC CACCTTCTAC CTCTCTCAAG CAGCCGATGG AGCCAAGGTG 1195
CTGTGCT GT GAGGCACCCC CAGGACAGCA CACGGTGAGG GTGCGGGGCC TGCAGGCCAG 1255
TCCCACGGCT CTGTGCCCGG TGCCATCTTC CATATCCGGG TGCTCAATAA ACTTGTGCCT 1315
CCAATGTGGA AAAAAAAAAA AAAAAAAACT CGAG 1349
(2) INFORMATION FOR SEO ID NO: 7: (i) CHARACTERISTICS OF THE SEQUENCE: (P) LENGTH: 7676 base pairs (E) TYPE: nucleic acid (C) TYPE OF CHAIN: double (E) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: DNA (genomic)
(xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 4: CCGAGCATCC CGCGCCGACG GGTCTGTGCC GGAGCAGCTG TGCAGAGCTG CAGGCGCGCG 60
TCATGGCTGC TTTGAGACAG CCCCAGGTCG CGGAGCTGCT GGCCGAGGCC CGGCGAGCCT 120
TCCGGGAGGA GTTCGGGGCC GAGCCCGAGC TGGCCGTGTC AGCGCCGGGC CGCGTCAACC 180
TCATCGGGGA ACACACGGAC TACAACCAGG GCCTGGTGCT GCCTATGGTG AGGGGCTGCA 240
CGGCiGAGCCC CTAGCCCGCC GCCGCCTGTC CCGGTCGCCG AGGAGGGCGG GCCTCGGGGA 300
CGCTGGGGGC GAGTTCTTCC CGCGGGAGAT GTGGGGCGGG CAGCTGCGCC TGGAGCACCG 360
GTGCACGGAA GAGTCCCCGG GACAGGCTGT TCCCCACGTT GGAAGGGAGG AAGCGAAGAA 420
GTGGTCCCCA GAGGGTGCGC GGCCGCCTCT TGGCTCAAGC CCGCCCTCTG GGGGCTGGGG 480 CTCCTCGCCT TCAACCTGGG AGCATGTTCC CCTTAAACTG TGAGGCCCTG TGTGCCACGC 540
AGAAGGGGAC ACTCCGCGCC TCCGGCCACC GTGGGGCCCC AACCGCAGAC TGGGCGAAC 600
GTAGCCTTCT GGCCCAGCCC GTTCAATTTA CAGAGGAGGA AACTGAGGCC TAGAGAGGCC 660
CAGTGAACTG CTGGAGGTCA C: "CAGGT TCTTGGCGGG GCTGCGACTT GGGAGTGAGG 720
ACTCCCAGCT TTCAGCGGGG GGCGCTTTCC GCCCCATCTG CAGCTTGGGG AGTGCACAGG 780
TACAGGATGT CCAGAGCCAC CCAAAATGTA AAGGCTTTGG AGCTCCAGTG ATCTGTTTTC 840
CCTTTGGGCT ÍAGCTCTCCC CCCTTGCCCC ACAGCTCAGG GCAGAGTCCA GGTCTGTGCT 900
CCAGCTGCAG CCGCCCCGCC CCTGAAGACC TAAGGGGGCA GGGCTCAAGC CCCCAAGGTC 960
AGCTGGCCCT CAGGATCTTC CCTGCGACGC TGAACCTGGA GGTTCAGAAC CTGATGACTG 1020
TGGAGGCATC AGAACCTCGG CTGGAGGCAG TGTCATTGGA GAGGCTTACT CCAGCTGGCG 1080
GAAGCCTCAC GTACTGCTTG TCTCTCCTGC CAGGCTCTGG AGCTCATGAC GGTGCTGGTG 1140
GGCAGCCCCC GCAAGGATGG GCTGGTGTCT CTCCTCACCA CCTCTGAGGG TGCCGATGAG 1200
CCCCAGCGGC TGCAGTTTCC ACTGCCCACA GCCCAGCGCT CGCTGGAGCC TGGGACTCCT 1260 CGGTGGGCOA ACTATGTC ?? GGGAGTGATT CAGT? CTACC CAGGTATGGG GCCCAGGCCT 1320
GAGCCAAGTC CTCACTGATA CTAGGAGTGC CACCTCACAG CCACAGAGCC CATTCATTTG 1380
TCTGATACAC TGTGGGGAAG GCTTGTAGAG TGGAGCATCC CATTGTACAG ATGAGGAAAC 1440
TGATGCCCCC: AGAAGGTCGG GAACTTGCCC TGGGTTTCCC GTGACCTGAT TGGAGGAGCC 1500
AGGATTTGAA CCCCAGCCTT TTTTCCCTCC AGAGCCCTAA ACCAGGAGGA CAATTAGAAG 1560
TGTCCCAGCA ACCTCAGAGG GTGGGAAAAT GGAGGGGAGT GGGTCCCTTG GGCCAGCAGG 1620
TTGGTGGGGT TCTTGACAAT TGAGA CAC ACCTAGAAAC AGTTGCTAGG CCGTTGCTGC 1680
CCTTCCCGCC AGGACACCTG CCCTTCCTGT CCAATCCTCC CAGGCAGCCT CTCTTACCAT 1740
CACCTGTTCT TTCCCCCTGC AGCTGCCCCC CTCCCTGGCT TCAGTGCAGT GGTGGTCAGC 1800
TCAGTGCCCC GGGGGGGTGG CCTGTCCAGC TCAGCATCCT ^ -AGTGGC CACGTACACC 1860
TTCCTCCAGC AGCTCTGTCC AGGTACCAGC TAGGCCCCAG CCCTGACCCA GCCCTCCTTC 1920
CCTGAGGTCT CCAGGTGGTC CCAGCTTCTA CTATGCCTTA TGGAGGGGGT GGCAGGGAAT 1980
CTCCCTGGAG TGTCATTGAA GCCACTGCTG CTTCCACCAG CCCTAGCCTC CCCACCTCAC 2040 CCTGTACTGC AGACTCGGGC ACAATAGCTG CCCGCGCCCA GGTGTGTCAG CAGGCCGAGC 2100
ACAGCTTCGC AGGGATGCCC TGTGGCATCA TGGACCAGTT CATCTCACTT ATGGGACAGA 2160
AAGGCCAC C GCTGCTCATT GACTGCAGGT TGGGCTCGCT CCCCTCGTCC CCTCCCGCCC 2220
TGCACTCAGC AGCTCCTGGG TGGAGTGTGC CCACTGCCTG GCGCAGCAAG CACACGCTTG 2280
GCCTCGTCAT CTCCCCCATT GTAACTCCAC CCCAGGTCCT TGGAGACCAG CCTGGTGCCA 2340
CTCTCGGACC CCAAGCTGGC CGTGCTCATC ACCAACTCTA ATGTCCGCCA CTCCCTGGCC 2400
TCCAGCGAGT ACCCTGTGCG GCGGCGCCAA TGTGAAGAAG TGGCCCGGGC GCTGGGCAAG 2460
GAAAGCCTCC GGGAGGTACA ACTGGAAGAG CTAGAGGGTG AGAACTGCCA GGGTGCTCTA 2520
TCCTGGACSGC GGCTGTGCTC CCTGCTGGCG CCTCAGTGTG GCCTTGACCC TGCCTGGGAC 2580AGGGGCTTCT GCCATGCTCT CCCCAGTCCC TTCAAACACT GCGCACCCAG 2640
GGTTCCAATC TCAGCAGGGG TGCTTGAAAT CCTAAAATGG TCTTATCTAÁ "TCAGAAAAAT 2700
CATGTTTCCA TTGTGGAAAA TGTAGAAAAG TACAAAGTAG AAAATAATAA GCTATAAGGG 2760
CACTACCCAG AGATAGGCAC TGCTGACATT TTCACGTTTC CTTTCAGTAT TTTTCCACAT 2820 CTGTCTTCAA AGCTGAGTAT ATGTAATATA CATCACTTT CCCCCCCCAC CCCCTTTTTT 2880
TTAAGAGGCA GGGTCTCATT CTGTTGCCCA AGCTGGAGTG TAGTGGTGTG ATCATAGCTT 2940
ACTGCAAACT TGAACTCTTG AGCTCAAGGG ATCCTCCCAG CTCAGCCTTC CAAGTAGCTG 3000
AGATTACAGG TGTGCCACCA TGCCCGGCTA ATTTTTATCT TCGTAAAGAC GGCCTTGTAG_3060_TGTTGCCCAG GATGATCCTG AACTCTGGCC TCAAGAGGTC CTCCTGCCTT GGGCTCCCAA 3120
AGTGTTGGGA TTATAGGCAT GAGCCACTGC GGCCAGCCCA TTTGCCGTGT TTTTTTTTTG 3180
GACACAGAGT TTCGGTCTTG TCACCCATGC TGGAGTGCAA TGGTGCGATC TCAGCTCACT 3240
GTAACCrCTG CCTCCCGGGT TCAAGTGATT CTCCTGCCTC AGCCTCCCGA GTAGCTGGGA 3300
CTACAG3CGC CCGCCACTAC GCCTGGCACA TTTTTTATAG TTCTAGTAGA GACTGGGGTT 3360
TCACCATGTT GGCCAGGCTG GTCTCAAACG CCTGACCTCA GGTGATCCTC CCGCCTCAGC 3420
CTTCCA? AGT GCTGGGATTA CAGGCGTGAG CCATAGTGCC GGTCTCTTTT TTTTTTTTTT 3480
TTAAACTAAA CATAATCTCA GAACCCAGAA CCCTATCTTA TCTTATGCCA TGAAAGGCAT 3540
ATCTCGGCGT GGCTCTTTTT TTTTTTTTTTT CTTTTTTTTTT GGGCGAGGTG GAGGCTTGCC 3600 CTGTTGCCCA GGCTGGAGTG CAGCGGCGCA ATCTCGGTTC ACTGCATCCT CCACCTCCTG 3660
GGTCCAAATG ATCCTCCTGC CTTAGCTTCC TGAGTAGGTG GGATTACTGG AACCCACCAC 3720
CACGCCCAGC CAATTTTTAT ATTTTTAGTA GAGACGGGGT TTCATGTTGG CCAGGCTGGC 3780
CTCGA? CTCC GACCTCGTG ATCTGCCCGC CTCAGCCTCC CAATGTGCTA GGATTACATG 3840
TGTGAGCCAC TGCACCTGGC CTCCGTGTGG CTCTTTAAAG CTCCACAATA TTTTAGCATT 3900
CAGGTGCTCT GTCATTTACT TAACTATTTT CTGATACACC TCACACTGCG ATTAACTTTC 3960
CTTATTTATC TTTTTTATTA TTTATTTATT TATTTATTTG AGACAGAGTC TTGCTCTGTC 4020
ACCCAGGCTG GAGTGCAGTG GCACGATCTC GGCTCACTGC AACCTCTGCC TCCCAGGTTC 4080
AAGTGATGCT CCTGCCTCAG CCTCCTGAGT AGCTAGGATT AGAGGCATGT GCCACCACAC 4140
CTGGCTA? TC TTCGTATTTT TAGCAGAGAT GAGGTTTTAC CATGTTGGTC GGGCTGGTCG 4200
TGAACTCCTG ACCTGGTGAT CTGCCCACCT CAGCCTCCCA AAGTACTGGG ATGACAGGCA 4260
TGAACCACTG TGCCTGGCCA TCTTTTTTTAT TTTTTAAAGA GATGGGTTCT GCTAAGTTGC 4320
CCAGGCTGGA CCTGAACTCT TGGGCTCAAG TAATCTTCTC ACCTAGTCTC CTGGGTAGCT 4380 6T
GCAACCAAAG GCACCCGGTT TATCTGCATT CTCTTTTTTT TCTTTGAGAC TGAGTCTTGC 4440
TCTGTAGCCC AGGCTGGAGC GCAGTGGCGT GATCTCGGCT CACTGCAACC TCCGTCTTCA 4500
GGGTTCAAGC AATTCTCCTG CCTCAGCCTC TGGAGTGGCT GGGACTACAG GCGTGTGCCA 4560
CCAGAGCGAG TTAATTTTTT TTTTTTTTTG TATTTTTAGT GGACACTGGG TTTCACTATA 4620
TTGGCCAGGC TGGTCTTGGA CTCCTGACCT CAAGTGATCC GCCTGCCTTG GCCTCCCAAA 4680
GTGCTGGGAT TACAGGCACA GGCGTGAGCC ACTACACCTG GCCTATCTGC ATTCTCTTAA 4740
TAGTTTCTTA GAAATGGATT CTTAGGAGTA GGATTACAGA GTCAAGAGAC ACAAGTTTTG 4800
TAGGCTGGGT GCGGTGGCTC ACGTCTGTGC CTGTAATCCC AGTACTTTAG GAGGCCAAGG 4860
TGGGCAGATT CATTGAGCTC AGGAATTCGA GACCAGCCTG GGCAACATGG CAAAACCCCA 4920
TCTCTAAAGA AATACAAAAA TTAGCCAGGT GTGGTGGTGT GTGCCTGTAG TCCTAGCTAC 4980
TTAGGAGGCT GGGGTGGGAG GATCAATTGA GCCCAGGAGG TTGAGACTGC AGTGAGCTGT 5040
GATTGCACCA TGGCACTCCA GCCTGGGCCT CAAAGTGAGA TCCTGTCTCC AAAACAAAAA 5100
AGATACAAGT ATCCTTAAGG CTCCTGCTAC ACATGGCCAG GAAGGTAGTC TATTGGACAG 5160 TTTTAAGGTC ATTATCAATA TTAGCTCATT TAATTCCCTC CAAAACTCTG TAAAGCACAT 5223
TCTGCTACCA TAGTTGTCAT ATTTTTGATG GGGGAATCTA CAGTGAGAGG CAGTGCTGGG 5280
ATCTGAACCC CATCTGGACA GATTAGCTCC AGGGCCCATG CTCTTGACTG GCTGGCCGCG 534C
CTGCCCACAC TGAGTTGTTC CTTCCTGGCA GGGTAGGTGT GCCTATCTCA GGGACACTAG_5400_ACAGCTCCGA GGGACCTCCC TGTCCTTTTC CTTTGTGAAC TGTGTCACGT TCTCCAGAGC 5460
AGGGCTCAGA CCTGCCCTGC CTGCTCTGTG CAGATGCCCT TGGCCAAGGT TTTCACACTG 5520
GAAC / AGTTG GTCCCTCCTC CCCACCCCAG CCTGTCCTTG GCCCTCCTCC AGGTCTCCTT 5580
CTGCATAGGA GCAGCTCACC CTGCCTCCTC CAGAGTCCTG CCCTAGAAGC GCAATCCCTC 5640
TCCTTCCATC CCCTGCCTGG CTGCCTGGCT CCTTCCCTCA GCCTCCAAGA CATGCTCAGT 5700
TTTCTTCCCT CCTAAAACAC CACCCACTGT CTCATTTCCA TTCATTTCTT TCTTTCTTTC 5760
TTTCTTTTTT TTTTTTGAGA GGGAGCCTCA CTCTGTCACC CAGGCTGAAG TGCAGTGGCA 5820
TGATCICCAC TCACTGCAAC CTCCGCCTCC CAGGTTCAAG CAATTCTCCT GCCTCAGCCT 5880 • * CCTGAGTAGC TGGGATTACA GGCGCCTGCC ACGATGCCCG GCTAACTTTT GTATTTTTAG_5940_TAGAGACGGG GTTTCGCCAT GTTGGCCAGG CTGGTCTCGA GCTCCTGACC TCAGGCAATC 6000
TGCCTGCCTC AGCTTCCCAA AGTGCTGGGA TTACAGGTGT GAGCCACCGC GCCCACCCAT 6060
TCATTTCTCA GTCCTTTGAA TCTACTTGCC CCTCCATCCC GCCATGCCAC CTACCCTAAC 6120
AACCTTCCCC CTTAAACCTG CGGGTTTGGC CGGGCGCAGT ACACTGAGTC AGTACTGGTA 6180
CTGACCCAGG TACCCCTCCA GCCTCAGCTC CAGTCAGATG GGACAGCCTG CTGGTCCCTG 6240
GCTGCTT TG CCCCCTCTTC TGGAGCCCCA GCCCTGGAGG CTCCATGTGG CTCAGCAGAA 6300
CTTCTTCTCC TCCTGCTCTG TGGTGGCCTC TTGAGGGCAG CACTCACCTT GGAAAGCATG 6360
GAGTGTTTCA ACCCTCACTG CTCCCTGAAG GACCAAGGTG TCCCATTTTA CAGTCGGGGG 6420
AGGAGGCACT GTGATAAAGG GGCTCTTCAG ACCCACGTCT GAGAGAGCCA GGCTGCGCCG 6480
CCCCCGCGGC CTTCCACCCT TCACCGTCCA GCCAGGGCCA CTGCCATCAC CGCCTGCTGG 6540
TCCTCACAGG CGTCGGGGCC CCAGGCAGTG AGAAGGCGGC TGCTGACTCC TCTTTCCTCC 6600
CCAGCTGCCA GGGACCTGGT GAGCAAAGAG GGCTTCCGGC GGGCCCGGCA CGTGGTGGGG 6660
GAGATTCGGC GCACGGCCCA GGCAGCGGCC GCCCTGAGAC GTGGCGACTA CAGAGCCTTT GGCCGCCTCA TGGTGGAGAG CCACCGCTCA CTCAGGTGAG GCCCTCTGGG CGCCCCGCTC 6780
CTGCCGGGCA CAGGCCGGCC CAGGCCCACC CCTTCAATAT CCTCTCTGCA GAGACGACTA 6840
TGAGGTGAGC TGCCCAGAGC TGGACCAGCT GGTGGAGGCT GCGCTTGCTG TGCCTGGGGT 6900
TTATGGCAGC CGCATGACGG GCGGTGGCTT CGGTGGCTGC ACGGTGACAC TGCTGGAGGC 6960
CTCCGC7GCT CCCCACGCCA TGCGGCACAT CCAGGTGGGC GGGCACCAGG GCCTGGGCGG 7020
GCAGGAGCGG CAGCTTCCCG GGGCCCTGCC ACTCACCCCC AGCCCGCCTC TACAGGAGC 7080
ACTACGXGG GACTGCCACC TTCTACCTCT CTCAAGCAGC CGATGGAGCC AAGGTGCTGT 7140
GCTTGTGAGG CACCCCCAGG ACAGCACACG GTGAGGGTGC GGGGCCTGCA GGCCAGTCCC 7200
ACGGCTCTGT GCCCGGTGCC ATCTTCCATA TCCGGGTGCT CAATAAACTT GTGCCTCCAA 7260
TGTGGTACCT GCCTCCTCTA GAGGTGGGTG TATGCTTGGG TGTCAGAGAA TGGGGGATGT 7320
CAGAACCGCT CCCCTACCCT AGGGGAGCAC CTCTCAGGCC CCAGAAGAAT GGGCAAGGCA 7380
GGGCCTAGCA GTAGCAAAAC CATTTATTAA GTGCAGAACA AAGGCTGGGT CCTTGTGCTG 7440
CTCCCAGCTC TTTGGTTACA AATAGGTTTG GGCCCACAGA GGACGGACCT TGCCCCCTTC 7500 ATGCCTCCCA GGAGACACCT AGCCCCTGCT CTGTGCATGC C3CTGGGCTG GGCCCCCAGC 7560
GGTGCAAGGA TGGAGTAGCT GAGGAGGCTC CGGGAGAGGA GTC3GGAGGA CGCCTAGTGG 7620
GACATTGCGG CGGTGGCGCA GGGTGCGGTC AAGTTTGGAA GAAAC-GTTG GGTC A 7676
(2) INFORMATION FOR SEO ID NO: 8: (L) SEQUENCE CHARACTERISTICS: (Ai LENGTH: 21 base pairs (Bi TYPE: nucleic acid (C i) CHAIN TYPE: individual (Di TOPOLOGY: linear (p) TYPE) OF MOLECULE: DNA yenornico (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 8
AGOCTTCCGG GAGGAGTTCG 0 21
(2) INFORMATION FOR SEO ID NO: 9: (l SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: ineal (n) ) TYPE OF MOLECULE: Genomic DNA
(xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 9: CTGGTÍGTAG TCCGTTTGTT C 21 (2) INFORMATION FOR SEO ID NO: 10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE : nucleic acid (C) TYPE OF CHAIN: individual (I) TOPOLOGY: linear (ii) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF SEQUENCE: SEO ID NO: 10; GCCAGCAGCT CCGCGACCTG G 21
(2) INFORMATION FOR SEO ID NO: 11: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear ( li) TYPE OF MOLECULE: Genomic DNA (XL) DESCRIPTION OF SEQUENCE: SEO ID NO: 11: GCTTCCTCCC TTCCAACGTG G 1
(2) INFORMATION FOR SEO ID NO: .2: (i) CHARACTERISTICS OF THE SEQUENCE: (AT LENGTH: 21 base pairs (B> TYPE: nucleic acid (Cl TYPE OF CHAIN: individual (DI TOPOLOGY: linear (i) :.) TYPE OF MOLECULE: Genomic DNA (xl) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 12:
CCCAGGCTCC AGCGZGCGCT G 21
(2) INFORMATION FOR SEO ID NO: 13: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear ( ii.) TYPE OF MOLECULE: Genomic DNA (i.) DESCRIPTION OF SEQUENCE: SEO ID NO: 13; ACOTCTGAGG GTGCCGATGA G 1
(2) INFORMATION FOR SEO ID NO: 14: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2.1 base pairs (B) TYPE: nucleic acid (Cl> CHAIN TYPE: i dividual (DÍ TOPOLOGÍA: lineal (iL) TYPE OF MOLECULE: Genoin DNA (XJ.) DESCRIPTION OF SEQUENCE: SEO ID NO: 14: CCCACAGCTC AGGGCAGAGT C 1
(2) INFORMATION FOR SEO ID NO: .1.5: (i!) CHARACTERISTICS OF THE SEQUENCE: (Ai LENGTH: 21 base pairs (B TYPE: nucleic acid (C: CHAIN TYPE: individual (Di TOPOLOGY: linear (Ll ) TYPE OF MOLECULE: Genetic DNA (Xl) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 15 GGACACTTCT AATTGTCCTC C 1 5 (2) INFORMATION FOR SEO ID NO: 5: () SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 pairs of bases (B) TYPE: nucleic acid 10 (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear (u) TYPE OF MOLECULE: Genomic DNA (XL) DESCRIPTION OF SEQUENCE: SEO ID NO: 16: TGAACTGG TCCATGA7GC C LS 21
(?) INFORMATION FOR SEO ID NO: 17: (H SEQUENCE CHARACTERISTICS: (AT LENGTH: 21 base pairs JU (Bl TYPE: nucleic acid (C> TYPE OF CHAIN: individual (DI TOPOLOGY: linear (n) TYPE OF MOLECULE: Genetic DNA (x) DESCRIPTION OF SEQUENCE: SEO ID NO: 17:
? h AGGGGCACTG AGCTGACCAC C 21
(2) INFORMATION FOR SEO ID NO: 18: (L SEQUENCE CHARACTERISTICS: (ft) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C) CHAIN TYPE: individual (D) TOPOLOGY: linear (Ü) ) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF SEQUENCE: SEO ID NO: 18; CACTTCTACA CATTGGCGCC G 21
(2) INFORMATION FOR SEO ID NO: 19: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (OR TYPE OF CHAIN: individual (D) TOPOLOGY: linear (ii) ) TYPE OF MOLECULE: Genomic DNA (XL) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 19: CT? CGCAGGG ATGCCCTGTG G 21
(2) INFORMATION FOR SEO ID NO: 20: (i) CHARACTERISTICS OF THE SEQUENCE: (AT LENGTH: 21 base pairs (Bl TYPE: nucleic acid (Ci TYPE OF CHAIN: individual. (DI TOPOLOGY: linear (i :) TYPE OF MOLECULE: Genomic DNA (x. \) DESCRIPTION OF SEQUENCE: SEO ID NO: 20; TCATCACCAA CTCTAAT6TC C 21 (2) INFORMATION FOR SEO ID NO: 21: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (OR TYPE OF CHAIN: individual (Dt TOPOLOGY: linear (ix) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF SEQUENCE: SEO ID NO: 21
TGTCAGCAGT GCCTATCTCT G 21 (2! INFORMATION FOR SEO ID NO: 22: (i) CHARACTERISTICS OF THE SEQUENCE: (A: LENGTH: 21 base pairs (B! TYPE: nucleic acid (C: CHAIN TYPE: individual (D)) TOPOLOGY: linear (ii) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 22: AGCAGCGGAG GCCTCCAGCA G 2.1.
(2) INFORMATION FOR SEO ID NO: 23: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 2.1 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 23: CCTCACCGTG TGCTGTCCTG G 21 (2) INFORMATION FOR SEO ID NO: 24: (j) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2.1 base pairs (I¡¡) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (Ti) TOPOLOGY: linear (ii) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF SEQUENCE: SEO ID NO: 24,
GGCTGCGCTT GCTGTGCCTG G 21
(2) INFORMATION FOR SEO ID NO: 25: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (OR CHAIN TYPE: individual (D) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: genomic DNA (i) DESCRIPTION OF SEQUENCE: SEO ID NO: 25; CCÍCACCGTG TGCTGTCCTG G 21
(2) INFORMATION FOR SEO ID NO: 26: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (Cl CHAIN TYPE: individual (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: genomic DNA (?: L) DESCRIPTION OF SEQUENCE: SEO ID NO: 26: CCTCACCGTG TGCTGTCCTG G 2.1 (2) INFORMATION FOR SEO ID NO: 27: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 21 base pairs (B> TYPE: nucleic acid (Cl TYPE OF CHAIN: individual (D) TOPOLOGY: linear (i:?.) TYPE OF MOLECULE: Genomic DNA (X: L) DESCRIPTION OF SEQUENCE: SEO ID NO: 27; GCGGGACTGC CACCTTCTAC C 21
(2) INFORMATION FOR SEO ID NO: 28: (i) CHARACTERISTICS OF THE SEQUENCE: (A. 'LENGTH: 21 base pairs (B) TYPE: nucleic acid (C: TYPE OF CHAIN: individual (D3 TOPOLOGY: linear ( ij) TYPE OF MOLECULE: Genomic DNA (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 28: CTCAATAAAC TTGTGCCTCC A 21
(2) INFORMATION FOR SEO ID NO: 29: (i) CHARACTERISTICS OF THE SEQUENCE: (A) LENGTH: 23 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual (D) TOPOLOGY: linear ( ii) TYPE OF MOLECULE: Genomic DNA (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 29:
CGGATATGGA AGATGGCACC GGG 23
(2) INFORMATION FOR SEO ID NO: 30: 5 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 base pairs (B) TYPE: nucleic acid (C) CHAIN TYPE: individual (D) TOPOLOGY: linear Ul (il) TYPE OF MOLECULE: DNA genórni co (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 30; AGAGCTGCAG GCGCGCGTCA TG 22
(2) INFORMATION FOR SEO ID NO: 1: (i: SEQUENCE CHARACTERISTICS: (A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) TYPE OF CHAIN: individual 0 (D) TOPOLOGY : linear (ii) TYPE OF MOLECULE: genomic DNA (xi) DESCRIPTION OF THE SEQUENCE: SEO ID NO: 31: CCGAGCATCC CGCGCCGAC 19 25 (2) INFORMATION FOR SEO ID NO: 32: (i) CHARACTERISTICS OF THE SEQUENCE: (A) ) LENGTH: 20 base pairs (B) TYPE: nucleic acid 0 (C) TYPE OF CHAIN: individual. (D) TOPOLOGY: linear (ii) TYPE OF MOLECULE: Genomic DNA (i) DESCRIPTION OF THE SEQUENCE: SEO ID NO : 32;
CAGCTGCCCG CCCCACATCT 0
Claims (21)
- NOVELTY OF THE INVENTION CLAIMS 1. - An isolated nucleic acid molecule encoding human genomic galacto kinase, said nucleic acid molecule characterized in that it is selected from the group comprising: (a) a characterized nucleic acid molecule * comprising the sequence as set forth in SEO ID NO:?; and (b) a nucleic acid molecule that differs to p.artir from the nucleic acid molecule of (a) in the sequence d < ? codons due to the degeneracy of the genetic code.
- 2. A vector characterized in that it comprises the nucleic acid molecule according to claim 1.
- 3. A recombinant host cell characterized in that it comprises the vector according to claim 2.
- 4. An isolated nucleic acid molecule characterized because it comprises a DNA sequence encoding nucleotides 29 to 1204 of the SEO ID NO: 5 or nucleotides 29 to 265 of the SEO ID NO: 6.
- 5. A vector characterized in that it comprises the nucleic acid molecule according to claim 4.
- 6. The vector according to claim 5 further characterized in that it is a plasmid.
- 7. A recornbinating host cell characterized in that it comprises the vector in accordance with the claim 5.
- 8. A method for preparing a human galactocine rotein characterized in that it comprises the reclosing host cell according to claim 7 under conditions that promote the expression of said protein and its recovery.
- 9. An isolated protein encoded by the DNA sequence according to claim 4.
- 10 - A monoclonal antibody that is specifically reactive with the protein according to the claim 9.
- 11. A method for diagnosing conditions associated with human galactocmase deficiency characterized in that it comprises isolating a serum or tissue sample from an individual; let t to the sample come into contact with an antibody or antibody fragments which bind specifically to the human galac + ocmase prolema according to claim 9 under conditions such that an antigen-antibody complex is formed between said antibody or antibody fragment and said galactocmase protein; and detect the presence or absence of said complex.
- 12. A j-vitro method for diagnosing conditions associated with human galactokinase deficiency characterized in that it comprises isolating a sample of nucleic acid from an individual; testing said sample and the DNA sequence, or corresponding sequence of RNA, encoding a human galactokinase gene; and comparing differences between said sample and said DNA (or RNA) encoding nucleotides 20 to 1204 of SEQ ID NO: 4, wherein said differences indicate mutations in the human galactokinase gene.
- 13. The in vitro method according to claim 12 further characterized in that sample is RNA which is subsequently amplified by RT-PCR.
- 14. The in vitro method according to claim 13 further characterized in that testing said sample comprises a digestion with restriction endonuclease.
- 15, .- The in vitro method according to claim 14 further characterized in that said restriction endonuclease is Mscl.
- 16 - The irj_ vitro method according to claim 12 further characterized in that testing said sample comprises a hybridization test.
- 17. - The in vitro method according to claim 16 further characterized in that the hybridization test is heteroduplex electrophoresis which is characterized, because it comprises determining the differential mobility of heteroduplex products in polyacrylamide gels, said heteroduplex products are the result of hybridization between the nucleic acid sample and the DNA sequence, or corresponding RNA sequence, which encodes nucleotides 29 to 1204 of the SEO ID NO:.
- 18. The i_n vitro method according to claim .1.2 further characterized in that testing said sample comprises gel electrophoresis of restriction fragment length polymorphisms of said nucleic acid sample and the DNA sequence., or corresponding RNA sequence, which encodes nucleotides 29 to 1204 of the SEQ ID NO: 4.
- 19. The in vitro method according to claim 12 further characterized in that testing said sample comprises DNA sequencing.
- 20. An in vitro method for diagnosing conditions associated with human galactokinase deficiency characterized in that it comprises isolating cells from an individual containing genomic DNA and testing said sample by in situ hybridization using the DNA sequence encoding nucleotides 29 to 1204 of SEO ID NO: 4, nucleotides 29 to 1204 of SEO ID NO: 5, or nucleotides 29 to 265 of the SEO ID NO: 6; or a fragment encoding at least one exon of said sequence; or a fragment containing at least 15 contiguous base pairs of said sequence as a probe.
- 21. A non-human transgenic mammal capable of expressing in any one cell thereof. DNA according to claim 4.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCPCT/US1994/010825 | 1994-09-23 | ||
| USUS94/10825 | 1994-09-23 | ||
| PCT/US1994/010825 WO1996009408A1 (en) | 1994-09-23 | 1994-09-23 | Human galactokinase gene |
| PCT/US1995/006743 WO1996009374A1 (en) | 1994-09-23 | 1995-05-26 | Human galactokinase gene |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| MXPA97002205A true MXPA97002205A (en) | 1997-06-01 |
| MX9702205A MX9702205A (en) | 1997-06-28 |
Family
ID=38963044
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| MX9702205A MX9702205A (en) | 1994-09-23 | 1995-05-26 | Human galactokinase gene. |
Country Status (1)
| Country | Link |
|---|---|
| MX (1) | MX9702205A (en) |
-
1995
- 1995-05-26 MX MX9702205A patent/MX9702205A/en unknown
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107602690B (en) | PTGIS gene mutation associated with pulmonary arterial hypertension and its application | |
| EP0828003A2 (en) | Human serine protease | |
| JPH10511936A (en) | Human somatostatin-like receptor | |
| CA2388363C (en) | Dna polymerase lambda and uses thereof | |
| JP2001512676A (en) | Materials and methods for 1-α-hydroxylase | |
| US5789223A (en) | Human galactokinase gene | |
| US6008012A (en) | Human somatostatin-like receptor | |
| US5830649A (en) | Human galactokinase gene | |
| EP1268776A2 (en) | Corneodesmosin based test and model for inflammatory disease | |
| US20050032155A1 (en) | Mutation in the beta2 nicotinic acetycholine receptor subunit associated with nocturnal frontal lobe epilepsy | |
| US6586581B1 (en) | Prolactin regulatory element binding protein and uses thereof | |
| MXPA97002205A (en) | Gene of galactocinasa hum | |
| WO1996009374A1 (en) | Human galactokinase gene | |
| US20040180338A1 (en) | Mutated eukariotic transalation initiation factor 2 alpha kinase3, eif2ak3, in patients with neonatal insuluin-dependant diabetes and multiple epiphyseal dyslapsia (wolcott-rallison syndrome) | |
| US20030022311A1 (en) | Human CIS protein | |
| WO2005024024A1 (en) | Mutations in the nedd4 gene family in epilepsy and other cns disorders | |
| CA2200583A1 (en) | Human galactokinase gene | |
| JP2003518628A (en) | Compound | |
| US20130254908A1 (en) | MO-1, A Gene Associated With Morbid Obesity | |
| US20030148331A1 (en) | Novel human hepatoma associated protein and the polynucleotide encoding said polypeptide | |
| WO1997044347A1 (en) | Human cis protein | |
| US20030125296A1 (en) | Insulin-responsive DNA binding protein-1 and methods to regulate insulin-responsive genes | |
| WO1997020573A1 (en) | Growth factor receptor-binding protein 2 homolog | |
| JP2004081001A (en) | Method for testing mutation of Arx gene associated with X-linked encephalopathy | |
| US20030157075A1 (en) | Isolated nucleic acids and polypeptides associated with glucose homeostasis disorders and method of detecting the same |