[go: up one dir, main page]

HK1128490B - Selection of host cells expressing protein at high levels - Google Patents

Selection of host cells expressing protein at high levels Download PDF

Info

Publication number
HK1128490B
HK1128490B HK09107655.2A HK09107655A HK1128490B HK 1128490 B HK1128490 B HK 1128490B HK 09107655 A HK09107655 A HK 09107655A HK 1128490 B HK1128490 B HK 1128490B
Authority
HK
Hong Kong
Prior art keywords
sequence
polypeptide
ala
leu
selectable marker
Prior art date
Application number
HK09107655.2A
Other languages
Chinese (zh)
Other versions
HK1128490A1 (en
Inventor
A‧P‧奥特
H‧J‧M‧范布洛克兰
T‧H‧J‧克瓦克斯
R‧G‧A‧B‧西沃尔特
Original Assignee
科罗迈吉尼科斯公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/359,953 external-priority patent/US20060141577A1/en
Priority claimed from US11/416,490 external-priority patent/US20060195935A1/en
Application filed by 科罗迈吉尼科斯公司 filed Critical 科罗迈吉尼科斯公司
Priority claimed from PCT/EP2007/051696 external-priority patent/WO2007096399A2/en
Publication of HK1128490A1 publication Critical patent/HK1128490A1/en
Publication of HK1128490B publication Critical patent/HK1128490B/en

Links

Description

Selection of host cells expressing high levels of protein
Technical Field
The present invention relates to the fields of molecular biology and biotechnology. More specifically, the present invention relates to methods and means for improving the selection of host cells that express proteins at high levels.
Background
Proteins can be produced in a variety of host cells and are widely used in biology and biotechnology, such as biopharmaceuticals. Eukaryotic, and in particular mammalian host cells, are preferred for the expression of a variety of proteins, for example when such proteins have certain post-translational modifications such as glycosylation. Methods for such production are well established and generally require the expression of a nucleic acid encoding a protein of interest (also referred to as a "transgene") in a host cell. Typically, the transgene is introduced into a precursor cell along with a selectable marker gene, the cell is selected for expression of the selectable marker gene, and one or more clones expressing high levels of the protein of interest are identified and used to express the protein of interest.
One problem with transgene expression is that it is unpredictable, due to the high probability that the transgene becomes inactive due to gene silencing (McBurney et al, 2002), and thus numerous host cell clones need to be tested in order to obtain high expression of the transgene.
Methods for selecting recombinant host cells that express relatively high levels of a desired protein are known, and some such methods are discussed in the introduction to WO2006/048459, which is incorporated herein by reference.
In certain prior art advantageous methods, bicistronic expression vectors are described for the rapid and efficient production of stable mammalian cell lines expressing recombinant proteins. These vectors contain an Internal Ribosome Entry Site (IRES) between the upstream coding sequence of the protein of interest and the downstream coding sequence of the selectable marker (Rees et al, 1996). Such vectors are commercially available, for example the pIRES1 vector is available from Clontech (CLONTECHniques, October 1996). The introduction of such vectors into host cells, followed by selection for sufficient expression of the downstream marker protein, can automatically select for high transcript levels of polycistronic mRNA, and thus can potentially increase the likelihood of high expression of the protein of interest using such vectors. The IRES used in such a method is preferably such that translation of the selectable marker gene is a relatively low level of IRES, thereby further improving the opportunity to select host cells with high expression levels of the protein of interest by expression selection of the selectable marker protein (see, e.g., WO03/106684 and WO 2006/005718).
The present invention aims to provide improved methods and means for selecting host cells that express high levels of a protein of interest.
Brief description of the invention
WO2006/048459, filed at the priority date of the present application but published thereafter, is incorporated herein by reference in its entirety. WO2006/048459 discloses the concept of selecting host cells that express a polypeptide of interest at high levels, which concept is referred to herein as "dependent translation". In this concept, a polycistronic transcription unit is used in which a sequence encoding a selectable marker polypeptide is located upstream of a sequence encoding a polypeptide of interest, wherein translation of the selectable marker polypeptide is impaired by mutations therein, whereas translation of the polypeptide of interest is very high (see, e.g., the schematic of figure 13 herein). The present invention provides alternative methods and means for selecting host cells that express high levels of a polypeptide.
In one aspect, the invention provides a DNA molecule comprising a polycistronic transcription unit encoding i) a polypeptide of interest and ii) a selectable marker polypeptide functional in a eukaryotic host cell, wherein the polypeptide of interest has a translation initiation sequence that is independent of the translation initiation sequence of the selectable marker polypeptide sequence, wherein in the polycistronic transcription unit the coding sequence for the polypeptide of interest is upstream of the coding sequence for the selectable marker polypeptide, wherein an Internal Ribosome Entry Site (IRES) is present downstream of the coding sequence for the polypeptide of interest and upstream of the coding sequence for the selectable marker polypeptide, wherein the nucleic acid sequence encoding the selectable marker polypeptide on the coding strand comprises a translation initiation sequence selected from the group consisting of: a) a GTG start codon; b) a TTG start codon; c) a CTG start codon; d) an ATT start codon; and e) ACG initiation codon.
The translation initiation sequence of the selectable marker polypeptide in the coding strand comprises an initiation codon, such as a GTG, TTG, CTG, ATT or ACG sequence, which is different from the ATG initiation codon, with the first two being most preferred. Such a non-ATG initiation codon is preferably flanked by sequences for relatively better recognition of non-ATG sequences as initiation codons, such that at least some ribosomes are translated from these initiation codons, i.e. the translation initiation sequence preferably comprises ACC [ non-ATG initiation codon ] G or GCC [ non-ATG initiation codon ] G.
In preferred embodiments, the selectable marker protein provides resistance to lethal and/or growth inhibitory effects of a selective agent, such as an antibiotic.
The invention further provides an expression cassette comprising a DNA molecule of the invention, the expression cassette further comprising a promoter upstream of the polycistronic expression unit, which is functional in a eukaryotic host cell and which can initiate transcription of the polycistronic expression unit, said expression cassette further comprising a transcription termination sequence downstream of the polycistronic expression unit.
In a preferred embodiment, such an expression cassette further comprises at least one chromatin control element (chromatin control element) selected from the group consisting of: a matrix or scaffold attachment region (MAR/SAR), an insulating sequence, a Universal Chromatin Opening Element (UCOE), and an anti-repressor (STAR) sequence. Preferred in this regard are anti-repressor sequences, which in certain embodiments are selected from the group consisting of: a) any one of seq.d. No.1 to seq.d. No. 66; b) a fragment of any one of seq.d. No.1 to seq.d. No.66, wherein said fragment has anti-repressor activity; c) a sequence that is at least 70% identical in nucleotide sequence to a) or b), wherein said sequence has anti-repressor activity; and d) the complement of any one of a) to c).
The invention also provides host cells comprising the DNA molecules of the invention.
The invention further provides a method of producing a host cell expressing a polypeptide of interest, the method comprising: the DNA molecules or expression cassettes of the invention are introduced into a plurality of precursor host cells, the cells are cultured under conditions that select for expression of the selectable marker polypeptide, and at least one host cell that produces the polypeptide of interest is selected.
In another aspect, the invention provides a method of producing a polypeptide of interest, the method comprising culturing a host cell comprising an expression cassette of the invention, and expressing the polypeptide of interest from the expression cassette. In a preferred embodiment, the polypeptide of interest is further isolated from the host cell and/or the host cell culture medium.
Brief Description of Drawings
FIG. 1. results for expression constructs of the invention. The expression construct contains a sequence encoding a polypeptide of interest (here exemplified by d2EGFP) upstream of an IRES upstream of a sequence encoding a selectable marker of the invention (here exemplified by a zeocin resistance gene, having a TTG initiation codon (TTG Zeo) (or under the control of its conventional ATG initiation codon (ATG Zeo)). See example 1 for details. The dots represent individual data points; straight lines represent mean expression levels; the constructs used are represented on the horizontal axis and depicted schematically above the diagram; the vertical axis represents the d2EGFP signal.
FIG. 2 results with the tricistronic expression cassette with dhfr as a maintenance marker. The expression construct contains a zeocin selectable marker gene, has a TTG initiation codon and lacks an internal ATG sequence upstream of the sequence encoding the polypeptide of interest (here exemplified by d2EGFP), which is further operably linked to a downstream metabolic selectable marker dhfr gene (with ATG initiation codon) by an IRES. The dots represent individual data points (Zeo on the vertical axis)RGFP fluorescence signal of colonies), the straight line indicates the mean expression level. The constructs used are shown above the graph with conditions indicated on the horizontal axis (d: days). See example 2 for details.
FIG. 3 shows the same result as in FIG. 2, except that the dhfr gene has a GTG start codon.
FIG. 4 shows the case of FIG. 2, but the dhfr gene has a TTG start codon.
FIG. 5 copy number of clones with dhfr enzyme (ATG start codon) under different conditions. See example 3 for details.
FIG. 6 shows the case of FIG. 5, but the dhfr gene has a GTG start codon.
FIG. 7 shows the case of FIG. 5, but the dhfr gene has a TTG start codon.
Detailed Description
In one aspect, the invention provides a DNA molecule according to claim 1. Such DNA molecules of the invention may be used to obtain eukaryotic host cells expressing high levels of a polypeptide of interest by selecting for expression of a selectable marker polypeptide. One or more host cells expressing the polypeptide of interest can then or concurrently be identified for further use in expressing high levels of the polypeptide of interest.
The term "monocistronic gene" is defined as a gene that provides an RNA molecule encoding a polypeptide. A "polycistronic transcription unit", also known as a polycistronic gene, is defined as a gene that can provide an RNA molecule encoding at least 2 polypeptides. The term "bicistronic gene" is defined as a gene that provides an RNA molecule encoding 2 polypeptides. Accordingly, the bicistronic gene is encompassed in the definition of the polycistronic gene. As used herein, a "polypeptide" comprises at least 5 amino acids joined by peptide bonds, and may be, for example, a protein or a portion of a protein, such as a subunit. In most cases, the terms polypeptide and protein are used interchangeably herein. "Gene" or "transcription unit" used in the present invention may comprise chromosomal DNA, cDNA, artificial DNA, a combination thereof, and the like. A transcription unit comprising several cistrons is transcribed as a single mRNA.
The polycistronic transcription unit of the present invention is preferably a bicistronic transcription unit, encoding from 5 'to 3' the polypeptide of interest and the selectable marker polypeptide. Thus, the polypeptide of interest is encoded upstream of the coding sequence for the selectable marker polypeptide. The IRES is operably linked to a sequence encoding a selectable marker polypeptide, such that translation of the selectable marker polypeptide is dependent on the IRES.
Preferably, separate transcription units are used to express different polypeptides of interest, and the same applies when these form part of a multimeric protein (see, e.g., example 13 of WO2006/048459, incorporated herein by reference: the heavy and light chains of an antibody are each encoded by separate transcription units, each expression unit being a bicistronic expression unit).
The DNA molecule of the invention may exist in the form of a double-stranded DNA having a coding strand for the selectable marker polypeptide and the polypeptide of interest and a non-coding strand, the coding strand being the same strand as the translated RNA sequence except that U is replaced by T. Thus, the AUG initiation codon is encoded by the ATG sequence on the coding strand, and the strand containing this ATG sequence corresponding to the AUG initiation codon on the RNA is referred to as the coding strand of the DNA. It will be clear to the skilled person that the initiation codon or translation initiation sequence is actually present in the RNA molecule, but these can equally be considered on the DNA molecule encoding such an RNA molecule; thus, wherever reference is made herein to an initiation codon or translation initiation sequence, the corresponding DNA molecule having the same sequence as the RNA sequence but with a T instead of a U on the coding strand of the DNA molecule, and vice versa, unless otherwise indicated. In other words, the initiation codon is, for example, an AUG sequence in RNA, but the corresponding ATG sequence on the DNA coding strand is also referred to as the initiation codon in the present invention. The same usage is also used to refer to "in frame" coding sequences, meaning that triplets of amino acids (3 bases) are translated on the RNA molecule, but are also interpreted as corresponding trinucleotide sequences on the coding strand of the DNA molecule.
The selectable marker polypeptide and the polypeptide of interest encoded by the polycistronic gene each have their own translation initiation sequence and thus each have their own initiation codon (and terminator), i.e., they are encoded by separate open reading frames.
The term "selectable marker" or "selectable marker" typically refers to a gene and/or protein whose presence in a cell can be detected directly or indirectly, e.g., a polypeptide (e.g., an antibiotic resistance gene and/or protein) that inactivates a selective agent and protects the host cell from lethal or growth inhibitory effects of the selective agent. Another possibility is that the selectable marker induces fluorescence or color deposits (e.g., Green Fluorescent Protein (GFP) and derivatives (e.g., d2EGFP), luciferase, lacZ, alkaline phosphatase, etc.), which can be used to select for cells expressing a color deposit-inducing polypeptide, e.g., cells expressing GFP using a Fluorescent Activated Cell Sorter (FACS). Preferably, the selectable marker polypeptide of the present invention provides resistance to lethal and/or growth inhibitory effects of the selective agent. The selectable marker polypeptide is encoded by the DNA of the present invention. The selectable marker polypeptide of the present invention must be functional in eukaryotic host cells and therefore capable of being selected in eukaryotic host cells. Any selectable marker polypeptide meeting such criteria may in principle be used in the present invention. Such selectable marker polypeptides are well known in the art and are routinely used to obtain eukaryotic host cell clones, several examples of which are provided herein. In certain embodiments, the selectable marker used in the present invention is zeocin. In other embodiments, blasticidin is used. Those skilled in the art will recognize that other selectable markers are available and may be used, such as neomycin, puromycin, bleomycin and hygromycin and the like. In other embodiments, kanamycin is used. In other embodiments, the DHFR gene is used as a selectable marker that can be selected using methotrexate, particularly with increased concentrations of methotrexate. The DHFR gene can also be used to complement a dhff-deficiency in media with folate and lacking glycine, hypoxanthine and thymidine, for example in CHO cells with a dhff phenotype. Similarly, the Glutamate Synthetase (GS) gene can also be used, and selection can be performed in GS-deficient cells (e.g., NS-0 cells) by culturing in a medium without glutamine, or in cells with sufficient GS (e.g., CHO cells) by adding the GS inhibitor Methionine Sulfoximine (MSX). Other selectable marker genes and their selection agents that may be used are described, for example, in table 1 of U.S. patent No. 5,561,053, which is incorporated herein by reference; see also Kaufman, Methods in Enzymology, 185: 537-566(1990). If the selectable marker polypeptide is dhfr, in an advantageous embodiment the host cell is cultured in a medium containing folate, which medium is substantially free of hypoxanthine and thymidine, preferably also free of glycine.
When two polycistronic transcription units are selected in a single host cell according to the present invention, each preferably contains a coding sequence for a different selectable marker, allowing for selection of two polycistronic transcription units. Of course, two polycistronic transcription units may be present on one nucleic acid molecule or each on a separate nucleic acid molecule.
The term "selection" is typically defined as the process of identifying a host cell with specific genetic properties (e.g., the host cell contains a transgene integrated into its genome) using a selectable marker/selectable marker and a selection agent. It is clear to the person skilled in the art that a combination of various selection markers is feasible. One particularly advantageous antibiotic is zeocin, because the zeocin-resistance protein (zeocin-R) acts by binding a drug such that it becomes harmless. The amount of drug that kills cells expressing zeocin-R at low levels can therefore be readily titrated while allowing high levels of expressing cells to survive. All other commonly used antibiotic resistance proteins are enzymes and therefore act catalytically (and drugs not 1: 1). Therefore, the antibiotic zeocin is a preferred selection marker. Another preferred selectable marker is 5,6, 7, 8-tetrahydrofolate synthetase (dhfr). However, the present invention is applicable to other selection markers.
The selectable marker polypeptide of the invention is a protein encoded by a nucleic acid of the invention, which polypeptide is functionally useful for selection, for example because it provides resistance to a selection agent such as an antibiotic. Thus, when an antibiotic is used as a selection agent, the DNA encodes a polypeptide that confers resistance to the selection agent, which is a selectable marker polypeptide. DNA sequences encoding such selectable marker polypeptides are known, and several examples of wild-type DNA sequences encoding selectable marker proteins are provided herein (e.g., FIGS. 26-32 of WO2006/048459, incorporated herein by reference). It is clear that mutants or derivatives of the selectable marker may also be suitable according to the present invention and are therefore included within the scope of the term "selectable marker polypeptide" as long as the selectable marker protein is functional.
For convenience, and as is also generally accepted by those skilled in the art, in many publications and herein, genes and proteins encoding resistance to a selection agent are often referred to as "selection agent (resistance) genes" or "selection agent (resistance) proteins", respectively, although the formal names may differ, e.g., genes encoding proteins conferring resistance to neomycin (as well as to G418 and kanamycin) are often referred to as neomycin (resistance) (or neo)r) The gene, and formally named aminoglycoside 3' -phosphotransferase gene.
For the present invention, it is advantageous that the expression level of the selectable marker polypeptide is low, so that stringent selection can be performed. In the present invention, this is due to the use of a selectable marker coding sequence having a non-ATG initiation codon. In the selection, only those cells are selected which still have a sufficient level of the selectable marker polypeptide, meaning that these cells must have sufficient transcription of the polycistronic transcription unit and sufficient translation of the selectable marker polypeptide, which provides for selection of cells in which the polycistronic transcription unit has been integrated or is present at a location in the host cell where the expression level of this transcription unit is high.
The DNA molecules of the invention have the coding sequence for a selectable marker polypeptide downstream of the coding sequence for the polypeptide of interest. Thus, the polycistronic transcription unit comprises in the 5 'to 3' direction (in both the transcribed strand of DNA and the resulting transcribed RNA) a sequence encoding a polypeptide of interest and a coding sequence for a selectable marker polypeptide. The IRES is located upstream of the coding sequence for the selectable marker polypeptide.
According to the present invention, the coding region of the gene of interest is preferably translated from the cap-dependent ORF and the polypeptide of interest is produced in large quantities. The selectable marker polypeptide is translated from an IRES. To reduce translation of the selectable marker cistron, according to the present invention, the nucleic acid sequence encoding the selectable marker polypeptide comprises a mutation in the initiation codon that reduces the efficiency of translation initiation of the selectable marker polypeptide in a eukaryotic host cell. Preferably, the GTG start codon or, more preferably, the TTG start codon is engineered into the selectable marker polypeptide. In the same cell, the translation efficiency is lower than for the corresponding wild-type sequence, i.e.the mutation results in less polypeptide per cell per time unit and thus less selection marker polypeptide.
The translation initiation sequence is often referred to in the art as a "Kozak sequence", and one optimal Kozak sequence is RCCATGG, the start codon is underlined, and R is a purine, i.e., A or G (see Kozak M, 1986, 1987, 1989, 1990, 1997, 2002). Thus, in addition to the initiation codon itself, the bases above and below it, particularly the nucleotides at the-3 to-1 and +4 positions, are also of interest, and an optimal translation initiation sequence contains an optimal initiation codon (i.e., ATG) in the optimal context (i.e., RCC directly before ATG, G directly after ATG). Translation by ribosomes is also most efficient when an optimal Kozak sequence is present (see Kozak M, 1986, 1987, 1989, 1990, 1997, 2002). However, in a small number of cases, non-optimal translation initiation sequences are recognized by the ribosome and used to initiate translation. The present invention utilizes this principle, allowing to reduce or even fine-tune the amount of translation and the amount of expression of the selection marker polypeptide, which can thus be used to increase the stringency of the selection system.
In the present invention, the ATG initiation codon of the selectable marker polypeptide is mutated to another codon that has been reported to provide some translation initiation, such as to GTG, TTG, CTG, ATT or ACG (collectively referred to herein as "non-ATG initiation codons"). In a preferred embodiment, the ATG start codon is mutated to a GTG start codon. This provides a lower level of expression (lower translation) than the complete ATG start codon but in non-optimal sequences. More preferably, the ATG start codon is mutated to the TTG start codon, which provides an even lower level of expression of the selectable marker polypeptide than with the GTG start codon (Kozak M, 1986, 1987, 1989, 1990, 1997, 2002; see also examples 9-13 of WO2006/048459, incorporated herein by reference). The use of a non-ATG initiation codon in the coding sequence for a selectable marker polypeptide in a polycistronic transcription unit of the present invention has not been disclosed or suggested in the prior art, preferably in combination with chromatin control elements, resulting in very high levels of expression of the polypeptide of interest, as also shown in WO2006/048459, incorporated herein by reference.
The use of a non-ATG initiation codon according to the invention, strongly preferred is an initiation codon providing an optimal sequence of bases before and after that, i.e.the non-ATG initiation codon preferably directly follows the RCC nucleotide at the-3 to-1 position and directly follows the G nucleotide (position + 4). However, TTT has been reportedGTGG sequences (start codon underlined) at least some initiation is observed in vitro, so although strongly preferred, it may not be absolutely necessary to provide optimal base sequences around the non-ATG start codon.
The ATG sequences within the coding sequence of the polypeptide (excluding the ATG start codon) are referred to as "internal ATGs" and if these are in frame with the ORF, thus encoding methionine, the methionine in the resulting polypeptide is referred to as "internal methionine". In the invention WO2006/048459, the coding region encoding the selectable marker polypeptide (which does not necessarily include the start codon after the start codon) does not contain any ATG sequence on the coding strand of the DNA up to (but not including) the start codon of the polypeptide of interest. WO2006/048459 discloses how to do this and to test the functionality of the resulting selectable marker polypeptide. For the purposes of the present invention, where the selectable marker polypeptide coding sequence is located downstream of the IRES and downstream of the coding sequence for the polypeptide of interest, the internal ATG may be retained in the sequence encoding the selectable marker polypeptide.
Clearly, it is strongly preferred according to the invention that the translation initiation sequence of the polypeptide of interest comprises an optimal translation initiation sequence, i.e.has the consensus sequence RCCATGG (start codon underlined). This results in a very efficient translation of the polypeptide of interest.
The stringency of selection can be increased by providing coding sequences for markers with different mutations, resulting in several reduced levels of translational efficiency. The selection system can thus be fine-tuned using the polycistronic transcription unit of the invention: for example, using GTG as the start codon for the selectable marker polypeptide, only a few ribosomes will be translated from this start codon, resulting in a low level of selectable marker protein, and thus high stringency of selection; the use of a TTG start codon increases the stringency of selection even further, as less ribose will translate the selectable marker polypeptide from this start codon.
It is shown in WO2006/048459 incorporated herein by reference that the polycistronic expression units disclosed therein can be used in a very robust selection system, resulting in a very large proportion of clones expressing the polypeptide of interest at high levels as desired. In addition, the expression levels of the polypeptide of interest obtained are significantly higher than those of clones obtained when screening even larger numbers of colonies with the selection systems known so far.
In addition to reduced translation initiation efficiency, it may be beneficial to reduce the efficiency of translational extension of the selectable marker polypeptide, for example by mutating its coding sequence such that it contains several non-preferred host cell codons to further reduce the level of translation of the marker polypeptide, and if desired, more stringent selection conditions may be used. In certain embodiments, the selectable marker polypeptide further comprises a mutation that reduces the activity of the selectable marker polypeptide (compared to the wild-type) in addition to the mutation that reduces translation efficiency of the invention. This can be used to further increase the stringency of selection. By way of non-limiting example, the proline at position 9 of the zeocin resistance polypeptide may be mutated, for example, to Thr or Phe (see example 14 of WO2006/048459, incorporated herein by reference), and for neomycin resistance polypeptides, amino acid residues 182 or 261 or both may be further mutated (see, for example, WO 01/32901).
In some embodiments of the invention, a so-called spacer sequence is placed downstream of the sequence encoding the start codon of the selectable marker polypeptide, which spacer sequence is preferably in-frame with the start codon and encodes several amino acids, without secondary structure (Kozak, 1990). If secondary structure is present in the RNA of the selection marker polypeptide (e.g.for zeocin or for blasticidin), such a spacer sequence may be used to further reduce the translation initiation frequency (Kozak, 1990) and thus increase the stringency of the selection system of the invention (see example 14 of WO2006/048459, incorporated herein by reference).
It will be clear that any DNA molecule as described above but having a mutation in the sequence downstream of the first ATG sequence (start codon) encoding the selectable marker protein may also be used and is therefore also encompassed within the scope of the present invention, as long as the respective encoded selectable marker polypeptide is still active. For example, any silent mutation that does not alter the encoded protein due to redundancy of the genetic code is also contemplated. Other mutations that result in conservative amino acid mutations or other mutations are also contemplated, as long as the encoded protein is still active, which may or may not be less active than the wild-type protein encoded by the sequence. In particular, it is preferred that the encoded protein is at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95% identical to the protein encoded by the respective representative sequence (e.g. as provided in SEQ ID No.68-80 of the sequence listing in this application). The activity of the selectable marker protein can be detected by conventional methods.
A preferred aspect of the invention provides an expression cassette comprising a DNA molecule of the invention having a polycistronic transcription unit. Such expression cassettes can be used to express a sequence of interest, for example in a host cell. As used herein, an "expression cassette" is a nucleic acid sequence comprising at least one promoter functionally linked to a sequence to be expressed. Preferably, the expression cassette further comprises transcription termination and polyadenylation sequences. Other regulatory sequences such as enhancers may also be included. Accordingly, the present invention provides an expression cassette comprising, in the following order: the 5 '-promoter-the polycistronic transcription unit of the invention, encodes the polypeptide of interest and downstream thereof the selectable marker polypeptide-transcription termination sequence-3'. The promoter must be capable of functioning in a eukaryotic host cell, i.e., it must be capable of driving transcription of the polycistronic transcription unit. The promoter is thus operably linked to the polycistronic transcription unit. The expression cassette may optionally further contain other elements known in the art, such as splice sites to include introns and the like. In some embodiments, the intron is present after the promoter and before the sequence encoding the polypeptide of interest. The IRES is operably linked to a cistron containing the coding sequence for the selectable marker polypeptide. In further embodiments, the sequence encoding the second selectable marker is present in a polycistronic transcription unit (i.e., in these embodiments this is at least a tricistronic transcription unit). In a preferred embodiment, the sequence encoding the second selectable marker polypeptide: a) having a translation initiation sequence separate from the translation initiation sequence of the polypeptide of interest, b) being located upstream of said sequence encoding the polypeptide of interest, c) having no ATG sequence in the coding chain between the start codon of said second selectable marker polypeptide and the start codon of the polypeptide of interest, and d) having a non-optimal translation initiation sequence, such as a GTG start codon or a TTG start codon. For such embodiments, a preferred selectable marker polypeptide is 5,6, 7, 8-tetrahydrofolate synthase (dhfr). This allows for a continuous selection of high levels of expression of the polypeptide of interest, as described in example 2.
To obtain expression of a nucleic acid sequence encoding a protein, it is known to those skilled in the art that sequences capable of driving such expression can be functionally linked to the nucleic acid sequence encoding the protein to produce the recombinant nucleic acid sequence encoding the protein in an expressible form. In the present invention, the expression cassette comprises a polycistronic transcription unit. Typically, the promoter sequence is located upstream of the expressed sequence. Widely used expression vectors are available in the art, such as the pcDNA and pEF vector series of Invitrogen, pMSCV and pTK-Hyg of BD Sciences, pCMV-Script of Stratagene, and the like, which can be used to obtain appropriate promoter and/or transcription terminator sequences, polyA sequences, and the like.
The sequence encoding the polypeptide of interest is suitably inserted into a sequence controlling the transcription and translation of the encoded polypeptide, and the resulting expression cassette can be used to produce the polypeptide of interest, referred to as expression. Sequences that drive expression may include promoters, enhancers, and the like, as well as combinations thereof. These should be capable of functioning in the host cell, thus driving expression of those nucleic acid sequences to which they are functionally linked. One skilled in the art recognizes that different promoters can be used to obtain expression of a gene in a host cell. Promoters may be constitutive or regulated, and may be obtained from different sources, including viral, prokaryotic or eukaryotic sources, or artificially designed. Expression of the nucleic acid of interest may be initiated from a native promoter or derivative thereof or from a completely heterologous promoter (Kaufman, 2000). According to the present invention, strong promoters giving high transcription levels in selected eukaryotic cells are preferred. Suitable promoters are well known and available to those skilled in the art, some of which are described in WO2006/048459 (e.g., pages 28-29), incorporated herein by reference, including the CMV Immediate Early (IE) promoter (referred to herein as the CMV promoter) (e.g., obtained from pcDNA from Invitrogen) and many others.
In certain embodiments, the DNA molecule of the invention is part of a vector, such as a plasmid. Such vectors are readily manipulated by methods well known to those skilled in the art and can, for example, be designed to replicate in prokaryotic and/or eukaryotic cells. In addition, many vectors can be used to transform eukaryotic cells, either directly or as fragments of interest isolated therefrom, and integrate all or part of these into the genome of these cells, resulting in a stable host cell comprising the desired nucleic acid in its genome.
Conventional expression systems are DNA molecules in the form of recombinant plasmids or recombinant viral genomes. Plasmids or viral genomes are introduced into (eukaryotic) cells and preferably integrated into their genome by methods known in the art, some aspects of which are described in WO2006/048459 (e.g. pages 30-31), incorporated herein by reference.
It is widely recognized that chromatin structure and other epigenetic regulatory mechanisms can influence transgene expression in eukaryotic cells (e.g., Whitelaw et al, 2001). The polycistronic expression unit in this aspect forms part of a selection system with a rigid selection scheme. This usually requires high transcription levels in the host cell of choice. In order to increase the chance of finding a viable host cell clone under stringent selection regimes and possibly increase the stability of expression in the resulting clones, it is generally preferred to increase the predictability of transcription. Thus, in a preferred embodiment, the expression cassette of the invention further comprises at least one chromatin control element. "chromatin control elements" are herein a generic term for DNA sequences which have a role in the chromatin structure in eukaryotic cells and thus in the expression level and/or expression stability of the transgene in their vicinity (they act in "cis" and are therefore preferably located within 5kb, more preferably 2kb, more preferably 1kb of the transgene). These elements are sometimes used to increase the number of clones with desired transgene expression levels. Several types of these elements that may be used in the present invention are described in WO2006/048459 (e.g.pages 32-34), incorporated herein by reference, and for the purposes of the present invention, chromatin control elements are selected from the following: matrix or scaffold attachment regions (MAR/SAR), insulators such as beta-globin insulator elements (5 'HS 4 of the chicken beta-globin locus), scs, scs', and the like, Universal Chromatin Opening Elements (UCOE) and anti-repressor sequences (also known as "STAR" sequences).
Preferably, the chromatin control element is an anti-repressor sequence, preferably selected from the group consisting of: a) any one of seq.d. No.1 to seq.d. No. 66; b) a fragment of any one of seq.d. No.1 to seq.d. No.66, wherein said fragment has anti-repressor activity ("functional fragment"); c) a sequence which is at least 70% identical in nucleotide sequence to a) or b), wherein said sequence has anti-repressor activity ("functional derivative"); d) the complement of any one of a) to c). Preferably, the chromatin control element is selected from the group consisting of: STAR67(seq.id.no.66), STAR7(seq.id.no.7), STAR9(seq.id.no.9), STAR17(seq.id.no.17), STAR27(seq.id.no.27), STAR29(seq.id.no.29), STAR43(seq.id.no.43), STAR44(seq.id.no.44), STAR45(seq.id.no.45), STAR47(seq.id.no.47), STAR61(seq.id.no.61), or a functional fragment or derivative of said STAR sequence. In a preferred embodiment, the STAR sequence is STAR67(seq. id. No.66) or a functional fragment or derivative thereof. In certain preferred embodiments, STAR67 or a functional fragment or derivative thereof is located upstream of a promoter that drives expression of the polycistronic transcription unit. In other preferred embodiments, the expression cassette of the invention is flanked on both sides by at least 1 anti-repressor sequence, for example by one of seq id No.1 to seq id No.65, preferably with the 3' end of each of these sequences facing the transcription unit. In certain embodiments, the expression cassettes of the invention comprise, in order from 5 'to 3': anti-repressor sequence A-anti-repressor sequence B- [ promoter-polycistronic transcription unit of the invention (encoding the polypeptide of interest and a functional selectable marker protein downstream thereof) -transcription termination sequence ] -anti-repressor sequence C, wherein A, B and C may be the same or different.
Sequences having anti-repressor activity (anti-repressor sequences) and their characteristics, as well as functional fragments or derivatives thereof, as well as their structural and functional definitions, and methods of obtaining and using them (these sequences can be used in the present invention) are described in WO2006/048459 (e.g. pages 34-38), incorporated herein by reference.
For the production of multimeric proteins, 2 or more expression cassettes may be used. Preferably, both expression cassettes are polycistronic expression cassettes of the invention, each encoding a different selectable marker protein, such that both expression cassettes can be selected. This embodiment has been shown to give good results, for example for expression of antibody heavy and light chains. It will be clearly understood that the two expression cassettes may be located on one nucleic acid molecule or that the two may be present on separate nucleic acid molecules prior to introduction into the host cell. The advantage of placing them on a nucleic acid molecule is that the two expression cassettes are present in a predetermined ratio (1: 1) when introduced into the host cell. On the other hand, when present on two different nucleic acid molecules, the molar ratio of the two expression cassettes can be varied when introduced into the host cell, which is advantageous for those cases where the preferred molar ratio is not 1: 1 or where it is not known beforehand what is the preferred molar ratio, so that the person skilled in the art can easily vary and empirically find the optimum ratio. According to the present invention, preferably at least one expression cassette, but more preferably each expression cassette comprises a chromatin control element, more preferably an anti-repressor sequence.
In another embodiment, different subunits or portions of the multimeric protein are present in one expression cassette.
Configurations of useful anti-repressor sets and expression cassettes have been described in WO2006/048459 (e.g., page 40), incorporated herein by reference.
In certain embodiments, the transcription unit or expression cassette of the invention provided further comprises a transcription pause (TRAP) sequence, substantially as described in WO2006/048459, pages 40-41, incorporated herein by reference. A non-limiting example of a TRAP sequence is given in seq.id No. 81. Other TRAP sequence examples, methods for their discovery and their use are described in WO 2004/055215.
DNA molecules comprising the polycistronic transcription unit and/or the expression cassette of the invention may be used to improve expression of the nucleic acid, preferably in a host cell. The terms "cell"/"host cell" and "cell line"/"host cell line" are typically defined as cells and their cognate population, respectively, that can be maintained in cell culture by methods known in the art, and that have the ability to express heterologous or homologous proteins.
Several examples of host cells that can be used are described in WO2006/048459 (e.g., pages 41-42), incorporated herein by reference, such cells including, for example, mammalian cells, including but not limited to CHO cells, e.g., CHO-K1, CHO-S, CHO-DG44, CHO-DUKXB11, including CHO cells having a dhfr-phenotype, and myeloma cells (e.g., Sp2/0, NS0), HEK 293 cells, and PER. C6 cells.
These eukaryotic host cells can express the desired polypeptide and are often used for these purposes. They can be obtained by introducing the DNA molecules of the invention, preferably in the form of expression cassettes, into cells. Preferably, the expression cassette is integrated into the genome of the host cell, can be integrated into different sites in different host cells, and clones can be selected for transgene integration into the appropriate site, resulting in host cell clones with the desired properties in terms of expression level, stability and growth characteristics. Alternatively, the polycistronic transcription unit may be targeted or randomly selected for integration into a chromosomal transcriptionally active region, e.g., following a promoter present in the genome. Selection of cells containing the DNA of the invention is carried out by selection of the selectable marker polypeptide using conventional methods known to those skilled in the art. When such a polycistronic transcription unit is integrated into the promoter of the genome, the expression cassette of the invention may be generated in situ, i.e.in the genome of the host cell.
Preferably, the host cells are obtained from stable clones selected and passaged according to standard methods known to those skilled in the art. If these cells contain the polycistronic transcription unit of the invention, a culture of such a clone will produce the polypeptide of interest.
Introduction of the expressed nucleic acid into the cell may be carried out by one of several methods, which are known to the person skilled in the art and depend on the form of the introduced nucleic acid. Such methods include, but are not limited to, transfection, infection, injection, transformation, and the like. Suitable host cells for expression of the polypeptide of interest may be obtained by selection.
In a preferred embodiment, the DNA molecule comprising the polycistronic transcription unit of the invention, preferably in the form of an expression cassette, is integrated into the genome of the eukaryotic host cell of the invention. This will provide stable inheritance of polycistronic transcription units.
Selection for the presence of the selectable marker polypeptide, and thus expression, can be performed at the time the cells were obtained. In certain embodiments, the selective agent is present in the culture medium at least part of the time during the culturing process, either at a sufficient concentration to select for cells expressing the selectable marker polypeptide, or at a lower concentration. In a preferred embodiment, the selective agent is no longer present in the culture medium during the production phase when the polypeptide is expressed.
The polypeptide of interest of the invention may be any protein, possibly a monomeric or multimeric protein (or a portion). The multimeric protein comprises at least two polypeptide chains. Non-limiting examples of proteins of interest of the present invention are enzymes, hormones, immunoglobulin chains, therapeutic proteins such as anti-cancer proteins, hemagglutinin proteins such as factor VIII, multifunctional proteins such as erythropoietin, diagnostic proteins, or proteins or fragments thereof for vaccination purposes, all of which are known to those skilled in the art.
In certain embodiments, the expression cassette of the invention encodes an immunoglobulin heavy or light chain or antigen-binding portion, derivative and/or analog thereof. In a preferred embodiment, a protein expression unit of the invention is provided, wherein the protein of interest is an immunoglobulin heavy chain. In another preferred embodiment, a protein expression unit of the invention is provided, wherein the protein of interest is an immunoglobulin light chain. When these two protein expression units are present in the same (host) cell, the multimeric protein, more specifically the immunoglobulin, is synthesized. Thus, in certain embodiments, the protein of interest is an immunoglobulin, e.g., an antibody, that is a multimeric protein. Preferably, such an antibody is a human or humanized antibody. In certain embodiments, it is an IgG, IgA, or IgM antibody. The immunoglobulin may be encoded by the heavy and light chains on different expression cassettes or on one expression cassette. Preferably, the heavy and light chains are present on different expression cassettes, each having its own promoter (which may be the same or different for both expression cassettes), each comprising a polycistronic transcription unit of the invention, the heavy and light chains being a polypeptide of interest, preferably each encoding a different selectable marker protein, such that the heavy and light chain expression cassettes can be selected when the expression cassettes are introduced into and/or present in a eukaryotic host cell.
The polypeptide of interest may be from any source, and in certain embodiments is a mammalian protein, an artificial protein (e.g., a fusion protein or a mutein), preferably a human protein.
Obviously, the expression cassette configuration of the invention can also be used when the ultimate goal is not to produce the polypeptide of interest but the RNA itself, e.g. to produce an increased amount of RNA from the expression cassette, which may be used for the purpose of modulating other genes (e.g. RNAi, reverse RNA), gene therapy, in vitro protein expression, etc.
In one aspect, the invention provides a method for producing a host cell expressing a polypeptide of interest, the method comprising introducing a DNA molecule or expression cassette of the invention into a plurality of precursor cells, culturing the produced cells under selective conditions and selecting at least one host cell that produces the polypeptide of interest. The advantages of this new process are similar to the alternative process described in WO2006/048459 (e.g. pages 46-47) and incorporated herein by reference.
When clones with relatively low copy numbers of polycistronic transcription units and high expression levels are available, the selection system of the present invention can still be combined with amplification methods to further improve expression levels. This can be achieved, for example, by amplifying the dhfr gene co-integrated with methotrexate, for example by placing dhff on the same nucleic acid molecule as the polycistronic transcription unit of the invention, or by co-transfection when dhfr is on a different DNA molecule. The dhfr gene may also be part of a polycistronic expression unit of the invention.
The invention also provides a method of producing one or more polypeptides of interest, the method comprising culturing a host cell of the invention.
Culturing the cell so that it can metabolize and/or grow and/or divide and/or produce the recombinant protein of interest. This can be accomplished by methods well known to those skilled in the art, including but not limited to providing nutrients to the cells. The methods comprise adherent growth, suspension growth, or a combination thereof. The cultivation may be carried out, for example, in a petri dish, a spinner flask or a bioreactor, using batch, fed-batch, continuous systems such as perfusion systems, etc. In order to achieve large-scale (continuous) production of recombinant proteins by cell culture, it is preferred in the art to use cells which can be cultured in suspension, preferably cells which can be cultured in the absence of serum of animal or human origin or serum components of animal or human origin.
The conditions under which the cells are grown or propagated (see, for example, Tissue Culture, Academic Press, Kruseand Paterson, editors (1973)) and the conditions under which the recombinant product is expressed are known to those skilled in the art. In general, the principles, procedures and operating techniques to maximize productivity of Mammalian Cell cultures can be found in Mammalian Cell Biotechnology: a Practical Approach (m.butler, ed., IRLPress, 1991).
In preferred embodiments, the expressed protein is collected (isolated) from the cell or from the culture medium or from both. It may be further purified by known methods such as filtration, column chromatography, etc. well known to those skilled in the art.
The selection methods of the invention work without chromatin control elements, but improved results are obtained when polycistronic expression units are provided with such elements. The selection method of the invention works particularly well when the expression cassette of the invention used comprises at least one anti-repressor sequence. Depending on the selection agent and conditions, the selection may in some cases be so stringent that only very few or even no host cells survive the selection unless an anti-repressor sequence is present. Thus, the combination of the new selection method and the anti-repressor sequence provides a very attractive way of obtaining a limited number of clones with a greatly improved chance of highly expressing the polypeptide of interest, while the obtained clones comprising expression cassettes with anti-repressor sequences provide stable expression of the polypeptide of interest, i.e. they are less prone to silencing or other expression-reducing mechanisms than conventional expression cassettes.
In one aspect, the present invention provides a polycistronic transcription unit having a different configuration than that disclosed in WO 2006/048459: in various configurations of the invention, the sequence encoding the polypeptide of interest is located upstream of the sequence encoding the selectable marker polypeptide operably linked to a cap-independent translation initiation sequence, preferably an Internal Ribosome Entry Site (IRES). Such polycistronic transcription units are also known (e.g., Rees et al, 1996, WO 03/106684), but have not been combined with a non-ATG initiation codon. According to various methods of the invention, the initiation codon of the selectable marker polypeptide is changed to a non-ATG initiation codon to further reduce the translational initiation rate of the selectable marker. This therefore results in a reduction in the level of expression of the desired selectable marker polypeptide and may result in efficient selection of host cells which express high levels of the polypeptide of interest, as described in the embodiment shown in WO 2006/048459. One possible benefit of this different aspect of the invention compared to the embodiment of WO2006/048459 is that the coding sequence for the selectable marker polypeptide need not be further modified with internal ATG sequences, since any internal ATG sequences therein may remain unchanged, since they are no longer involved in the translation of downstream polypeptides. This is particularly advantageous when the coding sequence for the selection marker polypeptide contains several internal ATG sequences, since for the present invention it is no longer necessary to perform the work of altering these sequences and testing the functionality of the resulting construct: in this case, it is sufficient to mutate only the ATG initiation codon. The following (example 1) shows that this modification provided by the present invention also produces very good results.
The coding sequence for the selectable marker polypeptide in the DNA molecule of the invention is translated under the control of an IRES, however the coding sequence for the polypeptide of interest is preferably translated in a cap-dependent manner. The coding sequence for the polypeptide of interest comprises a stop codon such that translation of the first cistron terminates upstream of the IRES operably linked to the second cistron.
It will be readily apparent to those skilled in the art upon reading the present invention that most of these polycistronic expression units can be advantageously altered in the same manner as the polycistronic expression units having the coding sequences for the polypeptide of interest and the selection marker polypeptide in reverse order (i.e., the polycistronic transcription unit of WO2006/048459, incorporated herein by reference). For example, the preferred initiation codon for the selectable marker polypeptide, the presence of integration into the expression cassette, host cell, promoter, chromatin control elements, and the like, can be altered and used in preferred embodiments as described above. The use of these polycistronic expression units and expression cassettes is also described above. Thus, this aspect is indeed an alternative to the methods and means described in WO2006/048459, the main difference being that the order of the polypeptides in the polycistronic expression unit is reversed, and the IRES is now essential for translation of the selectable marker polypeptide.
As used herein, "internal ribosome entry site" or "IRES" refers to an element that facilitates direct internal ribosome entry into the initiation codon of a cistron (protein coding region) such as ATG in general, but is preferably GTG or TTG in the present invention, thereby producing cap-independent gene translation. See, for example, Jackson R J, Howell M T, Kaminski A (1990) Trends Biochem Sci15 (12): 477-83) and Jackson R J and Kaminski, A. (1995) RNA 1 (10): 985-1000. The present invention encompasses the use of any cap-independent translation initiation sequence, in particular any IRES element that can facilitate direct internal ribosome entry into the cistron initiation codon. As used herein, "under translation control of an IRES" means that translation is associated with the IRES and proceeds in a cap-independent manner. The term "IRES" as used herein includes functional variants of the IRES sequence, provided that such variants are capable of facilitating direct internal ribosome entry into the cistron initiation codon. As used herein, "cistron" refers to a polynucleotide sequence or gene for a protein, polypeptide, or peptide of interest. "operably linked" refers to a state in which the described components are in a relationship such that they can function in the intended manner. Thus, for example, a promoter is "operably linked" to a cistron in such a way that expression of the cistron is achieved under conditions compatible with the promoter. Similarly, the nucleotide sequence of the IRES is operably linked to the cistron in such a way that translation of the cistron is achieved under conditions compatible with the IRES.
Internal ribosome binding site (IRES) elements are known from viral and mammalian genes (Martinez-Salas, 1999), and have also been identified in the screening of small synthetic oligonucleotides (Venkatesan & Dasgupta, 2001). IRES of encephalomyocarditis virus has been analyzed in detail (mizugchi et al, 2000). An IRES is an element encoded in DNA that creates a structure in transcribed RNA to which eukaryotic ribosomes can bind and initiate translation. IRES allows 2 or more proteins to be produced from a single RNA molecule (the first protein is translated by ribosomes bound to the RNA 5' end cap structure (Martinez-Salas, 1999)). Protein translation from IRES elements is less efficient than cap-dependent translation: the amount of protein obtained from IRES-dependent Open Reading Frames (ORFs) ranged from 20% to 50% less than the amount from the first ORF (mizugchi et al, 2000). IRES-dependent reduction of translation efficiency provides an advantage for utilizing this embodiment of the invention. Furthermore, mutations in the IRES elements may attenuate their activity, reducing expression from the IRES-dependent ORF to less than 10% of the first ORF (Lopez de Quinto & Martinez-Salas, 1998, Rees et al, 1996). Thus, it is clear to one skilled in the art that altering an IRES may have no effect on the basic function of the IRES (thus providing a protein translation initiation site with reduced translation efficiency), resulting in a modified IRES. Thus the use of such modified IRES that still provide a small percentage of translation (compared to 5' cap translation) is also encompassed by the present invention. The present invention uses a non-ATG start codon to significantly further reduce the initiation of translation of the selectable marker ORF, thus further improving the chances of obtaining a preferred host cell, i.e., a host cell expressing high levels of the recombinant protein of interest.
U.S. Pat. Nos. 5,648,267 and 5,733,779 describe a consensus Kozak sequence with an attenuation ([ Py)]xxATG[Py]Wherein [ Py]Use of a dominant selectable marker sequence that is a pyrimidine nucleotide (i.e., C or T), X is a nucleotide (i.e., G, A, T or C), and the ATG initiation codon is underlined. U.S. Pat. No.6,107,477 describes non-optimal Kozak sequences of selectable marker genesColumn (AGATCTTT)ATGGACC, where ATG initiation codon is underlined). None of these patents describe the use of a non-ATG initiation codon, nor do they provide any suggestion to do so. Further, they do nothing more than combine with IRES. Moreover, because the IRES itself already has reduced translation initiation compared to cap-dependent translation, it was not possible to predict prior to the present invention whether combining an IRES and a selection marker's non-ATG initiation codon would provide sufficient translation of the selection marker polypeptide to produce any selectable level of selection marker polypeptide. The present invention shows this result, providing a surprisingly effective selection system.
The invention also provides a DNA molecule comprising a sequence encoding a selectable marker polypeptide operably linked to an IRES sequence, wherein the coding sequence encoding the selectable marker polypeptide comprises a translation initiation sequence selected from the group consisting of: a) a GTG start codon; b) a TTG start codon; c) a CTG start codon; d) an ATT start codon; and e) ACG initiation codon.
Those skilled in the art will appreciate that further modifications to the present invention are possible, such as those described in US2006/0195935, particularly examples 20-27 thereof, which are incorporated herein by reference.
In certain embodiments, the mammalian 5,6, 7, 8-tetrahydrofolate synthetase dihydrofolate reductase (dhfr) can be made to have dhfr by removing hypoxanthine and thymidine from the culture medium (preferably glycine is also removed) and including folate in the culture medium (or (dihydrofolate)-Phenotypic cells (e.g., CHO-DG44 cells) are used as selectable markers (Simonsen et al, 1988). The dhfr gene may for example be derived from a mouse genome or mouse cDNA and used in the present invention, it is preferably provided with a GTG or TTG start codon (see dhfr gene sequence of seq. In all of these embodiments, "removed from the medium" means that the medium is substantially free of the indicated component, meaning that there are insufficient indicated components present in the medium to sustain cell growth, such that when the genetic information for the indicated enzyme is expressed in the cell and the indicated precursor component is present in the mediumThen, a good selection can be made. For example, the indicated component is present at a concentration of less than 0.1% of the concentration it would normally be used in the culture medium of a certain cell type. Preferably, the indicated components are not present in the culture medium. Media without the indicated components can be prepared by standard methods by those skilled in the art or can be obtained from commercial media suppliers. One potential advantage of using these types of metabolic enzymes as selectable marker polypeptides is that they can be used to place polycistronic transcription units under conditions of sequential selection, which may result in higher expression of the polypeptide of interest.
In another aspect, the invention uses a dhfr metabolic selectable marker as an additional selectable marker on a polycistronic transcription unit of the invention. In such embodiments, selection of host cell clones with high expression is first established by using, for example, antibiotic selection markers such as zeocin, neomycin, etc., the coding sequences of these markers of the invention will have either a GTG or TTG start codon. After appropriate clonal selection, antibiotic selection is complete and continuous or intermittent selection using a metabolic enzyme selection marker can be performed by culturing the cells in a medium lacking the appropriate identified component described above and containing the appropriate precursor component described above. In this regard, the metabolic selection marker is operably linked to an IRES and may have its usual ATG component, and the initiation codon may be appropriately selected from GTG or TTG. In this regard, the polycistronic transcription unit is at least a tricistron.
The practice of the present invention will employ, unless otherwise indicated, conventional immunological, molecular biological, microbiological, cell biological and recombinant DNA techniques, which are within the knowledge of one of ordinary skill in the art. See Sambrook, Fritsch and Maniatis, Molecular Cloning: a Laboratory Manual, second edition, 1989; current Protocols in Molecular Biology, Ausubel FM, et al, eds, 1987; the series Methods in Enzymology (Academic Press, Inc.); PCR 2: APractcal Approach, MacPherson MJ, Hams BD, Taylor GR, eds, 1995; antibodies: a Laboratory Manual, Harlow and Lane, eds, 1988.
The invention will be further elucidated in the following examples. The examples are not intended to limit the invention in any way. They are merely intended to illustrate the invention.
Examples
Example 1 describes a selection system with a polycistronic transcription unit of the present invention, it is clear that the variations described in examples 8-26 of WO2006/048459, which is incorporated herein by reference, can also be used and tested with a polycistronic transcription unit of the present invention. The same is true for examples 20-27 of US 2006/0195935.
Example 1: stringent selection by placing the modified Zeocin resistance Gene after IRES sequence
Examples 8-26 of WO2006/048459 (incorporated herein by reference in its entirety) have shown a selection system in which the sequence encoding the selectable marker protein on the polycistronic transcription unit is upstream of the sequence encoding the protein of interest, wherein the translational start sequence of the selectable marker is non-optimal, wherein the remaining internal ATGs have been removed from the selectable marker coding sequence. This system results in a high stringency selection system. For example, Zeo selection markers, in which the translation initiation codon is changed to TTG, show very high selection stringency, as well as very high levels of expression of the downstream encoded protein of interest.
In another possible selection system (i.e., the system of the present invention), a selection marker such as Zeo is placed downstream of the IRES sequence. This results in a polycistronic mRNA from which the Zeo gene product is translated via IRES-dependent initiation. In the usual d2EGFP-IRES-Zeo construct (i.e.one of the prior art, e.g.WO 2006/005718), the Zeo initiation codon is the optimal ATG. We tested whether altering the Zeo ATG initiation codon to, for example, TTG (referred to as IRES-TTG Zeo) results in increased selection stringency compared to the usual IRES-ATG Zeo.
Results
The constructs used are shown in FIG. 1. The control construct consisted of the CMV promoter, the d2EGFP gene, the IRES sequence (the sequence of the IRES used (Rees et al, 1996) in this example is GCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTGATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCACAACCCCGGGATA; SEQ. ID. NO.82) and the TTG Zeo selection marker, i.e.the zeocin resistance gene with the TTG start codon ('d 2EGFP-IRES-TTG Zeo'). The same is true for the other construct, but contains a combination of STAR7 and STAR67 upstream of the expression cassette and STAR7 downstream of the expression cassette ('STAR 7/67d2EGFP-IRES-TTG Zeo STAR 7'). Both constructs were transfected into CHO-K1 cells and selected in medium containing 100. mu.g/ml Zeocin. 4 clones appeared after transfection with the control construct, and 6 clones appeared after transfection with the construct containing STAR. These independent colonies were isolated and cultured prior to analysis of d2EGFP expression levels. As shown in figure 1, integration of STAR elements in the construct resulted in the formation of colonies with high d2EGFP expression levels. Whereas only one clone of the control colony without STAR element ('d 2EGFP-IRES-TTG Zeo') showed some d2EGFP expression. The expression levels were also much higher than those with other control constructs containing IRES with a conventional Zeo containing a standard ATG initiation codon with or without STAR elements ('d 2EGFP-IRES-ATG Zeo' and 'STAR 7/67d 2EGFP-IRES-ATG Zeo STAR 7'; the enhancement of STAR elements was also found in these ATG Zeo constructs, but was low compared to the new TTG Zeo variants).
These results indicate that placing a Zeo selection marker with a TTG initiation codon downstream of the IRES sequence, in combination with STAR elements, works well and establishes a stringent selection system.
From these data and examples 8-26 of WO2006/048459 and examples 20-27 of US2006/0195935, it is clear that the labels can be varied in the same way as examples 8-26 of WO2006/048459 and examples 20-27 of US 2006/0195935. For example, instead of the TTG start codon, a GTG start codon may be used, and the label may be changed from Zeo to a different label, e.g., Neo, Blas, dhfr, puro, etc., all with GTG or TTG as start codon. STAR elements can be altered with different STAR sequences or different substitutions thereof, or they can be replaced with other chromatin control elements, such as MAR sequences. This leads to an improvement of the prior art selection system for IRES with a marker for the common ATG initiation codon.
As a non-limiting example, instead of the modified Zeo resistance gene (TTG Zeo), a modified neomycin resistance gene is placed downstream of the IRES sequence. This modification is the substitution of the translation initiation codon ATG of the Neo coding sequence with the TTG translation initiation codon, resulting in TTG Neo. CHO-K1 cells were transfected with CMV-d2EGF-IRES-TTG Neo constructs with or without STAR elements. Colonies were picked and cells were propagated and tested for d2EGFP values. This ('IRES-TTG Neo') resulted in an improvement in the selection system for Neo known to have an ATG start codon downstream of IRES ('IRES-ATG Neo'). This improvement is particularly evident when the TTG Neo construct contains STAR elements.
Example 2: stability of expression of the modified dhfr Gene after Placement in the IRES sequence
The translation initiation codon of the Zeocin selection marker was modified to be a much less frequent translation initiation codon than the usual ATG codon, resulting in a high stringency selection system. In the selection system described in WO2006/048459, TTG Zeo is placed upstream of the gene of interest. In another possible selection system, a Zeo selection marker is placed downstream of the IRES sequence (see example 1, herein). This results in a bicistronic mRNA from which the Zeo gene product is translated from the translation initiation codon in the IRES sequence.
In this experiment, we combined the embodiments of the two systems. We placed a TTG selection marker upstream of the reporter gene and coupled a GTG or TTG modified metabolic marker with an IRES to the reporter gene. Different selectable marker genes can be used, such as the Zeocin and neomycin resistance genes, and the dhfr gene. Here we placed the modified Zeocin resistance gene TTG Zeo (see WO 2006/048459) upstream of the gene of interest, the dhfr selection gene downstream of the gene of interest and coupled to an IRES (FIG. 2). The purpose of this expression cassette was to select mammalian cell clones that produce high levels of protein, and Zeocin was first selected. TTG Zeo-gene of interest configuration most efficiently achieves this goal. After this preliminary selection period, the characteristics of the dhff-protein were used to maintain high expression levels in the absence of Zeocin antibiotics.
Active selection pressure appears to be beneficial in maintaining the protein expression levels of TTG Zeo selected colonies at the same high levels over a long period of time. This can be achieved, for example, by keeping a minimum amount of Zeocin in the culture medium, but this is not favoured in an industrial setting for economic or possible regulatory reasons (Zeocin is toxic and expensive).
Another approach is to couple the gene of interest to a selective marker for an enzyme that metabolizes 1 or more key steps in the metabolic pathway. By critical is meant that the cell is unable to synthesize specific critical metabolic components by itself, meaning that these components must be present in the culture medium for the cell to survive. Well known examples are essential amino acids which cannot be synthesized by mammalian cells, which must be present in the culture medium for the cells to survive. Another example relates to the dhfr gene for 5,6, 7, 8-tetrahydrofolate synthesis. The corresponding dhfr protein is an enzyme in the folate metabolic pathway. The dhfr protein specifically converts folate to 5,6, 7, 8-tetrahydrofolate, the methyl shuttle (shuttle) required for de novo synthesis of purines (hypoxanthine), thymidylate (thymidine), and the amino acid glycine. Operationally, the non-toxic substance folic acid must be present in the culture medium (Urlaub et al, 1980). Furthermore, the medium must be devoid of hypoxanthine and thymidine because the requirement for the dhff enzyme is skipped when these components are available to the cell. CHO-DG44 cells lack the dhfr gene and therefore these cells require glycine, hypoxanthine and thymidine in the culture medium to survive. However, if the final products glycine, hypoxanthine and thymidine are absent in the culture medium and folate is present and the dhfr gene is provided by the presence on the expression cassette in the cell, the cell can convert folate to 5,6, 7, 8-tetrahydrofolate and survive in such a medium. This principle has been used for many years as a method of choice for the production of stably transfected mammalian cell lines.
Here we apply this principle, not to initially select stable clones (this is done by Zeocin), but to maintain the cells under metabolic selection pressure. This has the advantage that initially very high protein expression can be achieved by the TTG Zeo selection system and that these high expression levels can be maintained without the need to maintain Zeocin in the culture medium. Alternatively, Zeocin may be removed from the culture medium where the absence of glycine, hypoxanthine and thymidine (GHT) or the mere absence of Hypoxanthine and Thymidine (HT) is sufficient to maintain a sufficiently high selection pressure to ensure high levels of protein expression. Such a configuration requires the presence of two selectable markers, both the Zeocin resistance gene and the dhfr gene in the expression cassette. As described above, it can be obtained efficiently when two genes and a gene of interest exist in such a configuration that a tricistronic mRNA is transcribed from the same promoter. When the modified Zeocin resistance gene (TTG Zeo) is located upstream of the d2EGFP gene, the dhfr gene needs to be coupled downstream of the d2EGFP gene by, for example, an IRES sequence (fig. 1).
Results
We generated constructs in which the TTG Zeo selection marker was located upstream of the d2EGFP reporter gene and the dhfr selection marker was located downstream of the d2EGFP gene, coupled by IRES sequences (fig. 2). These constructs were flanked by STAR 7/67/7. Three versions of this construct were made: ATG dhfr, GTG dhfr or TTG dhfr, each name representing the initiation codon for the dhfr gene. These constructs were transfected into CHO-DG44 cells. DNA transfection was performed using Lipofectamine 2000(Invitrogen) and cells were grown in IMDM medium (Gibco) + 10% FBS (Gibco) + HT-supplement in the presence of 400. mu.g/ml Zeocin.
Measured in the presence of 400. mu.g/ml Zeocin, at 14 TTG Zeo IRESATGThe average d2EGFP value in dhfr clones was 341 (day one). After measurement, cells were separated and cultured under three conditions:
(1) the medium contained 400. mu.g/ml Zeocin as well as hypoxanthine and thymidine (HT-supplement),
(2) zeocin is not contained in the culture medium, but HT-supplement is contained,
(3) does not contain Zeocin and HT-supplement.
Briefly, in condition 1, cells were only under Zeocin selection pressure, in condition 2, cells were not under any selection pressure, and in condition 3, cells were under DHFR selection pressure. Final condition 3 requires continuous dhfr gene expression to allow dhfr protein expression and cell survival.
After 65 days we measured the d2EGFP value again. TTG Zeo IRESATGThe average d2EGFP value for the dhfr clone under Zeocin selection is now 159 (fig. 2). TTG Zeo IRES ATG dhfr clone the average d2EGFP value was 20 in the absence of Zeocin but with HT supplement (FIG. 2). TTG Zeo IRES ATG dhff clones the average d2EGFP value was 37 without Zeocin selection and without HT supplement (FIG. 2). Overall we observed a decrease in d2EGFP values, but most seriously in the absence of Zeocin, despite the absence of HT supplement.
We performed TTG Zeo IRES in the same wayGTGExperiment with dhfr construct. The average d2EGFP value was 455 (day one) in 15 TTG Zeo IRES GTG dhfr clones measured in the presence of 400. mu.g/ml Zeocin (FIG. 3). After measurement the cells were separated and further cultured under the three conditions described above. D2EGFP values were re-measured after 65 days. The average d2EGFP value for TTG Zeo IRES GTG dhfr clone under Zeocin selection was now 356 (fig. 3). The TTG Zeo IRES GTGdhfr clone had an average d2EGFP value of 39 without Zeocin selection but with HT supplement (FIG. 3). The average d2EGFP value of the TTG Zeo IRES GTG dhfr clone without Zeocin selection and without HT supplement was 705 (FIG. 2).
In this example, we observed that the decrease in d2EGFP values occurred only in the absence of Zeocin but with HT supplement (condition 2). The d2EGFP value became very high in the absence of Zeocin and also in the absence of HTsupplement (condition 3). This may indicate that the expression level of dhfr protein is low enough to result in very high selection stringency due to the impaired translation frequency of the GTG dhfr mRNA. This selection pressure, in the absence of any toxic substance, is high enough to maintain long-term high protein expression levels and clearly even improve these expression levels over time.
We treated TTG Zeo IRES in the same wayTTGThe dhff construct was tested. The average d2EGFP value was 531 (day one) among 18 TTG Zeo IRES GTG dhfr clones measured in the presence of 00. mu.g/ml Zeocin at 4 (FIG. 4). After measurement the cells were separated and further cultured under the three conditions described above. D2EGFP values were re-measured after 65 days. The average d2EGFP value for TTG Zeo IRES TTG dhfr clone under Zeocin selection is now 324 (fig. 4). The TTG Zeo IRES TTG dhfr clone had an average d2EGFP value of 33 without Zeocin selection but with HT supplement (FIG. 4). The average d2EGFP value of the TTG Zeo IRES TTG dhfr clone without Zeocin selection and without HTsupplements was 1124 (FIG. 4).
Again, we observed that the decrease in d2EGFP values only occurred in the absence of Zeocin but with HTsupplement (condition 2). In the absence of Zeocin and also in the absence of HT supplement, the d2EGFP value became even higher than for the TTG Zeo IRES GTG dhfr construct (condition 3). Because TTG variants are more stringent than GTG variants, it is expected that TTG dhfr translates even less dhfr protein than GTG dhfr variants. This increased selective pressure of the TTG dhfr variant is high enough to maintain long-term high protein expression levels in the absence of any toxic substances and clearly improves protein expression levels even over time.
The data show that coupling of a non-ATG start codon variant of the dhfr gene to the d2EGFP gene by IRES results in high levels of d2EGFP expression with high stability in CHO-DG44 cells. This occurs when the medium does not contain Zeocin and essential metabolic end products. Preselection of Zeocin by a modified TTG Zeo selection marker allowed efficient establishment of colonies with high levels of d2EGFP expression. It is now possible to maintain high levels of d2EGFP expression and even improve these levels of expression simply by changing the medium (removing Zeocin and HT).
Example 3: increased expression of the modified dhfr gene after placement in the attenuated IRES sequence is not the result of gene amplification.
In the prior art, the use of the dhfr gene as a selectable marker generally relies on the amplification of the dhfr gene. One toxic agent, methotrexate, is used in this system to amplify the dhfr gene, accompanied by the desired transgene, of which up to several thousand copies can be integrated into the CHO cell genome following such amplification. While these high copy numbers produce high expression levels, they are also considered a disadvantage because so many copies may cause increased genomic instability, and subsequent removal of methotrexate from the culture medium results in rapid removal of many of the amplified loci.
In example 2, methotrexate was not used to inhibit dhfr enzyme activity. Only hypoxanthine and thymidine precursors are removed from the culture medium, which is sufficient to obtain stability of protein expression and can even increase the expression level. We therefore believe that the use of dhfr enzyme in our design leads to gene amplification.
Results
We isolated DNA from the clones described in example 2 on the same day as the d2EGFP values were measured (65). We used this DNA to determine the copy number of d2 EGFP.
TTG Zeo IRES under Zeocin selectionATGThe average d2EGFP copy number in dhfr clones was 86 (condition 1) (fig. 5). The average d2EGFP copy number of the TTG Zeo IRES ATG dhfr clone without Zeocin selection but with HT deletion was 53 (condition 2) (FIG. 5). The average d2EGFP copy number of the TTG Zeo IRES ATG dhfr clone without Zeocin selection and without HT supplement was 59 (condition 3) (FIG. 5).
TTG Zeo IRES under Zeocin selectionGTGThe average d2EGFP copy number in dhfr clones was 23 (condition 1) (fig. 6). TTG Zeo IRES GTG dhfr clone the average d2EGFP copy number was 14 (Condition 2) in the absence of Zeocin but in the presence of HT supplement (FIG. 6). The average d2EGFP copy number of the TTGZeo IRES GTG dhfr clone without Zeocin selection and without HT deletion was 37 (condition 3) (FIG. 6).
TTG Zeo IRES under Zeocin selectionTTGThe average d2EGFP copy number in dhfr clones was 33 (condition 1) (fig. 7). The average d2EGFP copy number of the TTG Zeo IRES TTG dhfr clone in the absence of Zeocin but with HT supplement was 26 (condition 2) (FIG. 7). The average d2EGFP copy number of the TTGZeo IRES TTG dhfr clone without Zeocin selection and without HT supplement was 32 (condition 3) (FIG. 7).
In either case, we did not observe a significant increase in d2EGFP copy number following removal of HT deletion, resulting in an increase in d2EGFP values in the case of GTG dhff and TTG dhff variants. The d2EGFP values remain stable over time and even increase significantly with both constructsThe fact that (a) is due to the action of dhfr protein is certain. Furthermore, no increase in the copy number of d2EGFP was observed in the TTG Zeo TTG dhfr clone, and only a slight increase was observed in the TTG Zeo GTG dhfr clone. Interestingly, at the lowest producer, TTG ZeoATGThe overall d2EGFP copy number in dhfr clones was higher than in both variants, while these clones did not maintain the original high d2EGFP fluorescence values (see example 2). We conclude from these data that the generally known gene amplification observed when using dhfr protein in combination with methotrexate has no effect on maintaining stable d2EGFP expression levels over time and the observed increase in these expression levels. In contrast, in both GTG and TTG dhfr variants, it appears that more d2EGFP protein is expressed per d2EGFP gene copy.
We further analyzed the d2EGFP mRNA levels of the different clones under different conditions as above and found that these mRNA levels generally trend with d2EGFP fluorescence values. We therefore concluded that the increase in d2EGFP fluorescence values was due to an increase in mRNA levels, rather than a change in translation efficiency.
Reference to the literature
Kaufman,RJ.(2000)Overview of vector design for mammalian geneexpression Mol Biotechnol 16,151-160.
Kozak M.(1986)Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44:283-292.
Kozak M.(1987)An analysis of 5′-noncoding sequences from 699vertebrate messenger RNAs.Nucleic Acids Res.15:8125-8148.
Kozak M.(1989)Context effects and inefficient initiation at non-AUGcodons in eucaryotic cell-free translation systems.Mol Cell Biol. 9:5073-5080.
Kozak M.(1990)Downstream secondary structure facilitates recognitionof initiator codons by eukaryotic ribosomes. Proc Natl Acad Sci USA87:8301-8305.
Kozak M. (1997)Recognition of AUG and alternative initiator codons isaugmented by G in position+4 but is not generally affected by the nucleotidesin positions+5 and+6. EMBO J.16:2482-2492.
Kozak M.(2002)Pushing the limits of the scanning mechanism forinitiation of translation. Gene 299:1-34.
Lopez de Quinto,S,and Martinez-Salas,E.(1998)Parameters influencingtranslational efficiency in aphthovirus IRES-based bicistronic expressionvectors Gene 217,51-6.
Martinez-Salas,E.(1999)Internalribosome entry site biology and its usein expression vectors Curr Opin Biotechnol 10,458-64.
McBurney,MW,Mai,T,Yang,X,and Jardine,K.(2002)Evidence forrepeat-induced gene silencing in cultured Mammalian cells:inactivation oftandem repeats of transfected genes Exp Cell Res 274,1-8.
Mizuguchi,H,Xu,Z,Ishii-Watabe,A,Uchida,E,and Hayakawa,T.(2000)IRES-dependent second gene expression is significantly lower than cap-dependent first gene expression in a bicistronic vector Mol Ther 1,376-82.
Rees,S,Coote,J,Stables,J,Goodson,S,Harris,S,and Lee,MG.(1996)Bicistronic vector for the creation of stable mammalian cell lines thatpredisposes all antibiotic-resistant cells to express recombinant proteinBiotechniques 20,102-104,106,108-110.
Urlaub,G. & Chasin,L.A.Isolation of Chinese hamster cell mutants deficient in dihydrofolate reductase activity.Proc Natl Acad Sci USA 77,4216-20(1980)
Venkatesan,A,and Dasgupta,A.(2001)Novel fluorescence-based screento identify small synthetic internal ribosome entry site elements Mol Cell Biol21,2826-37.
Whitelaw,E,Sutherland,H,Kearns,M,Morgan,H,Weaving,L,andGarrick,D.(2001)Epigenetic effects on transgene expression Methods MolBiol 158,351-68.
Sequence listing
<110> Corromi Ginkius Corp
<120> selection of host cells expressing proteins at high levels
<130>0117 A WO 01 ORD
<150>US11/359,953
<151>2006-02-21
<150>US11/269,525
<151>2005-11-07
<150>US60/626,301
<151>2004-11-08
<150>US60/696,610
<151>2005-07-05
<150>EP04105593.0
<151>2004-11-08
<160>82
<170>PatentIn version 3.3
<210>1
<211>749
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR1
<400>1
atgcggtggg ggcgcgccag agactcgtgg gatccttggc ttggatgttt ggatctttct 60
gagttgcctg tgccgcgaaa gacaggtaca tttctgatta ggcctgtgaa gcctcctgga 120
ggaccatctc attaagacga tggtattgga gggagagtca cagaaagaac tgtggcccct 180
ccctcactgc aaaacggaag tgattttatt ttaatgggag ttggaatatg tgagggctgc 240
aggaaccagt ctccctcctt cttggttgga aaagctgggg ctggcctcag agacaggttt 300
tttggccccg ctgggctggg cagtctagtc gaccctttgt agactgtgca cacccctaga 360
agagcaacta cccctataca ccaggctggc tcaagtgaaa ggggctctgg gctccagtct 420
ggaaaatctg gtgtcctggg gacctctggt cttgcttctc tcctcccctg cactggctct 480
gggtgcttat ctctgcagaa gcttctcgct agcaaaccca cattcagcgc cctgtagctg 540
aacacagcac aaaaagccct agagatcaaa agcattagta tgggcagttg agcgggaggt 600
gaatatttaa cgcttttgtt catcaataac tcgttggctt tgacctgtct gaacaagtcg 660
agcaataagg tgaaatgcag gtcacagcgt ctaacaaata tgaaaatgtg tatattcacc 720
ccggtctcca gccggcgcgc caggctccc 749
<210>2
<211>883
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR2
<400>2
gggtgcttcc tgaattcttc cctgagaagg atggtggccg gtaaggtccg tgtaggtggg 60
gtgcggctcc ccaggccccg gcccgtggtg gtggccgctg cccagcggcc cggcaccccc 120
atagtccatg gcgcccgagg cagcgtgggg gaggtgagtt agaccaaaga gggctggccc 180
ggagttgctc atgggctcca catagctgcc ccccacgaag acggggcttc cctgtatgtg 240
tggggtccca tagctgccgt tgccctgcag gccatgagcg tgcgggtcat agtcgggggt 300
gccccctgcg cccgcccctg ccgccgtgta gcgcttctgt gggggtggcg ggggtgcgca 360
gctgggcagg gacgcagggt aggaggcggg gggcagcccg taggtaccct gggggggctt 420
ggagaagggc gggggcgact ggggctcata cgggacgctg ttgaccagcg aatgcataga 480
gttcagatag ccaccggctc cggggggcac ggggctgcga cttggagact ggccccccga 540
tgacgttagc atgcccttgc ccttctgatc ctttttgtac ttcatgcggc gattctggaa 600
ccagatcttg atctggcgct cagtgaggtt cagcagattg gccatctcca cccggcgcgg 660
ccggcacagg tagcggttga agtggaactc tttctccagc tccaccagct gcgcgctcgt 720
gtaggccgtg cgcgcgcgct tggacgaagc ctgccccggc gggctcttgt cgccagcgca 780
gctttcgcct gcgaggacag agagaggaag agcggcgtca ggggctgccg cggccccgcc 840
cagcccctga cccagcccgg cccctccttc caccaggccc caa 883
<210>3
<211>2126
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR3
<400>3
atctcgagta ctgaaatagg agtaaatctg aagagcaaat aagatgagcc agaaaaccat 60
gaaaagaaca gggactacca gttgattcca caaggacatt cccaaggtga gaaggccata 120
tacctccact acctgaacca attctctgta tgcagattta gcaaggttat aaggtagcaa 180
aagattagac ccaagaaaat agagaacttc caatccagta aaaatcatag caaatttatt 240
gatgataaca attgtctcca aaggaacaag gcagagtcgt gctagcagag gaagcacgtg 300
agctgaaaac agccaaatct gctttgtttt catgacacag gagcataaag tacacaccac 360
caactgacct attaaggctg tggtaaaccg attcatagag agaggttcta aatacattgg 420
tccctcacag gcaaactgca gttcgctccg aacgtagtcc ctggaaattt gatgtccagt 480
atagaaaagc agagcagtca aaaaatatag ataaagctga accagatgtt gcctgggcaa 540
tgttagcagc accacactta agatataacc tcaggctgtg gactccctcc ctggggagcg 600
gtgctgccgg cggcgggcgg gctccgcaac tccccggctc tctcgcccgc cctcccgttc 660
tcctcgggcg gcggcggggg ccgggactgc gccgctcaca gcggcggctc ttctgcgccc 720
ggcctcggag gcagtggcgg tggcggccat ggcctcctgc gttcgccgat gtcagcattt 780
cgaactgagg gtcatctcct tgggactggt tagacagtgg gtgcagccca cggagggcga 840
gttgaagcag ggtggggtgt cacctccccc aggaagtcca gtgggtcagg gaactccctc 900
ccctagccaa gggaggccgt gagggactgt gcccggtgag agactgtgcc ctgaggaaag 960
gtgcactctg gcccagatac tacacttttc ccacggtctt caaaacccgc agaccaggag 1020
attccctcgg gttcctacac caccaggacc ctgggtttca accacaaaac cgggccattt 1080
gggcagacac ccagctagct gcaagagttg tttttttttt tatactcctg tggcacctgg 1140
aacgccagcg agagagcacc tttcactccc ctggaaaggg ggctgaaggc agggaccttt 1200
agctgcgggc tagggggttt ggggttgagt gggggagggg agagggaaaa ggcctcgtca 1260
ttggcgtcgt ctgcagccaa taaggctacg ctcctctgct gcgagtagac ccaatccttt 1320
cctagaggtg gagggggcgg gtaggtggaa gtagaggtgg cgcggtatct aggagagaga 1380
aaaagggctg gaccaatagg tgcccggaag aggcggaccc agcggtctgt tgattggtat 1440
tggcagtgga ccctcccccg gggtggtgcc ggaggggggg atgatgggtc gaggggtgtg 1500
tttatgtgga agcgagatga ccggcaggaa cctgccccaa tgggctgcag agtggttagt 1560
gagtgggtga cagacagacc cgtaggccaa cgggtggcct taagtgtctt tggtctcctc 1620
caatggagca gcggcggggc gggaccgcga ctcgggttta atgagactcc attgggctgt 1680
aatcagtgtc atgtcggatt catgtcaacg acaacaacag ggggacacaa aatggcggcg 1740
gcttagtcct acccctggcg gcggcggcag cggtggcgga ggcgacggca ctcctccagg 1800
cggcagccgc agtttctcag gcagcggcag cgcccccggc aggcgcggtg gcggtggcgc 1860
gcagccaggt ctgtcaccca ccccgcgcgt tcccaggggg aggagactgg gcgggagggg 1920
ggaacagacg gggggggatt caggggcttg cgacgcccct cccacaggcc tctgcgcgag 1980
ggtcaccgcg gggccgctcg gggtcaggct gcccctgagc gtgacggtag ggggcggggg 2040
aaaggggagg agggacaggc cccgcccctc ggcagggcct ctagggcaag ggggcggggc 2100
tcgaggagcg gaggggggcg gggcgg 2126
<210>4
<211>1625
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR4
<400>4
gatctgagtc atgttttaag gggaggattc ttttggctgc tgagttgaga ttaggttgag 60
ggtagtgaag gtaaaggcag tgagaccacg taggggtcat tgcagtaatc caggctggag 120
atgatggtgg ttcagttgga atagcagtgc atgtgctgta acaacctcag ctgggaagca 180
gtatatgtgg cgttatgacc tcagctggaa cagcaatgca tgtggtggtg taatgacccc 240
agctgggtag ggtgcatgtg gtgtaacgac ctcagctggg tagcagtgtg tgtgatgtaa 300
caacctcagc tgggtagcag tgtacttgat aaaatgttgg catactctag atttgttatg 360
agggtagtgc cattaaattt ctccacaaat tggttgtcac gtatgagtga aaagaggaag 420
tgatggaaga cttcagtgct tttggcctga ataaatagaa gacgtcattt ccagttaatg 480
gagacaggga agactaaagg tagggtggga ttcagtagag caggtgttca gttttgaata 540
tgatgaactc tgagagagga aaaacttttt ctacctctta gtttttgtga ctggacttaa 600
gaattaaagt gacataagac agagtaacaa gacaaaaata tgcgaggtta tttaatattt 660
ttacttgcag aggggaatct tcaaaagaaa aatgaagacc caaagaagcc attagggtca 720
aaagctcata tgccttttta agtagaaaat gataaatttt aacaatgtga gaagacaaag 780
gtgtttgagc tgagggcaat aaattgtggg acagtgatta agaaatatat gggggaaatg 840
aaatgataag ttattttagt agatttattc ttcatatcta ttttggcttc aacttccagt 900
ctctagtgat aagaatgttc ttctcttcct ggtacagaga gagcaccttt ctcatgggaa 960
attttatgac cttgctgtaa gtagaaaggg gaagatcgat ctcctgtttc ccagcatcag 1020
gatgcaaaca tttccctcca ttccagttct caaccccatg gctgggcctc atggcattcc 1080
agcatcgcta tgagtgcacc tttcctgcag gctgcctcgg gtagctggtg cactgctagg 1140
tcagtctatg tgaccaggag ctgggcctct gggcaatgcc agttggcagc ccccatccct 1200
ccactgctgg gggcctccta tccagaaggg cttggtgtgc agaacgatgg tgcaccatca 1260
tcattcccca cttgccatct ttcaggggac agccagctgc tttgggcgcg gcaaaaaaca 1320
cccaactcac tcctcttcag gggcctctgg tctgatgcca ccacaggaca tccttgagtg 1380
ctgggcagtc tgaggacagg gaaggagtga tgaccacaaa acaggaatgg cagcagcagt 1440
gacaggagga agtcaaaggc ttgtgtgtcc tggccctgct gagggctggc gagggccctg 1500
ggatggcgct cagtgcctgg tcggctgcaa gaggccagcc ctctgcccat gaggggagct 1560
ggcagtgacc aagctgcact gccctggtgg tgcatttcct gccccactct ttccttctaa 1620
gatcc 1625
<210>5
<211>1571
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR5
<400>5
cacctgattt aaatgatctg tctggtgagc tcactgggtc tttactcgca tgctgggtcc 60
acagctccac tgtcctgcag ggtccgtgag tgtgggcccc ttatctattt catcatcata 120
accctgcgtg tcctcaactc ctggcacata ttgggtggcc ccatccacac acggttgttg 180
agtgaatcca tgagatgaca aaggctatga tgtagactat atcatgagcc agaaccaggc 240
tttcctacct ccagacaatc aagggccttg atttgggatt gagggagaaa ggagtagaag 300
ccaggaagga gaagagattg aggtttacca agggtgcaaa gtcctggccc ctgactgtag 360
gctgaaaact atagaaatga tagaacaatt ttgcaatgaa atgcagaaga ccctgcatca 420
actttaggtg ggacttcggg tatttttatg gccacagaac atcctcccat ttacctgcat 480
ggcccagaca cagacttcaa aacagttgag gccagcaggc tccaggtaag tggtaggatt 540
ccagaatgcc ctcagagtgt tgtgggaggc agcaggcgat tttcctggac ttctgagttt 600
atgagaaccc caaaccccaa ttggcattaa cattgaggtc tcaatgtatc atggcaggaa 660
gcttccgagt ggtgaaaagg aaagtgaaca tcaaagctcg gaagacaaga gggtggagtg 720
atggcaacca agagcaagac ccttccctct cctgtgatgg ggtggctcta tgtgaagccc 780
ccaaactgga cacaggtctg gcagaatgag gaacccactg agatttagcg ccaacatcca 840
gcataaaagg gagactgaca tagaatttga gttagttaaa aataaggcac aatgcttttc 900
atgtattcct gagttttgtg gactggtgtt caatttgcag cattcttagt tgattaaatc 960
tgagatgaag aaagagtgtc caacactttc accttggaaa gctctggaaa agcaaaaggg 1020
agagacaatt agcttcatcc attaactcac ttagtcatta tgcattcatt catgtaacta 1080
ccaaacacgt actgagtgcc taacactcct gagacactga gaagtttctt gggaatacaa 1140
agatgaataa aaaccacgcc aggcaggagt tggaggaagg ttctggatgc caccacgctc 1200
tacctcctgg ctggacacca ggcaatgttg gtaaccttct gcctccaatt tctgcaaata 1260
cataattaat aaacacaagg ttatcttcta aacagttctt aaaatgagtc aactttgttt 1320
aaacttgttc tttttagaga aaaatgtatt tttgaaagag ttggttagtg ctaggggaaa 1380
tgtctgggca cagctcagtc tggtgtgaga gcaggaagca gctctgtgtg tctggggtgg 1440
gtacgtatgt aggacctgtg ggagaccagg ttgggggaag gcccctcctc atcaagggct 1500
cctttgcttt ggtttgcttt ggcgtgggag gtgctgtgcc acaagggaat acgggaaata 1560
agatctctgc t 1571
<210>6
<211>1173
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR6
<400>6
tgacccacca cagacatccc ctctggcctc ctgagtggtt tcttcagcac agcttccaga 60
gccaaattaa acgttcactc tatgtctata gacaaaaagg gttttgacta aactctgtgt 120
tttagagagg gagttaaatg ctgttaactt tttaggggtg ggcgagaggg atgacaaata 180
acaacttgtc tgaatgtttt acatttctcc ccactgcctc aagaaggttc acaacgaggt 240
catccatgat aaggagtaag acctcccagc cggactgtcc ctcggccccc agaggacact 300
ccacagagat atgctaactg gacttggaga ctggctcaca ctccagagaa aagcatggag 360
cacgagcgca cagagcaggg ccaaggtccc agggacagaa tgtctaggag ggagattggg 420
gtgagggtaa tctgatgcaa ttactgtggc agctcaacat tcaagggagg gggaagaaag 480
aaacagtccc tgtcaagtaa gttgtgcagc agagatggta agctccaaaa tttgaaactt 540
tggctgctgg aaagttttag ggggcagaga taagaagaca taagagactt tgagggttta 600
ctacacacta gacgctctat gcatttattt atttattatc tcttatttat tactttgtat 660
aactcttata ataatcttat gaaaacggaa accctcatat acccatttta cagatgagaa 720
aagtgacaat tttgagagca tagctaagaa tagctagtaa gtaaaggagc tgggacctaa 780
accaaaccct atctcaccag agtacacact cttttttttt ttccagtgta atttttttta 840
atttttattt tactttaagt tctgggatac atgtgcagaa ggtatggttt gttacatagg 900
tatatgtgtg ccatagtgga ttgctgcacc tatcaacccg tcatctaggt ttaagcccca 960
catgcattag ctatttgtcc tgatgctctc cctcccctcc ccacaccaga caggccttgg 1020
tgtgtgatgt tcccctccct gtgtccatgt gttctcactg ttcagctccc acttatgagt 1080
gagaacgtgt ggtatttggt tttctgttcc tgtgttagtt tgctgaggat gatggcttcc 1140
agcttcatcc atgtccctgc aaaggacacg atc 1173
<210>7
<211>2101
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR7
<400>7
aggtgggtgg atcacccgag gtcaggagtt caagaccagc ctggccaaca tggtaaaacc 60
tcgtctctac taaaaaatac gaaaaattag ctggttgtgg tggtgcgtgc ttgtaatccc 120
agctactcgg gaggctgagg caggagaatc acttgaatct gggaggcaga ggttgcagtg 180
agctgagata gtgccattgc actccagcct gggcaacaga cggagactct gtctccaaaa 240
aaaaaaaaaa aaatcttaga ggacaagaat ggctctctca aacttttgaa gaaagaataa 300
ataaattatg cagttctaga agaagtaatg gggatatagg tgcagctcat gatgaggaag 360
acttagctta actttcataa tgcatctgtc tggcctaaga cgtggtgagc tttttatgtc 420
tgaaaacatt ccaatataga atgataataa taatcacttc tgacccccct tttttttcct 480
ctccctagac tgtgaagcag aaaccccata tttttcttag ggaagtggct acgcactttg 540
tatttatatt aacaactacc ttatcaggaa attcatattg ttgccctttt atggatgggg 600
aaactggaca agtgacagag caaaatccaa acacagctgg ggatttccct cttttagatg 660
atgattttaa aagaatgctg ccagagagat tcttgcagtg ttggaggaca tatatgacct 720
ttaagatatt ttccagctca gagatgctat gaatgtatcc tgagtgcatg gatggacctc 780
agttttgcag attctgtagc ttatacaatt tggtggtttt ctttagaaga aaataacaca 840
tttataaata ttaaaatagg cccaagacct tacaagggca ttcatacaaa tgagaggctc 900
tgaagtttga gtttgttcac tttctagtta attatctcct gcctgtttgt cataaatgcg 960
tttagtaggg agctgctaat gacaggttcc tccaacagag tgtggaagaa ggagatgaca 1020
gctggcttcc cctctgggac agcctcagag ctagtgggga aactatgtta gcagagtgat 1080
gcagtgacca agaaaatagc actaggagaa agctggtcca tgagcagctg gtgagaaaag 1140
gggtggtaat catgtatgcc ctttcctgtt ttatttttta ttgggtttcc ttttgcctct 1200
caattccttc tgacaataca aaatgttggt tggaacatgg agcacctgga agtctggttc 1260
attttctctc agtctcttga tgttctctcg ggttcactgc ctattgttct cagttctaca 1320
cttgagcaat ctcctcaata gctaaagctt ccacaatgca gattttgtga tgacaaattc 1380
agcatcaccc agcagaactt aggttttttt ctgtcctccg tttcctgacc tttttcttct 1440
gagtgcttta tgtcacctcg tgaaccatcc tttccttagt catctaccta gcagtcctga 1500
ttcttttgac ttgtctccct acaccacaat aaatcactaa ttactatgga ttcaatccct 1560
aaaatttgca caaacttgca aatagattac gggttgaaac ttagagattt caaacttgag 1620
aaaaaagttt aaatcaagaa aaatgacctt taccttgaga gtagaggcaa tgtcatttcc 1680
aggaataatt ataataatat tgtgtttaat atttgtatgt aacatttgaa taccttcaat 1740
gttcttattt gtgttatttt aatctcttga tgttactaac tcatttggta gggaagaaaa 1800
catgctaaaa taggcatgag tgtcttatta aatgtgacaa gtgaatagat ggcagaaggt 1860
ggattcatat tcagttttcc atcaccctgg aaatcatgcg gagatgattt ctgcttgcaa 1920
ataaaactaa cccaatgagg ggaacagctg ttcttaggtg aaaacaaaac aaacacgcca 1980
aaaaccttta ttctctttat tatgaatcaa atttttcctc tcagataatt gttttattta 2040
tttattttta ttattattgt tattatgtcc agtctcactc tgtcgcctaa gctggcatga 2100
t 2101
<210>8
<211>1821
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR8
<400>8
gagatcacct cgaagagagt ctaacgtccg taggaacgct ctcgggttca caaggattga 60
ccgaacccca ggatacgtcg ctctccatct gaggcttgct ccaaatggcc ctccactatt 120
ccaggcacgt gggtgtctcc cctaactctc cctgctctcc tgagcccatg ctgcctatca 180
cccatcggtg caggtccttt ctgaagagct cgggtggatt ctctccatcc cacttccttt 240
cccaagaaag aagccaccgt tccaagacac ccaatgggac attccccttc cacctccttc 300
tccaaagttg cccaggtgtt catcacaggt tagggagaga agcccccagg tttcagttac 360
aaggcatagg acgctggcat gaacacacac acacacacac acacacacac acacacacac 420
acacgactcg aagaggtagc cacaagggtc attaaacact tgacgactgt tttccaaaaa 480
cgtggatgca gttcatccac gccaaagcca agggtgcaaa gcaaacacgg aatggtggag 540
agattccaga ggctcaccaa accctctcag gaatattttc ctgaccctgg gggcagaggt 600
tggaaacatt gaggacattt cttgggacac acggagaagc tgaccgacca ggcattttcc 660
tttccactgc aaatgaccta tggcgggggc atttcacttt cccctgcaaa tcacctatgg 720
cgaggtacct ccccaagccc ccacccccac ttccgcgaat cggcatggct cggcctctat 780
ccgggtgtca ctccaggtag gcttctcaac gctctcggct caaagaagga caatcacagg 840
tccaagccca aagcccacac ctcttccttt tgttataccc acagaagtta gagaaaacgc 900
cacactttga gacaaattaa gagtccttta tttaagccgg cggccaaaga gatggctaac 960
gctcaaaatt ctctgggccc cgaggaaggg gcttgactaa cttctatacc ttggtttagg 1020
aaggggaggg gaactcaaat gcggtaattc tacagaagta aaaacatgca ggaatcaaaa 1080
gaagcaaatg gttatagaga gataaacagt tttaaaaggc aaatggttac aaaaggcaac 1140
ggtaccaggt gcggggctct aaatccttca tgacacttag atataggtgc tatgctggac 1200
acgaactcaa ggctttatgt tgttatctct tcgagaaaaa tcctgggaac ttcatgcact 1260
gtttgtgcca gtatcttatc agttgattgg gctcccttga aatgctgagt atctgcttac 1320
acaggtcaac tccttgcgga agggggttgg gtaaggagcc cttcgtgtct cgtaaattaa 1380
ggggtcgatt ggagtttgtc cagcattccc agctacagag agccttattt acatgagaag 1440
caaggctagg tgattaaaga gaccaacagg gaagattcaa agtagcgact tagagtaaaa 1500
acaaggttag gcatttcact ttcccagaga acgcgcaaac attcaatggg agagaggtcc 1560
cgagtcgtca aagtcccaga tgtggcgagc ccccgggagg aaaaaccgtg tcttccttag 1620
gatgcccgga acaagagcta ggcttccgga gctaggcagc catctatgtc cgtgagccgg 1680
cgggagggag accgccggga ggcgaagtgg ggcggggcca tccttctttc tgctctgctg 1740
ctgccgggga gctcctggct ggcgtccaag cggcaggagg ccgccgtcct gcagggcgcc 1800
gtagagtttg cggtgcagag t 1821
<210>9
<211>1929
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR9
<400>9
cacttcctgg gagtggagca gaggctctgc gtggagcatc catgtgcagt actcttaggt 60
acggaaggga ttgggctaaa ccatggatgg gagctgggaa gggaagggac caacttcagg 120
ccccactggg acactggagc tgccaccctt tagagccctc ctaaccctac accagaggct 180
gagggggacc tcagacatca cacacatgct ttcccatgtt ttcagaaatc tggaaacgta 240
gaacttcagg ggtgagagtg cctagatatt gaatacaagg ctagattggg cttctgtaat 300
atcccaaagg accctccagc tttttcacca gcacctaatg cccatcagat accaaagaca 360
cagcttagga gaggttcacc ctgaagctga ggaggaggca gccggattag agttgactga 420
gcaaggatga ctgccttctc cacctgacga tttcagctgc tgcccttttc ttttcctggg 480
aatgcctgtc gccatggcct tctgtgtcca caggagagtt tgacccagat actcatggac 540
caggcaaagg tgctgttcct cccagcccag ggcccaccat gaagcatgcc tgggagcctg 600
gtaaggaccc agccactcct gggctgttga cattggcttc tcttgcccag cattgtagcc 660
acgccactgc attgtactgt gagataagtc aaggtgggct caccaggacc tgcactaaat 720
tgtgaaattc agctccaaag aactttggaa attacccatg catttaagca aaatgaatga 780
tacctgagca aaccctttca cattggcaca agttacaatc ctgtctcatc ctcttgatta 840
caaattccat ccaggcaaga gctgtatcac cctgaggtct ccccattcat gttttggtca 900
ataatattta gtttcctttt gaaaatagat ttttgtgtta ctccattatg atgggcagag 960
gccagatgct tatattctat ttaaatgact atgtttttct atctgtaact gggtttgtgt 1020
tcaggtggta aatgcttttt ttttgcagtc agaagattcc tggaaggcga ccagaaatta 1080
gctggccgct gtcagacctg aagttacttc taaagggcct ttagaaatga attctttttt 1140
atgccttctc tgaattctga gaagtaggct tgacttcccc taagtgtgga gttgggagtc 1200
aactcttctg aaaagaaagt ttcagagcat tttccaaagc catggtcagc tgtgggaagg 1260
gaagacgatg gatagtacag ttgccggaaa acactgatgg aggcggatgc tccagctcag 1320
ccaaagacct ttgttctgcc caccccagaa atgccccttc ctcaatcgca gaaacgttgc 1380
cccatggctc ctgatactca gaatgcagcc tctgaccagg accatctgca tcctccagga 1440
gctcgtaaga aatgcagcat cgtgggacct gctggcacct ggtgaaccca aacctgcagg 1500
gctcctgggt gtgcttgggg cggctgcagg ggaagaggga gtcagcagcc tcctcctgac 1560
cttcccgggg gctgcttttc tgaggggcca gaatgcaccg gttgaccttg ttgcatcact 1620
ggcccatgac tggctgcttt ggtcaggtgt aaaaaggtgt ttccagaggg tctgctcctc 1680
tcactatcgg accaggtttc catggagagc tcagcctccc agcaaggata gagaacttca 1740
aatggctcaa agaactgaga ggccacacat gtgtgacctg aatagtctct gctgcaaaac 1800
aaagggtttc ttaatgtaaa acgttctctt cctcacagag gggttcccag ctgctagtgg 1860
gcatgttgca ggcatttcct gggctgcatc aggttgtcat aagccagagg atcatttttg 1920
ggggctcat 1929
<210>10
<211>1167
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR10
<220>
<221>misc_feature
<222>(452)..(1143)
<223>n is a,c,g,or t on various positions
<400>10
aggtcaggag ttcaagacca gcctggccaa catggtgaaa ccctgtccct acaaaaaata 60
caaaaattag ccgggcgtgg tggggggcgc ctataatccc agctactcag gatgctgaga 120
caggagaatt gtttgaaccc gggaggtgga ggttgcagtg aactgagatc gcgccactgc 180
actccagcct ggtgacagag agagactccg tctcaacaac agacaaacaa acaaacaaac 240
aacaacaaaa atgtttactg acagctttat tgagataaaa ttcacatgcc ataaaggtca 300
ccttctacag tatacaattc agtggattta gtatgttcac aaagttgtac gttgttcacc 360
atctactcca gaacatttac atcaccccta aaagaagctc tttagcagtc acttctcatt 420
ctccccagcc cctgccaacc acgaatctac tntctgtctc tattctgaat atttcatata 480
aaggagtcct atcatatggg ccttttacgt ctaccttctt tcacttagca tcatgttttt 540
aagattcatc cacagtgtag cacgtgtcag ttaattcatt tcatcttatg gctggataat 600
gctctattgt atgcatatcc ctcactttgc ttatccattc atcaactgat tgacatttgg 660
gttatttcta ctttttgact attatgagta atgctgctat gaacattcct gtaccaatcg 720
ttacgtggac atatgctttc aattctcctg agtatgtaac tagggttgga gttgctgggt 780
catatgttaa ctcagtgttt catttttttg aagaactacc aaatggtttt ccaaagtgga 840
tgcaacactt tacattccca ccagcaagat atgaaggttc caatgtctct acatttttgc 900
caacacttgt gattttcttt tatttattta tttatttatt tatttttgag atggagtctc 960
actctgtcac ccaggctgga gtgcagtggc acaatttcag ctcactgcaa tctccacctc 1020
tcgggctcaa gcgatactcc tgcctcaacc tcccgagtaa ctgggattac aggcgcccac 1080
caccacacca agctaatttt ttgtattttt agtagagacg gggtttcatc atgtcggcca 1140
ggntgtactc gaactctgac ctcaagt 1167
<210>11
<211>1377
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR11
<400>11
aggatcactt gagcccagga gttcaagacc agcctgggca acatagcgag aacatgtctc 60
aaaaaggaaa aaaatggggg aaaaaaccct cccagggaca gatatccaca gccagtcttg 120
ataagctcca tcattttaaa gtgcaaggcg gtgcctccca tgtggatgat tatttaatcc 180
tcttgtactt tgtttagtcc tttgtggaaa tgcccatctt ataaattaat agaattctag 240
aatctaatta aaatggttca actctacatt ttactttagg ataatatcag gaccatcaca 300
gaatgtctga gatgtggatt taccctatct gtagctcact tcttcaacca ttcttttagc 360
aaggctagtt atcttcagtg acaacccctt gctgccctct actatctcct ccctcagatg 420
gactactctg attaagcttg agctagaata agcatgttat cccgggattt catatggaat 480
attttataca tgagtgagcc attatgagtt gtttgaaaat ttattatgtt gagggagggt 540
aaccgctgta acaaccatca ccaaatctaa tcgactgaat acatttgacg tttatttctt 600
gttcacctga cagttcagtg ttacctaaat ttacatgaag acccagaggc ccacgctcct 660
tcattttggg ctccaccgac ctccaaggtt tcagggccct ctgccccgcc ttctgcaccc 720
acaggggaag agagtggagg atgcacacgc ccaggcctgg aagtgacgca tgtggcttcc 780
ccgtccacag acttcaccca cagtccattg gccttcttaa gtcatggact cctgctgagc 840
tgccagggtg catgggaaat ccatgtgact gtgtgccctg gaggaagggg agcgtttcgg 900
tgagcacaca ggagtctttg ccactagacg ctgatgagga ttccccacag gcgatgaagc 960
atggagactc atcttgtaac aaacagatga gttgttgaca tctcttaagt ttactttgtg 1020
tgcagttttt attcagatag gaaaggctgt taaaatctta acacctaact ggaagaaggg 1080
ttttagagaa gtgtggtttt cagtaagcca gttctttcca caatccaaga aacgaaataa 1140
atttccagca tggagcagtt ggcaggtaag gtttttgttg tggtctcgcc caggcttgag 1200
tgtaaccggt gtggtcatag ctcactacat tctcaaactc ctggccttaa gtcatcctcc 1260
tgcctcagcc tcccaaaggc aagtaaggtt aagaataggg gaaaggtgaa gtttcacagc 1320
ttttctagaa ttctttttat tcaagggact ctcagatcat caaacccacc cagaatc 1377
<210>12
<211>1051
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR12
<400>12
atcctgcttc tgggaagaga gtggcctccc ttgtgcaggt gactttggca ggaccagcag 60
aaacccaggt ttcctgtcag gaggaagtgc tcagcttatc tctgtgaagg gtcgtgataa 120
ggcacgagga ggcaggggct tgccaggatg ttgcctttct gtgccatatg ggacatctca 180
gcttacgttg ttaagaaata tttggcaaga agatgcacac agaatttctg taacgaatag 240
gatggagttt taagggttac tacgaaaaaa agaaaactac tggagaagag ggaagccaaa 300
caccaccaag tttgaaatcg attttattgg acgaatgtct cactttaaat ttaaatggag 360
tccaacttcc ttttctcacc cagacgtcga gaaggtggca ttcaaaatgt ttacacttgt 420
ttcatctgcc tttttgctaa gtcctggtcc cctacctcct ttccctcact tcacatttgt 480
cgtttcatcg cacacatatg ctcatcttta tatttacata tatataattt ttatatatgg 540
cttgtgaaat atgccagacg agggatgaaa tagtcctgaa aacagctgga aaattatgca 600
acagtgggga gattgggcac atgtacattc tgtactgcaa agttgcacaa cagaccaagt 660
ttgttataag tgaggctggg tggtttttat tttttctcta ggacaacagc ttgcctggtg 720
gagtaggcct cctgcagaag gcattttctt aggagcctca acttccccaa gaagaggaga 780
gggcgagact ggagttgtgc tggcagcaca gagacaaggg ggcacggcag gactgcagcc 840
tgcagagggg ctggagaagc ggaggctggc acccagtggc cagcgaggcc caggtccaag 900
tccagcgagg tcgaggtcta gagtacagca aggccaaggt ccaaggtcag tgagtctaag 960
gtccatggtc agtgaggctg agacccaggg tccaatgagg ccaaggtcca gagtccagta 1020
aggccgagat ccagggtcca gggaggtcaa g 1051
<210>13
<211>1291
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR13
<400>13
agccactgag gtcctaactg cagccaaggg gccgttctgc acatgtcgct caccctctgt 60
gctctgttcc ccacagagca aacgcacatg gcaacgttgg tccgctcagc cactggttct 120
gtggtggaac ggtggatgtc tgcactgtga catcagctga gtaagtaaca acgactgagg 180
atgccgctga cccagggctg gggaagggga ctcccagctc agacaggctt ggctgtggtt 240
tgctttggga ggagagtgaa catcacaggg aatggctcat gtcagcccca ggagggtggg 300
ctggcccctg gtccccgggc tccttctggc cctgcaggcg atagagagcc tcaacctgct 360
gccgcttctc cttggcccgg gtgatggccg tctggaagag cctgcagtag aggtgcacag 420
ccagcggaga gtcgtcattg ccgggtacag ggtaggtgat gaggcagggg ttgcagttgg 480
tgtccacgat gcccactgtg gggatgttca tcttggctgc gtctctcacg gccacgtgtg 540
gctcaaagat gttgttgagc gtgtgcagga agatgatgag gtccggcagg cggaccgtgg 600
ggccaaagag gaggcgcgcg ttggtcagca tgccgcccct gaagtagcga gtgtgggcgt 660
actcgccaca gtcacgggcc atgttctcaa tcaggtacga gaactgccgg ttgcggctta 720
taaacaagat gatgcccttg cggtaggcca tgtgggcggt gaagttcaag gccagctgga 780
ggtgcgtggc tgtctgttcc aggtcgatga tgtcgtggtc caggcggctc ccaaagatgt 840
acggctccat aaacctgcca gagaccccac caaggcaagg gggatgagag ttcacggggc 900
catctccact ggctccttgc aggaacacag acgcccacca gggactcccg ggctcctctg 960
tgggggcact atgggctggg aagcacaatt tgcaacgctc cccgtgtgca tggacagcag 1020
tgcagaccca tccaggccac ccctctgcat gcctcgtctc gtggcttaac ccctcctacc 1080
ctctacctct tcccgaagga atcctaatag aactgacccc atatggatgt gtggacatcc 1140
aacatgacgc caaaaggaca ttctgccccg tgcagctcac agggcagccg cctccgtcac 1200
tgtcctcttc ccgaggcttt gcggatgagg cccctctggg gttggactta gcggggtgct 1260
ctgggccaaa agcattaagg gatcagggca g 1291
<210>14
<211>711
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR14
<400>14
ccctggacca gggtccgtgg tcttggtggg cactggcttc ttcttgctgg gtgttttcct 60
gtgggtctct ggcaaggcac tttttgtggc gctgcttgtg ctgtgtgcgg gaggggcagg 120
tgctctttcc tcttggagct ggaccctctg gggcgggtcc ccgtcggcct ccttgtgtgt 180
tttctgcacc tggtacagct ggatggcctc ctcaatgccg tcgtcgctgc tggagtcgga 240
cgcctcgggc gcctgtacgg cgctcgtgac tcgctttccc ctccttgcgg tgctggcgtt 300
ccttttaatc ccacttttat tctgtactgc ttctgaaggg cggtgggggt tgctggcttt 360
gtgctgccct ccttctcctg cgtggtcgtg gtcgtgacct tggacctgag gcttctgggc 420
tgcacgtttg tctttgctaa ccgggggagg tctgcagaag gcgaactcct tctggacgcc 480
catcaggccc tgccggtgca ccacctttgt agccggctct tggtgggatt tcgagagtga 540
cttcgccgaa ttttcatgtg tgtctggttt cttctccact gacccatcac atttttgggt 600
ctcatgctgt cttttctcat tcagaaactg ttctatttct gccctgatgc tctgctcaaa 660
ggagtctgct ctgctcatgc tgactgggga ggcagagccc tggtccttgc t 711
<210>15
<211>1876
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR15
<400>15
gagtccaaga tcaaggtgcc agcatcttgt gagggccttc ttgttacgtc actccctagc 60
gaaagggcaa agagagggtg agcaagagaa aggggggctg aactcgtcct tgtagaagag 120
gcccattccc gagacaatgg cattcatcca ttcactccac cctcatggcc tcaccacctc 180
tcatgaggct ccacctccca gccctggttt gttggggatt aaatttccaa cacatgcctt 240
ttgggggaca tgttaaaatt atagcacccc aaatgttaca ctatcttttg atgagcggta 300
gttctgattt taagtctagc tggcctactt tttcttgcac gtgggatgct ttctgcctgt 360
tccagggcag gcagctcttc tctgtccctc tgctggcccc acctcatcct ctgttgtcct 420
cttccctcct tctgtgccct ggggtcctgg tgggggtgtg actgtcaact gcgttgggct 480
aacttttttc cctgctggtg gcccgtaatg aaagaaagct tcttgctccc aagttcctta 540
aatccaagct catagacaac gcggtctcac agcaggcctg gggccagcct cacgtgagcc 600
ccttccctgg tgtagtcact ggcatggggg aatgggattt cctgttgccc tactgtgtgg 660
ctgaggtggg ggttgcttcc tggagccagg ccttgtggaa gggcagtgcc cactgcagtg 720
gatgctgggc cctgaatctg accccagtgt tcattggctc tgtgagaccc agtgagggca 780
gggagggaag tggagctggg gtgagaagta gaggccctgc agggcccacg tgccagccac 840
caggcctcag actaggctca gatgacggag agctgcacac ctgcccaacc caggccctgc 900
agtgcccaca tgccagccgc tggggcccag acttgctcca gagggcggag agctttacac 960
cggcccaacc caggccatgg ctccaaatgc gtgacagttt tgctgttgct tcttttagtc 1020
attgtcaagt tgatgcttgt tttgcagagg accaaggctt tatgaaccta ttaccctgtg 1080
tgaagagttt caccaggtta tggaaatttc tttaaaacca taccacagtt ttttcattat 1140
tcatgtatat ttttaaaaat aattactgca ctcagtagaa taacatgaaa atgttgcctg 1200
ttagcccttt tccagtttgc cccgagaata ctgggggcac ttgtggctgc aatgtttatc 1260
ctgcggcagc tttgccatga agtatctcac ttttattatt atttttgcat tgctcgagta 1320
tattgacttt ggaaacaaaa gacatcattc tatttatagc attatgtttt tagtagtggt 1380
atttccatat acaagataca gtaattttcc gtcaatgaaa atgtcaaatt ctagaaaatg 1440
taacattcct atgcgtggtg ttaacatcgt tctctaacag ttgttggccg aagattcgtt 1500
tgatgaatcc gatttttcca aaatagccga ttctgatgat tcagacgatt ctgatgttct 1560
gtttagaaat aattccaaga acagttttta cattttattt tcacattgaa aatcagtcag 1620
atttgcttca gcctcaaaga gcacgtttat gtaaaattaa atgagtgctg gcagccagct 1680
gcgctttgtt tttctaaatg ggaaaagggt taaatttcac tcagctttta aatgacagcg 1740
cacagcctgt gtcatagagg gttggaggag atgactttaa ctgcctgtgg ttaggatccc 1800
tttcccccag gaatgtctgg gagcccactg ccgggtttgc tgtccgtctc gtttggactc 1860
agttctgcat gtactg 1876
<210>16
<211>1282
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR16
<400>16
cgcccacctc ggctttccaa agtgctggga ttacaggcat gagtcactgc gcccatcctg 60
attccaagtc tttagataat aacttaactt tttcgaccaa ttgccaatca ggcaatcttt 120
gaatctgcct atgacctagg acatccctct ccctacaagt tgccccgcgt ttccagacca 180
aaccaatgta catcttacat gtattgattg aagttttaca tctccctaaa acatataaaa 240
ccaagctata gtctgaccac ctcaggcacg tgttctcagg acctccctgg ggctatggca 300
tgggtcctgg tcctcagatt tggctcagaa taaatctctt caaatatttt ccagaatttt 360
actcttttca tcaccattac ctatcaccca taagtcagag ttttccacaa ccccttcctc 420
agattcagta atttgctaga atggccacca aactcaggaa agtattttac ttacaattac 480
caatttatta tgaagaactc aaatcaggaa tagccaaatg gaagaggcat agggaaaggt 540
atggaggaag gggcacaaag cttccatgcc ctgtgtgcac accaccctct cagcatcttc 600
atgtgttcac caactcagaa gctcttcaaa ctttgtcatt taggggtttt tatggcagtt 660
ccactatgta ggcatggttg ataaatcact ggtcatcggt gatagaactc tgtctccagc 720
tcctctctct ctcctcccca gaagtcctga ggtggggctg aaagtttcac aaggttagtt 780
gctctgacaa ccagccccta tcctgaagct attgaggggt cccccaaaag ttaccttagt 840
atggttggaa gaggcttatt atgaataaca aaagatgctc ctatttttac cactagggag 900
catatccaag tcttgcggga acaaagcatg ttactggtag caaattcata caggtagata 960
gcaatctcaa ttcttgcctt ctcagaagaa agaatttgac caagggggca taaggcagag 1020
tgagggacca agataagttt tagagcagga gtgaaagttt attaaaaagt tttaggcagg 1080
aatgaaagaa agtaaagtac atttggaaga gggccaagtg ggcgacatga gagagtcaaa 1140
caccatgccc tgtttgatgt ttggcttggg gtcttatatg atgacatgct tctgagggtt 1200
gcatccttct cccctgattc ttcccttggg gtgggctgtc cgcatgcaca atggcctgcc 1260
agcagtaggg aggggccgca tg 1282
<210>17
<211>793
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR17
<400>17
atccgagggg aggaggagaa gaggaaggcg agcagggcgc cggagcccga ggtgtctgcg 60
agaactgttt taaatggttg gcttgaaaat gtcactagtg ctaagtggct tttcggattg 120
tcttatttat tactttgtca ggtttcctta aggagagggt gtgttggggg tgggggagga 180
ggtggactgg ggaaacctct gcgtttctcc tcctcggctg cacagggtga gtaggaaacg 240
cctcgctgcc acttaacaat ccctctatta gtaaatctac gcggagactc tatgggaagc 300
cgagaaccag tgtcttcttc cagggcagaa gtcacctgtt gggaacggcc cccgggtccc 360
cctgctgggc tttccggctc ttctaggcgg cctgatttct cctcagccct ccacccagcg 420
tccctcaggg acttttcaca cctccccacc cccatttcca ctacagtctc ccagggcaca 480
gcacttcatt gacagccaca cgagccttct cgttctcttc tcctctgttc cttctctttc 540
tcttctcctc tgttccttct ctttctctgt cataatttcc ttggtgcttt cgccacctta 600
aacaaaaaag agaaaaaaat aaaataaaaa aaacccattc tgagccaaag tattttaaga 660
tgaatccaag aaagcgaccc acatagccct ccccacccac ggagtgcgcc aagacgcacc 720
caggctccat cacagggccg agagcagcgc cactctggtc gtacttttgg gtcaagagat 780
cttgcaaaag agg 793
<210>18
<211>492
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR18
<400>18
atctttttgc tctctaaatg tattgatggg ttgtgttttt tttcccacct gctaataaat 60
attacattgc aacattcttc cctcaacttc aaaactgctg aactgaaaca atatgcataa 120
aagaaaatcc tttgcagaag aaaaaaagct attttctccc actgattttg aatggcactt 180
gcggatgcag ttcgcaaatc ctattgccta ttccctcatg aacattgtga aatgaaacct 240
ttggacagtc tgccgcattg cgcatgagac tgcctgcgca aggcaagggt atggttccca 300
aagcacccag tggtaaatcc taacttatta ttcccttaaa attccaatgt aacaacgtgg 360
gccataaaag agtttctgaa caaaacatgt catctttgtg gaaaggtgtt tttcgtaatt 420
aatgatggaa tcatgctcat ttcaaaatgg aggtccacga tttgtggcca gctgatgcct 480
gcaaattatc ct 492
<210>19
<211>1840
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR19
<400>19
tcacttcctg atattttaca ttcaaggcta gctttatgca tatgcaacct gtgcagttgc 60
acagggcttt gtgttcagaa agactagctc ttggtttaat actctgttgt tgccatcttg 120
agattcatta taatataatt tttgaatttg tgttttgaac gtgatgtcca atgggacaat 180
ggaacattca cataacagag gagacaggtc aggtggcagc ctcaattcct tgccaccctt 240
ttcacataca gcattggcaa tgccccatga gcacaaaatt tgggggaacc atgatgctaa 300
gactcaaagc acatataaac atgttacctc tgtgactaaa agaagtggag gtgctgacag 360
cccccagagg ccacagttta tgttcaaacc aaaacttgct tagggtgcag aaagaaggca 420
atggcagggt ctaagaaaca gcccatcata tccttgttta ttcatgttac gtccctgcat 480
gaactaatca cttacactga aaatattgac agaggaggaa atggaaagat agggcaaccc 540
atagttcttt ttccttttag tctttcctta tcagtaaacc aaagatagta ttggtaaaat 600
gtgtgtgagt taattaatga gttagtttta ggcagtgttt ccactgttgg ggtaagaaca 660
aaatatatag gcttgtattg agctattaaa tgtaaattgt ggaatgtcag tgattccaag 720
tatgaattaa atatccttgt atttgcattt aaaattggca ctgaacaaca aagattaaca 780
gtaaaattaa taatgtaaaa gtttaatttt tacttagaat gacattaaat agcaaataaa 840
agcaccatga taaatcaaga gagagactgt ggaaagaagg aaaacgtttt tattttagta 900
tatttaatgg gactttcttc ctgatgtttt gttttgtttt gagagagagg gatgtggggg 960
cagggaggtc tcattttgtt gcccaggctg gacttgaact cctgggctcc agctatcctg 1020
ccttagcttc ttgagtagct gggactacag gcacacacca cagtgtctga cattttctgg 1080
attttttttt tttttttatt ttttttgtga gacaggttct ggctctgtta ctcaggttgc 1140
agtgcagtgg catgatagcg gctcactgca gcctcaacct cctcagctta agctactctc 1200
ccacttcagc ctcctgagta gccaggacta cagttgtgtg ccaccacacc tgtggctaat 1260
ttttgtagag atggggtctc tccacgttgc cgaggctggt ctccaactcc tggtctcaag 1320
cgaacctcct gacttggcct cccgaagtgc tgggattaca ggcttgagcc actgcatcca 1380
gcctgtcctc tgtgttaaac ctactccaat ttgtctttca tctctacata aacggctctt 1440
ttcaaagttc ccatagacct cactgttgct aatctaataa taaattatct gccttttctt 1500
acatggttca tcagtagcag cattagattg ggctgctcaa ttcttcttgg tatattttct 1560
tcatttggct tctggggcat cacactctct ttgagttact cattcctcat tgatagcttc 1620
ttcctagtct tctttactgg ttcttcctct tctccctgac tccttaatat tgtttttctc 1680
cccaggcttt agttcttagt cctcttctgt tatctattta cacccaattc tttcagagtc 1740
tcatccagag tcatgaactt aaacctgttt ctgtgcagat aattcacatt attatatctc 1800
cagcccagac tctcccgcaa actgcagact gatcctactg 1840
<210>20
<211>780
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR20
<400>20
gatctcaagt ttcaatatca tgttttggca aaacattcga tgctcccaca tccttaccta 60
aagctaccag aaaggctttg ggaactgtca acagagctac agaaaagtca gtaaagacca 120
atggacccct caaacaaaaa cagccaagct tttctgccaa aaagatgact gagaagactg 180
ttaaagcaaa aaactctgtt cctgcctcag atgatggcta tccagaaata gaaaaattat 240
ttcccttcaa tcctctaggc ttcgagagtt ttgacctgcc tgaagagcac cagattgcac 300
atctcccctt gagtgaagtg cctctcatga tacttgatga ggagagagag cttgaaaagc 360
tgtttcagct gggcccccct tcacctttga agatgccctc tccaccatgg aaatccaatc 420
tgttgcagtc tcctttaagc attctgttga ccctggatgt tgaattgcca cctgtttgct 480
ctgacataga tatttaaatt tcttagtgct ttagagtttg tgtatatttc tattaataaa 540
gcattatttg tttaacagaa aaaaagatat atacttaaat cctaaaataa aataaccatt 600
aaaaggaaaa acaggagtta taactaataa gggaacaaag gacataaaat gggataataa 660
tgcttaatcc aaaataaagc agaaaatgaa gaaaaatgaa atgaagaaca gataaataga 720
aaacaaatag caatatgaaa gacaaacttg accgggtgtg gtggctgatg cctgtaatcc 780
<210>21
<211>607
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR21
<400>21
gatcaataat ttgtaatagt cagtgaatac aaaggggtat atactaaatg ctacagaaat 60
tccattcctg ggtataaatc ctagacatat ttatgcatat gtacaccaag atatatctgc 120
aagaatgttc acagcaaatc tctttgtagt agcaaaaggc caaaaggtct atcaacaaga 180
aaattaatac attgtggcac ataatggcat ccttatgcca ataaaaatgg atgaaattat 240
agttaggttc aaaaggcaag cctccagata atttatatca tataattcca tgtacaacat 300
tcaacaacaa gcaaaactaa acatatacaa atgtcaggga aaatgatgaa caaggttaga 360
aaatgattaa tataaaaata ctgcacagtg ataacattta atgagaaaaa aagaaggaag 420
ggcttaggga gggacctaca gggaactcca aagttcatgg taagtactaa atacataatc 480
aaagcactca aaatagaaaa tattttagta atgttttagc tagttaatat cttacttaaa 540
acaaggtcta ggccaggcac ggtggctcac acctgtaatc ccagcacttt gggaggctga 600
ggcgggt 607
<210>22
<211>1380
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR22
<400>22
cccttgtgat ccacccgcct tggcctccca aagtgctggg attacaggcg tgagtcacta 60
cgcccggcca ccctccctgt atattatttc taagtatact attatgttaa aaaaagttta 120
aaaatattga tttaatgaat tcccagaaac taggatttta catgtcacgt tttcttatta 180
taaaaataaa aatcaacaat aaatatatgg taaaagtaaa aagaaaaaca aaaacaaaaa 240
gtgaaaaaaa taaacaacac tcctgtcaaa aaacaacagt tgtgataaaa cttaagtgcc 300
tgaaaattta gaaacatcct tctaaagaag ttctgaataa aataaggaat aaaataatca 360
catagttttg gtcattggtt ctgtttatgt gatggattat gtttattgat ttgtgtatgt 420
tgaacttatc tcaatagatg cagacaaggc cttgataaaa gtttttaaca ccttttcatg 480
ttgaaaactc tcaatagact aggtattgat gaaacatatc tcaaaataat agaagctatt 540
tatgataaac ccatagccaa tatcatactg agtgggcaaa agctggaagc attccctttg 600
aaaactggca caagacaagg atgccctctc tcaccactcc tattaaatgt agtattggaa 660
gttctggcca gagcaatcag gcaggagaaa gaaaaggtat taaaatagga agagaggaag 720
tcaaattgtc tctgtttgca gtaaacatga ttgtatattt agaaaacccc attgtctcat 780
cctaaaaact ccttaagctg ataaacaact tcagcaaagt ctcaggatac aaaatcaatg 840
tgcaaaaatc acaagcattc ctatacaccg ataatagaca gcagagagcc aaatcatgag 900
tgaagtccca ttcacaattg cttcaaagaa aataaaatac ttaggaatac aactttcacg 960
ggacatgaag gacattttca aggacaacta aaaaccactg ctcaaggaaa tgagagagga 1020
cacaaagaaa tggaaaaaca ttccatgctc atggaagaat caatatcatg aaaatggcca 1080
tactgcccaa agtaatttat agattcaatg ctaaccccat caagccacca ttgactttct 1140
tcacagaact agaaaaaaac tattttaaaa ctcatatgta gtcaaaaaga gtcggtatag 1200
ccaagacaat cctaagcata aagaacaaag ctggatgcat cacgctgact tcaaaccata 1260
ctacaaggct acagtaacca aaacagcatg gtactggtac caaaacagat agatagaccg 1320
atagaacaga acagaggcct cggaaataac accacacatc tacaaccctt tgatcttcaa 1380
<210>23
<211>1246
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR23
<400>23
atcccctcat ccttcagggc agctgagcag ggcctcgagc agctggggga gcctcactta 60
atgctcctgg gagggcagcc agggagcatg gggtctgcag gcatggtcca gggtcctgca 120
ggcggcacgc accatgtgca gccgccccca cctgttgctc tgcctccgcc acctggccat 180
gggcttcagc agccagccac aaagtctgca gctgctgtac atggacaaga agcccacaag 240
cagctagagg accttgtgtt ccacgtgccc agggagcatg gcccacagcc caaagaccag 300
tcaggagcag gcaggggctt ctggcaggcc cagctctacc tctgtcttca cacagatggg 360
agatttctgt tgtgattttg agtgatgtgc ccctttggtg acatccaaga tagttgctga 420
agcaccgctc taacaatgtg tgtgtattct gaaaacgaga acttctttat tctgaaataa 480
ttgatgcaaa ataaattagt ttggatttga aattctattc atgtaggcat gcacacaaaa 540
gtccaacatt gcatatgaca caaagaaaag aaaaagcttg cattccttaa atacaaatat 600
ctgttaacta tatttgcaaa tatatttgaa tacacttcta ttatgttaca tataatatta 660
tatgtatatg tatatataat atacatatat atgttacata taatatactt ctattatgtt 720
acatataata tttatctata agtaaataca taaatataaa gatttgagta gctgtagaac 780
attgtcttat gtgttatcag ctactactac aaaaatatct cttccactta tgccagtttg 840
ccatataaat atgatcttct cattgatggc ccagggcaag agtgcagtgg gtacttattc 900
tctgtgagga gggaggagaa aagggaacaa ggagaaagtc acaaagggaa aactctggtg 960
ttgccaaaat gtcaagtttc acatattccg agacggaaaa tgacatgtcc cacagaagga 1020
ccctgcccag ctaatgtgtc acagatatct caggaagctt aaatgatttt tttaaaagaa 1080
aagagatggc attgtcactt gtttcttgta gctgaggctg tgggatgatg cagatttctg 1140
gaaggcaaag agctcctgct ttttccacac cgagggactt tcaggaatga ggccagggtg 1200
ctgagcacta caccaggaaa tccctggaga gtgtttttct tactta 1246
<210>24
<211>939
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR24
<400>24
acgaggtcac gagttcgaga ccagcctggc caagatggtg aagccctgtc tctactaaaa 60
atacaacaag tagccgggcg cggtgacggg cgcctgtaat cccagctact caggaggctg 120
aagcaggaga atctctagaa cccaggaggc ggaggtgcag tgagctgaga ctgccccgct 180
gcactctagc ctgggcaaca cagcaagact ctgtctcaaa taaataaata aataaataaa 240
taaataaata aataaataaa tagaaaggga gagttggaag tagatgaaag agaagaaaag 300
aaatcctaga tttcctatct gaaggcacca tgaagatgaa ggccacctct tctgggccag 360
gtcctcccgt tgcaggtgaa ccgagttctg gcctccattg gagaccaaag gagatgactt 420
tggcctggct cctagtgagg aagccatgcc tagtcctgtt ctgtttgggc ttgatcctgt 480
atcacttgat tgtctctcct ggactttcca tggattccag ggatgcaact gagaagttta 540
tttttaatgc acttacttga agtaagagtt attttaaaac attttagcaa aggaaatgaa 600
ttctgacagg ttttgcactg aagacattca catgtgagga aaacaggaaa accactatgc 660
tagaaaaagc aaatgctgtt gagattgtct cacaaacaca aattgcgtgc cagcaggtag 720
gtttgagcct caggttgggc acattttacc ttaagcgcac tgttggtgga acttaaggtg 780
actgtaggac ttatatatac atacatacat ataatatata tacatattta tgtgtatata 840
cacacacaca cacacacaca cacacagggt cttgctatct tgcccagggt ggtctccaac 900
tctgggtctc aagcgatcct ctgcctcccc ttcccaaag 939
<210>25
<211>1067
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR25
<400>25
cagcccctct tgtgtttttc tttatttctc gtacacacac gcagttttaa gggtgatgtg 60
tgtataatta aaaggaccct tggcccatac tttcctaatt ctttagggac tgggattggg 120
tttgactgaa atatgttttg gtggggatgg gacggtggac ttccattctc cctaaactgg 180
agttttggtc ggtaatcaaa actaaaagaa acctctggga gactggaaac ctgattggag 240
cactgaggaa caagggaatg aaaaggcaga ctctctgaac gtttgatgaa atggactctt 300
gtgaaaatta acagtgaata ttcactgttg cactgtacga agtctctgaa atgtaattaa 360
aagtttttat tgagcccccg agctttggct tgcgcgtatt tttccggtcg cggacatccc 420
accgcgcaga gcctcgcctc cccgctgccc tcagcctccg atgacttccc cgcccccgcc 480
ctgctcggtg acagacgttc tactgcttcc aatcggaggc acccttcgcg ggagcggcca 540
atcgggagct ccggcaggcg gggaggccgg gccagttaga tttggaggtt caacttcaac 600
atggccgaag caagtagcgc caatctaggc agcggctgtg aggaaaaaag gcatgagggg 660
tcgtcttcgg aatctgtgcc acccggcact accatttcga gggtgaagct cctcgacacc 720
atggtggaca cttttcttca gaagctggtc gccgccggca ggtaaagtgg acgcagccgc 780
ggtgggagtg tttgttggca ccgaagctca aatcccgcga ggtcaggacg gccgcaggct 840
ggcgcgcggt gacgtgggtc cgcgttgggg gcggggcagt cggacgaggc gacccagtca 900
aatcctgagc cttaggagtc agggtattca cgcactgata acctgtagcg gaccgggata 960
gctagctact ccttcctaca ggaagccccg ttttcactaa aatttcaggt ggttgggagg 1020
aaagatagag cctttgcaaa ttagagcagg gttttttatt tttttat 1067
<210>26
<211>540
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR26
<400>26
ccccctgaca agccccagtg tgtgatgttc cccactctgt gtccatgcat tctcattgtt 60
caactcccat ctgtgagtga gaacatgcag tgtttggttt tctgtccttg agatagtttg 120
ctgagaatga tggtttccag cttcatccat gtccttgcaa aggaagtgaa cttatccttt 180
tttatggctt catagtattc catggcacat atgtgccaca tttttttaat ccagtctatc 240
attgatggac atttgggttg gttccaagtc tttgctattg tgaatagcac cacaattaac 300
atatgtgtgc atgtatacat ctttatagta gcatgattta taatccttcg ggtatatacc 360
ctgtaatggg atcgctgggt caaatggtat ttctagttct agatccttga ggaatcacca 420
cactgctttc cacaatggtt gaactaattt acgctcccac cagcagtgta aaagcattcc 480
tatttctcca cgtcctctcc agtatctgtt gtttcctgac tttttaatga tcatcattct 540
<210>27
<211>1520
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR27
<400>27
cttggccctc acaaagcctg tggccaggga acaattagcg agctgcttat tttgctttgt 60
atccccaatg ctgggcataa tgcctgccat tatgagtaat gccggtagaa gtatgtgttc 120
aaggaccaaa gttgataaat accaaagaat ccagagaagg gagagaacat tgagtagagg 180
atagtgacag aagagatggg aacttctgac aagagttgtg aagatgtact aggcaggggg 240
aacagcttaa ggagagtcac acaggaccga gctcttgtca agccggctgc catggaggct 300
gggtggggcc atggtagctt tcccttcctt ctcaggttca gagtgtcagc cttgaacttc 360
taattcccag aggcatttat tcaatgtttt cttctagggg catacctgcc ctgctgtgga 420
agactttctt ccctgtgggt cgccccagtc cccagatgag acggtttggg tcagggccag 480
gtgcaccgtt gggtgtgtgc ttatgtctga tgacagttag ttactcagtc attagtcatt 540
gagggaggtg tggtaaagat ggagatgctg ggtcacatcc ctagagaggt gttccagtat 600
gggcacatgg gagggctgga aggataggtt actgctagac gtagagaagc cacatccttt 660
aacaccctgg cttttcccac tgccaagatc cagaaagtcc ttgtggtttc gctgctttct 720
cctttttttt tttttttttt tttctgagat ggagtctggc tctgtcgccc aggctggagt 780
gcagtggcac gatttcggct cactgcaagt tccgcctcct aggttcatac cattctccca 840
cctcagcctc ccgagtagct gggactacag gcgccaccac acccagctaa ttttttgtat 900
ttttagtaga gacggcgttt caccatgtta gccaggatgg tcttgatccg cctgcctcag 960
cctcccaaag tgctgggatt acaggcgtga gccaccgcgc ccggcctgct ttcttctttc 1020
atgaagcatt cagctggtga aaaagctcag ccaggctggt ctggaactct tgacctcaag 1080
tgatctgcct gcctcagcct cccaaagtgc tgagattaca ggcatgagcc agtccgaatg 1140
tggctttttt tgttttgttt tgaaacaagg tctcactgtt gcccaggctg cagtgcagtg 1200
gcatacctca gctccactgc agcctcgacc tcctgggctc aagcaatcct cccaactgag 1260
cctccccagt agctggggct acaagcgcat gccaccacgc ctggctattt tttttttttt 1320
tttttttttt gagaaggagt ttcattcttg ttgcccaggc tggagtgcaa tggcacagtc 1380
tcagctcact gcagcctccg cctcctgggt tcaagcgatt ctcctgcctc agcctcccga 1440
gtagctggga ttataggcac ctgccaccat gcctggctaa tttttttgta tttttagtag 1500
ggatggggtt tcaccatgtt 1520
<210>28
<211>961
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR28
<400>28
aggaggttat tcctgagcaa atggccagcc tagtgaactg gataaatgcc catgtaagat 60
ctgtttaccc tgagaagggc atttcctaac tctccctata aaatgccaag tggagcaccc 120
cagatgaaat agctgatatg ctttctatac aagccatcta ggactggctt tatcatgacc 180
aggatattca cccactgaat atggctatta cccaagttat ggtaaatgct gtagttaagg 240
gggtcccttc cacatggaca ccccaggtta taaccagaaa gggttcccaa tctagactcc 300
aagagagggt tcttagacct catgcaagaa agaacttggg gcaagtacat aaagtgaaag 360
caagtttatt aagaaagtaa agaaacaaaa aaatggctac tccataagca aagttatttc 420
tcacttatat gattaataag agatggatta ttcatgagtt ttctgggaaa ggggtgggca 480
attcctggaa ctgagggttc ctcccacttt tagaccatat agggtatctt cctgatattg 540
ccatggcatt tgtaaactgt catggcactg atgggagtgt cttttagcat tctaatgcat 600
tataattagc atataatgag cagtgaggat gaccagaggt cacttctgtt gccatattgg 660
tttcagtggg gtttggttgg cttttttttt tttttaacca caacctgttt tttatttatt 720
tatttattta tttatttatt tatatttttt attttttttt agatggagtc ttgctctgtc 780
acccaggtta gagtgcagtg gcaccatctc ggctcactgc aagctctgcc tccttggttc 840
acgccattct gctgcctcag cctcccgagt agctgggact acaggtgcct gccaccatac 900
ccggctaatt ttttctattt ttcagtagag acggggtttc accgtgttag ccaggatggt 960
c 961
<210>29
<211>2233
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR29
<400>29
agcttggaca cttgctgatg ccactttgga tgttgaaggg ccgccctctc ccacaccgct 60
ggccactttt aaatatgtcc cctctgccca gaagggcccc agaggagggg ctggtgaggg 120
tgacaggagt tgactgctct cacagcaggg ggttccggag ggaccttttc tccccattgg 180
gcagcataga aggacctaga agggccccct ccaagcccag ctgggcgtgc agggccagcg 240
attcgatgcc ttcccctgac tcaggtggcg ctgtcctaaa ggtgtgtgtg ttttctgttc 300
gccagggggt ggcggataca gtggagcatc gtgcccgaag tgtctgagcc cgtggtaagt 360
ccctggaggg tgcacggtct cctccgactg tctccatcac gtcaggcctc acagcctgta 420
ggcaccgctc ggggaagcct ctggatgagg ccatgtggtc atccccctgg agtcctggcc 480
tggcctgaag aggaggggag gaggaggcca gcccctccct agccccaagg cctgcgaggc 540
tgcaagcccg gccccacatt ctagtccagg cttggctgtg caagaagcag attgcctggc 600
cctggccagg cttcccagct aggatgtggt atggcagggg tgggggacat tgaggggctg 660
ctgtagcccc cacaacctcc ccaggtaggg tggtgaacag taggctggac aagtggacct 720
gttcccatct gagattcaag agcccacctc tcggaggttg cagtgagccg agatccctcc 780
actgcactcc agcctgggca acagagcaag actctgtctc aaaaaaacag aacaacgaca 840
acaaaaaacc cacctctggc ccactgccta actttgtaaa taaagtttta ttggcacata 900
gacacaccca ttcatttaca tactgctgcg gctgcttttg cattaccctt gagtagacga 960
cagaccacgt ggccatggaa gccaaaaata tttactgtct ggccctttac agaagtctgc 1020
tctagaggga gaccccggcc catggggcag gaccactggg cgtgggcaga agggaggcct 1080
cggtgcctcc acgggcctag ttgggtatct cagtgcctgt ttcttgcatg gagcaccagg 1140
ggtcagggca agtacctgga ggaggcaggc tgttgcccgc ccagcactgg gacccaggag 1200
accttgagag gctcttaacg aatgggagac aagcaggacc agggctccca ttggctgggc 1260
ctcagtttcc ctgcctgtaa gtgagggagg gcagctgtga aggtgaactg tgaggcagag 1320
cctctgctca gccattgcag gggcggctct gccccactcc tgttgtgcac ccagagtgag 1380
gggcacgggg tgagatgtca ccatcagccc ataggggtgt cctcctggtg ccaggtcccc 1440
aagggatgtc ccatcccccc tggctgtgtg gggacagcag agtccctggg gctgggaggg 1500
ctccacactg ttttgtcagt ggtttttctg aactgttaaa tttcagtgga aaattctctt 1560
tcccctttta ctgaaggaac ctccaaagga agacctgact gtgtctgaga agttccagct 1620
ggtgctggac gtcgcccaga aagcccaggt actgccacgg gcgccggcca ggggtgtgtc 1680
tgcgccagcc atgggcacca gccaggggtg tgtctacgcc ggccaggggt aggtctccgc 1740
cggcctccgc tgctgcctgg ggagggccgt gcctgacact gcaggcccgg tttgtccgcg 1800
gtcagctgac ttgtagtcac cctgcccttg gatggtcgtt acagcaactc tggtggttgg 1860
ggaaggggcc tcctgattca gcctctgcgg acggtgcgcg agggtggagc tcccctccct 1920
ccccaccgcc cctggccagg gttgaacgcc cctgggaagg actcaggccc gggtctgctg 1980
ttgctgtgag cgtggccacc tctgccctag accagagctg ggccttcccc ggcctaggag 2040
cagccgggca ggaccacagg gctccgagtg acctcagggc tgcccgacct ggaggccctc 2100
ctggcgtcgc ggtgtgactg acagcccagg agcgggggct gttgtaattg ctgtttctcc 2160
ttcacacaga accttttcgg gaagatggct gacatcctgg agaagatcaa gaagtaagtc 2220
ccgcccccca ccc 2233
<210>30
<211>1851
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR30
<400>30
gggtgcattt ccacccaggg gacacttggc aatggtggga gacattgctt gttgtcacaa 60
ctgggcatgg gagtgctgct gcgtctagtg ggtagaggcc agagatgctc ctaatatcct 120
acaaggcaca gaacagcccc ccacaacaga gaattatcca gcctgaaaat gtccacagtg 180
ctgaggttgg gaaaccctat tctagagcca acaggctgtg aagcttgact catggttcca 240
tcaccaatag ctgcgtgacc ttggtgagtt ccttagctgc tctgtgcctc ggattcatgg 300
taggttttcc ttgttaggtt taaatgagtg aagttataca gagggcctga agtctcatgg 360
tattttacta gagcctcatt gtgttttagt tataattaga aattgggtaa ggtaaggaca 420
cagaagaagc catctgatct gggggcttca cacttagaag tgacctcgga gcaattgtat 480
tggggtggaa agggactaac agccaggagc agagggcaca ttggaattgg ggccagaggg 540
cacagactgc cttgtccatc aggcatagca atggacagag gaaggggaat gactagttat 600
ggctgcaagg ccaagtacag gggacttatt tctcatatct atctatctat ctacctaccg 660
tctatttatc tatcatctat ctacttattt atctatctat ttatgcatgt gtaccaaccg 720
aaagttttag taaatgcaca aactgcgata taatgaaaat ggaaattttc aaaagaagag 780
aaatcacctg ccacctgact accttaacaa atgagtggtt ttcatctctc cttccaggcc 840
tgtcattttt acagtgcttt agtcataaaa caggtcctct attctattgt tttatgtcac 900
atgaaattgt accataagca ttttccatga tgtgactcca ctgtttcatt ttccattttt 960
ttccagaatg aagataacct cattgttttt ttcctgattg taaaaatgct ctgtgctctt 1020
tttttttttt tttaacaatg caggcagtac caaaaagtat gaagaagaat gtaatagttc 1080
ccatttccca tctcactctt taaggccagc attttggtga acatccatcc gaacaaatct 1140
ccacgcgttt atcaatttgt tgacttactc cttcttttat gtaaatatga acatgattta 1200
actgccagtc catttggaac cttaaagtga aggtttttta ttgttggggt ttgctatggt 1260
ctgaatatgt gtgtcccccc aaaatttatg ttgaatccta acgcccaatg cgattaggag 1320
gtggggccat taggaggtga ttaagtcatg aagtcatcag ccctaatgaa tgggatttgt 1380
ggccttgaaa agggacccca gagagctgcc ttgccccttc tgccatgtaa ggacacagtg 1440
aggagctagg aagggggcct cagcagagac caaatgtgat ggtgcctcga tattggactt 1500
cccagcctcc agaatgtgag aaatgaattt ctgttgttta taagtcaccc agtctatagt 1560
attttgttct agcagcccaa acagactaag tcagggttgt tgttttagga agtggggaat 1620
ggggccatgc atgggtgtac gccagaacaa aggaagccag caagtcctga aagatactgg 1680
aaaagggaat agtgggcacg tgcagtgtgt tagtttcctg aggctgctat aacaaagcac 1740
cacaggttgg gtggcttaaa taacagaaat tcattctccc atcattctgg ggaccagacg 1800
tctgaaatca agactcctat gccatgctcc ttctgaaggc tccaggggag g 1851
<210>31
<211>1701
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR31
<220>
<221>misc_feature
<222>(159)..(1696)
<223>n is a,c,g,or t on various positions
<400>31
cacccgcctt ggccccccag agtgctggga ttacaagtgt aaaccaccat tcctggctag 60
atttaatttt ttaaaaaata aagagaagta ggaatagttc attttaggga gagcccctta 120
actgggacag gggcaggaca ggggtgaggc ttcccttant tcaagctcac ctcaaaccca 180
cccaggactg tgtgtcacat tctccaataa aggaaaggtt gctgcccccg cctgtgagtg 240
ctgcagtgga gggtagaggg ccgtgggcag agtgcttcat ggactgctca tcaagaaagg 300
cttcatgaca atcggcccag ctgctgtcat cccacattct acttccagct aggagaaggc 360
ggcttgccca cagtcaccca gccggcaagt gtcacccctg ggttggaccc agagctatga 420
tcctgcccag gggtccagct gagaatcagg cccacgttct aggcagaggg gctcacctac 480
tgggactcca gtagctgtag tgcatggagg catcatggct gcagcagcct ggacctggtc 540
tcacactggc tgtccctgtg ggcaggccat cctcaatgcc aggtcaggcc caagcatgta 600
tcccagacaa tgacaatggg gtggaatcct ctcttgtccc agaagccact cctcactgtt 660
ctacctgagg aaggcagggg catggtggaa tcctgaagcc tgctgtgagg gtctccagcg 720
aacttgcaca tggtcagccc tgccttctcc tccctgaact agattgagcg agagcaagaa 780
ggacattgaa ccagcaccca aagaattttg gggaacggcc tctcatccag gtcaggctca 840
cctccttttt aaaatttaat taattaatta attaattttt ttttagagac agagtcttac 900
tgtgtggccc aggctgtagt gcagtggcac aatcatagtt cactgcagcc tcaaactccc 960
cacctcagcc tctggattag ctgagactac aggtgcacca ccaccacacc cagctaatat 1020
ttttattttt gtagagagag ggtttcacca tcttgcccag gctggtctca aactcctggg 1080
ctcaagtgat cccgcccagg tctgaaagcc cccaggctgg cctcagactg tggggttttc 1140
catgcagcca cccgagggcg cccccaagcc agttcatctc ggagtccagg cctggccctg 1200
ggagacagag tgaaaccagt ggtttttatg aacttaactt agagtttaaa agatttctac 1260
tcgatcactt gtcaagatgc gccctctctg gggagaaggg aacgtgactg gattccctca 1320
ctgttgtatc ttgaataaac gctgctgctt catcctgtgg gggccgtggc cctgtccctg 1380
tgtgggtggg gcctcttcca tttccctgac ttagaaacca cagtccacct agaacagggt 1440
ttgagaggct tagtcagcac tgggtagcgt tttgactcca ttctcggctt tcttcttttt 1500
ctttccagga tttttgtgca gaaatggttc ttttgttgcc gtgttagtcc tccttggaag 1560
gcagctcaga aggcccgtga aatgtcgggg gacaggaccc ccagggaggg aaccccaggc 1620
tacgcacttt agggttcgtt ctccagggag ggcgacctga cccccgnatc cgtcggngcg 1680
cgnngnnacn aannnnttcc c 1701
<210>32
<211>771
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR32
<400>32
gatcacacag cttgtatgtg ggagctagga ttggaacccc agaagtctgg ccccaggttc 60
atgctctcac ccactgcata caatggcctc tcataaatca atccagtata aaacattaga 120
atctgcttta aaaccataga attagtagcg taagtaataa atgcagagac catgcagtga 180
atggcattcc tggaaaaagc ccccagaagg aattttaaat cagctttcgt ctaatcttga 240
gcagctagtt agcaaatatg agaatacagt tgttcccaga taatgcttta tgtctgacca 300
tcttaaactg gcgctgtttt tcaaaaactt aaaaacaaaa tccatgactc ttttaattat 360
aaaagtgata catgtctact tgggaggctg aggtggtggg aggatggctt gagtttgagg 420
ctgcagtatg ctactatcat gcctataaat agccgctgca ttccagcttg ggcaacatac 480
ccaggcccta tctcaaaaaa ataaaaagta atacatctac attgaagaaa attaatttta 540
ttgggttttt ttgcattttt attatacaca gcacacacag cacatatgaa aaaatgggta 600
tgaactcagg cattcaactg gaagaacagt actaaatcaa tgtccatgta gtcagcgtga 660
ctgaggttgg tttgtttttt cttttttctt ctcttctctt ctcttttctt tttttttgag 720
acggagcttt gctctttttg cccaggcttg attgcaatgg cgtgatctca g 771
<210>33
<211>1368
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR33
<400>33
gcttttatcc tccattcaca gctagcctgg cccccagagt acccaattct ccctaaaaaa 60
cggtcatgct gtatagatgt gtgtggcttg gtagtgctaa agtggccaca tacagagctc 120
tgacaccaaa cctcaggacc atgttcatgc cttctcactg agttctggct tgttcgtgac 180
acattatgac attatgatta tgatgacttg tgagagcctc agtcttctat agcactttta 240
gaatgcttta taaaaaccat ggggatgtca ttatattcta acctgttagc acttctgttc 300
gtattaccca tcacatccca acatcaattc tcatatatgc aggtacctct tgtcacgcgc 360
gtccatgtaa ggagaccaca aaacaggctt tgtttgagca acaaggtttt tatttcacct 420
gggtgcaggt gggctgagtc tgaaaagaga gtcagtgaag ggagacaggg gtgggtccac 480
tttataagat ttgggtaggt agtggaaaat tacaatcaaa gggggttgtt ctctggctgg 540
ccagggtggg ggtcacaagg tgctcagtgg gagagccttt gagccaggat gagccagaag 600
gaatttcaca aggtaatgtc atcagttaag gcagggactg gccattttca cttcttttgt 660
ggtggaatgt catcagttaa ggcaggaacc ggccattttc acttcttttg tgattcttca 720
cttgcttcag gccatctgga cgtataggtg caggtcacag tcacagggga taagatggca 780
atggcatagc ttgggctcag aggcctgaca cctctgagaa actaaagatt ataaaaatga 840
tggtcgcttc tattgcaaat ctgtgtttat tgtcaagagg cacttatttg tcaattaaga 900
acccagtggt agaatcgaat gtccgaatgt aaaacaaaat acaaaacctc tgtgtgtgtg 960
tgtgtgtgag tgtgtgtgta tgtgtgtgtg tgtgtattag agaggaaaag cctgtatttg 1020
gaggtgtgat tcttagattc taggttcttt cctgcccacc ccatatgcac ccaccccaca 1080
aaagaacaaa caacaaatcc caggacatct tagcgcaaca tttcagtttg catattttac 1140
atatttactt ttcttacata ttaaaaaact gaaaatttta tgaacacgct aagttagatt 1200
ttaaattaag tttgttttta cactgaaaat aatttaatat ttgtgaagaa tactaataca 1260
ttggtatatt tcattttctt aaaattctga acccctcttc ccttatttcc ttttgacccg 1320
attggtgtat tggtcatgtg actcatggat ttgccttaag gcaggagg 1368
<210>34
<211>755
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR34
<400>34
actgggcacc ctcctaggca ggggaatgtg agaactgccg ctgctctggg gctgggcgcc 60
atgtcacagc aggagggagg acggtgttac accacgtggg aaggactcag ggtggtcagc 120
cacaaagctg ctggtgatga ccaggggctt gtgtcttcac tctgcagccc taacacccag 180
gctgggttcg ctaggctcca tcctgggggt gcagaccctg agagtgatgc cagtgggagc 240
ctcccgcccc tccccttcct cgaaggccca ggggtcaaac agtgtagact cagaggcctg 300
agggcacatg tttatttagc agacaaggtg gggctccatc agcggggtgg cctggggagc 360
agctgcatgg gtggcactgt ggggagggtc tcccagctcc ctcaatggtg ttcgggctgg 420
tgcggcagct ggcggcaccc tggacagagg tggatatgag ggtgatgggt ggggaaatgg 480
gaggcacccg agatggggac agcagaataa agacagcagc agtgctgggg ggcaggggga 540
tgagcaaagg caggcccaag acccccagcc cactgcaccc tggcctccca caagccccct 600
cgcagccgcc cagccacact cactgtgcac tcagccgtcg atacactggt ctgttaggga 660
gaaagtccgt cagaacaggc agctgtgtgt gtgtgtgcgt gtatgagtgt gtgtgtgtga 720
tccctgactg ccaggtcctc tgcactgccc ctggg 755
<210>35
<211>1193
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR35
<220>
<221>misc_feature
<222>(312)..(1191)
<223>n is a,c,g,or t on various positions
<400>35
cgacttggtg atgcgggctc ttttttggtt ccatatgaac tttaaagtag tcttttccaa 60
ttctgtgaag aaagtcattg gtaggttgat ggggatggca ttgaatctgt aaattacctt 120
gggcagtatg gccattttca caatgttgat tcttcctatc catgatgatg gaatgttctt 180
ccattagttt gtatcctctt ttatttcctt gagcagtggt ttgtagttct ccttgaagag 240
gtccttcaca tcccttgtaa gttggattcc taggtatttt attctctttg aagcaaattg 300
tgaatgggag tncactcacg atttggctct ctgtttgtct gctgggtgta taaanaatgt 360
ngtgatnttn gtacattgat ttngtatccn tgagacttng ctgaatttgc ttnatcngct 420
tnngggaacc ttttgggctg aaacnatggg attttctaaa tatacaatca tgtcgtctgc 480
aaacagggaa caatttgact tcctcttttc ctaattgaat acactttatc tccttctcct 540
gcctaattgc cctgggcaaa acttccaaca ctatgntngn aataggagnt ggtgagagag 600
ggcatccctg ttcttgttgc cagnttttca aagggaatgc ttccagtttt ggcccattca 660
gtatgatatg ggctgtgggt ngtgtcataa atagctctta tnattttgaa atgtgtccca 720
tcaataccta atttattgaa agtttttagc atgaangcat ngttgaattt ggtcaaaggc 780
tttttctgca tctatggaaa taatcatgtg gtttttgtct ttggctcntg tttatatgct 840
ggatnacatt tattgatttg tgtatatnga acccagcctn ncatcccagg gatgaagccc 900
acttgatcca agcttggcgc gcngnctagc tcgaggcagg caaaagtatg caaagcatgc 960
atctcaatta gtcagcaccc atagtccgcc cctacctccg cccatccgcc cctaactcng 1020
nccgttcgcc cattctcgcc catggctgac taatnttttt annatccaag cggngccgcc 1080
ctgcttganc attcagagtn nagagnnttg gaggccnagc cttgcaaaac tccggacngn 1140
ttctnnggat tgaccccnnt taaatatttg gttttttgtn ttttcanngg nga 1193
<210>36
<211>1712
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR36
<400>36
gatcccatcc ttagcctcat cgatacctcc tgctcacctg tcagtgcctc tggagtgtgt 60
gtctagccca ggcccatccc ctggaactca ggggactcag gactagtggg catgtacact 120
tggcctcagg ggactcagga ttagtgagcc ccacatgtac acttggcctc agtggactca 180
ggactagtga gccccacatg tacacttggc ctcaggggac tcaggattag tgagccccca 240
catgtacact tggcctcagg ggactcagga ttagtgagcc ccacatgtac acttggcctc 300
aggggactca ggactagtga gccccacatg tacacttggc ctcaggggac tcagaactag 360
tgagccccac atgtacactt ggcttcaggg gactcaggat tagtgagccc cacatgtaca 420
cttggacacg tgaaccacat cgatgtgctg cagagctcag ccctctgcag atgaaatgtg 480
gtcatggcat tccttcacag tggcacccct cgttccctcc ccacctcatc tcccattctt 540
gtctgtcttc agcacctgcc atgtccagcc ggcagattcc accgcagcat cttctgcagc 600
acccccgacc acacacctcc ccagcgcctg cttggccctc cagcccagct cccgcctttc 660
ttccttgggg aagctccctg gacagacacc ccctcctccc agccatggct ttttcctgct 720
ctgccccacg cgggaccctg ccctggatgt gctacaatag acacatcaga tacagtcctt 780
cctcagcagc cggcagaccc agggtggact gctcggggcc tgcctgtgag gtcacacagg 840
tgtcgttaac ttgccatctc agcaactagt gaatatgggc agatgctacc ttccttccgg 900
ttccctggtg agaggtactg gtggatgtcc tgtgttgccg gccacctttt gtccctggat 960
gccatttatt tttttccaca aatatttccc aggtctcttc tgtgtgcaag gtattagggc 1020
tgcagcgggg gccaggccac agatctctgt cctgagaaga cttggattct agtgcaggag 1080
actgaagtgt atcacaccaa tcagtgtaaa ttgttaactg ccacaaggag aaaggccagg 1140
aaggagtggg gcatggtggt gttctagtgt tacaagaaga agccagggag ggcttcctgg 1200
atgaagtggc atctgacctg ggatctggag gaggagaaaa atgtcccaaa agagcagaga 1260
gcccacccta ggctctgcac caggaggcaa cttgctgggc ttatggaatt cagagggcaa 1320
gtgataagca gaaagtcctt gggggccaca attaggattt ctgtcttcta aagggcctct 1380
gccctctgct gtgtgacctt gggcaagtta cttcacctct agtgctttgg ttgcctcatc 1440
tgtaaagtgg tgaggataat gctatcacac tggttgagaa ttgaagtaat tattgctgca 1500
aagggcttat aagggtgtct aatactagta ctagtaggta cttcatgtgt cttgacaatt 1560
ttaatcatta ttattttgtc atcaccgtca ctcttccagg ggactaatgt ccctgctgtt 1620
ctgtccaaat taaacattgt ttatccctgt gggcatctgg cgaggtggct aggaaagcct 1680
ggagctgttt cctgttgacg tgccagacta gt 1712
<210>37
<211>1321
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR37
<400>37
aggatcacat ttaaggaagt gtgtggggtc cctggatgac accagcaccc agtgcggctc 60
tgtctggcaa ccgctcccaa ggtggcagga gtgggtgtcc cctgtgtgtc agtgggcagc 120
tcctgctgag cctacagctc actggggagc ctgacagcgg ggccatgtgc ctgacactcc 180
tctctgcttg tggacctggc aaggcaggga gcagaaaaca gagccacttg aaggctttct 240
gtctgcgtct gtgtgcagtg tggatttagt tgtgcttttt tcttgctggg agagcacagc 300
caccatttac aagcagtgtc accctcatgg gtggcgagga cagaacagga gcctctgctc 360
tctgtaccta tctgggcccg gtgggctccc ttgtcctggc ttccatctct gtctcagcga 420
ccattcagcc ctgcgcagga acacatgttg cttagaaaag ccaaattcag cccttgtctc 480
tgcctcctct ggtctcatga tgtgcatctg ttaccttgaa actggaaacc agtctatcaa 540
tgtctgtgcc aattttttat tccctcccca acctccttcc ccatacgact ttttatttat 600
gtaggatgtg tgctgtctaa tgatgggatg accacatttt tccatgttct aaaagtgctc 660
ctctcccgca gggtcccagg gctggtggtt gctttgggtc tacagctacg tcttacccgc 720
ctcctgcctc aacagcctgt gtggtggcaa agccggtgtg gggctgggga acgcagcgtt 780
ctccaggagg gggacccggc tctccttctg cagtgcaggc gaaggcctag atgccagtgt 840
gacctcccac aaggcgtggc ttccagactc cccggctgga agtgatgctt ttttgcctcc 900
ggccctgggt ttgaagcagc ctggctttct cttggtaagt ggctggtgtc ttagcagctg 960
caatctgagc tcagccacct acacaccacc gtggccgaca ctttcattaa aaagtttcct 1020
gagacgactt gcgtgcatgt tgacttcatg atcagcgccg ctgggaagaa cccctgagcc 1080
ggtggggtgg ggctggaagc agcaggtgca gtgatggggc tgggtgccca ggaggcctca 1140
gtgctcaatc aggccaaggt ggccaagccc aggctgcagg gaaggccggc ctgggggttg 1200
tgggtgagca caggcaggca ccagctgggc agtgttagga tgctggagca gcatccgtaa 1260
ccccactgag tggggtagtc tggttggggc agggaccgct gttgctttgg cagagagaga 1320
t 1321
<210>38
<211>1445
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR38
<220>
<221>misc_feature
<222>(348)..(949)
<223>n is a,c,g,or t on various positions
<400>38
gatctatggg agtagcttcc ttagtgagct ttcccttcaa atactttgca accaggtaga 60
gaattttgga gtgaaggttt tgttcttcgt ttcttcacaa tatggatatg catcttcttt 120
tgaaaatgtt aaagtaaatt acctctcttt tcagatactg tcttcatgcg aacttggtat 180
cctgtttcca tcccagcctt ctataaccca gtaacatctt ttttgaaacc agtgggtgag 240
aaagacacct ggtcaggaac gcggaccaca ggacaactca ggctcaccca cggcatcaga 300
ctaaaggcaa acaaggactc tgtataaagt accggtggca tgtgtatnag tggagatgca 360
gcctgtgctc tgcagacagg gagtcacaca gacacttttc tataatttct taagtgcttt 420
gaatgttcaa gtagaaagtc taacattaaa tttgattgaa caattgtata ttcatggaat 480
attttggaac ggaataccaa aaaatggcaa tagtggttct ttctggatgg aagacaaact 540
tttcttgttt aaaataaatt ttattttata tatttgaggt tgaccacatg accttaagga 600
tacatataga cagtaaactg gttactacag tgaagcaaat taacatatct accatcgtac 660
atagttacat ttttttgtgt gacaggaaca gctaaaatct acgtatttaa caaaaatcct 720
aaagacaata catttttatt aactatagcc ctcatgatgt acattagatc gtgtggttgt 780
ttcttccgtc cccgccacgc cttcctcctg ggatggggat tcattcccta gcaggtgtcg 840
gagaactggc gcccttgcag ggtaggtgcc ccggagcctg aggcgggnac tttaanatca 900
gacgcttggg ggccggctgg gaaaaactgg cggaaaatat tataactgna ctctcaatgc 960
cagctgttgt agaagctcct gggacaagcc gtggaagtcc cctcaggagg cttccgcgat 1020
gtcctaggtg gctgctccgc ccgccacggt catttccatt gactcacacg cgccgcctgg 1080
aggaggaggc tgcgctggac acgccggtgg cgcctttgcc tgggggagcg cagcctggag 1140
ctctggcggc agcgctggga gcggggcctc ggaggctggg cctggggacc caaggttggg 1200
cggggcgcag gaggtgggct cagggttctc cagagaatcc ccatgagctg acccgcaggg 1260
cggccgggcc agtaggcacc gggcccccgc ggtgacctgc ggacccgaag ctggagcagc 1320
cactgcaaat gctgcgctga ccccaaatgc tgtgtccttt aaatgtttta attaagaata 1380
attaataggt ccgggtgtgg aggctcaagc cttaatcccc agcacctggc gaggccgagg 1440
aggga 1445
<210>39
<211>2331
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR39
<400>39
gtgaaataga tcactaaagc tgattcctct tgtctaaatg aaactttcta ccctttgatg 60
gacagctatg ctttccccat cctctcccgt cccccagccc ttggtaacca tcatcctact 120
ctctacttgt aggagttcaa cttgtttaga ttttgtgagt gagaacatgt ggtatttgcc 180
tttagagtcc tctaggttta tccatattgt gttaaatgac aggattccct gcctttttaa 240
ggctgaatag tatttcattg taatatatat acatacacac acacatatac acacacatat 300
atatacatat atacatatat gtacatagat acatatatat gtacatatat acacacacat 360
atacacacat atatacacat atatacatat acatatatac acatatatgt acatatatat 420
aacttttttt catttatcca ttcacttaat acatatgatg gagggcttta tatatgccag 480
gctctgtgat gaatgctgga aattcaatag tgagaaagac tcagtctctg cctccaaaga 540
gcatcatggg ctaggtgctg caacgaggaa ttgccaactg ttgtcatgag agcacagaga 600
agggactcaa ccagccttga agaatcaggg gaggcttcta agctaatggt gtgtgcctgg 660
ggatcacatt gtttcaagca gcagtaacag gatgtgctca ggtccagatg tgagagagag 720
agagagcata tgtcttcaag aaactaacag tagctcccta tagctgaagc aggagtacaa 780
aatagtgagt ttaagtgatg aggcaagaga tatgaagaag cttgaccatg cagctacacc 840
gggcagcatg ccctctgaga catctcatgg aagccggaaa tgggagtgcc ttgataccaa 900
gccagagaaa ttataatact aagtagatag actgagcagc actcctcctg ggaagaatga 960
gacaagccct gaatttggag gtaagttgtg gattggtgat tagaggagag gtaacaggca 1020
ccaaagcaag aaatagtatt gatgcaaagc tgaggttaat tggatgacaa aatgaagagc 1080
ataaggggct cagacacaga ctgagcagaa aacgagtagc atctgaacct agattgagtt 1140
actaatggat gagaaagagt tcttaaagtt gatgaccacg ggatccatat ataagaatgt 1200
ccaatctccc caaattgatc cacgagttca gtgcaatgcc aatcaaaatc ccactaacaa 1260
gtttatttta aaatgtaaat gaaaatacaa aatttttaaa aagcaaagca atattgaaaa 1320
cccaggaaaa attaggagga cttacacaac ctgatctcaa aacttaccat tatcaagaca 1380
gagtgttatt gacacaagga gagacaaata gataaacgga atgtggtagt ctggagatgc 1440
acccacatgt atgtggtcaa ttgatttttg gccaaggcac caagtcaatt caaaggagca 1500
aggaaagtag tacagaaaca accaaatatt gttttggaaa ataatgacaa agggcttata 1560
accagaatat aagcatataa atataattct ttcaaatcaa taataagaag gcaaatatct 1620
aataaaaatg agcaaagact tgaaaagtca cttaaaaagg cttattaatt agaaatatgc 1680
aaatgttatt agtcttcagt ggaatttaca ttaaaccaca agggatacta ttatatctta 1740
tgcccactag aataaccaaa ggaaaaaaga cagacaaaac aaaatgctgg tgaggatgtg 1800
aagcaactgg aactctcata cattattggt ggtaatgtaa aatttataca accattatga 1860
ataaaggttt ggcagtttct tacaaagttg aatgcacttc tccacgatga ctaggctttt 1920
cactcatagg cgtctggctc cctagaactg aaaacatatg ttcacaagaa gacttgcaaa 1980
tatatattct cccacgtcag gagatatttg ctatgcattt aactgacata agattagtgc 2040
tagagtttat aatgaggttc ttcaaatcta aaagaaaatg caaagcatat aatagtaagg 2100
ggtgcaggcc aggcgcagtg gctcactctg taatcccagc actttgggag gccgaggtgg 2160
gcggatcaca aggtcaggag ttcgagacca acctggccaa catagtgaaa ccctgtctct 2220
actaaaaata caaaaactag ccaggtgcgg tgtcatgcac ctgtagtccc agctactcgg 2280
gaggccgagg caggagaatc acttgaacct gggaggtgga ggttgcagtg a 2331
<210>40
<211>1071
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR40
<400>40
gctgtgattc aaactgtcag cgagataagg cagcagatca agaaagcact ccgggctcca 60
gaaggagcct tccaggccag ctttgagcat aagctgctga tgagcagtga gtgtcttgag 120
tagtgttcag ggcagcatgt taccattcat gcttgacttc tagccagtgt gacgagaggc 180
tggagtcagg tctctagaga gttgagcagc tccagcctta gatctcccag tcttatgcgg 240
tgtgcccatt cgctttgtgt ctgcagtccc ctggccacac ccagtaacag ttctgggatc 300
tatgggagta gcttccttag tgagctttcc cttcaaatac tttgcaacca ggtagagaat 360
tttggagtga aggttttgtt cttcgtttct tcacaatatg gatatgcatc ttcttttgaa 420
aatgttaaag taaattacct ctcttttcag atactgtctt catgcgaact tggtatcctg 480
tttccatccc agccttctat aacccagtaa catctttttt gaaaccagtg ggtgagaaag 540
acacctggtc aggaacgcgg accacaggac aactcaggct cacccacggc atcagactaa 600
aggcaaacaa ggactctgta taaagtaccg gtggcatgtg tattagtgga gatgcagcct 660
gtgctctgca gacagggagt cacacagaca cttttctata atttcttaag tgctttgaat 720
gttcaagtag aaagtctaac attaaatttg attgaacaat tgtatattca tggaatattt 780
tggaacggaa taccaaaaaa tggcaatagt ggttctttct ggatggaaga caaacttttc 840
ttgtttaaaa taaattttat tttatatatt tgaggttgac cacatgacct taaggataca 900
tatagacagt aaactggtta ctacagtgaa gcaaattaac atatctacca tcgtacatag 960
ttacattttt ttgtgtgaca ggaacagcta aaatctacgt atttaacaaa aatcctaaag 1020
acaatacatt tttattaact atagccctca tgatgtacat tagatctcta a 1071
<210>41
<211>1135
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR41
<400>41
cgtgtgcagt ccacggagag tgtgttctcc tcatcctcgt tccggtggtt gtggcgggaa 60
acgtggcgct gcaggacacc aacatcagtc acgtatttca ttctggaaaa aaaagtagca 120
caagcctcgg ctggttccct ccagctctta ccaggcagcc taagcctagg ctccattccc 180
gctcaaggcc ttcctcaggg gcctgctcac cacaggagct gttcccatgc agggactaag 240
gacatgcagc ctgcatagaa accaagcacc caggaaaaca tgattggatg gagcgggggg 300
gtgtggtctc tagccttgtc cacctccggt cctcatgggt ctcacacctc ctgagaatgg 360
gcaccgcaga ggccacagcc catacagcca agatgacaga ctccgtaagt gacagggatc 420
cacagcagag tgggtgaaat gttccctata aactttacaa aattaatgag ggcaggggga 480
ggggagaaat gaaaatgaac ccagctcgca gcacatcagc atcagtcact aggtcggcgt 540
gctctctgac tgcttcctcg tagctgcttg gtgtctcatt gcctcagaag catgtagacc 600
ctgtcacaag attgtagttc ccctaactgc tccgtagatc acaacttgaa ccttaggaaa 660
tgctgttttc cctttgagat attcctttgg gtcctgtata ctgatggagc tactgactga 720
gctgctccga aggaccccac gaggagctga ctaaaccaag agtgcagttt gtacaccctg 780
atgattacat cccccttgcc ccaccaatca actctcccaa ttttccagcc cctcaccctc 840
cagtcccctt aaaagcccca gcccaggccg ggcacagtgg ctcatgcctg taatcccagc 900
actttgggag gccaaggtgg gcagatcacc tgagggcagg aatttgagac cagcctgacc 960
aacatgaaga aaccccgtct ctattacaaa tacaaaatta gccgggcgtg ttgctgcata 1020
ctggtaatcc cagctacttg ggagggtgag gcaggagaat cacttgaatc tgggaggcgg 1080
aggttgcgat gagccgagac agcgccattg cactgcagcc tgggcaacaa gagca 1135
<210>42
<211>735
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR42
<400>42
aagggtgaga tcactaggga gggaggaagg agctataaaa gaaagaggtc actcatcaca 60
tcttacacac tttttaaaac cttggttttt taatgtccgt gttcctcatt agcagtaagc 120
cctgtggaag caggagtctt tctcattgac caccatgaca agaccctatt tatgaaacat 180
aatagacaca caaatgttta tcggatattt attgaaatat aggaattttt cccctcacac 240
ctcatgacca cattctggta cattgtatga atgaatatac cataatttta cctatggctg 300
tatatttagg tcttttcgtg caggctataa aaatatgtat gggccggtca cagtgactta 360
cgcccgtagt cccagaactt tgggaggccg aggcgggtgg atcacctgag gtcgggagtt 420
caaaaccagc ctgaccaaca tggagaaacc ccgtctctgc taaaaataca aaaattaact 480
ggacacggtg gcgtatgcct gtaatcccag ctactcggga agctgaggca ggagaactgc 540
ttgaacccag gaggcggagg ttgtggtgag tcgagattgc gccattgcac tccagcctgg 600
gcaacaagag cgaaattcca tctcaaaaaa aagaaaaaag tatgactgta tttagagtag 660
tatgtggatt tgaaaaatta ataagtgttg ccaacttacc ttagggttta taccatttat 720
gagggtgtcg gtttc 735
<210>43
<211>1227
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR43
<400>43
caaatagatc tacacaaaac aagataatgt ctgcccattt ttccaaagat aatgtggtga 60
agtgggtaga gagaaatgca tccattctcc ccacccaacc tctgctaaat tgtccatgtc 120
acagtactga gaccaggggg cttattccca gcgggcagaa tgtgcaccaa gcacctcttg 180
tctcaatttg cagtctaggc cctgctattt gatggtgtga aggcttgcac ctggcatgga 240
aggtccgttt tgtacttctt gctttagcag ttcaaagagc agggagagct gcgagggcct 300
ctgcagcttc agatggatgt ggtcagcttg ttggaggcgc cttctgtggt ccattatctc 360
cagcccccct gcggtgttgc tgtttgcttg gcttgtctgg ctctccatgc cttgttggct 420
ccaaaatgtc atcatgctgc accccaggaa gaatgtgcag gcccatctct tttatgtgct 480
ttgggctatt ttgattcccc gttgggtata ttccctaggt aagacccaga agacacagga 540
ggtagttgct ttgggagagt ttggacctat gggtatgagg taatagacac agtatcttct 600
ctttcatttg gtgagactgt tagctctggc cgcggactga attccacaca gctcacttgg 660
gaaaacttta ttccaaaaca tagtcacatt gaacattgtg gagaatgagg gacagagaag 720
aggccctaga tttgtacatc tgggtgttat gtctataaat agaatgcttt ggtggtcaac 780
tagacttgtt catgttgaca tttagtcttg ccttttcggt ggtgatttaa aaattatgta 840
tatcttgttt ggaatatagt ggagctatgg tgtggcattt tcatctggct ttttgtttag 900
ctcagcccgt cctgttatgg gcagccttga agctcagtag ctaatgaaga ggtatcctca 960
ctccctccag agagcggtcc cctcacggct cattgagagt ttgtcagcac cttgaaatga 1020
gtttaaactt gtttattttt aaaacattct tggttatgaa tgtgcctata ttgaattact 1080
gaacaacctt atggttgtga agaattgatt tggtgctaag gtgtataaat ttcaggacca 1140
gtgtctctga agagttcatt tagcatgaag tcagcctgtg gcaggttggg tggagccagg 1200
gaacaatgga gaagctttca tgggtgg 1227
<210>44
<211>1586
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR44
<400>44
cacctgcctc agcctcccaa agtgctgaga ttcaaagaaa ttttcatgga gaggggacag 60
atggagtcaa ttcttgtggg gtgaacatga gtaccacagt tagactgagg ttgggaaaga 120
ttttccagac aattggaaga gcatgtgaaa gacacagatt ttgagaaatg ttaagtctag 180
ggaactgcaa ggcttttggc acaagaaagc cactgtagac tatagaggca ggatgcctag 240
attcaaatcc caactgctac acttctaagc tttgtaattt tggcaagttt ttaccctcta 300
ttttcttatc tataaaatat agattttata tatatagata tagatatata gatagataat 360
aattgtgcat gcctaataaa gttgtcaaag attaaatgtt atatgtgaag tattttgtac 420
ggtgatagga acccaggaag ggctctatga atattatgta ttattattat tctaaagtag 480
ctggaataca atgttcaaag gagatagtgg caggagataa gtttgaattg aaagattgag 540
gccagaacat aaagtgcctc ctatattata ttttacataa ttggaacatc attgaaaaat 600
ttaagtatta tttatgtgtg tatgtgtgtt ttatataatt aattctagtt catcatttta 660
aaatatcttt ctgatgtcac tgtgaacaac agatgagaag aagtgaatcc tgagttaagg 720
agaccagctc tctgattact gccataatcc agggagggta ccataaggat ttcaactgga 780
agtgaatcca tcatgatgga gaggaaggac agggctgaaa aatacttagg aagtagtatc 840
agtaggactg gttaagagag agcagaggca ggctacaggg gttggaggtg tcaatcacag 900
agatagggaa aatgggagga gaagcaggct ttgaaaaagt ggcttgtctt gtaaaattat 960
gtgctgttaa aacagtacaa gaaattaata tattcaatcc caaaatacag ggacaattct 1020
ttttgaaaga gttacccaga tagtcttcct tgaagttttc agttaaagaa atttcttgtt 1080
aacaaataat gtagtcatag aagaaaacac ttaaaacttt attgaataaa gctaataaat 1140
catttaatat aatttatagg aaattgttac ataacacaca cattcaatac tttttgctaa 1200
agtataaatt aatggaagga gagcacgcac acagaggttg aattatgttt atgactttat 1260
tagtcaagaa tacaaaattg agtagctaca tcaagcagaa gcacatgctt tacaatccag 1320
cacagaatcc cttgacatcc aaactcccga aacagacatg taaatacaga tgacattgtc 1380
agaacaaaat agggtctcac ccgacctata atgttctttt cttgatataa atatgcacat 1440
gaattgcata cggtcatatg gttccaatta ccattatttc ctctgggctt agctatccat 1500
ctaaggggaa tttacaccaa cactgtactt ctacttgcaa gaatatatga aagcatagtt 1560
aacttctggc ttaggacccc aactca 1586
<210>45
<211>1981
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR45
<400>45
atggatcata gggtaaataa atttataatt tcttgagaaa gcttcgtact gttttccaag 60
atggctgtac taatttccat tcctaccaac agtgtacagg gtttcttttt ctccacatcc 120
tcaccaacac ttatcttcca tcttttttta taatagccct agtaaaatgt gtgaggtgat 180
atctcattgt ggcattgatt tgcacttctc tgataattag gaatgtttat gattttttca 240
tgtacctggt tggccttttg tatgatgtag gaaatgtcta ttctgattct ttgcttattt 300
tttaataagc atagtttttt tcttattttt gagtaggttg agttgcttat atattattat 360
atgagcccct tacctgatgt atggtttaaa aatattatcc catttgtggg ttctcttaat 420
tctatcattg cttcttttcc tgtggaaaag ttttaagttt tatgcagtct catttgtgtg 480
ttttgctttt gttgcctttt ggaataatct acagaaaatc atagctcagg ccaatgtcat 540
acagtctcct tctatatttc cttgtagtag ttttacattt aaactttaat tttgatttga 600
tgcttgtata aagagcaaaa taaaagtcaa attttattct tctgtatgtg gatagtcagt 660
tttgtctaca ccatttattg aaaataattt tctttcttca ctgtgtattt ttagttattt 720
tatcaaaaaa tcaattgacc acagacacac ggatttattt acaggttcta tatccctttg 780
tactgtttta catgtctgtt tttatgccat tgctatgctg ttttaattcc tatagctttg 840
taatagagtt tggagtcagg tagtctgatg cctccagctt tgttcttttt gttcaagatt 900
gctttggttg gtccaggtct tttgtggttc catacaaatt ttagcagtaa tttttctatt 960
tctgtgaaga atgacattgg aatttgatag tggttgcatt taatctgtag attgctttgg 1020
gtagcattga cacttttaca atactaattt ttgaatccat caatgaagga tgtttctcca 1080
tttatttatg ccattttaat ttttttcatc aatgtgctat agttttcagt atgtaaatct 1140
tttatggttt tgattaaatt tactcctgtc ttttatatat ttatatatct gttttgattc 1200
tattataaat tgaattgcct ttatttttca ggtaatagtt tgtcattagt taatagaaac 1260
aataatgata tttgtatgtt gattttgtaa ctattaactt tattgaattt cttcatcagc 1320
tataaccatt tattttggtg gaatctttaa gattttctct atcttaagat tatattttca 1380
aaaaacagaa acaatcttac ctcttccttc cctatgtgga tttcttttac gtctttgtct 1440
tgtgtaactg ttctggctag gcaattacac ataatgtttt catcatttat aattttacat 1500
cacatccatc tattgtggca cattgattgc tacttttcaa gttgtaaacc tggacattta 1560
tcactactct tcctccaata caggagtcca tggcgtggtg tgggccctac tgtgccacag 1620
tccagggcac ggctgggctg aggttctctt gtgcaagagt ccgtggctct gcggagcaag 1680
agttctccag tgccttagtc cagggttagg caggggtggg gctccttcag tagcttagtc 1740
cagtgcgccg ccctgcgagg gtcctcctga gcaggagtac acgatgaggc agggtcctac 1800
tgtgccttag cccaggaagc ggggggctgg gtcctctggt gccatagtcc aggctgccgg 1860
gagctgggtc ctctggtgcc atagctcagg ccggcgggag ctgggtcctc tggtgccgta 1920
gtccagggtg cagcagaaca ggagtcctgc ggagcagtag tccagggcac gctggggcgt 1980
g 1981
<210>46
<211>1859
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR46
<400>46
attgtttttc tcgcccttct gcattttctg caaattctgt tgaatcattg cagttactta 60
ggtttgcttc gtctccccca ttacaaacta cttactgggt ttttcaaccc tagttccctc 120
atttttatga tttatgctca tttctttgta cacttcgtct tgctccatct cccaactcat 180
ggcccctggc tttggattat tgttttggtc ttttattttt tgtcttcttc tacctcaaca 240
cttatcttcc tctcccagtc tccggtaccc tatcaccaag gttgtcatta acctttcata 300
ttattcctca ttatccatgt attcatttgc aaataagcgt atattaacaa aatcacaggt 360
ttatggagat ataattcaca taccttaaaa ttcaggcttt taaagtgtac ctttcatgtg 420
gtttttggta tattcacaaa gttatgcatt gatcaccacc atctgattcc ataacatgtt 480
caatacctca aaaagaagtc tgtactcatt agtagtcatt tcacattcac cactccctct 540
ggctctgggc agtcactgat ctttgtgtct ctatggattt gcctagtcta ggtattttta 600
tgtaaatggc atcatacaac atgtgacctt ttgtttggct tttttcattt agcaaaatgt 660
tatcaaggtc tgtccctgtt gtagcatgta ttagcacttc atttcttata tgctgaatga 720
tatactttat ttgtccatca gttgttcatg ctttatttgt ccatcagttg atgaacattt 780
gcgtttttgc cactttgggc tattaagaat aatgctactg tgaacaagtg tgtacaagtt 840
cctctacaaa tttttgtgtg gacatatcct ttcagttctc tcaggtgtat atctgggaat 900
tgaattgctg ggtcgtgtag tagctatgtt aaacactttg agaaactgct ataatgttct 960
ccagagctgt accattttaa attctgtgta tgaggattcc acgttctcca cttcctcacc 1020
agtgtatgga tttgggggta tactttttaa aaagtgggat taggctgggc acagtggctc 1080
acacctgtaa tcccaacact tcaggaagct gaggtgggag gatcacttga gcctagtagt 1140
ttgagaccag cctgggcaac atagggagac cctgtctcta caaaaaataa tttaaaataa 1200
attagctggg cgttgtggca cacacctgta gtcccagcta catgggaggc tgaggtggaa 1260
ggattccctg agcccagaag tttgaggttg cagtgagcca tgatggcagc actatactgt 1320
agcctgggtg tcagagcaag actccgtttc agggaagaaa aaaaaaagtg ggatgatatt 1380
tttgacactt ttcttcttgt tttcttaatt tcatacttct ggaaattcca ttaaattagc 1440
tggtaccact ctaactcatt gtgtttcatg gctgcatagt aatattgcat aatataaata 1500
taccattcat tcatcaaagt tagcagatat tgactgttag gtgccaggca ctgctctaag 1560
cgttaaagaa aaacacacaa aaacttttgc attcttagag tttattttcc aatggagggg 1620
gtggagggag gtaagaattt aggaaataaa ttaattacat atatagcata gggtttcacc 1680
agtgagtgca gcttgaatcg ttggcagctt tcttagtagt ataaatacag tactaaagat 1740
gaaattactc taaatggtgt tacttaaatt actggaatag gtattactat tagtcacttt 1800
gcaggtgaaa gtggaaacac catcgtaaaa tgtaaaatag gaaacagctg gttaatgtt 1859
<210>47
<211>1082
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR47
<400>47
atcattagtc attagggaaa tgcaaatgaa aaacacaagc agccaccaat atacacctac 60
taggatgatt taaaggaaaa taagtgtgaa gaaggacgta aagaaattgt aaccctgata 120
cattgatggt agaaatggat aaagttgcag ccactgtgaa aaacagtctg cagtggctca 180
gaaggttaaa tatagaaccc ctgttggacc caggaactct actcttaggc accccaaaga 240
atagagaaca gaaatcaaac agatgtttgt atactaatgt ttgtagcatc acttttcaca 300
ggagccaaaa ggtggaaata atccaaccat cagtgaacaa atgaatgtaa taaaagcaag 360
gtggtctgca tgcaatgcta catcatccat ctgtaaaaaa cgaacatcat tttgatagat 420
gatacaacat gggtggacat tgagaacatt atgcttagtg aaataagcca gacacaaaag 480
gaatatattg tataattgta attacatgaa gtgcctagaa tagtcaaatt catacaagag 540
aaagtgggat aggaatcacc atgggctgga aataggggga aggtgctata ctgcttattg 600
tggacaaggt ttcgtaagaa atcatcaaaa ttgtgggtgt agatagtggt gttggttatg 660
caaccctgtg aatatattga atgccatgga gtgcacactt tggttaaaag gttcaaatga 720
taaatattgt gttatatata tttccccacg atagaaaaca cgcacagcca agcccacatg 780
ccagtcttgt tagctgcctt cctttacctt caagagtggg ctgaagcttg tccaatcttt 840
caaggttgct gaagactgta tgatggaagt catctgcatt gggaaagaaa ttaatggaga 900
gaggagaaaa cttgagaatc cacactactc accctgcagg gccaagaact ctgtctccca 960
tgctttgctg tcctgtctca gtatttcctg tgaccacctc ctttttcaac tgaagacttt 1020
gtacctgaag gggttcccag gtttttcacc tcggcccttg tcaggactga tcctctcaac 1080
ta 1082
<210>48
<211>1242
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR48
<400>48
atcatgtatt tgttttctga attaattctt agatacatta atgttttatg ttaccatgaa 60
tgtgatatta taatataata tttttaattg gttgctactg tttataagaa tttcattttc 120
tgtttacttt gccttcatat ctgaaaacct tgctgatttg attagtgcat ccacaaattt 180
tcttggattt tctatgggta attacaaatc tccacacaat gaggttgcag tgagccaaga 240
tcacaccact gtactccagc ctgggcgaca gagtgagaca ccatctcaca aaaacacata 300
aacaaacaaa cagaaactcc acacaatgac aacgtatgtg ctttcttttt ttcttcctct 360
ttctataata tttctttgtc ctatcttaac tgaactggcc agaaacccca ggacaatgat 420
aaatacgagc agtgtcaaca gacatctcat tccctttcct agcttttata aaaataacga 480
ttatgcttca acattacata tggtggtgtc gatggttttg ttatagataa gcttatcagg 540
ttaagaaatt tgtctgcgtt tcctagtttg gtataaagat tttaatataa atgaatgttg 600
tattttatca tcttattttt ttcctacatc tgctaaggta atcctgtgtt ttcccctttt 660
caatctccta atgtggtgaa tgacattaaa ataccttcta ttgttaaaat attcttgcaa 720
cgctgtatag aaccaatgcc tttattctgt attgctgatg gatttttgaa aaatatgtag 780
gtggacttag ttttctaagg ggaatagaat ttctaatata tttaaaatat tttgcatgta 840
tgttctgaag gacattggtg tgtcatttct ataccatctg gctactagag gagccgactg 900
aaagtcacac tgccggagga ggggagaggt gctcttccgt ttctggtgtc tgtagccatc 960
tccagtggta gctgcagtga taataatgct gcagtgccga cagttctgga aggagcaaca 1020
acagtgattt cagcagcagc agtattgcgg gatccccacg atggagcaag ggaaataatt 1080
ctggaagcaa tgacaatatc agctgtggct atagcagctg agatgtgagt tctcacggtg 1140
gcagcttcaa ggacagtagt gatggtccaa tggcgcccag acctagaaat gcacatttcc 1200
tcagcaccgg ctccagatgc tgagcttgga cagctgacgc ct 1242
<210>49
<211>1015
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR49
<400>49
aaaccagaaa cccaaaacaa tgggagtgac atgctaaaac cagaaaccca aaacaatggg 60
agggtcctgc taaaccagaa acccaaaaca atgggagtga agtgctaaaa ccagaaaccc 120
aaaacaatgg gagtgtcctg ctacaccaga aacccaaaac gatgggagtg acgtgataaa 180
accagacacc caaaacaatg ggagtgacgt gctaaaccag aaacccaaaa caatgggagt 240
gacgtgctaa aacctggaaa cctaaaacaa tgcgagtgag gtgctaacac cagaatccat 300
aacaatgtga gtgacgtgct aaaccagaac ccaaaacaat gggagtgacg tgctaaaaca 360
ggaacccaaa acaatgagag tgacgtgcta aaccagaaac ccaaaacaat gggaatgacg 420
tgctaaaacc ggaacccaaa acaatgggag tgatgtgcta aaccagaaac ccaaaacaat 480
gggaatgaca tgctaaaact ggaacccaaa acaatggtaa ctaagagtga tgctaaggcc 540
ctacattttg gtcacactct caactaagtg agaacttgac tgaaaaggag gatttttttt 600
tctaagacag agttttggtc tgtcccccag agtggagtgc agtggcatga tctcggctca 660
ctgcaagctc tgcctcccgg gttcaggcca ttctcctgcc tcagcctcct gagtagctgg 720
gaatacaggc acccgccacc acacttggct aattttttgt atttttagta gagatggggt 780
ttcaccatat tagcaaggat ggtctcaatc tcctgacctc gtgatctgcc cacctcaggc 840
tcccaaagtg ctgggattac aggtgtgagc caccacaccc agcaaaaagg aggaattttt 900
aaagcaaaat tatgggaggc cattgttttg aactaagctc atgcaatagg tcccaacaga 960
ccaaaccaaa ccaaaccaaa atggagtcac tcatgctaaa tgtagcataa tcaaa 1015
<210>50
<211>2355
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>seauence of STAR50
<400>50
caaccatcgt tccgcaagag cggcttgttt attaaacatg aaatgaggga aaagcctagt 60
agctccattg gattgggaag aatggcaaag agagacaggc gtcattttct agaaagcaat 120
cttcacacct gttggtcctc acccattgaa tgtcctcacc caatctccaa cacagaaatg 180
agtgactgtg tgtgcacatg cgtgtgcatg tgtgaaagta tgagtgtgaa tgtgtctata 240
tgggaacata tatgtgattg tatgtgtgta actatgtgtg actggcagcg tggggagtgc 300
tggttggagt gtggtgtgat gtgagtatgc atgagtggct gtgtgtatga ctgtggcggg 360
aggcggaagg ggagaagcag caggctcagg tgtcgccaga gaggctggga ggaaactata 420
aacctgggca atttcctcct catcagcgag cctttcttgg gcaatagggg cagagctcaa 480
agttcacaga gatagtgcct gggaggcatg aggcaaggcg gaagtactgc gaggaggggc 540
agagggtctg acacttgagg ggttctaatg ggaaaggaaa gacccacact gaattccact 600
tagccccaga ccctgggccc agcggtgccg gcttccaacc ataccaacca tttccaagtg 660
ttgccggcag aagttaacct ctcttagcct cagtttcccc acctgtaaaa tggcagaagt 720
aaccaagctt accttcccgg cagtgtgtga ggatgaaaag agctatgtac gtgatgcact 780
tagaagaagg tctagggtgt gagtggtact cgtctggtgg gtgtggagaa gacattctag 840
gcaatgagga ctggggagag cctggcccat ggcttccact cagcaaggtc agtctcttgt 900
cctctgcact cccagccttc cagagaggac cttcccaacc agcactcccc acgctgccag 960
tcacacatag ttacacacat acaatcacat atatgttccc atatagacac attcacactc 1020
ataccttcac acatgcacac gcatgtgcac acacagtcac tcatttctgt gttggagatt 1080
gggtgaggac attcaatggg tgaggaccaa caggtgtgaa gattgctttc tagaaaatga 1140
ctcctgtctc tctttgccat tcttcccaat ccgatggagc tactaggctt ttccctcatt 1200
tcatgtttaa taaaccttcc caatggcgaa atgggctttc tcaagaagtg gtgagtgtcc 1260
catccctgcg gtggggacag gggtggcagc ggacaagcct gcctggaggg aactgtcagg 1320
ctgattccca gtccaactcc agcttccaac acctcatcct ccaggcagtc ttcattcttg 1380
gctctaattt cgctcttgtt ttctttttta tttttatcga gaactgggtg gagagctttt 1440
ggtgtcattg gggattgctt tgaaaccctt ctctgcctca cactgggagc tggcttgagt 1500
caactggtct ccatggaatt tcttttttta gtgtgtaaac agctaagttt taggcagctg 1560
ttgtgccgtc cagggtggaa agcagcctgt tgatgtggaa ctgcttggct cagatttctt 1620
gggcaaacag atgccgtgtc tctcaactca ccaattaaga agcccagaaa atgtggcttg 1680
gagaccacat gtctggttat gtctagtaat tcagatggct tcacctggga agccctttct 1740
gaatgtcaaa gccatgagat aaaggacata tatatagtag ctagggtggt ccacttctta 1800
ggggccatct ccggaggtgg tgagcactaa gtgccaggaa gagaggaaac tctgttttgg 1860
agccaaagca taaaaaaacc ttagccacaa accactgaac atttgttttg tgcaggttct 1920
gagtccaggg agggcttctg aggagagggg cagctggagc tggtaggagt tatgtgagat 1980
ggagcaaggg ccctttaaga ggtgggagca gcatgagcaa aggcagagag gtggtaatgt 2040
ataaggtatg tcatgggaaa gagtttggct ggaacagagt ttacagaata gaaaaattca 2100
acactattaa ttgagcctct actacgtgct cgacattgtt ctagtcactg agataggttt 2160
ggtatacaaa acaaaatcca tcctctatgg acattttagt gactaacaac aatataaata 2220
ataaaagtga acaaaagctc aaaacatgcc aggcactatt atttatttat ttatttattt 2280
atttatttat tttttgaaac agagtctcgc tctgttgccc aggctggagt gtagtggtgc 2340
gatctcggct cactg 2355
<210>51
<211>2289
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR51
<400>51
tcacaggtga caccaatccc ctgaccacgc tttgagaagc actgtactag attgactttc 60
taatgtcagt cttcattttc tagctctgtt acagccatgg tctccatatt atctagtaca 120
acacacatac aaatatgtgt gatacagtat gaatataata taaaaatatg tgttataata 180
taaatataat attaaaatat gtctttatac tagataataa tacttaataa cgttgagtgt 240
ttaactgctc taagcacttt acctgcagga aacagttttt tttttatttt ggtgaaatac 300
aactaacata aatttattta caattttaag catttttaag tgtatagttt agtggagtta 360
atatattcaa aatgttgtgc agccgtcacc atcatcagtc ttcataactc ttttcatatt 420
gtaaaattaa aagtttatgc tcatttaaaa atgactccca atttcccccc tcctcaacct 480
ctggaaacta ccattctatt ttctgcctcc gtagttttgc ccactctaag tacctcacat 540
aagtggaatt tgtcttattt gcctgtttgt gaccggctga tttcatttag tataatgtcc 600
tcaagtttta ttcacgttat atagcatatg tcataatttt cttcactttt aagcttgagt 660
aatatttcat cgtatgtatc tcacattttg cttatccatt catctctcag tggacacttg 720
agttgcttct acattttagc tgttgtgaat actgctgcta tgaacatggg tgtataaata 780
tctcaagacc tttttatcag ttttttaaaa tatatactca gtagtagttt agctggatta 840
tatggtaatt ttatttttaa tttttgagga actgtcctac ccttttattc aatagtagct 900
ataccaattg acaattggca ttcctaccaa cagggcataa gggttctcaa ttctccacat 960
attccctgat acttgttatt ttcaggtgtt tttttttttt tttttttttt atgggagcca 1020
tgttaatggg tgtaaggtga tatttcatta tagttttgat ttgcatttcc ctaatgatta 1080
gtgatgttaa gcatctcttc atgtgcctat tggccatttg tatatcttct ttaaaaatat 1140
atatatactc attcctttgc ccatttttga attatgttta ttttttgtta ttgagtttca 1200
atacttttct atataaccta ggtattaatc ctttatcaga cttaagattt gcaaatattc 1260
tctttcattc cacaggttgc taattctctc tgttggtaat atcttttgat gctgttgtgt 1320
ccagaattga ttcattcctg tgggttcttg gtctcactga cttcaagaat aaagctgcgg 1380
accctagtgg tgagtgttac acttcttata gatggtgttt ccggagtttg ttccttcaga 1440
tgtgtccaga gtttcttcct tccaatgggt tcatggtctt gctgacttca ggaatgaagc 1500
cgcagacctt cgcagtgagg tttacagctc ttaaaggtgg cgtgtccaga gttgtttgtt 1560
ccccctggtg ggttcgtggt cttgctgact tcaggaatga agccgcagac cctcgcagtg 1620
agtgttacag ctcataaagg tagtgcggac acagagtgag ctgcagcaag atttactgtg 1680
aagagcaaaa gaacaaagct tccacagcat agaaggacac cccagcgggt tcctgctgct 1740
ggctcaggtg gccagttatt attcccttat ttgccctgcc cacatcctgc tgattggtcc 1800
attttacaga gtactgattg gtccatttta cagagtgctg attggtgcat ttacaatcct 1860
ttagctagac acagagtgct gattgctgca ttcttacaga gtgctgattg gtgcatttac 1920
agtcctttag ctagatacag aacgctgatt gctgcgtttt ttacagagtg ctgattggtg 1980
catttacaat cctttagcta gacacagtgc tgattggtgg gtttttacag agtgctgatt 2040
ggtgcgtctt tacagagtgc tgattggtgc atttacaatc ctttagctag acacagagtg 2100
ctgattggtg cgtttataat cctctagcta gacagaaaag ttttccaagt ccccacctga 2160
ccgagaagcc ccactggctt cacctctcac tgttatactt tggacatttg tccccccaaa 2220
atctcatgtt gaaatgtaac ccctaatgtt ggaactgagg ccagactgga tgtggctggg 2280
ccatgggga 2289
<210>52
<211>1184
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR52
<400>52
ctcttctttg tttttttatt ttggggtgtg tgggtacgtg taagatgaga aatgtacaaa 60
cacaagtatt tcagaaactc caagtaatat tctgtctgtg agttcacggt aaataaataa 120
aaagggcaaa gtgacagaaa tacaggatta ttaaaagcaa aataatgttc tttgaaatcc 180
cccccttggt gtatttttta tcttaggatg cagcactttc agcatgccca agtattgaaa 240
gcagtgtttt tacgctacca cggtaatttt atttagaaac cccatgttca cttttagttt 300
taaaatggtc tttatgacat aaaattatca gcattcatat ttttgtgttt taatattcct 360
ttggctactt attgaaacag taaacattac gaaaattagt aaacaaatct ttgatagttg 420
cttatttttg tttaattgaa tgtttatttt attaggtaaa tatacaatca aatttattta 480
aaaataatga ggaaaagaat acttttcttt cgctttgcga aagcaaagtg atttttcatt 540
cttctccgtc cgattccttc tcttccagct gccacagccg actgacaggc tcccggcggc 600
ctgaggagta gtatgcaaat tttggatgat tgacacctac agtagaagcc aatcacgtca 660
aagtaggatg ctgattggtt gacaacaata ggcgtaaacc ttgacgtttt aaaaacctga 720
cacccaatcc aggcgattca tgcaaataaa ggaagggagt cacattacca ggggccagag 780
agacttgagt acgacctcac gtgttcagtg gtggatattg cacagacgtc tgcaaggtct 840
atataaacgc tacataatgt tcaactcaat tgcttgcctt ggcctttccc aaacttgtca 900
ctggaatata aattatccct tttttaaaaa taaaaaaata agaattatgt agtgcacata 960
tatgatggtt catgtagaaa tctaaatgga cttccaacgc atggaatttt cctatttccc 1020
cctttcttta aattaatcct cagtgaagga ggctgttttc ccctagattt caaaaggacg 1080
agatttacag agcctttcct tggagaaacc cgctctaggc acagatggtc agtaaattta 1140
gcttcttcag cgaagttcca catggcaccg ccagatggca taag 1184
<210>53
<211>1431
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR53
<400>53
ccctgaggaa gatgacgagt aactccgtaa gagaaccttc cactcatccc ccacatccct 60
gcagacgtgc tattctgtta tgatactggt atcccatctg tcacttgctc cccaaatcat 120
tcccttctta caattttcta ctgtacagca ttgaggctga acgatgagag atttcccatg 180
ctctttctac tccctgccct gtatatatcc ggggatcctc cctacccagg atgctgtggg 240
gtcccaaacc ccaagtaagc cctgatatgc gggccacacc tttctctagc ctaggaattg 300
ataacccagg cgaggaagtc actgtggcat gaacagatgg ttcacttcga ggaaccgtgg 360
aaggcgtgtg caggtcctga gatagggcag aatcggagtg tgcagggtct gcaggtcagg 420
aggagttgag attgcgttgc cacgtggtgg gaactcactg ccacttattt ccttctctct 480
tcttgcctca gcctcaggga tacgacacat gcccatgatg agaagcagaa cgtggtgacc 540
tttcacgaac atgggcatgg ctgcggaccc ctcgtcatca ggtgcatagc aagtgaaagc 600
aagtgttcac aacagtgaaa agttgagcgt catttttctt agtgtgccaa gagttcgatg 660
ttagcgttta cgttgtattt tcttacactg tgtcattctg ttagatacta acattttcat 720
tgatgagcaa gacatactta atgcatattt tggtttgtgt atccatgcac ctaccttaga 780
aaacaagtat tgtcggttac ctctgcatgg aacagcatta ccctcctctc tccccagatg 840
tgactactga gggcagttct gagtgtttaa tttcagattt tttcctctgc atttacacac 900
acacgcacac aaaccacacc acacacacac acacacacac acacacacac acacacacac 960
acacaccaag taccagtata agcatctgcc atctgctttt cccattgcca tgcgtcctgg 1020
tcaagctccc ctcactctgt ttcctggtca gcatgtactc ccctcatccg attcccctgt 1080
agcagtcact gacagttaat aaacctttgc aaacgttccc cagttgtttg ctcgtgccat 1140
tattgtgcac acagctctgt gcacgtgtgt gcatatttct ttaggaaaga ttcttagaag 1200
tggaattgct gtgtcaaagg agtcatttat tcaacaaaac actaatgagt gcgtcctcgt 1260
gctgagcgct gttctaggtg ctggagcgac gtcagggaac aaggcagaca ggagttcctg 1320
acccccgttc tagaggagga tgtttccagt tgttgggttt tgtttgtttg tttcttctag 1380
agatggtggt cttgctctgt ccaggctaga gtgcagtggc atgatcatag c 1431
<210>54
<211>975
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR54
<400>54
ccataaaagt gtttctaaac tgcagaaaaa tccccctaca gtcttacagt tcaagaattt 60
tcagcatgaa atgcctggta gattacctga ctttttttgc caaaaataag gcacagcagc 120
tctctcctga ctctgacttt ctatagtcct tactgaatta tagtccttac tgaattcatt 180
cttcagtgtt gcagtctgaa ggacacccac attttctctt tgtctttgtc aattctttgt 240
gttgtaaggg caggatgttt aaaagttgaa gtcattgact tgcaaaatga gaaatttcag 300
agggcatttt gttctctaga ccatgtagct tagagcagtg ttcacactga ggttgctgct 360
aatgtttctg cagttcttac caatagtatc atttacccag caacaggata tgatagagga 420
cttcgaaaac cccagaaaat gttttgccat atatccaaag ccctttggga aatggaaagg 480
aattgcgggc tcccattttt atatatggat agatagagac caagaaagac caaggcaact 540
ccatgtgctt tacattaata aagtacaaaa tgttaacatg taggaagtct aggcgaagtt 600
tatgtgagaa ttctttacac taattttgca acattttaat gcaagtctga aattatgtca 660
aaataagtaa aaatttttac aagttaagca gagaataaca atgattagtc agagaaataa 720
gtagcaaaat cttcttctca gtattgactt ggttgctttt caatctctga ggacacagca 780
gtcttcgctt ccaaatccac aagtcacatc agtgaggaga ctcagctgag actttggcta 840
atgttggggg gtccctcctg tgtctcccca ggcgcagtga gcctgcaggc cgacctcact 900
cgtggcacac aactaaatct ggggagaagc aacccgatgc cagcatgatg cagatatctc 960
agggtatgat cggcc 975
<210>55
<211>501
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR55
<400>55
cctgaactca tgatccgccc acctcagcct cctgaagtgc tgggattaca ggtgtgagcc 60
accacaccca gccgcaacac actcttgagc aaccaatgtg tcataaaaga aataaaatgg 120
aaatcagaaa gtatcttgag acagacaaaa atggaaacac aacataccaa aatttatggg 180
acacagcaaa agcagtttta ggagggaagt ttatagtgat gaatacctac ctcaaaatca 240
ttagcctgat tggatgacac tacagtgtat aaatgaattg aaaaccacat tgtgccccat 300
acatatatac aatttttatt tgttaattaa aaataaaata aaactttaaa aaagaagaaa 360
gagctcaaat aaacaaccta actttatacc tcaaggaaat agaagagcca gctaagccca 420
aagttgacag aaggaaaaaa atattggcag aaagaaatga aacagagact agaaagacaa 480
ttgaagagat cagcaaaact a 501
<210>56
<211>741
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR56
<400>56
acacaggaaa agatcgcaat tgttcagcag agctttgaac cggggatgac ggtctccctc 60
gttgcccggc aacatggtgt agcagccagc cagttatttc tctggcgtaa gcaataccag 120
gaaggaagtc ttactgctgt cgccgccgga gaacaggttg ttcctgcctc tgaacttgct 180
gccgccatga agcagattaa agaactccag cgcctgctcg gcaagaaaac gatggaaaat 240
gaactcctca aagaagccgt tgaatatgga cgggcaaaaa agtggatagc gcacgcgccc 300
ttattgcccg gggatgggga gtaagcttag tcagccgttg tctccgggtg tcgcgtgcgc 360
agttgcacgt cattctcaga cgaaccgatg actggatgga tggccgccgc agtcgtcaca 420
ctgatgatac ggatgtgctt ctccgtatac accatgttat cggagagctg ccaacgtatg 480
gttatcgtcg ggtatgggcg ctgcttcgca gacaggcaga acttgatggt atgcctgcga 540
tcaatgccaa acgtgtttac cggatcatgc gccagaatgc gctgttgctt gagcgaaaac 600
ctgctgtacc gccatcgaaa cgggcacata caggcagagt ggccgtgaaa gaaagcaatc 660
agcgatggtg ctctgacggg ttcgagttct gctgtgataa cggagagaga ctgcgtgtca 720
cgttcgcgct ggactgctgt g 741
<210>57
<211>1365
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR57
<400>57
tccttctgta aataggcaaa atgtatttta gtttccacca cacatgttct tttctgtagg 60
gcttgtatgt tggaaatttt atccaattat tcaattaaca ctataccaac aatctgctaa 120
ttctggagat gtggcagtga ataaaaaagt tatagtttct gattttgtgg agcttggact 180
ttaatgatgg acaaaacaac acattcttaa atatatattt catcaaaatt atagtgggtg 240
aattatttat atgtgcattt acatgtgtat gtatacataa atgggcggtt actggctgca 300
ctgagaatgt acacgtggcg cgaacgaggc tgggcggtca gagaaggcct cccaaggagg 360
tggctttgaa gctgagtggt gcttccacgt gaaaaggctg gaaagggcat tccaagaaaa 420
ggctgaggcc agcgggaaag aggttccagt gcgctctggg aacggaaagc gcacctgcct 480
gaaacgaaaa tgagtgtgct gaaataggac gctagaaagg gaggcagagg ctggcaaaag 540
cgaccgagga ggagctcaaa ggagcgagcg gggaaggccg ctgtggagcc tggaggaagc 600
acttcggaag cgcttctgag cgggtaaggc cgctgggagc atgaactgct gagcaggtgt 660
gtccagaatt cgtgggttct tggtctcact gacttcaaga atgaagaggg accgcggacc 720
ctcgcggtga gtgttacagc tcttaaggtg gcgcgtctgg agtttgttcc ttctgatgtt 780
cggatgtgtt cagagtttct tccttctggt gggttcgtgg tctcgctggc tcaggagtga 840
agctgcagac cttcgcggtg agtgttacag ctcataaaag cagggtggac tcaaagagtg 900
agcagcagca agatttattg caaagaatga aagaacaaag cttccacact gtggaagggg 960
accccagcgg gttgccactg ctggctccgc agcctgcttt tattctctta tctggcccca 1020
cccacatcct gctgattggt agagccgaat ggtctgtttt gacggcgctg attggtgcgt 1080
ttacaatccc tgcgctagat acaaaggttc tccacgtccc caccagatta gctagataga 1140
gtctccacac aaaggttctc caaggcccca ccagagtagc tagatacaga gtgttgattg 1200
gtgcattcac aaaccctgag ctagacacag ggtgatgact ggtgtgttta caaaccttgc 1260
ggtagataca gagtatcaat tggcgtattt acaatcactg agctaggcat aaaggttctc 1320
caggtcccca ccagactcag gagcccagct ggcttcaccc agtgg 1365
<210>58
<211>1401
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR58
<400>58
aagtttacct tagccctaaa ttatttcatt gtgattggca ttttaggaaa tatgtattaa 60
ggaatgtctc ttaggagata aggataacat atgtctaaga aaattatatt gaaatattat 120
tacatgaact aaaatgttag aactgaaaaa aaattattgt aactccttcc agcgtaggca 180
ggagtatcta gataccaact ttaacaactc aactttaaca acttcgaacc aaccagatgg 240
ctaggagatt cacctattta gcatgatatc ttttattgat aaaaaaatat aaaacttcca 300
ttaaattttt aagctactac aatcctatta aattttaact taccagtgtt ctcaatgcta 360
cataatttaa aatcattgaa atcttctgat tttaactcct cagtcttgaa atctacttat 420
ttttagttac atatatatcc aatctactgc cgctagtaga agaagcttgg aatttgagaa 480
aaaaatcaga cgttttgtat attctcatat tcactaattt attttttaaa tgagtttctg 540
caatgcatca agcagtggca aaacaggaga aaaattaaaa ttggttgaaa agatatgtgt 600
gccaaacaat cccttgaaat ttgatgaagt gactaatcct gagttattgt ttcaaatgtg 660
tacctgttta tacaagggta tcacctttga aatctcaaca ttaaatgaaa ttttataagc 720
aatttgttgt aacatgatta ttataaaatt ctgatataac attttttatt acctgtttag 780
agtttaaaga gagaaaagga gttaagaata attacatttt cattagcatt gtccgggtgc 840
aaaaacttct aacactatct tcaaatcttt ttctccattg ccttctgaac atacccactt 900
gggtatctca ttagcactgc aaattcaaca ttttcgattg ctaatttttc tccctaaata 960
tttatttgtt ttctcagctt tagccaatgt ttcactattg accatttgct caagtatagt 1020
gacgcttcaa tgaccttcag agagctgttt cagtccttcc tggactactt gcatgcttcc 1080
aacaaaatga agcactcttg atgtcagtca ctcaaataaa tggaaatggg cccatttact 1140
aggaatgtta acagaataaa aagatagacg tgacaccagt tgcttcagtc catctccatt 1200
tacttgctta aggcctggcc atatttctca cagttgatat ggcgcagggc acatgtttaa 1260
atggctgttc ttgtaggatg gtttgactgt tggattcctc atcttccctc tccttaggaa 1320
ggaaggttac agtagtactg ttggctcctg gaatatagat tcataaagaa ctaatggagt 1380
atcatctccc actgctcttg t 1401
<210>59
<211>866
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR59
<400>59
gagatcacgc cactgcactc cagcctgggg gacagagcaa gactccatct cagaaacaaa 60
caaacacaca aagccagtca aggtgtttaa ttcgacggtg tcaggctcag gtctcttgac 120
aggatacatc cagcacccgg gggaaacgtc gatgggtggg gtggaatcta ttttgtggcc 180
tcaagggagg gtttgagagg tagtcccgca agcggtgatg gcctaaggaa gcccctccgc 240
ccaagaagcg atattcattt ctagcctgta gccacccaag agggagaatc gggctcgcca 300
cagaccccac aacccccaac ccaccccacc cccacccctc ccacctcgtg aaatgggctc 360
tcgctccgtc aggctctagt cacaccgtgt ggttttggaa cctccagcgt gtgtgcgtgg 420
gttgcgtggt ggggtggggc cggctgtgga cagaggaggg gataaagcgg cggtgtcccg 480
cgggtgcccg ggacgtgggg cgtggggcgt gggtggggtg gccagagcct tgggaactcg 540
tcgcctgtcg ggacgtctcc cctcctggtc ccctctctga cctacgctcc acatcttcgc 600
cgttcagtgg ggaccttgtg ggtggaagtc accatccctt tggactttag ccgacgaagg 660
ccgggctccc aagagtctcc ccggaggcgg ggccttgggc aggctcacaa ggatgctgac 720
ggtgacggtt ggtgacggtg atgtacttcg gaggcctcgg gccaatgcag aggtatccat 780
ttgacctcgg tgggacaggt cagctttgcg gagtcccgtg cgtccttcca gagactcatc 840
cagcgctagc aagcatggtc ccgagg 866
<210>60
<211>2067
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR60
<220>
<221>misc_feature
<222>(92)..(1777)
<223>n is a,c,g,or t on various positions
<400>60
agcagtgcag aactggggaa gaagaagagt ccctacacca cttaatactc aaaagtactc 60
gcaaaaaata acacccctca ccaggtggca tnattactct ccttcattga gaaaattagg 120
aaactggact tcgtagaagc taattgcttt atccagagcc acctgcatac aaacctgcag 180
cgccacctgc atacaaacct gtcagccgac cccaaagccc tcagtcgcac caagcctctg 240
ctgcacaccc tcgtgccttc acactggccg ttccccaagc ctggggcata ctncccagct 300
ctgagaaatg tattcatcct tcaaagccct gctcatgtgt cctnntcaac aggaaaatct 360
cccatgagat gctctgctat ccccatctct cctgccccat agcttaggca nacttctgtg 420
gtggtgagtc ctgggctgtg ctgtgatgtg ttcgcctgcn atgtntgttc ttccccacaa 480
tgatgggccc ctgaattctc tatctctagc acctgtgctc agtaaaggct tgggaaacca 540
ggctcaaagc ctggcccaga tgccaccttt tccagggtgc ttccgggggc caccaaccag 600
agtgcagcct tctcctccac caggaactct tgcagcccca cccctgagca cctgcacccc 660
attacccatc tttgtttctc cgtgtgatcg tattattaca gaattatata ctgtattctt 720
aatacagtat ataattgtat aattattctt aatacagtat ataattatac aaatacaaaa 780
tatgtgttaa tggaccgttt atgttactgg taaagcttta agtcaacagt gggacattag 840
ttaggttttt ggcgaagtca aaagttatat gtgcattttc aacttcttga ggggtcggta 900
cntctnaccc ccatgttgtt caanggtcaa ctgtctacac atatcatagc taattcacta 960
cagaaatgtt agcttgtgtc actagtatct ccccttctca taagcttaat acacatacct 1020
tgagagagct cttggccatc tctactaatg actgaagttt ttatttatta tagatgtcat 1080
aataggcata aaactacatt acatcattcg agtgccaatt ttgccacctt gaccctcttt 1140
tgcaaaacac caacgtcagt acacatatga agaggaaact gcccgagaac tgaagttcct 1200
gagaccagga gctgcaggcg ttagatagaa tatggtgacg agagttacga ggatgacgag 1260
agtaaatact tcatactcag tacgtgccaa gcactgctat aagcgctctg tatgtgtgaa 1320
gtcatttaat cctcacagca tcccacggtg taattatttt cattatcccc atgagggaac 1380
agaaactcag aacggttcaa cacatatgcg agaagtcgca gccggtcagt gagagagcag 1440
gttcccgtcc aagcagtcag accccgagtg cacactctcg acccctgtcc agcagactca 1500
ctcgtcataa ggcggggagt gntctgtttc agccagatgc tttatgcatc tcagagtacc 1560
caaaccatga aagaatgagg cagtattcan gagcagatgg ngctgggcag taaggctggg 1620
cttcagaata gctggaaagc tcaagtnatg ggacctgcaa gaaaaatcca ttgtttngat 1680
aaatagccaa agtccctagg ctgtaagggg aaggtgtgcc aggtgcaagt ggagctctaa 1740
tgtaaaatcg cacctgagtc tcctggtctt atgagtnctg ggtgtacccc agtgaaaggt 1800
cctgctgcca ccaagtgggc catggttcag ctgtgtaagt gctgagcggc agccggaccg 1860
cttcctctaa cttcacctcc aaaggcacag tgcacctggt tcctccagca ctcagctgcg 1920
aggcccctag ccagggtccc ggcccccggc ccccggcagc tgctccagct tccttcccca 1980
cagcattcag gatggtctgc gttcatgtag acctttgttt tcagtctgtg ctccgaggtc 2040
actggcagca ctagccccgg ctcctgt 2067
<210>61
<211>1470
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR61
<220>
<221>misc_feature
<222>(130)..(976)
<223>n is a,c,g,or t on various positions
<400>61
cagcccccac atgcccagcc ctgtgctcag ctctgcagcg gggcatggtg ggcagagaca 60
cagaggccaa ggccctgctt cggggacggt gggcctggga tgagcatggc cttggccttc 120
gccgagagtn ctcttgtgaa ggaggggtca ggaggggctg ctgcagctgg ggaggagggc 180
gatggcactg tggcangaag tgaantagtg tgggtgcctn gcaccccagg cacggccagc 240
ctggggtatg gacccggggc cntctgttct agagcaggaa ggtatggtga ggacctcaaa 300
aggacagcca ctggagagct ccaggcagag gnacttgaga ggccctgggg ccatcctgtc 360
tcttttctgg gtctgtgtgc tctgggcctg ggcccttcct ctgctccccc gggcttggag 420
agggctggcc ttgcctcgtg caaaggacca ctctagactg gtaccaagtc tggcccatgg 480
cctcctgtgg gtgcaggcct gtgcgggtga cctgagagcc agggctggca ggtcagagtc 540
aggagaggga tggcagtgga tgccctgtgc aggatctgcc taatcatggt gaggctggag 600
gaatccaaag tgggcatgca ctctgcactc atttctttat tcatgtgtgc ccatcccaac 660
aagcagggag cctggccagg agggcccctg ggagaaggca ctgatgggct gtgttccatt 720
taggaaggat ggacggttgt gagacgggta agtcagaacg ggctgcccac ctcggccgag 780
agggccccgt ggtgggttgg caccatctgg gcctggagag ctgctcagga ggctctctag 840
ggctgggtga ccaggnctgg ggtacagtag ccatgggagc aggtgcttac ctggggctgt 900
ccctgagcag gggctgcatt gggtgctctg tgagcacaca cttctctatt cacctgagtc 960
ccnctgagtg atgagnacac ccttgttttg cagatgaatc tgagcatgga gatgttaagt 1020
ggcttgcctg agccacacag cagatggatg gtgtagctgg gacctgaggg caggcagtcc 1080
cagcccgagg acttcccaag gttgtggcaa actctgacag catgacccca gggaacaccc 1140
atctcagctc tggtcagaca ctgcggagtt gtgttgtaac ccacacagct ggagacagcc 1200
accctagccc cacccttatc ctctcccaaa ggaacctgcc ctttcccttc attttcctct 1260
tactgcattg agggaccaca cagtgtggca gaaggaacat gggttcagga cccagatgga 1320
cttgcttcac agtgcagccc tcctgtcctc ttgcagagtg cgtcttccac tgtgaagttg 1380
ggacagtcac accaactcaa tactgctggg cccgtcacac ggtgggcagg caacggatgg 1440
cagtcactgg ctgtgggtct gcagaggtgg 1470
<210>62
<211>1011
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR62
<400>62
agtgtcaaat agatctacac aaaacaagat aatgtctgcc catttttcca aagataatgt 60
ggtgaagtgg gtagagagaa atgcatccat tctccccacc caacctctgc taaattgtcc 120
atgtcacagt actgagacca gggggcttat tcccagcggg cagaatgtgc accaagcacc 180
tcttgtctca atttgcagtc taggccctgc tatttgatgg tgtgaaggct tgcacctggc 240
atggaaggtc cgttttgtac ttcttgcttt agcagttcaa agagcaggga gagctgcgag 300
ggcctctgca gcttcagatg gatgtggtca gcttgttgga ggcgccttct gtggtccatt 360
atctccagcc cccctgcggt gttgctgttt gcttggcttg tctggctctc catgccttgt 420
tggctccaaa atgtcatcat gctgcacccc aggaagaatg tgcaggccca tctcttttat 480
gtgctttggg ctattttgat tccccgttgg gtatattccc taggtaagac ccagaagaca 540
caggaggtag ttgctttggg agagtttgga cctatgggta tgaggtaata gacacagtat 600
cttctctttc atttggtgag actgttagct ctggccgcgg actgaattcc acacagctca 660
cttgggaaaa ctttattcca aaacatagtc acattgaaca ttgtggagaa tgagggacag 720
agaagaggcc ctagatttgt acatctgggt gttatgtcta taaatagaat gctttggtgg 780
tcaactagac ttgttcatgt tgacatttag tcttgccttt tcggtggtga tttaaaaatt 840
atgtatatct tgtttggaat atagtggagc tatggtgtgg cattttcatc tggctttttg 900
tttagctcag cccgtcctgt tatgggcagc cttgaagctc agtagctaat gaagaggtat 960
cctcactccc tccagagagc ggtcccctca cggctcattg agagtttgtc a 1011
<210>63
<211>1410
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR63
<400>63
ccacagcctg atcgtgctgt cgatgagagg aatctgctct aagggtctga gcggagggag 60
atgccgaagc tttgagcttt ttgtttctgg cttaaccttg gtggattttc accctctggg 120
cattacctct tgtccagggg aggggctggg ggagtgcctg gagctgtagg gacagagggc 180
tgagtggggg ggactgcttg ggctgaccac ataatattct gctgcgtatt aatttttttt 240
tgagacagtc tttctctgtt gcccaggctg gagtgtaatg gcttgatagc tcactgccac 300
ctccgcctcc tgggttcaag tgattctcct gcttcagctt ccggagtagc tgggactgca 360
ggtgcccgcc accatggctg gctaattttt gtatttttat tagcaatggg gttttgctat 420
gttgcccagg ccggtcccga actcctgccc tcaagtgata cacctgcctc ggcctcccaa 480
agtgctggga ttagaggctt gagccactgc gcctggccag ctgcatattg ttaattagac 540
ataaaatgca aaataagatg atataaacac aaaggtgtga aataagatgg acacctgctg 600
agcgcgcctg tcctgaagca tcgcccctct gcaaaagcag gggtcagcat gtgttctccg 660
gtccttgctc ttacagagga gtgagctgcc tatgcgtctt ccagccactt cctgggctgc 720
tcagaggcct ctcacgggtg ttctgggttg ctgccacttg caggggtgct gaggcggggc 780
tcctcccgtg cggggcatgt ccaggccgcc ctctctgaag gcttggcagg tacaggtggg 840
agtgggggtc tctgggctgc tgtggggact gggcaggctc ctggaagacc tccctgtgtt 900
tgggctgaaa gcgcagcccg aggggaggtc cccagggagg ccgctgtcgg gggtgggggc 960
ttggaggagg gaggggccga ggagccggcg acactccgtg acggcccagg aacgtcccta 1020
aacaaggcgc cgcgttctcg atggggtggg gtccgctttc ttttctcaaa agctgcagtt 1080
actccatgct cggaggactg gcgtccgcgc cctgttccaa tgctgccccg gggccctggc 1140
cttggggaat cggggccttg gactggaccc tgggggcttc gcggagccgg gcctggcggg 1200
gcgagcggag cagaggctgg gcagccccgg ggaagcgctc gccaaagccg ggcgctgctc 1260
ccagagcgcg aggtgcagaa ccagaggctg gtcccgcggc gctaacgaga gaagaggaag 1320
cgcgctgtgt agagggcgcc caccccgtgg ggcgaacccc cttcctcaac tccatggacg 1380
gggc tcatgg gt tcccagcg gc tcagacgc 1410
<210>64
<211>1414
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR64
<400>64
tggatcagat ttgttttata ccctcccttc tactgctctg agagttgtac atcacagtct 60
actgtatctg tttcccatta ttataatttt tttgcactgt gcttgcctga agggagcctc 120
aagttcatga gtctccctac cctcctccca aatgagacat ggacctttga atgctttcct 180
gggaccacca ccccaccttt catgctgctg ttatccagga ttttagttca acagtgtttt 240
aaccccccaa atgagtcatt tttattgttt cgtatagtga atgtgtattt gggtttgctt 300
atatggtgac ctgtttattt gctcctcatt gtacctcatg ctctgctctt tccttctaga 360
ttcagtctct ttcctaatga ggtgtctcgc agcaattctt tacaagacag ccaagatagg 420
ccagctctca gagcacttgt tgtctgaaaa agtcttgtct tatttaattt ctttttctta 480
gagatggggt ctcattatgt tacccacact ggtctcaaac ttctggctta aagcggtcct 540
cccaccttgg cctcccaaag tgctaggatt acaggcgtga gcgacctcgt ccagcctgtc 600
tgagaaagcg tttgttttgc ccttgctctc agatgacagt ttggggatag aattctaggt 660
ggacggtttt tttccttcag ccctttgaag agtctgtatt ttcattatct ccctgcatta 720
gatgttcttt tgcaagtaac gtgtcttttc tctctgggta ttcttaaggt tttctctttg 780
cctttggtga gctgcagtgg atttgctttt ttcaagaggt caagagaaag gaaagtgtga 840
ggtttctgtt ttttactgac aatttgtttg ttgatttgtt ttcccaccca gaggttcctt 900
gccactttgc caggctggaa ggcagacttc ttctggtgtc ctgttcacag acggggcagc 960
ctgcggaagg ccctgccaca tgcagggcct cggtcctcat tcccttgcat gtggacccgg 1020
gcgtgactcc tgttcaggct ggcacttccc agagctgagc cccagcctga ccttcctccc 1080
atactgtctt cacaccccct cctttcttct gatacctgga ggttttcctt tctttcctgt 1140
cacctccact tggattttaa atcctctgtc tgtggaattg tattcggcac aggaagatgc 1200
ttgcaagggc caggctcatc agccctgtcc ctgctgctgg aagcagcaca gcagagcctc 1260
atgctcaggc tgagatggag cagaggcctg cagacgagca cccagctcag ctggggttgg 1320
cgccgatggt ggagggtcct cgaaagctct ggggacgatg gcagagctat tggcagggga 1380
gccgcagggt cttttgagcc cttaaaagat ctct 1414
<210>65
<211>1310
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR65
<400>65
gtgaatgttg atggatcaaa tatctttctg tgttgtttat caaagttaaa ataaatgtgg 60
tcatttaaag gacaaaagat gaggggttgg agtctgttca agcaaagggt atattaggag 120
aaaagcagaa ttctctccct gtgaagggac agtgactcct attttccacc tcatttttac 180
taactctcct aactatctgc ttaggtagag atatatccat gtacatttat aaaccacagt 240
gaatcatttg attttggaat aaagatagta taaaatgtgt cccagtgttg atatacatca 300
tacattaaat atgtctggca gtgttctaat tttacagttg tccaaagata atgttagggc 360
atactggcta tggatgaagc tccaatgttc agattgcaaa gaaacttaga attttactaa 420
tgaaaccaaa tacatcccaa gaaatttttc agaagaaaaa aagagaaact agtagcaaag 480
taaagaatca ccacaatatc atcagatttt ttttatatgt agaatattta ttcagttctt 540
ttttcaagta caccttgtct tcattcattg tactttattt tttgtgaagg tttaaattta 600
tttcttctat gtgtttagtg atatttaaaa tttttattta atcaagttta tcagaaagtt 660
ctgttagaaa atatgacgag gctttaattc cgccatctat attttccgct attatataaa 720
gataattgtt ttctcttttt aaaacaactt gaattgggat tttatatcat aattttttaa 780
tgtctttttt tattatactt taagttctgg gatacatgtg cagaacgtgc aggtgtgtta 840
catagatata cacgtgccat ggtggtttgc tgcacccact aacctgttat cgacattagg 900
tatttctcct aatgctatca ccccctattt ccccaccccc cgagaggccc cagtgtgtga 960
tgttctcctc cctgtgtcca tgtgttctca ttgttcatct cccacttatg gtatctacca 1020
taaccttgaa attgtcttat gcattcactt gtttggttgt tatatagcct ccatcaggac 1080
agggatattt gctgctgctt cttttttttt tctttttgag acagtcttgc tccgtcatcc 1140
aggctggagt gcttctcggc tcaatgcaac ctccacctcc caggtttaag cgattctcca 1200
acttcagcct cccaaatggc tgggactgca ggcatgcacc actacacctg gctaattttt 1260
gtatttgtaa tagagacaat gtttcaccat gttggccagg ctggtctcga 1310
<210>66
<211>1917
<212>DNA
<213>Homo sapiens
<220>
<221>misc_feature
<223>sequence of STAR67
<400>66
aggatcctaa aattttgtga ccctagagca agtactaact atgaaagtga aatagagaat 60
gaaggaatta tttaattaag tccagcaaaa cccaaccaaa tcatctgtaa aatatatttg 120
ttttcaacat ccaggtattt tctgtgtaaa aggttgagtt gtatgctgac ttattgggaa 180
aaataattga gttttcccct tcactttgcc agtgagagga aatcagtact gtaattgtta 240
aaggttaccc atacctacct ctactaccgt ctagcatagg taaagtaatg tacactgtga 300
agtttcctgc ttgactgtaa tgttttcagt ttcatcccat tgattcaaca gctatttatt 360
cagcacttac tacaaccatg ctggaaaccc aagagtaaat aggctgtgtt actcaacagg 420
actgaggtac agccgaactg tcaggcaagg ttgctgtcct ttggacttgc ctgctttctc 480
tctatgtagg aagaagaaat ggacataccg tccaggaaat agatatatgt tacatttcct 540
tattccataa ttaatattaa taaccctgga cagaaactac caagtttcta gacccttata 600
gtaccacctt accctttctg gatgaatcct tcacatgttg atacatttta tccaaatgaa 660
aattttggta ctgtaggtat aacagacaaa gagagaacag aaaactagag atgaagtttg 720
ggaaaaggtc aagaaagtaa ataatgcttc tagaagacac aaaaagaaaa atgaaatggt 780
aatgttggga aagttttaat acattttgcc ctaaggaaaa aaactacttg ttgaaattct 840
acttaagact ggaccttttc tctaaaaatt gtgcttgatg tgaattaaag caacacaggg 900
aaatttatgg gctccttcta agttctaccc aactcaccgc aaaactgttc ctagtaggtg 960
tggtatactc tttcagattc tttgtgtgta tgtatatgtg tgtgtgtgtg tgtgtttgta 1020
tgtgtacagt ctatatacat atgtgtacct acatgtgtgt atatataaat atatatttac 1080
ctggatgaaa tagcatatta tagaatattc ttttttcttt aaatatatat gtgcatacat 1140
atgtatatgc acatatatac ataaatgtag atatagctag gtaggcattc atgtgaaaca 1200
aagaagccta ttacttttta atggttgcat gatattccat cataggagta tagtacaact 1260
tatgtaacac acatttggct tgttgtaaaa ttttggtatt aataaaatag cacatatcat 1320
gcaaagacac ccttgcatag gtctattcat tctttgattt ttaccttagg acaaaattta 1380
aaagtagaat ttctgggtca agcagtatgc tcatttaaaa tgtcattgca tatttccaaa 1440
ttgtcctcca gaaaagtagt aacagtaaca attgatggac tgcgtgtttt ctaaaacttg 1500
catttttttc cttattggtg aggtttggca ttttccatat gtttattggc attttaattt 1560
tttttggttc atgtctttta ttcccttcct gcaaatttgt ggtgtgtctc aactttattt 1620
atactctcat tttcataatt ttctaaagga atttgacttt aaaaaaataa gacagccaat 1680
gctttggttt aatttcattg ctgctttttg aagtgactgc tgtgttttta tatactttta 1740
tattttgttg ttttagcaaa ttcttctata ttataattgt gtatgctgga acaaaaagtt 1800
atatttctta atctagataa aatatttcaa gatgttgtaa ttacagtccc ctctaaaatc 1860
atataaatag acgcatagct gtgtgatttg taattagtta tgtccattga tagatcc 1917
<210>67
<211>375
<212>DNA
<213>Artificial
<220>
<223>wt zeocin resi stance gene
<220>
<221>CDS
<222>(1)..(375)
<400>67
atg gcc aag ttg acc agt gcc gtt ccg gtg ctc acc gcg cgc gac gtc 48
Met Ala Lys Leu Thr Ser Ala Val Pro Val Leu Thr Ala Arg Asp Val
1 5 10 15
gcc gga gcg gtc gag ttc tgg acc gac cgg ctc ggg ttc tcc cgg gac 96
Ala Gly Ala Val Glu Phe Trp Thr Asp Arg Leu Gly Phe Ser Arg Asp
20 25 30
ttc gtg gag gac gac ttc gcc ggt gtg gtc cgg gac gac gtg acc ctg 144
Phe Val Glu Asp Asp Phe Ala Gly Val Val Arg Asp Asp Val Thr Leu
35 40 45
ttc atc agc gcg gtc cag gac cag gtg gtg ccg gac aac acc ctg gcc 192
Phe Ile Ser Ala Val Gln Asp Gln Val Val Pro Asp Asn Thr Leu Ala
50 55 60
tgg gtg tgg gtg cgc ggc ctg gac gag ctg tac gcc gag tgg tcg gag 240
Trp Val Trp Val Arg Gly Leu Asp Glu Leu Tyr Ala Glu Trp Ser Glu
65 70 75 80
gtc gtg tcc acg aac ttc cgg gac gcc tcc ggg ccg gcc atg acc gag 288
Val Val Ser Thr Asn Phe Arg Asp Ala Ser Gly Pro Ala Met Thr Glu
85 90 95
atc ggc gag cag ccg tgg ggg cgg gag ttc gcc ctg cgc gac ccg gcc 336
Ile Gly Glu Gln Pro Trp Gly Arg Glu Phe Ala Leu Arg Asp Pro Ala
100 105 110
ggc aac tgc gtg cac ttc gtg gcc gag gag cag gac tga 375
Gly Asn Cys Val His Phe Val Ala Glu Glu Gln Asp
115 120
<210>68
<211>124
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>68
Met Ala Lys Leu Thr Ser Ala Val Pro Val Leu Thr Ala Arg Asp Val
1 5 10 15
Ala Gly Ala Val Glu Phe Trp Thr Asp Arg Leu Gly Phe Ser Arg Asp
20 25 30
Phe Val Glu Asp Asp Phe Ala Gly Val Val Arg Asp Asp Val Thr Leu
35 40 45
Phe Ile Ser Ala Val Gln Asp Gln Val Val Pro Asp Asn Thr Leu Ala
50 55 60
Trp Val Trp Val Arg Gly Leu Asp Glu Leu Tyr Ala Glu Trp Ser Glu
65 70 75 80
Val Val Ser Thr Asn Phe Arg Asp Ala Ser Gly Pro Ala Met Thr Glu
85 90 95
Ile Gly Glu Gln Pro Trp Gly Arg Glu Phe Ala Leu Arg Asp Pro Ala
100 105 110
Gly Asn Cys Val His Phe Val Ala Glu Glu Gln Asp
115 120
<210>69
<211>399
<212>DNA
<213>Artificial
<220>
<223>wt blasticidin resistance gene
<220>
<221>CDS
<222>(1)..(399)
<400>69
atg gcc aag cct ttg tct caa gaa gaa tcc acc ctc att gaa aga gca 48
Met Ala Lys Pro Leu Ser Gln Glu Glu Ser Thr Leu Ile Glu Arg Ala
1 5 10 15
acg gct aca atc aac agc atc ccc atc tct gaa gac tac agc gtc gcc 96
Thr Ala Thr Ile Asn Ser Ile Pro Ile Ser Glu Asp Tyr Ser Val Ala
20 25 30
agc gca gct ctc tct agc gac ggc cgc atc ttc act ggt gtc aat gta 144
Ser Ala Ala Leu Ser Ser Asp Gly Arg Ile Phe Thr Gly Val Asn Val
35 40 45
tat cat ttt act ggg gga cct tgt gca gaa ctc gtg gtg ctg ggc act 192
Tyr His Phe Thr Gly Gly Pro Cys Ala Glu Leu Val Val Leu Gly Thr
50 55 60
gct gct gct gcg gca gct ggc aac ctg act tgt atc gtc gcg atc gga 240
Ala Ala Ala Ala Ala Ala Gly Asn Leu Thr Cys Ile Val Ala Ile Gly
65 70 75 80
aat gag aac agg ggc atc ttg agc ccc tgc gga cgg tgc cga cag gtg 288
Asn Glu Asn Arg Gly Ile Leu Ser Pro Cys Gly Arg Cys Arg Gln Val
85 90 95
ctt ctc gat ctg cat cct ggg atc aaa gcc ata gtg aag gac agt gat 336
Leu Leu Asp Leu His Pro Gly Ile Lys Ala Ile Val Lys Asp Ser Asp
100 105 110
gga cag ccg acg gca gtt ggg att cgt gaa ttg ctg ccc tct ggt tat 384
Gly Gln Pro Thr Ala Val Gly Ile Arg Glu Leu Leu Pro Ser Gly Tyr
115 120 125
gtg tgg gag ggc taa 399
Val Trp Glu Gly
130
<210>70
<211>132
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>70
Met Ala Lys Pro Leu Ser Gln Glu Glu Ser Thr Leu Ile Glu Arg Ala
1 5 10 15
Thr Ala Thr Ile Asn Ser Ile Pro Ile Ser Glu Asp Tyr Ser Val Ala
20 25 30
Ser Ala Ala Leu Ser Ser Asp Gly Arg Ile Phe Thr Gly Val Asn Val
35 40 45
Tyr His Phe Thr Gly Gly Pro Cys Ala Glu Leu Val Val Leu Gly Thr
50 55 60
Ala Ala Ala Ala Ala Ala Gly Asn Leu Thr Cys Ile Val Ala Ile Gly
65 70 75 80
Asn Glu Asn Arg Gly Ile Leu Ser Pro Cys Gly Arg Cys Arg Gln Val
85 90 95
Leu Leu Asp Leu His Pro Gly Ile Lys Ala Ile Val Lys Asp Ser Asp
100 105 110
Gly Gln Pro Thr Ala Val Gly Ile Arg Glu Leu Leu Pro Ser Gly Tyr
115 120 125
Val Trp Glu Gly
130
<210>71
<211>600
<212>DNA
<213>Artificial
<220>
<223>wt puromycin resistance gene
<220>
<221>CDS
<222>(1)..(600)
<400>71
atg acc gag tac aag ccc acg gtg cgc ctc gcc acc cgc gac gac gtc 48
Met Thr Glu Tyr Lys Pro Thr Val Arg Leu Ala Thr Arg Asp Asp Val
1 5 10 15
ccc agg gcc gta cgc acc crc gcc gcc gcg ttc gcc gac tac ccc gcc 96
Pro Arg Ala Val Arg Thr Leu Ala Ala Ala Phe Ala Asp Tyr Pro Ala
20 25 30
acg cgc cac acc gtc gat ccg gac cgc cac atc gag cgg gtc acc gag 144
Thr Arg His Thr Val Asp Pro Asp Arg His Ile Glu Arg Val Thr Glu
35 40 45
ctg caa gaa ctc ttc ctc acg cgc gtc ggg crc gac atc ggc aag gtg 192
Leu Gln Glu Leu Phe Leu Thr Arg Val Gly Leu Asp Ile Gly Lys Val
50 55 60
tgg gtc gcg gac gac ggc gcc gcg gtg gcg gtc tgg acc acg ccg gag 240
Trp Val Ala Asp Asp Gly Ala Ala Val Ala Val Trp Thr Thr Pro Glu
65 70 75 80
agc gtc gaa gcg ggg gcg gtg ttc gcc gag atc ggc ccg cgc atg gcc 288
Ser Val Glu Ala Gly Ala Val Phe Ala Glu Ile Gly Pro Arg Met Ala
85 90 95
gag ttg agc ggt tcc cgg ctg gcc gcg cag caa cag atg gaa ggc ctc 336
Glu Leu Ser Gly Ser Arg Leu Ala Ala Gln Gln Gln Met Glu Gly Leu
100 105 110
ctg gcg ccg cac cgg ccc aag gag ccc gcg tgg ttc ctg gcc acc gtc 384
Leu Ala Pro His Arg Pro Lys Glu Pro Ala Trp Phe Leu Ala Thr Val
115 120 125
ggc gtc tcg ccc gac cac cag ggc aag ggt ctg ggc agc gcc gtc gtg 432
Gly Val Ser Pro Asp His Gln Gly Lys Gly Leu Gly Ser Ala Val Val
130 135 140
ctc ccc gga gtg gag gcg gcc gag cgc gcc ggg gtg ccc gcc ttc ctg 480
Leu Pro Gly Val Glu Ala Ala Glu Arg Ala Gly Val Pro Ala Phe Leu
145 150 155 160
gag acc tcc gcg ccc cgc aac ctc ccc ttc tac gag cgg ctc ggc ttc 528
Glu Thr Ser Ala Pro Arg Asn Leu Pro Phe Tyr Glu Arg Leu Gly Phe
165 170 175
acc gtc acc gcc gac gtc gag tgc ccg aag gac cgc gcg acc tgg tgc 576
Thr Val Thr Ala Asp Val Glu Cys Pro Lys Asp Arg Ala Thr Trp Cys
180 185 190
atg acc cgc aag ccc ggt gcc tga 600
Met Thr Arg Lys Pro Gly Ala
195
<210>72
<211>199
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>72
Met Thr Glu Tyr Lys Pro Thr Val Arg Leu Ala Thr Arg Asp Asp Val
1 5 10 15
Pro Arg Ala Val Arg Thr Leu Ala Ala Ala Phe Ala Asp Tyr Pro Ala
20 25 30
Thr Arg His Thr Val Asp Pro Asp Arg His Ile Glu Arg Val Thr Glu
35 40 45
Leu Gln Glu Leu Phe Leu Thr Arg Val Gly Leu Asp Ile Gly Lys Val
50 55 60
Trp Val Ala Asp Asp Gly Ala Ala Val Ala Val Trp Thr Thr Pro Glu
65 70 75 80
Ser Val Glu Ala Gly Ala Val Phe Ala Glu Ile Gly Pro Arg Met Ala
85 90 95
Glu Leu Ser Gly Ser Arg Leu Ala Ala Gln Gln Gln Met Glu Gly Leu
100 105 110
Leu Ala Pro His Arg Pro Lys Glu Pro Ala Trp Phe Leu Ala Thr Val
115 120 125
Gly Val Ser Pro Asp His Gln Gly Lys Gly Leu Gly Ser Ala Val Val
130 135 140
Leu Pro Gly Val Glu Ala Ala Glu Arg Ala Gly Val Pro Ala Phe Leu
145 150 155 160
Glu Thr Ser Ala Pro Arg Asn Leu Pro Phe Tyr Glu Arg Leu Gly Phe
165 170 175
Thr Val Thr Ala Asp Val Glu Cys Pro Lys Asp Arg Ala Thr Trp Cys
180 185 190
Met Thr Arg Lys Pro Gly Ala
195
<210>73
<211>564
<212>DNA
<213>Artificial
<220>
<223>wt DHFR gene(from mouse)
<220>
<221>CDS
<222>(1)..(564)
<400>73
atg gtt cga cca ttg aac tgc atc gtc gcc gtg tcc caa aat atg ggg 48
Met Val Arg Pro Leu Asn Cys Ile Val Ala Val Ser Gln Asn Met Gly
1 5 10 15
att ggc aag aac gga gac cta ccc tgg cct ccg ctc agg aac gag ttc 96
Ile Gly Lys Asn Gly Asp Leu Pro Trp Pro Pro Leu Arg Asn Glu Phe
20 25 30
aag tac ttc caa aga atg acc aca acc tct tca gtg gaa ggt aaa cag 144
Lys Tyr Phe Gln Arg Met Thr Thr Thr Ser Ser Val Glu Gly Lys Gln
35 40 45
aat ctg gtg att atg ggt agg aaa acc tgg ttc tcc att cct gag aag 192
Asn Leu Val Ile Met Gly Arg Lys Thr Trp Phe Ser Ile Pro Glu Lys
50 55 60
aat cga cct tta aag gac aga att aat ata gtt ctc agt aga gaa ctc 240
Asn Arg Pro Leu Lys Asp Arg Ile Asn Ile Val Leu Ser Arg Glu Leu
65 70 75 80
aaa gaa cca cca cga gga gct cat ttt ctt gcc aaa agt ttg gat gat 288
Lys Glu Pro Pro Arg Gly Ala His Phe Leu Ala Lys Ser Leu Asp Asp
85 90 95
gcc tta aga ctt att gaa caa ccg gaa ttg gca agt aaa gta gac atg 336
Ala Leu Arg Leu Ile Glu Gln Pro Glu Leu Ala Ser Lys Val Asp Met
100 105 110
gtt tgg ata gtc gga ggc agt tct gtt tac cag gaa gcc atg aat caa 384
Val Trp Ile Val Gly Gly Ser Ser Val Tyr Gln Glu Ala Met Asn Gln
115 120 125
cca ggc cac ctc aga ctc ttt gtg aca agg atc atg cag gaa ttt gaa 432
Pro Gly His Leu Arg Leu Phe Val Thr Arg Ile Met Gln Glu Phe Glu
130 135 140
agt gac acg ttt ttc cca gaa att gat ttg ggg aaa tat aaa ctt ctc 480
Ser Asp Thr Phe Phe Pro Glu Ile Asp Leu Gly Lys Tyr Lys Leu Leu
145 150 155 160
cca gaa tac cca ggc gtc ctc tct gag gtc cag gag gaa aaa ggc atc 528
Pro Glu Tyr Pro Gly Val Leu Ser Glu Val Gln Glu Glu Lys Gly Ile
165 170 175
aag tat aag ttt gaa gtc tac gag aag aaa gac taa 564
Lys Tyr Lys Phe Glu Val Tyr Glu Lys Lys Asp
180 185
<210>74
<211>187
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>74
Met Val Arg Pro Leu Asn Cys Ile Val Ala Val Ser Gln Asn Met Gly
1 5 10 15
Ile Gly Lys Asn Gly Asp Leu Pro Trp Pro Pro Leu Arg Asn Glu Phe
20 25 30
Lys Tyr Phe Gln Arg Met Thr Thr Thr Ser Ser Val Glu Gly Lys Gln
35 40 45
Asn Leu Val Ile Met Gly Arg Lys Thr Trp Phe Ser Ile Pro Glu Lys
50 55 60
Asn Arg Pro Leu Lys Asp Arg Ile Asn Ile Val Leu Ser Arg Glu Leu
65 70 75 80
Lys Glu Pro Pro Arg Gly Ala His Phe Leu Ala Lys Ser Leu Asp Asp
85 90 95
Ala Leu Arg Leu Ile Glu Gln Pro Glu Leu Ala Ser Lys Val Asp Met
100 105 110
Val Trp Ile Val Gly Gly Ser Ser Val Tyr Gln Glu Ala Met Asn Gln
115 120 125
Pro Gly His Leu Arg Leu Phe Val Thr Arg Ile Met Gln Glu Phe Glu
130 135 140
Ser Asp Thr Phe Phe Pro Glu Ile Asp Leu Gly Lys Tyr Lys Leu Leu
145 150 155 160
Pro Glu Tyr Pro Gly Val Leu Ser Glu Val Gln Glu Glu Lys Gly Ile
165 170 175
Lys Tyr Lys Phe Glu Val Tyr Glu Lys Lys Asp
180 185
<210>75
<211>1143
<212>DNA
<213>Artificial
<220>
<223>wt hygromycin resistance gene
<220>
<221>CDS
<222>(1)..(1143)
<400>75
atg aaa aag cct gaa ctc acc gcg acg tct gtc gag aag ttt ctg atc 48
Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu Ile
1 5 10 15
gaa aag ttc gac agc gtc tcc gac ctg atg cag ctc tcg gag ggc gaa 96
Glu Lys Phe Asp Ser Val Ser Asp Leu Met Gln Leu Ser Glu Gly Glu
20 25 30
gaa tct cgt gct ttc agc ttc gat gta gga ggg cgt gga tat gtc ctg 144
Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu
35 40 45
cgg gta aat agc tgc gcc gat ggt ttc tac aaa gat cgt tat gtt tat 192
Arg Val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr
50 55 60
cgg cac ttt gca tcg gcc gcg ctc ccg att ccg gaa gtg ctt gac att 240
Arg His Phe Ala Ser Ala Ala Leu Pro Ile Pro Glu Val Leu Asp Ile
65 70 75 80
ggg gaa ttc agc gag agc ctg acc tat tgc atc tcc cgc cgt gca cag 288
Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys Ile Ser Arg Arg Ala Gln
85 90 95
ggt gtc acg ttg caa gac ctg cct gaa acc gaa ctg ccc gct gtt ctg 336
Gly Val Thr Leu Gln Asp Leu Pro Glu Thr Glu Leu Pro Ala Val Leu
100 105 110
cag ccg gtc gcg gag gcc atg gat gcg atc gct gcg gcc gat ctt agc 384
Gln Pro Val Ala Glu Ala Met Asp Ala Ile Ala Ala Ala Asp Leu Ser
115 120 125
cag acg agc ggg ttc ggc cca ttc gga ccg caa gga atc ggt caa tac 432
Gln Thr Ser Gly Phe Gly Pro Phe Gly Pro Gln Gly Ile Gly Gln Tyr
130 135 140
act aca tgg cgt gat ttc ata tgc gcg att gct gat ccc cat gtg tat 480
Thr Thr Trp Arg Asp Phe Ile Cys Ala Ile Ala Asp Pro His Val Tyr
145 150 155 160
cac tgg caa act gtg atg gac gac acc gtc agt gcg tcc gtc gcg cag 528
His Trp Gln Thr Val Met Asp Asp Thr Val Ser Ala Ser Val Ala Gln
165 170 175
gct ctc gat gag ctg atg ctt tgg gcc gag gac tgc ccc gaa gtc cgg 576
Ala Leu Asp Glu Leu Met Leu Trp Ala Glu Asp Cys Pro Glu Val Arg
180 185 190
cac ctc gtg cac gcg gat ttc ggc tcc aac aat gtc ctg acg gac aat 624
His Leu Val His Ala Asp Phe Gly Ser Asn Asn Val Leu Thr Asp Asn
195 200 205
ggc cgc ata aca gcg gtc att gac tgg agc gag gcg atg ttc ggg gat 672
Gly Arg Ile Thr Ala Val Ile Asp Trp Ser Glu Ala Met Phe Gly Asp
210 215 220
tcc caa tac gag gtc gcc aac atc ttc ttc tgg agg ccg tgg ttg gct 720
Ser Gln Tyr Glu Val Ala Asn Ile Phe Phe Trp Arg Pro Trp Leu Ala
225 230 235 240
tgt atg gag cag cag acg cgc tac ttc gag cgg agg cat ccg gag ctt 768
Cys Met Glu Gln Gln Thr Arg Tyr Phe Glu Arg Arg His Pro Glu Leu
245 250 255
gca gga tcg ccg cgg ctc cgg gcg tat atg ctc cgc att ggt ctt gac 816
Ala Gly Ser Pro Arg Leu Arg Ala Tyr Met Leu Arg Ile Gly Leu Asp
260 265 270
caa ctc tat cag agc ttg gtt gac ggc aat ttc gat gat gca gct tgg 864
Gln Leu Tyr Gln Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trp
275 280 285
gcg cag ggt cga tgc gac gca atc gtc cga tcc gga gcc ggg act gtc 912
Ala Gln Gly Arg Cys Asp Ala Ile Val Arg Ser Gly Ala Gly Thr Val
290 295 300
ggg cgt aca caa atc gcc cgc aga agc gcg gcc gtc tgg acc gat ggc 960
Gly Arg Thr Gln lle Ala Arg Arg Ser Ala Ala Val Trp Thr Asp Gly
305 310 315 320
tgt gta gaa gta ctc gcc gat agt gga aac cga cgc ccc agc act cgt 1008
Cys Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg
325 330 335
ccg gag gca aag gaa ttc ggg aga tgg ggg agg cta act gaa aca cgg 1056
Pro Glu Ala Lys Glu Phe Gly Arg Trp Gly Arg Leu Thr Glu Thr Arg
340 345 350
aag gag aca ata ccg gaa gga acc cgc gct atg acg gca ata aaa aga 1104
Lys Glu Thr Ile Pro Glu Gly Thr Arg Ala Met Thr Ala Ile Lys Arg
355 360 365
cag aat aaa acg cac ggg tgt tgg gtc gtt tgt tca taa 1143
Gln Asn Lys Thr His Gly Cys Trp Val Val Cys Ser
370 375 380
<210>76
<211>380
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>76
Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu Ile
1 5 10 15
Glu Lys Phe Asp Ser Val Ser Asp Leu Met Gln Leu Ser Glu Gly Glu
20 25 30
Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu
35 40 45
Arg Val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr
50 55 60
Arg His Phe Ala Ser Ala Ala Leu Pro Ile Pro Glu Val Leu Asp Ile
65 70 75 80
Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys Ile Ser Arg Arg Ala Gln
85 90 95
Gly Val Thr Leu Gln Asp Leu Pro Glu Thr Glu Leu Pro Ala Val Leu
100 105 110
Gln Pro Val Ala Glu Ala Met Asp Ala Ile Ala Ala Ala Asp Leu Ser
115 120 125
Gln Thr Ser Gly Phe Gly Pro Phe Gly Pro Gln Gly Ile Gly Gln Tyr
130 135 140
Thr Thr Trp Arg Asp Phe Ile Cys Ala Ile Ala Asp Pro His Val Tyr
145 150 155 160
His Trp Gln Thr Val Met Asp Asp Thr Val Ser Ala Ser Val Ala Gln
165 170 175
Ala Leu Asp Glu Leu Met Leu Trp Ala Glu Asp Cys Pro Glu Val Arg
180 185 190
His Leu Val His Ala Asp Phe Gly Ser Asn Asn Val Leu Thr Asp Asn
195 200 205
Gly Arg Ile Thr Ala Val Ile Asp Trp Ser Glu Ala Met Phe Gly Asp
210 215 220
Ser Gln Tyr Glu Val Ala Asn Ile Phe Phe Trp Arg Pro Trp Leu Ala
225 230 235 240
Cys Met Glu Gln Gln Thr Arg Tyr Phe Glu Arg Arg His Pro Glu Leu
245 250 255
Ala Gly Ser Pro Arg Leu Arg Ala Tyr Met Leu Arg Ile Gly Leu Asp
260 265 270
Gln Leu Tyr Gln Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trp
275 280 285
Ala Gln Gly Arg Cys Asp Ala Ile Val Arg Ser Gly Ala Gly Thr Val
290 295 300
Gly Arg Thr Gln Ile Ala Arg Arg Ser Ala Ala Val Trp Thr Asp Gly
305 310 315 320
Cys Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg
325 330 335
Pro Glu Ala Lys Glu Phe Gly Arg Trp Gly Arg Leu Thr Glu Thr Arg
340 345 350
Lys Glu Thr Ile Pro Glu Gly Thr Arg Ala Met Thr Ala Ile Lys Arg
355 360 365
Gln Asn Lys Thr His Gly Cys Trp Val Val Cys Ser
370 375 380
<210>77
<211>804
<212>DNA
<213>Artificial
<220>
<223>wt neomycin resistance gene
<220>
<221>CDS
<222>(1)..(804)
<400>77
atg gga tcg gcc att gaa caa gat gga ttg cac gca ggt tct ccg gcc 48
Met Gly Ser Ala Ile Glu Gln Asp Gly Leu His Ala Gly Ser Pro Ala
1 5 10 15
gct tgg gtg gag agg cta ttc ggc tat gac tgg gca caa cag aca atc 96
Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp Ala Gln Gln Thr Ile
20 25 30
ggc tgc tct gat gcc gcc gtg ttc cgg ctg tca gcg cag ggg cgc ccg 144
Gly Cys Ser Asp Ala Ala Val Phe Arg Leu Ser Ala Gln Gly Arg Pro
35 40 45
gtt ctt ttt gtc aag acc gac ctg tcc ggt gcc ctg aat gaa ctg cag 192
Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala Leu Asn Glu Leu Gln
50 55 60
gac gag gca gcg cgg cta tcg tgg ctg gcc acg acg ggc gtt cct tgc 240
Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr Thr Gly Val Pro Cys
65 70 75 80
gca gct gtg ctc gac gtt gtc act gaa gcg gga agg gac tgg ctg cta 288
Ala Ala Val Leu Asp Val Val Thr Glu Ala Gly Arg Asp Trp Leu Leu
85 90 95
ttg ggc gaa gtg ccg ggg cag gat ctc ctg tca tct cac ctt gct cct 336
Leu Gly Glu Val Pro Gly Gln Asp Leu Leu Ser Ser His Leu Ala Pro
100 105 110
gcc gag aaa gta tcc atc atg gct gat gca atg cgg cgg ctg cat acg 384
Ala Glu Lys Val Ser Ile Met Ala Asp Ala Met Arg Arg Leu His Thr
115 120 125
ctt gat ccg gct acc tgc cca ttc gac cac caa gcg aaa cat cgc atc 432
Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gln Ala Lys His Arg Ile
130 135 140
gag cga gca cgt act cgg atg gaa gcc ggt ctt gtc gat cag gat gat 480
Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val Asp Gln Asp Asp
145 150 155 160
ctg gac gaa gag cat cag ggg ctc gcg cca gcc gaa ctg ttc gcc agg 528
Leu Asp Glu Glu His Gln Gly Leu Ala Pro Ala Glu Leu Phe Ala Arg
165 170 175
ctc aag gcg cgc atg ccc gac ggc gat gat ctc gtc gtg acc cat ggc 576
Leu Lys Ala Arg Met Pro Asp Gly Asp Asp Leu Val Val Thr His Gly
180 185 190
gat gcc tgc ttg ccg aat atc atg gtg gaa aat ggc cgc ttt tct gga 624
Asp Ala Cys Leu Pro Asn Ile Met Val Glu Asn Gly Arg Phe Ser Gly
195 200 205
ttc atc gac tgt ggc cgg ctg ggt gtg gcg gac cgc tat cag gac ata 672
Phe Ile Asp Cys Gly Arg Leu Gly Val Ala Asp Arg Tyr Gln Asp Ile
210 215 220
gcg ttg gct acc cgt gat att gct gaa gag ctt ggc ggc gaa tgg gct 720
Ala Leu Ala Thr Arg Asp Ile Ala Glu Glu Leu Gly Gly Glu Trp Ala
225 230 235 240
gac cgc ttc ctc gtg ctt tac ggt atc gcc gct ccc gat tcg cag cgc 768
Asp Arg Phe Leu Val Leu Tyr Gly Ile Ala Ala Pro Asp Ser Gln Arg
245 250 255
atc gcc ttc tat cgc ctt ctt gac gag ttc ttc tga 804
Ile Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe
260 265
<210>78
<211>267
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>78
Met Gly Ser Ala Ile Glu Gln Asp Gly Leu His Ala Gly Ser Pro Ala
1 5 10 15
Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp Ala Gln Gln Thr Ile
20 25 30
Gly Cys Ser Asp Ala Ala Val Phe Arg Leu Ser Ala Gln Gly Arg Pro
35 40 45
Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala Leu Asn Glu Leu Gln
50 55 60
Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr Thr Gly Val Pro Cys
65 70 75 80
Ala Ala Val Leu Asp Val Val Thr Glu Ala Gly Arg Asp Trp Leu Leu
85 90 95
Leu Gly Glu Val Pro Gly Gln Asp Leu Leu Ser Ser His Leu Ala Pro
100 105 110
Ala Glu Lys Val Ser Ile Met Ala Asp Ala Met Arg Arg Leu His Thr
115 120 125
Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gln Ala Lys His Arg Ile
130 135 140
Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val Asp Gln Asp Asp
145 150 155 160
Leu Asp Glu Glu His Gln Gly Leu Ala Pro Ala Glu Leu Phe Ala Arg
165 170 175
Leu Lys Ala Arg Met Pro Asp Gly Asp Asp Leu Val Val Thr His Gly
180 185 190
Asp Ala Cys Leu Pro Asn Ile Met Val Glu Asn Gly Arg Phe Ser Gly
195 200 205
Phe Ile Asp Cys Gly Arg Leu Gly Val Ala Asp Arg Tyr Gln Asp Ile
210 215 220
Ala Leu Ala Thr Arg Asp Ile Ala Glu Glu Leu Gly Gly Glu Trp Ala
225 230 235 240
Asp Arg Phe Leu Val Leu Tyr Gly Ile Ala Ala Pro Asp Ser Gln Arg
245 250 255
Ile Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe
260 265
<210>79
<211>1121
<212>DNA
<213>Artificial
<220>
<223>wt glutamine synthase gene(human)
<220>
<221>CDS
<222>(1)..(1119)
<400>79
atg acc acc tca gca agt tcc cac tta aat aaa ggc atc aag cag gtg 48
Met Thr Thr Ser Ala Ser Ser His Leu Asn Lys Gly Ile Lys Gln Val
1 5 10 15
tac atg tcc ctg cct cag ggt gag aaa gtc cag gcc atg tat atc tgg 96
Tyr Met Ser Leu Pro Gln Gly Glu Lys Val Gln Ala Met Tyr Ile Trp
20 25 30
atc gat ggt act gga gaa gga ctg cgc tgc aag acc cgg acc ctg gac 144
Ile Asp Gly Thr Gly Glu Gly Leu Arg Cys Lys Thr Arg Thr Leu Asp
35 40 45
agt gag ccc aag tgt gtg gaa gag ttg cct gag tgg aat ttc gat ggc 192
Ser Glu Pro Lys Cys Val Glu Glu Leu Pro Glu Trp Asn Phe Asp Gly
50 55 60
tcc agt act tta cag tct gag ggt tcc aac agt gac atg tat ctc gtg 240
Ser Ser Thr Leu Gln Ser Glu Gly Ser Asn Ser Asp Met Tyr Leu Val
65 70 75 80
cct gct gcc atg ttt cgg gac ccc ttc cgt aag gac cct aac aag ctg 288
Pro Ala Ala Met Phe Arg Asp Pro Phe Arg Lys Asp Pro Asn Lys Leu
85 90 95
gtg tta tgt gaa gtt ttc aag tac aat cga agg cct gca gag acc aat 336
Val Leu Cys Glu Val Phe Lys Tyr Asn Arg Arg Pro Ala Glu Thr Asn
100 105 110
ttg agg cac acc tgt aaa cgg ata atg gac atg gtg agc aac cag cac 384
Leu Arg His Thr Cys Lys Arg Ile Met Asp Met Val Ser Asn Gln His
115 120 125
ccc tgg ttt ggc atg gag cag gag tat acc ctc atg ggg aca gat ggg 432
Pro Trp Phe Gly Met Glu Gln Glu Tyr Thr Leu Met Gly Thr Asp Gly
130 135 140
cac ccc ttt ggt tgg cct tcc aac ggc ttc cca ggg ccc cag ggt cca 480
His Pro Phe Gly Trp Pro Ser Asn Gly Phe Pro Gly Pro Gln Gly Pro
145 150 155 160
tat tac tgt ggt gtg gga gca gac aga gcc tat ggc agg gac atc gtg 528
Tyr Tyr Cys Gly Val Gly Ala Asp Arg Ala Tyr Gly Arg Asp Ile Val
165 170 175
gag gcc cat tac cgg gcc tgc ttg tat gct gga gtc aag att gcg ggg 576
Glu Ala His Tyr Arg Ala Cys Leu Tyr Ala Gly Val Lys Ile Ala Gly
180 185 190
act aat gcc gag gtc atg cct gcc cag tgg gaa ttt cag att gga cct 624
Thr Asn Ala Glu Val Met Pro Ala Gln Trp Glu Phe Gln Ile Gly Pro
195 200 205
tgt gaa gga atc agc atg gga gat cat ctc tgg gtg gcc cgt ttc atc 672
Cys Glu Gly Ile Ser Met Gly Asp His Leu Trp Val Ala Arg Phe Ile
210 215 220
ttg cat cgt gtg tgt gaa gac ttt gga gtg ata gca acc ttt gat cct 720
Leu His Arg Val Cys Glu Asp Phe Gly Val Ile Ala Thr Phe Asp Pro
225 230 235 240
aag ccc att cct ggg aac tgg aat ggt gca ggc tgc cat acc aac ttc 768
Lys Pro Ile Pro Gly Asn Trp Asn Gly Ala Gly Cys His Thr Asn Phe
245 250 255
agc acc aag gcc atg cgg gag gag aat ggt ctg aag tac atc gag gag 816
Ser Thr Lys Ala Met Arg Glu Glu Asn Gly Leu Lys Tyr Ile Glu Glu
260 265 270
gcc att gag aaa cta agc aag cgg cac cag tac cac atc cgt gcc tat 864
Ala Ile Glu Lys Leu Ser Lys Arg His Gln Tyr His Ile Arg Ala Tyr
275 280 285
gat ccc aag gga ggc ctg gac aat gcc cga cgt cta act gga ttc cat 912
Asp Pro Lys Gly Gly Leu Asp Asn Ala Arg Arg Leu Thr Gly Phe His
290 295 300
gaa acc tcc aac atc aac gac ttt tct ggt ggt gta gcc aat cgt agc 960
Glu Thr Ser Asn Ile Asn Asp Phe Ser Gly Gly Val Ala Asn Arg Ser
305 310 315 320
gcc agc ata cgc att ccc cgg act gtt ggc cag gag aag aag ggt tac 1008
Ala Ser Ile Arg Ile Pro Arg Thr Val Gly Gln Glu Lys Lys Gly Tyr
325 330 335
ttt gaa gat cgt cgc ccc tct gcc aac tgc gac ccc ttt tcg gtg aca 1056
Phe Glu Asp Arg Arg Pro Ser Ala Asn Cys Asp Pro Phe Ser Val Thr
340 345 350
gaa gcc ctc atc cgc acg tgt ctt ctc aat gaa acc ggc gat gag ccc 1104
Glu Ala Leu Ile Arg Thr Cys Leu Leu Asn Glu Thr Gly Asp Glu Pro
355 360 365
ttc cag tac aaa aat ta 1121
Phe Gln Tyr Lys Asn
370
<210>80
<211>373
<212>PRT
<213>Artificial
<220>
<223>Synthetic Construct
<400>80
Met Thr Thr Ser Ala Ser Ser His Leu Asn Lys Gly Ile Lys Gln Val
1 5 10 15
Tyr Met Ser Leu Pro Gln Gly Glu Lys Val Gln Ala Met Tyr Ile Trp
20 25 30
Ile Asp Gly Thr Gly Glu Gly Leu Arg Cys Lys Thr Arg Thr Leu Asp
35 40 45
Ser Glu Pro Lys Cys Val Glu Glu Leu Pro Glu Trp Asn Phe Asp Gly
50 55 60
Ser Ser Thr Leu Gln Ser Glu Gly Ser Asn Ser Asp Met Tyr Leu Val
65 70 75 80
Pro Ala Ala Met Phe Arg Asp Pro Phe Arg Lys Asp Pro Asn Lys Leu
85 90 95
Val Leu Cys Glu Val Phe Lys Tyr Asn Arg Arg Pro Ala Glu Thr Asn
100 105 110
Leu Arg His Thr Cys Lys Arg Ile Met Asp Met Val Ser Asn Gln His
115 120 125
Pro Trp Phe Gly Met Glu Gln Glu Tyr Thr Leu Met Gly Thr Asp Gly
130 135 140
His Pro Phe Gly Trp Pro Ser Asn Gly Phe Pro Gly Pro Gln Gly Pro
145 150 155 160
Tyr Tyr Cys Gly Val Gly Ala Asp Arg Ala Tyr Gly Arg Asp Ile Val
165 170 175
Glu Ala His Tyr Arg Ala Cys Leu Tyr Ala Gly Val Lys Ile Ala Gly
180 185 190
Thr Asn Ala Glu Val Met Pro Ala Gln Trp Glu Phe Gln Ile Gly Pro
195 200 205
Cys Glu Gly Ile Ser Met Gly Asp His Leu Trp Val Ala Arg Phe Ile
210 215 220
Leu His Arg Val Cys Glu Asp Phe Gly Val Ile Ala Thr Phe Asp Pro
225 230 235 240
Lys Pro Ile Pro Gly Asn Trp Asn Gly Ala Gly Cys His Thr Asn Phe
245 250 255
Ser Thr Lys Ala Met Arg Glu Glu Asn Gly Leu Lys Tyr Ile Glu Glu
260 265 270
Ala Ile Glu Lys Leu Ser Lys Arg His Gln Tyr His Ile Arg Ala Tyr
275 280 285
Asp Pro Lys Gly Gly Leu Asp Asn Ala Arg Arg Leu Thr Gly Phe His
290 295 300
Glu Thr Ser Asn Ile Asn Asp Phe Ser Gly Gly Val Ala Asn Arg Ser
305 310 315 320
Ala Ser Ile Arg Ile Pro Arg Thr Val Gly Gln Glu Lys Lys Gly Tyr
325 330 335
Phe Glu Asp Arg Arg Pro Ser Ala Asn Cys Asp Pro Phe Ser Val Thr
340 345 350
Glu Ala Leu Ile Arg Thr Cys Leu Leu Asn Glu Thr Gly Asp Glu Pro
355 360 365
Phe Gln Tyr Lys Asn
370
<210>81
<211>154
<212>DNA
<213>Artificial
<220>
<223>combined synthetic polyadenylation sequence and pausing signal
from the human alpha2 globin gene
<220>
<221>synthetic polyadenylation sequence
<222>(1)..(49)
<220>
<221>cloning site
<222>(50)..(62)
<220>
<221>pausing signal from the human alpha2 globin gene
<222>(63)..(154)
<400>81
aataaaatat ctttattttc attacatctg tgtgttggtt ttttgtgtga atcgatagta 60
ctaacatacg ctctccatca aaacaaaacg aaacaaaaca aactagcaaa ataggctgtc 120
cccagtgcaa gtgcaggtgc cagaacattt ctct 154
<210>82
<211>596
<212>DNA
<213>Artificial
<220>
<223>IRES sequence
<400>82
gcccctctcc ctcccccccc cctaacgtta ctggccgaag ccgcttggaa taaggccggt 60
gtgcgtttgt ctatatgtga ttttccacca tattgccgtc ttttggcaat gtgagggccc 120
ggaaacctgg ccctgtcttc ttgacgagca ttcctagggg tctttcccct ctcgccaaag 180
gaatgcaagg tctgttgaat gtcgtgaagg aagcagttcc tctggaagct tcttgaagac 240
aaacaacgtc tgtagcgacc ctttgcaggc agcggaaccc cccacctggc gacaggtgcc 300
tctgcggcca aaagccacgt gtataagata cacctgcaaa ggcggcacaa ccccagtgcc 360
acgttgtgag ttggatagtt gtggaaagag tcaaatggct ctcctcaagc gtattcaaca 420
aggggctgaa ggatgcccag aaggtacccc attgtatggg atctgatctg gggcctcggt 480
gcacatgctt tacatgtgtt tagtcgaggt taaaaaaacg tctaggcccc ccgaaccacg 540
gggacgtggt tttcctttga aaaacacgat gataagcttg ccacaacccc gggata 596

Claims (15)

  1. A DNA molecule comprising a polycistronic transcription unit comprising a coding sequence
    i) A polypeptide of interest, and
    ii) a selectable marker polypeptide functional in a eukaryotic host cell,
    at least one of the code sequences of (a),
    wherein the polypeptide of interest has a translation initiation sequence that is independent of the translation initiation sequence of the selectable marker polypeptide,
    wherein the at least one coding sequence for the polypeptide of interest is located upstream of the at least one coding sequence for the selectable marker polypeptide in the polycistronic transcription unit,
    wherein an Internal Ribosome Entry Site (IRES) is located downstream of at least one coding sequence for a polypeptide of interest and upstream of at least one coding sequence for a selectable marker polypeptide, and
    characterized in that the coding sequence encoding the selectable marker polypeptide comprises a translation initiation sequence selected from the group consisting of:
    a) a GTG start codon;
    b) a TTG start codon;
    c) a CTG start codon;
    d) an ATT start codon; and
    e) ACG initiation codon.
  2. 2. The DNA molecule of claim 1, wherein the translation initiation sequence of the selectable marker polypeptide comprises a GTG start codon or a TTG start codon.
  3. 3. The DNA molecule of claim 1 or 2, wherein the selectable marker polypeptide provides resistance to lethal or growth-inhibitory effects of the selective agent.
  4. 4. The DNA molecule of claim 3, wherein the selection agent is selected from the group consisting of: zeocin, puromycin, blasticidin, hygromycin, neomycin, methotrexate, methionine sulfoximine (methioninesulphomine) and kanamycin.
  5. 5. The DNA molecule of claim 3, wherein said selection agent is zeocin.
  6. 6. The DNA molecule of claim 1 or 2, wherein the selectable marker polypeptide is a 5,6, 7, 8-tetrahydrofolate synthetase.
  7. 7. The DNA molecule of claim 1 or 2, wherein said polycistronic transcription unit further comprises a sequence encoding a second selectable marker polypeptide functional in eukaryotic cells, wherein said sequence encoding a second selectable marker polypeptide:
    a) having a translation initiation sequence which is independent of the translation initiation sequence of the polypeptide sequence of interest,
    b) upstream of said sequence encoding a polypeptide of interest,
    c) no ATG sequence is present on the coding strand after the start codon of said second selectable marker polypeptide up to the start codon of the polypeptide of interest, and
    d) having a GTG start codon or a TTG start codon.
  8. 8. An expression cassette comprising a DNA molecule according to any one of claims 1 to 7, said expression cassette comprising a promoter upstream of said polycistronic transcription unit and a transcription termination sequence downstream of said polycistronic transcription unit, wherein said expression cassette is functional in a eukaryotic host cell and is capable of initiating transcription of the polycistronic transcription unit.
  9. 9. The expression cassette of claim 8, further comprising at least one chromatin control element selected from the group consisting of: a matrix or scaffold attachment region, an insulating sequence, a universal chromatin opening element, and an anti-repressor sequence.
  10. 10. A host cell comprising the DNA molecule of any one of claims 1 to 7 or the expression cassette of any one of claims 8 to 9, wherein the host cell is a mammalian cell.
  11. 11. The host cell of claim 10, wherein the cell is a CHO cell
  12. 12. A method of producing a host cell capable of expressing a polypeptide of interest, the method comprising:
    a) introducing a DNA molecule according to any one of claims 1 to 7 or an expression cassette according to any one of claims 8 to 9 into a plurality of precursor cells; and
    b) culturing the plurality of precursor cells under conditions suitable for expression of the selectable marker polypeptide, and;
    c) selecting at least one host cell expressing the polypeptide of interest.
  13. 13. A method of expressing a polypeptide of interest comprising culturing a host cell comprising the expression cassette of any one of claims 8-9 and expressing the polypeptide of interest from the expression cassette.
  14. 14. The method of claim 13, further comprising harvesting the polypeptide of interest.
  15. 15. The method of claim 13 or 14, wherein the host cell is dhfr-bearing-A CHO cell of a phenotype, and wherein the expression cassette comprises a coding sequence for a selectable marker polypeptide which is a 5,6, 7, 8-tetrahydrofolate synthetase, wherein the cell is cultured in a medium comprising folate, which medium is free of hypoxanthine and thymidine.
HK09107655.2A 2006-02-21 2007-02-21 Selection of host cells expressing protein at high levels HK1128490B (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US11/359,953 2006-02-21
US11/359,953 US20060141577A1 (en) 2004-11-08 2006-02-21 Selection of host cells expressing protein at high levels
EP06113354.2 2006-05-02
US11/416,490 2006-05-02
US11/416,490 US20060195935A1 (en) 2004-11-08 2006-05-02 Selection of host cells expressing protein at high levels
EP06113354 2006-05-02
PCT/EP2007/051696 WO2007096399A2 (en) 2006-02-21 2007-02-21 Selection of host cells expressing protein at high levels

Publications (2)

Publication Number Publication Date
HK1128490A1 HK1128490A1 (en) 2009-10-30
HK1128490B true HK1128490B (en) 2013-07-05

Family

ID=

Similar Documents

Publication Publication Date Title
CN101389763B (en) Selection of host cells expressing protein at high levels
KR101271884B1 (en) Selection of host cells expressing protein at high levels
NZ553700A (en) Selection of host cells expressing protein at high levels using a selectable marker with a non-optimal start codon
KR100942117B1 (en) DNA sequences comprising gene transcriptional regulatory properties and methods of detecting and using such DNA sequences
DK2443239T3 (en) New stringent selection markers
AU2021252991B2 (en) Base editing of ANGPTL3 and methods of using same for treatment of disease
KR101188013B1 (en) Novel sequence for improving expression of nucleic acid
Montana et al. Transcriptional regulation of neural retina leucine zipper (Nrl), a photoreceptor cell fate determinant
EP2582816B1 (en) Novel intergenic elements for enhancing gene expression
EP2611915B1 (en) Nucleic acid fragments from a ribosomal protein promoter for enhancing gene expression
Lemme et al. Characterization of mleR, a positive regulator of malolactic fermentation and part of the acid tolerance response in Streptococcus mutans
MX2015003281A (en) Methods and compositions for preventing norleucine misincorporation into proteins.
Kobayashi et al. Type II/III Runx2/Cbfa1 is required for tooth germ development
KR100951759B1 (en) DNA sequences comprising gene transcriptional regulatory properties and methods of detecting and using such DNA sequences
CN115279910A (en) Adjustable rotor system
HK1128490B (en) Selection of host cells expressing protein at high levels
JP2006506044A (en) Compositions and methods for acceleration of protein secretion dynamics
US6346606B1 (en) Protein containing a scavenger receptor cysteine rich domain
CA2349588A1 (en) Endoplasmic reticulum stress transcription factors atf6 and creb-rp
US6355775B1 (en) Transcriptional inhibitor