WO2001053529A9 - Determination rapide de la structure genique au moyen d'une sequence d'adnc - Google Patents
Determination rapide de la structure genique au moyen d'une sequence d'adncInfo
- Publication number
- WO2001053529A9 WO2001053529A9 PCT/US2001/001461 US0101461W WO0153529A9 WO 2001053529 A9 WO2001053529 A9 WO 2001053529A9 US 0101461 W US0101461 W US 0101461W WO 0153529 A9 WO0153529 A9 WO 0153529A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- primers
- cdna
- gene
- seq
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- eukaryotic genes comprise sequences (exons) destined to be part of the mature RNA interrupted by sequences that are not destined to be part of the mature RNA. Such interrupting sequences are known as intervening sequences or introns.
- the exons comprise coding sequence and 5' regulartory sequence.
- the combination of coding sequence and introns is transcribed into a primary RNA transcript.
- Genes also comprise non-coding sequence 5' of the transcribed region; such upstream regions are known as enhancers and promoters.
- genomic sequence that is not present in the mature RNA product be it mRNA, rRNA or tRNA comprises enhancer and promoter sequences 5' of the translated region as well as introns interspersed within the translated region.
- RNA messenger RNA
- mRNA messenger RNA
- Primary mRNA is processed into mature mRNA by 5' capping, removal of intervening sequences, and addition of a polyA tail on the 3' terminus of the mRNA.
- the human genome as well as those of most other mammals is in the range of 3 xlO 9 base pairs.
- the average size of a gene or primary transcript is 16.6 kilobase pairs, of which 2.2 kilobase pairs is the average size of the mature mRNA. Therefore, non-coding regions make up the vast majority of the size of genes (about 87%).
- allelic variations comprises more than the study of variations within the exons and requires the information present in the genomic version of the gene of interest, information that does not ultimately end up in the mRNA or final RNA product. Typically, this information is available only when the complete sequence of a chromosomal copy of a gene of interest is obtained. Therefore, not all sequence pertinent to gene structure and phenotypic variation is available in cDNA or EST sequence, because these sequences are derived from mature transcribed copies of the genes where introns have been removed.
- a typical method for obtaining the desired genetic information comprises cloning and sequencing the entire chromosomal copy of a gene of interest. This method is very costly and time consuming and involves sequencing many thousand kilobases of DNA in order to obtain enough sequence coverage to assemble a given gene.
- the present invention is drawn to a method of determining gene structure including boundaries between exons and introns of a gene and between 5' or 3' termini of mature RNA transcripts and the adjacent genomic sequence, including intron termini 5' and 3' untranslated regions (UTR) and promoter and enhancer sequence.
- gene structure refers to the order of exons and introns in the chromosomal copy of a gene as well as about 50 to about 300 nucleotides of sequence 5' and 3' of each exon terminus.
- non-exon regions refers to 5' untranscribed regions of the gene, 3' untranscribed regions of the gene and introns.
- the present invention further provides genomic sequence 5' and 3' of the mature RNA termini, as well as sequence of 5' and 3' ends of introns. Furthermore, the sequence provided herein can be used to obtain additional sequence 5' and 3' of the mature RNA termini as well as additional intron sequence if desired, e.g. using primer walking with sequence obtained by the present method.
- regions of a chromosomal copy of a gene or fragments thereof are sequenced using a set of primers.
- the sequence of the mature transcript is known.
- the primers cover both strands of the cDNA, at evenly spaced or similarly spaced intervals.
- the present invention provides information necessary to determine gene structure and phenotypic expression without the need to sequence the entire chromosomal copy of the gene or fragment thereof. As a result of the method of the present invention, gene structure can be determined without the need to sequence the entire gene.
- the present invention is useful, for example, in germ line sequence variation analysis.
- the method of the present invention is drawn to determining gene structure, where at least some portion of the genomic sequence of the gene of interest is unknown.
- the method involves sequencing the gene across exon-intron boundaries using evenly spaced primers, or tiled primers.
- the tiled primers comprise nucleic acids that hybridize to the known cDNA sequence of the gene at about 100 to about 300 base intervals and the gene comprises the template.
- the present invention is drawn to a method of determining boundaries between at least one exon and at least one non-exon of a gene.
- the method comprises the steps of conducting one or more sequencing reactions, comprising a template and a primer or set of primers.
- the template comprises a gene or fragment thereof and the primer or set thereof comprises at least one oligonucleotide, wherein said oligonucleotide hybridizes to the cDNA encoded by said gene or fragment thereof and wherein said cDNA has known sequence.
- the set of primers of the present invention comprises oligonucleotides that hybridize to the coding and non-coding strand of said cDNA.
- sequence obtained as described above is compared with the known sequence of said cDNA, thereby determining the boundaries between the sequence corresponding to exons (cDNA) and the sequence corresponding to non-exons, wherein sequence obtained as described above that is not within the sequence of the cDNA is non-exon sequence.
- sequence obtained as described above that is not within the sequence of the cDNA is non-exon sequence.
- the present invention has several advantages.
- the present invention does not require prior knowledge of "genomic sequence" including boundaries between exon and non-exon sequence, nor knowledge of any sequence within the non-exon regions.
- the present invention requires much less work and therefore saves time and money than traditional methods of determining gene structure because the entire chromosomal copy of a gene need not be sequenced.
- the cost would be at least 20 times more than the method of the present invention.
- the 150 kb BAC clone contains coding sequence for a 2 kb cDNA
- the method of the present invention could provide the gene structure from 37 sequencing reactions using 30 primers. This includes 20 primers designed for a first round of sequencing reactions where the primers hybridize at 200 base intervals on both strands of the cDNA.
- This estimate also includes a 25% failure rate in first round sequencing reactions such that 5 sequencing must be repeated as well as a 50% failure of primes such that 10 new primers must be synthesized an used in sequencing reactions and synthesized to fill in any the gaps.
- the gene structure can be determined using the method of the present invention in two rounds of sequencing with a total of 25 primers and 25 sequencing reactions.
- One of ordinary skill in the art can readily determine if and when additional primers need to be designed for additional rounds of sequencing and how to design the additional primers.
- to sequence the entire 150 kb BAC clone if each sequencing reaction yields 500 bases of sequence, a minimum of 300 sequencing reactions must be conducted with 300 primers. The time involved to sequence the entire BAC clone is also an important factor and is estimated at 2 months in contrast to the estimated two weeks required in the present invention.
- the present invention is also drawn to human cytochrome P450 2C19 sequence. More particularly, the present invention is drawn to SEQ ID NOS: 59, 61, 63, 65, 67, 71, 73, 75, 77, 79, 81, 84, 86, 89 and 91.
- Figure 1 is a schematic diagram of the present invention.
- Figure 2 is a schematic diagram of the hybridization pattern of primers of Tables m and IV with the p53 cDNA, SEQ ID NO: 96.
- Figure 3 is alignment of primers on the P450 2C19 cDNA,SEQ ID NO: 58.
- Figure 4 shows a sequence obtained using the P450 2C19 gene as template and the cDNA specific primers according to Example II.
- Figure 5 is the gene structure of human P450 2C19 as determined by the present invention in the form of a composite sequence, SEQ ID NOS: 59 and 97, where the underlined sequence is novel sequence and the primer hybridization sites and starting ATG are boxed.
- Figure 6 is a schematic diagram of the human P450 2C19 gene.
- the term “gene” refers to a contiguous stretch of deoxynucleotides comprising the basic unit of heredity of an organism, encoding a given protein or RNA.
- the terms “gene” and also “genomic DNA” comprises one or more exon or part thereof, one or more intron or part thereof, all or a portion of the 5' untranslated region, and all or a portion of the 3' untranslated region.
- the term “gene structure” includes the coding regions or exons together with the exon-intron boundaries with at least 50 nucleotides of sequence of all intron termini as well as 5' and 3' UTR.
- the gene structure as determined by the present invention can also include promoter and enhancer sequences.
- cDNA refers to complementary DNA of an mRNA molecule.
- cDNA can represent the complete mRNA or a fragment thereof.
- RNA product can be mRNA, tRNA, rRNA or other structural RNA.
- polymorphism is an allelic variation in nucleic acid sequence between two or more samples.
- polymorphisms can be, for example, restriction fragment length polymorphism (RFLP), a variation in DNA sequence that alters the length of a restriction fragment (Botstein et al., Am. J. Hum. Genet. 32, 314-331 (1980)).
- RFLP restriction fragment length polymorphism
- Other polymo ⁇ hisms include of short tandem repeats (STRs) that include tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats are also referred to as variable number tandem repeat (VNTR) polymorphisms.
- VNTRs have been used in identity and paternity analysis (US 5,075,217; Armour et al, FEBS Lett. 307, 113-115 (1992); Horn et al, WO 91/14003; Jeffreys, EP 370,719), and in a large number of genetic mapping studies.
- Other polymo ⁇ hisms include single nucleotide variations between individuals of the same species. Such polymo ⁇ hisms are far more frequent than RFLPs, STRs and VNTRs.
- SNP single nucleotide polymo ⁇ hisms
- cSNP protein-coding sequences
- genes in which polymo ⁇ hisms within coding sequences give rise to genetic disease include ⁇ - globin (sickle cell anemia), apoE4 (Alzheimer's Disease), Factor V Leiden (thrombosis), and CFTR (cystic fibrosis).
- cSNPs can alter the codon sequence of the gene and therefore specify an alternative amino acid.
- sequences provide different levels of information regarding the structure of the gene of interest and the variations of the gene sequence that affect phenotype in the organism from which the sequence is derived.
- the term "variation" or "polymo ⁇ hism” implies that more than one version of the gene has been sequenced for comparison.
- Information on allelic variation is sometimes available for chromosomal copies of genes, if more than one example of a chromosomal copy has been sequenced, though often only one version is sequenced.
- Information on allelic variation for cDNA (and thus only the coding portion of a gene) is also sometimes available if more than one version has been sequenced.
- this information clearly does not include any information for example from regions upstream or downstream of the mature RNA nor the introns and therefore does not provide complete information of the gene structure nor the expression phenotype of a gene, where expression phenotype includes both expression level and structure of the gene product.
- information on allelic variation is also likely to be available for EST sequences, as it is possible that more than one example of a given EST has been sequenced.
- EST sequence does not provide information from for example regions nor the introns.
- the currently available sequence information does not readily provide complete information on phenotypic allelic variation or where phenotypic variation could be available, the information is incomplete and lacks genetic structure information.
- Typical methods for determining complete or additional gene structure include generating PCR products based in part on known gene structure (Shiinoki et al, Metabolism, 48:581-584 (1999)) or sequencing PCR products wherein one primer is derived from cDNA sequence and the other primer is derived from Alu repetitive element sequence (Monani and Burgess, Genome Res. 6:1200-1206 (1996)).
- These methods have the disadvantage of requiring some prior knowledge of the gene structure and the additional step of PCR amplification of portions of the genomic sequence. Therefore, a rapid, cost effective method is needed to determine the useful sequence of a chromosomal copy of a gene of interest, for gene structure determination wherein prior knowledge of the gene structure and/or the sequencing of the entire nontranscribed portions of the gene are not required.
- the method of the present invention provides gene structure, wherein gene structure includes the coding regions or exons together with the exon-intron boundaries (a point on a line that separates two regions) with at least about 50 nucleotides of sequence of all intron termini determined as well as boundaries between the mature transcript and 5' and 3' UTR and at least about 50 nucleotides of sequence of the 5' and 3' UTR adjacent to the mature transcript.
- gene structure includes the coding regions or exons together with the exon-intron boundaries (a point on a line that separates two regions) with at least about 50 nucleotides of sequence of all intron termini determined as well as boundaries between the mature transcript and 5' and 3' UTR and at least about 50 nucleotides of sequence of the 5' and 3' UTR adjacent to the mature transcript.
- two regions separated by a boundary are adjacent or contiguous in the genomic copy of the gene of interest.
- the present invention is drawn to a method of determining boundaries between at least one exon and at least one non-exon (where non-exon includes introns as well as sequence 5' and 3' of that which ultimately becomes the mature RNA sequence) region of a gene.
- boundary therefore, refers to the junction between exon and non-exon sequence.
- sequence refers to the arrangement of specific nucleotides within the specified polynucleic acid.
- exon refers to a segment or region of nucleotides within a eukaryotic gene that is retained in the mature RNA transcript such as mRNA, tRNA and rRNA.
- exons comprise coding sequences that encode part of the final gene product, and regulatory sequences, such as leader sequences.
- leader sequence refers to nucleotide sequence at the 5' end of a gene of interest that is transcribed but is not part of the final gene product.
- the method of the present invention comprises the steps of conducting one or more sequencing reactions, comprising a template and a primer or set of primers.
- the template comprises a gene or fragment thereof (e.g.
- the primer, or set thereof comprises one or more oligonucleotides, wherein said oligonucleotides hybridize to the cDNA or RNA product of interest, wherein said cDNA or RNA product has known sequence.
- the primers of the present invention comprise one or a set of oligonucleotides that hybridize to the coding and non-coding strand of said cDNA or to said RNA product.
- the primers are hybridized to the gene of interest or fragment thereof and used to prime template dependent nucleic acid polymerization. Sequence obtained is compared with the known sequence of said cDNA or RNA product. Sequence obtained that is not within the sequence of the cDNA or RNA product reveals the boundaries between said exons and said non-exon regions and reveals non-exon sequence.
- One of ordinary skill in the art can readily assemble the sequence and boundary information thus obtained to generate the gene structure of said gene of interest.
- cDNA or RNA sequence of interest can be obtained from commercial or public databases, such as GenBank.
- cDNA or RNA can be obtained by standard laboratory protocols, such as those described in Chapters 7 and 8 in Molecular Cloning, a Laboratory Manual (Sambrook et al, Cold Spring Harbor Laboratory Press, (1989)).
- standard laboratory protocols such as those described in Chapters 7 and 8 in Molecular Cloning, a Laboratory Manual (Sambrook et al, Cold Spring Harbor Laboratory Press, (1989)).
- One of ordinary skill in the art would readily be able to either construct the necessary cDNA libraries and/or screen libraries for the desired cDNA or RNA using standard laboratory techniques.
- libraries can be screened using antibodies specific for the encoded protein of interest or oligonucleotide probes that hybridize the cDNA or RNA of interest as described in Chapter 12 of Sambrook et al.
- antibodies specific for the encoded protein of interest or oligonucleotide probes that hybridize the cDNA or RNA of interest as described in Chapter 12 of Sambrook et al.
- one of ordinary skill in the art can readily obtain a sequence of the cDNA or RNA from commercial sequencing companies; commercial sequencing apparatusi or by the following standard laboratory techniques, such as that provided in Chapter 13 of Sambrook et al.
- primers suitable for the methods described herein can be designed and produced using techniques well-known to those of skill in the art.
- the term "primer” refers to an oligonucleotide suitable for the pu ⁇ ose of initiating template dependent nucleic acid synthesis.
- Said primer can comprise, for example, deoxyribonucleotides.
- the primers are about 5 to about 50 nucleotides in length.
- the primer is about 20 nucleotides in length.
- the primers have a T m of about 42 to about 55°C. The primers do not have to be exactly complementary to the cDNA, as long as they specifically hybridize to one location of the template to be sequenced.
- primer picking programs can be used, such as “Oligo 5.0" (MedProbe AS, Norway).
- the term “set of primers” comprises one or more primers, such that the primers hybridize to a polynucleotide strand of interest.
- the primers hybridize to the polynucleotide of interest at discrete intervals.
- the primers hybridize to the polynucleotide of interest as intervals of about 50 to about 500 nucleotides.
- the primers hybridize at intervals of about 100 to about 300 nucleotides. In still another embodiment, the primers hybridize at intervals of about 100 to about 200 nucleotides. In another embodiment of the present invention, the primers hybridize to said cDNA at evenly spaced intervals. In another embodiment of the present invention, the set of primers hybridize at similarily or evenly spaced internals on both strands of a double-stranded polynucleotide of interest. Primers that hybridize with said cDNA or RNA product at similarly or evenly spaced intervals are referred to herein as "even spaced” or "tiled primers".
- the primers are designed such that sequence information generated from one primer extends at least until the 5' terminus of the next downstream primer, if there is no intervening sequence. In this way, no intervening sequences are missed by this method.
- primers could be designed such that they hybridize at about nucleotides 1-20, 120-140, 240-260, 360-380, 480- 500, 600-620, 720-740, 840-860 and 960-980 of one strand and bases 1000-980, 880-860, 760-740, 640-620, 520-500, 400-380, 280-260, 160-140 and 40-20 of the opposite stand.
- One of ordinary skill in the art can optimize the number of primers necessary based on known information about the cDNA. To further reduce costs for example, the number of primers can be reduced. For example, domains of the cDNA that are thought to be typically encoded by one exon or a known number of exons in a given pattern may not require multiple internal primers. In another example, the gene structure of the cDNA may be known for another organism. Therefore, primers can be designed to hybridize near putative boundaries.
- the primers are used to prime sequencing reactions of at least one template comprising all or a portion of the gene of interest.
- the present invention can be used with any nucleic acid sequence from eukaryotic archaebacterial or viral sources wherein regions of said sequence have been processed, e.g. joined together by excising intervening sequences present in the original parent molecule.
- the primers are designed using the processed molecule and the template is the original parent molecule.
- the eukaryotic source comprises fungal, plant, mammalian and non-mammalian sources.
- the gene or fragment thereof to be used as template can be isolated from any tissue, fluid or extract from an organism comprising said polynucleic acid of interest.
- said polynucleic acid of interest can be derived from a libraries in the form of artificial chromosome libraries.
- libraries contain chromosomal DNA in excess of 100 kilobases in length.
- libraries can be in yeast artificial chromosome (YAC) libraries, bacterial artificial chromosome (BAC) or PI artificial chromosome (PAC) libraries.
- BAC and PAC libraries are especially useful because these are bacterial plasmid- based vectors that be easily isolated, manipulated and amplified.
- Such libraries are well known in the art and commercially available.
- one of ordinary skill in the art can isolate the template as described in Example 2. Templates of various lengths can be used. Uncloned genomic DNA can be used as template.
- the template is about 10 to about 500 kilobases in length and about 250 nanograms to 2.5 micrograms is used.
- the method of the present invention is useful to determine the boundaries between regions of nucleic acid that were separated by intervening sequence wherein said intervening sequence has been removed. For example, cDNA can be analyzed, wherein the boundaries between the exons comprising the cDNA and the introns present in the gene are determined. In addition, the method of the present invention is useful for the determination of boundaries present in genes containing group 1 type introns such as Tetrahymena rRNA, where self-splicing occurs in the presence of guanosine cofactor.
- the method of the present invention provides sequence extending into the non-exon regions of the gene of interest, h one embodiment, the present invention provides sequence information of the promoter and enhancer upstream of the 5' UTR of the cDNA. In a one embodiment, the present invention provides sequence in the upstream of the 5' most exon wherein the 5' most exon is up to about 500 base pairs before the transcription initiation site. In another embodiment of the present invention, sequence upstream of the transcription initiation site is provided, comprising the promoter of the gene of interest.
- the sequencing reactions can be conducted simultaneously in a multiplex assay so long as the sequence information can be unambiguously assigned to a given primer.
- the non-exon regions comprise sequence upstream and downstream of the mature RNA, as well as intron sequence. It is well known in the art that eukaryotic gene structure comprises promoter and enhancer sequences 5' to the coding sequence, followed by a terminator sequence on the 3' side of the coding sequence.
- eukaryotic genes are transcribed into a primary RNA transcript which comprises untranslated region (5' UTR) with introns upstream of the start codon or ATG, followed by the coding sequence, interrupted by introns, followed by a stop codon such as TAA, followed by 3' untranslated region (3' UTR) and ending in polyA tail.
- Said primary RNA transcripts are also refened to herein as "pre-mRNA” and as "heterogeneous nuclear RNA or hnRNA".
- Introns if present, are removed from the 5' UTR and from the coding sequence to generate a mature transcript.
- the intronic sequences of a gene generally do not contain sequence useful in the removal of introns except for near the 5' and 3' termini (e.g. within 50 bases of the boundary).
- the 5' and 3' temiini of the intron sequences contain the donor and the acceptor sites for splicing or removal of the introns. These sites are known to contain consensus sequences that are required by the splicing machinery of the cell to properly excise the intron sequences in order to generate mature RNA product such as mRNA. Mutations to such consensus sequences prevent the accurate removal of introns. Therefore, not only are the sequences of the exons important, but the sequences of the consensus sequences within the introns are also important.
- Donor consensus sequence for example, comprises SEQ ID NO: 1, AGGTAAGT, wherein the first two nucleotides, AG, are present within the exon and the last six nucleotides, GTAAGT, are present within the intron.
- GTAAGT last six nucleotides
- the 3' terminus of an intron comprises the sequence 12Py NCAGN, wherein 12Py stands for 12 pyrimidine bases and N stands for A, G, C or T and wherein the last nucleotide is present in the exon and the remaining nucleotides are present at the 3' terminus of the intron.
- the method of the present invention provides this sequence without the need to sequence the entire intron or the entire gene.
- 18-24 nucleotides upstream of the 3' splice site within the intron comprises a "branch site.”
- This branch site is another consensus site necessary for the proper removal of the intron.
- this consensus sequence is highly conserved and comprises SEQ ID NO: 2, UACUAAC.
- other eukaryotic branch site consensus sequences are not highly conserved and comprises a sequence of 7 nucleotides in length having a sequence according to Table HI. Table m *
- the adenosine residue at position 6 in the branch point sequence is required for proper intron removal.
- the adenine at position 6 is the site at which the lariat between the 5' end of the intron and the internal portion of the intron is formed through a 5'-2' phosphodiester bond.
- the method of the present invention also provides this sequence without having to sequence the entire intron or the entire gene.
- the present invention also provides sequence present on the 3' side of the mature RNA of interest.
- the method of the present invention provides at least about 50 nucleotides of sequence from each primer. Therefore, if the primer hybridizes near the boundary between an exon and non-exon, then at least about 50 nucleotides of non-exon sequence is provided. This sequence is sufficient in length to define genomic consensus sequences that are required for transcription, proper removal of introns and translation of said gene to generate functional gene product.
- Example 1 p53 In Silico Experiment p53 was chosen as an in Silico test for the present invention.
- the cDNA of p53 is approximately 1.3 kilobases in length.
- Primers were designed using software for primer design, (Oligo 5.0 MedProbe AS, Oslo, Norway). The parameters used for the software were: 50 mM monovalent salt and the T , was chosen to be between 42 and 55°C. Oligonucleotides of 20 bases in length were generated using both the coding and the non-coding strand of p53 cDNA.
- oligos that were separated on the respective strand of cDNA by about 90 to about 195 nucleotides were chosen for further analysis (Tables IV and V).
- Oligos were user-defined.
- the selected primers were aligned on the genomic sequence of p53 as shown in Figure 2.
- Each of primers 1-5, 7 and 9 from Table IV hybridized completely within an exon.
- Primers 6 and 8 hybridized at an intron/exon boundary and are therefore not expected to result in a successful sequencing reaction. It can readily be seen by one of ordinary skill in the art that sequencing reactions using these primers and a genomic copy of p53 reveal all intron exon boundaries and all useful intronic sequence.
- relevant sequence information from the p53 gene is extracted from about 350 bases of sequence information, including exon intron boundaries, enhancer, promoter and intron consensus sequences. When added to the sequence of the cDNA (1.3 kilobases), the complete gene structure and relevant sequence information for phenotypic allelic variations is obtained.
- Example 2 Boundary Determination of Human Cytochrome P450 2C19 Screening and Isolation of a Bacterial Artificial Chromosome encoding the Human Cytochrome P450 2C19 Gene (CYP450 2C19 gene).
- CYP450 2C8 '2C9, '2C18 and '2C19 Four members of the Cytochrome P450 2C subfamily are known: CYP450 2C8, '2C9, '2C18 and '2C19 (leiri and Higuchi, J. Toxicol. Sci. 23:129-131, (1998)).
- the CYP450 2C19 gene is flanked by two other members of the CYP450 2C family, CYP450 2C18 and CYP450 2C9 (Gray, et al, Genomics, 28:328-332 (1995)).
- gene specific primers were designed such that amplicons would be generated from the 5' end, the middle and the 3' end of the coding region (Table VI). For an amplicon from the putative boundary between intron 4 and exon 5, primers were taken as published in the partial gene structure de Morais, et al, Mol. Pharmacol, 46:594-598 (1994)).
- the primers were used for primary PCR screening of 48 human BAC DNA pools from Research Genetics (Huntsville, Alabama). PCR reactions were carried out using 100 ⁇ M dNTP, 1.5 mM units AmplitaqTM (PE BioSystems, Foster City, California) in a final volume of 14 ⁇ l. Cycling conditions were are follows: 94° for 2 minutes then 35 cycles of 94° for 30 seconds, 30 seconds at the appropriate annealing temperature (T m , Table VI) and 45 seconds at 72° C, followed by a final extension at 72° for 7 minutes. For each primer, the positive pools from the primary screening were subjected to secondary screening. For a secondary screening, each pool was split into 48 samples, which consisted of 10 plate pools, 14 row pools and 24 column pools.
- SEQ ID NO: 21 3.
- SEQ ID NO: 23 5 SEQ ID NO: 2 5
- BAC-DNA was isolated on a large scale as follows.
- a single BAC colony was picked and inoculated in a starter culture of 5 ml medium (LB medium containing 12.5 ug/ml chloramphenicol). The culture was shaken vigorously at 37°C until the OD 600 nm read between 1.0-1.5 (6-8 hrs). OD 600 should be maintained at 1.0-2.0; however, if the growth exceeds the limit less pre-culture volume per 500 ml culture can be used.
- the resuspended bacteria were centrifuged at 4500 x g (GSA rotor at 5100 rpm) for 20 min at 4°C.
- Steps 4 to 7 were repeated one time.
- Each bacterial pellet was gently and completely resuspend in 50 ml of ice-cold QiagenTM Buffer PI (containing RNAse A (100 ug/ml) as per Qiagen instructions) (Valencia, California) and incubated for 10 min. at room temperature (e.g. 24.0°C).
- Buffer P2 is the most critical step to keep the E. coli contamination low. Buffer P2 must be quickly and completely distributed throughout the cell suspension after its addition.
- the bottle was incubated undisturbed at room temperature for 15 minutes. 12. 50 ml ice-cold Buffer P3 was added to each bottle, mixed immediately by gently inverting 4-6 times, and incubated on ice for 30 min.
- the bottles were centrifuged at 20,000 x g (GSA rotor at 11 ,000 ipm) for 30 min. at 4°C.
- the bottles were VERY GENTLY recovered from the centrifuge without disturbing the pellet.
- the bottles were placed in such a way that they did not move at all while the supernatant was recovered.
- the supernatant was removed promptly using a 25 ml pipette and transferred to a fresh 250 ml bottle.
- the supernatant was re-centrifuge at 20,000 x g for 15 min. at 4°C. The supernatant was promptly removed and kept on ice. The total volume was about 150 ml. Note: Filter through cheesecloth to remove cell debris if necessary.
- Buffer QF Qiagen
- the eluted DNA (20 ml total) was transfened to a 45 ml centrifuge tube.
- the tubes were centrifuged immediately at >15,000 x g (SA 600 rotor at 11 ,000 ⁇ m) for 30 min. at 4°C. The supernatant was carefully discard by decanting. This was done as soon as the centrifuge came to a stop. Note: The pellet will be BOTH at the bottom of the tube AND as a streak on the tube wall, so be very gentle. The pellet may become detached.
- the tube was vortexed gently to ensure that most of the DNA was dissolved.
- the tube was spun for 5 min. to collect the solution to the bottom of the tube.
- the tube was left at 4°C overnight to allow for the DNA to completely dissolve.
- the BAC -DNA was directly sequenced as follows, except that 250 nanograms of
- BAC-DNA template was used instead of 2.5 micrograms of genomic DNA.
- Genomic DNA should be of high quality (eg , OD 26 o 28 o ⁇ ' 7-1 9) and be quantitated accurately, e g , by fluorometry and by agarose gel electrophoresis The DNA does not need to be of a certain size
- the PCR tubes were capped tightly and then quickly spun to collect all the reagents.
- thermocycler In a thermocycler, the following program was run: 1) 95°C for 5 minutes
- the sample was transfered from the collection tube into a plate format so that it was easier to load them onto the sequencing gel.
- the samples were dried in a plate vacuum centrifuge, applying medium heat and checked after 30 minutes. High temperature was not used. The sample was completely dry before transfer to sequencing gel.
- the plate was quickly spun in a plate centrifuge for up to 800g. 4. The plate was throughly vortexed with a plate shaker for 5 minutes.
- Primers were chosen such that they were spaced approximately 150 bp apart. One set of primers was complementary to the non-coding strand and an a second set was complementary to the coding strand of the CYP450 2C19 cDNA sequence (lower, L and upper, U respectively).
- the primers were chosen with the software Oligo 5.0, using the same parameters as described in the p53 in silico experiment.
- additional primers were chosen manually. All primers are listed in Table VIII.
- the non-underlined sequence is provided herein for the first time and includes sequence belonging to 5' and 3' untranslated regions or intronic regions.
- the novel sequences are assigned SEQ ID Nos. as follows.
- Figure 5 shows the 5' and 3' intron and 5' and 3' untranslated sequences provided by the method of the present invention, assembled with the published cDNA sequence (Romkes et al), published sequence is in capital letters, the ATG start codon is boxed, and positions of the primers are boxed. All newly discovered sequence are in lower case and underlined. Missing intron sequence is shown as a string of underlined "n.”
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP01942674A EP1294943A2 (fr) | 2000-01-20 | 2001-01-17 | Determination rapide de la structure genique au moyen d'une sequence d'adnc |
| AU29532/01A AU2953201A (en) | 2000-01-20 | 2001-01-17 | Rapid determination of gene structure using cdna sequence |
| CA002398683A CA2398683A1 (fr) | 2000-01-20 | 2001-01-17 | Determination rapide de la structure genique au moyen d'une sequence d'adnc |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US48812700A | 2000-01-20 | 2000-01-20 | |
| US09/488,127 | 2000-01-20 |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| WO2001053529A2 WO2001053529A2 (fr) | 2001-07-26 |
| WO2001053529A9 true WO2001053529A9 (fr) | 2002-10-24 |
| WO2001053529A3 WO2001053529A3 (fr) | 2003-01-16 |
Family
ID=23938426
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2001/001461 Ceased WO2001053529A2 (fr) | 2000-01-20 | 2001-01-17 | Determination rapide de la structure genique au moyen d'une sequence d'adnc |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP1294943A2 (fr) |
| AU (1) | AU2953201A (fr) |
| CA (1) | CA2398683A1 (fr) |
| WO (1) | WO2001053529A2 (fr) |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| ZA931601B (en) * | 1992-03-06 | 1993-10-05 | Squibb & Sons Inc | Microsomal triglyceride transfer protein |
| US5707863A (en) * | 1993-02-25 | 1998-01-13 | General Hospital Corporation | Tumor suppressor gene merlin |
| US5578493A (en) * | 1993-09-01 | 1996-11-26 | The Trustees Of Columbia University In The City Of New York | Wilson's disease gene |
| US5858661A (en) * | 1995-05-16 | 1999-01-12 | Ramot-University Authority For Applied Research And Industrial Development | Ataxia-telangiectasia gene and its genomic organization |
| HUP9903959A3 (en) * | 1997-08-13 | 2002-01-28 | Icos Corp Bothell | Truncated platelet-activating factor acetylhydrolase |
| WO2000001816A1 (fr) * | 1998-07-02 | 2000-01-13 | Imperial Cancer Research Technology Limited | GENE SUPPRESSEUR DE TUMEUR DBCCR1 SITUE DANS 9q32-33 |
-
2001
- 2001-01-17 AU AU29532/01A patent/AU2953201A/en not_active Abandoned
- 2001-01-17 WO PCT/US2001/001461 patent/WO2001053529A2/fr not_active Ceased
- 2001-01-17 CA CA002398683A patent/CA2398683A1/fr not_active Abandoned
- 2001-01-17 EP EP01942674A patent/EP1294943A2/fr not_active Withdrawn
Also Published As
| Publication number | Publication date |
|---|---|
| WO2001053529A2 (fr) | 2001-07-26 |
| EP1294943A2 (fr) | 2003-03-26 |
| CA2398683A1 (fr) | 2001-07-26 |
| WO2001053529A3 (fr) | 2003-01-16 |
| AU2953201A (en) | 2001-07-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8501459B2 (en) | Test probes, common oligonucleotide chips, nucleic acid detection method, and their uses | |
| EP1448793B1 (fr) | Amorce de commande de renaturation et ses utilisations | |
| US20030175749A1 (en) | Annealing control primer and its uses | |
| KR100649165B1 (ko) | 어닐링 조절 프라이머 및 그의 용도 | |
| Men et al. | Sanger DNA sequencing | |
| Arcot et al. | High-resolution cartography of recently integrated human chromosome 19-specific Alu fossils | |
| Osanai et al. | Essential motifs in the 3′ untranslated region required for retrotransposition and the precise start of reverse transcription in non-long-terminal-repeat retrotransposon SART1 | |
| US6312913B1 (en) | Method for isolating and characterizing nucleic acid sequences | |
| Miller et al. | Whole blood RNA offers a rapid, comprehensive approach to genetic diagnosis of cardiovascular diseases | |
| JP2001512694A (ja) | Dnaのhlaクラスiタイプを決定するための方法およびキット | |
| WO2001053529A9 (fr) | Determination rapide de la structure genique au moyen d'une sequence d'adnc | |
| Gonen et al. | High throughput fluorescent CE-SSCP SNP genotyping | |
| US8110357B2 (en) | Method for detecting an individual who is afflicted with or a carrier for Van Buchem's disease | |
| CN101343667A (zh) | 一种水产动物snp标记筛选方法 | |
| WO2001062966A2 (fr) | Procedes de caracterisation de polymorphismes | |
| CN109554462B (zh) | 基因cyp11b1外显子的pcr引物组、试剂盒、扩增体系和检测方法 | |
| US20070190535A1 (en) | Size fractionation of nucleic acid samples | |
| JP2008079604A (ja) | アルコール分解能と宿酔の耐性とを予測するためのプライマーセット、プローブセット、方法及びキット | |
| CN1809637A (zh) | 分离核酸的方法及用于核酸分离的试剂盒和装置 | |
| CN108753990B (zh) | 一种锈斑蟳全基因组微卫星标记及筛选方法与应用 | |
| WO2006133840A1 (fr) | Snp il10 associé à un rejet violent | |
| AU2007201538A1 (en) | Methods for identification of Alport Syndrome | |
| Sellas et al. | Isolation and characterization of 10 tetranucleotide microsatellite loci in an enigmatic East African bird, the spot-throat (Modulatrix stictigula). | |
| EP1583831A2 (fr) | Nouveau procede a haut debit de production et de purification de cibles d'arnc marquees pour l'analyse de l'expression genique | |
| Wang et al. | Imperfect units of an extended microsatellite structure involving single nucleotide changes |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| WWE | Wipo information: entry into national phase |
Ref document number: 2398683 Country of ref document: CA |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 29532/01 Country of ref document: AU |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 520370 Country of ref document: NZ |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2001942674 Country of ref document: EP |
|
| AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
| COP | Corrected version of pamphlet |
Free format text: PAGES 1/18-18/18, DRAWINGS, REPLACED BY NEW PAGES 1/19-19/19; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
| REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2001942674 Country of ref document: EP |
|
| WWW | Wipo information: withdrawn in national office |
Ref document number: 2001942674 Country of ref document: EP |
|
| NENP | Non-entry into the national phase in: |
Ref country code: JP |