WO2001032929A1

WO2001032929A1 - Methods and compositions for analysis of snps and strs

Info

Publication number: WO2001032929A1
Application number: PCT/US2000/030534
Authority: WO
Inventors: Elliot R. Ramberg; Christian C. Oste
Original assignee: Cygene Inc
Current assignee: Cygene Inc
Priority date: 1999-11-03
Filing date: 2000-11-03
Publication date: 2001-05-10
Anticipated expiration: 2002-05-03
Also published as: AU1468201A

Abstract

The present invention is directed to methods and compositions for detecting differences in sequences, such as genomic sequences. Particularly, the invention is directed to determinations of single nucleotide polymorphisms and short tandem repeats. Methods of detection include triplex protection assays and restriction fragment target assays.

Description

METHODS AND COMPOSITIONS FOR ANALYSIS OF SNPs and STRs

The present invention claims priority from U.S. Provisional Patent Application

No. 60/163,356, filed November 3, 1999, U.S. Provisional Patent Application No.

60/163,416, filed November 3, 1999, U.S. Provisional Patent Application No.

60/216,579, filed July 7, 2000 and U.S. Provisional Patent Application No. 60/171,348, filed December 21, 1999.

Field of the Invention

The present invention comprises methods and compositions for detecting nucleic acid sequences. More particularly, the present invention comprises methods and compositions for detection and analysis of single nucleotide polymorphisms (SNP) and of short tandem repeat sequences (STR).

Background of the Invention

The characterization of human Single Nucleotide Polymorphisms (SNP) and their role in phenotype determination represents a breakthrough in genomic analysis, diagnostics and therapeutics in humans, plants and animals. It has been stated that human beings are little more than "sacks of SNPs."

It is thought that the majority of SNPs present in the human genome are not in coding sequences. Most SNPs with clinical relevance are located in the exon coding regions with the adjacent intervening intron sequences. However, those SNPs located in non-coding regions would have a more prominent application in forensic identity analysis, where no pseudogene, or similar sequences can interfere with the scoring result, and the SNP would exist in a non-coding site. SNP formation has occuned throughout evolution with the SNPs formed during early evolution being less important due to their low levels of linkage disequilibrium with sunounding sequences. Young SNPs will have a higher level of linkage disequilibrium with sunounding sequences such as the non-coding introns. Approximately 90% of DNA polymorphism (variation in genetic regions) is in the form of SNPs. SNPs are single base pair differences at the same site in genomic DNA. A SNP must be differentiated from a point mutation. This is resolved by defining the SNP as having a frequency of occunence at or above 1%, whereas the frequency of the point mutation is 1 in 1000 bases (0.1%) or less depending on the stability of the genomic sequence and region under analysis. These differing gene regions (the wild type and mutant sequence) are called different alleles of the gene and exist in normal individuals in some or many populations.

The frequency with which single base changes (point mutations and SNPs) have been observed in genomic DNA from two identical chromosomes (genes) is on the order of 1 in 1000 base pairs. If the frequency of this change is at or above 1% then the change is refened to as a SNP. The rate of nucleotide difference in two randomly chosen chromosomes is called the nucleotide diversity index. There is an average of 0.1% chance of any base being heterozygous in an individual. Furthermore, some genome regions have at least a 100 fold enrichment in these SNP regions, while others have 1000 fold less (0.1%). Some unique non-coding HLA regions show nucleotide diversity levels of 5-10% that can well serve as SNP sites for forensic analysis.

Furthermore, SNPs have been used to characterize and delineate genes as recessively acting, low penetrance dominant, quantitative trait low, or risk associated alleles, since all of these will occur in some normal individuals (non-diseased). It is believed by some that disease predisposes single base variants to be SNPs. SNP detection and scoring methods are numerous. Cunent methods utilize dye conjugated dideoxynucleotide terminator molecules with single base extension by the DNA polymerase to score the SNP target site of interest. Addition of the unique dye terminator base to the SNP site, and detection of the unique dye fluorescent signal that results, predicts or scores the presence of the SNP. Each base has a different dye attached which emits fluorescence at a unique wavelength. This type of scoring procedure, requires emission from millions of dye molecules to register a discemable signal.

Cunently, most methods involve target sequence PCR amplification which has flaws, such as ease of contamination, high cost, and time consuming and interference by pseudogene presence in SNP analysis where coding exon containing SNPs are scored. Some of these same problems exist with miniature hybridization aπay (DNA-chip) technology.

A problem in scoring and detecting clinically relevant SNPs lies in the existence of non-processed pseudogenes. Currently, the sequencing of the human genome has brought to light proof of the existence of these non-coded pseudogenes. Pseudogenes are defined as sequences that are highly similar in sequence to the target gene, thus rendering the pseudogene as a site for PCR primer binding. The pseudogenes are devoid of introns, the non-coding adjacent intervening sequences. It has been estimated that as many as 20% of SNPs analyzed will yield false results due to the presence of a pseudogene and the subsequent absence of the adjacent intervening intron. It is believed that most pseudogenes are non-coding copies of certain genes that are inadvertently inserted into the genome.

Most scoring assays involve allele discrimination via, or secondary to, matched and mismatched base pair detection at the SNP locus by hybridization techniques. The most stable mismatched base pair is the G:T, which is almost as stable as A:T. In actuality this mismatch is the most important to distinguish in order to score the most abundant of the four SNP types [C-T and G-A] transitions.

In a homozygous individual the SNP score will reflect a single result. This might be homozygous for the wild type allele or homozygous for the mutant allele. In a heterozygous individual the SNP score will yield a double scoring result. One result would be representative of the wild type allele, whereas the second result would be representative of the mutation allele.

Another invaluable technique for discriminating between differing genomes is

STR analysis. The first DNA marker for human identification was discovered on chromosome 14 in 1980 by Restriction Fragment Length Polymorphism analysis

(RFLP). RFLP detects the presence or absence of a restriction site at the same site in homologous chromosomes.

In the first RFLP analysis, DNAs from several individuals were digested with the restriction endonuclease EcoR I. The resulting restriction fragments were separated by agarose gel electrophoresis and detected by Southern blot analysis using a radioactive probe for the D14S1 locus. Different banding patterns were seen for different individuals. It was found that the alternative forms of the DNA (alleles), characterized by length variations, were heritable traits resulting from the presence or absence of a restriction site at this locus. Additional polymorphic identity markers have been identified and methods for detecting them have been developed. Examples include additional RFLP loci, variable number tandem repeat (VNTR) sequences, and short tandem repeat (STR) sequences.

VNTRs were first described in 1985. These markers contain many alleles at each individual locus, due to the variability of the number of copies of a tandemly repeated 15-70 base consensus DNA sequence, present between two neighboring restriction sites. Thousands of DNA polymorphisms were used to generate genetic maps during the late 1980s and early 1990s. Similar to RFLP analysis, VNTRs are detected and analyzed using flanking restriction sites and Southern Blot transfer.

VNTRs are prevalent throughout the human genome. They have become important in several fields including forensic science, genetic mapping, linkage analysis, and human identity testing including paternity testing. VNTR regions of DNA are classified into several groups depending on the size of the repeat unit and also the length of the repeat region itself. Minisatellite VNTRs feature 9-100 bp core repeats and can reach lengths of up to 30,000 base pairs, while microsatellite VNTRs contain 2-7 bp repeat units and can extend up to 300 base pairs.

A subset of microsatellite VNTRs are called Short Tandem Repeats (STRs). These include dinucleotide repeat polymorphisms, trinucleotide and tetranucleotide repeats (7-11). The tetranucleotide repeats are currently favored by the forensic community due to the fact that these can be analyzed by PCR, whereas other better, shorter repeat sequences would not provide reliable results with the same technique. They have the same basic structure as VNTRs, but the tandemly repeated consensus sequences are only two to four bases long. The shorter repeat lengths of STR markers make them more compatible for detection and analysis by the polymerase chain reaction (PCR), wherein PCR primers specific to the unique flanking sequences are employed. This advantage has made them popular and useful markers for recent genetic maps.

Analysis of STRs has several advantages over the larger variable number tandem repeats (VNTRs). Discrete alleles from STR systems may be obtained due to their smaller size. DNA fragments differing by a single base pair in size may be differentiated with the proper technical procedure. Determination of discrete alleles allows results to be compared easily between laboratories. Additionally, since the PCR process is used, smaller quantities of DNA, as well as degraded DNA, may be typed using STRs as genetic markers. Thus, the quantity and integrity of the DNA sample is less of an issue with PCR-based typing methods than with conventional DNA methods such as RFLP analysis_^ The analysis of STR markers with PCR in the forensic sciences and parentage analyses employs a limited number of markers from the thousands, which have been generated for genetic mapping purposes. The selected markers must have several characteristics to be useful for human identification with PCR. First, only STRs, which demonstrate a high degree of variability within the population, should be selected. Second, the amplified products (as a result of PCR) must be easily distinguished from one another. This means rejecting markers, which contain frequent microvariants (i.e., alleles differing from one another by lengths shorter than the repeat length) as the closer and more random spacing of alleles is more difficult to interpret. Finally, the prevalence of stutter bands, i.e., amplification artifacts, which appear one or more repeat lengths above or below the true amplified allele has led to the rejection of dinucleotide repeats as a class for these applications. Even some of the STR markers validated by the FBI occasionally display stutter bands, depending on the PCR conditions used and the level of multiplexing in the PCR reaction.

The forensic community has primarily concentrated on tetranucleotide repeats in dealing with forensic evidence. This type of repeat is small enough to be amenable to PCR and is a faster and more efficient analysis than VNTR analysis. The diversity of tetranucleotide repeat alleles present in a population is such that a high degree of discrimination among individuals is possible. The caveat is that many STR loci must be examined to obtain frequencies of occurrence in a population comparable to traditional RFLP. There are thousands of STR loci (chromosomal locations), which have been mapped in the human genome; however, a much smaller number have been tested for application in forensic and parentage testing. At present, loci of forensic interest are distributed on almost every chromosome in the genome. Tetranucleotide repeats have been most popular among forensic scientists due to their fidelity in PCR amplification, although some pentanucleotide repeats are also used. Some of the common features that are desirable for forensic use of STRs include:

^■ high variability among repeat lengths (many different alleles)

^■ uniform repeat units (no sequence variation) ^■ robust methodology so that alleles that can be amplified and reproduced and marked by common laboratory methods

Identification of the best markers for PCR applications is complicated by the fact that the desired traits of a marker are not fully compatible with simple analysis.

While it is possible to identify highly polymorphic markers with a relatively low presence of stutter bands, such markers generally display microvariants and increased mutation frequency.

Furthermore, the most commonly selected loci do not have the highest possible degree of polymorphism as individual systems. Yet, these loci have been developed into a multiplex PCR system, which is characterized by easy and reliable interpretation, while providing a powerful statistic for discriminating individuals.

However, PCR analysis introduces several difficult problems.

One of the problems with the cunent methods is the presence of stutter bands, small bands that are approximately 4 basepairs (bp) away from a conesponding allele band. This band, or set of bands, is usually smaller than the main peak. Stutter occurs more commonly in some loci than others. Stutter bands are dependent on the specific locus amplified and also the amount of starting material. Stutter can increase with an underestimation of DNA quantity. If one underestimates the amount of DNA present, either weak amplification will occur or no amplification at all. Stutter bands are rarely > 10 % of the peak area of the true allele band and are only one repeat unit different in size. Stutter bands may be caused by slippage of the thermostable DNA Polymerase during replication of the strand and may be due to out of alignment re- annealing of complementary target sequences.

Another problem is the appearance of peaks or other artifacts that complicate the interpretation of the data. Some peaks may have shoulders which may cause a peak to be indeterminate mask variant alleles, which may be in the vicinity of the main allele. Peaks may also be due to amplification of degraded samples, contamination of the sample, non-specific primer-primer interactions prior to, or during amplification, and the presence of several different individuals DNA being present in the sample.

Another artifact that is sometimes found when multi-dye systems are used is a pull-up artifact peak. This is a peak present directly underneath a very high peak of another color in an electrophoretogram. This occurs because the overlap in the emission spectra of the dyes has not been compensated for by the mathematical matrix in the computer software and is usually due to over-amplification of a sample. Dye overlap, also known as cross-talk, can be a problem in many situations. Sometimes an allele peak may appear to be divided into two peaks with an approximate 1 bp size difference, an n:n+l peak.. This is associated with all loci but especially with vWA locus products. This occunence is prompted by the tendency for the thermostable DNA polymerase to add an extra A on the end of a PCR product, no matter what that allele is. Another disadvantage in the use of the cunent STR systems is the choice of makers used. More than 2,000 STR markers have been described. From these many STRs, a relatively small number of loci have been selected. The disadvantage in using those STRs, which display few or no microvariants and low mutation rates, is that they are not as polymorphic as the best VNTR markers. It is a disadvantage that the number of usable STR markers must be limited because of the technology employed to analyze the markers.

Thus, there is a need for developing high-throughout methods for the detection and analysis of SNPs and STRs. Such methods need to be reliable and reproducible, with low enor rates and high confidence.

What is particularly needed are methods for detecting tetranucleotide repeat loci, which display few or no microvariants, minimal stutter bands, have a relatively low mutation rate, provide a capability for reanalysis of other shorter repeats that not amenable to PCR analysis, and is not a DNA amplification based technology.

Summary The present inventions include methods, compositions and kits for detecting and analyzing SNPs for using nucleic acid target protection strategies. The methods of the present inventions comprise SNP detection and analysis technology for use in population genetics, drug development, forensics, cancer, genetic disease research, genomic analysis, diagnostics and therapeutics in humans, plants and animals. Methods and compositions for specific target detection, such as those taught in

U.S. Patent Nos. 5,962,225, 6,100,040, U.S. Patent Application Nos. 09/633,848; 09/569,504; and 09/443,633 can be employed with the methods and compositions of the present invention. Particularly useful are the methods and compositions for Target Protection Assays, (TPA) and Restriction Fragment Target Assays (RFTA). Alternative methods for restricting nucleic acids comprise methods of Cutter Probe Assays (CPA). All patent and patent applications are hereby incorporated in their entireties.

In a prefened method of the present invention employing the TPA assay for SNP detection using TPA methods, genomic DNA is isolated. A target section of the genomic DNA containing the SNP is chosen. This choice is based on the location of the SNP, such as in a non-coding region, or a clinically relevant SNP in an exon. Preferably the SNP is flanked with one or more polypurine rich regions, which may be in the exon or in immediately adjacent introns on either side of the exon or flanking the non-coding SNP itself. Excision sites are chosen that flank the SNP and two triplex forming oligonucleotide (TFO) polypurine rich binding sites in the vicinity of the SNP site. The chosen region (target) is excised out of the genomic nucleic acid. Excision can use restriction enzymes, cutter probes or any method known in the art. Next, two polypyrimidine TFOs are hybridized to the isolated target nucleic acid to form triplexes at the TFO polypurine rich binding sites. The triplexes are then treated with an enzyme, such as Exonuclease III having 3' to 5' exonuclease activity, to trim the 3' strand of the blunt ends of the restricted segment. Such an enzyme will generate a triplex protected nucleic acid structure (PNAS) having 5' tails remaining on each end of the PNAS. The double stranded non-specific nucleic acid sequences in the genomic DNA including the interfering pseudogenes is degraded and the structure is detected. The 5' tails are hybridized with affinity capture probes and SNP identification probes also called the position identifier probe or PIP. The 3' end of the PIP is next to the site complementary to the SNP scoring site, the only site at which single base addition will occur because all other 3' ends have been rendered resistant to single base addition by the DNA polymerase. In one embodiment each dideoxynucleotide terminator base has a different dye conjugated to it wherein each dye emits characteristic emission fluorescence at a different wavelength. This can be done by adding DNA polymerase to the terminator base mixture and single base addition is performed. The base present at the scoring site will dictate which base is added with its conjugated dye. Thus, a mixture of all four bases with differing dyes is used to discriminate the base that is added to the site opposite the scoring site.

In another embodiment, the sample is divided into four sample wells. DNA polymerase is added and each well has only one of four dye conjugated dideoxynucleotide chain terminators (A, T, G, or C) each with a different dye emitting fluorescence at a unique wavelength. DNA polymerase will insert the dideoxynucleotide chain terminator that is complementary to the SNP site. One of the four wells will fluorescence as the complementary base is bound for each allele present (two alleles for the heterozygous state). The unique dye fluorescence emission is detected and the SNP is scored as an A, T, G, or C. The sensitivity of this label is superior to hybridization techniques where base pair mismatches are difficult to detect.

An alternative method uses a normal dye conjugated base added to the four wells and not a dideoxynucleotide terminator base, and the same dye can be used for each base. Again two scoring results are expected due to the heterozygous state that may exist. This can be achieved with the lack of use of a 4 base cocktail, with the expected result of visualizing single base addition in one well, and with the scoring result indifference to multiple base addition if the bases downstream from the scoring site are identical.

In another novel embodiment, the capture probe of the PNAS structure in TPA can be attached via a polar spacer molecule to a magnetic bead and a pyrosequencing reaction can be used to score a SNP either by a single base addition (sequencing by synthesis) or with sequencing by synthesis of up to a 100 base region containing the scoring site.

In an embodiment, an allele specific probe, in this case a PIP probe, has the SNP complementary site in the middle. Since most SNPs are C — » T or G — > A transitions, up to four PIP probes are used to hybridize to the PNAS as described herein. Generally, only one of the four PIP probes with the SNP complementary base in the center will hybridize completely, whereas the other three will only partly hybridize and have a mismatched base in the center. An increase in stringency conditions post hybridization will destabilize the three mismatched probes but not the perfectly matched probes.

This can be detected by the addition of any differential label to the probes or use of the same label when aliquoting the specimen into four samples and each hybridized by a different but similarly labeled probe.

Another method for SNP site scoring involves the use of a streptavidin- enzyme conjugate to generate an amplified signal from the particular single base added and score the SNP. Again the wild type scoring result must be subtracted form the mutant allele SNP scoring result to appropriately score the SNP mutation.

In this method, the sample is divided into four aliquots, each receiving DNA polymerase and a different biotinylated nucleotide. Single base addition on the PNAS is performed, and in a preferred embodiment a streptavidin-alkaline phosphatase polymer is added, thereby attaching the enzyme to the PNAS of one of the wells (only one base can be added per allele present) thus base addition will occur in one or two of the sample tubes. A colorless substrate is then added and a color generated. Furthermore, chemiluminescence or chemifluorescence substrates may also, be used also known by those skilled in the art. Alternative embodiments comprise scoring methods that are not dependent on the presence of polyrich regions for triple helix formation. Cunently over one million

SNPs have been identified, and the ongoing discovery effort will raise that number continually. This number of SNP targets requires a technology for SNP scoring that can score a SNP at any position on the genome.

The methods of the present invention allow for scoring to be performed at any genomic site, without interference by the existence of non-processed pseudogenes in scoring exon containing SNPs. These embodiments are referred to as RFTA SNP scoring and include all those methods discussed earlier for TPA SNP scoring. This non-triplex forming embodiment, RFTA, comprises methods for SNP scoring using three different scoring techniques that are use of the dye conjugated dideoxynucleotide terminator bases, (dNTPs), use of pyrosequencing, and use of enzyme conjugates. Furthermore, accurate scoring is readily achievable, and pseudogenes can be removed as an interfering influence on the scoring process of exon containing SNPs.

The present invention includes methods and kits for detecting and analyzing short tandem repeat sequences using nucleic acid target protection strategies. The method and kits of the invention may be used in forensic and medical sciences, as well as in genetic mapping, linkage analysis, and human identity testing, including parentage testing.

Brief Description of the Figures

Figure 1 shows an embodiment of detecting and analyzing SNPs using the TPA assay and two TFOs using dye conjugated dideoxynucleotide terminator bases as a signal. Figure 2 shows an embodiment for detecting and analyzing SNPs using the TPA assay and one TFO using dye conjugated dideoxynucleotide terminator bases as a signal.

Figure 3 is a flow chart diagramming an embodiment for detecting and analyzing any SNPs (exon or non-coding) using the classical RFTA assay and hybridization with a primary probe forming a PDTP structure using dye conjugated dideoxynucleotide terminator bases as a signal.

Figure 4 is a drawing of a Watson strand SNP scoring site configuration of TPA SNP scoring in the exon region or non-coding region. Figure 5 is a drawing of a Crick strand SNP scoring site configuration of TPA

SNP scoring in the exon region or non-coding region.

Figure 6 is a drawing of an embodiment of a Watson strand SNP scoring site configuration where the RP-TFOs are on the same strand of the target comprising 2 TFOs. Figure 7 is a drawing of an embodiment of a Crick strand SNP scoring site configuration where the RP-TFOs are on the same strand of the target comprising 2 TFOs.

Figure 8A and B is a drawing of an embodiment of a Watson strand SNP scoring site configuration where the RP-TFO is on one strand of the target comprises 1 TFO.

Figure 9A and B is a drawing of an embodiment of a Crick strand SNP scoring site configuration where the RP-TFO is on one strand of the target and comprises 1 TFO.

Figure 10 is a drawing where TPA SNP scoring is in the exon/intron region, the RP-TFO is on opposite strands of the target, and is a Watson strand SNP scoring site configuration Figure 11 is a drawing where TPA SNP scoring is in the exon/intron region, the RP-TFO is on opposite strands of the target, and is a Crick strand SNP scoring site configuration

Figure 12 is a drawing where TPA SNP scoring is in the exon/intron region, the RP-TFO is on the same strand of the target, and is a Watson strand SNP scoring site configuration

Figure 13 is a drawing where TPA SNP scoring is in the exon/intron region, the RP-TFO is on opposite strands of the target, and is a Crick strand SNP scoring site configuration. Figure 14 is a drawing of APSP RFTA SNP scoring in the exon region as well as non-coding SNPs, where the APSPs are on opposite strands of the target, and a Watson strand SNP scoring site configuration.

Figure 15 is a drawing of APSP RFTA SNP scoring in the exon region as well as non-coding SNPs, where the APSPs are on opposite strands of the target, and a Crick strand SNP scoring site configuration.

Figure 16 is a drawing of APSP RFTA SNP scoring in the exon region as well as non-coding SNPs, where the APSP PIP and ASPS capture probes are on the same strand of the target, and a Watson strand SNP scoring site configuration.

Figure 17 is a drawing of APSP RFTA SNP scoring in the exon region as well as non-coding SNPs, where the APSP PIP and ASPS capture probes are on the same strand of the target, and a Crick strand SNP scoring site configuration.

Figure 18 is a drawing of APSP RFTA SNP scoring in the exon/intron region, where the APSPs are on the opposite strand of the target, and a Watson strand SNP scoring site configuration. Figure 19 is a drawing of APSP RFTA SNP scoring in the exon/intron region, where the APSPs are on the opposite strand of the target, and a Crick strand SNP scoring site configuration.

Figure 20 is a drawing of APSP RFTA SNP scoring in the exon/intron region, where the APSPs are on the same strand of the target, and a Watson strand SNP scoring site configuration.

Figure 21 is a drawing of APSP RFTA SNP scoring in the exon/intron region, where the APSPs are on the opposite strand of the target, and a Crick strand SNP scoring site configuration. Figure 22 is a drawing of RP-TFO TPA STR analysis, where the RP-TFO is on the opposite strand of the target, and a Watson strand repeat target.

Figure 23 is a drawing of RP-TFO TPA STR analysis, where the RP-TFO is on the opposite strand of the target, and is a Crick strand repeat target.

Figure 24 is a drawing of RP-TFO TPA STR analysis, where the RP-TFO is on the same strand of the target, and is a Watson strand repeat target.

Figure 25 is a drawing of RP-TFO TPA STR analysis, where the RP-TFO is on the same strand of the target, and is a Crick strand repeat target.

Figure 26 A and B is a drawing of RP-TFO TPA STR analysis, where only one RP-TFO used, and is a Watson strand repeat target. Figure 27 A and B is a drawing of RP-TFO TPA STR analysis, where only one

RP-TFO used, and is a Crick strand repeat target.

Figure 28 is a drawing of APSP RFTA STR analysis, the APSP probes are located on opposite sides of the target, and is a Watson strand repeat target.

Figure 29 is a drawing of APSP RFTA STR analysis, the APSP probes are located on opposite sides of the target, and is a Crick strand repeat target.

Figure 30A and B is a drawing of APSP RFTA STR analysis, the APSP probes are located on the same side of the target, and is a Watson strand repeat target. Figure 31 A and B is a drawing of APSP RFTA STR analysis, the APSP probes are located on the same side of the target, and is a Watson strand repeat target.

Figure 32 represents an embodiment of RP-TFO TPA SNP DYS271, non- coding region analysis. Figure 33 represents an embodiment of Factor V Leiden RP-TFO TPA SNP, intron-exon-intron analysis.

Figure 34 represents an embodiment of Factor V Leiden APSP RFTA SNP, intron-exon analysis.

Figure 35 represents an embodiment of RP-TFO STR (CSF1PO Locus analysis) Chromosome 5.

Figure 36 represents an embodiment of APSP RFTA STR (CSF1PO Locus analysis) Chromosome 5.

Figure 37 represents an embodiment of RP-TFO TPA SNP DYS271 and shows the genomic sequence for the DYS271 SNP site and upstream and downstream regions.

Figure 38A and B represents an embodiment of RP-TFO TPA SNP and shows the genomic sequence for the Factor V Leiden (wild type allele) SNP exon site and upstream and downstream adjacent intron regions

Figure 39A and B represents a prefened embodiment of a ds DNA Cutter Probe Assay

Figure 40 is a drawing of an embodiment of the method of using a ds cutter probe in the present invention.

Detailed Description Throughout this document, to aid in a more clear description, SNP scoring will be refened to as homozygous for the mutant allele. Only one scoring result will be anticipated and that will be representative of the SNP mutation. Actual scoring of heterozygous individuals will require that the wild type SNP score be subtracted from the double scoring result obtained to give the SNP score of the mutation allele. In the present invention, SNPs are analyzed by target protecting methods, such as the Triplex Protection Assay (TPA), and the Restriction Fragment Target Assay (RFTA), disclosed in U.S. Patent Nos. 5,962,225, 6,100,040, U.S. Patent Application Nos. 09/633,848; 09/569,504; and 09/443,633. Each of the cited patents, patent applications, provisional patent applications and references are incorporated herein by reference.

SNP-TPA TPA comprises methods for the detection of very low copy numbers of nucleic acid targets either in a vast excess of non-specific nucleic acids or in femtomoles of target sequences alone. The methods provide direct DNA and RNA analysis by allowing the processing of milligram quantities of nucleic acids searching for very low copy number nucleic acid targets. The methods comprise triplex formation in the target region, forming a protected nucleic acid sequence (PNAS). The target sequence is protected from degradation. For example, degradation by a 3 ' to 5' enzyme, such as, Exo III, will degrade all non-specific double strand DNA, and leave only the triplex-protected target and single stranded DNA (50% reduction in non-specific nucleic acid resulting in less non-specific signal background). These methods can be used in assay formats such as test tube, analytical gel, dot blot, and reverse dot blot formats, and can also be automated.

SNP TPA Classical Format

It is prefened, that the non-coding SNP or the clinically relevant SNP be present in the target region with a flanking intron region present on either side. The intron's presence helps eliminate false results because there is no interference from non-processed similar sequences, such as the pseudogenes that are frequently present.

Any other SNP that is in a non-coding region has no additional requirement before scoring. It is further prefened that in the probe hybridization there are no gaps or nicks are present within the PNAS structure that are large enough for the high fidelity DNA polymerase to participate in any form of non-specific base insertion, thereby leading to a mixed or false result. In other words, no 3' ends must be available for single base addition other than the complementary position to the SNP scoring site. It is also prefened that the DNA polymerase used have only 3' synthesis function and no 5' exonuclease activity. Also, the DNA exonuclease used, such as Exo III, must be tested to preclude any endonuclease, single strand exonuclease activity, or gap widening activity that will compromise the PNAS structures leading to a false scoring.

RP-TFO TPA SNP Automated Analysis In the RP-TFO automated embodiment of TPA SNP scoring, one or two RP-

TFO probes, an APSP PIP probe, and one or more APSP assisting probes may be used for hybridization to the target. The choice depends on the exon or non-coding region containing SNP, and the exon/intron sequence to be analyzed. After hybridization, the PNAS is resistant to Exo III, and the displacement of the W and C strands of the scoring target allows the single base insertion to occur, and defines the

SNP scoring site as the sole site for single base addition. All other 3' ends are capped or configured so as not to be a substrate for single base addition by DNA polymerase.

Two prefened methods render the PNAS or PDTP structures EXO III resistant. The first method comprises a step of displacement of the W and C strands of genomic DNA, relaxed by heating, by a triplex forming probe, preferably an 8- aminopurine substituted RP-TFO forming a triplex with either strand, or an 8- aminopurine substituted oligonucleotide forming a duplex with either strand. Both strands may have an associating displacement probe, preferably upstream and downstream from the target and preferably not past the RE sites chosen. The strand displacement probes generally enclose the target between them.

The second step comprises identifying RE (restriction emdonuclease) sites upstream and downstream from the displacement probes by an enzyme that cleaves the DNA leaving a blunt end or a 5' overhang, both of which are substrates for Exo III degradation The third step comprises addition of Exo III to degrade the W and C duplex up to the point of W and C strand displacement. The Exo III action on the PNAS and PDTP structures would continue until that point at which one would have, on each side of these structures, a 3' free end and a non-associating 5' end (the latter incapable of duplex formation due to the displacement probe on the opposite strand). This free 3' end that is single stranded and a non-associating complementary 5' end are no longer substrates for Exo III and the enzyme will drop off leaving an Exo III resistant PNAS and PDTP. The second strategy includes the exclusive use of RE enzymes that cleave the genomic DNA in the regions upstream and downstream from the PNAS and PDTP structures that leave a 3' overhang only. This 3' overhang renders the duplex ends of the structures resistant to Exo III. If the RE chosen is used to render the PNAS Exo III resistant, due to the generation of 3' overhang ends, then another RE is used to generate blunt ends or 5' overhangs in order to allow Exo III to degrade double strand non-specific genomic DNA including the pseudogene. Preferably, neither RE has a restriction site within the PNAS or PDTP region. In the RP-TFO embodiment of the process, the RP-TFO probe, the APSP PIP probe, and the APSP assisting probe all function to destabilize and displace the target W and C strands and form more stable aminopurine substituted triplex and duplex structures, while opening the target W and C strands to allow single base addition for SNP scoring. The single base addition for SNP analysis must occur in the open region of the W and C target duplex between the 8-aminopurine substituted probes.

Considerations in choosing the length of the RP-TFOs include sequence specificity in its hybridization. It is believed that 11 mer or greater substituted purine and pyrimidine region is most prefened. A second consideration is the ability of the triplexes to destabilize and keep the W and C strands separated so that single base addition can occur on the 3' end of the APSP PIP probe in the open W and C target strand area containing the SNP.

The length of the APSP probes and the number of amino-substituted purine residues are dependent upon the length that will bind with sequence specificity (11 mer or greater), and the ability of the APSP probe to keep the W and C strand destabilized and allow single base addition for SNP scoring, to occur between them.

The aminopurine substituted structures separate the W and C strands by strand displacement, and when these substituted purine structures are 100-150 bases apart, single base addition or synthesis occurs in the widened gap between the triplexes or duplexes formed. Thus it is prefened that the SNP site be within 100-150 bases of an RP-TFO site, or within 100-150 bases of another aminopurine substituted probe called the APSP assisting probe. This APSP assisting probe can be placed by hybridization anywhere (within 100-150 bases) along the target region to facilitate strand widening in the proximity of the SNP scoring site.

It is also prefened that the DNA polymerase have only 3' synthesis function and no 5' exonuclease activity as discussed previously. Furthermore, since elevated incubation temperatures of 50°C - 60°C favor W and C target strand displacement by the amino purine substituted probes, a high fidelity, thermostable polymerase is most prefened and include any known to those skilled in the art.

The embodiments represented in Figures 4 to 13 and described in Table I, have an aminopurine-substituted PIP probe with the SNP scoring complementary site or single base addition site at its 3' end. Next, either an RP-TFO or another aminopurine-substituted (APSP) assisting probe must be within 100-150 bases from it in order to open the W and C genomic DNA of the target for SNP scoring. Last, the single base addition method can be any known to those skilled in the art that functions to add the single base in the open W and C region between the aminopurine substituted probes, at the SNP score complementary site. Scoring entails any method known by those skilled in the art to determine the single base that is added.

It is prefened in all the embodiments, including the embodiments presented in

Table I for TPA that either an exonuclease treatment (Exo III) is employed or a certain restriction endonuclease is selected for use. The goal of SNP scoring is single base addition exclusively at the site complementary to the SNP scoring site. Thus the complement to the single added base is the SNP score. There must be no additional 3' end sites available for single base addition by the added DNA polymerase and the four dNTPs.

Embodiments found in Table I have a PNAS with two 3' ends suitable for single base addition by the DNA polymerase. To cap the 3' ends, Exo III treatment is performed. This results in the 3' free end, the position indicated by anow Number 23 in Figures 4 to Figure 14 being incapable of single base addition.

Alternative embodiments comprise omission of an Exo III treatment and involves restriction endonucleases that generate either 3' overhang cuts or blunt end cuts, both of which prevent or cap the 3' PNAS ends from being a site for single base addition. Restriction endonucleases fall into a number of categories. Four base recognition site cutters cut on the average 4⁴ power or every 256 bases, while six base recognition site cutters cut on the average 4⁶ power or every 4096 bases.

Furthermore, according to New England Biolabs, a major supplier of restriction endonucleases, 53% of the RE enzymes cut with 5' overhangs, 26% of the RE enzymes cut with 3' overhangs, and 21% of the RE enzymes cut leaving blunt ends. The interaction of RE selection, Exo III usage result in 3' end "capping" to prevent additional sites for single base addition can be seen in Table VI. The number of RE types provides sufficient flexibility to assure the effective capping of the 3' PNAS ends by either or both approaches, i.e. Exo III treatment or lack of Exo III treatment.

TABLE V: Capping Strategies For Rendering PNAS And PDTP Structures Incapable Of Single Base Addition At The Restriction Site Ends

In one method, if the Exo ///resistance is confened to the PNAS by selection of the appropriate RE enzymes, then an additional RE, a frequent base cutter, is selected and simultaneously used to restrict the non-specific double strand genomic DNA and make it degradable by Exo III. The non-specific DNA includes the interfering pseudogene sequences. All RE enzymes chosen must have restriction sites within the PNAS structure to assure the integrity of the PNAS.

The last requirement is the use of a scoring signal that is sensitive and specific. Several techniques have been presented herein, however, it is to be understood that any technique that accurately measures single base addition known to those skilled in the art may be used. Furthermore, in cunently used assays, a problem in scoring and detecting

SNPs lies in the existence of non-processed pseudogenes. These pseudogenes interfere with SNP scoring results, by technologies such as PCR, based solely on their similarity of sequence and subsequent recognition by the PCR primers involved of the non-processed pseudogene sequences. Most pseudogenes are sequences, non-coding, that are highly similar in sequence to the processed, coded gene (processed gene), but are devoid of non-coding adjacent intervening sequences, called introns. The SNP analysis methods and kits of the present invention, isolate the target specific coding sequence within an exon that contains an adjacent intron, thereby eliminating the problem of assays in cunent use.

Classical TPA SNP Analysis With Two Triplex Formations The classical prefened embodiment of SNP-TPA is presented in Figure 1 A to

IC.

STEP I: Isolate the genomic DNA by any method known to those skilled in the art. DNA must be high quality with only nicks and no gaps.

STEP II: Select two restriction endonuclease sites (the same or different enzymes) flanking two polypurine rich regions (preferably at least an 11 mer), one in an exon containing the target SNP, and the other in an adjacent intron. If a non- coding SNP is being scored, the two RE sites simply flank the SNP scoring site location in the genomic DNA. Neither RE enzyme may have a restriction site between the two flanking polyrich target regions. STEP III: Hybridize with two TFOs, having made sure that no restriction site exists between them.

• First and Second Levels of Specificity.

The TFO presence also insures that the DNA polymerase will not insert a base at the triplex sites formed between the capture and SNP identification probes on the 3 ' ends, also the enzyme must evidence no nicking activity.

STEP IV: Treat the restricted and double triplex containing region with exonuclease (Exo III) to generate the PNAS tails structure. This step will destroy all double strand pseudogenes and other double strand non-specific DNA in the sample reducing non-specific DNA by 50%. STEP V: Hybridize the PNAS tails structure with a capture probe. In this embodiment the 3' end is blocked for synthesis and degradation due to its covalent interaction with the solid support used, and possessing a primary amine on the 3 ' end. This capture probe must be sufficiently long enough to prevent a gap between the two adjoining sequences that is wide enough to allow insertion of any base, modified or other. Such an insertion would provide a false result.

• Third Level of Specificity STEP VI: Capture the PNAS structure onto a coated solid support.

Capture may be accomplished with any pair of affinity molecules known in the art, such as an n-oxysuccinimide (NOS) coating having an affinity for the terminal amino group.

STEP VII: Wash to remove unbound capture probe and degraded single strand and other degraded DNA.

STEP VIII: Hybridize the PNAS with a SNP identification probe. The SNP identification probe must be sufficiently long enough to prevent a gap between the two adjoining sequences that is wide enough to allow insertion of the modified base.

Such an insertion would provide a false result. This probe must extend from its flanking DNA region (no missing base) to a site adjacent to the site opposite the SNP.

• Fourth Level of Specificity

STEP IX: Wash to remove unbound probe.

STEP X: In one embodiment, add to each well DNA polymerase and the four dideoxynucleotide chain terminators (ddNTPs) each a different fluorescent label. The DNA polymerase will insert the base complementary to the SNP score site.

STEP XI: Wash to remove unbound enzyme and fluorescent ddNTPs. STEP XII: Measure the fluorescence emission. A single emission will exist for each allele present.

In an alternative embodiment, the minor image of Figure 1 is produced, such that the capture probe is blocked on its 5' end. Alternatively, the intron may be on the other side of the exon shown in Fig. 1. Alternatively, instead of isolating the DNA with restriction enzymes, any other method known in the art may be used, such as the

BU cutter probes.

The numerical designations in figure 1A to IC representing SNP - TPA are as follows: 1, the first restriction site on the genomic DNA flanking the SNP site and in the target exon, or beyond the target exon, in an adjacent intron or even a non-coding SNP.

2, the second restriction site, flanking on the other side of the SNP in the exon or in the adjacent intron, or even a non-coding SNP.

3, the genomic double-stranded DNA containing at least part of the exon region containing the SNP site and at least part of an adjacent intron (either side of the exon), or the non-coding SNP.

4, the target exon containing the SNP and a polypurine rich region for binding a TFO or a non-coding SNP.

5, the intron region adjacent to the target exon or a region adjacent to a non-coding SNP region.

6, the SNP site to be scored.

7, the TFO in the exon region (or any non-coding region) upstream from the SNP site, also in the same region.

8, the other TFO in the intron, may be located anywhere in the intron (on either side of the exon), or in an adjacent region to the non-coding SNP.

9, the capture probe that possesses a primary amine (-NH ).

10, the n-oxysuccinimide (NOS) coated solid support, the magnetic bead. 11, the SNP identification probe (also called the position identifier probe or

APSP PIP). 12, the site opposite the SNP site, which will be used to score the SNP by single base insertion.

Classical TPA SNP Analysis With One Polyrich Triplex Formation Site

Another classical embodiment of SNP-TPA is presented in Figure 2 A to 2C. In this embodiment, only one TFO is hybridized in the target exon. A TFO is not hybridized to the adjacent intron.

This embodiment also relates to SNP scoring of the vast majority of SNPs residing in non-coding regions of the genomic DNA.

STEP I: Isolate the genomic DNA by any method known to those skilled in the art. STEP II: Select two restriction endonuclease sites (one or two enzymes) flanking a polypurine rich region (preferably at last an) and the target SNP must reside in the target region to be analyzed. The polypurine region may, however, extend into one of the adjacent introns if the clinically relevant SNP is located in an exon.

STEP III: Hybridize with a single polypyrimidine TFO.

• First Level of Specificity.

The TFO presence prevents the insertion in Step X of a dideoxynucleotide modified base at the nick site formed after hybridization of the capture probe in Step V and similar to the occunence in the SNP identification probe hybridization in Step VIII.

STEP IV: Treat the restricted and triplex formation region with an exonuclease having 3' to 5' activity, such as, Exo III, to generate the PNAS tails structure. The nuclease must have no nicking activity. STEP V: Hybridize the PNAS structure with a capture probe with the 3' end blocked for synthesis and degradation and possessing a primary amine. This capture probe must be sufficiently long enough to prevent a gap between the two adjoining sequences that is wide enough to allow insertion of dye conjugated dideoxynucleotide terminator base. Such an insertion would provide a false result. • Second Level of Specificity

STEP VI: Capture the PNAS structure onto a solid support coated with the other pair to the affinity molecule (NOS) used in Step V.

STEP VII: Wash to remove unbound capture probe and degraded DNA, namely single stranded and single nucleotides. STEP VIII: Hybridize the PNAS with an SNP identification probe also called the position identifier probe (PIP). The SNP identification probe must be sufficiently long enough to prevent a gap between the two adjoining sequences that is wide enough to allow insertion of the dye conjugated dideoxynucleotide base at this position. Such an insertion would provide a false result. This probe must extend from its flanking DNA region (no missing base) to a site adjacent to the site opposite the SNP.

• Third Level of Specificity STEP IX: Wash to remove unbound probe.

STEP X: In one embodiment, add to each well DNA polymerase and the four dideoxynucleotide chain terminators (ddNTPs) each a different fluorescent label. The DNA polymerase will insert the base complementary to the SNP score site. STEP XI: Wash to remove unbound enzyme and fluorescent ddNTPs.

STEP XII: Measure the fluorescence emission. A single emission will exist for each allele present. The sensitivity of this detection modified base label is superior to hybridization techniques where base pair mismatches are difficult to detect. In an alternative embodiment, the minor image of Figure 2 is produced, such that the capture probe is blocked on its 5' end. Alternatively, the intron may be on the other side of the exon shown in Fig. 2. Alternatively, instead of isolating the DNA with restriction enzymes, any other method known in the art may be used, such as the BU cutter probes. The numerical designations in figure 2 A to 2C representing SNP-TPA are as follows:

1, the first restriction site on the genomic DNA flanking the SNP site. In this embodiment, the first restriction site is in the exon region, and must flank the SNP site, or the SNP may exist in a non-coding region other than the exon with restriction sites flanking on both sides.

2, the second restriction site flanking, on the other side, the SNP site and may extend into the adjacent intron region but in this embodiment it is in the exon, or the SNP site may exist in a non-coding region other than the exon with restriction sites flanking on both sides. 3, the genomic double-stranded DNA containing the exon region containing the SNP site or any SNP in a non-coding site. Exclusion of the intron permits pseudogene interference in the SNP analysis in exon containing SNPs. 4, the exon or any region containing the SNP. 6, the SNP site to be scored.

9, the capture probe whose 3' end is blocked for synthesis and degradation possessing a primary amine. 10, the affinity (NOS) coated solid support, the magnetic bead. When the capture probe has a primary amine, the affinity coat is n-oxysuccinimide.

11, the SNP identification probe or APSP PIP probe.

12, the site opposite the SNP site, which will be used to score the SNP. 24, the TFO specific for a site in the exon region or any non-coding SNP site within 100 bases of the SNP region to be scored (this may also extend into the intron region.)

SNP-RFTA The invention comprises methods that comprise direct DNA and RNA analysis by binding a target- specific oligonucleotide to a single-stranded target (denatured genomic nucleic acid). The target sequence is isolated by any method known to those of skill in the art, such as, restriction enzymes or BU cutter probes. The target sequence is denatured and partially hybridized with a primary probe. This probe binds specifically to part of the targeted sequence leaving single stranded adjacent regions that serve as capture and reporter regions. The resulting complex is called a Partial Duplex Target Probe (PDTP) complex.

Next, the PDTP complex is hybridized with a secondary anchor probe, which is conjugated with a biochemical hook such as an amine. This aminated secondary probe is attached to a solid substrate via any interaction, resulting in formation of a covalent bond, providing the capture and isolation of the target/probe complex. The PDTP is attached to the solid substrate in order to wash and remove all non-specific nucleic acids while still preserving and concentrating the targets.

The PDTP complex, now attached to the solid phase, such as plastic microtiter plates or magnetic beads is separated from the non-specific complexes by a series of washes, removing the non-specific nucleic acid sequences.

A position identifier probe is then hybridized to the PDTP structure and its 3' end is the site where the single base addition is performed and the base at the site of addition is complementary to the SNP site base to be scored and the SNP score is determined to be A, T, G, or C.

It is prefened that the SNP be present in the target exon region with a flanking intron region present on either side in a clinically relevant SNP. The majority of SNPs are non-coding and these can be detected with the RFTA SNP scoring methods of the present invention. Most prefened is that SNP scoring methods have two flanking restriction sites. Restrictions can be accomplished with Bu cutter probes.

It is also prefened that probe hybridization be configured so that no gaps or nicks are present within the PDTP structure that are large enough for the DNA polymerase chosen to participate in non-specific base insertion, thereby leading to a mixed or false result. In other words it is most prefened that 3' ends be available for single base addition other than the complementary position to the SNP scoring site. Probe hybridization of adjoining regions preferably results in only a nick that is not a substrate for the DNA polymerase gap widening activity and the polymerase should have no gap widening activity. Most preferably, the DNA polymerase used must have only 3' synthesis function and no 5' exonuclease activity, endonuclease activity.

APSP RFTA SNP Automated Analysis In SNP APSP RFTA automated embodiment of SNP scoring one or two 8- aminopurine substituted probes (APSP), APSP capture probe, the APSP PIP (position identifier probe), and one or more APSP assisting probes may be necessary for hybridization to the target, dependent on the clinically relevant SNP (exon or exon/intron) analysis, or the non-coding SNP to be analyzed. After positioning, all must function, to render the PDTP resistant to Exo III. Alternatively, the choice of restriction endonuclease can achieve this, as previously discussed. The probes also displace the W and C strands of the scoring target allowing single base insertion to occur, and define the SNP scoring site as the sole site for single base addition. All other 3' ends are capped or configured so as not to be a substrate for DNA polymerase single base addition.

If RE functions to render the PDTP Exo III resistant, due to the generation of 3' overhang ends that are not a substrate for Exo III degradation, another RE must also be used in combination to generate blunt ends or 5' overhangs outside the PDTP in order to allow Exo III to degrade non-specific genomic DNA including the pseudogene. Neither RE may have a restriction site within the PDTP region.

In an APSP RFTA embodiment, the APSP probes, the APSP capture probe, the APSP PIP probe, and the APSP assisting probe all function to destabilize and displace either the target W or C strands and form more stable aminopurine substituted duplex structures, while opening the target W and C strands and making the SNP complementary site available for single base addition by the DNA polymerase. It is prefened that the single base addition for SNP analysis occur in the open region of the W and C target duplex within 100-150 bases separating any two APSP probe formed duplexes. The length of the APSP probes and the number of amino-substituted purine residues are dependent upon the length that will bind with sequence specificity, preferably 11 mer or greater, and the ability of the APSP probe to keep the W and C strand destabilized and allow single base addition to occur between them.

When aminopurine substituted probes separate the W and C strands by strand displacement and when these substituted purine structures are at least 100-150 bases apart, single base addition, repeated base addition, or synthesis can occur in the widened gap between the two aminopurine substituted duplexes formed. It is prefened that the single base addition site must be within 100-150 bases of two APSP probes, or alternatively, an additional aminopurine substituted probe called the APSP assisting probe is hybridized within 100-150 bases from the site of single base addition. This APSP assisting probe can be placed by hybridization anywhere along the target region to facilitate strand widening in the proximity of the SNP scoring site. Similarly, the DNA polymerase must have only 3' synthesis function and no 5' exonuclease activity or the process probes may be compromised. Also the DNA exonuclease (Exo III) must be tested to preclude any endonuclease, single strand exonuclease activity, or gap widening activity that will compromise the PNAS structure leading to a false scoring result due to multiple sites capable of single base additions.

Furthermore, since elevated incubation temperatures of 50°C - 60°C favor W and C target strand displacement by the amino purine substituted probes, a high fidelity, thermostable polymerase may be useful for single base insertion, and may include any known to those skilled in the art. The embodiments represented in Figures 14 to Figure 21 described, in Table

II, have a aminopurine-substituted (APSP) PIP probe with the SNP scoring complementary site or single base addition site at its 3' end. Next, it is prefened that either an APSP capture probe or another aminopurine-substituted (APSP) assisting probe be within 100-150 bases from the SNP scoring site to open the W and C genomic DNA of the target for SNP scoring. Last, the single base addition method can be any known to those skilled in the art that functions to add the single base opposite the SNP scoring site in the open W and C regions between the probes. Scoring entails any method known by those skilled in the art to determine the single base that is added, the complement of which is the base at the scoring site.

The embodiments presented in Table II for RFTA require that either an exonuclease treatment (Exo III) is employed or a certain restriction endonuclease is selected for use. The goal of SNP scoring is single base addition exclusively at the site complementary to the SNP scoring site for each allele being scored. Thus, the complement to the single added base is the SNP score. There must be no additional 3' end available for single base addition by the added DNA polymerase and the four dNTPs. Embodiments, found in Table II, evidence a PDTP with two 3* ends suitable for single base addition by the DNA polymerase. To prevent this and cap the 3' ends, Exo III treatment is performed which results in the 3' overhang, indicated in the figures listed in Table II.

Omission of the Exo III treatment requires that alternative methods be employed to prevent single base addition at the 3' ends of the PDTP. This involves the limitation of the appropriate restriction endonuclease for use, to those that generate either 3' overhang cuts or blunt end cuts, both of which prevent or "cap" the 3' PDTP ends from being a site for single base addition. Additionally, Bu cutter probes can be used. If the Exo III resistance is confened to the PDTP by selection of the appropriate RE enzymes for use, then the non-specific DNA must be degraded. For example, an additional RE, a frequent base cutter, must be selected and simultaneously used to restrict the non-specific genomic DNA and make it degradable by Exo III. This will degrade the interfering sequences. All RE enzymes chosen must not have restriction sites within the PDTP structure to assure the integrity of the PDTP, also previously discussed. The scoring signal must be sensitive and specific. Several techniques have been presented herein, however, it is to be understood that any technique known to those skilled in the art to accurately measure single base addition may be used.

Classical RFTA SNP Analysis: Scoring without Triplex Formation

The methods of the present invention for detecting and analyzing SNPs are incorporated by reference from the Restriction Fragment Target Assay (RFTA), disclosed in provisional patent applications, 60/065,378, 60/075,812, 60/076,872 and PCT Application No. PCT/US98/24226. The process of classical scoring by SNP-RFTA is presented in Figure 3 A to

3C.

STEP I: Isolate the genomic DNA by any method known to those skilled in the art.

STEP II: Isolate a 4 base varying region with the target SNP located in the target exon or any non-coding region. The 4 base varying region may extend into one of the adjacent introns. The targeted DNA section containing the 4 base varying region and the target SNP may be isolated with restriction enzymes or a BU cutter probe. As shown in Figure 3 and described herein, restriction enzymes isolate the target DNA. STEP III: Denature the isolated double-stranded DNA fragment.

STEP IV: Hybridize with a primary probe that forms the PDTP structure. The primary probe does not overlap the SNP, but extends beyond the isolated DNA fragment on the end opposite the SNP, thereby creating a tail.

• First Level of Specificity STEP V: Hybridize the tail of the PDTP structure with a capture probe. In

Figure 3, the capture probe has its 3' end blocked for synthesis and degradation and has an affinity molecule attached, such as, a primary amine. This capture probe must be sufficiently long enough to prevent a gap that is wide enough to allow insertion of a modified dideoxynucleotide base between the two adjoining sequences. Such an insertion would provide a false result.

• Second Level of Specificity STEP VI: Capture the PDTP structure onto an affinity coated solid support, such as, NOS when the capture molecule has a primary amine as the other member of the affinity pair.

STEP VII: Wash to remove unbound capture probe and degraded DNA. STEP VIII: Hybridize the PDTP with an SNP identification probe. This probe must be sufficiently long enough to prevent a gap that is wide enough to allow insertion of a dye bound dideoxynucleotide base between the two adjoining sequences. Such an insertion would provide a false result. The SNP identification probe must extend from next to the adjoining region up to a site just before the SNP paired base site opposite to the SNP site. • Third Level of Specificity

STEP IX: Wash to remove unbound probe.

STEP XI: Wash to remove unbound enzyme and fluorescent ddNTPs. STEP XII: Measure the fluorescence emission. A single emission will exist for each allele present. The sensitivity of this label is superior to hybridization techniques where base pair mismatches are difficult to detect. In an alternative embodiment, the minor image of Figure 3 is produced, such that the capture probe is blocked on its 5' end. Alternatively, the intron may be on the other side of the exon shown in Fig. 3. Alternatively, instead of isolating the DNA with restriction enzymes, any other method known in the art may be used, such as the BU cutter probes. The numerical designations for SNP-RFTA in Figure 3A to 3C are:

1, one restriction site upstream from the scoring site.

2, the second restriction site downstream from the scoring site. 4, the exon or any non-coding region containing the SNP.

6, the SNP site to be scored. 9, the capture probe with a 3 'end affinity molecule, such as, a primary amine. 10, the solid support coated with the other member of the affinity pair, here is n-oxysuccinimide (NOS).

11, the SNP identification probe or APSP PIP as described,

12, the site complementary to the SNP where the modified dideoxynucleotide terminator base is placed by the DNA polymerase.

13, the restricted DNA fragment

14, the primary probe that produces the PDTP structure

Automated TPA And RFTA SNP Analysis: Three Strategies For Base Scoring The first method, known to those skilled in the art, is dependent upon a dye conjugated dideoxynucleotide terminator base insertion followed by detection of fluorescence emission characteristic for the specific dye on the conjugated base that is inserted (each base has a different fluorescence emitting dye). The scoring sample may also be split (four tubes) and a single labeled base added to each or a single tube may be used with a cocktail of four bases each with a different label.

The second method requires regular dideoxynucleotide terminator base (without conjugated dye) or simply a dye conjugated natural base insertion followed by pyrosequencing identification of the inserted base accomplished by measurement of the inorganic pyrophosphate (ppi) release expressed as visual light production as known to those skilled in the art (each base is unmodified - i.e., no dye or label attached).

Lastly, the method used provides an amplification of signal by use of a biotinylated modified (labeled) dideoxynucleotide terminator base or simply a biotinylated base insertion followed by a streptavidin-alkaline phosphatase polymer conjugate to generate a color signal (each base is modified with a biotin or other similar molecule to initiate the color generation by attaching the enzyme polymer to the PNAS or PDTP structure). AUTOMATABLE TPA SNP ANALYSIS: RP-TFO TPA SNP

Classical TPA is dependent upon triplex protection of the target (PNAS) at physiologic pH or 7.0 - 7.6. Due to the fact that triplex nucleic acid forms are inherently more stable at ph 6.0, technical difficulties arise in producing a stable triplex at physiologic pH, a prime requirement for enzyme activity necessary for diagnostic assays. Attempts to stabilize the triplex at pH 7.0 using halogenated pyrimidines and 5-methyl cytosine substitution in the TFO have failed.

Another embodiment of the present invention called SNP RP-TFO TPA resolves the problems of triplex stability and provide a ready automation platform for SNP TPA.. A reverse phase triplex forming oligonucleotide is used. Embodiments of this probe, are described in U.S. Provisional Patent Application Nos. 60/162,627 and

60/197,559, and U.S. Patent Application filed October 30, 2000 which claims priority to the above two provisionals. The probes comprise a polypurine section and a polypyrimidine section joined 3' to 3' or 5' to 5' by a linker or spacer region.. The RP-TFO is comprised of two regions separated by a spacer. The TFO can have two 3' ends or two 5' ends. The first region is a polypurine region, more specifically in the embodiment of the present invention, having 8-aminopurine substituted bases followed by a spacer, and next, a polypyridine region of opposite polarity that has the complementary pyrimidine base sequence to the purine region. For example,

5'GAGGAAA-spacer-TTTCCTA5' 3'GAGGAAA-spacer-TTTCCTA3' (in which any of the above purine bases can have amines substituted at the 8 position) The nature of this hairpin is that it folds at the spacer and forms Hoogsteen's Bonds between the two strands.

The stability at pH 7.0 of the triplex formed between the RP-TFO and the polypyrimidine strand (C) of the target is due to the presence of 8-aminopurine substituted bases in the purine half of the RP-TFO, which result in additional hydrogen bonds formed that stabilize the triplex when the RP-TFO binds to the complementary polypyrimidine strand of the target DNA. Melting point determination and circular dichroism studies indicated the RP-TFO triplex formed at pH 7.0 has a melting temperature of 54°C -57°C, and forms a stable triplex at physiologic pH and at standard in vitro diagnostic temperatures at 25 °C - 40°C, where nucleolytic enzymes demonstrate adequate activity.

Automatable RFTA SNP Analysis: APSP RFTA SNP A problem in the reduction to automation of classical RFTA is the necessity for denaturation of the ds target DNA after nuclease restriction of genomic DNA to release the target.

In automated methods, an embodiment comprising APSP or aminopurine substituted probes is used. In this embodiment, SNP scoring can be achieved on genomic DNA without the need for a denaturation step after target restriction. A 4 base varying probe is synthesized that is complementary to the Watson or Crick strand of the target region possessing a number of 8-aminopurine (8aA and 8aG) bases. The length of the APSP probe is governed by two parameters. First, the modified purines within must be of sufficient number to displace the Watson or Crick strand of the target and form a secondary, more stable duplex melting at an even higher temperature than the regular W and C duplex (containing non-modified purines). The APSP probe only needs to be long enough to destabilize the original W and C duplex and prevent reversal to the original non-purine modified duplex. The additional hydrogen bonding offered by the amino-substituted purines and the elevated experimental temperatures should keep the secondary, formed, APSP duplex as stable as the RP-TFO triplex.

In both the RP-TFO and APSP embodiment of TPA and RFTA respectively, the aminopurine substituted oligonucleotide causes displacement of one of the strands of the target region (double stranded genomic DNA) that is relaxed by heating to 50°C - 60°C. Thus, in both embodiments, new stable PNAS and PDTP structures are generated that can participate in SNP scoring, STR repeat sequencing, and many other diagnostic activities.

The present invention comprises methods and compositions for automation of genomic DNA preparation for use in analysis for single nucleotide polymorphism scoring. Automation of Genomic Nucleic Acid Isolation For Diagnostic Analyses

The isolation of genomic nucleic acid is well known to those skilled in the art. Manual, semi-manual, and automated processes have been developed.

A prefened method of automating the disclosed inventions comprises use of the Automated Genomic Diagnostic Analyzer (AGENDA) of CyGene, Inc. This instrument with its magtration technology is made by Precision System Science

Company, Ltd. Tokyo, Japan. U.S. Provisional Patent Application No. 60/220,582 filed July 25, 1999, discloses use of these robotic processes.

The robotic device, AGENDA, is capable of nucleic acid isolation from blood, and other sample sources such as urine, foodstuffs, and samples from environmental sources. Any magnetic particle technology may be used with the AGENDA robotic machine and will provide the platform to run the nucleic acid isolation, extraction and other assays.

Configuration of the robotic process for nucleic acid analyte isolation can be achieved by any of a number of processes known by those skilled in sample preparation.

Automated Isolation Of Nucleic Acid

The AGENDA robotic devices deliver a high quality and high efficiency yield of the DNA present in the size of the blood sample used. Those skilled in the art know that 65 microliters of peripheral whole blood contain approximately 1 microgram of DNA. Use of varying sample sizes provides the following yield results:

Sample μG DNA yield

50 μl 0.77

100 μl 1.54

150 μl 2.31

200 μl 3.08

250 μl 3.85

500 μl 7.70 Methods for the manual processing of peripheral blood are known by those skilled in the art and may be used as a preliminary step in genomic DNA isolation and diagnostic analyses.

One embodiment involves taking a number of 15 ml tubes of peripheral blood and, manually, isolating nucleic acid from buffy coat (WBCA), and providing the nucleic acid in a reduced small sample volume of <500μl microliters. Calculations, based on buffy coat extraction, place the yield of DNA in 15 ml of peripheral blood at approximately 231 micrograms or:

DNA manual yield 1 x 15 ml blood tube 231 μg.

2 x 15 ml blood tube 462 μg. (=0.5 mg)

3 x 15 ml blood tube 693 μg.

4 x 15 ml blood tube 924 μg. (___ 1 mg)

A prefened embodiment of this manual nucleic acid isolation procedure is as follows; however, any other method can be used.

STEP I: Centrifuge the blood tubes in a centrifuge at low speed (800xg for 10 minutes at room temperature) to sediment the RBCs (red blood cells) and provide a buffy coat layer at its upper boundary with the plasma.

STEP II: Remove the plasma with a pipette and discard. STEP III: Remove the buffy coat and pool buffy coats from as many blood collection tubes as is required.

STEP IV: Add a hypotonic RBC lysis buffer to the pooled buffy coats (5- 10 times volume is appropriate) that will lyse the RBCs and have no effect on the nucleated WBCs. RBC lysis buffer: 1 OmM Hepes Buffer pH 7.9

1.5mM Mg Cl₂ lOmM KC1, from Short Protocols Mol. Biol. 1999

STEP V: Centrifuge at low speed (1300 x g for 15 minutes) to pellet the

WBCs and resuspend in a minimal volume of WBC lysis buffer containing proteinase K and incubate at optimum temperature for proteinase K activity and WBC lysis and inactivation of all cellular DNA or RNA nucleases. WBC Lysis Buffer:

20 mM EDTA 0.5% SDS

lOOμg/ml proteinase K final cone. Incubate at 55°C for 30 minutes. Add 250μl of lysis buffer per 500μg of buffy coat (2mg/ml DNA in solution maximal attainable solubility).

STEP VI: Place WBC lysate into 2x 250 μl microtiter plate wells in WBC lysis buffer and add 12.5 μl 10M LiCl to each well (500mM LiCl final concentration). Add magnetic beads coated with poly dT to remove mRNA from the nucleic acid containing lysate for future mRNA analysis.

TPA And SNP scoring

RP-TFO TPA Embodiments: Numerical Designations

Figures 4 to 13 present RP-TFO TPA SNP scoring embodiments with three scoring signals, hereafter refened to as enzyme conjugates, pyrosequencing, and dye- conjugated dideoxynucleotide terminator bases and other bases, but not limited to these methods but also to include all methods known to those skilled in the art.

Figures 4 to 13 in Table I represent prefened embodiments of RP-TFO TPA SNP scoring. The following numerical designations describe the PNAS structures generated in these embodiments that function as a site for the SNP single base addition, and reactions for scoring:

1, the restriction endonuclease site, one at each end of the PNAS, found in the exon or upstream from a non-coding SNP

2, the second restriction endonuclease site, found in the exon or the intervening intron or downstream from a non-coding SNP

4, the exon region that encodes the production of the SNP gene product and contains the SNP scoring site, or the non-coding SNP (no gene product produced)

6, the SNP scoring site, located on the W or C strand of the target exon, or the target exon of the exon/intron group, or a non-coding SNP to be scored 10, the COOH derivatized magnetic beads to which are covalently coupled the capture probe or solid support 11, the APSP PIP, the 8-aminopurine substituted position identifier probe that functions to one, displace the W or C strand in the exon or non-coding SNP region, and two, defining the single base addition site at its 3 'end

12, the site of single base addition that is complementary to the SNP scoring site

15, the substrate for Exo /// activity (a blunt end or a 5' overhang end)

17, a polypyrimidine rich region of the W or C strand existing in the exon or intron, or a non-coding SNP

18, the 8-aminopurine substituted region of the RP-TFO 19, the polypyrimidine region of reverse polarity that acts as a TFO

20, the spacer or linker molecule connecting the two reverse polarity regions of the RP-TFO

21, the lack of additional Restriction Endonuclease sites similar to those located on the ends of the PNAS in the region between them 22, the APSP assisting probe, a probe that must displace the W or C strand for scoring and must be hybridized within 100 bases of the 3' end of the APSP PIP which is the scoring site, that functions to displace the W or C strands for scoring by allowing single base addition to occur at the SNP scoring site

23, the 3' free end that results as a post Exo III treatment 25, a primary amine affinity molecule on a spacer

26, the RP-TFO capture probe, possessing the primary amine affinity molecule

APSP RFTA Embodiments: Numerical Designations Figures 14 to 21 represent APSP RFTA SNP scoring embodiments to be interphased with three scoring signals, hereafter refened to as enzyme conjugates, pyrosequencing, and dye-conjugated dideoxynucleotide terminator bases and other bases, but not limited to these methods, and also include all methods known to those skilled in the art. Figures 14 to 21 in Table I represent some embodiments of APSP RFTA SNP scoring. The following numerical designations describe the PDTP structures generated in these embodiments that function as a site for the SNP single base addition reaction for scoring.

APSP RFTA Embodiments: Numerical Designations Figures 14 to 21 represent prefened embodiments of ASPS RFTA SNP scoring:

1, the restriction endonuclease site, one at each end of the PNAS, found in the exon, or upstream from a non-coding SNP

2, the second restriction endonuclease site, found on the other side of the SNP in the exon or the intervening intron or downstream from a non-coding SNP

6, the SNP scoring site, located on the W or C strand of the target exon, or the target exon of the exon/intron group, or the non-coding SNP to be scored 10, the magnetic bead, or solid support with surface-COOH groups, used to capture the PDTP by covalent binding to the primary amine of the capture probe

11, the APSP PIP, the 8-aminopurine substituted position identifier probe that functions to one, displace the W or C strand in the exon and two, defining the single base addition site at its 3'end 12, the site of single base addition that is complementary to the SNP scoring site

15, the substrate for Exo III activity (a blunt end or a 5' overhang end)

21, the lack of additional Restriction Endonuclease sites similar to those located on the ends of the PNAS in the region between them 22, the APSP assisting probe, a probe that must displace the W or C strand for scoring and must be hybridized within 100 bases of the 3' end of the APSP PIP which is the scoring site, that functions to displace the W and C strands for scoring by allowing single base addition to occur at the SNP scoring site

23, the 3* free end that results as a post Exo III treatment 27, the APSP capture probe, possessing the primary amine affinity molecule Automation of RP-TFO TPA SNP and APSP RFTA SNP Scoring

The following steps are performed on the AGENDA robotic device to SNP score genomic DNA. The following steps pertain to all the embodiments depicted in Figures 4 to Figure 21 and found in Table I and Table II representing the various embodiments of RP-TFO TPA and APSP RFTA. Furthermore, these following steps, up to the point of single base addition, are compatible with use of dye conjugated dideoxynucleotide terminators, and pyrosequencing and any other method known to those skilled in the art that allows single base addition and permits scoring of the SNP site by determination of the base added. Both TPA and RFTA have similar methods for automation. As such, the process steps will be presented singly with the appropriate differences duly noted. After lysis of the cells and isolation of nucleic acids, as given in the above Buffy coat example Steps I- VI, the following automated steps comprise:

STEP VII: Aspirate oligo (dT) derivatized magnetic beads into pipet tip to capture poly(A)⁺ mRNA. Move pipet tip to well containing lysate. Mix beads with lysate. Put magnet down (on). Draw all liquid into the pipet tip past the magnet.

STEP VIII: Move the pipet, with the liquid, to another microtiter plate well containing another magnetic bead, coated with a polymer that will bind the highly negatively charged DNA molecules. Dispense the liquid into the well. Dispose of the tip unless mRNA analysis is necessary, which is not the case herein.

STEP IX: Get a new pipet tip. Mix the beads and the liquid in the pipet tip. Put the magnet down (on). Bring the liquid into the pipet tip past the magnet. Dispense the liquid into the well.

STEP X: Move the pipet tip, with the beads held in place, to a new microtiter plate well containing a wash buffer. The wash buffer is appropriate to the composition of the beads. Remove the magnet. Mix the beads with the wash buffer.

Put the magnet down. Bring liquid back into the pipet tip such that the beads are again held in the pipet tip by the magnet. Dispense the liquid into the well.

STEP XI: Move the pipet tip, with the beads held in place, to a new microtiter plate well containing elution buffer. The elution buffer (e.g. lOmM Tris- HCl pH 9.8) is determined by the magnetic bead coating. Remove the magnet. Resuspend the beads in 180 μl of elution buffer. Incubate for appropriate temperature and time (e.g. 50°C for at least 5 minutes). Put magnet down. Draw liquid into pipet tip past the magnet such that all the beads are stuck to the side of the tip.

STEP XII: Move the pipet tip, containing the magnetic beads and liquid, to a new microtiter plate well and aspirate the liquid, 50 μl 10X RE buffer. Release the magnet. Aspirate and dispense the buffer in and out of the pipet tip until the magnetic beads are loosened and suspended in the well. Incubate at 37°C for appropriate time. Example for IPX Alu-I CRE Buffer: 500 mM NaCl

100 mM Tris pH 7.9 (Tris elutes the DNA from the magnetic bead) lOO mM MgCl.

10 mM DTT Apply the magnet. Aspirate the contents of the well such that the magnetic beads are held in the pipet tip. Dispense the liquid back into the well. Dispose of the pipet tip. STEP XIII: Get new pipet tip. Go to microtiter plate containing RE. Aspirate 50 μl of RE. Move pipet tip, with RE, to the microtiter plate well containing the released DNA. Dispense the RE into the well. Mix the solution in the pipet tip. Incubate the solution at 37°C for 30 to 60 minutes.

STEP XIV: Move pipet tip to a new microtiter plate well containing 25 μl 10X hybridization buffer and magnetic beads derivatized with the capture RP-TFO probe (and any additional RP-TFO or other probe that binds to the PNAS, or the APSP capture probe that binds to the PDTP structure). Mix the suspension of beads and buffer in the pipet tip. Aspirate the mixture into the pipet tip. Move the pipet tip back to the microtiter plate well containing the restricted DNA fragments. Dispense the contents of the pipet tip into the well. Mix the material in the pipet tip. Incubate the material in the well at 40°C to 50°C for 30 to 60 minutes. The hybridization buffer is:

25 μl of a 1M sodium phosphate-citric acid solution, pH 5.5 to 7.0 to a final buffer concentration of:

0.1M Na phosphate-citric acid, pH 5.5 to 7.0 STEP XV: Put the magnet down. Aspirate all the material in the well past the magnet such that all the magnetic beads are stuck to the side of the pipet tip by the magnet. Dispense all the liquid from the pipet tip into the well. Move the pipet tip, with the magnetic beads, to a new microtiter plate well containing Exo /// treatment buffer:

50 mM Hepes pH 7.0

10 mM MgCl₂ 1 mM spermidine or, one could use an alternate Exo /// buffer:

5 mM MgCl₂

1 mM DTT Aspirate all the buffer into the pipet tip. Dispense the liquid back into the well. Move the pipet tip, with magnetic beads, to a new microtiter plate well containing Exo III treatment buffer.

Draw the liquid (200 μl) into the pipet tip. Dispense the liquid back into the well. STEP XVI: Move the pipet tip, with the magnetic beads held in place, to a new microtiter plate well containing Exo HI buffer and Exo III. The Exo III is present at a concentration of 5U of Exo III per picomole of template. The presumption is that there are 2.5 picomoles of template present. The total volume of reagents in the well is 200 μl. Remove the magnet. Draw the liquid from the well into and back out of the pipet tip until the magnetic beads are loosened. Incubate the magnetic beads with the reagent in the well at 37° for 5-10 minutes. Optionally, one may inactivate the

EXO III at this point with addition of 0.5 M EDTA pH 8.0. Add the EDTA at 1/10 the reaction volume to result in a 50mM final concentration of EDTA. Apply the magnet. Aspirate the liquid into the pipet tip such that all the magnetic beads are stuck to the side of the pipet tip. Dispense liquid into well.

STEP XVII: Move the pipet tip, with the magnetic beads, to a new microtiter plate well containing 200 μl of Exo III buffer. Aspirate and dispense the buffer into and out of the pipet tip to remove degraded non-specific DNA and enzymes. Move the pipet tip, with magnetic beads, to a new microtiter plate well containing 200 μl of hybridization buffer. The hybridization buffer is 3X SSPE (slightly modified):

0.5 M NaCl ImM EDTA

30mM NaH₂PO₄, pH 7.0

STEP XVIII: Remove the magnet. Aspirate and dispense the buffer into and out of the pipet tip until all the magnetic beads are loosened and are in the well. Hybridize the position identifier probe (APSP PIP), used in TPA and RFTA, at 40°C for 5 to 10 minutes. The PIP has a 3' end next to the single base addition site that is complementary to the SNP site to be scored. The length of this probe and the number of 8aP substituted bases is that which provides a similar strand displacement potential for the PIP as that possessed by the RP-TFO. Optimally local duplex strand displacement occurs when either two RP-TFOs or one RP-TFO and one 8aP substituted oligo are hybridized to the genomic DNA within 100-150 bases of each other.

STEP XIX: Apply the magnet. Aspirate the liquid into the pipet tip such that all the magnetic beads are stuck to the side of the pipet tip. Dispense liquid back into the well.

STEP XX: Wash to remove excess APSP PIP by moving the pipet tip, with the magnetic beads, to a new microtiter plate well containing 200 μl of 3X modified SSPE buffer heated to 40°C. Remove the magnet. Aspirate the buffer into and out of the pipet tip such that the magnetic beads are loosened and are suspended in the well. Aspirate the mixture from the well into the pipet tip. Apply the magnet. Dispense liquid into well leaving magnetic bead with attached PNAS or PDTP in pipet tip.

At this point, single base addition is performed by any method known to those skilled in the art, and the particular base added is determined, its complement being the SNP scoring site score.

Use of Enzyme Conjugates for SNP Scoring

Again process Steps I to Step XX are identical as previously discussed. The next step calls for and allows single base addition and the following steps provide the enzyme conjugate method for determining the particular base added, the complement of which is the SNP score, with a single base added for each allele scored. Again both TPA and RFTA process steps are similar. STEP XXI: Move the pipet tip containing the magnetic beads with attached PNAS or PDTP structures from the previous step to a new microtiter plate. Dispense 4 equal aliquots to 4 successive wells containing high fidelity DNA polymerase, biotinylated dNTPs and appropriate buffer:

Aliquot 1 2 3 4

DNA DNA DNA ^DNA ^y Polymerase Polymerase Polymerase Polymerase

dNTPs dATP-bio dTTP-bio dGTP-bio dCTP-bio

(biotinylate) Incubate at 40°C for a short period of time to facilitate single base extension in one of the four sample wells for each allele present

STEP XXII: Apply the magnet. Aspirate the mixture from the well into the pipet tip such that all the magnetic beads are held on the side of the pipet tip. Dispense the liquid back into the well. Move the pipet tip containing the magnetic bead to a new microtiter plate containing alkaline-phosphatase 7.0 wash buffer: 0.1 M Tris-HCl pH 7.0 0.1 M NaCl 2mM MgCl₂ Remove the magnet. Aspirate and dispense the buffer into and out of the pipet tip until the magnetic beads are loosened. This removes unbound labeled dNTPs. Apply the magnet. Aspirate the mixture from the well into the pipet tip such that the magnetic beads are held on the side of the pipet tip. Dispense the liquid back into the well.

STEP XXIII: Move the pipet tip, with the magnetic beads, to a new microtiter plate well containing streptavidin-alkaline phosphatase polymer conjugate at 40°C. Release the magnet. Aspirate and dispense the liquid from the well into and out of the pipet tip until all the magnetic beads are loosened and suspended in the well.

STEP XXIV: Apply the magnet. Aspirate the liquid into the pipet tip such that all the magnetic beads are held on the side of the pipet tip. Dispense liquid back into the well. STEP XXV: Move the pipet tip, with the magnetic beads, to a new microtiter plate well containing alkaline phosphatase 7.0 buffer. Aspirate and dispense the liquid from the well to remove unbound conjugate from the beads. Move the pipet tip, with the magnetic beads, to a new microtiter plate well containing alkaline phosphatase 9.5 buffer at 37°C. Alkaline phosphatase 9.5 buffer is:

0.1 NaCl

50 mM MgCl₂

STEP XXVI: Add the alkaline phosphatase colorimetric substrate and incubate at 37°C for variable time and quantify color.

Use of Pyrosequencing For SNP Scoring

Again, process Steps I to Step XX are identical as previously described. The next step calls for and allows single base addition and the following steps provide the pyrosequencing method for determining the particular base added, the complement of which is the SNP score in both TPA and RFTA.

Step XXI: Perform the single base extension to score the SNP by addition of a high fidelity DNA Polymerase and unlabeled dNTPs represented by the following mixtures to 4 aliquots of the sample processed to the magnetic beads binding of the PNAS structures. Other embodiments may use a single sample, with sequential addition of each dNTP.

Aliquot 1 2 3 4

DNA DNA DNA DNA „ ,

Enzyme „ , „ , _n , Polymerase

^J Polymerase Polymerase Polymerase

Unlabeled dNTPs dATP dTTP dGTP dCTP

to facilitate single base extension in one of the four sample wells for each allele present. Other embodiments may use a single sample with sequential addition followed by wash of each dNTP. STEP XXII: Remove the magnetic beads and assay the supernate for ppi

(inorganic pyrophosphate - a result of dNTP addition). The assay called pyrosequencing uses a sulfiirylase to react with ppi to convert ADP to ATP and a luciferase to react with ATP to produce light. The light generated can be sensitively determined optically by use of a CCD camera or other optical sensing device. Another embodiment will permit dNTP addition and ppi assay to occur in a single step.

STEP XXIII: Measure light production found in one well of the four for each allele scored.

Detection of the release of inorganic pyrophosphates in the pyrosequencing reaction can be automated. Automation steps comprise: 1, all enzymes and other reagents introduced to the sample containing the ppi in a single step, and the initiated light production proportionate to ppi quantity.

2, light is detected by a sensitive CCD camera able to electronically amplify the signal.

3, the module interfaces with a computer with software, to analyze the magnitude of the signal peak to evaluate the sample for the following parameters in the SNP and STR reactions:

• SNP Analysis Considerations

- recognition of heterozygosity of alleles present

- presence of identical repeat base in the SNP • STR Analysis Considerations

- recognition of heterozygosity of alleles present

- presence of identical repeat base in the STR

- presence of sample contamination (more than two alleles present)

Use of Dve Conjugated Dideoxynucleotide Terminator Bases For SNP Scoring

Steps I to Step XX are as previously described. The next step calls for and allows single base addition and the following steps provide the dye conjugated dideoxynucleotide terminator method for SNP scoring.

If dye/terminator base mixes are not used and the sample is aliquoted into 4 parts, and each is challenged with a dye conjugated base with a different or the same dye for each, a normal base may be used instead of the terminator base due to the introduction of only one base (dye labeled) to each aliquot.

The next step calls for and allows single base addition and the following steps provide the dye conjugated nucleotide method for determining the particular base added, the complement of which is the SNP score in TPA and RFTA.

STEP XXI: Perform the single base extension to score the SNP by addition of a high fidelity polymerase, dye labeled fluoro-ddNTPs*, and appropriate buffer dependent on the polymerase at pH 7.0 represented by the following mixtures to 4 aliquots of the sample processed to the magnetic beads binding of the PNAS structures to facilitate single base extension in one of the four sample wells for each allele present (or another embodiment, a single sample is used and each ddNTP has a conjugated dye emitting at a different wavelength).

Aliquot 1 2 3 4

DNA DNA DNA E^^A

Enzyme _. . „ , „ , Polymerase

^J Polymerase Polymerase Polymerase ^J

fluoro-ddNTP* ddATP* ddTTP* ddGTP* ddCTP*

STEP XXII: Wash (for example, with modified 3X SSPE) the magnetic beads to remove unincorporated labeled ddNTP*s. STEP XXII: Measure fluorescence from bound labeled ddNTP appearing in one of the four wells for each allele scored.

The present invention includes methods and kits for detecting and analyzing short tandem repeat sequences using nucleic acid target protection strategies. The method and kits of the invention may be used in forensic and medical sciences, as well as in genetic mapping, linkage analysis, and human identity testing, including parentage testing. The present invention is fast and accurate, without the problems plaguing cunent methods of PCR analysis and detection. These problems have been eliminated in the present invention because the methods disclosed herein are not based upon the amplification of the nucleic acid being studied. In the present invention, Short Tandem Repeats (STR) or microsatellites are analyzed by methods which do not amplify the nucleic acid, such as the Triplex Protection Assay (TPA), and the Restriction Fragment Target Assay (RFTA), disclosed in U.S. Patent Nos. 5,962,225, 6,100,040, U.S. Patent Application Nos. 09/633,848; 09/569,504; and 09/443,633.

Automatable TPA STR and RFTA STR Analysis: Using Primer Extension and Gel Electrophoresis

Similar to the SNP analysis, the RP-TFO TPA STR analysis and APSP RFTA STR analysis procedures possess near identical process steps.

The rules that apply to PNAS formation and the analysis steps and PDTP formation and the analysis steps presented in the SNP discussion are the same in STR analysis for both TPA and RFTA analysis processes. However, slightly different rationale exists for selecting the site of the PNAS or PDTP structures, which can be observed in Figures 22 to Figure 3 IB.

The PNAS generated by the various embodiments of RP-TFO TPA can also serve to support STR analysis, whether automated or not. One embodiment comprises isolating the STR and its flanking regions by methods previously discussed for SNP scoring site isolation and assuring the resistance of the STR region PNAS to Exo III exposure that would remove all double strand non-specific nucleic acid from the sample.

RP-TFO TPA STR and APSP RFTA STR Analysis

In these embodiments of STR analysis, one or two RP-TFO probes (in TPA) or APSP probes (in RFTA), an APSP Extension Primer (EP), and an APSP blocking probe (BP) (a blocking probe with both ends capped) may be necessary for hybridization to the target, depending on the proximity of the STR repeat sequence to the RE site. After hybridization, all must function to render the PNAS or PDTP resistant to Exo III. A choice of restriction endonuclease (RE) can also accomplish this. The probes must also displace the W and C strands of the STR region target and allow full STR sequencing by synthesis, preferably pyrosequencing. The entire SNP repeat region will be characterized, using the 3' end of the APSP EP as the sole site for single base additions. All other 3' ends are capped or configured so as not to be a substrate for single base additions by the DNA polymerase.

If the RE functions to render the PNAS Exo III resistant, due to the generation of 3' free ends, another RE must also be used in addition to generate blunt ends or 5' overhangs in order to allow Exo III to degrade non-specific genomic DNA. Neither RE may have a restriction site within the PNAS region, to protect its integrity.

In the TPA and RFTA embodiments of STR analysis, all aminopurine substituted probes function to destabilize and displace the target W and C strands, and open the STR target region for repeat single base additions or synthesis. The use of aminopurine substituted bases during the synthesis of longer than 150 mer STR alleles will continue to unzip the W and C duplex so that the entire repeat sequence can be synthesized in the open region of the W and C STR target region.

It is known by those skilled in the art that all of the aforementioned aminopurine substituted structures separate the W and C strands by strand displacement and when these substituted purine structures are 100-150 bases apart, generally span the STR alleles for all 13 CODIS loci between their flanking regions. Multiple, consecutive single base additions can occur in a stepwise fashion, in the widened gap between the two aminopurine substituted triplex or duplexes formed.

In embodiments of TPA and RFTA STR analysis, synthesis of the STR occurs in the open STR region and extends from the 3' end of the extension primer located in one flanking region up to the 5' end of the blocking probe located in the opposite flanking region. The blocking probe possesses a capped 3' end to prevent single base addition at this site. Alternatively, if gel analysis is desired, or if a RE site appears in the end flanking region, synthesis will continue from the 3' end of the extension primer to the RE site and the synthesis will be considered a runoff.

Similarly, the high fidelity DNA polymerase (thermostable or normal) must have only 3' synthesis function and no 5' exonuclease activity or the process probes may be compromised. Also the DNA exonuclease (Exo III) must be tested to preclude any endonuclease, single strand exonuclease activity, or gap widening activity that will compromise the PNAS or PDTP structure leading to a false scoring result due to multiple sites capable of single base additions. The embodiments presented in Table III and Table IV for TPA and RFTA STR analyses require that either an exonuclease treatment (Exo III) is employed or a certain restriction endonuclease is selected for use. The goal of STR sequencing by synthesis is single base addition in the STR repeat region exclusively at the 3' end of the APSP (ΕP) extension probe. There must be no additional 3' end available for single base addition by the added DNA polymerase and the four dNTPs repeatedly added. The APSP BP has at least a capped 3' end).

Embodiments presented evidence a PNAS and PDTP with two 3' ends suitable for single base addition by the DNA polymerase. To prevent this and cap the 3' ends, Exo III treatment is performed which results in a free 3' end at each site, which is incapable of single base addition. The rationale, herein, lies in the fact that the 5' complementary strand exists far distant from the free end (precludes duplex formation) that results from Exo III degradation of both sides of the PNAS, so no base can be added to a 3' free end, and a far distant complementary 5' base. A 5' overhang is a requisite for 3' single base addition.

When Exo III treatment is omitted, an alternative embodiment is used to prevent single base addition at the 3' ends of the PNAS and PDTP. This tactic involves the limitation of the appropriate restriction endonuclease for use, to those that generate either 3' overhang cuts or blunt end cuts, both of which prevent or cap the 3' PNAS or PDTP ends from being a site for single base addition.

Furthermore, no selected restriction endonuclease cleavage site must exist between the two ends of the PNAS or PDTP, between the two RE sites selected or additional sites for single base addition may be created, or the integrity capture and reporter elements of the PNAS or PDTP will be compromised, resulting in uninterpretable sequencing by synthesis of the STR sequence.

The last consideration is the sensitive analysis of the STR product. If pyrosequencing or sequencing by synthesis is used, then the analysis is complete. However, if a fluorescent labeled extension primer is used, then the synthesized STR can be analyzed in a gel format. In which case, the PNAS and PDTP structures are dissociated and run on a denaturing polyacryiamide gel and the R/ or migration distance is calculated and compared to allelic ladders and size standard markers and the repeat designated for each allele analyzed. RP-TFO TPA Embodiments:

Figures 22 to 27B in Table III represent prefened embodiments of RP-TFO

TPA STR sequencing. The following numerical designations describe the PNAS structures generated in these embodiments that function as a substrate for the STR single base addition and subsequent additions to ultimately sequence the entire STR region to assess the allelic determination.

10, the magnetic bead, or solid support, coated with a polymer containing -COOH groups, which in the presence of EDC activator, allows covalent attachment, via a condensation reaction, of the primary amine on the capture probe to the bead to allow capture of the PNAS

15, the substrate for Exo III activity (a blunt end or a 5' overhang end) conversely, a 3' overhang or a 3' free end is not a substrate for Exo III

17, a polypyrimidine rich region of the W or C strand existing in the STR flanking region or upstream or downstream from it

18, the 8-aminopurine substituted region of the RP-TFO

19, the polypyrimidine region of reverse polarity that acts as a TFO

20, the spacer or linker molecule connecting the two reverse polarity regions of the RP-TFO 21, the lack of additional Restriction Endonuclease sites similar to those located on the ends of the PNAS in the region between them

23, the 3' free end that results in a post Exo III treatment 26, the RP-TFO capture probe

28, the STR repeat sequence with its flanking regions (both sides) 29, an APSP EP, an aminopurine substituted extension primer that hybridizes to one of the flanking regions of the STR and allows bases to be added to its 3' end sequentially to sequence the entire adjoining STR region through to the opposite flanking region (site of RE site or blocking probe).

30, the extension primer driven synthesis representing the 3' to 5' directed synthesis of the STR sequence using aminopurine and regular pyrimidine nucleotides.

31, an APSP BP probe, an aminopurine substituted blocking probe which hybridizes to the opposite flanking region of the STR region and stops the single base addition or extension primer driven synthesis by DNA polymerase after the STR has been fully synthesized.

32, the first base added on the extension primer. This may be part of the flanking region or may represent the first base of the STR sequence. This 3' end is the only available site for single base addition.

33, the restriction endonuclease site, one at each end of the PNAS, found in the genomic DNA upstream from the STR region and flanking regions on both sides.

34, the second restriction endonuclease site, found in the genomic DNA downstream from the STR region with its flanking regions on both sides.

Similar to the situation in SNP analysis previously discussed, STR analysis by TPA and RFTA are identically structured.

Numerical Designations for APSP RFTA STR Figures 28 to Figure 31B Figures 28 to 3 IB in Table IV represent prefened embodiments of APSP

RFTA STR analysis. The following numerical designations describe the PDTP structures generated in these embodiments that function as a substrate for the STR single base addition and subsequent repeat additions to ultimately sequence the entire STR region to determine the alleles present. 10, the magnetic bead, or solid support, coated with COOH groups to form a covalent bond, upon activation by EDAC [l-Ethyl-3-(3-dimethylaminopropyl) carbodiimide], with the terminal primary amine of the capture probe, which will eventually be used to capture the PDTP.

15, the substrate for Exo III activity (a blunt end or a 5' overhang end). 21, the lack of additional Restriction Endonuclease sites similar to those located on the ends of the PNAS in the region between them.

23, the 3' free end that results as a post Exo III treatment.

27, the APSP capture probe with a capped 3' end.

28, the STR sequence with its flanking regions (both sides). 29, an APSP EP, an aminopurine substituted extension primer that hybridizes to one of the flanking regions of the STR and allows bases to be added to its 3' end sequentially to sequence the entire adjoining STR region through to the opposite flanking region.

30, the primer extension synthesis representing the 3' to 5' directed synthesis of the STR sequence using aminopurine and regular pyrimidine nucleotides. 31, an APSP BP probe, an aminopurine substituted blocking probe which hybridized to the opposite flanking region of the STR region and stops the single base addition or primer extension synthesis by DNA polymerase after the STR has been fully synthesized (the gap filled).

32, the first base added on the primer extension probe. This may be part of the flanking region or may represent the first base of the STR sequence. This 3' end is the only available site for single base addition.

33, the restriction endonuclease site, one at each end of the PNAS, found in the genomic DNA upstream from the STR region and flanking regions on both sides. 34, the second restriction endonuclease site, found in the genomic DNA downstream from the STR region with its flanking regions on both sides.

Automation of RP-TFO TPA and APSP RFTA STR Analyses: The Process Steps The following steps are performed on the AGENDA robotic device to sequence the STR sequence of microsatellites in genomic DNA. The following steps pertain to all the embodiments depicted in Figures 22 to Figure 3 IB and found in Table III and Table IV representing the various embodiments of RP-TFO TPA and APSP RFTA STR analyses. Steps I to Step VI are identical to those described for isolation of milligram quantities of DNA for analysis that are previously discussed. Steps VII to Step XIX are identical to those described for PNAS or PDTP generation and isolation, and the resulting STR target structures must conform to the embodiments presented in figures 22 to 3 IB. Use of Pyrosequencing For Single Base Addition and For Sequencing

Again, after execution of process Steps I to Step XX, the PNAS and PDTP structures have been constructed and bound to a fixed magnetic bead for repeated single base addition by either of two methods: 1 . Addition of a cocktail of each (4) dNTP and a high fidelity DNA polymerase to fill in the gap between the EP and the BP, to be followed by denaturing polyacryiamide gel electrophoresis for determination of the repeat length 2 . Cyclic and repeated addition of one dNTP at a time in a pyrosequencing reaction as previously discussed, which provides the sequence of the STR region as a result of synthesis. In such a situation, a denaturing polyacryiamide gel could be a confirmatory test. Use of pyrosequencing to achieve single base addition identification, as well as additions of 100-150 bases and STR sequencing, is also contemplated as part of the present invention. The analysis of the STR product is confirmed by gel electrophoresis of the STR synthesis product, in which case the extension primer must be labeled, in any manner known by those skilled in the art, to identify the band on the gel, and any other method known to those skilled in the art that allows repeat single base addition and its sequence and STR repeat size determination. The present invention also comprises methods and compositions for restricting nucleic acids with use of enzymes. Most nucleic acid manipulations involve the use of enzymes for nucleic acid analyses. These enzymes are limited in their applicability. Thus, the present invention is directed to novel embodiments of cutter probes and the Cutter Probe Assay (CPA). Embodiments of CPA were disclosed in U.S. Patent Application Nos. 09/569,504; 09/443,633 and U.S. Provisional Patent Applications 60/171,348, each of which is herein incorporated in its entirety. The embodiments presented therein, involved analysis of single-stranded (ss) nucleic acid targets (DNA, mRNA, other RNA, or other molecules). Denaturation of the naturally double- stranded nucleic acids is one step in analyzing many nucleic acid polymers. This denaturization step often presents a problem in automating the assays.

The present invention can be used for the non-enzymatic restriction of any nucleic acid target. The present invention can be used in the detection of a very low copy numbers of nucleic acid targets, either in a vast excess of non-specific nucleic acids or in femtamoles of target alone. The present invention comprises methods and compositions for direct DNA and RNA analysis in the absence of triplex formation for target protection, such as those used in the Triplex Protection Assay. The present invention comprises assay formats such as test tube, analytical gel, dot blot, and reverse dot blot formats, and can also be automated.

The present invention can be used to remove the requirement for the target denaturation step. In general, prefened embodiments comprise three probes to distinguish the presence of the unique double stranded DNA or RNA target. These probes are characterized as having restriction, capture and reporter activities. The length of these probes can be variable and they can be composed of DNA or RNA or any nucleic acid binding ligand, all of which will specifically bind to the desired sequences. These assays have multiple levels of specificity built in and are a part of the overall technology known as Haystack Processing. The assays are designed with multiple specificity steps to support highly specific and sensitive assays.

A preferred embodiment of the invention comprises methods and compositions for the hybridization of multiple probes, preferably, one capture and two reporter/restriction DNA, RNA, or other probes in the form of triplex forming oligonucleotides. These probes are then used to restrict, capture and identify the presence of the ds DNA or RNA region. A ds exoribonuclease or ds exodeoxyribonuclease may optionally be use to eliminate non-specific nucleic acid to minimize nonspecific background signal.

The present invention contemplates the use of any label molecule conjugated to the conserved portion of the reporter/restriction probe to generate the detection signal. This label can be any known to those skilled in the art and may involve any signal detection technology desired. The present invention comprises detection of specific target nucleic acids. These targets may be detected using any labels and detection systems used for molecular biological techniques. Though specific labeling and detection systems are presented in the examples given, the present invention contemplates the use of any labels or detection systems. Such detection systems include, but are not limited to, enzyme-chromogen reactions, bioluminescence, such as aequorin, renilla and luciferase, chemiluminescence, chemifluorescence, and cold laser based direct dye fluorescence detection.

One embodiment of the present invention comprises use of quencher molecules and fluorescent molecules attached to probes. The probes may comprise two or more individual probes comprising reporter and restriction regions. The probe may also comprise one probe with reporter and restriction regions. For example, the quencher molecule may be attached to the end of the probe that is destroyed in the assay. This is the non-conserved end of the probe. A fluorescent molecule is attached to the conserved end of the probe, the end that is maintained throughout the assay. When the Quencher (Q) and fluorescent (F) molecules are in proximity, there is no fluorescence at a given wavelength due to the activity of the Q molecule in quenching the fluorescence of the F molecule. Presence of the target and the two Q and F molecule-containing reporter/restriction probes mediate signal production as a result of separation and destruction of the quencher molecule by the BU substituted, or BU linked bases that comprise the restriction region of the probe that is destroyed by the activation of the BU.

A prefened embodiment comprises the following elements. The quencher (Q) molecule is on the restriction end of the reporter/restriction triplex-forming oligonucleotide (TFO). Q is attached near or onto a Bromouracil base (BU) on the end of the restriction region. Preferably, the Q and BU are not on the side in common with the reporter region. The F molecule is in the reporter region of the probe. Q is destroyed or liberated from close proximity to the fluorescent molecule to initiate the production of its unique emission fluorescence release at a specific frequency after excitation at a different frequency. Other embodiments of the present invention may use any signal and signal detection strategy known to those skilled in the art. For example, reporter and signal amplification compositions and methods such as the reporter/restriction probes with a poly dA or poly A tail and duplex and triplex reporter strategies and the MTRF advanced signal amplification techniques disclosed in PCT/US98/24226 and PCT/US99/27525 and U.S. Patent Application 09/443,633.

Embodiments of the present invention can be used in assays such as those disclosed U.S. Patent No. 5,962,225, and in U. S. Provisional Patent Applications, including 60/065,378, 60/075,812, 60/076,872 and PCT Application No. PCT/US98/24226 and PCT US99/27525 and U.S. Patent Application 09/443,633, all of which are herein incoφorated in their entirety.

A prefened embodiment of the present invention, directed to a ds DNA Cutter Probe Assay, is shown in Fig. 39A and 39B.

STEP I: The first step is the isolation of genomic nucleic acid by any method known to those skilled in the art.

STEP II: The genomic DNA is hybridized with a triplex forming oligonucleotide (TFO) specific for the target with an attached affinity molecule that functions as a capture molecule. For example, the capture molecule can be a primary amine that binds to a n-oxysuccinimide (NOS) coated solid support. This is the first level of specificity of the assay. The TFO in this embodiment binds to the ds DNA target region and s preferably at least 18 bases (mer) in length to insure sequence specific binding of the TFO/capture probe and the affinity molecule can be placed anywhere on the TFO. In general , this probe is the TFO/capture probe.

STEP III: Hybridize the genomic DNA with a second probe, a TFO/reporter probe, which is a TFO specific for a region at any distance from the target sequence. The TFO/reporter probe is comprised of at least two regions, the conserved and the nonconserved regions. The conserved region does not have a nucleic acid restriction function. The conserved region possesses a label at any location within the conserved region. For example, a fluorescent dye molecule may be placed at the probe's end or at any internal position.

The other region, the nonconserved region, is a region that is capable of cleaving nucleic acids and is itself destroyed upon activation. For example, the second region, the nonconserved region, comprises BU (bromouracil) or halogen- linked bases that will self-destruct upon activation with UV (short or long wavelength), x-ray, and gamma radiation. In a most prefened embodiment, the non- conserved region also possesses a fluorescence quencher molecule at its end or within the restriction region of the probe. As shown in Figure 39A, the fluorescence molecule is on the end of the upstream TFO that is closest to the capture TFO and target and the quencher molecule is on the end of the upstream TFO that is furthest distant from the capture TFO and target. This is the second level of specificity of the assay.

STEP IV: Hybridize the genomic DNA with a second TFO/reporter probe that is specific for a region away from the target at any distance from the target sequence. This probe (TFO/reporter) is comprised of two sections, like the

TFO/reporter probe described in Step III. This is the third level of specificity of the assay.

Alternatively, all three TFOs may be simultaneously hybridized to the genomic DNA, greatly facilitating the automation and speed of the assay. STEP V: Activation of the destructive molecule in the nonconserved region of the probes. For example, UV, X-ray, or gamma inadiation of probes having BU bases that are hybridized to the genomic nucleic acid will provide restriction of the region of ds genomic DNA that contains the ds target DNA sequence. The restriction is mediated by the mechanism of free radical formation and occurs only in the non-conserved regions of the TFO reporter/restriction probes.

STEP VI: Capture the target/probe complex, for example, onto a solid support. In one embodiment, a microliter plate with walls coated with n- oxysuccinimide (NOS) serves as the binding partner for the amine linked to the target/probe complex. STEP VII: Wash the captured genomic DNA to remove non-hybridized probes and other contaminants such as non-specific nucleic acid.

Optionally, the nonspecific nucleic acids can be removed by other methods known to those skilled in the art. For example, treatment with a double stranded deoyribonuclease, or double stranded exoribonuclease may be used to destroy non- specific genomic nucleic acid that remains. This prevents the production of a high background signal.

STEP VIII: Detection of the bound target. In the embodiment comprising a fluorescence quencher molecule, the quencher molecule has been removed from the probes by the destruction of the nonconserved region of the probe and detection of the bound probe is due to the excitation of the fluorescent molecule of the reporter probe. This fluorescence molecule will provide a specific emission at a different and unique fluorescent wavelength due to the loss upon cleavage of the quencher molecule. The numerical designations for ds DNA CPA Embodiment in Figure 39A and 39B are:

1, Double stranded genomic DNA

2, The ds DNA poly rich region 3, The anchor TFO with the primary amine affinity molecule

4, The (NH₂-) primary amine affinity molecule

5, The reporter/restriction probe, with a conserved region and a non- conserved restriction region.

6, The non-conserved restriction region of the reporter/restriction TFO fluorescence molecule, downstream from the target

7, A fluorescence quencher molecule on the end of the reporter/restriction TFO furthest distance downstream from the target and capture TFO

8, The fluorescent molecule at the end of the reporter/restriction probe nearest to the target and the capture TFO 9, The reporter/restriction probe, with a conserved region and a non- conserved restriction region

10, The non-conserved restriction region of the report/restriction probe TFO fluorescence molecule, upstream from the target

11, A fluorescence quencher molecule on the end of the reporter/restriction TFO furthest distance upstream from the target and capture TFO

Two types of cutter probes are refened to as the BUCP (bromouracil cutter probe, BUCPl and BUCP2) and the BU-TFO (bromouracil triplex forming oligonucleotide). The prefened embodiment of the former binds to a single strand nucleic acid, forming a duplex, and upon activation at high frequency, cleaves the opposite sugar-phosphate backbone of the single stranded DNA (or RNA). The prefened embodiment of the latter lies in the major groove and binds to the W and C duplex and, upon activation at high frequency, cleaves both sugar-phosphate backbones of the duplex. Another method of the present invention can be used with genomic DNA duplex strand cleavage wherein, the bromouracil substituted or bromouracil linked bases are located in a region of a probes, such as an 8-aminopurine substituted oligo or probe (BU-APSP).

Relaxation of genomic DNA by heating to 50°C to 60°C and hybridization of the duplex genomic DNA with the BU-APSP oligos will cause either W or C strand displacement, due to formation of a more stable duplex by the BU-APSP oligo

In TPA and RFTA, use of the APSP probes, requires an accompanying use of restriction endonucleases to isolate the target region for analysis. An alternative embodiment uses hybridization of each region or regions of genomic DNA to be restricted by one or more BU-APSP probe pairs, as depicted in Figure 39. In this embodiment, two aminopurine substituted probes (a pair) are used to restrict the W and C strand duplex at each side of the target region (upstream and downstream). This is favored over the use of a BU-TFO due to the fact that a 3' free end is routinely and more easily (no RE enzymes to deal with) generated that will render the PNAS or PDTP structure Exo III resistant and not a site for single base addition in SNP analyses, or primer extension in STR analyses. Use of the cutter probes in such a structure is shown in Figure 40. Numerical Designations: Figure 40

146, the target region to be analyzed that will ultimately become the PNAS or PDTP structure 147, the restriction region upstream from the target region

148, the restriction region downstream from the target region

149, the BU-APSP 1 restriction probe that is at least 11 mer (to confer sequence specificity) with BU substituted or BU-linked bases on the 3' end with specificity to the upstream restriction region, C strand 150, the BU-APSP 2 restriction probe that is at least 11 mer (to confer sequence specificity) with BU substituted or BU-linked bases on the 5' end with specificity to the upstream restriction region, W strand

151, the BU-APSP 1 restriction probe that is at least 11 mer (to confer sequence specificity) with BU substituted or BU-linked bases on the 3' end with specificity to the downstream restriction region, W strand 152, the BU-APSP 2 restriction probe that is at least 11 mer (to confer sequence specificity) with BU substituted or BU-linked bases on the 5' end with specificity to the downstream restriction region, C strand

153, the site of restriction, namely the sugar-phosphate backbone of the strand complementary to the cutter probe at a position opposite the BU substituted or linked base

154, the 3' free end formed by action of the BU cutter probe pairs, upon activation in the upstream restriction region from the target

155, the 3' free end formed by action of the BU cutter probe pairs, upon activation in the downstream restriction region from the target

EXAMPLES

EXAMPLE I RP-TFO TPA SNP DYS271 (non-coding region analysis)

Figure 32 represents the prefened embodiment for this analysis. Numerical designations follow, and reference must be made to the methods section detailing the RP-TFO TPA SNP procedure.

The genomic sequence for the DYS271 SNP site and upstream and downstream regions have been included to assist in the understanding of the experimental process component design, and is found in Figure 37.

Numerical Designations: Figure 32

35, RE Site, 3' side of SNP, Alu-I, blunt end cut 36, RE Site, 5' side of SNP, Alu-I, blunt end cut

37, 2 mer base region between capture RP-TFO and the Alu-I RE site (5' side of SNP)

38, 15 mer base, sequence of the capture RP-TFO

39, 10 mer base region between the capture RP-TFO and the site of single base addition and the scoring site (5' side of SNP)

40, the scoring site and site of single base addition (1 mer)

41, the 17 mer base sequence of the APSP PIP probe with the single base addition site at its 3' end, which is complementary to SNP scoring site 5' aATT aGTT aAaAC aAAaA aAaGT CC 3'

42, the 63 mer base region between the APSP PIP probeand RP-TFO (3' side of SNP)

43, the 19 mer base sequence of the RP-TFO (3* side of SNP)

44, the 121 mer base region between the RP-TFO (3' side of SNP) and the Alu-I restriction site (3' side of SNP)

45 , the Exo /// substrate

46, the site of single base addition

47, the SNP scoring site

48, the polypyrimidine region on the Crick strand (3' side of the SNP score site)

3'CCC TCT TCT TGC CTT CCT 5'

49, the poly aminopurine substituted region of the RP-TFO (3' side of SNP)

5' AaGA aAGaA aADaG GaAA aGGaA G-linker

D = dummy base (abasic site), either Inosine or Hypoxanthine b. the polypyrimidine region of the RP-TFO (3' side of SNP) ' CCC TCT TCT TDC CTT CTT C linker 51, the linker molecule attaching the 3' ends of the RP-TFO regions

52, No RΕ sites within the PNAS

53, a 3' free end that results from Exo III treatment rendering the PNAS Exo III resistant

54, the polypyrimidine region on the Crick strand (5' side of the SNP score site)

3' CCT CCT ATT TCC CCC 5'

55 the poly aminopurine substituted region of the RP-TFO (5' side of

SNP)

NH₂ aGGaA GGaA DaAA aAGG aGGaG CCC linker NH₂ = affinity molecule, D = dummy base (abasic site), either Inosine or Hypoxanthine 56 the polypyrimidine region of the capture RP-TFO (5' side of SNP)

5'CCT CCT ATT TCC CCC -linker

57, the derivatized magnetic bead solid support EXAMPLE II

Factor V Leiden RP-TFO TPA SNP (intron-exon-intron analysis)

Figure 33 represents the preferred embodiment for this analysis.

Numerical designations follow, and reference must be made to the methods section detailing the RP-TFO TPA SNP procedure, previously presented.

The genomic sequence for the Factor V Leiden (wild type allele) SNP exon site and upstream and downstream adjacent intron regions have been included to assist in the understanding of the experimental process component design, and is found in Figures 38 A and 38B. Factor V Leiden RP-TFO TPA SNP

Numerical Designations: Figure 33

58, the RE site (Xba I enzyme), 3' to the SNP site, cleaves with 5' overhang

59, the RE site (Nde I enzyme), 5' to the SNP site, cleaves with 5' overhang

60, the adjacent intron, 3' to the SNP site

61, the exon containing the Factor V Leiden SNP site (Crick strand SNP location)

62, the adjacent intron, 5' to the SNP site 63, the 163 mer region between the Nde I RE site and the capture RP-

TFO, 5' to the SNP site

64, the 13 mer region that is the length of the capture RP-TFO, 5' to the SNP site

65, the 51 mer region between the capture RP-TFO and the APSP assisting probe, 5' to the SNP site

66, the 21 mer length of the APSP assisting probe, 5' to the SNP site

67, the 132 mer region between the SNP site of single base addition (also SNP score site) and the APSP assisting probe, 5' to the SNP site

68, the 1 mer region of the site of single base addition and the SNP site 69, the 19 mer length of the APSP PIP probe, 3' to the SNP site

70, the 136 mer region between the APSP PIP probe and the other RP-

TFO, 3' to the SNP site 71, the 20 mer length of the RP-TFO, 3' to the SNP site

72, the 46 mer region between the RP-TFO and the Xba I RE site, 3' to the SNP site

73, the 3' free end rendering the C strand of the SNP score region resistant to Exo III

74, the derivatized magnetic bead attached to the PNAS, 5' to the SNP site

75, the 13 mer Crick strand polypyrimidine sequence for capture RP- TFO binding, 5' to the SNP site

3' CTT TCT TTT ATA T 5'

76, the 13 mer 8-aminopurine substituted strand of the capture RP-

TFO with attached affinity capture molecule, 5' to the SNP site

NH₂ 5' aGAaA AaGA aAAaA DaAD aA linker

NH₂ = affinity molecule D = dummy base (abasic site), either Inosine or Hypoxanthine 77, the 13 mer polypyrimidine substituted strand of the capture RP-

TFO, 5' to the SNP site

5' CTT TCT TTT DTD T linker

D = dummy base (abasic site), either Inosine or Hypoxanthine

78, the 20 mer Crick strand polypyrimidine sequence for binding the second RP-TFO, 3' to the SNP site

₅, _{τττ τττ τττ ττχ τcτ TAT ττ}_, 79, the 20 mer 8-aminopurine substituted strand of the second RP-TFO binding to the polypyrimidine site, 3' to the SNP site

5 'aAAaA AaAA aAAaA AaAA aAAaA aADaA AaA linker

D = dummy base (abasic site), either Inosine or Hypoxanthine

80,the 20 mer 8-aminopurine substituted strand of the second RP-TFO, 3' to the SNP site

₅. _{τττ τττ τττ τττ τχτ TDT ττ Hnker}

D = dummy base (abasic site), either Inosine or Hypoxanthine 81, the 19 mer sequence of the APSP PIP complementary to the Crick strand, 3' to the SNP site

5' TaGT TTT aATaG GaAC aATaA AaGG aA 3'

82, the SNP scoring site

83, the site of single base addition complementary to the SNP score

84, the 21 mer APSP assisting probe, 5' to the SNP site

5'aGAaG TTC TaAC aAAaG GTaG AaAT a ATT 3'

85, the 5' overhangs generated by the Nde I and Xba I cuts, both are a substrate for Exo /// and both are a site for single base addition. Therefore, this is an Exo /// treatment requiring embodiment to prevent single base addition at these two 3' sites.

EXAMPLE III

Factor V Leiden APSP RFTA SNP (intron-exon analysis)

Figure 34 represents the prefened embodiment for this analysis. Numerical designations follow, and reference must be made to the methods section detailing the APSP RFTA procedure, previously presented.

Numerical Designations: Figure 34

86, the RE site for Xba I, cutting with a resulting 5' overhang, a substrate for Exo III, 5' to the SNP score site

87, the RE site for Hind III, cutting with a resulting 5' overhang, a substrate for Exo III, 3' to the SNP score site

88, the Exo III restriction sites of the PNAS (both ends)

89, the region between the APSP capture probe and the restriction site, 117 mer, 3' to the SNP site

90, the region of the APSP capture probe, 20 mer 91, the region between the APSP capture probe and the APSP PIP probe, 115 mer

92, the region of the APSP PIP probe, 19 mer

93, the site of single base addition and SNP score site, 1 mer

94, the distance between the site of single base addition and the APSP assisting probe with a capped 3' end, 10 mer

95, the region of the APSP assisting probe with a capped 3' end, 17 mer 96, the distance between the APSP assisting probe and the RE site, 5' to the SNP score site, 131 mer

97, the adjacent intron region, 5' to the SNP score site

98, the exon region containing the SNP score site 99, the magnetic bead with the APSP capture probe covalently bound via the primary amine capture molecule

100, the APSP capture probe, 20 mer, with the sequence

5' TCT aAAaG aATaG TTC CaAC TTaA TaA 3'

NH₂

NH₂= primary amine capture molecule aA or aG = 8-aminopurine substituted bases 101, the Crick strand target region for the APSP capture probe, 20 mer, represented by

3'AGA TTC TAC AAG GTG AAT AT 5'

102, the APSP PIP probe, 19 mer with the sequence and a 3' end suitable for single base addition

3' CaGaG aACaA aGaGT CCC TaAaG aACaG aA 5' aA or aG = 8-aminopurine substituted bases

103, the SNP score site

104, the site of single base addition on the W strand of the target

105, the APSP assisting probe with a capped 3' end, 17 mer, with a sequence

CAP 3' TaAaA TaAaA aATC aGaGT CCT CT 5' aA or aG = 8-aminopurine substituted bases

106, the 3' free end generated by Exo III treatment of the PDTP structure

107, the target sequence complementary to the APSP assisting probe 5* ATT ATT TAG CCA GGA GA 3'

108, the PDTP structure has no RE sites within or the structure would be compromised

EXAMPLE IV

RP-TFO STR (CSF1PO Locus analysis) Chromosome 5 Figure 35 represents the prefened, embodiment for this analysis. Numerical designations follow, and reference must be made to the methods section detailing the RP-TFO TPA SNP procedure, previously presented.

Numerical Designations: Figure 35 109, the RE site Pvu II cutting with a 5' overhang, 3' to the RP-TFO capture probe

110, the RE site Sea I cutting with a 5' overhang, 5' to the RP-TFO capture probe

111, the distance between the RE site and the end of the STR region synthesis, 7 mer

112, the distance between the 2^nd base of the extension primer synthesis and the RE site just inside the opposite flanking region

113, the first base attached to the APSP EP (only site where 3' addition is allowed) 114, the region of the 18 mer APSP EP, complementary to the STR flanking region

115, the distance between the RP-TFO capture probe and the APSP EP, 166 mer

116, the 11 mer region of the RP-TFO capture probe 117, the distance between the RP-TFO capture probe and the Pvu II RE site 3' to the RP-TFO capture probe, 123 mer

118, the entire PNAS structure must be devoid of internal Pvu II and Sea I restriction sites

119, the entire STR target region including the two flanking regions or in this embodiment one flanking region and one RE site in close proximity to the opposite end of the STR sequence region

120, the entire STR sequence synthesis

121, the 5' overhang ends of the PNAS that is the substrate for Exo III

122, the APSP EP, 18 mer with the following sequence:

5' aATaA aGAaA aGaAT aAGaA TaAaG a ATT 3' aA or aG = 8-aminopurine substituted bases

123, the target Crick strand polypyrimidine rich sequence for capture RP-TFO attachment, which is 11 mer and possesses the sequence:

3' TTT CTC CTC TC 5'

124, the 8-aminopurine substituted region of the RP-TFO; its 11 mer sequence is: NH₂

5' aAAaA aGAaG aGAaG aAG linker

NH = primary amine capture molecule aA or aG = 8-aminopurine substituted bases

125, the polypyrimidine region of the RP-TFO; its 11 mer sequence is: 5' TTT CTC CTC TC linker

126, the 3' free end generated by Exo /// treatment of the PNAS rendering the Crick strand Exo /// resistant

127, the magnetic bead or solid support with the covalently linked RP- TFO capture probes

EXAMPLE V APSP RFTA STR (CSF1PO Locus analysis) Chromosome 5

Figure 36 represents the prefened embodiment for this analysis. Numerical designations follow, and reference must be made to the methods section detailing the RP-TFO TPA SNP procedure, previously presented. APSP RFTA TPA STR (CSF1PO Locus) Numerical Designations: Figure 36

128, the RE site Mbo II cutting with a 3' overhang, 5' to the extension primer (APSP EP) complementary region

129, the RE site Sea I cutting with a 3' overhang, 3' to the EP complementary region 130, the sites for Exo III degradation (Exo III substrate)

131, the 3' free end generated by Exo III treatment of the PDTP protected target

132, the distance between the RE site and the APSP capture probe, 3' to the extension primer complementary region, 40 mer 133, the length of the APSP assisting probe, 17 mer

134, the distance between the capped 3' end of the APSP capture probe and the APSP EP probe, 71 mer

135, the length of the extension primer, 22 mer

136, the first base synthesized of the STR repeat sequence or adjacent flanking region, 1 mer

137, the remaining STR repeat sequence being synthesized, resulting in a synthesis run-off 138, distance between the end of the STR region and the adjacent RE site, 4 mer

139, the STR target region that usually includes the repeat region with two flanking regions, or one region may have a restriction site at its initiating point (for synthesis run-off generation)

140, the STR repeat region (complete) with possibly some flanking region synthesis

141, the first base synthesized on the 3' end of the APSP EP probe

142, the APSP EP probe whose sequence is:

3' TCC TTC aATaG aAaAT CTT aGTC CaAaG aA 5' aA or aG = 8-aminopurine substituted bases

143, the magnetic bead solid support to which the APSP capture probe is attached

144, the 3' end capped APSP capture probe (17 mer) possessing a primary amine to allow covalent attachment to magnetic beads whose sequence is: CAP3' aGAaG TCaA aGaAC CaGT aGaGT a AC 5'

NH₂ NH_{2 =} primary amine a A or aG = 8-aminopurine substituted bases

145, the PNAS structure must have no RE sites within or the structure will be compromised by endonuclease action.

Terms As used herein the following terms have the following meaning.

Allele/A form of a gene used to differentiate a wild type or "normal" sequence from a mutant sequence. In SNP testing one or two alleles are analyzed for the SNP site mutation. A homozygous normal individual has only the wild type SNP score. A homozygous mutated individual has only the mutated SNP score. Furthermore, the heterozygous individual has both alleles, the wild type and mutant SNP scoring alleles present thereby resulting in a double SNP scoring reaction. The SNP scoring reaction of the SNP mutation site in the mutant allele is attained by subtracting the wild type result in the heterozygous individual.

APSP Capture Probe/An APSP probe with a primary amine (capture affinity molecule conjugated to it) or half of any affinity pair and a capped 3' end that will prevent single base addition at this point.

APSP Probe/A probe comprised of 8 aminopurine substituted purines and other bases that form a more stable triplex and duplex with the W and C strands of the target. Again the length of all the embodiments of the probe and the number of 8 aminopurine substitutions is a function of its potential to destabilize the W and C strand target duplex. Some have capped 3' end while others have 3' ends that are the site of single base addition, or primers for extended synthesis.

APSP-Assisting Probe/Aminopurine Substituted Assisting Probe. A "floating" probe that is positioned within 100-150 bases of the single base addition site, if necessary. Its length and number of substituted purines are governed by the same considerations previously stated for APSP 8-amino purine duplex forming probes. Positioning of this probe helps open the local genomic DNA region to single base addition. Any analysis embodiment can be configured for use of an additional APSP assisting probe, if the proximity to the single base addition site is greater than 150 bases. The 3' end of this probe is capped in order to prevent single base addition at this site.

APSP-EP Probe/Aminopurine substituted extension primer that hybridized to a flanking end of an STR region and whose 3' end is available for multiple single base additions for the purposes of sequencing and synthesis of the STR repeat region. The 3' end is the only site the DNA polymerase can add a base.

APSP-PIP Probe/Aminopurine Substituted Position Identifier Probe (has a 3' end next to the site of single base addition). The length of the APSP probe and the number of 8 amino substituted purines within is a function of that needed to stably displace the W or C target strand and form a more stable duplex and at the same time open the region between the APSP probes (100-150 bases) for single base addition. The SNP scoring site is the complement to the single base added.

APSP-BP/ Aminopurine substituted blocking probe that hybridizes to the opposite flanking end of an STR region and whose presence stops the single base addition for the purpose of sequencing the STR region. Once the repeat region is fully synthesized, base addition is stopped by the blocking probe presence (duplex present, no gap to fill). The 3' end of the probe is capped to prevent single base addition at the site.

Capture RP-TFO/Capture probe possessing a primary amine attached to the aminopurine substituted region of the oligonucleotide. The capture molecule may be half of any affinity pair. The RP-TFO has two 5' ends that will not support single base addition. ddNTP/dideoxynucleotide triphosphates, also known as chain terminator bases that are a substrate for single base addition by DNA polymerase, but after addition are not themselves a substrate or site for additional single base addition, such as ddATP, ddTTP, ddGTP, ddCTP, are all substrates for the DNA polymerase to be added as single base additions only.

Exo III Substrate/Genomic DNA (duplex) resulting in blunt ends and 5' overhangs, after RE cleavage are substrates for Exo III activity. 3' free ends are resistant to Exo III action, which is one strategy in protecting the PNAS or PDTPs from further degradation, as are 3' overhangs. The RP-TFO or APSP capture, APSP PIP probe, APSP EP, APSP BP, and any other APSP assisting probe placement in relaxed, genomic DNA allows Exo III to generate a 3' free end, protecting the PNAS and PDTP respectively from further degradation and renders the 3' free end resistant to single base addition by the DNA polymerase.

Exon/ Part or the entire coding region of a gene, some genes possess numerous exons. Only a minority of the total number of SNPs is found in genomic DNA. The exon region has an accompanying adjacent intron (one or both sides).

Intron/ Non-coding regions between the coding exons (pseudogenes have no neighboring intron).

PDTP/Partial Duplex Target Probe complex represents the use of single stranded duplex forming DNA probes that do not form triplexes that create a modified target containing structure, the PDTP, that possesses a capture element and a reporter element to aid in identification of the target presence or help characterize the target. PNAS/Protected Nucleic Acid Sequence represents the use of triplex forming oligonucleotides to create a modified target region that possesses a capture element and a reporter element to aid in identification of the target presence or help characterize the target.

RE/restriction endonucleases are those nucleic acid modifying enzymes that provide cleavage of intact genomic (duplex) DNA by cutting at specific base recognition sites over the DNA genome. The cuts resulting fall into three categories: one, blunt ends (21% of all RE), two, 5' overhang ends (53% of all RE), three, 3' overhang ends (26% of all RE).

RFTA/Restriction Fragment Target Assay is a diagnostic assay that involves duplex forming probes producing a PDTP structure, including the target that will determine the presence of or characterize the target.

RP-TFO/Reverse Phase-Triplex Forming Oligonucleotide (forms a stable triplex at physiologic pH). It has at lease an 11 mer purine region that is composed of 8 aminopurine substitutions, a reversal of polarity and a pyrimidine region. It has sufficient capability to displace the W or C strands of the relaxed genomic DNA of the target. Its polypyrimidine region possesses identical sequence to the Crick target strand polypyrimidine region; however, it has a reversal of polarity. It is stable at physiologic pH. See related documents.

Single Base Addition/Only a 3' end with a conesponding 5' overhang is a site for single base addition by any DNA polymerase. After Exo III exposure the PNAS or PDTP, generated by insertion of an RP-TFO or APSP probe into genomic DNA, possesses a 3' free end on each side that has no associating 5' overhang, therefore, the 3' end is not a site for single base extension by the DNA polymerase (5' overhang is mandatory). In the absence of Exo ///, the 3' ends can be rendered sites where single base addition will not occur only if the RΕ on both sides of the PNAS or PDTP leaves a 3' overhang, or a blunt end, thus rendering the ends not a site for single base extension by the DNA polymerase.

SNP/Single Nucleotide Polymorphism is a base change at a specific site in genomic DNA that occurs with a frequency around 1%. The SNP can be of clinical relevance and exist in the coding exon with its adjacent intron, or in a non-coding region, which is not located in the exon and by far accounts for the largest number of SNPs. These non-coding SNPs have relevance to identity testing and have usefulness in forensic analysis.

STR/ Short Tandem Repeats represent variable regions of repeat segments in genomic DNA refened to as microsatellites. These are shorter repeat segments and are cunently selected as those having tetranucleotide repeat sequences. Current STRs accepted by the forensic community number 13 (CODIS).

STR Target Region/The STR sequence with its two adjacent flanking sequences (one upstream and one downstream).

TFO/Triplex Forming Oligonucleotide is an oligonucleotide usually polypyrimidine that is used to form a triple strand DNA structure with a polyrich Watson and Crick region by binding to the major groove of the W and C strand duplex.

TP A/Triplex Protection Assay is a diagnostic assay where triplex formation in the target region functions to protect the target from nuclease digestion and forms the triplex PNAS structure, which can then be detected or' characterized.

W and C Strand Displacement/ Watson and Crick Target DNA can be relaxed by heating to 50°C - 60°C and hybridized with 8 aminopurine substituted probes that form a more stable triplex or duplex. The 100 - 150 base open region (displaced strand) between them is the site where single base addition (and other activities) will occur.

References:

Brookes, A.J., 1999. Review The essence of SNPs. Gene 234 177-186.

Barbujani, G., 1997. An apportionment of human DNA diversity. Proc. Natl. Acad. Sci. USA 94, 4516-19. Gu. Z., 1998, Single nucleotide polymorphism hunting in cyperspace. Human Mutat. 12, 221-225.

Hacia, J.G., 1998. Applications of DNA chips for genomic analysis. Mol. Psychiatry 3. 483-492.

Hammer, M.F., 1996. The role of the Y chromosome in human evolutionary studies. Evol. Anthropol. 5, 116-134.

Horton. R., 1998. Large-scale sequence comparisons reveal unusually high levels of variation in the HLA-DQBl locus in the class II region of the human MHC. J. Mol. Biol. 282, 71-97.

Howell, W. M., 1999. Dynamic allele-specific hybridization. A new method for scoring single nucleotide polymorphisms. Nature Biotechnol. 17, 87-88.

Ikuta, S., Dissociation kinetics of 19 base paired oligonucleotide-DNA duplexes containing different single mismatched base pairs. Nucleic Acids Res. 15, 797-811.

Kondrashov, A.S., 1995. Contamination of the genome by very slightly deleterious mutations; why have we not died 100 times over? J. Theor. Biol. 175, 583-594.

Landegren, C, 1998. Reading bits of genetic information: methods for single-nucleotide polymorphism analysis. Genome Res. 8, 769-776.

Li, W., 1991. Low nucleotide diversity in man. Genetics 129, 513-523. Lizardi, P. M., 1998. Mutation detection and single-molecule counting using isothermal rolling-circle amplification. Nature Genet. 19, 225-232.

Lyamichev, V., 1999. Polymorphism identification and quantitative detection of genomic DNA by invasive cleavate of oligonucleotide probes. Nature Biotechnol. 17, 292-296.

Marshall, E. 1997. "Playing chicken" over gene markers. Science 278, 2046-2048.

Nilsson, M., 1997. Padlock probes reveal single-nucleotide differences, parent of origin and in situ distribution of centromeric sequences in human chromosomes 13 and 21. Nature Genet. 16, 252-255.

Picoult-Newberg, L., 1999. Mining SNPs from EST databases. Genome Res. 9, 167-174.

Risch, N., 1996. The future of genetic studies of complex human diseases. Science 273, 1516-1517.

Taillon-Miller, P.,

1998. Overlapping genomic sequences: A treasure trove of single- nucleotide polymorphisms. Genome Res. 8, 748-754.

Vial, W. C, 1986. Cigarette smoking and lung disease. Am. J. Med. Sci. 291, 130-142.

Wang, D.G., 1998. Large-scale identification, mapping and genotyping of single-nucleotide polymorphisms in the human genome. Science 280, 1077-1082. TABLE I

RP-TFO TPA SNP SCORING EMBODIMENTS**** AND THEIR REQUIREMENTS

This refers to mandatory use of APSP assisting probes. In any case, separation of the RP-TFO and the PIP probe that is greater than 100 bases apart will require the use of an additional APSP assisting probe.

With Exo III treatment use RE that does not generate 3 ' overhangs (not a substrate for Exo III) Exo III treatment protects all critical targets from degradation by the presence of a 3' free end, which also prevents single base addition at those sites (PNAS ends).

In the absence of Exo III treatment use RE that leaves a 3' overhang or blunt end to prevent single base addition at the PNAS ends (at restriction ends).

All the embodiments, herein presented, refer to clinical SNP scoring, namely, those SNPs located in an exon region with an adjacent intron; however, the same embodiments can be applied to the scoring of non-coding SNPs. TABLE II

APSP RFTA SNP SCORING EMBODIMENTS**** AND THEIR REQUIREMENTS

This refers to mandatory use of APSP assisting probes. In any case, separation of the RP-TFO and the PD? probe that is greater than 100 bases apart will require the use of an additional APSP assisting probe.

With Exo III treatment use RE that does not generate 3' overhangs (not a substrate for Exo III) Exo III treatment protects all critical targets from degradation by the presence of a 3' free end, which also prevents single base addition at those sites (PDTP ends).

In the absence of Exo 111 treatment use RE that leaves a 3' overhang or blunt end to prevent single base addition at the PDTP ends.

All the embodiments, herein presented, refer to clinical SNP scoring, namely, those SNPs located in an exon region with an adjacent intron; however, the same embodiments can be applied to the scoring of non-coding SNPs. TABLE III

RP-TFO TPA STR ANALYSIS EMBODIMENTS AND THEIR REQUIREMENTS

This refers to mandatory use of APSP assisting probes. In any case, separation of the RP-TFO and the APSP EP probe that is greater than 100 bases apart will require the use of an additional APSP assisting probe.

** With Exo III treatment use RE that does not generate 3 Overhangs (not a substrate for Exo III) Exo Ul treatment protects all critical targets from degradation by the presence of a 3' free end, which also prevents single base addition at those sites (PNAS ends).

*** In the absence of Exo III treatment use RE that leaves a 3' overhang or blunt end to prevent single base addition at the PNAS ends (at restriction ends).

TABLE IV

APSP RFTA STR ANALYSIS EMBODIMENTS AND THEIR REQUIREMENTS

This refers to mandatory use of APSP assisting probes. In any case, separation of the RP-TFO and the PEP probe that is greater than 100 bases apart will require the use of an additional APSP assisting probe.

With Exo III treatment use RE that does not generate 3'overhangs (not a substrate for Exo III) Exo III treatment protects all critical targets from degradation by the presence of a 3' free end, which also prevents single base addition at those sites (PDTP ends).

In the absence of Exo III treatment use RE that leaves a 3 Overhang or blunt end to prevent single base addition at the PDTP ends (at restriction ends).

Claims

ClaimsWhat is Claimed is

1. A method for analyzing target nucleic acid sequences, comprising: a. isolating DNA; b. restricting the DNA on both sides of the target sequence; c. hybridizing at least one TFO to a region near the target sequence; d. adding a 3' to 5' exonuclease to the DNA to form a PNAS tail structure; e. capturing the structure f. hybridizing the structure with a SNP identification probe; and g. determining the SNP score

2. The method of Claim 1 , wherein the DNA is genomic DNA.

3 . The method of Claim 1 , wherein restricting is by two restriction endonucleases.

4. The method of Claim 1 , wherein restricting is by cutter probes.

5. The method of Claim 1, wherein capturing is by a capture probe hybridized to a PNAS tail.

6. The method of Claim 1, wherein determining the SNP score comprises, a. Dividing the composition of step f into four portions; b . Adding a DNA polymerase and four differently labeled dideoxynucleotide chain terminators bases (ddNTP) and allowing addition of one base to proceed; and c. detecting the added base.

7. A method for analyzing target nucleic acid sequences, comprising a. Isolating DNA b. Restricting the target sequence DNA; c. Denaturing the DNA; d. Hybridizing a primary probe to form a PDTP structure; e. Capturing the PDTP structure; f. Hybridizing a SNP identification probe; and g. Determining the SNP score.

8. The method of Claim 7, wherein the DNA is genomic DNA.

9. The method of Claim 7, wherein restricting is by two restriction endonucleases.

10. The method of Claim 7, wherein restricting is by cutter probes.

1 1 . The method of Claim 7, wherein capturing is by a capture probe hybridized to the PDTP structure.

12. The method of Claim 7, wherein determining the SNP score comprises, a. Dividing the composition of step e into four portions; b . Adding a DNA polymerase and four differently labeled dideoxynucleotide chain terminators bases (ddNTP) and allowing addition of one base to proceed; and c. detecting the added base.

13. A method for analyzing target nucleic acid sequences, comprising: a. isolating DNA; b. restricting the DNA on both sides of the target sequence; c. hybridizing at least one TFO to a region near the target sequence; d. adding a 3' to 5' exonuclease to the DNA to form a PNAS tail structure; e. capturing the structure f. hybridizing the structure with at least an extension probe; and g. determining the STR sequence.

14. The method of Claim 13, wherein restricting is by two restriction endonucleases.

15. The method of Claim 13 , wherein restricting is by cutter probes.

16. The method of Claim 13, wherein capturing is by a capture probe hybridized to a PNAS tail.

17. The method of Claim 13, wherein determining the STR sequence comprises, a . Adding a DNA polymerase and a mixture of four deoxynucleotide bases (dNTP) and synthesizing complement sequence; and b. Determining the sequence of the complement sequence.

18. A method for analyzing target nucleic acid sequences, comprising a. Isolating DNA b. Restricting the target sequence DNA; c. Denaturing the DNA; d. Hybridizing a primary probe to form a PDTP structure; e. Capturing the PDTP structure; f. Hybridizing at least an extension probe; and g. Determining the STR sequence.

19. The method of Claim 7, wherein the DNA is genomic DNA.

20. The method of Claim 7, wherein restricting is by cutter probes.