[go: up one dir, main page]

WO2009064897A2 - Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy - Google Patents

Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy Download PDF

Info

Publication number
WO2009064897A2
WO2009064897A2 PCT/US2008/083420 US2008083420W WO2009064897A2 WO 2009064897 A2 WO2009064897 A2 WO 2009064897A2 US 2008083420 W US2008083420 W US 2008083420W WO 2009064897 A2 WO2009064897 A2 WO 2009064897A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
nucleic acid
sequence
sequences
bse
Prior art date
Application number
PCT/US2008/083420
Other languages
French (fr)
Other versions
WO2009064897A3 (en
Inventor
Ekkehard Schuetz
Julia Beck
Howard Urnovitz
Original Assignee
Chronix Biomedical
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chronix Biomedical filed Critical Chronix Biomedical
Publication of WO2009064897A2 publication Critical patent/WO2009064897A2/en
Publication of WO2009064897A3 publication Critical patent/WO2009064897A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • BSE Mad cow disease or bovine spongiform encephalopathy
  • BSE tests are post mortem tests performed on brain tissue from animals that have been slaughtered.
  • the majority of these tests detect abnormal proteins known as prions, which are not found in an animal until the disease has progressed into late stage.
  • Early stage spongiform encephalopathies are difficult to detect by prion testing because prion accumulation is most often associated with late-stage disease.
  • Genetic tests for prion gene polymorphisms are currently used to determine the susceptibility of sheep for scrapie (Hunter, et al. Arch Virol 141:809-824, 1996). No such diversity of prion genes is found in BSE.
  • the detection of nucleic acids in cattle sera Schotz et al.
  • CDLI circulating nucleic acids
  • This invention is based on the discovery that single nucleotide variances and polymorphisms in nucleic acid sequences are detected in acellular samples, such as serum or plasma, from animals at risk for transmissible spongiform encephalopathy, e.g., BSE.
  • the invention therefore provides a method of detecting an animal with bovine spongiform encephalopathy (BSE), the method comprising: detection of an individual or multiple single nucleotide polymorphisms (SNPs), also referred to herein as single nucleotide variations (SNVs), in nucleic acids extracted from an acellular sample obtained from the animals.
  • the sample is an acellular fluid such as serum or plasma.
  • the nucleic acid sample can be a DNA sample or RNA sample.
  • BSE specific SNPs are selected from a database of nucleic acid sequences.
  • the database is generated by ultra deep sequencing technology whereby sequences present in BSE animals are compared to sequences present in animals without BSE.
  • SNVs can be detected using methods that are well known in the art.
  • Most assays entail one of several general protocols: hybridization using allele-specific oligonucleotides, primer extension, allele-specific ligation, sequencing, or electrophoretic separation techniques, e.g., singled stranded conformational polymorphism (SSCP) and heteroduplex analysis.
  • Other assays include 5' nuclease assays, template-directed dye terminator incorporation, molecular beacon allele-specific oligonucleotide assays, single-base extension assays, and SNP scoring by real-time pyrophosphate sequences.
  • Analysis of amplified sequences can be performed using various technologies such as microchips, fluorescence polarization assays, and matrix-assisted laser desoprtion ionization (MALDI) mass spectrometry.
  • MALDI matrix-assisted laser desoprtion ionization
  • Two methods that can also be used are assays based on invasive cleavage with Flap nucleases and methodologies employing padlock probes.
  • the presence of SNVs is detected by sequencing.
  • the invention provides a method of selecting technologies to use in an amplification reaction to detect an animal with BSE, the method comprising: identifying nucleic acid sequences that have differences in BSE animals compared to normal; and selecting reagents that detect specific SNP/SNV reactivity.
  • the sequences are . identified in acellular samples, such as serum, hi one embodiment, the invention provides a method for the detection of SNPs/SNVs to detect an animal with BSE, the method comprising: whole genome amplification of circulating nucleic acids, ultra deep sequencing of the amplified products and identification of SNPs/SNVs in the resulting database in BSE animals as compared to normal controls.
  • the invention provides a method of detecting an animal with BSE, the method comprising: extracting nucleic acids from a sample and detecting the statistical presence of a SNP/SNV in the extracted nucleic acids. Detection of a SNP/SNV can be done directly or indirectly, e.g., through amplification of the target nucleic acids and query at a variant position within the target nucleic acid using an oligonucleotide that selectively hybridizes to a reference sequence or known BSE-associated variant; or by direct sequencing..
  • Fig. 1 provides an example of a query repetitive element against a database. Calculations based on sequences Infected and controls compared using Chi-square test.
  • the solid line shows the Chi-square value at each position of this example queried sequence, based on distribution of nucleotide sampling, ie. A, C, G, T or a insertion or deletion, of sequences from animals with BSE as compared to sequences from normal controls.
  • the dotted line depicts the total number of hits in the database for each position on this example queried sequence.
  • Fig. 2 shows exemplary SNV analysis. Whenever the dotted line reaches 1.0 the position has a single nucleotide variation found only in animals with BSE as compared to normal controls.
  • the accompanying solid line is the respective Chi-square value, based on distribution of nucleotide sampling, ie. A, C, G, T or a insertion or deletion, calculated per animal. Throughout this example sequence, more than one position contains a SNV that is present in BSE animals but not in normal controls.
  • a "cohort” refers to birth or feeding cohorts that are defined according to the official EU definition as being raised or born on the same farm within 12 months prior to or after a BSE index case.
  • Animals "with BSE” refer to cattle that are incubating BSE etiologic agents but may or may not show any clinical signs of BSE or PrP res reactivity at the time of sampling.
  • reactivity refers to a change in a characteristic of SNP/SNV detection, in the presence of a nucleic acid sequence that is indicative of BSE.
  • a sample is considered reactive when it exhibits a value of at least 3, preferably 5 standard deviations above a reference standard.
  • a "positive reference” or “positive control” is a sample that is known to contain SNPs/SNVs that are indicative of BSE.
  • a “positive reference” can be from a known cohort animal that was reactive in the assay of the invention.
  • a “positive reference” can be a synthetic construct that shows reactivity in an assay of the invention.
  • a “reference control” is a sample that results in minimal change to the SNP/SNV detection in BSE. Often, such a sample is a known negative, e.g., from healthy animals. For example, in diagnostic applications, such a control is typically derived from a normal animal that is not a cohort with a PrP res animal. A “reference control” is preferably included in an assay, but may be omitted.
  • “Amplifying” refers to a step of submitting a solution to conditions sufficient to allow for amplification of a polynucleotide if all of the components of the reaction are intact.
  • Components of an amplification reaction include, e.g., primers, a polynucleotide template, polymerase, nucleotides, and the like.
  • the term “amplifying” typically refers to an "exponential" increase in target nucleic acid. However, “amplifying” as used herein can also refer to linear increases in the numbers of a select target sequence of nucleic acid.
  • An "amplification characteristic” refers to any parameter of an amplification reaction. Such reactions typically comprises repeated cycles. An amplification characteristic may be the number of cycles, a melting curve, temperature profile, or band characteristics on a gel or other means of post-amplification detection.
  • the term allele-specific probe does not refer to an allele per se, but to a probe to a variant nucleic acid sequence relative to a reference sequence.
  • an “allele” in the context of this invention refers to a variant nucleic acid sequence in comparison to a references sequence, e.g., the reference sequences set forth in SEQ ID NOs 1-41.
  • a “melting profile” or “melting curve” refers to the melting temperature characteristics of a nucleic acid fragment over a temperature gradient.
  • the melting curve is derived from the first derivative of the melting signal.
  • the melting point of a DNA fragment depends, e.g., on its length, its G/C content, the ionic strength of the buffer and the presence of mismatches (heteroduplexes).
  • mismatches heteroduplexes
  • amplification reaction refers to any in vitro means for multiplying the copies of a target sequence of nucleic acid.
  • methods include but are not limited to polymerase chain reaction (PCR), DNA ligase, (LCR), Q ⁇ RNA replicase, RNA transcription- based (TAS and 3SR) amplification reactions, and nucleic acid sequence based amplification (NASBA).
  • PCR polymerase chain reaction
  • LCR DNA ligase
  • TAS and 3SR RNA transcription- based amplification reactions
  • NASBA nucleic acid sequence based amplification
  • PCR Polymerase chain reaction
  • PCR refers to a method whereby a specific segment or subsequence of a target double-stranded DNA, is amplified in a geometric progression.
  • PCR is well known to those of skill in the art; see, e.g., U.S. Patents 4,683,195 and 4,683,202; PCR Technology: Principles and Applications for DNA Amplification (Erlich, ed., 1992)and PCR Protocols: A Guide to Methods and Applications, Innis et al, eds, 1990.
  • amplification reaction mixture refers to an aqueous solution comprising the various reagents used to amplify a target nucleic acid. These include enzymes, aqueous buffers, salts, amplification primers, target nucleic acid, and nucleoside triphosphates. Depending upon the context, the mixture can be either a complete or incomplete amplification reaction mixture
  • a "primer” refers to a polynucleotide sequence that hybridizes to a sequence on a target nucleic acid and serves as a point of initiation of nucleic acid synthesis.
  • Primers can be of a variety of lengths and are often less than 50 nucleotides in length, for example 12-25 nucleotides, in length.
  • the length and sequences of primers for use in PCR can be designed based on principles known to those of skill in the art, see, e.g., Innis et al., supra.
  • a primer is preferably a single-stranded oligodeoxyribonucleotide.
  • the primer includes a "hybridizing region" exactly or substantially complementary to the target sequence, preferably about 15 to about 35 nucleotides in length.
  • a primer oligonucleotide can either consist entirely of the hybridizing region or can contain additional features which allow for the detection, immobilization, or manipulation of the amplified product, but which do not alter the ability of the primer to serve as a starting reagent for DNA synthesis.
  • a nucleic acid sequence tail can be included at the 5' end of the primer that hybridizes to a capture oligonucleotide.
  • a primer for use in the invention need not exactly correspond to the sequence(s) that it amplifies in a hybridization reaction.
  • the incorporation of mismatches into a probe can be used to adjust duplex stability when the assay format precludes adjusting the hybridization conditions.
  • the effect of a particular introduced mismatch on duplex stability is well known, and the duplex stability can be routinely both estimated and empirically determined, as described above.
  • Suitable hybridization conditions which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art (see, e.g., the general PCR and molecular biology technique references cited herein).
  • sequence when referring to a nucleic acid refers to a sequence of nucleotides that are contiguous within a second sequence but does not include all of the nucleotides of the second sequence.
  • a "temperature profile” refers to the temperature and lengths of time of the denaturation, annealing and/or extension steps of a PCR reaction.
  • a temperature profile for a PCR reaction typically consists of 10 to 60 repetitions of similar or identical shorter temperature profiles; each of these shorter profiles may typically define a two step or three- step PCR reaction. Selection of a "temperature profile” is based on various considerations known to those of skill in the art, see, e.g., Innis et al., supra.
  • a “template” refers to a double or single stranded polynucleotide sequence that comprises a polynucleotide to be amplified.
  • an "acellular biological fluid” is a biological fluid that substantially lacks cells.
  • such fluids are fluids prepared by removal of cells from a biological fluid that normally contains cells (e.g., whole blood).
  • exemplary processed acellular biological fluids include processed blood (serum and plasma), e.g., from peripheral blood or blood from body cavities or organs; and samples prepared from urine, milk, saliva, sweat, tears, phlegm, cerebrospinal fluid, semen, feces, and the like.
  • serum or plasma is the acellular sample that is analyzed in the assays of the invention.
  • acellular samples that can be used include samples comprising nucleic acids obtained by washing any cell preparation to remove circulating nucleic acids that are associated with the cell surface.
  • samples comprising nucleic acids obtained by washing any cell preparation to remove circulating nucleic acids that are associated with the cell surface.
  • such an acellular sample can be obtained by washing circulating blood cells, such as lymphocytes. The supernatant from the wash can then be analyzed.
  • Nucleic acid refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, or chimeric constructs of polynucleotides chemically linked to reporter molecules, and unless otherwise limited, would encompass known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides.
  • biological sample refers to a sample obtained from an organism or from components (e.g., cells) of an organism.
  • the sample may be of any biological tissue or fluid. Frequently the sample will be a "clinical sample” which is a sample derived from a patient, animal or human, with a disease or suspected of having a disease.
  • samples include, but are not limited to, sputum, blood, serum, plasma, body cavity blood or blood products, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, milk, peritoneal fluid, and pleural fluid, or cells therefrom.
  • Biological samples may also include sections of tissues such as frozen sections taken for histological purposes.
  • An "individual” or “patient” as used herein, refers to any animals, often mammals, including, but not limited to humans, nonhuman primates such as chimpanzees and monkeys, horses, cows, deer, sheep, goats, pigs, dogs, minks, elk, cats, lagromorphs, and rodents.
  • a "chronic illness” is a disease, symptom, or syndrome that last for months to years.
  • chronic illnesses in animals include, but are not limited to, cancers and wasting diseases as well as autoimmune diseases, and neurodegenerative diseases such as spongiform encephalopathies and others.
  • Repetitive genomic sequences or “repetitive genomic nucleic acid sequences” (RGNAS) refer to highly repeated DNA elements present in the animal genome. These sequences are usually categorized in sequence families and are broadly classified as tandemly repeated DNA or interspersed repetitive DNA (see, e.g., Jelinek and Schmid, Ann. Rev. Biochem. 51:831-844, 1982; Hardman, Biochem J. 234:1-11, 1986; and Vogt, Hum. Genet. 84:301-306, 1990). Tandemly repeated DNA includes satellite, minisatellite, and microsatellite DNA.
  • Repetitive genomic sequences includes AIu sequences, short interspersed nuclear elements (SINES), long terminal repeats (LTR), LTR and non-LTR transposable elements, LTR and non-LTR retrotransposons, endogenous retroviruses, and long interspersed nuclear elements (LINES) including Ll LINE sequences.
  • Intergenic sequence or "spacer DNA” or “non-coding sequence” refers to those nucleic acid sequences, including non-intronic sequences that do not code for protein sequences.
  • a "rearranged sequence” or “recombined sequence” is a region of the genomic DNA that is rearranged compared to normal, i.e., the rearranged sequence is not contiguous in genomic DNA in healthy animals or in genomic DNA obtained from animals prior to contracting a disease or prior to exposure to a genotoxic agent.
  • a single nucleotide polymorphism is used herein interchangeably with the term "single nucleotide variance" or " single nucleotide variations” (SNV).
  • SNP single nucleotide polymorphism
  • the reference sequence can be derived experimentally from nucleic acids sequenced from the serum of normal individuals and the test sequence may be derived from the serum of an animal with BSE.
  • Such variability can include regions of short nucleotide (1- 5 nucleotide) deletions and insertions.
  • SNPs may occur at any region in the genome or in nucleic acid sequences. In the current invention, the change is in a non-coding region of DNA, including, but not limited to repetitive sequences, and intragenic DNA.
  • Ultra Deep Sequencing (454 Sequencing) is a massively-parallel pyrosequencing system capable of sequencing roughly 100 megabases of raw DNA sequence per 7-hour run using the GSFLX sequencing machine.
  • the system relies on fixing nebulized and adapter- ligated DNA fragments to small DNA-capture beads in a water-in-oil emulsion.
  • the DNA fixed to these beads is then amplified by PCR.
  • each DNA-bound bead is placed into a ⁇ 44 ⁇ m well on a PicoTiterPlate, a fiber optic chip. A mix of enzymes such as polymerase, ATP sulfurylase, and luciferase are also packed into the well.
  • the PicoTiterPlate is then placed into the GS20 for sequencing.
  • Contig refers to sequences that are computationally assembled from several overlapping physically contiguous sequences into one contiguous sequence. Such a contig is usually, but not necessarily longer than the initial sequences.
  • “Whole genome amplification” is a technique in which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis.
  • CNAs refers to DNA or RNA that is found in acellular fluids.
  • stringent hybridization conditions will be those in which the salt concentration is about 0.2XSSC at pH 7 and the temperature is at least about 6O 0 C.
  • a nucleic acid of the invention or fragment thereof can be identified in standard filter hybridizations using the nucleic acids disclosed here under stringent conditions, which for purposes of this disclosure, include at least one wash (usually 2) in 0.2X SSC at a temperature of at least about 60°C, usually about 65°C, sometimes 70°C for 20 minutes, or equivalent conditions.
  • an annealing temperature of about 5°C below Tm is typical for low stringency amplification, although annealing temperatures may vary between about 32°C and 72°C, e.g., 40 0 C, 42°C, 45°C, 52°C, 55 0 C, 57 0 C, or 62°C, depending on primer length and nucleotide composition.
  • High stringency PCR amplification, a temperature at, or slightly (up to 5°C) above, primer Tm is typical, although high stringency annealing temperatures can range from about 50 0 C to about 72°C, and are often 72°C, depending on the primer and buffer conditions (Ahsen et al, Clin Chem. 47:1956-61, 2001).
  • Typical cycle conditions for both high and low stringency amplifications include a denaturation phase of 90°C-95°C for 30 sec-2 min., an annealing phase lasting 30 sec-10 min., and an extension phase of about 72°C for 1 -15 min.
  • BSE is clinically characterized by increasing perturbation of central nervous function in the affected animal, ultimately leading to severe symptoms, e.g., an inability to stand, forcing the sacrifice of the animal.
  • the bovine form does not appear to be associated with a mutation in the prion gene, but may be caused by a post-translational misfolding of the prion protein, which leads to aggregation in the central nervous system.
  • the diagnosis is based on the fact that misfolded prion protein has enhanced resistance to protease K digestion. As disease-specific prion accumulation in the plasma or blood of animals has not been identified, the diagnostic target has been the brain stem.
  • the invention provides a method for diagnosing an increased risk for BSE by amplification and analysis of circulating nucleic acids (CNA) from test animals.
  • CNA circulating nucleic acids
  • Nucleic acid molecules detected in the methods of the invention may be free, single or double stranded, molecules or complexed with protein or lipids or both.
  • the detected nucleic acids can be DNA or RNA molecules.
  • RNA molecules need not be transcribed from a gene, but can be transcribed from any sequence in the chromosomal DNA.
  • Exemplary RNAs include miRNA, intergenic RNA, small nuclear RNA (snRNA), mRNA, tRNA, rRNA, and interference RNA (iRNA).
  • the nucleic acid molecules may comprise sequences transcribed from repetitive genomic sequences or intergenic or non-coding DNA in the genome of the individual from which the sample is derived.
  • the detected nucleic acid molecules may also be the products of rearrangement of germline sequences and/or sequences introduced into the genome, e.g., exogenous viral sequences.
  • a polynucleotide detected using this method may be a particular polynucleotide or may be a population of polynucleotides that are present in the sample.
  • the polynucleotide in a particular sample need not have that sequence, i.e., the sequence of the polynucleotide in the sample may be altered in comparison to the known sequence.
  • Such alterations can include mutations, e.g., insertions, deletions, substitutions, and various other rearrangements.
  • the resulting amplified products may be as result of the amplification reaction and not reflect the original pool of polynucleotides.
  • the test samples are typically biological samples that comprise target nucleic acids.
  • a target nucleic acid can be from any source, but is typically from a biological sample that comprises small quantities of nucleic acid, e.g., nucleic acid samples obtained from samples that are not readily quantified by standard PCR methodology.
  • the test sample is a nucleic acid, e.g., RNA or DNA that is isolated from serum or plasma. SNP/SNV Detection Reactions
  • Detection techniques for evaluating nucleic acids for the presence of a SNP or SNV involve procedures well known in the field of molecular genetics. Further, many of the methods involve amplification of nucleic acids. Ample guidance for performing such technicques is provided in the art. Exemplary references include manuals such as PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N. Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds.
  • microarrays can be utilized for genomewide SNP detection assays (Genomewide SNP assay reveals mutations underlying Parkinson disease. Simon-Sanchez J, Scholz S, Del Mar Matarin M, Fung HC, Hernandez D, Gibbs JR, Britton A, Hardy J, Singleton A. Hum Mutat. 2007 Nov 9)
  • Suitable amplification methods include ligase chain reaction ⁇ see, e.g., Wu & Wallace, Genomics 4:560-569, 1988); strand displacement assay ⁇ see, e.g., Walker et al, Proc. Natl. Acad. ScL USA 89:392-396, 1992; U.S. Pat. No. 5,455,166); and several transcription-based amplification systems, including the methods described in U.S. Pat. Nos. 5,437,990; 5,409,818; and 5,399,491; the transcription amplification system (TAS) (Kwoh et al, Proc. Natl.
  • TAS transcription amplification system
  • oligonucleotide primers and/or probes can be prepared by any suitable method, usually chemical synthesis. Oligonucleotides can be synthesized using commercially available reagents and instruments. Alternatively, they can be purchased through commercial sources. Methods of synthesizing oligonucleotides are well known in the art ⁇ see, e.g, Narang et al, Meth. Enzymol. 68:90-99, 1979; Brown et al, Meth. Enzymol. 68:109-151, 1979; Beaucage et al, Tetrahedron Lett.
  • modified phosphodiester linkages e.g., phosphorothioate, methylphosphonates, phosphoamidate, or boranophosphate
  • linkages other than a phosphorous acid derivative may be used to prevent cleavage at a selected site
  • 2 '-amino modified sugars tends to favor displacement over digestion of the oligonucleotide when hybridized to a nucleic acid that is also the template for synthesis of a new nucleic acid strand.
  • This technique also commonly referred to as allele specific oligonucleotide hybridization (ASO) (e.g., Stoneking et al., Am. J. Hum. Genet. 48:70-382, 1991; Saiki et al., Nature 324, 163-166, 1986; EP 235,726; and WO 89/11548), relies on distinguishing between two DNA molecules differing at a polymorphic position, typically by one nucleotide, by hybridizing an oligonucleotide probe that is specific for one of the variants to an amplified product obtained from amplifying the nucleic acid sample.
  • This method typically employs short oligonucleotides, e.g., 15-20 bases in length.
  • probes are designed to differentially hybridize to one variant versus another.
  • probes are designed to hybridize to the version of the nucleic acid sequence that is present in normal cows. Principles and guidance for designing such probe is available in the art, e.g., in the references cited herein.
  • Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-base oligonucleotide at the 7 position; in a 16-based oligonucleotide at either the 8 or 9 position) of the probe, but this design is not required.
  • the amount and/or presence of an allele is determined by measuring the amount of allele-specific oligonucleotide that is hybridized to the sample.
  • the oligonucleotide is labeled with a label such as a fluorescent label. After stringent hybridization and washing conditions, fluorescence intensity is measured for each SNP oligonucleotide.
  • the nucleotide present at the polymorphic site is identified by hybridization under sequence-specific hybridization conditions with an oligonucleotide probe exactly complementary to one of the polymorphic alleles in a region encompassing the polymorphic site.
  • the probe hybridizing sequence and sequence-specific hybridization conditions are selected such that a single mismatch at the polymorphic site destabilizes the hybridization duplex sufficiently so that it is effectively not formed.
  • sequence-specific hybridization conditions stable duplexes will form only between the probe and the exactly complementary allelic sequence.
  • oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are exactly complementary to an allele sequence in a region which encompasses the polymorphic site are within the scope of the invention.
  • the nucleotide present at the polymorphic site is identified by hybridization under sufficiently stringent hybridization conditions with an oligonucleotide substantially complementary to one of the SNP/SNV alleles in a region encompassing the polymorphic site, and exactly complementary to the allele at the polymorphic site. Because mismatches which occur at non-polymorphic sites are mismatches with both allele sequences, the difference in the number of mismatches in a duplex formed with the target allele sequence and in a duplex formed with the corresponding non-target allele sequence is the same as when an oligonucleotide exactly complementary to the target allele sequence is used.
  • the hybridization conditions are relaxed sufficiently to allow the formation of stable duplexes with the target sequence, while maintaining sufficient stringency to preclude the formation of stable duplexes with non-target sequences. Under such sufficiently stringent hybridization conditions, stable duplexes will form only between the probe and the target allele.
  • oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are substantially complementary to an allele sequence in a region which encompasses the polymorphic site, and are exactly complementary to the allele sequence at the polymorphic site, are within the scope of the invention.
  • oligonucleotides may be desirable in assay formats in which optimization of hybridization conditions is limited.
  • probes for each target are immobilized on a single solid support.
  • Hybridizations are carried out simultaneously by contacting the solid support with a solution containing target DNA.
  • the hybridization conditions cannot be separately optimized for each probe.
  • the incorporation of mismatches into a probe can be used to adjust duplex stability when the assay format precludes adjusting the hybridization conditions.
  • duplex stability can be routinely both estimated and empirically determined, as described above.
  • Suitable hybridization conditions which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art.
  • the use of oligonucleotide probes to detect single base pair differences in sequence is described in, for example, Conner et al., 1983, Proc. Natl. Acad. Sci. USA 80:278-282, and U.S. Pat. Nos. 5,468,613 and 5,604,099, each incorporated herein by reference.
  • the proportional change in stability between a perfectly matched and a single-base mismatched hybridization duplex depends on the length of the hybridized oligonucleotides. Duplexes formed with shorter probe sequences are destabilized proportionally more by the presence of a mismatch. In practice, oligonucleotides between about 15 and about 35 nucleotides in length are preferred for sequence-specific detection. Furthermore, because the ends of a hybridized oligonucleotide undergo continuous random dissociation and re- annealing due to thermal energy, a mismatch at either end destabilizes the hybridization duplex less than a mismatch occurring internally. Preferably, for discrimination of a single base pair change in target sequence, the probe sequence is selected which hybridizes to the target sequence such that the polymorphic site occurs in the interior region of the probe.
  • Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats.
  • Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.
  • amplified target DNA is immobilized on a solid support, such as a nylon membrane.
  • a solid support such as a nylon membrane.
  • the membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe.
  • a preferred dot-blot detection assay is described in the examples.
  • the probes are immobilized on a solid support, such as a nylon membrane or a microtiter plate.
  • the target DNA is labeled, typically during amplification by the incorporation of labeled primers.
  • One or both of the primers can be labeled.
  • the membrane-probe complex is incubated with the labeled amplified target DNA under suitable hybridization conditions, unhybridized target DNA is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound target DNA.
  • a preferred reverse line-blot detection assay is described in the examples.
  • An allele-specific probe that is specific for one of the polymorphism variants is often used in conjunction with the allele-specific probe for the other polymorphism variant.
  • the probes are immobilized on a solid support and the target sequence in an individual is analyzed using both probes simultaneously. Examples of nucleic acid arrays are described by WO 95/11995. The same array or a different array can be used for analysis of characterized polymorphisms.
  • Polymorphisms are also commonly detected using allele-specific amplification or primer extension methods. These reactions typically involve use of primers that are designed to specifically target a polymorphism via a mismatch at the 3' end of a primer. The presence of a mismatch effects the ability of a polymerase to extend a primer when the polymerase lacks error-correcting activity. The presence of the particular allele can be determined by the ability of the primer to initiate extension. If the 3' terminus is mismatched, the extension is impeded. Thus, for example, if a primer matches the "C" allele nucleotide at the 3' end, the primer will be efficiently extended.
  • the primer is used in conjunction with a second primer in an amplification reaction.
  • the second primer hybridizes at a site unrelated to the polymorphic position.
  • Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form is present.
  • Allele-specific amplification- or extension- based methods are described in, for example, WO 93/22456; U.S. Pat. Nos. 5,137,806; 5,595,890; 5,639,611; and U.S. Pat. No. 4,851,331.
  • identification of the alleles requires only detection of the presence or absence of amplified target sequences.
  • Methods for the detection of amplified target sequences are well known in the art. For example, gel electrophoresis and probe hybridization assays described are often used to detect the presence of nucleic acids.
  • the amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture, is described, e.g., in U.S. Pat. No. 5,994,056; and European Patent Publication Nos. 487,218 and 512,334.
  • the detection of double-stranded target DNA relies on the increased fluorescence various DNA-binding dyes, e.g., SYBR Green, exhibit when bound to double- stranded DNA.
  • allele-specific amplification methods can be performed in reaction that employ multiple allele-specific primers to target particular alleles.
  • Primers for such multiplex applications are generally labeled with distinguishable labels or are selected such that the amplification products produced from the alleles are distinguishable by size.
  • both alleles in a single sample can be identified using a single amplification by gel analysis of the amplification product.
  • an allele-specific oligonucleotide primer may be exactly complementary to one of the polymorphic alleles in the hybridizing region or may have some mismatches at positions other than the 3' terminus of the oligonucleotide, which mismatches occur at non-polymorphic sites in both allele sequences.
  • Identification of the presence of a polymorphism can also be performed using a "TaqMan®” or "5'-nuclease assay", as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al, 1988, Proc. Natl. Acad. Sd. USA 88:7276-7280.
  • TaqMan® assay labeled detection probes that hybridize within the amplified region are added during the amplification reaction. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis.
  • the amplification is performed using a DNA polymerase having 5' to 3' exonuclease activity.
  • any probe which hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5' to 3' exonuclease activity of the DNA polymerase.
  • the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.
  • the hybridization probe can be an allele-specific probe that discriminates between the SNP alleles.
  • the method can be performed using an allele-specific primer and a labeled probe that binds to amplified product.
  • any method suitable for detecting degradation product can be used in a 5' nuclease assay.
  • the detection probe is labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye.
  • the dyes are attached to the probe, preferably one attached to the 5' terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5' to 3' exonuclease activity of the DNA polymerase occurs in between the two dyes.
  • Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye.
  • the accumulation of degradation product is monitored by measuring the increase in reaction fluorescence.
  • SNPs/SNVs can also be detected by direct sequencing. Methods include e.g., dideoxy sequencing-based methods and other methods such as Maxam and Gilbert sequence (see, e.g., Sambrook and Russell, supra).
  • Other detection methods include PyrosequencingTM of oligonucleotide-length products. Such methods often employ amplification techniques such as PCR. For example, in pyrosequencing, a sequencing primer is hybridized to a single stranded, PCR-amplified, DNA template; and incubated with the enzymes, DNA polymerase, ATP sulfurylase, luciferase and apyrase, and the substrates, adenosine 5' phosphosulfate (APS) and luciferin. The first of four deoxynucleotide triphosphates (dNTP) is added to the reaction.
  • dNTP deoxynucleotide triphosphates
  • DNA polymerase catalyzes the incorporation of the deoxynucleotide triphosphate into the DNA strand, if it is complementary to the base in the template strand. Each incorporation event is accompanied by release of pyrophosphate (PPi) in a quantity equimolar to the amount of incorporated nucleotide.
  • PPi pyrophosphate
  • ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5 ' phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP.
  • the light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and seen as a peak in a pyrogramTM. Each light signal is proportional to the number of nucleotides incorporated.
  • Apyrase a nucleotide degrading enzyme, continuously degrades unincorporated dNTPs and excess ATP. When degradation is complete, another dNTP is added.
  • Another similar method for characterizing SNPs/SNVs does not require use of a complete PCR, but typically uses only the extension of a primer by a single, fluorescence- labeled dideoxyribonucleic acid molecule (ddNTP) that is complementary to the nucleotide to be investigated.
  • ddNTP dideoxyribonucleic acid molecule
  • the nucleotide at the polymorphic site can be identified via detection of a primer that has been extended by one base and is fluorescently labeled (e.g., Kobayashi et al, MoI. Cell. Probes, 9:175-182, 1995).
  • Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution (see, e.g., Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, W. H. Freeman and Co, New York, 1992, Chapter 7).
  • Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described, e.g, in Orita et ah, Proc. Nat. Acad. ScL 86, 2766-2770 (1989).
  • Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products.
  • Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target
  • Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means.
  • Useful labels include fluorescent dyes, radioactive labels, e.g., 32 P, electron-dense reagents, enzyme, such as peroxidase or alkaline phsophatase, biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeling techniques are well known in the art (see, e.g., Current Protocols in Molecular Biology, supra; Sambrook & Russell, supra).
  • a BSE animal is detected by detecting the presence of any one of the 420 polymorphisms set forth in Table 1, or detecting a combination of those polymorphisms. Analysis is generally performed by querying more than one of the 420 variant positions. Thus, anywhere from 1 to all of the 420 variant positions set forth in Table 1 can be analyzed to detect BSE. Generally at least 10, 20, 30, 40, 50 60, 70, or 100 or more positions are analyzed.
  • Table 1 and Table 2 provide a description of SNPs found only in BSE animals. Table 1 provides details of SNPs found only in BSE animals. Column A contains the sequence identification tags for each reference query sequence as described in Table 2. Column B describes the position of the SNP in the sequence referred to in Column A. Column C is the consensus sequence derived from the database of sequences of normal and BSE animals and designates the position (shown in a capital letter) at which a diagnostic SNP can be found in sequences from BSE animals only. Column E is the Sequence ID number for those sequences found in Column F.
  • Column F is the consensus sequence derived from the database of sequences of BSE animals and designates the actual polymorphism (shown in a capital letter) found in sequences from BSE animals only: a "-" designates a deletion, a "N” designates an insertion.
  • Column G is the Sequence ID number for those sequences found in Column H.
  • Column H is an alternative consensus sequence to that found in Column F and derived from the database of sequences of BSE animals and designates the actual polymorphism (shown in a capital letter) found in sequences from BSE animals only: a "-" designates a deletion, a "N” designates an insertion.
  • the position of the polymorphism is determined with reference to one of reference sequences SEQ ID NOs 1-41. Thus, the number of the position indicates where that position occurs in the context of the reference sequence (Le., one of SEQ ID NOs: 1-41). An insertion occurs after the designated position,
  • Table 2 provides a summary table of queried sequences containing diagnostic SNPs found only in BSE animals.
  • Column A contains the sequence identification tags for each query sequence.
  • Column B contains the repetitive element nomenclature that has the highest homology with the query sequence when applicable.
  • Column C is the length of the query sequence.
  • Column D contains the percentage of the query length that has homology (>70%) to the reference repetitive element in Column B.
  • Column E contains, where appropriate, the BLAST reference of the query sequence when searching against the cow genome.
  • Column F contains the percentage of the query length, where appropriate, that has homology (>70%) to the BLAST reference in Column E.
  • Column G contains the highest number of individual sequences in the database, derived from the ultra deep sequencing of both BSE and normal animals, at positions that contains SNPs found only in BSE animals.
  • Column H contains the total number of significant SNPs in the queried sequence found only in BSE animals.
  • Column I contains the maximum number of BSE animals, out of a total of 15 BSE animals but not any normal control animals, that can be detected when using a single SNP from Column H.
  • Column J contains the maximum number of BSE animals, out of a total of 15 BSE animals but not any normal control animals, that can be detected when using a combination of SNPs from Column H.
  • Column K refers to the Sequence Number of oligos (as detailed in Table 1) that are located within the query sequence of Column A and containing the SNP position referred to in Column H.
  • a polymorphic position described herein (i.e., a query position) can be evaluated using sequencing or any number of methods employing oligonucleotides that are competent to discriminate between the residue(s) present in the reference sequence and the indicated polymorphism present only in BSE animals.
  • Such oligonucleotides can bind selectively to the normal sequence or in some embodiments, are designed to bind selectively to the variant sequence known to be associated with BSE.
  • Exemplary oligonucleotides that discriminate between the reference sequence and BSE-associated variant sequence are provided in Table 1.
  • a BSE animal is detected by sequence analysis of one or more polymorphic positions.
  • EXAMPLES Example 1. Detection of polymorphisms to detect animals with BSE
  • This example describes detection of SNP/SNVs associated with BSE.
  • Samples were obtained from an experimental study whereby cows were inoculated orally with BSE- infectious or control brain material.
  • Fleckvieh/Brown Swiss cattle were fed 100 g of either PrP res -positive brain stem macerate or normal brain material (controls). Serum samples were taken 40 months post-inoculation (15 infected, 6 control non-infected and 12 randomly selected normal animals).
  • Serum collection Special care was taken in collection, processing and storage of serum samples. Blood from the tail vein or artery was collected into 18 mL plastic tubes equipped with a coagulation accelerator and kept at room temperature for 30 min to ensure proper coagulation. Until further processing, the tubes were stored at 2 - 8 °C for not longer than 24 hours. Centrifugation was done at 2 - 8 °C, 1000 x g for 15 min. The serum supernatant was transferred into 1.5 mL microcentrifuge cups in 0.5 mL aliquots and frozen immediately at -20°C or -80 0 C until use
  • Ultra deep sequencing of products from steps above was performed using a Roche/454 genome sequencer (GS20/GSFLX) with system reagents according to the manufacturers instructions.
  • Blast analysis A total of 117 contigs were compared against a database containing 808,634 sequences (total letters: 86,785,049) using Blast. Database sequences segregate into 410984 sequences from 15 animals artificially infected with BSE and 397650 sequences from 18 un-infected controls.
  • the ultra deep sequencing approach had generated 41 sequences (Jean is this a table, figure, attachment?) in which SNP/SNVs were found only in animals with BSE and not in normal controls.
  • the sequences were mostly derived from repetitive genomic sequences wherein most of the prevalent sequences had homology to bovine Ll LINE or SINE repetitive elements. Two sequences showed were neither repetitive nor coding sequences homology.
  • Table 1 shows that a total of 421 SNPs could be identified from the 41 sequences
  • Seq. ID: 11 1231 aggaaatartcaCycaagtccaaga Seq. ID: 223 aggaaatartca-ccaagtccaaga Seq. ID: 596 aggaaahagtya-ccaagtccaara
  • Seq. ID: 27 640 aagcagggtgacAatatacagcctt Seq. ID: 394 aagcagggtgacTatatacagcctt Seq. ID: 675 aarcagggtgacTatatacagcctt
  • Seq. ID: 27 1340 acagttcttctgTgtattcttgcca Seq. ID: 353 acagttcttctg-gtattcttgcca Seq. ID: 693 ayagttcttctg-gtattcttgcca
  • Seq. ID: 34 576 aamarggtmvyrAarattrkaangw Seq. ID: 422 aamaaggtcvyrGarattakaaagw Seq. ID: 713 aavaargkvryrGarawkataragw
  • TTTTGT 1 1 1 I CATGTGTTTGTT-AGYTNKGTGBTWDHNADTHAAATTCAACAYCCATTTATGATAAAAACTCTCCAGAAA

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides methods and compositions for detecting a transmissible spongiform encephalopathy, e.g., BSE, based on the presence of BSE- associated polymorphisms in nucleic acid samples from acellular samples.

Description

DETECTION OF NUCLEIC ACID SEQUENCE VARIATIONS IN CIRCULATING NUCLEIC ACID IN BOVINE SPONGIFORM
ENCEPHALOPATHY
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims benefit of U.S. provisional application no. 60/988,066, filed November 14, 2007, which application is herein incorporated by reference.
BACKGROUND OF THE INVENTION
[0002] Mad cow disease or bovine spongiform encephalopathy (BSE) is a progressive, invariably fatal neurodegenerative disease in cattle. BSE was recognized as a public health concern in 1996 when young Britons were diagnosed with what appeared to be a new form of a familial illness of older age, Creutzfeldt- Jakob Disease (CJD). British scientists linked the development of this "variant Creutzfeldt- Jakob Disease" (vCJD) to exposure to and/or consumption of BSE cattle. As of November 2002, 143 cases of "definite or probable" vCJD had been diagnosed in the UK.
[0003] Currently, the only available BSE tests are post mortem tests performed on brain tissue from animals that have been slaughtered. In addition, the majority of these tests detect abnormal proteins known as prions, which are not found in an animal until the disease has progressed into late stage. Early stage spongiform encephalopathies are difficult to detect by prion testing because prion accumulation is most often associated with late-stage disease. Genetic tests for prion gene polymorphisms are currently used to determine the susceptibility of sheep for scrapie (Hunter, et al. Arch Virol 141:809-824, 1996). No such diversity of prion genes is found in BSE. However, the detection of nucleic acids in cattle sera (Schϋtz et al. CDLI) has previously been reported. Tests for detection and monitoring of genetic material associated with chronic illnesses other than BSE can be performed using sera. Such circulating nucleic acids (CNA) associated tests are often designed to detect unique nucleic acid targets, usually of exogenous origin, e.g. HIV-I, CMV, HCV and HBV. CNAs of possible endogenous origin also have been found to be associated with chronic illnesses in humans (Urnovitz, et al. Clin Diagn Lab Immunol 6:330-335,1999; Durie, Urnovitz, & Murphy Acta Oncol 39:789-796, 2000). These studies focused on the detection of only repetitive genomic sequences or repetitive sequences rearranged with other genomic elements. These studies did not analyze variances or polymorphisms
[0004] It has been suggested that current tests are not sensitive enough to fully protect against the entry of BSE cattle into the human food chain (Knight, Nature 426:216, 2003). Further, current tests cannot identify cohort herd mates of BSE-infected cattle that have an increased risk of BSE. The current invention addresses this need.
BRIEF SUMMARY OF THE INVENTION
[0005] This invention is based on the discovery that single nucleotide variances and polymorphisms in nucleic acid sequences are detected in acellular samples, such as serum or plasma, from animals at risk for transmissible spongiform encephalopathy, e.g., BSE. The invention therefore provides a method of detecting an animal with bovine spongiform encephalopathy (BSE), the method comprising: detection of an individual or multiple single nucleotide polymorphisms (SNPs), also referred to herein as single nucleotide variations (SNVs), in nucleic acids extracted from an acellular sample obtained from the animals. In some embodiments, the sample is an acellular fluid such as serum or plasma. The nucleic acid sample can be a DNA sample or RNA sample.
[0006] Typically, BSE specific SNPs are selected from a database of nucleic acid sequences. In some embodiments, the database is generated by ultra deep sequencing technology whereby sequences present in BSE animals are compared to sequences present in animals without BSE.
[0007] SNVs can be detected using methods that are well known in the art. Most assays entail one of several general protocols: hybridization using allele-specific oligonucleotides, primer extension, allele-specific ligation, sequencing, or electrophoretic separation techniques, e.g., singled stranded conformational polymorphism (SSCP) and heteroduplex analysis. Other assays include 5' nuclease assays, template-directed dye terminator incorporation, molecular beacon allele-specific oligonucleotide assays, single-base extension assays, and SNP scoring by real-time pyrophosphate sequences. Analysis of amplified sequences can be performed using various technologies such as microchips, fluorescence polarization assays, and matrix-assisted laser desoprtion ionization (MALDI) mass spectrometry. Two methods that can also be used are assays based on invasive cleavage with Flap nucleases and methodologies employing padlock probes. In typical embodiments, the presence of SNVs is detected by sequencing.
[0008] In another aspect, the invention provides a method of selecting technologies to use in an amplification reaction to detect an animal with BSE, the method comprising: identifying nucleic acid sequences that have differences in BSE animals compared to normal; and selecting reagents that detect specific SNP/SNV reactivity. Typically, the sequences are . identified in acellular samples, such as serum, hi one embodiment, the invention provides a method for the detection of SNPs/SNVs to detect an animal with BSE, the method comprising: whole genome amplification of circulating nucleic acids, ultra deep sequencing of the amplified products and identification of SNPs/SNVs in the resulting database in BSE animals as compared to normal controls.
[0009] The invention provides a method of detecting an animal with BSE, the method comprising: extracting nucleic acids from a sample and detecting the statistical presence of a SNP/SNV in the extracted nucleic acids. Detection of a SNP/SNV can be done directly or indirectly, e.g., through amplification of the target nucleic acids and query at a variant position within the target nucleic acid using an oligonucleotide that selectively hybridizes to a reference sequence or known BSE-associated variant; or by direct sequencing..
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] Fig. 1 provides an example of a query repetitive element against a database. Calculations based on sequences Infected and controls compared using Chi-square test. The solid line shows the Chi-square value at each position of this example queried sequence, based on distribution of nucleotide sampling, ie. A, C, G, T or a insertion or deletion, of sequences from animals with BSE as compared to sequences from normal controls. The dotted line depicts the total number of hits in the database for each position on this example queried sequence.
[0011] Fig. 2 shows exemplary SNV analysis. Whenever the dotted line reaches 1.0 the position has a single nucleotide variation found only in animals with BSE as compared to normal controls. The accompanying solid line is the respective Chi-square value, based on distribution of nucleotide sampling, ie. A, C, G, T or a insertion or deletion, calculated per animal. Throughout this example sequence, more than one position contains a SNV that is present in BSE animals but not in normal controls. DETAILED DESCRIPTION OF THE INVENTION Definitions
[0012] A "cohort" refers to birth or feeding cohorts that are defined according to the official EU definition as being raised or born on the same farm within 12 months prior to or after a BSE index case.
[0013] Animals "with BSE" refer to cattle that are incubating BSE etiologic agents but may or may not show any clinical signs of BSE or PrPres reactivity at the time of sampling.
[0014] The term "reactivity" as used herein refers to a change in a characteristic of SNP/SNV detection, in the presence of a nucleic acid sequence that is indicative of BSE. A sample is considered reactive when it exhibits a value of at least 3, preferably 5 standard deviations above a reference standard.
[0015] A "positive reference" or "positive control" is a sample that is known to contain SNPs/SNVs that are indicative of BSE. In some embodiments, a "positive reference" can be from a known cohort animal that was reactive in the assay of the invention. Alternatively, a "positive reference" can be a synthetic construct that shows reactivity in an assay of the invention.
[0016] A "reference control" is a sample that results in minimal change to the SNP/SNV detection in BSE. Often, such a sample is a known negative, e.g., from healthy animals. For example, in diagnostic applications, such a control is typically derived from a normal animal that is not a cohort with a PrPres animal. A "reference control" is preferably included in an assay, but may be omitted.
[0017] "Amplifying" refers to a step of submitting a solution to conditions sufficient to allow for amplification of a polynucleotide if all of the components of the reaction are intact. Components of an amplification reaction include, e.g., primers, a polynucleotide template, polymerase, nucleotides, and the like. The term "amplifying" typically refers to an "exponential" increase in target nucleic acid. However, "amplifying" as used herein can also refer to linear increases in the numbers of a select target sequence of nucleic acid.
[0018] An "amplification characteristic" refers to any parameter of an amplification reaction. Such reactions typically comprises repeated cycles. An amplification characteristic may be the number of cycles, a melting curve, temperature profile, or band characteristics on a gel or other means of post-amplification detection. [0019] In the context of this invention, "the term allele-specific probe" does not refer to an allele per se, but to a probe to a variant nucleic acid sequence relative to a reference sequence. Thus an "allele" in the context of this invention refers to a variant nucleic acid sequence in comparison to a references sequence, e.g., the reference sequences set forth in SEQ ID NOs 1-41.
[0020] A "melting profile" or "melting curve" refers to the melting temperature characteristics of a nucleic acid fragment over a temperature gradient. In some embodiments, the melting curve is derived from the first derivative of the melting signal. The melting point of a DNA fragment depends, e.g., on its length, its G/C content, the ionic strength of the buffer and the presence of mismatches (heteroduplexes). Thus, the proportion of the molecules in the population that are melting over a temperature range generates a melting profile, which is unique to a particular fragment or population of molecules.
[0021] The term "amplification reaction" refers to any in vitro means for multiplying the copies of a target sequence of nucleic acid. Such methods include but are not limited to polymerase chain reaction (PCR), DNA ligase, (LCR), QβRNA replicase, RNA transcription- based (TAS and 3SR) amplification reactions, and nucleic acid sequence based amplification (NASBA). (See, e.g., Current Protocols in Human Genetics Dracopoli et al. eds., 2000, John Wiley & Sons, Inc.).
[0022] "Polymerase chain reaction" or "PCR" refers to a method whereby a specific segment or subsequence of a target double-stranded DNA, is amplified in a geometric progression. PCR is well known to those of skill in the art; see, e.g., U.S. Patents 4,683,195 and 4,683,202; PCR Technology: Principles and Applications for DNA Amplification (Erlich, ed., 1992)and PCR Protocols: A Guide to Methods and Applications, Innis et al, eds, 1990.
[0023] The term "amplification reaction mixture" refers to an aqueous solution comprising the various reagents used to amplify a target nucleic acid. These include enzymes, aqueous buffers, salts, amplification primers, target nucleic acid, and nucleoside triphosphates. Depending upon the context, the mixture can be either a complete or incomplete amplification reaction mixture
[0024] A "primer" refers to a polynucleotide sequence that hybridizes to a sequence on a target nucleic acid and serves as a point of initiation of nucleic acid synthesis. Primers can be of a variety of lengths and are often less than 50 nucleotides in length, for example 12-25 nucleotides, in length. The length and sequences of primers for use in PCR can be designed based on principles known to those of skill in the art, see, e.g., Innis et al., supra. A primer is preferably a single-stranded oligodeoxyribonucleotide. The primer includes a "hybridizing region" exactly or substantially complementary to the target sequence, preferably about 15 to about 35 nucleotides in length. A primer oligonucleotide can either consist entirely of the hybridizing region or can contain additional features which allow for the detection, immobilization, or manipulation of the amplified product, but which do not alter the ability of the primer to serve as a starting reagent for DNA synthesis. For example, a nucleic acid sequence tail can be included at the 5' end of the primer that hybridizes to a capture oligonucleotide. As appreciated by one of skill in the art, a primer for use in the invention need not exactly correspond to the sequence(s) that it amplifies in a hybridization reaction. For example, the incorporation of mismatches into a probe can be used to adjust duplex stability when the assay format precludes adjusting the hybridization conditions. The effect of a particular introduced mismatch on duplex stability is well known, and the duplex stability can be routinely both estimated and empirically determined, as described above. Suitable hybridization conditions, which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art (see, e.g., the general PCR and molecular biology technique references cited herein).
[0025] The term " subsequence" when referring to a nucleic acid refers to a sequence of nucleotides that are contiguous within a second sequence but does not include all of the nucleotides of the second sequence.
[0026] A "temperature profile" refers to the temperature and lengths of time of the denaturation, annealing and/or extension steps of a PCR reaction. A temperature profile for a PCR reaction typically consists of 10 to 60 repetitions of similar or identical shorter temperature profiles; each of these shorter profiles may typically define a two step or three- step PCR reaction. Selection of a "temperature profile" is based on various considerations known to those of skill in the art, see, e.g., Innis et al., supra.
[0027] A "template " refers to a double or single stranded polynucleotide sequence that comprises a polynucleotide to be amplified.
[0028] An "acellular biological fluid" is a biological fluid that substantially lacks cells. Typically, such fluids are fluids prepared by removal of cells from a biological fluid that normally contains cells (e.g., whole blood). Exemplary processed acellular biological fluids include processed blood (serum and plasma), e.g., from peripheral blood or blood from body cavities or organs; and samples prepared from urine, milk, saliva, sweat, tears, phlegm, cerebrospinal fluid, semen, feces, and the like. Often, serum or plasma is the acellular sample that is analyzed in the assays of the invention. Other acellular samples that can be used include samples comprising nucleic acids obtained by washing any cell preparation to remove circulating nucleic acids that are associated with the cell surface. For example, such an acellular sample can be obtained by washing circulating blood cells, such as lymphocytes. The supernatant from the wash can then be analyzed.
[0029] "Nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, or chimeric constructs of polynucleotides chemically linked to reporter molecules, and unless otherwise limited, would encompass known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides.
[0030] The term "biological sample", as used herein, refers to a sample obtained from an organism or from components (e.g., cells) of an organism. The sample may be of any biological tissue or fluid. Frequently the sample will be a "clinical sample" which is a sample derived from a patient, animal or human, with a disease or suspected of having a disease. Such samples include, but are not limited to, sputum, blood, serum, plasma, body cavity blood or blood products, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, milk, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections of tissues such as frozen sections taken for histological purposes.
[0031] An "individual" or "patient" as used herein, refers to any animals, often mammals, including, but not limited to humans, nonhuman primates such as chimpanzees and monkeys, horses, cows, deer, sheep, goats, pigs, dogs, minks, elk, cats, lagromorphs, and rodents.
[0032] A "chronic illness" is a disease, symptom, or syndrome that last for months to years. Examples of chronic illnesses in animals include, but are not limited to, cancers and wasting diseases as well as autoimmune diseases, and neurodegenerative diseases such as spongiform encephalopathies and others.
[0033] "Repetitive genomic sequences" or "repetitive genomic nucleic acid sequences" (RGNAS) refer to highly repeated DNA elements present in the animal genome. These sequences are usually categorized in sequence families and are broadly classified as tandemly repeated DNA or interspersed repetitive DNA (see, e.g., Jelinek and Schmid, Ann. Rev. Biochem. 51:831-844, 1982; Hardman, Biochem J. 234:1-11, 1986; and Vogt, Hum. Genet. 84:301-306, 1990). Tandemly repeated DNA includes satellite, minisatellite, and microsatellite DNA. Repetitive genomic sequences includes AIu sequences, short interspersed nuclear elements (SINES), long terminal repeats (LTR), LTR and non-LTR transposable elements, LTR and non-LTR retrotransposons, endogenous retroviruses, and long interspersed nuclear elements (LINES) including Ll LINE sequences.
[New Paragrpah] "Intergenic sequence" or "spacer DNA" or "non-coding sequence" refers to those nucleic acid sequences, including non-intronic sequences that do not code for protein sequences.
[0034] A "rearranged sequence" or "recombined sequence" is a region of the genomic DNA that is rearranged compared to normal, i.e., the rearranged sequence is not contiguous in genomic DNA in healthy animals or in genomic DNA obtained from animals prior to contracting a disease or prior to exposure to a genotoxic agent.
[0035] A single nucleotide polymorphism (SNP) is used herein interchangeably with the term "single nucleotide variance" or " single nucleotide variations" (SNV). A SNP (SNV) is a difference at a single nucleotide position of a test sequence when compared to a reference nucleic acid sequence in which 1 and up to 5 nucleotides can be different. In one embodiment, the reference sequence can be derived experimentally from nucleic acids sequenced from the serum of normal individuals and the test sequence may be derived from the serum of an animal with BSE. Such variability can include regions of short nucleotide (1- 5 nucleotide) deletions and insertions. SNPs (SNVs) may occur at any region in the genome or in nucleic acid sequences. In the current invention, the change is in a non-coding region of DNA, including, but not limited to repetitive sequences, and intragenic DNA.
[0036] "Ultra Deep Sequencing" (454 Sequencing) is a massively-parallel pyrosequencing system capable of sequencing roughly 100 megabases of raw DNA sequence per 7-hour run using the GSFLX sequencing machine. The system relies on fixing nebulized and adapter- ligated DNA fragments to small DNA-capture beads in a water-in-oil emulsion. The DNA fixed to these beads is then amplified by PCR. Finally, each DNA-bound bead is placed into a ~44 μm well on a PicoTiterPlate, a fiber optic chip. A mix of enzymes such as polymerase, ATP sulfurylase, and luciferase are also packed into the well. The PicoTiterPlate is then placed into the GS20 for sequencing.
[0037] "Contig" refers to sequences that are computationally assembled from several overlapping physically contiguous sequences into one contiguous sequence. Such a contig is usually, but not necessarily longer than the initial sequences. [0038] "Whole genome amplification" is a technique in which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis.
[0039] "Circulating nucleic acids" or "CNAs" refers to DNA or RNA that is found in acellular fluids.
[0040] Typically, stringent hybridization conditions will be those in which the salt concentration is about 0.2XSSC at pH 7 and the temperature is at least about 6O0C. For example, a nucleic acid of the invention or fragment thereof can be identified in standard filter hybridizations using the nucleic acids disclosed here under stringent conditions, which for purposes of this disclosure, include at least one wash (usually 2) in 0.2X SSC at a temperature of at least about 60°C, usually about 65°C, sometimes 70°C for 20 minutes, or equivalent conditions. For PCR, an annealing temperature of about 5°C below Tm, is typical for low stringency amplification, although annealing temperatures may vary between about 32°C and 72°C, e.g., 400C, 42°C, 45°C, 52°C, 550C, 570C, or 62°C, depending on primer length and nucleotide composition. High stringency PCR amplification, a temperature at, or slightly (up to 5°C) above, primer Tm is typical, although high stringency annealing temperatures can range from about 500C to about 72°C, and are often 72°C, depending on the primer and buffer conditions (Ahsen et al, Clin Chem. 47:1956-61, 2001). Typical cycle conditions for both high and low stringency amplifications include a denaturation phase of 90°C-95°C for 30 sec-2 min., an annealing phase lasting 30 sec-10 min., and an extension phase of about 72°C for 1 -15 min.
Introduction
[0041] BSE is clinically characterized by increasing perturbation of central nervous function in the affected animal, ultimately leading to severe symptoms, e.g., an inability to stand, forcing the sacrifice of the animal. In contrast to other mammalian transmissible spongiform encephalopathies, the bovine form does not appear to be associated with a mutation in the prion gene, but may be caused by a post-translational misfolding of the prion protein, which leads to aggregation in the central nervous system. The diagnosis is based on the fact that misfolded prion protein has enhanced resistance to protease K digestion. As disease-specific prion accumulation in the plasma or blood of animals has not been identified, the diagnostic target has been the brain stem. Therefore, there is an urgent need to define a blood-borne marker for TSEs so that the disease can be diagnosed in living animals, in particular in animals that may be at increased risk for the disease, e.g., cohort animals, such as cows in the same herd as an infected animal.
[0042] The invention provides a method for diagnosing an increased risk for BSE by amplification and analysis of circulating nucleic acids (CNA) from test animals.
Nucleic acids detected in the methods of the invention
[0043] Nucleic acid molecules detected in the methods of the invention may be free, single or double stranded, molecules or complexed with protein or lipids or both. The detected nucleic acids can be DNA or RNA molecules. RNA molecules need not be transcribed from a gene, but can be transcribed from any sequence in the chromosomal DNA. Exemplary RNAs include miRNA, intergenic RNA, small nuclear RNA (snRNA), mRNA, tRNA, rRNA, and interference RNA (iRNA).
[0044] The nucleic acid molecules may comprise sequences transcribed from repetitive genomic sequences or intergenic or non-coding DNA in the genome of the individual from which the sample is derived. The detected nucleic acid molecules may also be the products of rearrangement of germline sequences and/or sequences introduced into the genome, e.g., exogenous viral sequences.
[0045] The method does not require knowledge of the polynucleotide sequences present in the test samples to be evaluated. Thus, a polynucleotide detected using this method may be a particular polynucleotide or may be a population of polynucleotides that are present in the sample. Furthermore, even in instances, where the polynucleotide to be detected has a known sequence, the polynucleotide in a particular sample, need not have that sequence, i.e., the sequence of the polynucleotide in the sample may be altered in comparison to the known sequence. Such alterations can include mutations, e.g., insertions, deletions, substitutions, and various other rearrangements. Further, the resulting amplified products may be as result of the amplification reaction and not reflect the original pool of polynucleotides.
Test samples
[0046] The test samples are typically biological samples that comprise target nucleic acids. A target nucleic acid can be from any source, but is typically from a biological sample that comprises small quantities of nucleic acid, e.g., nucleic acid samples obtained from samples that are not readily quantified by standard PCR methodology. In particular embodiments, the test sample is a nucleic acid, e.g., RNA or DNA that is isolated from serum or plasma. SNP/SNV Detection Reactions
[0047] Detection techniques for evaluating nucleic acids for the presence of a SNP or SNV involve procedures well known in the field of molecular genetics. Further, many of the methods involve amplification of nucleic acids. Ample guidance for performing such technicques is provided in the art. Exemplary references include manuals such as PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N. Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, Calif, 1990); Current Protocols in Molecular Biology, Ausubel, 1994-1999, including supplemental updates through April 2004; Sambrook & Russell, Molecular Cloning, A Laboratory Manual (3rd Ed, 2001). In addition, microarrays can be utilized for genomewide SNP detection assays (Genomewide SNP assay reveals mutations underlying Parkinson disease. Simon-Sanchez J, Scholz S, Del Mar Matarin M, Fung HC, Hernandez D, Gibbs JR, Britton A, Hardy J, Singleton A. Hum Mutat. 2007 Nov 9)
[0048] Although the methods typically employ PCR steps, other amplification protocols may also be used. Suitable amplification methods include ligase chain reaction {see, e.g., Wu & Wallace, Genomics 4:560-569, 1988); strand displacement assay {see, e.g., Walker et al, Proc. Natl. Acad. ScL USA 89:392-396, 1992; U.S. Pat. No. 5,455,166); and several transcription-based amplification systems, including the methods described in U.S. Pat. Nos. 5,437,990; 5,409,818; and 5,399,491; the transcription amplification system (TAS) (Kwoh et al, Proc. Natl. Acad. ScL USA 86:1173-1177, 1989); and self-sustained sequence replication (3SR) (Guatelli et al, Proc. Natl. Acad. ScL USA 87:1874-1878, 1990; WO 92/08800). Alternatively, methods that amplify the probe to detectable levels can be used, such as Qβ- replicase amplification (Kramer & Lizardi, Nature 339:401-402, 1989; Lomeli et al, CHn. Chem. 35:1826-1831, 1989). A review of known amplification methods is provided, for example, by Abramson and Myers in Current Opinion in Biotechnology 4:41-47, 1993.
[0049] Typically, the detection of a SNP/SNV is performed using oligonucleotide primers and/or probes. Oligonucleotides can be prepared by any suitable method, usually chemical synthesis. Oligonucleotides can be synthesized using commercially available reagents and instruments. Alternatively, they can be purchased through commercial sources. Methods of synthesizing oligonucleotides are well known in the art {see, e.g, Narang et al, Meth. Enzymol. 68:90-99, 1979; Brown et al, Meth. Enzymol. 68:109-151, 1979; Beaucage et al, Tetrahedron Lett. 22:1859-1862, 1981; and the solid support method of U.S. Pat. No. 4,458,066). In addition, modifications to the above-described methods of synthesis may be used to desirably impact enzyme behavior with respect to the synthesized oligonucleotides. For example, incorporation of modified phosphodiester linkages (e.g., phosphorothioate, methylphosphonates, phosphoamidate, or boranophosphate) or linkages other than a phosphorous acid derivative into an oligonucleotide may be used to prevent cleavage at a selected site, hi addition, the use of 2 '-amino modified sugars tends to favor displacement over digestion of the oligonucleotide when hybridized to a nucleic acid that is also the template for synthesis of a new nucleic acid strand.
[0050] Frequently used methodologies for analysis of nucleic acid samples to detect SNPs/SNVs are briefly described. However, any method known in the art can be used in the invention to detect the presence of short nucleotide substitutions.
Variant Specific Hybridization
[0051] This technique, also commonly referred to as allele specific oligonucleotide hybridization (ASO) (e.g., Stoneking et al., Am. J. Hum. Genet. 48:70-382, 1991; Saiki et al., Nature 324, 163-166, 1986; EP 235,726; and WO 89/11548), relies on distinguishing between two DNA molecules differing at a polymorphic position, typically by one nucleotide, by hybridizing an oligonucleotide probe that is specific for one of the variants to an amplified product obtained from amplifying the nucleic acid sample. This method typically employs short oligonucleotides, e.g., 15-20 bases in length. The probes are designed to differentially hybridize to one variant versus another. For example, in some embodiments, probes are designed to hybridize to the version of the nucleic acid sequence that is present in normal cows. Principles and guidance for designing such probe is available in the art, e.g., in the references cited herein. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-base oligonucleotide at the 7 position; in a 16-based oligonucleotide at either the 8 or 9 position) of the probe, but this design is not required.
[0052] The amount and/or presence of an allele is determined by measuring the amount of allele-specific oligonucleotide that is hybridized to the sample. Typically, the oligonucleotide is labeled with a label such as a fluorescent label. After stringent hybridization and washing conditions, fluorescence intensity is measured for each SNP oligonucleotide.
[0053] In one embodiment, the nucleotide present at the polymorphic site is identified by hybridization under sequence-specific hybridization conditions with an oligonucleotide probe exactly complementary to one of the polymorphic alleles in a region encompassing the polymorphic site. The probe hybridizing sequence and sequence-specific hybridization conditions are selected such that a single mismatch at the polymorphic site destabilizes the hybridization duplex sufficiently so that it is effectively not formed. Thus, under sequence- specific hybridization conditions, stable duplexes will form only between the probe and the exactly complementary allelic sequence. Thus, oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are exactly complementary to an allele sequence in a region which encompasses the polymorphic site are within the scope of the invention.
[0054] In an alternative embodiment, the nucleotide present at the polymorphic site is identified by hybridization under sufficiently stringent hybridization conditions with an oligonucleotide substantially complementary to one of the SNP/SNV alleles in a region encompassing the polymorphic site, and exactly complementary to the allele at the polymorphic site. Because mismatches which occur at non-polymorphic sites are mismatches with both allele sequences, the difference in the number of mismatches in a duplex formed with the target allele sequence and in a duplex formed with the corresponding non-target allele sequence is the same as when an oligonucleotide exactly complementary to the target allele sequence is used. In this embodiment, the hybridization conditions are relaxed sufficiently to allow the formation of stable duplexes with the target sequence, while maintaining sufficient stringency to preclude the formation of stable duplexes with non-target sequences. Under such sufficiently stringent hybridization conditions, stable duplexes will form only between the probe and the target allele. Thus, oligonucleotides from about 10 to about 35 nucleotides in length, preferably from about 15 to about 35 nucleotides in length, which are substantially complementary to an allele sequence in a region which encompasses the polymorphic site, and are exactly complementary to the allele sequence at the polymorphic site, are within the scope of the invention.
[0055] The use of substantially, rather than exactly, complementary oligonucleotides may be desirable in assay formats in which optimization of hybridization conditions is limited. For example, in a typical multi-target immobilized-probe assay format, probes for each target are immobilized on a single solid support. Hybridizations are carried out simultaneously by contacting the solid support with a solution containing target DNA. As all hybridizations are carried out under identical conditions, the hybridization conditions cannot be separately optimized for each probe. The incorporation of mismatches into a probe can be used to adjust duplex stability when the assay format precludes adjusting the hybridization conditions. The effect of a particular introduced mismatch on duplex stability is well known, and the duplex stability can be routinely both estimated and empirically determined, as described above. Suitable hybridization conditions, which depend on the exact size and sequence of the probe, can be selected empirically using the guidance provided herein and well known in the art. The use of oligonucleotide probes to detect single base pair differences in sequence is described in, for example, Conner et al., 1983, Proc. Natl. Acad. Sci. USA 80:278-282, and U.S. Pat. Nos. 5,468,613 and 5,604,099, each incorporated herein by reference.
[0056] The proportional change in stability between a perfectly matched and a single-base mismatched hybridization duplex depends on the length of the hybridized oligonucleotides. Duplexes formed with shorter probe sequences are destabilized proportionally more by the presence of a mismatch. In practice, oligonucleotides between about 15 and about 35 nucleotides in length are preferred for sequence-specific detection. Furthermore, because the ends of a hybridized oligonucleotide undergo continuous random dissociation and re- annealing due to thermal energy, a mismatch at either end destabilizes the hybridization duplex less than a mismatch occurring internally. Preferably, for discrimination of a single base pair change in target sequence, the probe sequence is selected which hybridizes to the target sequence such that the polymorphic site occurs in the interior region of the probe.
[0057] Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample are known in the art and include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats. Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.
[0058] In a dot-blot format, amplified target DNA is immobilized on a solid support, such as a nylon membrane. The membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe. A preferred dot-blot detection assay is described in the examples.
[0059] In the reverse dot-blot (or line-blot) format, the probes are immobilized on a solid support, such as a nylon membrane or a microtiter plate. The target DNA is labeled, typically during amplification by the incorporation of labeled primers. One or both of the primers can be labeled. The membrane-probe complex is incubated with the labeled amplified target DNA under suitable hybridization conditions, unhybridized target DNA is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound target DNA. A preferred reverse line-blot detection assay is described in the examples.
[0060] An allele-specific probe that is specific for one of the polymorphism variants is often used in conjunction with the allele-specific probe for the other polymorphism variant. In some embodiments, the probes are immobilized on a solid support and the target sequence in an individual is analyzed using both probes simultaneously. Examples of nucleic acid arrays are described by WO 95/11995. The same array or a different array can be used for analysis of characterized polymorphisms.
Allele-Specific Primers
[0061] Polymorphisms are also commonly detected using allele-specific amplification or primer extension methods. These reactions typically involve use of primers that are designed to specifically target a polymorphism via a mismatch at the 3' end of a primer. The presence of a mismatch effects the ability of a polymerase to extend a primer when the polymerase lacks error-correcting activity. The presence of the particular allele can be determined by the ability of the primer to initiate extension. If the 3' terminus is mismatched, the extension is impeded. Thus, for example, if a primer matches the "C" allele nucleotide at the 3' end, the primer will be efficiently extended.
[0062] Typically, the primer is used in conjunction with a second primer in an amplification reaction. The second primer hybridizes at a site unrelated to the polymorphic position. Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form is present. Allele-specific amplification- or extension- based methods are described in, for example, WO 93/22456; U.S. Pat. Nos. 5,137,806; 5,595,890; 5,639,611; and U.S. Pat. No. 4,851,331. [0063] Using allele-specific amplification-based genotyping, identification of the alleles requires only detection of the presence or absence of amplified target sequences. Methods for the detection of amplified target sequences are well known in the art. For example, gel electrophoresis and probe hybridization assays described are often used to detect the presence of nucleic acids.
[0064] In an alternative probe-less method, the amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture, is described, e.g., in U.S. Pat. No. 5,994,056; and European Patent Publication Nos. 487,218 and 512,334. The detection of double-stranded target DNA relies on the increased fluorescence various DNA-binding dyes, e.g., SYBR Green, exhibit when bound to double- stranded DNA.
[0065] As appreciated by one in the art, allele-specific amplification methods can be performed in reaction that employ multiple allele-specific primers to target particular alleles. Primers for such multiplex applications are generally labeled with distinguishable labels or are selected such that the amplification products produced from the alleles are distinguishable by size. Thus, for example, both alleles in a single sample can be identified using a single amplification by gel analysis of the amplification product.
[0066] As in the case of allele-specific probes, an allele-specific oligonucleotide primer may be exactly complementary to one of the polymorphic alleles in the hybridizing region or may have some mismatches at positions other than the 3' terminus of the oligonucleotide, which mismatches occur at non-polymorphic sites in both allele sequences.
5 '-nuclease assay
[0067] Identification of the presence of a polymorphism can also be performed using a "TaqMan®" or "5'-nuclease assay", as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al, 1988, Proc. Natl. Acad. Sd. USA 88:7276-7280. In the TaqMan® assay, labeled detection probes that hybridize within the amplified region are added during the amplification reaction. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis. The amplification is performed using a DNA polymerase having 5' to 3' exonuclease activity. During each synthesis step of the amplification, any probe which hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5' to 3' exonuclease activity of the DNA polymerase. Thus, the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.
[0068] The hybridization probe can be an allele-specific probe that discriminates between the SNP alleles. Alternatively, the method can be performed using an allele-specific primer and a labeled probe that binds to amplified product.
[0069] Any method suitable for detecting degradation product can be used in a 5' nuclease assay. Often, the detection probe is labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye. The dyes are attached to the probe, preferably one attached to the 5' terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5' to 3' exonuclease activity of the DNA polymerase occurs in between the two dyes. Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye. The accumulation of degradation product is monitored by measuring the increase in reaction fluorescence. U.S. Pat. Nos. 5,491,063 and 5,571,673, both incorporated herein by reference, describe alternative methods for detecting the degradation of probe which occurs concomitant with amplification.
DNA Sequencing and single base extensions
[0070] SNPs/SNVs can also be detected by direct sequencing. Methods include e.g., dideoxy sequencing-based methods and other methods such as Maxam and Gilbert sequence (see, e.g., Sambrook and Russell, supra).
[0071] Other detection methods include Pyrosequencing™ of oligonucleotide-length products. Such methods often employ amplification techniques such as PCR. For example, in pyrosequencing, a sequencing primer is hybridized to a single stranded, PCR-amplified, DNA template; and incubated with the enzymes, DNA polymerase, ATP sulfurylase, luciferase and apyrase, and the substrates, adenosine 5' phosphosulfate (APS) and luciferin. The first of four deoxynucleotide triphosphates (dNTP) is added to the reaction. DNA polymerase catalyzes the incorporation of the deoxynucleotide triphosphate into the DNA strand, if it is complementary to the base in the template strand. Each incorporation event is accompanied by release of pyrophosphate (PPi) in a quantity equimolar to the amount of incorporated nucleotide. ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5 ' phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP. The light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and seen as a peak in a pyrogram™. Each light signal is proportional to the number of nucleotides incorporated. Apyrase, a nucleotide degrading enzyme, continuously degrades unincorporated dNTPs and excess ATP. When degradation is complete, another dNTP is added.
[0072] Another similar method for characterizing SNPs/SNVs does not require use of a complete PCR, but typically uses only the extension of a primer by a single, fluorescence- labeled dideoxyribonucleic acid molecule (ddNTP) that is complementary to the nucleotide to be investigated. The nucleotide at the polymorphic site can be identified via detection of a primer that has been extended by one base and is fluorescently labeled (e.g., Kobayashi et al, MoI. Cell. Probes, 9:175-182, 1995).
Denaturing Gradient Gel Electrophoresis
[0073] Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution (see, e.g., Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, W. H. Freeman and Co, New York, 1992, Chapter 7).
Single-Strand Conformation Polymorphism Analysis
[0074] Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described, e.g, in Orita et ah, Proc. Nat. Acad. ScL 86, 2766-2770 (1989). Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products. Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target
[0075] SNP detection methods often employ labeled oligonucleotides. Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. Useful labels include fluorescent dyes, radioactive labels, e.g., 32P, electron-dense reagents, enzyme, such as peroxidase or alkaline phsophatase, biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeling techniques are well known in the art (see, e.g., Current Protocols in Molecular Biology, supra; Sambrook & Russell, supra).
Detection of a BSE animal
[0076] In typical embodiments, a BSE animal is detected by detecting the presence of any one of the 420 polymorphisms set forth in Table 1, or detecting a combination of those polymorphisms. Analysis is generally performed by querying more than one of the 420 variant positions. Thus, anywhere from 1 to all of the 420 variant positions set forth in Table 1 can be analyzed to detect BSE. Generally at least 10, 20, 30, 40, 50 60, 70, or 100 or more positions are analyzed.
[0077] Table 1 and Table 2 provide a description of SNPs found only in BSE animals. Table 1 provides details of SNPs found only in BSE animals. Column A contains the sequence identification tags for each reference query sequence as described in Table 2. Column B describes the position of the SNP in the sequence referred to in Column A. Column C is the consensus sequence derived from the database of sequences of normal and BSE animals and designates the position (shown in a capital letter) at which a diagnostic SNP can be found in sequences from BSE animals only. Column E is the Sequence ID number for those sequences found in Column F. Column F is the consensus sequence derived from the database of sequences of BSE animals and designates the actual polymorphism (shown in a capital letter) found in sequences from BSE animals only: a "-" designates a deletion, a "N" designates an insertion. Column G is the Sequence ID number for those sequences found in Column H. Column H is an alternative consensus sequence to that found in Column F and derived from the database of sequences of BSE animals and designates the actual polymorphism (shown in a capital letter) found in sequences from BSE animals only: a "-" designates a deletion, a "N" designates an insertion. The position of the polymorphism is determined with reference to one of reference sequences SEQ ID NOs 1-41. Thus, the number of the position indicates where that position occurs in the context of the reference sequence (Le., one of SEQ ID NOs: 1-41). An insertion occurs after the designated position,
[0078] Table 2 provides a summary table of queried sequences containing diagnostic SNPs found only in BSE animals. Column A contains the sequence identification tags for each query sequence. Column B contains the repetitive element nomenclature that has the highest homology with the query sequence when applicable. Column C is the length of the query sequence. Column D contains the percentage of the query length that has homology (>70%) to the reference repetitive element in Column B. Column E contains, where appropriate, the BLAST reference of the query sequence when searching against the cow genome. Column F contains the percentage of the query length, where appropriate, that has homology (>70%) to the BLAST reference in Column E. Column G contains the highest number of individual sequences in the database, derived from the ultra deep sequencing of both BSE and normal animals, at positions that contains SNPs found only in BSE animals. Column H contains the total number of significant SNPs in the queried sequence found only in BSE animals. Column I contains the maximum number of BSE animals, out of a total of 15 BSE animals but not any normal control animals, that can be detected when using a single SNP from Column H. Column J contains the maximum number of BSE animals, out of a total of 15 BSE animals but not any normal control animals, that can be detected when using a combination of SNPs from Column H. Column K refers to the Sequence Number of oligos (as detailed in Table 1) that are located within the query sequence of Column A and containing the SNP position referred to in Column H.
[0079] As noted above, a polymorphic position described herein, (i.e., a query position) can be evaluated using sequencing or any number of methods employing oligonucleotides that are competent to discriminate between the residue(s) present in the reference sequence and the indicated polymorphism present only in BSE animals. Such oligonucleotides can bind selectively to the normal sequence or in some embodiments, are designed to bind selectively to the variant sequence known to be associated with BSE. Exemplary oligonucleotides that discriminate between the reference sequence and BSE-associated variant sequence are provided in Table 1.
[0080] In some embodiments, a BSE animal is detected by sequence analysis of one or more polymorphic positions. EXAMPLES Example 1. Detection of polymorphisms to detect animals with BSE
[0081] This example describes detection of SNP/SNVs associated with BSE. Samples were obtained from an experimental study whereby cows were inoculated orally with BSE- infectious or control brain material. Fleckvieh/Brown Swiss cattle were fed 100 g of either PrPres-positive brain stem macerate or normal brain material (controls). Serum samples were taken 40 months post-inoculation (15 infected, 6 control non-infected and 12 randomly selected normal animals).
Experimental Protocol
[0082] Serum collection: Special care was taken in collection, processing and storage of serum samples. Blood from the tail vein or artery was collected into 18 mL plastic tubes equipped with a coagulation accelerator and kept at room temperature for 30 min to ensure proper coagulation. Until further processing, the tubes were stored at 2 - 8 °C for not longer than 24 hours. Centrifugation was done at 2 - 8 °C, 1000 x g for 15 min. The serum supernatant was transferred into 1.5 mL microcentrifuge cups in 0.5 mL aliquots and frozen immediately at -20°C or -800C until use
[0083] Preparation of serum fractions: Frozen serum was thawed at 4 0C in an ice-water bath or refrigerator and 250 μL were transferred into a 1.5 mL microcentrifuge tube. The tube was centrifuged at 4,000 x g for 25 min at 4 0C in a Model 5214 bench top centrifuge (Eppendorf, Hamburg, Germany) to remove cell debris. The supernatant was transferred into a fresh tube and subjected to 35 min centrifugation at 20,000 x g. The supernatant was carefully removed and the pellet was used for further analyses.
[0084] Nucleic acid extraction: 20,000 x g pellets were dissolved in 5 μl IxPBS and subsequently lysed by adding 7.5 μl of Solution D2 (0.4 M KOH, 0.01 M EDTA, 0.08 M DTT). Samples were mixed by pipetting up and down for five times. Samples were kept on ice for ten minutes. Samples were neutralized by adding 7.5 μl of Solution B (0.4 M HCl, 0.6 M Tris/HCl pH 7.5). Samples were either further purified using the PLG (light) tubes (2 mL capacity; Eppendorf) according to the manufacturers instructions or used directly in subsequent steps. [0085] Alternatively, total nucleic acids from whole serum were extracted using the High Pure Viral Nucleic Acids Extraction Kit (Roche) according to the manufacturers instructions except for the use of poly A-RNA as a carrier.
[0086] Whole DNA amplification: 1 μl of the extracted serum DNA solution was amplified using WGA4 Kit (Sigma) according to the manufacturers protocol. In preparation for sequencing samples were labelled using identifier lead sequences for later tracking of the individual samples.
[0087] Ultra deep sequencing of products from steps above was performed using a Roche/454 genome sequencer (GS20/GSFLX) with system reagents according to the manufacturers instructions.
[0088] Raw sequences were processed using SeqMan (DNAStar Inc., Madison WI, USA). Briefly, after trimming for the used adapters/primers, an automatic contig assembly was performed in order to reduce redundancy within in the dataset A total of 227283 were assembled into 26673 contigs. Out of this initial assembly 138 contigs assembled from more than 49 sequences each were selected. Sequential local alignments using the Blast program (Altschul, et al. Nucleic Acids Res 25(17): 3389-3402, 1997) were undertaken to discover sharing of domains between the contig sequences. Parameters used were -r 1 -q -3 -G 5 -E 2 - W 7 -F "m D" -e 0.01. Using this local alignment approach the sequence redundancy within the contigs was further reduced.
[0089] Blast analysis: A total of 117 contigs were compared against a database containing 808,634 sequences (total letters: 86,785,049) using Blast. Database sequences segregate into 410984 sequences from 15 animals artificially infected with BSE and 397650 sequences from 18 un-infected controls.
[0090] SNV mining: Out of the 117 initial contigs 70 gave significant hits when compared to the aforementioned database. Variations between database-sequences were extracted from the Blast output for each contig-query that gave more than five hits. Variations were tested for distribution differences between by chi-square test. For SNV positions displaying a specificity of 100% flanking nucleotides (12 bp in each direction) were extracted from the contig sequence, which served as query in the Blast analysis. Results
[0091] The ultra deep sequencing approach had generated 41 sequences (Jean is this a table, figure, attachment?) in which SNP/SNVs were found only in animals with BSE and not in normal controls. The sequences were mostly derived from repetitive genomic sequences wherein most of the prevalent sequences had homology to bovine Ll LINE or SINE repetitive elements. Two sequences showed were neither repetitive nor coding sequences homology. Table 1 shows that a total of 421 SNPs could be identified from the 41 sequences These data suggest that multiple SNPs/SNVs may be involved in defining the difference between normal and BSE infected cattle
Discussion
[0092] There a dearth of literature on the detection of CNAs, with or without polymorphisms, that can be used as a diagnostic test with acceptable performance criteria (reviewed in Fleischhaker and Schmidt, 2007). The majority of studies to date focus on coding genes that have putative roles in a suspected disease process. In the study described here, all CNAs were analyzed without gene preference bias. SNPs/SNVs were determined from the presence of multiple occurrences from ultra deep sequencing.
[0093] The presence of CNA in the sera of cattle was reported previously (e.g., Schutz et al CDLI). The study described here was conducted to determine whether SNPs or SNVs could be used to detect the presence of BSE in living cattle. BSE was evaluated because BSE is a naturally-occurring veterinary chronic illness and clinical experimental samples with sufficient volumes of samples, e.g. sera are available.
[0094] The above examples are provided to illustrate the invention but not to limit its scope. Other variants of the invention will be readily apparent to one of ordinary skill in the art and are encompassed by the appended claims.
[0095] All publications, patents, accession numbers, and patent applications cited herein are hereby incorporated by reference for all purposes. Table 1
A B C G H
SNP
Name Position Consensus Oligo SNPOIigo ID SNP-Oligol SNPOIigo ID SNP-Oligo2 Origin in wi icjiπ
Seq. ID: 1 157 gcctmrggagmaGtcycrcgttagg Seq. ID: 50 gcctmrggasmaTtcycrcgttagg Seq. ID: 464 gcctmrvgasmaTtcycrcgwymks
Seq. ID: 1 239 atcccggtctccCtcgagaggaaca Seq. ID: 68 atcccggtctccGtcgagaggaaca Seq. ID: 465 atmccggtctccGtcgagaggaaca
Seq. ID: 1 247 ctccctcgagagGaacacygaggtt Seq. ID: 70 ctccctcgagagGNaacacygaggtt
Seq. ID: 1 259 gaacacygaggtTttccggcaccmc Seq. ID: 71 gaacacygaggtAttccggcaccmc
Seq. ID: 1 267 a g gttttccg g cAccm cy tcctctg Seq. ID: 72 aggttttccggc-ccmcytcctctg Seq. ID: 466 aggttttccggc-ccmcctcctctg
Seq. ID: 1 310 gatctggacaggAgggtcgactccc Seq. ID: 73 gatctggacaggCgggtcgactccc
Seq. ID: 1 400 gacattccagacGtggcctcgtggg Seq. ID: 74 gacattccagacTtggcctcgtggg
Seq. ID: 1 410 acgtggcctcgtGggtggttccaca Seq. ID: 75 acgtggcctcgt-ggtggttccaca
Seq. ID: 1 419 cgtgggtggttcCacattccgtagg Seq. ID: 76 cgtgggtggttc-acattccgtagg
Seq. ID: 1 471 aagaacccgatgCccggacacctct Seq. ID: 77 aagaacccgatgGccggacacctct
Seq. ID: 1 493 tctccgaactccAscctgtgaatgm Seq. ID: 78 tctccgaactcc-scctgtgaatgm Seq. ID: 467 tctccgaactcc-ccctgtg a atg a
Seq. ID: 1 505 ascctgtgaatgMagtcaacacgaa Seq. ID: 79 ascctgtgaatgTagtcaacacgaa Seq. ID: 468 accctgtgaatgTagtcaacacgaa
Seq. ID: 1 565 accccaggttccAaatacagctcga Seq. ID: 80 accccaggttcc-aatacagctcga
Seq. ID: 1 578 aatacagctcgaCaagcggcctctc Seq. ID: 81 aatacagctcgaCNaagcggcctctc Seq. ID: 469 aatacagctcgaCNaagyggcctstc
Seq. ID: 1 582 cagctcgacaagCggcctctctccc Seq. ID: 82 cagctcgacaagCNggcctctctccc
Seq. ID: 1 669 ctgtccccagtcTgcagggaccctg Seq. ID: 83 ctgtccccagtcTNgcagggaccctg w Seq. ID: 1 720 cctgmggttcctGcctcaactggag Seq. ID: 84 cctgmggttcct-cctcaactggag Seq. ID: 470 ccygmgvttccy-cctcaactsgag
■^ Seq. ID: 1 820 tcagagccaccaTgagaagccccct Seq. ID: 85 tcwgagccaccaGgagaagccccct Seq. ID: 471 tcwgagccaccmGgwgaagcyccct
Seq. ID: 1 856 cacaagtcgaggGaacccmgggttt Seq. ID: 86 cacaagtcgagg-aacccmgggttt Seq. ID: 472 cacaagtcgagg-aacccngrgttt
Seq. ID: 1 978 tcgcmactcgmaTggagayysgact Seq. ID: 87 tcgcmactcgmaTNggagayysgact
Seq. ID: 1 991 ggagayysgactTccctggsscmmc Seq. ID: 88 ggagayysgactTNccctggsscmmc Seq. ID: 473 ggagabcsgactTNccctggsbcmmc
Seq. ID: 1 1074 mrctcgagaamaAccmcgwgrytcc Seq. ID: 42 mrctcgagaama-ccmcgwgrytcc Seq. ID: 474 crctcgagaaca-ccccgagrytcc
Seq. ID: 1 1142 rvgasmartcycRcg wym kcwctcc Seq. ID: 43 rvgasmartcycRNcgwymkcwctcs Seq. ID: 475 g vg asca rtcycD N eg wy ckctctcs
Seq. ID: 1 1163 ctccrarckcswMaggrbrcttgrc Seq. ID: 44 ctcsrarckcswTaggrbrcttgrc Seq. ID: 476 ctcsvarckcswTaggasvcttgrc
Seq. ID: 1 1220 tacccgtcgcgaYtcgagagcagag Seq. ID: 45 tacccgtcgcgaYNtcgagagcagag Seq. ID: 477 tmcccgtcgcnmCNtcgagagvaras
Seq. ID: 1 1303 caaccccgagatCcctgtcgcccct Seq. ID: 46 caaccccgagatCNcctgtcgcccct Seq. ID: 478 saaccccgagwtCNcctgtckcmmct
Seq. ID: 1 1336 acattg g cttctGg acacaa g ccta Seq. ID: 47 acattggcttctGNgacacaagccta Seq. ID: 479 acmytgrcttcbBNghcdcaasycka
Seq. ID: 1 1341 ggcttctggacaCaagcctagatga Seq. ID: 48 ggcttctggacaCNaagcctagatga Seq. ID: 480 grcttcbbghcdCNaasyckagatga
Seq. ID: 1 1560 gcctmrvgasmaGtcycrcgttagg Seq. ID: 49 gcctmrvgasmaTtcycrcgttagg Seq. ID: 481 gcctmrvgasmaTtcycdcgwymks
Seq. ID: 1 1642 atcccggtctccCtcgagaggaaca Seq. ID: 51 atcccggtctccGtcgagaggaaca Seq. ID: 482 atmccggtctccGtcgagaggaaca
Seq. ID: 1 1650 ctccctcgagagGaacacygaggtt Seq. ID: 52 ctccctcgagagGNaacacygaggtt
Seq. ID: 1 1672 gttttccggcacCccctcctctgag Seq. ID: 53 gttttccggcac-ccctcctctgag
Seq. ID: 1 1713 gatctggacaggAgggtcgactccc Seq. ID: 54 gatctggacaggCgggtcgactccc
Seq. ID: 1 1803 gacattccagacGtggcctcgtggg Seq. ID: 55 gacattccagacTtggcctcgtggg
Seq. ID: 1 1813 acgtggcctcgtGggtggttccaca Seq. ID: 56 acgtggcctcgt-ggtggttccaca
Seq. ID: 1 1874 aagaacccgatgCccggacacctct Seq. ID: 57 aagaacccgatgGccggacacctct
Table 1 (continued)
Seq. ID: 1 1908 ascctgtgaatgMagtcaacacgaa Seq. ID: 58 ascctgtgaatgTagtcaacacgaa Seq. ID: 483 accctgtgaatgTagtcaacacgaa
Seq. ID: 1 1965 accccaggttccAaatacagctcga Seq. ID: 59 accccaggttcc-aatacagctcga
Seq. ID: 1 1978 aatacagctcgaCaagcggcctctc Seq. ID: 60 aatacagctcgaCNaagcggcctctc Seq. ID: 484 aatacagctcgaCNaagyggcctstc
Seq. ID: 1 2120 cctgmggttcctGcctcaactggag Seq. ID: 61 cctg m g gttcct-cctcaactggag Seq. ID: 485 ccygmgvttccy-cctcaactsgag
Seq. ID: 1 2184 cgagaggcccctCccacctccagkw Seq. ID: 62 cgagasscccct-ccamckccagdw Seq. ID: 486 cgagwbscccct-ccamctcsagdw
Seq. ID: 1 2241 cccygagrtcmcYkkcrcmmstsga Seq. ID: 63 cccygag rtcmcYN kkcrcm mstsga Seq. ID: 487 ccchgrgrtchcCNkkcvcaavtcga
Seq. ID: 1 2286 cwcaascckagaH-gaggtcwrwtd Seq. ID: 64 cwcaascckagaHN-gaggtcwrwtd
Seq. ID: 1 2293 ckagah-gaggtCwrwtdsccctgc Seq. ID: 65 ckagah-gaggt-wrwtdsccctdc Seq. ID: 488 cgwsahrgassy-wghkdcccctdc
Seq. ID: 1 2294 kagah-gaggtcWrwtdsccctgcm Seq. ID: 66 kagah-gaggtc-rwtdsccctdcm
Seq. ID: 1 2379 gsscmmcacragAggcwbmctgamy Seq. ID: 67 gsscmmcacrag-ggcwbmctgamy Seq. ID: 489 gsscmmcacrag-ggcwymctgamy
Seq. ID: 1 2451 gaa-maaccccgWgrytcccccgtc Seq. ID: 69 gaa-maaccmcgWNgrytcccccgtc Seq. ID: 490 gaanmaaccccgANgaytcccccgtc
Seq. ID: 2 49 trcgagacagcaAaagagamacwga Seq. ID: 93 trcragacagcaGargagacacwga
Seq. ID: 2 75 rtadagaayagwCttwtggactytg Seq. ID: 94 gtadagaacagwTttdtggactntg Seq. ID: 491 rtagagaacaraTttdtkganwhts
Seq. ID: 2 88 ttwtggactytgTgggagagggmga Seq. ID: 95 ttdtggactntgCgggagagggaga Seq. ID: 492 ttdtkganwhtsCrggrgrdggwga
Seq. ID: 2 118 ggatgatttgrgAgaatrgcattga Seq. ID: 89 ggawgatttgggTgaatggcattga Seq. ID: 493 ggatgawttgrgTgaatrgcattga
Seq. ID: 2 200 wggatgcttgggGctggtgcactgg Seq. ID: 90 wggatgcttgggCctggtgcactgg Seq. ID: 494 wggawgcttvrrCcwrgtgcwctgk
Seq. ID: 2 241 ggtatggggaggGaggagggaggag Seq. ID: 91 ggtatggggaggTaggagggaggag Seq. ID: 495 ggkatggggaggTaggagggaggag
Seq. ID: 2 257 agggaggagggtTca Seq. ID: 92 agggaggagggtCca
Seq. ID: 3 76 taaaaactctccAgaaagyagghat Seq. ID: 112 taaaaactctccGgaaagyaggmat Seq. ID: 496 traaaacwcnmmGcaaavtagrmmt
,O Seq. ID: 3 108 acatacctcaacAtaataaaagcya Seq. ID: 96 acatacctcaacGtaataaaagcya Seq. ID: 497 amhkwbctcaam-nnnwrwadgnba
^ Seq. ID: 3 207 agacaaggrtgcCcactytcaccac Seq. ID: 97 agacaaggrtgcTcactytcaccac Seq. ID: 498 agrcaaggatgyTcaywytydcmac
Seq. ID: 3 233 hytattcaacatAgttttggargtt Seq. ID: 98 wctattcaacatGgtdttggargtt Seq. ID: 499 wcyattcaayahGgwvytggargtt
Seq. ID: 3 280 aaaaagaaataaAaggaatccaaat Seq. ID: 99 aaaaagaaataaGaggaatccaaat
Seq. ID: 3 283 aagaaataaaagGaatccaaattgg Seq. ID: 100 aagaaataaaagCaatccaaattgg
Seq. ID: 3 293 aggaatccaaatTggaaaagaagaa Seq. ID: 101 aggaatccaaatTNggaaaagaagaa
Seq. ID: 3 299 ccaaattggaaaAgaagaagtaaaa Seq. ID: 102 ccaaattggaaaTgaagaagtaaaa
Seq. ID: 3 309 aaagaagaagtaAaactctcactrt Seq. ID: 103 aaagaagaagtaTaactctcactrt
Seq. ID: 3 311 agaagaagtaaaActctcactrttt Seq. ID: 104 agaagaagtaaaCctctcactrttt
Seq. ID: 3 318 gtaaaactctcaCtrtttgcagatg Seq. ID: 105 gtaaaactctca-trtttgcagatg
Seq. ID: 3 321 aaactctcactrTttgcagatgaca Seq. ID: 106 aaactctcactrCttgcagatgaca
Seq. ID: 3 344 catgatcctmtaCatagaaaaccct Seq. ID: 107 catgatcctcta-atagaaaaccct Seq. ID: 500 catgatcctmta-atagaaaaycct
Seq. ID: 3 361 aaaaccctaaagActccaccagaaa Seq. ID: 108 aaaaccctaaagGctccaccagaaa Seq. ID: 501 aaaaycctaaagGmtccaccagaaa
Seq. ID: 3 445 aatchcttgcatTyctatacactaa Seq. ID: 109 aatcmcttgcatAcctatacactaa
Seq. ID: 3 456 ttyctatacactAayaatgaraaaa Seq. ID: 110 ttcctatacactGayaatgaraaaa
Seq. ID: 3 472 atgaraaaacagAaagagaaattaa Seq. ID: 111 atgaraaaacag-aagagaaattaa Seq. ID: 502 atgarraawcan-aagagaaattaa
Seq. ID: 4 31 aaayaaaattgaCaaaccattagcc Seq. ID: 120 aaayaaaattgaAaaaccattagcc Seq. ID: 503 aaayaaaattgaAaaachwttagcc
Seq. ID: 4 58 actcatcaagaaAmaagggagaara Seq. ID: 129 actcatcaagaaTmaagrgagaara Seq. ID: 504 actsatcaagaaTmaagrgagaara
Seq. ID: 4 64 caagaaamaaggGagaaramtcaaa Seq. ID: 130 caagaaamaagr-agaaraatcaaa Seq. ID: 505 caagaaamaagr-agaaraawcaaa
Seq. ID: 4 81 ramtcaaatmaaYaaaattagaaat Seq. ID: 131 raatcaaatcaaYNaaaattagaaat Seq. ID: 506 raawcaaatmaaYNaaaatyagaaat
Table 1 (continued)
Seq. ID: 4 115 garrtyacaacdGayamhrcagaaa Seq. ID: 113 garatyacaacaCacamyacagaaa Seq. ID: 507 gavrtyacaacwCayamcacagaaa
Seq. ID: 4 150 cataagagamtaCtatnarcaayta Seq. ID: 114 cataagagamtaGtatvarcaayta Seq. ID: 508 cataagagaytaGtakvarcaayta
Seq. ID: 4 166 narcaaytatatGccaataaaatgg Seq. ID: 115 varcaaytatatTccaataaaatgg Seq. ID: 509 varcaaytatayTccaataaaatgg
Seq. ID: 4 217 ttagaaamgtacAacytbccaaaac Seq. ID: 116 ttagaaaagtacTacyttccaarac Seq. ID: 510 ytagaaaadtwcTacyttccaarac
Seq. ID: 4 249 rgaagaaatagaAaatmtkaacaga Seq. ID: 117 ggaagaaatagaTaatmtkaacara Seq. ID: 511 rgaagaaatagaTawtmtgaacavh
Seq. ID: 4 261 aaatmtkaacagAccmatcacaagc Seq. ID: 118 aaatmtkaacarTccmathacaagy Seq. ID: 512 aawtmtgaacavTccmatbacaarh
Seq. ID: 4 299 ctgtratcaaaaAtcttccarcaaa Seq. ID: 119 ctgtaatcaraaCtctyccarcaaa Seq. ID: 513 ctgtaathaaaaCwctyccmvmaaa
Seq. ID: 4 310 aatcttccarcaAacaaaagcccag Seq. ID: 121 aatctyccarcaTacaaaagcccag Seq. ID: 514 aawctyccmvmaTabaaaagyccag
Seq. ID: 4 355 gaattctaccaaAmatttaragaag Seq. ID: 122 gaattctaycaaChatttagagaag Seq. ID: 515 raattctaycaaChatttaragaag
Seq. ID: 4 385 acacctatcctdCtcaaactcttcc Seq. ID: 123 ayacctatcctd-tcaaactcttcc Seq. ID: 516 ayachwatcctd-wyaaactmttcc
Seq. ID: 4 386 cacctatcctdcTcaaactcttcca Seq. ID: 124 yacctatcctdc-caaactcttcca Seq. ID: 517 yachwatcctdc-yaaactmttcca
Seq. ID: 4 406 ttccaraaaattGcagaggaaggwa Seq. ID: 125 ttccaraaaatt-cagaggaaggwa Seq. ID: 518 ttccaraaaatt-magaggraggaa
Seq. ID: 4 413 aaattgcagaggAaggwaaacttcc Seq. ID: 126 aaattgcagaggGaggwamacttcc Seq. ID: 519 aaattnmagaggGaggaamacttcc
Seq. ID: 4 482 acaaagayrccaCaaaaaagaaaay Seq. ID: 127 acaaagayvccaGaaaaaagaaaay Seq. ID: 520 acaaagayvhcaGaaaaaagaaaay
Seq. ID: 4 500 agaaaaytacagGccaatatcaytg Seq. ID: 128 agaaaaytacagTccaatatcactg Seq. ID: 521 agaaaaytayagTccaatatcactg
Seq. ID: 5 61 ygactgagygacTgaactgaactga Seq. ID: 132 ygactgagygacTNgaactgaactga
Seq. ID: 6 380 tgggtacagaatGggatcaggaggt Seq. ID: 150 tgggtacagaatCggatcaggaggt
Seq. ID: 6 474 ctttgctaatggAttttaagtttct Seq. ID: 151 ctttgctaatggTttttaagtttct
Seq. ID: 6 1009 cccasccttaccAggccvcagagga Seq. ID: 133 cccasccttaccGggccvcagagga
Seq. ID: 6 1060 gggtaaggaacaAggaactaacaag Seq. ID: 134 gggtaaggaacaANggaactaacaag
Seq. ID: 6 1073 ggaactaacaagCtcccaccaacca Seq. ID: 135 ggaactaacaagTtcccaccaacca Seq. ID: 522 ggaactaacaagTtcccascaacca
Seq. ID: 6 1111 gtcaacaagaggTcagcaagagatg Seq. ID: 136 gtcaacaagaggCcagcaagagatg
Seq. ID: 6 1162 aacttccagcgaGcagcaaagaccc Seq. ID: 137 aacttccagcgaAcagcaaagaccc Seq. ID: 523 aacttccagcgaAcavmaaagaccc
Seq. ID: 6 1166 tccagcgagcagCaaagacccagca Seq. ID: 138 tccagcgagcagAaaagacccagca Seq. ID: 524 tccagcgagcavAaaagacccagca
Seq. ID: 6 1199 acgaattcaatgTtacccaaaacaa Seq. ID: 139 acgaattcaatgCtacccaaaacaa Seq. ID: 525 acgaattcaatgCkasccaaaacma
Seq. ID: 6 1201 gaattcaatgttAcccaaaacaaay Seq. ID: 140 gaattcaatgttANcccaaaacaaay Seq. ID: 526 gaattcaatgtkANsccaaaacmant
Seq. ID: 6 1253 acangactgaggTtdggggtgaggc Seq. ID: 141 acatgactgaggGtdggggtgaggc
Seq. ID: 6 1282 catgtatgtgccAggatactcttaa Seq. ID: 142 catgtatgtgccGg g atactcttaa
Seq. ID: 6 1529 ggtggatttctaCaaagaacatcca Seq. ID: 143 ggtggatttctaAaaagaacatcca
Seq. ID: 6 1536 ttctacaaagaaCatccatgcaaag Seq. ID: 144 ttctacaaagaaAatccatgcaaag
Seq. ID: 6 1580 gtgcacatctgcCcccwcccccccw Seq. ID: 145 gtgcacatctgcGccchccccccct Seq. ID: 527 gtgcacatctbcGccchccccccct
Seq. ID: 6 1698 ttttctaaaaaaAaraaaaacaaaa Seq. ID: 146 ttttctaaaaaaCaraaaaacaaaa Seq. ID: 528 ttttctaaaaaaCarraaaacaaaa
Seq. ID: 6 1714 aaaacaaaaarcAaattaaaaaaaa Seq. ID: 147 aaaacaaaaarcCaattaaaaaaaa Seq. ID: 529 aaaacaaaaaasCaantwnaaaaaa
Seq. ID: 6 1717 acaaaaarcaaaTtaaaaaaaaaaa Seq. ID: 148 acaaaaarcaaa-taaaaaaaaaaa Seq. ID: 530 acaaaaaasmaa-twnaaaaaaaaa
Seq. ID: 6 1720 aaaarcaaattaAaaaaaaaaaacc Seq. ID: 149 aaaarcaaatta-aaaaaaaaaacc Seq. ID: 531 aaaaasmaantw-aaaaaaaaaacc
Seq. ID: 7 23 ctactccatttcTtctaagggattc Seq. ID: 157 ctactccatttcAtctaagggattc
Seq. ID: 7 47 cytgcccacagtAgtagatataatg Seq. ID: 158 cytgcccacagtCgtagatataatg Seq. ID: 532 cttgcccacagtCgtagatataatg
Seq. ID: 7 50 gcccacagtagtAgatataatggtc Seq. ID: 159 gcccacagtagtTgatataatggtc
Seq. ID: 7 57 gtagtagatataAtggtcatctgag Seq. ID: 160 gtagtagatataCtggtcatctgag Seq. ID: 533 gtagtagatataCtggtcatctgar
Table 1 (continued)
Seq. ID: 7 74 catctgagttaaAttcacccattcc Seq. ID: 169 catctg agtta aTttca cccattcc Seq. ID: 534 catctga rttaaTttcacccattcc
Seq. ID: 7 93 cattccagtccaTtttagttcrctg Seq. ID: 171 cattccagtccaGtttagttcrctg Seq. ID: 535 cattccagtccaGtttagttcactg
Seq. ID: 7 121 ccta raatgtcg Ayrttcactcttg Seq. ID: 152 cctaraatgtygTyrttcactcttg Seq. ID: 536 cctaraatgtygTygttcactcttg
Seq. ID: 7 174 rccttgattcatGgacctaacattc Seq. ID: 153 gccttgattcatTgacctaacattc Seq. ID: 537 rccttgattcatTgacctaacattc
Seq. ID: 7 176 cttgattcatggAcctaacattcca Seq. ID: 154 cttg attcatg g Gcctaacattcca
Seq. ID: 7 185 tggacctaacatTccaggttcctat Seq. ID: 155 tggacctaacatCccaggttcctat
Seq. ID: 7 227 agcatcrgacytTrcttcyaycacc Seq. ID: 156 agcatcrgacytCrcttcyatcacc
Seq. ID: 7 613 ctctgatgccctCttgcaacaccta Seq. ID: 161 ctctgatgccctAttgcaacaccta
Seq. ID: 7 615 ctgatgccctctTgcaacacctacc Seq. ID: 162 ctgatgccctctAgcaacacctacc
Seq. ID: 7 627 tgcaacacctacCrtcttacttggg Seq. ID: 163 tgcaacacctacGrtcttacttggg
Seq. ID: 7 635 ctaccrtcttacTtgggtttctctt Seq. ID: 164 ctaccrtcttacAtg g gtttctctt
Seq. ID: 7 638 ccrtcttacttgGgtttctcttacc Seq. ID: 165 ccrtcttacttgCgtttctcttacc
Seq. ID: 7 657 cttaccttggrcGtggggtatctct Seq. ID: 166 cttaccttggrcCtggggtatctct
Seq. ID: 7 658 ttaccttggrcgTggggtatctctt Seq. ID: 167 ttaccttggrcgTNggggtatctctt Seq. ID: 538 ttaccttggrcvTNggggtatctctt
Seq. ID: 7 678 ctcttcacggctGctccagcaaagc Seq. ID: 168 ctcttcacggctCctccagcaaagc Seq. ID: 539 ctcttcacrgctCctccagcaaagc
Seq. ID: 7 759 accttsaacgtgGrrtagctcctct Seq. ID: 170 accttsaacgtgTrrtagctcctct Seq. ID: 540 accttsraygtgTrrtagctcctct
Seq. ID: 8 147 ggtgcctgactgStgcctcagctca Seq. ID: 172 ggtgcctgactg-tgcctcagctca Seq. ID: 541 ggyrcctkactg-ngcytcagctca
Seq. ID: 8 148 gtgcctgactgsTgcctcagctcag Seq. ID: 173 gtgcctgactgs-gcctcagctcag Seq. ID: 542 gyrcctkactg—gcytcagctcag
Seq. ID: 8 178 gccttcccttcgGacctaggagcca Seq. ID: 174 gccttcccttcgAacctaggagcca Seq. ID: 543 gccttyccntysAwmctaggacnna to Seq. ID: 8 180 cttcccttcggaCctaggagccagg Seq. ID: 175 cttcccttcggaActaggagccagg Seq. ID: 544 cttyccntysnwActaggacnnagg
Seq. ID: 8 188 cggacctaggagCcaggcgtccttg Seq. ID: 176 cggacctaggagAcaggcgtccttg Seq. ID: 545 ysnwmctaggacAnaggmswnyttg
Seq. ID: 8 189 ggacctaggagcCaggcgtccttgc Seq. ID: 177 ggacctaggagcAaggcgtccttgc Seq. ID: 546 snwmctaggacnAaggmswnyttgs
Seq. ID: 8 194 taggagccaggcGtccttgcagagg Seq. ID: 178 taggagccaggcCtccttgcagagg Seq. ID: 547 taggacnnaggmCwnyttgsagarg
Seq. ID: 8 201 caggcgtccttgCagagggcttagg Seq. ID: 179 caggcgtccttgGagagggcttagg Seq. ID: 548 naggmswnyttgGagarggsywagg
Seq. ID: 8 205 cgtccttgcagaGggcttaggggct Seq. ID: 180 cgtccttgcagaAggcttaggggct Seq. ID: 549 mswnyttgsagaAggsywagggkct
Seq. ID: 8 208 ccttgcagagggCttaggggctggg Seq. ID: 181 ccttgcagagggGttaggggctggg Seq. ID: 550 nyttgsagarggGywagggkctksd
Seq. ID: 8 215 gagggcttagggGctgggggtgtgc Seq. ID: 182 gagggcttagggTctgggggtgtgc Seq. ID: 551 garggsywagggTctksdkbtbtvc
Seq. ID: 8 229 tgggggtgtgcaCactgccaaagcc Seq. ID: 183 tgggggtgtgcaAactgccaaagcc Seq. ID: 552 tksdkbtbtvcaAacngccaargsc
Seq. ID: 8 232 gggtgtgcacacTgccaaagcctcc Seq. ID: 184 gggtgtgcacacCgccaaagcctcc Seq. ID: 553 dkbtbtvcanacCgccaargsctmc
Seq. ID: 8 238 gcacactgccaaAgcctcccgcggg Seq. ID: 185 gcacactgccaaGgcctcccgcggg Seq. ID: 554 vcanacngccaaGgsctmcngcaat
Seq. ID: 8 240 acactgccaaagCctcccgcgggga Seq. ID: 186 acactgccaaagGctcccgcgggga Seq. ID: 555 anacngccaargGctmcngcaatga
Seq. ID: 8 243 ctgccaaagcctCccgcggggagag Seq. ID: 187 ctgccaaagcctAccgcggggagag Seq. ID: 556 cngccaargsctAcngcaatgagav
Seq. ID: 8 249 aagcctcccgcgGggagagctcttt Seq. ID: 188 aagcctcccgcgAggagagctcttt Seq. ID: 557 argsctmcngcaAtgagavctntbt
Seq. ID: 8 250 agcctcccgcggGgagagctctttc Seq. ID: 189 agcctcccgcggTgagagctctttc Seq. ID: 558 rgsctmcπgcaaTgagavctntbtc
Seq. ID: 8 255 cccgcggggagaGctctttccgagg Seq. ID: 190 cccgcggggagaCctctttccgagg Seq. ID: 559 mcngcaatgagaCctntbtcmgagg
Seq. ID: 8 258 gcggggagagctCtttccgagggaa Seq. ID: 191 gcggggagagctGtttccgagggaa Seq. ID: 560 gcaatgagavctGtbtcmgaggkaa
Seq. ID: 8 263 gagagctctttcCgagggaaaggaa Seq. ID: 192 gagagctctttcAgagggaaaggaa Seq. ID: 561 gagavctntbtcAgaggkaarggaa
Seq. ID: 8 281 aaaggaagcgccCcaggctgcattc Seq. ID: 193 aaaggaagcgccAcaggctgcattc Seq. ID: 562 aarggaagvgynAnaggmtgsnttc
Seq. ID: 8 286 aagcgccccaggCtgcattcccaag Seq. ID: 194 aagcgccccaggAtgcattcccaag Seq. ID: 563 aagvgynmnaggAtgsnttcccaag
Table 1 (continued)
Seq. ID: 8 289 cgccccaggctgCattcccaagctg Seq. ID: 195 cgccccaggctgGattcccaagctg Seq. ID: 564 vgynmnaggmtgGnttcccaagcwg
Seq. ID: 8 300 gcattcccaagcTggtgcttccccg Seq. ID: 196 gcattcccaagcAggtgcttccccg Seq. ID: 565 gsnttcccaagcAggtgcntssyca
Seq. ID: 9 34 tctccaaagaagAcatacagatggc Seq. ID: 202 tctccaaagaagTcatacagatggc
Seq. ID: 9 52 agatggcyaacaAacacatgaaaag Seq. ID: 207 agatggcyaacaTrcacatgaaaag Seq. ID: 566 avatgrcyaahaTryacatgaaaar
Seq. ID: 9 167 gtctacaaayaaTaaatgctgga-g Seq. ID: 197 rtcyacaaahaaGaaatgctgga-g Seq. ID: 567 swctrmaaabadGaaatgytggmwg
Seq. ID: 9 209 gaaccctcttacActgttggtggga Seq. ID: 198 gaaccctcttac-ctgttggtggga Seq. ID: 568 gaachctchtwc-ntgytggtggga
Seq. ID: 9 211 accctcttacacTgttggtgggaat Seq. ID: 199 accctcttacac-gttggtgggaat Seq. ID: 569 achctchtwcan-gytggtgggaat
Seq. ID: 9 244 gtacagccactaTggaraacagtrt Seq. ID: 200 gtacagccactaTNggaraacagtrt Seq. ID: 570 gtrcagccactaTNggaraacagtdt
Seq. ID: 9 319 ccactvctgggcAtayacmchgagg Seq. ID: 201 ccactvctgggc-tayacmchgagg Seq. ID: 571 ccactvctgggy-tanaymchgagr
Seq. ID: 9 397 ayaatagccargAcatggaagcaac Seq. ID: 203 ayaatagccargCcatggaagcaac Seq. ID: 572 acaatagccaarCcatggaarcaac
Seq. ID: 9 399 aatagccargacAtggaagcaacct Seq. ID: 204 aatagccargac-tggaagcaacct Seq. ID: 573 aatagccaarac-tggaarcaaccy
Seq. ID: 9 432 atcarcagatgaAtggataarvaag Seq. ID: 205 atcarcagatga-tggataarraag Seq. ID: 574 atcaacagrdga-tggataarnaar
Seq. ID: 9 455 aghtgtggtacaTatacacaatgga Seq. ID: 206 ag htgtggtaca-atayacaatgga Seq. ID: 575 arhtgtggtaya-atayacaatgga
Seq. ID: 9 582 ccaatacagtatAytaaygcatata Seq. ID: 208 ccaatacagtat-ytaaygcatata Seq. ID: 576 cmaatacartat-htaayryatata
Seq. ID: 9 588 cagtataytaayGcatatatatgga Seq. ID: 209 cagtataytaayGNcatatatatgga Seq. ID: 577 cartatahtaayRNyatatatatgga
Seq. ID: 9 611 gaatttagaaagAtggtaacrahaa Seq. ID: 210 gaatttagaaagTtggtaacrayaa
Seq. ID: 9 613 atttagaaagatGgtaacrahaacc Seq. ID: 211 atttagaaagatCgtaacrayaacc Seq. ID: 578 mtttaraaaratCrtamhaatramc
Seq. ID: 9 627 taacrahaacccTrtdtrcraraca Seq. ID: 212 taacrayaacccCrtatrcraraca
Seq. ID: 10 52 ascctgtgaatgMagtcaacacgaa Seq. ID: 213 ascctgtgaatgTagtcaacacgaa Seq. ID: 579 accctgtgaatgTagtcaacacgaa
N Seq. ID: 10 72 acgaaggggcagTttt Seq. ID: 214 acgaaggggcagAttt
∞ Seq. ID: 11 142 tg rattg atccyTtkaycattatgt Seq. ID: 226 tgrattgatccyAtkaycattatgt Seq. ID: 580 trnatyrmtyctAtkwtsawkayrt
Seq. ID: 11 172 ccttctttgtctCttttnayvkyyt Seq. ID: 227 ccttctttgtct-ttttbayvdyyt Seq. ID: 581 ccttcttngtct-ttttnacndttt
Seq. ID: 11 224 tkagtattgcwaCtccwgctttctt Seq. ID: 228 tdagtattgctaGtcctgctttctt Seq. ID: 582 tragtatwgctaGycctgctttctt
Seq. ID: 11 250 tsdtyycyatttGcatg raatatyt Seq. ID: 229 tsntyycyattt-catg raatatct Seq. ID: 583 tbrtytcyrttt-natgraatatht
Seq. ID: 11 251 sdtyycyatttgCatgraatatytt Seq. ID: 230 sntyycyatttg-atgraatatctt
Seq. ID: 11 286 ctcactttcagtCtrtrtgtgtcyy Seq. ID: 231 ytcactttcagtGtrtrtgtgtcyy Seq. ID: 584 ytywctttcartGtdtrtktktcyh
Seq. ID: 11 366 ccattcagccagTctktgtcttttg Seq. ID: 232 ccattcagccagCctttgtcttttg
Seq. ID: 11 452 tttactttattgTtttgggttygag Seq. ID: 233 tttwctttattgCtttgggttyrag
Seq. ID: 11 461 ttgttttgggttYgagtttatacac Seq. ID: 234 ttgttttgg gttG ragtttata cac
Seq. ID: 11 753 tygttvyttttcYcttgctgctttt Seq. ID: 235 ttgttgyttttcActtgctgctttt Seq. ID: 585 ttgttgcttttcActtgctgctttt
Seq. ID: 11 795 tttratytttgtTartttgattadt Seq. ID: 236 tttratytttgtAartttgattart Seq. ID: 586 tttartyttyrtAadtttgattant
Seq. ID: 11 874 tcttggacttgrGtgaytatttcct Seq. ID: 237 tcttg g acttg g -tg ay tatttcct Seq. ID: 587 tyttggacttgr-traytdttycht
Seq. ID: 11 895 tcctttcccatkTtagggaagtttt Seq. ID: 238 tccttycccatkGtagggaagtttt Seq. ID: 588 ychttyymcatdGtagggaartttt
Seq. ID: 11 931 tcy tcaa rtattTtctca kg by ctt Seq. ID: 239 tcytcaa rtattGtctca kg byctt Seq. ID: 589 tcttcaaatattGtctca nsbπytt
Seq. ID: 11 933 ytcaartattttCtcakgbyctttc Seq. ID: 240 y tea a rtatttt-tca kg by ctttc
Seq. ID: 11 1010 yattgtcccagaGgtctctgagrtt Seq. ID: 215 nattgtcccagaAgtctctgag rtt Seq. ID: 590 nrttgtcccagaAgtctctsagntt
Seq. ID: 11 1028 tgag rttgtcctCatttctttthat Seq. ID: 216 tgag rttgtcctTatttcttttwat Seq. ID: 591 tsagnttgtcctTatttyttttwan
Seq. ID: 11 1040 catttctttthaTtcktttttcttt Seq. ID: 217 catttcttttwa-tcdtttttcttt Seq. ID: 592 catttyttttwa-tn ktttttcttt
Seq. ID: 11 1172 gaaagrvcmtgaGaaaatayttgaa Seq. ID: 218 gaaagrvcmtga-aaaatayttgar
Table 1 (continued)
Seq. ID: 11 1174 aagrvcmtgagaAaatayttgaaga Seq. ID: 219 aagrvcmtgagaCaatayttgarga Seq. ID: 593 aardvbhtgagaCaatatttgaaga
Seq. ID: 11 1210 aaaacttccctaAmatgggaaagga Seq. ID: 220 aaaacttccctaCmatgggraagga Seq. ID: 594 aaaayttccctaCmatgkgraagga
Seq. ID: 11 1220 taamatgggaaaGgaaatartcacy Seq. ID: 221 taamatgggraaTgaaatartcacc Seq. ID: 595 tammatgkgraaTgaaahagtyacc
Seq. ID: 11 1225 tgggaaaggaaaTartcacycaagt Seq. ID: 222 tgggraaggaaaAartcacccaagt
Seq. ID: 11 1231 aggaaatartcaCycaagtccaaga Seq. ID: 223 aggaaatartca-ccaagtccaaga Seq. ID: 596 aggaaahagtya-ccaagtccaara
Seq. ID: 11 1326 aagatyaaacacAaasavmaaawat Seq. ID: 224 aagatyaaacacTaagaamaaatat Seq. ID: 597 aarayyaaachcTaasnnmraatat
Seq. ID: 11 1346 aawattaaaaagCagcmagrgaraa Seq. ID: 225 aatattaaaaagTagcaagggaaaa Seq. ID: 598 aatattπaaaagTagcaagggaraa
Seq. ID: 12 72 tttacctgctgrRtatttctctgyc Seq. ID: 260 tttayctgctgrCtattyctctgyc
Seq. ID: 12 108 tttadattgctgTgtttggggtgkc Seq. ID: 241 tttarattgctgGgtttggggtgkc
Seq. ID: 12 109 ttadattgctgtGtttggggtgkcc Seq. ID: 242 tta ra ttg ctg wTtttg g g gtg kcc
Seq. ID: 12 193 gggkttgkacrrGtggcttgtcaag Seq. ID: 243 gggkttgkacrgAtggcttgtcaag Seq. ID: 599 g g g kttg kaca rAtgg cttgtcaag
Seq. ID: 12 346 cctgtathttraWgctcagggctrt Seq. ID: 244 cctgtathttdaWNgctcagggctrt Seq. ID: 600 cctgththttndRNgctcagggytrt
Seq. ID: 12 403 gtcttgcyctggAacttgttggcyc Seq. ID: 245 gtcttgcyctggGacttgttggcyc Seq. ID: 601 gtcttgchctggGrcttgttggcyy
Seq. ID: 12 410 yctggaacttgtTggcycttgggtg Seq. ID: 246 yctggaacttgtAggcycttgggtg
Seq. ID: 12 447 agtgtaggtatgGaggcdtttgatg Seq. ID: 247 agtgtaggtatgCaggcdtttgatg Seq. ID: 602 agtgtaggtatgCaggcdtttgrtg
Seq. ID: 12 448 gtgtaggtatggAggcdtttgatga Seq. ID: 248 gtgtaggtatggTggcdtttgatga Seq. ID: 603 gtgtaggtatggTggcdtttgrtga
Seq. ID: 12 450 gtaggtatggagGcdtttgatgagc Seq. ID: 249 gtaggtatggagTcdtttgatgagc
Seq. ID: 12 463 cdtttgatgagcTcytrtcrattaa Seq. ID: 250 cdtttgatgagcAcytgtcdattaa Seq. ID: 604 cdtttgrtgagcAcytgtcdattaa
Seq. ID: 12 479 rtcrattaatgtTccctggagtcag Seq. ID: 251 gtcdattaatgtCccctggagtcag
Seq. ID: 12 480 tcrattaatgttCcctggagtcagg Seq. ID: 252 tcdattaatgttTcctggagtcagg
^o Seq. ID: 12 509 cyctggwgtcagGdtttggacttaa Seq. ID: 253 cyctggwgtcagTdtttggayttaa Seq. ID: 605 cyctgrwgtcarTdtttggabttra
Seq. ID: 12 528 acttaagcctccTgcytcyggttwt Seq. ID: 254 ayttaagcctccAgcytcyrgttwt Seq. ID: 606 abttragcctccAgcytcyrgbtwt
Seq. ID: 12 543 ytcyggttwtcrGtcttattyttac Seq. ID: 255 ytcyrgttwtcrAtcttattyttac Seq. ID: 607 ytcyrg btwtcrAtcttattyttac
Seq. ID: 12 547 ggttwtcrgtctTattyttacagta Seq. ID: 256 rgttwtcrgtctCattyttacagta Seq. ID: 608 rgbtwtcrgtctCattyttacagta
Seq. ID: 12 558 ttattyttacagTagyytcaaract Seq. ID: 257 ttattyttacagAagyytcaaract
Seq. ID: 12 568 agtagyytcaarActtctccwtcta Seq. ID: 258 agtagyytcaarGcttctccwtcta Seq. ID: 609 agtagyhtcaarGcttctccwtcna
Seq. ID: 12 593 tacagcaccrwtGataaaacatcta Seq. ID: 259 tacarcaccrwtAataaaacatcta Seq. ID: 610 tacagcacmrwwAataaaacatcta
Seq. ID: 13 352 gtggtrcacgccTtrttcagccaag Seq. ID: 261 gtggtgcacgcmCygytcagccaag Seq. ID: 611 gtggtgcangcmCygytcagccaag
Seq. ID: 13 498 tgtsrtaaaacaActcatgcaaatg Seq. ID: 262 tgtsgtaaaacaCctcatgcaaatg Seq. ID: 612 tgtbgtaaaacaCctcatgcaaatg
Seq. ID: 14 13 ggttcaggatggGgaacacatgtat Seq. ID: 264 ggttcaggatggGNgaacacatgtat
Seq. ID: 14 29 cacatgtataccTgtggcggattca Seq. ID: 265 cacatgtataccTNgtggcggattca
Seq. ID: 14 48 gattcatkttgaTrtwtggcaaaac Seq. ID: 266 gattcatkttgaTNrtwtggcaaaac
Seq. ID: 14 69 aaacyaatacaaTwwtgtaaagttw Seq. ID: 267 aaacyaatacaaTNattgtaaagttw
Seq. ID: 14 111 aaawwwaaaawaRadwhawwawaad Seq. ID: 263 aaawwwaaaawaRNadwhawwawaad
Seq. ID: 15 20 tttcccggtcccCtcttgataagaa Seq. ID: 268 tttcccggtcccGtcttgataagaa
Seq. ID: 16 37 ragtcatrgtagGggaatctggcct Seq. ID: 269 ragtcatrgtagCggaatctggcct
Seq. ID: 17 82 yakgagytgcttGtatattttkgar Seq. ID: 297 yakgagytgyttCtatattttkgag Seq. ID: 613 yatgagttsyttCtatattttkgab
Seq. ID: 17 94 gtatattttkgaRattantystttg Seq. ID: 298 gtatattttkgaCattartystttg Seq. ID: 614 vtatattttkgaCattarthctttr
Seq. ID: 17 104 garattantystTtgtcagttgytt Seq. ID: 270 gagattartystGtgtcagttgctt Seq. ID: 615 gabattarthctGtrtcagwtrydt
Table 1 (continued)
Seq. ID: 17 148 ccattctgwrggYtgtcttttcayy Seq. ID: 271 ccattctgargg-tgtcttttcacc Seq. ID: 616 ccattctgtggg — tctttnn— c
Seq. ID: 17 150 attctgwrggytGtcttttcayytt Seq. ID: 272 attctgarggyt-tcttttcacctt Seq. ID: 617 attctgtggg— tctttnn— ctk
Seq. ID: 17 161 tgtcttttcayyTtgyttatagttt Seq. ID: 273 tgtcttttcacc-tgyttatrgttt Seq. ID: 618 -tctttnn-c-ktcttatggttt
Seq. ID: 17 237 ttgcttttatttCcawtaytctrgg Seq. ID: 274 ttgyttttattt-cawtaytctggg Seq. ID: 619 ttgyttttattt-cawtaytytdgg
Seq. ID: 17 262 aggtgggtcataGaggatcytgctg Seq. ID: 275 aggtgggtcataGNaggatcctgctg Seq. ID: 620 agrtgggtcataGNaggatcytgctg
Seq. ID: 17 301 gagtgttytgccTatgttytcctct Seq. ID: 276 gagtgttttgcc-atgttytcctct Seq. ID: 621 gagtgttttgcc-atgttytcytct
Seq. ID: 17 301 gagtgttytgccTatgttytcctct Seq. ID: 277 gagtgttttgccAatgttytcctct Seq. ID: 622 gagtgttttgcc-atgttytcytct
Seq. ID: 17 316 gttytcctctagGagttttatagtt Seq. ID: 278 gttytcctctagGNagttttatagtt Seq. ID: 623 gttytcytctarGNagttttatagtt
Seq. ID: 17 340 ttctggtcttacRtttag rtcttta Seq. ID: 279 ttytggtcttacRNtttagrtcttta Seq. ID: 624 ttctggtcttacANtttagrtcttta
Seq. ID: 17 492 tgcctcctttgtCaaagatharktg Seq. ID: 280 tgcctcctttgtGaaagathagktg Seq. ID: 625 tg bctcctttgtGaaa ratnag ktg
Seq. ID: 17 495 ctcctttgtcaaAgatha rktg wcy Seq. ID: 281 ctcctttgtcaaCgathagktghcc Seq. ID: 626 ctcctttgtsaaCratnagktgvcy
Seq. ID: 17 524 gtgygtggrtttAtytctgggcttt Seq. ID: 282 gtgygtggrtttCtytctgggcttt Seq. ID: 627 rtgyrtggrtttCtytstggrytyt
Seq. ID: 17 529 tg g rtttaty tcTg g g ctttctatt Seq. ID: 283 tggrtttatytcGgggctttctatt Seq. ID: 628 tggrtttatytsGggrytytctatt
Seq. ID: 17 536 atytctgggcttTctattytgttcc Seq. ID: 284 atytctgggctt-ctattytgttcc Seq. ID: 629 atytstgg ryty-ctattntgttcc
Seq. ID: 17 541 tgggctttctatTytgttccattga Seq. ID: 285 tgggctttctatGytgttccattga Seq. ID: 630 tg g ry tytctatGntgttccattg a
Seq. ID: 17 563 tgatctatatktCtgtytttgtgcc Seq. ID: 286 tgatctatatttAtgtytttgtgcc Seq. ID: 631 tgatctatrtktAtgtytttgtrcc
Seq. ID: 17 573 ktctgtytttgtGccagtaccatac Seq. ID: 287 ttctgtytttgt-ccagtaccatac Seq. ID: 632 ktctgtytttgt-ccagtaccatac
Seq. ID: 17 617 ttgtagtakagyCtgaagtcaggda Seq. ID: 288 ttgtagtanagyGtgaagtcaggha Seq. ID: 633 ttgtagtakagyGtgaagtcaggda
Seq. ID: 17 623 takagyctgaagTcaggdaggytga Seq. ID: 289 tanagyctgaagCcagghaggttga Seq. ID: 634 takagystgaagCcaggdagvbtga ω Seq. ID: 17 627 gyctgaagtcagGdaggytgattcc Seq. ID: 290 gyctgaagtcagChaggttgattcc Seq. ID: 635 gystgaagtcagCdagvbtgattcc o Seq. ID: 17 641 aggytgattcctCcagytcyrttyt Seq. ID: 291 aggttgattcctGcagytccattct Seq. ID: 636 agvbtgattcctGcagbtyydttyt
Seq. ID: 17 645 tgattcctccagYtcyrttyttctt Seq. ID: 292 tgattcctccagGtccattcttctt Seq. ID: 637 tgattcctccagGtyydttyttctt
Seq. ID: 17 651 ctccagytcyrtTyttctttctcaa Seq. ID: 293 ctccagytccatGcttctttctcaa
Seq. ID: 17 668 tttctcaagattGctttggctatty Seq. ID: 294 tttctcaagattCctttggctatty Seq. ID: 638 tttctcaagattCytttggctatty
Seq. ID: 17 706 tttccatayaaaTtktraaattdtt Seq. ID: 295 tttccatacaaaGtktraaattwtt Seq. ID: 639 tttccayacaaaGtdtrdrrttwtt
Seq. ID: 17 797 ttgggtagtataC-hawywtmrtga Seq. ID: 296 ttgggtagtataThcatyttmacra
Seq. ID: 18 14 tttggggkbccCctgsctggrgtc Seq. ID: 299 gtgnggggbymcTctgsctggggtc Seq. ID: 640 gtntggggbyccTctgvctggggtc
Seq. ID: 18 35 rgtccttctctgTtgctyrgydyrt Seq. ID: 305 ggtccttctctrCtrcttdgygyrt Seq. ID: 641 g gtccttctctg Ctgcttd gydy rt
Seq. ID: 18 88 ggtccttytctgTtgctyrgtgyrt Seq. ID: 306 ggtccttctctgCtgctyrgtkyrt Seq. ID: 642 ggtccttctctgCtgcttdgydyrt
Seq. ID: 18 148 ctctgttgcttgGyryrtcaggcrc Seq. ID: 300 ctctgttgcttgCγgyrtcagghrc Seq. ID: 643 ctctgttgcttgCygyrtcaggcrc
Seq. ID: 18 199 ctctgttgctyrGtkyrtcaggyrc Seq. ID: 301 ctctgttgctyrCtkyrycagghrc Seq. ID: 644 ctctgttgctyrCtkyrtcagghrc
Seq. ID: 18 285 ccm cmytscctg Gg ktcctthtctg Seq. ID: 302 ccmcmytscctgAgktcctthtctg Seq. ID: 645 ccmcmytssctgAg ktccttmtctg
Seq. ID: 18 312 gcttggtgcatcAggcactaaagcc Seq. ID: 303 gcttggtgcatcTggcactaaagcc Seq. ID: 646 gcttggtgyrtcTgghrctwaagvs
Seq. ID: 18 315 tggtgcatcaggCactaaagcctgc Seq. ID: 304 tggtgcatcaggAactaaagcctgc Seq. ID: 647 tggtgyrtcaggArctwaagvskgc
Seq. ID: 19 23 rcaaagagtcrgAcacgactgagcg Seq. ID: 309 rcaaagagtcrgANcacgactgagcg
Seq. ID: 19 147 cctttcaaggcgTgaatgtttcaga Seq. ID: 307 cctttcaaggcgCgaatgtttcaga
Seq. ID: 19 200 tcttcagtacacTgtcgtcacttag Seq. ID: 308 tcttcagtacac-gtcgtcacttag
Seq. ID: 19 295 tcgctcagtcgtGtcygactctttg Seq. ID: 310 teg ctcag tcgtG Ntcyg actctttg
Seq. ID: 19 429 ctgtcgtcccctTctcctctgcccy Seq. ID: 311 ctgtcgtcccct-ctcctctg cccy
Table 1 (continued)
Seq. ID: 19 465 m sea kca kcag rTcttttccaatg w Seq. ID: 312 mbcakcakcagr-cttttccaatga Seq. ID: 648 cscakcakcaga-cytttccaatga Seq. ID: 19 471 akcagrtcttttCcaatgwrdyhas Seq. ID: 313 akcagrtcttttTcaatgardyhas Seq. ID: 649 akcagatcytttTcaatgagtyaac Seq. ID: 19 510 chaaaavcwcatAtcacatttratt Seq. ID: 314 cbaaaagvtcayCksacatttmaky Seq. ID: 19 511 haaaavcwcataTcacatttrattt Seq. ID: 315 baaaagvtcaymGsacatttmakyt Seq. ID: 650 gmcaaggt-ahcGgac-tttmagyt Seq. ID: 19 839 cttgagcactatTatcaaaaatcac Seq. ID: 316 cttgagcactatGatcaaaaatcac Seq. ID: 20 112 aagaaatcaaagAdgacacaaabar Seq. ID: 317 aagaaatcaaag-dgacacwaayar Seq. ID: 651 aagaaatyaaan-rgacabraayar Seq. ID: 20 118 tcaaagadgacaCaaabaratggar Seq. ID: 318 tcaaagadgacaGwaayaratggar Seq. ID: 652 tyaaanrrgacaGraayaratggar Seq. ID: 20 147 atwccatgttchTggattggaagaa Seq. ID: 319 atwccatgttcaGggattggaagaa Seq. ID: 653 athcyrtgttcaGggatwggaagaa Seq. ID: 20 195 ctacccaaacaaTytayagattcaa Seq. ID: 320 ctacccaaacaaCytayagattcaa Seq. ID: 20 197 acccaaacaatyTayagattcaayr Seq. ID: 321 acccaaacaatyCayagattcaayg Seq. ID: 654 acccaaacarthCayagattcaaya Seq. ID: 21 124 atacacyaayaaYrrrmaaacagar Seq. ID: 322 atacacyaayaa-rrrmaaacagar Seq. ID: 21 289 baratggararaTatwccatgttcm Seq. ID: 323 yaratggararaGatwccatgttcm Seq. ID: 655 haratggaaaraGathccrtgytca Seq. ID: 21 401 ryattyttcacaGaactagaamaaa Seq. ID: 324 ryattyttcacaTaactagaacaaa Seq. ID: 656 rhwtttttcacaTaactagaamaaa Seq. ID: 21 426 haatttyamaatTyrtatggaamya Seq. ID: 325 haatttyamaatCyrtatggaamya Seq. ID: 657 haatyyyamaatCyrtatggaahca Seq. ID: 22 133 gtartattccatTgtgtatatgtac Seq. ID: 326 rtartattccatAgtgtatatgtac Seq. ID: 658 rtartattccatAgtrwrtatrtdc Seq. ID: 22 137 tattccattgtgTatatgtaccaca Seq. ID: 327 tattccattgtgAatatgtaccaca Seq. ID: 659 tattccatdgtrArtatrtdccaca Seq. ID: 22 222 traayagtgctgCdatraacatwbg Seq. ID: 328 traatagtgctgCNdatraacatwsg Seq. ID: 660 traatartgctgCNwatgaacatwsg Seq. ID: 22 246 g kgtrcatg titCty twtvrha kh a Seq. ID: 329 gkgtrcatgtgt-tttwtvrhakha Seq. ID: 661 gkgtrcawgtrt-tttwtdkwabnv Seq. ID: 22 304 tgggattgctggGtcawatggtakt Seq. ID: 330 tgggattgctggTtcawatggtakt Seq. ID: 662 tgg rattgctggTtcatatgstadt u> Seq. ID: 22 312 ctgggtcawatgGtakttctakttc Seq. ID: 331 ctgggtcawatgCtakttctakttc Seq. ID: 663 ctgg ntcatatgCtadttytrktty Seq. ID: 23 25 tacagcaaagtgAhtcagttataca Seq. ID: 332 tacagcaaagtgGwtcagttataca Seq. ID: 24 6 gtggaAgagggcctcatctccagtt Seq. ID: 334 gtggaTgagggcctcatctccagtt Seq. ID: 24 51 cagggtacctctGattgaggc Seq. ID: 333 cagggtacctctCatbgyggcgt Seq. ID: 664 cagggtwcctctCatggtgnngt Seq. ID: 25 352 t— gggggkagaTgctcgagaatwt Seq. ID: 335 t~gggggkakaGgctcgagaatwt Seq. ID: 25 402 ccgcmtttgcgtAtgccgagcctcc Seq. ID: 336 ccgcmtttgcgtTtgccgagcctcc Seq. ID: 26 42 tggacaggagggTcgactcccctgc Seq. ID: 337 tggacaggagggGcgactcccctgc Seq. ID: 27 113 tgatgggaccrgAtgccatgatctt Seq. ID: 344 tgatgggaccrgTtgccatgatctt Seq. ID: 27 155 bttknttwwtccAactctcacatcc Seq. ID: 364 ytt-sttwdtccGactctcacatcc Seq. ID: 665 bttg nttthtccGactctcacatcc Seq. ID: 27 197 aaaccatagcytTgactagayggac Seq. ID: 371 aaaccatagcytAgactagayggac Seq. ID: 666 aaaccatagcytAgactagacrgac Seq. ID: 27 201 catagcyttgacTagayggaccttt Seq. ID: 373 . catagcyttgacGagayggaccttt Seq. ID: 667 catagcyttgacGagacrgaccttt Seq. ID: 27 213 tagayggaccttTgttggcaaagta Seq. ID: 375 tagayggaccttAgttggcaaagta Seq. ID: 668 tagacrgaccttAgttggcaaagta Seq. ID: 27 265 aggttggtcataActttycttccaa Seq. ID: 386 aggttggtcataANctttycttccaa Seq. ID: 27 306 aatttcatggctGcagtcaccatct Seq. ID: 387 aatttcatggct-cagtcaccatct Seq. ID: 669 aatttcatggct-cartcaccatct Seq. ID: 27 326 catctgcagtgaTt±tggagcccmm Seq. ID: 388 catctgcagtgaTNtttggagcccmv Seq. ID: 670 catctgcagtgaTNtttggagcccaa Seq. ID: 27 394 tsccatgaagtgAtgggaccrgatg Seq. ID: 389 tsccatgaagtgANtgggaccrgatg Seq. ID: 27 426 ctthgttttytgAatgttgagyttt Seq. ID: 390 ctthgttttytgCatgttgagyttt Seq. ID: 671 cttagttttytkCatrttkagyttt Seq. ID: 27 549 gttattg atattTctcccrg caatc Seq. ID: 391 gttattgatatt-ctccyrgcaatc Seq. ID: 672 gttrttgatatt-ctccyrg caatc Seq. ID: 27 624 tgcatataagttAaataagcagggt Seq. ID: 392 tgcatataagtt-aataagcagggt Seq. ID: 673 tgcatataagtt-wataarcagggt Seq. ID: 27 625 gcatataagttaAataagcagggtg Seq. ID: 393 gcatataagttaTataagcagggtg Seq. ID: 674 gcatataagttaTataarcagggtg
Table 1 (continued)
Seq. ID: 27 640 aagcagggtgacAatatacagcctt Seq. ID: 394 aagcagggtgacTatatacagcctt Seq. ID: 675 aarcagggtgacTatatacagcctt
Seq. ID: 27 670 actcctttyccwAtttggaaccagt Seq. ID: 395 actcctttycctANtttggaaccagt Seq. ID: 676 actccttttccwANtttkgaaccagt
Seq. ID: 27 709 ccagttctaactGttgcttcctgac Seq. ID: 396 ccagttctaactTttgcttcctgac Seq. ID: 677 ccagttctaactTttgcttcytgac
Seq. ID: 27 710 cagttctaactgTtgcttcctgacc Seq. ID: 397 cagttctaactgGtgcttcctgacc Seq. ID: 678 cagttctaactgGtgcttcytgacc
Seq. ID: 27 837 atagtcaataaaGcagaartagatg Seq. ID: 398 atagtcaataaa-cagaartagatg
Seq. ID: 27 837 atagtcaataaaGcagaartagatg Seq. ID: 399 atagtcaataaaCcagaartagatg Seq. ID: 679 atagtcaataaa-cagaartagatg
Seq. ID: 27 838 tagtcaataaagCagaartagatgt Seq. ID: 400 tagtcaataaag-agaartagatgt Seq. ID: 680 tagtcaataaas-agaartagatgt
Seq. ID: 27 922 gttcctctgcctTttctaaawccag Seq. ID: 401 gttcctctgcctGttctaaawccag Seq. ID: 681 gttcctctgcctGttctaaa hccag
Seq. ID: 27 944 cagcttgaacatCtggaagttcacg Seq. ID: 402 cagcttgaacat-tggaagttcayg Seq. ID: 682 cagcttgaacat-tggaagttcayr
Seq. ID: 27 955 tctggaagttcaCggttcayrtayt Seq. ID: 403 tctggaagttca-ggttcacrtayt Seq. ID: 683 tctggaagttca-rgttcayrtayt
Seq. ID: 27 968 ggttcayrtaytGytgaagcctggc Seq. ID: 404 ggttcacrtayt-ytgaagcctggc Seq. ID: 684 rgttcayrtayt-ytgaagcctggc
Seq. ID: 27 990 ggcttggagaatTttgagcattact Seq. ID: 405 ggcttggagaatAttgagcattact
Seq. ID: 27 1006 agcattactttrCtagcrtgtgaga Seq. ID: 338 agcattacttta-tagcrtgtgaga Seq. ID: 685 agcattactttr-tagyrtgtgaga
Seq. ID: 27 1012 actttrctagcrTgtgagatgagtg Seq. ID: 339 actttactagcrGgtgagatgagtg Seq. ID: 686 actttrctagyrGgtgagatgagtg
Seq. ID: 27 1025 gtgagatgagtgCaattgtgyggta Seq. ID: 340 gtgagatgagtgCNaattgtgyggta Seq. ID: 687 gtgagatgagtgCNaattgtgyrgta
Seq. ID: 27 1089 aatgaaaactgaCcttttccagtcc Seq. ID: 341 aatgaaaactgaCNcttttccagtcc
Seq. ID: 27 1098 tgaccttttccaGtcctgtggccac Seq. ID: 342 tgaccttttccaGNtcctgtggccac
Seq. ID: 27 1101 ccttttccagtcCtgtggccactgc Seq. ID: 343 ccttttccagtcCNtgtggccactgc
Seq. ID: 27 1168 acagcatcatctTtyaggatttgaa Seq. ID: 345 acagcatcatct-tyaggatttgaa ω Seq. ID: 27 1172 catcatctttyaGgatttgaaatag Seq. ID: 346 catcatctttyaGNgatttgaaatag κ> Seq. ID: 27 1264 tcacattccaggAtgtctggctcta Seq. ID: 347 tcacattccaggTtgtctggctcta
Seq. ID: 27 1306 tcrtgattatctGggtcatgaagat Seq. ID: 348 tcgtgattatct-ggtcatraagat Seq. ID: 688 tcrtgrttatct-ggtcrtgaagat
Seq. ID: 27 1307 crtgattatctgGgtcatgaagatc Seq. ID: 349 cgtgattatctg-gtcatraagatc Seq. ID: 689 crtg rttatctn-gtcrtgaagatc
Seq. ID: 27 1311 attatctgggtcAtgaagatctttt Seq. ID: 350 attatctgggtc-traagatctttt Seq. ID: 690 rttatctnggtc-tgaagatctttt
Seq. ID: 27 1312 ttatctgggtcaTgaagatcttttt Seq. ID: 351 ttatctgggtca-raagatcttttt Seq. ID: 691 ttatctnggtcr-gaagatcttttt
Seq. ID: 27 1322 catgaagatcttTtttgtacagttc Seq. ID: 352 catraagatcttTNtttgtacagttc Seq. ID: 692 crtgaagatcttTNtttgtayagttc
Seq. ID: 27 1340 acagttcttctgTgtattcttgcca Seq. ID: 353 acagttcttctg-gtattcttgcca Seq. ID: 693 ayagttcttctg-gtattcttgcca
Seq. ID: 27 1345 tcttctgtgtatTcttgccacctct Seq. ID: 354 tcttctgtgtat-cttgccacctct
Seq. ID: 27 1372 ttaatatcttctGcttctgttaggt Seq. ID: 355 ttaatatcttct-cttctgttaggt
Seq. ID: 27 1374 aatatcttctgcTtctgttaggtcc Seq. ID: 356 a atatcttctg cGtctgttag gtcc
Seq. ID: 27 1431 tgcatgaaatgtTcccttggtatct Seq. ID: 357 tgcatgaaatgtGcccttggtatct Seq. ID: 694 tgcatgnaatgtGccyttggtatct
Seq. ID: 27 1462 ttcttgaagagaTctctagtctttc Seq. ID: 358 ttcttgaagagaGctctagtctttc
Seq. ID: 27 1478 tagtctttcccaTtctrttgttttc Seq. ID: 359 tagtctttcccaGtctrttgttttc Seq. ID: 695 tagtctttcccaGtctdttgttttc
Seq. ID: 27 1482 ctttcccattctRttgttttcctct Seq. ID: 360 ctttcccattctRNttgttttcctct Seq. ID: 696 ctttcccattctDNttgttttcctct
Seq. ID: 27 1489 attctrttgtttTcctctatttctt Seq. ID: 361 attctrttgtt±TNcctctatttctt Seq. ID: 697 attctdttgtttTN cctctatttctt
Seq. ID: 27 1498 ttttcctctattTctttgcattgat Seq. ID: 362 ttttcctctat±TNctttg cattg at
Seq. ID: 27 1510 tctttgcattgaTcrctgargaagg Seq. ID: 363 tctttgcattgaGcrctgargaagg
Seq. ID: 27 1629 agctatttgtaaGgcctccycagac Seq. ID: 365 agctatttgtaa-gcctccycagac
Seq. ID: 27 1630 gctatttgtaagGcctccycagaca Seq. ID: 366 gctatttgtaag-cctccycagaca
Table 1 (continued)
Seq. ID: 27 1677 cttttycwtgggGatggtcttgatc Seq. ID: 367 cttttycwtgggGNatggtcttgatc Seq. ID: 698 cttttycwtgggGNatggtyttgrtc
Seq. ID: 27 1740 ttcwtcaggcacTctrtctatcaga Seq. ID: 368 ttcwtcaggcacCctrtctatcaga
Seq. ID: 27 1836 tggtctagtggtTttccctactttc Seq. ID: 369 tggtctagtggtCttccctactttc
Seq. ID: 27 1968 aagaatataatcAatctgatttygg Seq. ID: 370 aagaatataatcCatctgatttygg
Seq. ID: 27 1988 ttyggtrttgacCatctggtgatgt Seq. ID: 372 ttyggtgttgacAatctggtgatgt
Seq. ID: 27 2103 tg cttca ttcy gTay tcca a g g cca Seq. ID: 374 tgcttcattctgTNaytccaaggcca Seq. ID: 699 tgcttcattytdTNaytccaaggcca
Seq. ID: 27 2176 cagtcccctataAtgaaaaggacat Seq. ID: 376 cagtcccctataGtgaaaaggacat
Seq. ID: 27 2179 tcccctataatgAaaaggacatctt Seq. ID: 377 tcccctataatgTaaaggacatctt Seq. ID: 700 tsccctatartgTaaagracatctt
Seq. ID: 27 2230 tgtaggtcttcaTagaaccrttcaa Seq. ID: 378 tgtaggtcttcaAagaaccrttcaa Seq. ID: 701 tgtdggtcttcaAagaaccrttcaa
Seq. ID: 27 2233 aggtcttcatagAaccrttcaactt Seq. ID: 379 aggtcttcatagCaccrttcaactt
Seq. ID: 27 2256 ttcagcttcttcAgcrttactggtt Seq. ID: 380 ttcagcttcttcTgcrttactggtt Seq. ID: 702 ttcagcttcttcTgcrttastggtt
Seq. ID: 27 2296 attactgtgataTtgaatggtttgc Seq. ID: 381 attactgtgataGtgaatggtttgc
Seq. ID: 27 2315 gtttgccttggaAaygaacagagat Seq. ID: 382 gtttgccttgga-aygaacagagat Seq. ID: 703 gtttgccttgga-ayraacagagat
Seq. ID: 27 2316 tttgccttggaaAygaacagagatc Seq. ID: 383 tttgccttggaa-ygaacagagatc Seq. ID: 704 tttgccttggaa-yraacagagatc
Seq. ID: 27 2318 tgccttggaaayGaacagagatcat Seq. ID: 384 tgccttggaaay-aacagagatcat
Seq. ID: 27 2418 tcttttgttgacYatgatggcyact Seq. ID: 385 tcttttgttgacAatgatggctact
Seq. ID: 28 35 gcttcagtagttGcngcwyrbgggc Seq. ID: 406 gcttcagtagttTyrgcacrygggc Seq. ID: 705 gcttcagtagttTyrgcrcayrggc
Seq. ID: 28 63 gbagttghggytCmhggsbctagwg Seq. ID: 407 gtagttgyggctTvtggsyytagak
Seq. ID: 29 20 tccrggagttggTgatggacaggga Seq. ID: 408 tccgggagttggTNgatggacaggga Seq. ID: 706 tcygggagttggTNgatggacaggga ω Seq. ID: 29 42 ggaggcctggygTgctgcrrtycat Seq. ID: 409 ggaggcctggygTNgctgcrrttcat w Seq. ID: 30 490 gyrggccagtccAtggggtcrcama Seq. ID: 410 gyrggccagtcc-tggggtcrcaaa Seq. ID: 707 gyrggccngtcy-nngggtcrcaaa
Seq. ID: 31 35 wagbaaadthhaAtdhtataaagaa Seq. ID: 411 dagdaaadthmaANtdhtataaadaa Seq. ID: 708 navhraavtchrRNtgntataaadaa
Seq. ID: 31 74 ttcagttcagttCagtcgctcagtc Seq. ID: 412 ttcagttcagttCNagtcrctcagtc
Seq. ID: 32 54 dakdhadtbhabTttadhtddddtc Seq. ID: 413 daddbagtbhab-ttwdhtddddtc Seq. ID: 709 garkbdgtvhr— ttaavtvrvktc
Seq. ID: 33 25 atcaagggtgccAagtgccctttcg Seq. ID: 414 atcaagggtgccCagtgccctttcg
Seq. ID: 33 36 caagtgccctttCgacctccaattc Seq. ID: 415 caagtgccctttAgacctccaattc
Seq. ID: 34 97 cyagccagtaatCasttccytawgt Seq. ID: 436 ctagccagtaatAacttccytatgt
Seq. ID: 34 279 tctggaacacacAgaaattcacaga Seq. ID: 416 tctggaacacacANranattcacaga
Seq. ID: 34 316 rgagaggggttaGgraggagacasa Seq. ID: 417 agagargggtkwAggaggagacasa Seq. ID: 710 agagargrgkkaAggaggagayasa
Seq. ID: 34 370 aaagggagagagAgcartcaagcca Seq. ID: 418 aaagggrgagagTgcartcaagcca
Seq. ID: 34 468 mwaaaagcaaarAttaaaaatctag Seq. ID: 419 mwaaaagcaaarTttaaaaatctag Seq. ID: 711 maaaaagcaaarTttaaaaatctag
Seq. ID: 34 477 aarattaaaaatCtagagtagagkt Seq. ID: 420 aarattaaaaatTtagagtagagkt
Seq. ID: 34 529 aaaagaagaaggAaaagaaagagag Seq. ID: 421 aaaagaagaagrCaaagaaagagag Seq. ID: 712 aaaaraagaarrCaaagaaagarag
Seq. ID: 34 576 aamarggtmvyrAarattrkaangw Seq. ID: 422 aamaaggtcvyrGarattakaaagw Seq. ID: 713 aavaargkvryrGarawkataragw
Seq. ID: 34 592 ttrkaangwwmdTaaaggtacaaaa Seq. ID: 423 ttakaaagwamwTNaaaggtacaaaa Seq. ID: 714 wkataragwamaTNawaggtacaaaa
Seq. ID: 34 616 attgrtaacaaaTacmwaaaagcaa Seq. ID: 424 attgataacaaaCacmaaaaagcaa Seq. ID: 715 attgataacwaaCaccaaaaagcaa
Seq. ID: 34 617 ttgrtaacaaatAcmwaaaagcaaa Seq. ID: 425 ttgataacaaatANcmaaaaagcaaa Seq. ID: 716 ttgataacwaatANccaaaaagcaaa
Seq. ID: 34 620 rtaacaaatacmWaaaagcaaarat Seq. ID: 426 ataacaaatacmCaaaagcaaarat
Seq. ID: 34 622 aacaaatacmwaAaagcaaaratta Seq. ID: 427 aacaaatacmaaTaagcaaaratta
Table 1 (continued)
Seq. ID: 34 625 aaatacmwaaaaGcaaarattaaaa Seq. ID: 428 aaatacmaaaaaGNcaaarattaaaa Seq. ID: 717 waataccaaaaaGNcaaarrttaaaa
Seq. ID: 34 678 atacratgttaaAaaaraagaagra Seq. ID: 429 atacratgttaaGaaaraagaagra
Seq. ID: 34 681 cratgttaaaaaAraagaagraaaa Seq. ID: 430 cratgttaaaaaTraagaagraaaa
Seq. ID: 34 726 aaacaaamaanaAmaaacaargtmv Seq. ID: 431 aaacaaasaaraGcaaacaargt-c
Seq. ID: 34 727 aacaaamaanaaMaaacaargtmvv Seq. ID: 432 aacaaasaaraaTaaacaargt-cv Seq. ID: 718 aavaaanaaraaTaaavaaagtnca
Seq. ID: 34 733 maanaamaaacaArgtmvvhawaar Seq. ID: 433 saaraacaaacaCrgt-cvhawaaa
Seq. ID: 34 746 rgtmvvhawaarTtataaagaaaat Seq. ID: 434 rgt-cvhawaaaAtataaagaaaat
Seq. ID: 34 758 ttataaagaaaaTaayaggtacaaa Seq. ID: 435 ttataaagaaaaAaahaggtacaaa
Seq. ID: 35 40 cagagacattacTttgccaacaaaa Seq. ID: 437 cagagacattacAttgccaacaaar Seq. ID: 719 cagagacattacAttgccaacaaaa
Seq. ID: 36 12 ctgtraagaarGctgagyrcygaa Seq. ID: 438 ctgtraagaarGcNtgagyrccgaa Seq. ID: 720 ctgtraagaaaGcNtgagygcygaa
Seq. ID: 36 77 gagagtcccttgGactgcaaggaga Seq. ID: 439 gagagtcccttgGNactgcaaggaga
Seq. ID: 37 101 waawadmahthaHhtcaaactg Seq. ID: 440 daatadhahtwaHNctcaaactg Seq. ID: 721 taayarnachha-ctcawactg
Seq. ID: 38 69 ttcctgggtgtgAcagtgtttaacc Seq. ID: 456 ttcctgggtgtgGcagtgtttaacc
Seq. ID: 38 77 tgtgacagtgttTaacctacaaact Seq. ID: 458 tgtgacagtgttCaacctacaaact Seq. ID: 722 tgtgnnngtgttCaacctacaaact
Seq. ID: 38 81 acagtgtttaacCtacaaactcctt Seq. ID: 459 acagtgtttaacTtacaaactcctt Seq. ID: 723 nnngtgtttaacTtacaaactcctt
Seq. ID: 38 252 gtgaatggttttTcacttgttgggc Seq. ID: 441 gtgaatggttttTNcacttgttgggc Seq. ID: 724 gtgaakggttttTNcacttgttgggc
Seq. ID: 38 284 tgctgctaagttTccatatccctta Seq. ID: 442 tgctgctaagtt-ccatatccctta
Seq. ID: 38 290 taagtttccataTcccttacctgct Seq. ID: 443 taagtttccataTNcccttacctgct
Seq. ID: 38 293 gtttccatatccCttacctgctgtg Seq. ID: 444 gtttccatatccCNtta cctg ctg tg
Seq. ID: 38 296 tccatatcccttAcctgctgtgtcc Seq. ID: 445 tccatatcccttTcctg ctgtgtcc
-^ Seq. ID: 38 363 ttaatgtttgtaAcctgggaccctt Seq. ID: 446 ttaatg tttg taTcctg g g a ccctt
Seq. ID: 38 369 tttgtaacctggGacccttgagtta Seq. ID: 447 tttgtaacctggGNacccttgagtta
Seq. ID: 38 397 ctttttcttg tt Atag ccca cca ca Seq. ID: 448 ctttttcttgtt-tagcccaccaca
Seq. ID: 38 404 ttgttatagcccAccacacctttgc Seq. ID: 449 ttgttatagcccGccacacctttgc
Seq. ID: 38 408 tatagcccaccaCacctttgctctg Seq. ID: 450 tatagcccaccaTacctttgctctg
Seq. ID: 38 455 gcttttttggagGgtggctcctgac Seq. ID: 451 gcttttttggagTgtggctcctgac
Seq. ID: 38 469 tggctcctgaccAaccacctttaga Seq. ID: 452 tggctcctgaccGaccacctttaga Seq. ID: 725 tggctcctgaccGaycacctttaga
Seq. ID: 38 484 cacctttagagaAaaataagttttc Seq. ID: 453 cacctttagaga-aaataagttttc
Seq. ID: 38 502 agttttctgaagAaaaggtcttaaa Seq. ID: 454 agttttctgaagGaaaggtcttaaa
Seq. ID: 38 514 aaaaggtcttaaAatgttaacaggc Seq. ID: 455 aaaaggtcttaaANatgttaacaggc
Seq. ID: 38 750 gctgaattaytcAgyctcttttctc Seq. ID: 457 gctgaattaytcGgyctcttttctc
Seq. ID: 39 259 cgtgatcagtgcAtgatagc Seq. ID: 460 cgtgatcagtgcTtgatagc
Seq. ID: 40 59 atgatagccacgTgatcagtgcatg Seq. ID: 461 atgatagycacgAgatcagtgcatg
Table 2
A B C D H I K
Seq Homology to Homo#
Repetitive ID Length Repetitive Genomic contig with highest score logy to Sequences uniqu
A C 9K Max INIDr Total Element from DB 6
IT Dependent Oligos
No. cow IT
Element genome positi ons
1 BTSAT4 2473 full length 8840 47 8 15 Seq. ID Nos. 42-88 and Seq ID Nos. 463-490
2 Ll-BT 259 full length 492 7 8 13 Seq. ID Nos. 89-95 and Seq. ID Nos. 491-495
3 Ll-BT 530 91.5% 1075 17 7 15 Seq. ID Nos. 96-112 and Seq. ID Nos. 496-502
4 Ll-BT 614 full length 420 19 6 15 Seq. ID Nos. 113-131 and Seq. ID Nos. 503-521
5 BOV2 103 75.0% 11585 1 6 6 Seq. ID: 132
6 BOVA2 2285 11.8% ref|NW_001495372.1 |Bt7_WGA975_3 88.0% 1360 19 7 15 Seq. ID Nos. 133-151 and Seq. ID Nos. 522-531
7 BOVB 912 92.6% 1081 20 8 15 Seq. ID Nos. 152 -171 and Seq. ID Nos. 532-540
8 BTSAT6 633 full length 226 25 6 8 Seq. ID Nos. 172-196 and Seq. ID Nos. 541-565
9 Ll-BT 655 full length 3166 16 8 15 Seq. ID Nos. 97-212 and Seq. ID Nos. 566-578
10 BTSAT4 75 full length 4123 2 5 8 Seq. ID Nos. 213-214 and Seq. ID Nos. 579-579
11 Ll-BT 1475 78.6% 675 26 8 15 Seq. ID Nos. 215-:240 and Seq. ID Nos. 580-598
12 Ll-BT 730 full length 1046 20 9 15 Seq. ID Nos. 241-260 and Seq. ID Nos. 599-610
13 None 572 ref|NW_001493362.1 |Btl5_WGA1870_3 69.8% 640 2 5 7 Seq. ID Nos. 261-262 and Seq. ID Nos. 611-612
14 Ll-BT 145 full length 4690 5 9 15 Seq. ID: 263 to Seq. ID: 267
15 BTSAT4 53 full length 5138 1 9 9 Seq. ID: 268
16 BTSAT4 55 90.9% 333 1 6 6 Seq. ID: 269 u> 17 Ll-BT 929 full length 925 29 8 15 Seq. ID Nos. 270-298 and Seq. ID Nos..613-639
^ 18 None 361 305 8 7 14 Seq. ID Nos. 299-306 and Seq. ID Nos. 640-647
19 ART2A 1300 21.5% 19158 10 10 14 Seq. ID Nos. 307-316 and Seq. ID Nos. 648-650
20 Ll-BT 211 full length 607 5 6 13 Seq. ID Nos. 317-321 and Seq. ID Nos. 651-654
21 Ll-BT 499 full length 450 4 6 13 Seq. ID Nos. 322-325 and Seq. ID Nos. 655-657
22 L1PA8 334 full length 549 6 6 15 Seq. ID Nos. 326-331 and Seq. ID Nos. 658 -663
23 None 125 ref|NW_001508842.1 |BtX_WGA3030_3 77.0% 379 1 7 7 Seq. ID: 332
24 BTSAT4 61 85.2% 3375 2 5 8 Seq. ID: 333 to Seq. ID: 334 and Seq ID: 664
25 BTLTRl 723 74.1% 1593 2 7 9 Seq. ID: 335 to Seq. ID: 336
26 BTSAT4 57 full length 2746 1 5 5 Seq. ID: 337
27 BOVB 2434 92.6% 4675 68 8 15 Seq. ID Nos. 338-405 and Seq. ID Nos. 665 -704
28 CHR-2 120 88.3% 133 2 6 9 Seq. ID: 406 to Seq. ID: 407 and Seq ID:, 705
29 ART2A 141 full length 13335 2 5' 8 Seq. ID: 408 to Seq. ID: 409 and Seq ID: 706
30 Bov-tA 570 26.3% ref|NW_001495573.1 |Bt9_WGA1221_3 99.0% 5376 1 7 7 Seq. ID: 410 to Seq. ID: 410 and Seq ID: 707
31 BOV2 122 50.0% ref|NW_001494076.1 | Bt22_WGA2384_J3 94.0% 12323 2 7 11 Seq. ID: 411 to Seq. ID: 412 and Seq ID: 708
32 ART2A 125 44.8% ref|NW_001503113.1 |BtUn_WGA4440_3 84.0% 907 1 6 6 Seq. ID: 413 and Seq ID: 709
33 BTSAT4/5 52 full length 4498 2 7 11 Seq. ID: 414 to Seq. ID: 415
34 Ll-BT 795 full length 405 21 7 15 Seq. ID Nos. 416-436 and Seq. ID Nos. 710-718
35 BOVB 62 full length 666 1 5 5 Seq. ID: 437 and Seq ID: 719
36 BOV2 235 full length 4881 2 6 10 Seq. ID: 438 to Seq. ID: 439 and Seq ID: 720
37 BCS 110 73.6% 1039 1 5 5 Seq. ID: 440 and Seq ID: 721
Table 2 (continued)
38 BTLTRl 905 full length 3077 19 11 15 Seq. ID Nos.441-459 and Seq. ID Nos.722-725
39 BTSAT3 266 63.9% 1102 1 5 5 Seq. ID: 460
40 BTSAT4 81 full length 1320 1 5 5 Seq. ID: 461
41 ART2B_BT 81 full length 8451 1 6 6 Seq. ID: 462
U)
CS
Corr5_FastaConsAl l .txt >Seq . ID: 1
TGTGTTGGGKGKGBTHBKGDGGWTGTGGRMCMACMCAMGAGGCMDCNNCTGRAWTSYCKTCGTRASWCSRGMMTCMYSCY GYMRCTCGAGAAMAACCMCGWGRYTCCCCCGTCATCGCRAGATGARGSCCTTKYCCSCTRCAKSGCCTMRGGASMAGTCY CRCGTTAGGWATTGGAGSTCGAAAGGGBACTTGRCACCCTTGATGCGACCCACAAAGTTCCCCGAVATCCCGGTCTCCCT CGAGAGGAACACYGAGGTTTTCCGGCACCMCYTCCTCTGAGCCCTTTCTCCCCTCCTGATCTGGACAGGAGGGTCGACTC CCCTGCTTTGTCTGGAAGGGGTTCCCGACCTTCCGGTCGCACCTCAGGATGAGGCCGGTCTCACGAAGACATTCCAGACG TGGCCTCGTGGGTGGTTCCACATTCCGTAGGACCCCGATTTCCCGGTCCCCTCTTGATAAGAACCCGATGCCCGGACACC TCTCCGAACTCCASCCTGTGAATGMAGTCAACACGAAGGGGCAGTNNYTTKYCCGTGCATCGTTCGGAAAAAACCCCAGG
ACGAGGCCTGACTCTCCTGTCCCCAGTCTGCAGGGACCCTGCGATCGGAGTCTGAAATCAGAGGWACCCTGMGGTTCCTG CCTCAACTGGAGATGAGGCCCTCTTCCAATGCACCAAVCCCAGTGGAGTCCCGAGAGGCCCCTCCCACCKCCAGTTTCCC TGRCTTCTCAGAGCCACCATGAGAAGCCCCCTGAGGTCACCTGCACAAGTCGAGGGAACCCMGGGTTTCCTGCCTCMACY
YYCCYTCGCMACTCGMATGGNGAYYSGACTTCCCTGGSSCMMCACRAGAGGCWBMCTGAMYTCSCCGTCGTAMCTCGNNA
CAKSGCCTMRVGASMARTCYCRCGWYMKCWCTCSRARCKCSWMAGGRBRCTTGRCWCCCTTGAKKCSACCCAGWGAGCTC CMAGAGATACCCGTCGCGAYTCGAGAGCAGAGCGGGGTTCTTTGCTTCCACTCGAGATGAATGCCTGTCTCCCCGGGTGC GTCTGGAATGCAACCCCGAGATCCCTGTCGCCCCTGGAGAGGAACATTGGCTTCTGGACACAAGCCTAGATGAGGTCTAT TGGCCCTGCAGTCACTCGAGAGCAATCCCCAGCTTTCCTTCGCAACTCGAATGGAAGATTGGACTTSCCTGGGCCAACAC AAGAGGCABCCTGAATTCCCCGTCGTAACTCGAGAATCCCGCCGYMRCTCGAGAAMAACCMCGWGRYTCCCCCGTCATCG
CCTTGATGCGACCCACAAAGTTCCCCGAAATCCCGGTCTCCCTCGAGAGGAACACYGAGGTTTTCCGGCACCCCCTCCTC TGAGCCCTTTCTCCCCTCCTGATCTGGACAGGAGGGTCGACTCCCCTGCTTTGTCTGGAAGGGGTTCCCGACCTTCCGGT
ATTTCCCGGTCCCCTCTTGATAAGAACCCGATGCCCGGACACCTCTCCGAACTCCASCCTGTGAATGMAGTCAACACGAA GGGGCAGTTTTTCCGTGCATCGTTCGGAAAAAACCCCAGGTTCCAAATACAGCTCGACAAGCGGCCTCTCTCCCCGGGGA CATCTCGAGAGGCAAGCGGAGTTCCATGCCTCAACCCAAGACGAGGCCTGACTCTCCTGTCCCCAGTCTGCAGGGACCCT GCGATCGGAGTCTGAAATCAGAGGWACCCTGMGGTTCCTGCCTCAACTGGAGATGAGGCCCTCTTCCAATGCACCAAVCC
YKKCRCMMSTSGAGRGRAVCMHKGGYTTCYKGMCWCAASCCKAGAHNGAGGTCWRWTDSCCCTDCMRTSACTCGAGAGCA ATSMCSMGCTYYCCYTCGCMACTCGMATGNAGAYYSGACTTSCCTGGSSCMMCACRAGAGGCWBMCTGAMYTCSCCGTCG TAMCTCGWGAGAAHCCGCACNCTNGSGCCGCMRCTCGAGAA-MAACCMCGWGRYTCCCCCGTCATCGMGGAGA Corr5_FastaConsAl l . txt >Seq . ID : 2
TATGGAATTTAGAMGATGGTMYF^TAACCCTRTRTRCi^GACAGCMAAGAGAMACWGAYRTADAGAACAGWCTTVfTG
GACTBTGTGGGAGAGGGAGAGGGTGGGATGATTTGGGAGAATGGCATTGAAACATGTAWAATATCATRTATGAAAYGART
YGCCAGTCCAGGTTCGATGCAYGATACWGGATGCTTGGGGCTGGTGCACTGGGAYGACCCAGAGGGATGGTATGGGGAGG
GAGGAGGGAGGAGGGTTCA
>Seq . ID: 3
TTTTGT 1 1 1 I CATGTGTTTGTT-AGYTNKGTGBTWDHNADTHAAATTCAACAYCCATTTATGATAAAAACTCTCCAGAAA
AAAAYTGAAAGCATTTCCYCTAARRTCAGGAACAAGACAAGGRTGCCCACTYTCACCACTWCTATTCAACATAGTTTTGG ARGTTTTGGCCACAGCAATCAGAGCAGAAAAAGAAATAAAAGGAATCCAAATTGGAAAAGAAGAAGTAAAACTCTCACTR
Figure imgf000039_0001
TTAAGGAAACAAAGACGCTTACTCCTTGGAAGGAAAGTTATGACCAACCA
>Seq . ID : 4
GDGTTCTTTGARARGATAAAYAAAATTGACAAACCATTAGCCAGACTCATCAAGAAAMAAGRGAGAARAATCAAATCAAY
ATATGCCAATAAAATGGACAACBTRGAAGAAATGGACAAATTCTTAGAAAMGTACAACYTTCCAAAACTGAACCARGAAG
GGWCCAGAYGGCTTCACAGSTGAATTCTACCAAAMATTTARAGAAGAGCTAAYACCTATCCTDCTCAAACTCTTCCARAA AATTGCAGAGGAAGGWAMACTTCCAAACTCATTCTATGAGGCCACCATCACCCTRATACCAAAACCWGACAAAGAYNCCA CAAAAAAGAAAAYTACAGGCCAATATCACTGATGAACATAGATGCAAAAATCCI
AACAACACATYAAAAAGRATCATACACCATGA YCAAGTGGGMKTTTATΎCCCA >seq . ID : 5
GGGAGGCCTGGC
AHTGTDTMTATDTAHHAMAACAA
>Seq. ID: 6
TGTGTTGGGTNTGTTNGGTTTTGGGTCAAAGAGA I I I I I GACTATCCAATAAGAGTGTCTGGCACCATCCTGACATGTGA
AACGTAAGAAAGcwAA1111111 ΓGTΠTGTTTGGTTCTGATCTTGGGTGTGTGTGTGTGTGTGTGTGTGTGCGTGYGTG
YGGGGGAGGGGAGGGGCGCTACTATCACTGATAGGTATGTTCACCAATCTGTCGAATGCCAATATAAAAGTCAGTTCTTG GTTGCTGMGGGAAMTGTATGACTTGGTATTAAAAGAAGGTATGTGGCATGAAATCGCATTTTCTGAAGGAAAACGACGT CGTTCTTTCTTCCAGGTGGCAGTTTCAGGATGGGAGAAGCAAAGGAATGGGTACAGAATGGGATCAGGAGGTTGTAGGGA corr5_FastaConsAil .txt AMGTGGACCCCCAGAAMGAGTGTTGTGCCTAGCACMGGTTTCTTGAGAGGTTGAACTGCCTTTGCTAATGGATTTTAA
GTTTCTTTACCTCTTCCGTGATCTCTTCTAGGTTGACCTAGAASAGAMTCTTTTGTTAGCTTTGGCTAAGTGGAGAAGT ATTGCTTCACGGTGACTTGTGTGATCCTACCTGACTGTTCTAAACCCTTTTGATATCTATGCACTGCTGCTGCTGCTGCT GCTGCTGCTGCTGCTGCKAAGTCGCTTCAGTCGTGTCCGASTCTGTGYGACCCMMTVKDYGSSACCCCATGGACTGYAGC CYACCAGGCTCCTCYGTCCATGGGATTYTCCAGGCAAGAATACTGGAGTGGGTTGCCATTBCCTYCTCCARKGSATGAAA GTGAAAWGTGAWAGTKMAGTCGCTCAGTCGTGTCCGACTCTTTGYGACCCCATGGACTGYAGCMYRCCAGGCYYCYCTGT CCATSRVATTΎTCCAGGCAAGAATACTGGAGTGGGTTGCCATTBCCTTCTCCMRKRKMTMTTCMCAASCCWGCCTAAATC ATTGACTTCTAGCCACCTTTGGGATGCCAATGMCCACCCASCCTTACCAGGCCVCAGAGGAGAGAAAACACTTACCACAG GAGGTCCGGGTAAGGAACAAGGAACTAACAAGCTCCCACCAACCAGGATTCTGCAGAGGTCAACAAGAGGTCAGCAAGAG ATGGGAGACTGCAGTCCAGAGGTCCTAGCAACTTCCAGCGAGCAGCAAAGACCCAGCATAGTCAAAACGAATTCAATGTT
CAGGATACTCTTAAAGTATACTCTTAACRTAGATGAGGAGAACAAGGTAGGAGGAAAACTTACTGTTTGGTTCAGGTCGC TTCKGAGCCTTTTGGGGGAACCGACCTGAAGCCAGAGTCTAGCCGAAATACACAGGCTTTGAACATCGCTGAAGACCAAC ATTTCCTAAGGAAAAAACGCATTGCTTAGCTCAAGGTTTGTCTGGGAAATTCTTACACATCAGGAGATGGCTCCGAGGTG GATTTCTACAAAGAACATCCATGCAAAGGACAAGCCTGCACAGCTCTGTGCACATCTGCCCCCHCCCCCCCTCACAGCCA GCRGCATTTCCAAGTGACCTCCTAAGGCCTTCGCCAAACTACAGGGGCACCTCTGGGCTTCCTGTTTCCCTCAI ITTTCG GAATCTTTTCTAAAAAAMRAAAAACAAAAARCAMTTAAAAAAAAAAAACCCGCACATTTCTTCGTACAATGCCACAGT
GGMGGAACACAGGCCTTCMTATTMGACCTAMTTTGCAAAAAACTTGGAGCTTGG I I I Il I GTTTTGTTTGGTTTTG
TTTTGTTTGAATCAAACCACCACATCATTTAAAA Π III ACACGTTCTAATTATTCGGTTCCAAAACACCAAACTTGTAC
TGCACACGGTCCCCATGGAAAGGCCCCCTCAGCAGGACAGCCACAGATGCTGCCATGAAAGGGCAGCAGGGTGCCCTCCT
CCCTTGTTTGCTGCCTGGGCGGCCACCGGGTTTTGAAACCCATGCACCCTACTMGTTTCTAGGGTGCMTTCGTGGACC
CTTCCCTTTCCCAYCCCASCAGGTGGTCAAGGGCCCGCTGGCCTCCAAAGTGCCAGAAGCAGGGCTGGGCCAGCCGCCCA
GGCAGCAA--ACAAGGGAGGAGGGCACCCTGCTGCCCTTTCATGGCAGCATCTGTGGCTGTCCTGCTGAGGGGGCCTTTC
CATGGGGACCGTGTGCAGTACAAGTTTGGTGTTTKGGAACCGACC
>Seq. ID: 7
ACYATGATGGCTACTCCATTTCTTCTAAGGGATTCYTGCCCACAGTAGTAGATATAATGGTCATCTGAGTTAMTTCACC
CATTCCAGTCCATTTTAGTTCRCTGATTCCTARAATGTYGAYRTTCACTCTTGCCATCTCCTGTTTGACCACTTCCAATT
TRCCTTGATTCATGGACCTAACATTCCAGGTTCCTATGCMTATTGCTCTTTACAGCATCRGACYTTRCTTCYAYCACCA
GTCACATCCACAACTGGGTRTTG Il I I I GBTTTGGCTCCATCCCTTCATTCTTTCTGGAGTTATTTCTCCACTGATCTCC
AGTAGCATATTGGGCACCTACYGACCTGGGGAGTTCMTCTTTCAGTRTCCTATCWTTTTGCCTTTTCATWCTGTCCATGG
GATTCTCCAGGCAAGAATACTGGAGTGGGTTGCCATTYCCTTCTCCAGKGGATCAHAKTCTGTCAGABMTCTCMACCATG
WCYCDYCCRTCTTGGGTKGCCCCACRGGCATGGCTTAGTTTCATTGAGTTAGACMGGCTGTGGTCCNNGTGTGATTAGA Corr5_FastaconsAn .txt
TTGRCTAGTTTTCTGTGAKTATGGTTTCAGTGTGTCTGCCCTCTGATGCCCTCTTGCAACACCTACCRTCTTACTTGGGT TTCTCTTACCTTGGRCGTGGGGTATCTCTTCACGGCTGCTCCAGCAAAGCRCAGCCRYTGCTCCTTACCTTGGAYGAGGG GTATCTCCTCACCRCCRCCCYTCCTGACCTTSAACGTGGRRTAGCTCCTCTMGGCCCTCCTGCRCCYRYGCAGCCACBRC TCCTTGGACGTGGGGTWGCTCCTCHCVGCCRCCVCYCCTGRCCTYVRRCRTGGGGTAGCTCCTCHCRSCNYSCYGCCCCT GRCCTCVGRCKTVGGGGHSRTGGGGTWGSTCC
>Seq. ID: 8
CCCTAGCASAGCCGCAGCGCAGGGGAACCAGCCCATGATCAGTGAGCTCCGGGCANTACTTTGGTCCCAGGTTTGGAGCC
CWATGC-CCGGAACGCGGGAGAGARBRGGYTGGNTGNSTGTGGGDGCYGTGDGTGGTGCCTGACTGSTGCCTCAGCTCAG
CTCCCGCGGGGAGAGCTCTTTCCGAGGGAAAGGAAGCGCCCCAGGCTGCATTCCCAAGCTGGTGCTTCCCCGCCCAGACT
GTGCCCAGAAAGCGCCTGCCAGACCGGGCCTTCCCTGGAGCTGTCCTCCCGCCTMKMGGCMGGNGGMTCTGTGCTABCGC
AGCCAGCCTCTTCTCTCCCGCGTTCCGGGCATWGGGCTCCAAACCTGGGACCAAAGTNACTGCCCGGAGCTCA >Seq. ID: 9 TGTGTTGKGTGDWTAGACATTTCTCCAAAGAAGACATACAGATGGCYAACAAACACATGAAAAGATGCTCAACATCACTM
Figure imgf000041_0001
CTATGGARAACAGTRTGGAGATTCCTTAAAAAACTRRAAATAGAACTRCCATATGAYCCAGCAATCCCACTVCTGGGCAT AYACMCHGAGGAAACCADAAKKGAAARAGACACRTGTACCCCAATGTTCATHGCAGCACTRTTTAYAATAGCCARGACAT GGAAGCAACCTARATGTCCATCANCAGATGAATGGATAARRAAGHTGTGGTACATATACACAATGGAATATTACTCAGCC ATWAAAAAGAATRMATTTGARTCAKTTCTAATGAGGTGGATGAAMCTRGAGCCTATTATACAGAGTGAAGTAAGYCAGAA AGAAAAACACCAATACAGTATAYTAAYRCATATATATGGAATTTAGAAAGATGGTAACRAHAACCCTRTDTRCNARACAV MAAAASACTCACG
>Seq . ID : 10
TCGAWAAGAACCCGATGCCCGGACACCTCTCCGAACTCCASCCTGTGAATGMAGTCAACACGAAGGGGCAGTTTT
>seq . ID : 11
GTDTTAMGTCYCCYACTATTATTGTGTTAYTGTYAATTTCTCCTTTYAWRNYTGTTAGYATTTGTCTTATRTATTGWGG TGCTCCTATGTTGGGTGCATAKATATTTAYAATTGTTATATCTTCTTSTTGRATTGATCCYTTKAYCATTATGTARTGDC CTTCTTTGTCTCTTTTBAWDYYTTTRKTTTAAAGTCTRTTTTRTCTGATATKAGTATTGCTACTCCWGCTTTCTTKTSD TYYCYATTTGCATGRAATATY I I I I I CCATCCCYTCACTTTCAGTCTRTRTGTGTCYYYWGDTYTGAGGTGGGTYTCTTG Corr5_FastaConsAl l .txt
TAGACARCATATRTAKGGGTCTTG I I I I I GTATCCATTC^GCCAGTCTKTGTCTTTTGGTTGGRGCATTYMYCCATTTA
CATTTMGGTMTTATTGATAWGTATGWTCCYRTTGCCATTTACTTTATTGTT^
GTGTTTCCTGTCTAGAGAAKWTCCTTTAGYATTTGTTGDARAGCTGGTTTGGTGGTGCTGAATTCTCTΎAGCTTTTGCTT
GTCTGWMAGCTTTTI^TTTCTCCTTCAWWTHTGAATGAKAKYCTTGCTGGGTABARTAWTCTKGGYTGTAGRTTHTTYT
CTTRCATYAYTTTRAGTATGTCYTGCCATTCCCTYCTGGCYTGWAGAGTTTCTRTTGAAAGATCAGCTGTTAKCCTKATG
GGVWTYCCCTTGTRTGTTATTTGTTGYTTTTCYCTTGCTGCTTTT^
GATTARTATGTGTCTTGGBGTGTTTCKCCTTGGGTTTATCCTGTWTGGGACTCTYTGKGYTTCTTGGACTTGRGTGAYTA
TTTCCTTYCCCATKTTAGGGAAGTTTTCARCTATTATCTCYTCAARTATTTTCTCAKGBYCTTTCI I I I ISTCTTCTTCT TCTGGGACYCCTATGATTYGAATGTTGGDGCRTTTMYATTGTCCCAGAGGTCTCTGAGRTTGTCCTCATTTCTTTTHAT TCDI 11 I ICI I I I I ICCTCTCTGHTTCATTTATTTCYACCATTCTATCTTCYACHHCACHWATCCTATCTTCTGCCCCAM
ACTTCCCTAAMATGGGRAAGGAAATARTCACCCAAGTCCAAGAARCMCAGAGAGTCCCAMCAGGATAAACCCAAGGMRA
AACACVCCAAGACACATAYTAATCAAATTMCAAAGATYAAACACAAASAAMAAATATTNAAAAGCAGCAAGRGARAARC
AACMRTAACACACAAGGGRAWTCCCATMAGRHTAACAGCTGATCTTTCAAYAGAAACTCTTCARGCCAGRAGGGAATGG
CARGACATACTWAARTRATGAAAGAAAAWMCCWA
>Seq. ID: 12
TAGATTCCCTATCTCTTCCTCTTT--NNTNTTKGGKTTGGTGGGCATTTATCMTGTTCCTTTAYCTGCTGRRTATT^CTC
TGYCTYTTCATCTTGTTTANATΓGCTGWGTTTGGGGTGKCCTTTCTGTATKCTGGMAGTTTGTGGWKTYCTCTTTATTGT
GGMGKTTCCTCVCTGTGKGTGGGKTTGKACRGGTGGCTTGTCAAGGTTTYYTGGTTAGGGAAGCTTGTGTCRGTGTTCTG
GTGGGTGGAGCTGGATΎTCTTCTCTCTGGAGTGCMTGMGTGTCCAGTARTGAGTTWTGAGATGTCTRTGGKTTTGGDG
TGACTTTGGGCAGCCTGTATHTTRAWGCTCAGGGCTRTGTTCCTKTGTTGCTGGAGAATTWGCTTGGTATGTCTTGCYCT
GGMCTTGTTGGCYCTTGGGTGGWGCTTGGTTTCAGTGTAGGTATGGAGGCDTTTGATGAGCTCYTGTCRATTMTGTTC
CCTGGAGTCAGGAGTTCYCTGGWGTCAGGDTTΓGGACTTAAGCCTCCTGCYTCYRGTTWTCRGTCTTATTYTTACAGTAG
YYTCAARACTTCTCCWTCTATACARCACCRWTGATAAAACATCTASGTTAAWGRTGAAAAGWTTCTCCACAGTGAGGRHC
ACYCAGAGAGGTTCACAGAGTTAYATRGAGMGAGAAGAGGGAGGARGGAGWYAGAGGTGRCCAGSSAGNGGTCACTCCC
TTATGTGCAC
>Seq . ID : 13
GTKTGCTGGYGDWRGGGGGDKGTGBTGCMGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAA
ACGACGGCCAGTGAATTGKAWTACGACTCACTATAGGGCGAATTGGGCCCGACGTCGCATGCTCCCGGCCGCCATGGCCG
CGGGATTKGCTGACAMGCAAGAGATTTTDTTGGGAAASGGCRCYSGGGYRGAGAGCAGKAGGGTAAGGGAACMCAGGAG
AACAGCTCTGCCRCRTGGCTCRCAGKCTYGGGTTTTATGGTGATGGGATTAGTTTCCGGGTTGTCTTTRGCCMTCATTC
TGAYTCAGAGTCCTTCCTGGTGGTRCACGCCTTGTTCAGCCMGATGGATGCCAGAGMGMGGATTCTGGGAGGTGGTCG Corr5_FastaConsAll .txt
GACAYRTGGTGTCTCCTKTTGACCTYTSCHGAANNNTYCYGGTTGGTGGKRGCTTRTΓAGTTCCKTGTTCCTTACCMGGA CCTCCTGTSRTAAAACAACTCATGCAAATGGTTACTATGGTGCCTGGCCAGGGTGGGYGGTTTCARTCAGTGTGCTTCCC CWAACAHAATCA >Seq. ID: 14
GGTTCAGGATGGC
WAAAAHAWAAAAWWWMAAAAWWWAAAAWARADWHAWWAWAADDWHMWAWAADWWGWWACTAHAG
>seq. ID: 15
TTYCCGATTTCCCGGTCCCCTCTTGATAAGAACCCGATGCCCGGACMMMCYCC
>Seq. ID: 16
TGGGGGGAGCGCGTCWTTRCTCTCRAGTCATRGTAGGGGAATCTGGCCTCGAG
>seq. ID: 17
GGAGAAATGTCTRTTTAGKTCTTTGGCCCAI I I I I I GATTGGGTΎRTTTR I I I I I CTGGWRTTGAGCTGYAKGAGYTGYT
TGTATATTTTKGAGATTARTYSTTTGTCAGTTGYTTCATTTGCWATTATTTTCTCCCATTCTGWRGGYTGTCTTTTCAYC TTGYTTATRGTTTCCTTTGYTGTGCAFl^AGCTTTTAAGTTTAATTAGGTCCCATTTGTTTA I I I 11 GYTTTTATTTCCAW TAYTCTRGGAGGTGGGTCATAGAGGATCYTGCTGTGATKTATGTCRGAGAGTGTTTTGCCTATGTTYTCCTCTAGGAGTT TTATAGTTTCTGGTCTTACRTTTAGRTCTTTAATCCATTTTGAGTTTA I I I I I GTGTATGGTGTTAGRAAGTGKTCTART
TTCATTCTTTTACAWGTRGYTGWCCAGTTTTCCCARCACCACTTRTTRAAGAGAYTGTCTTTWMTCCATTGTATATTCTT
GCCTCCTTTGTCAAAGATHAGKTGDCCATAKGTGYGTGGRTTTATYTCTGGGCTTTCTATTYTGTTCCATTGATCTATAT
KTCTGTYTTTGTGCCAGTACCATACTGTCTTGATDACTGTRGCTTTGTAGTAKAGYCTGAAGTCAGGHAGGTTGATTCCT
CCAGYTCYRTTΎTTCTTTCTCAAGATTGCTTTGGCTATTΎRRGGTY I I I I GTRTTTCCATACAAATTKTRAAATTWTTTG
TTCTAGYTCTGTGAAAAATRCCRTTGGTADYTTGATAGGGATTGCATTGAATCTRTARATTGCTTTGGGTAGTATASWCA
TYWTMRYRAYADTGATTMTTCTNMCCAAAARYMTGKWHTTTHCATCTRTDTCTYYTTTGCTKYCYTGCATVTAGGATCRT
TGWYYTATNANNTTNNNTGNANNAGNNTCTΓTNNTTTCNTTCCTTGTAA
>Seq . ID : 18
GGTKWGGGGBYCC
TTCTCTGTTGCTYRGTGYRTCAGGYRCTTAAADGVCYMCCCTGSCTGGGGTCCTTCTCTGTTGCTTGGYGYRTCAGGCRC
TTAAAKGGCCHCTGVCTGGGGTCCTTCTCTGTTGCTYRGTKYRYCAGGHRCTTAAAKGVCSMCCCTGSCTKGRGTYYTTC
WCTGTTGCTTGGTGYRTCWGGHRCTTAAATGGCCMCMYTSCCTGGGKTCCTTHTCTGTTGCTTGGTGCATCAGGCACTAA
AGCCTGCCCCCBTCCCCCCCCACMCCMAACACACCCACACA
>Seq. ID: 19
CCATGGGGTCRCAAAGAGTCRGACACGACTGAGCGACTGAACTGAACTGAACWKAHAATHDTTTCTCMDGRTGBTDHHAD Corr5_FastaConsA"π .txt
AHACTCHTVTDTGDATAGTCAAAMTCTCTTTTGTTTBTGTGTAAMGGGAGGTCCTTTCAAGGCGTGAATGTTTCAGAA
CTTKAATTTATTTGGAMTGACCCAGCTCTTCAGTACACTGTCGTCACTTAGTTTAGCACAGGATAGAAACTCGGGTAAC
CAAAACACCTGGAGAAAHDATTDTHWGTTCAGTTCAGTTCAGTCGCTCAGTCGTGTCYGACTCTTTGYGACCCCATGRAY
YGCAGCACRCCAGGCYTCCCTGTCCATCACCAACTCCHGGAGTTYRCTCAAACTCAYGTCCATYGAGTCRGTGATGCCAT
CCARCCATCTCATCCTCTGTCGTCCCCTTCTCCNCTGCCCYCAATCWWNCCYMSCAKCAKNNGRTCTTTTCCAATGWRDY
HASTCTTCAYADKKTRBCHAAMVSTCAYATSACATTTMAKYTTTMGTWTTGTTCCTGTTCTTTCGAGGTACTTTCTYGT
TGACAAGCTTGTGACAGATAGAACGATATAGCAAI I I I IACCTTAGAACAAACCGAGGCACTATGAACATTTTGTGCTTC
ATGTTGATGACTCTTAGACATGTCTACAGTAGAGGAGCAAAAACAAAACTACTAGATATTTCATATTGACTAGTTCCCAG
TTCACGGGACTCTGACATTCCCTGAGGTCAAAGTTTTCTTGTATTGGAAGCAGTTGGGTTTGCAAGGGCTGCCTTGTCTT
GAGACCATTGAAATAAGAACTCAGAACTTGAGCACTATTATCAAAAATCACAAGGCTCACACTGACACAGACATCAATCC
AGACAGACMGACMGACATCTTCCAGTTTTCCGCCTGAGATGGAMAGATTTCTTTGAGCCG I I I I I I CTGGGGAGTGG
GGGGTGGGGCTGGCGGCCAGGCAGGCTTTGGGAAGGAGCTCATGGTTTGGAATTGCATATGAAAAGAACCAGCTTTCCGG
GTTCCAAGGAATCMGTTTCCTTGGAAACCAACTTTGTCCGGTTCTRTAGAATATCATGGCCCTCCTAGGTCAGGTCGTC
TTTCCTTTCCGTAGCCCTCGTTKGATCCCTGMTTCTAGACAGCTTGGATTGCCTCTGTGGGCTGGATGGATTGTCCTCC
CGTTTCACCGGGCGGCGGGAGCGAGGTCCCAGAGGCTCTCCTGGAACCGGGCGKRSRRKCGGGGCTCACCGGGAGCCCGT
GGTGAAGGTGGGNNGACCC
>Seq. ID: 20
TTCCATTCACCATTGCAWCNAAAAGAATAAAATACYTAGGAATAWAYCTACCTAARGARACDAAAGACCTVTANAHAGM
AACTAYAAAACACTGVTGAMGAAATCAAAGADGACACWAAYARATGGARARATATWCCATGTTCATGGATTGGAAGMT
CAATATWGTGAAAATGASTATACTACCCMNCAATYTAYAGATTCAAYRCA
>Seq . ID : 21
TGTMTAYMTAGAAMCCCTAAAGACTCMACCMRAAAWYTMCTWRARCTRATMARYRAHTWYAGYAAAGTYKCAGGATAYA AAATCAAYRYACARAAATCMCWWGCATTCYTATACACYAAYAAYRRRMAAACAGARAGMSAAATYAWGRRWRMAMTYCCA TTCACMATTGCWWCRMRAGMTAAAATACYTAGGMTMHAWCTWMCWARRGADRYDAARGACCTVTWYAHRGARAACTA
TWGTGAAMTGASTATACTACCCMAGCMTYTAYAGATTCMTGCMTCCCTATCMRHTACCMYGRYATTYTTCACA
GAACTAGMMAAAHAATTTYAMMTTYRTATGGAAMYAMMAARASCYCRAATAGCCAARDCAATCYTRAGMAARMGM
YRRARCTGGAGGMATCACG
>Seq . ID : 22
TGTTGGGTGTGTTTGNGTGTKTTTGKTITI ITSTTTYTGDSTTABTTCACTBWGTATRATRGBYTCYAGKTTCATCCAYS
TYVYTRSAAMTGAYWYDAWTKYATTCTTTTTWATGGCTGAGTARTATTCCATTGTGTATATGTACCACAKYTTBYTWATC
CAKTCWWYYRYTGATGGACATYTRGGTTGBTTCCAWGTCYTKGCTATTRTRMTAGTGCTGCDATRAACATWSGKGTRCA Corr5_FastaConsAil .txt
TGTRTCTTTVfTVRHAKHATGDTTTMKWWTCCTYWGGGTATATRCCCAGBARTGGGATTGCTGGGTCAWATGGtAKTTCTA
KTTCTAGWTCCYTG
>Seq. ID: 23
TWGTΓΓBWGSTGTACAGCAMGTGAWTCAGTTATACATAYACATATATCYANTCTTTTTYAGATTCTTTTCCCATATAGG
TYATTACAGARTATTGAGTAGAGTKTTCHKGTAKAYATATNSTCA
>Seq. ID: 24
GTGGAAGAGGGCCTCATCTCCAGTTGAGGCAGGAACCGCAGGGTACCTCTGATBGHGGCGT
>Seq. ID: 25
GTAGCTTGGTTTACGCGGAAGACCAATCAAACTTCAAGACAAGAAGTTTGCACCACTTACGTAGGCCGCAGGCGCCCTCT CGAATAGCGAAAGGTGCCTCACCCTAGACACCTTCTCGAGTGGGTCTTAGCAGCCCAGGCATAATTAGTAAGCGTGGTGG GTTCCGCGCTCCAGATGGAGACTCAGCTGGAAGTTAAAGGGAAGAATGACAAGGAACTTTATGAATTGGAGCTGTAAGTT AACTCTTTGACAGAGAGAGCGAGATGGTGGTGGGGGACAGCCCCCMGTRAARTCAGAGGTGAGAGCACAAAGCAADAMAG TAGGCAGACTCYGGTTTTNNNNGGGGGKAKATGCTCGAGAATWTCCRGGKGGACTCCTGAGGCTCGATCCCGCMTTTGCG TATGCCGAGCCTCCTTCCTCATGACCTTTGTCMWGRGYGGARTKCCTCMCYGGCTCCMGSCACRTGATCAGTGCATGMTC AGYCACGTGATCAGTGCMTGMTCAGCCACGTGATCAGTGHMTGVWCAGYCACGTGATCAGTGCMTVTCAGYCACGTGATC
GYCACGTGATCAGTGCMTGMWCAGYCACGTGATCAGTGCMTNTCAGYCNSKGATCASTGCMTGATARHCACGTGATCAGT
GCC
>Seq. ID: 26
ACCTCWGAGCCCCTTCTCCCCTCCTGATCTGGACAGGAGGGTCGACTCCCCTGCTTT
>Seq. ID: 27
TGTTTTCATGGCTGCAGTCACCATCTGCAGTGATTTTGGAGCCCNMVAAAAATAAAGTCTGWCACTGTTTCCACTGTTTC
CCCATCTATTTSCCATGAAGTGATGGGACCRGATGCCATGATCTTMGTTTTCTGAANTTGAGYTTKNTTWDTCCAACTCT
CACATCCATACATGACYACTGGAAAAACCATAGCYTTGACTAGAYGGACCTTTGTTGGCAAAGTAATGTCTCTGCTTTTK
AATATGCTRTCTAGGTTGGTCATAACTTTYCTTCCAAGGAGYAAGCRTCTTTTAATTTCATGGCTGCAGTCACCATCTGC
AGTGATTTTGGAGCCCMVVAAAATAMGTCTGHCACTGTTTCCACTGTTTCCCCATCTATTTSCCATGAAGTGATGGGAC
CRGATGCCATGATCTTHGTTTTYTGAATGTTGAGYTTTAAGCCARC I Π I I CACTCTCCTCTTTCACTTTCATCAAGAGG
CTYTTTAGTTCYTCTTCRCTTTCTGCCATAAGGGTGGTGTCATCTGCATATCTGAGGTTATTGATATTTCTCCYRGCAAT
CTTGATTCCAGCTTGTGCTTCWTCCAGCCCAGCRTTTCTCATGATGTACTCTGCATATAAGTTAAATAAGCAGGGTGACA
ATATACAGCCTTGACGTACTCCTTTYCCWATTTGGAACCAGTCTGTTGTTCCATGTCCAGTTCTAACTGTTGCTTCCTGA
CCTGCATACAGRTTTCTCMGAGGCAGGTCAGGTGGTCTGGTATTCCCATCTCTTTMAGAATTTTCCACAGTTTRTTGTG Corr5__FastaConsATl . txt
ATCCACACAGTCAAAGGCTTTGGCATAGTCAATAAAGCAGAARTAGATGTTTTTCTGGAACTCTCTTGC I I I I I CBATGA TCCARYRGATGTTGGCMTTTGATCTCTGGTTCCTCTGCCTTTTCTAAAWCCAGCTTGAACATCTGGAAGTTCAYGGTTC ACRTAYTGYTGAAGCCTGGCTTGGAGAATTTTGAGCATTACTTTRCTAGCRTGTGAGATGAGTGCAATTGTGYGGTAGTT TGAGCATTCTTTGGCATTGCCTTTCTTTGGGATTGGMTGAAAACTGACCTTTTCCAGTCCTGTGGCCACTGCTGAGTTT TCCAMTTTGCTGGCATATTGAGTGCAGCACTTTCACAGCATCATCTTTYAGGATTTGAMTAGCTCAACTGGAATTCCA TCACCTCCACTAGCTTTGTTCRTAGTGATGCTTYCTMGGCCCACTTGACTTCACATTCCAGGATGTCTGGCTCTAGGTG AGTGATCACACCATCRTGATTATCTGGGTCATGMGATC I l I I I I GTACAGTTCTTCTGTGTATTCTTGCCACCTCTTCT TAATATCTTCTGCTTCTGTTAGGTCCATACCATTTCTGTCCTTTATYGWGCCCATCTTTGCATGAMTGTTCCCTTGGTA TCTCTAATTTTCTTGAAGAGATCTCTAGTCTITCCCATTCTRTTGTTTTCCTCTATTTCTTTGCATTGATCRCTGARGAA GGCTTTCTTATCTCTYCTTGCTATTCTTTGGMCTCTGCATTCAGATG B KTATATCTTTCCTTTTCTCCTTTGCTTTTYR CTTCTCTTCTTTTCACAGCTATTTGTAAGGCCTCCYCAGACARCCATTTTGC I I I I I I GCATTTC I I I I YCWTGGGGATG
GTCTTGATCMCTGTCTCCTGTACAATGTCACRAACCTCHDTCCATAGTTCWTCAGGCACTCTRTCTATCAGATCTARTCC CTTRMTCTATTTSTCACTTCCACTGTATMTCATMGGGATTTGATTTAGGTCATACCTGMTGGTCTAGTGGTTTTCC CTACTTTCTTCAATTTAAGTCTGMTTTGGCMTAAGGAGTTCATGATCTGAGCCACAGTCAGCTCCYGGTCTTG I I I I I GCTGACTGTATAGAGCTTCTCCATCTTTGGCTGCAAAGAATATAATCAATCTGATΠΎGGTRTTGACCATCTGGTGATGT CCATGTGTAGAGTCTTCTCTTGTGTTGTTGGAAGAGGGTGTTTGCTATGACCAGTGCRTTCTCTTGGCAAAACTCTRTTA GCCTTTGCCCTGCTTCATTCYGTAYTCCMGGCCAAATTTGCCTGTTACTCCAGGTRTTTCTTGACTTCCTACTTTTGCA
TTCCAGTCCCCTATAATGMMGGACATC I I I I I I GGGTGTTAGTTCTARMGGTCTTGTAGGTCTTCATAGMCCRTTC
AACTTCAGCTTCTTCAGCRTTACTGGTTGGGGCATAGACTTGGATTACTGTGATATTGMTGGTTTGCCTTGGAAAYGM
CAGAGATCATTCT-GTCR I I l I I GAGATTGCATCCAAGTACTGCATTTCDGACTCTTTTGTTGCATCCMGTACTGCATT
CVGACTCTTTTGTTGACYATGATGGCYACTCCHA
>Seq . ID : 28
TGCACAGGCTCTAGGCDCVYGGGCTTCAGTAGTTGCRGCACRYGGGCTCAGTAGTTGYGGCTCMTGGSYYTAGWKKSYVC
AGGCNTCARTAGTTGYRGCANVCAGRCCVTAGAGTGCGAG
>Seq. ID: 29
GGTGAACTCCRGGAGTTGGTGATGGACAGGGAGGCCTGGYGTGCTGCRRTYCATGGGGTCMAARRRKCGGACACGACTGA
GCGACTGAACTGAACTGAAHHTDMAHTHTCADTHHTTKDHKDAGGRAAGCCCCCTGGCTTC
>Seq. ID: 30
GTTTTRTTCCTGTGTGTTCTTGCCTCCANTGTCCACAGCTRTCAGAACTAGTGTGTTTTY Ni l GTGGGAGCTCKCAATG
WCCTTTTAYATATTCCANNNNCASAGTCTGCCTAGTTGATCRTGTGGATTTAATCTGCAGCTTGTACAGCTGGTGGGMG
GTTTTGGGTCTTCTTCCTTAGCCACACTGCCCCTGGGTTTC^ATTGTGGTTTTATTTCCACCTCTGCATGTGGGTCRTCC
ACTGGGGTTTGCTCCTGAGGCTGCCCTGGAGGACTTGGGTTTGCCCCTGTGAGGGCCAGGTGTGGAGGTGGTGCAGCTGC Corr5_FastaConsAll .txt
TTGGGTYRCAGGGGTTCTGGCAGCACCAGGTACTCAGGGGAGTTGGYRGCTAGRGCAGCAGGAAATAYAGTGBTCTRGAA
GGDTATGGGARRAAGVAAWKGSCAACCCACTCCAGTATTCτTGCCTGGARAAτCCNW--NNτGGACRGAGGAGCCτGGYR
GGNCAGTCCATGGGGTCRCAMAGAGTCRGACAYGACTGNAGCGACTDAGCACACAYAVACACAAVACTCTTTTTGCMTGT
GGCARBTCTG
>Seq. ID: 31
ATCACADTGTGACTGGTGATDGDAGBAAADTHHAATDHTATAAADAAHAHDATTDHWTCAGTTCAGTTCAGTTCAGTCRC
TCAGTCGTGTCCGACTCTTTGCGACCCCATGAAYYGCAGCAC
>Seq. ID: 32
CACCAAHACABCAADWTCAGTBTDGAGAAAAADAAATDDTDDADDHADTBHABTTTADHTDDDDTCAGWTCAGWTCAGTT
CAGTTCAGTCRCTCAGTCGTGTCCGACTCTTTGCGACCCCATGAA
>Seq, ID: 33
TTTGTGGGTCGCATCAAGGGTGCCAAGTGCCCTTTCGACCTCCAATTCCTA
>Seq. ID: 34
TGTTAYATRGAGAAGAGAAGAGGGAGGAGGGAGWTAGAGGTGACCMRRAKGAGAWGAGGKGGAATCAAWAGDGGAGAGAG
HRRKCTAGCCAGTAATCASTTCCYTAWGTGYWCTCCACMRCTGGAMCACNCAGARAKNNTCACRGAGTTRBRYAGAGAAG
ATCASTTCCCTAAGTGTTCTCCACMGTCTGGAACACACAGARATTCACAGAGTTRGRTAGAGWAGAGARGGGTKAGGGAG
Figure imgf000047_0001
AGTAGAGKTTGGAATTΓCAAAMTACRATGTTAAAGAAAAGAAGAAGRAAAAGAAAGAGAGAAAAAAHDAACAAACAAAA
AMAAAMAARGTMVYRAARATTAI^NNGWWMWTAAAGGTACAAMTTGATAACAAATACMWAAAAGCAAARATTAAAAATC
TAGAGTAGAGKTTGGAATTTCAAAAATACRATGTTANAAAARAAGAAGRAAAAGAAAGAGARAAAAAAMRAACAAACAAA
NAARAAMAAACAARGTMCVHAWAAATTATAAAGAAAATNAYAGGTACNAAAATWGAYAACWAAHMCCAAAMACAT
>Seq . ID: 35
GACCAACCTAGAYAGCATATTMAAAAGCAGAGACATTACTTTGCCAACAAAAVYCCMATGGA
>Seq. ID: 36
CTGTRAAGAARGCTGAGYRCYRAAGAATTGATGCTTTTGAACTGTGGTGTTGGAGAAGACTCTTGAGAGTCCCTTGGACT
GCAAGGAGATCCAACCAGTCCATYCTAAAGGARATCAGYCCTGRGTDTTCWTTGGAAGGAMTGATGCTRAAGCTGAAACT
CCARTACTTTGGCCACCTSATGYGAAGAGYTGACTCATTGGAAAAGACYCTGATGCTGGGARRGATTGRGGGCAG
>Seq. ID: 37 Corr5_FastaconsAll .txt
AADAHADAWAAWADHAHTWAHHTCAAACTG
>seq. ID: 38
CCCCACCACCTCTTTCGGAGWAGIMRVWARMBTWGAGCTTACAGTTCMGTTAATAATTCCTGGGTGTGACAGTGTTTAAC
CTACAAACTccTTTGGAARTccTCTAGCCTGCCTGAATAGGi I I I ΓCCGGCCACATGKGATTGTTCAGAGCCTCCCAACT
GTGAGAGGCAGGAGATGTTCTMACTGTCTAAACACAGATTCTTTTGAGKAGTTACAAGATTGATTAGAAATTGTATTGG
TGAATGG I I I I I CACTTGTTGGGCCATTGTTTGCTGCTAAGTTTCCATATCCCTTACCTGCTGTGTCCCTGGCAGTGTAT
TGATTAATATMTTGGTGTAAGTAGTAGCTTTAATGTTTGTAACCTGGGACCCTTGAGTTAATTCT I I I l CTTGTTATAG
CCCACCACACCTTTGCTCTGTAGGAATGCMCTTTATCTMTGC I I I I I TGGAGGGTGGCTCCTGACCMCCACCTTTAG
AGAAAMTMGTTTTCTGAAGAAAAGGTCTTMAATGTTMCAGGCCTCCGGGCCAGAAGATGATGCAMTCACCTAAGC
TTTTGCATATGATAAGTTTGCAGGAAGAAAGCCTGGTTTGCTGCAAGACTCGACCCCHHCCCCCNMMNATTATCCTCTAT
GCATMCTTMGGTATMAAACTACTTTGAAAAATAAAGTGCGGGCCTTGTTCACCGAAACTTGGTCTCACCATGTCGTT
CTTTCTCTTACCTTCTGGCTGMTTAYTCAGYCTCTTTTCTCCACYGAATTTYCYYACTGAGCTMTCCTCATWCTATTAY
TCTTKAYATCYYTRATTARCATWTA/
GATCAGYYGGGGCTGGTCCCCGGCA
>Seq . ID: 39
ATCAGTGCATGANAGBCACGTGATCAGTGCAT-NGMTCAGYCACGTGATCAGTGCMTGATAGYCACGTGATCAGTGCATG
MTCAGYCACGTGATCAGTGCMTGANAGYCACGTGATCAGTGCATGMTCAGYCACGTGATCAGTGCMTGMTCAGYCACGTG
ATCAGTGCMTGANAGYCACGTGATCAGTGCATGMTCAGYCACGTGATCAGTGCMTGANAGYCACGTGATCASTGCATGMT
CAGYCACGTGATCAGTGCMTGATAGC
>Seq. ID: 40
ATGATCAGYCACGTGATCAGTGCATGATCAGYCACGTGATCAGTGCATGANAGCCACGTGATCAGTGCATGATCAGYCAC
A
>Seq, ID: 41
GAGGAGAAGGGGACGACAGAGGATGAGATGGCTGGATGGCATCANSANCRATGGACRTGAGTYTGAGTRMCTCCGGGAG
C

Claims

WHAT IS CLAIMED IS:
1. A method of detecting prion disease in a subject, the method comprising detecting the presence of a single nucleotide polymorphism (SNP) in nucleic acid present in an acellular fluid sample from the subject.
2. The method of claim 1, wherein the prion disease is bovine spongiform encephalopathy.
3. The method of claim 1 , wherein the SNP is present in a non-coding region.
4. The method of claim 1 , wherein the SNP is present in a reference sequence set forth in SEQ ID NOs 1 -41.
5. The method of claim 4, wherein the SNP comprises one of the polymorphic positions set forth in Table 1.
6. The method of claim 1, wherein the acellular fluid sample is serum.
7. The method of claim 1, wherein the acellular fluid sample is plasma.
8. The method of claim 1, wherein the nucleic acid sample comprises DNA.
9. An oligonucleotide that specifically hybridizes under stringent conditions to a sequence comprising a SNP position, or to the complement thereof, as set forth in Table 1, wherein the oligonucleotide is competent to discriminate between reference sequence set forth in Table 1 and a polymorphism present at a SNP site as set forth in Table 1.
PCT/US2008/083420 2007-11-14 2008-11-13 Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy WO2009064897A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US98806607P 2007-11-14 2007-11-14
US60/988,066 2007-11-14

Publications (2)

Publication Number Publication Date
WO2009064897A2 true WO2009064897A2 (en) 2009-05-22
WO2009064897A3 WO2009064897A3 (en) 2009-12-30

Family

ID=40639436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/083420 WO2009064897A2 (en) 2007-11-14 2008-11-13 Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy

Country Status (1)

Country Link
WO (1) WO2009064897A2 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102241471B1 (en) * 2019-10-31 2021-04-15 전북대학교산학협력단 Primer for detecting bovine spongiform encephalopathy-associated somatic mutation
US20210198733A1 (en) 2018-07-03 2021-07-01 Natera, Inc. Methods for detection of donor-derived cell-free dna
US11286530B2 (en) 2010-05-18 2022-03-29 Natera, Inc. Methods for simultaneous amplification of target loci
US11306357B2 (en) 2010-05-18 2022-04-19 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11306359B2 (en) 2005-11-26 2022-04-19 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11319595B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11390916B2 (en) 2014-04-21 2022-07-19 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US11519035B2 (en) 2010-05-18 2022-12-06 Natera, Inc. Methods for simultaneous amplification of target loci
US11519028B2 (en) 2016-12-07 2022-12-06 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US12020778B2 (en) 2010-05-18 2024-06-25 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12024738B2 (en) 2018-04-14 2024-07-02 Natera, Inc. Methods for cancer detection and monitoring
US12065703B2 (en) 2005-07-29 2024-08-20 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US12084720B2 (en) 2017-12-14 2024-09-10 Natera, Inc. Assessing graft suitability for transplantation
US12100478B2 (en) 2012-08-17 2024-09-24 Natera, Inc. Method for non-invasive prenatal testing using parental mosaicism data
US12146195B2 (en) 2016-04-15 2024-11-19 Natera, Inc. Methods for lung cancer detection
US12152275B2 (en) 2010-05-18 2024-11-26 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12221653B2 (en) 2010-05-18 2025-02-11 Natera, Inc. Methods for simultaneous amplification of target loci
US12260934B2 (en) 2014-06-05 2025-03-25 Natera, Inc. Systems and methods for detection of aneuploidy
US12305235B2 (en) 2019-06-06 2025-05-20 Natera, Inc. Methods for detecting immune cell DNA and monitoring immune system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040161758A1 (en) * 2000-03-06 2004-08-19 Scott Seiwert Detection of infectious prion proteins and associated antibodies using nucleic acid sensor molecules
WO2003054143A2 (en) * 2001-10-25 2003-07-03 Neurogenetics, Inc. Genes and polymorphisms on chromosome 10 associated with alzheimer's disease and other neurodegenerative diseases

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12065703B2 (en) 2005-07-29 2024-08-20 Natera, Inc. System and method for cleaning noisy genetic data and determining chromosome copy number
US11306359B2 (en) 2005-11-26 2022-04-19 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US11519035B2 (en) 2010-05-18 2022-12-06 Natera, Inc. Methods for simultaneous amplification of target loci
US12221653B2 (en) 2010-05-18 2025-02-11 Natera, Inc. Methods for simultaneous amplification of target loci
US11286530B2 (en) 2010-05-18 2022-03-29 Natera, Inc. Methods for simultaneous amplification of target loci
US11312996B2 (en) 2010-05-18 2022-04-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12410476B2 (en) 2010-05-18 2025-09-09 Natera, Inc. Methods for simultaneous amplification of target loci
US12270073B2 (en) 2010-05-18 2025-04-08 Natera, Inc. Methods for preparing a biological sample obtained from an individual for use in a genetic testing assay
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11306357B2 (en) 2010-05-18 2022-04-19 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12152275B2 (en) 2010-05-18 2024-11-26 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US12110552B2 (en) 2010-05-18 2024-10-08 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11525162B2 (en) 2010-05-18 2022-12-13 Natera, Inc. Methods for simultaneous amplification of target loci
US12020778B2 (en) 2010-05-18 2024-06-25 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11482300B2 (en) 2010-05-18 2022-10-25 Natera, Inc. Methods for preparing a DNA fraction from a biological sample for analyzing genotypes of cell-free DNA
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11746376B2 (en) 2010-05-18 2023-09-05 Natera, Inc. Methods for amplification of cell-free DNA using ligated adaptors and universal and inner target-specific primers for multiplexed nested PCR
US12100478B2 (en) 2012-08-17 2024-09-24 Natera, Inc. Method for non-invasive prenatal testing using parental mosaicism data
US11414709B2 (en) 2014-04-21 2022-08-16 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11371100B2 (en) 2014-04-21 2022-06-28 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11319595B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11408037B2 (en) 2014-04-21 2022-08-09 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11486008B2 (en) 2014-04-21 2022-11-01 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12305229B2 (en) 2014-04-21 2025-05-20 Natera, Inc. Methods for simultaneous amplification of target loci
US11319596B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12203142B2 (en) 2014-04-21 2025-01-21 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11390916B2 (en) 2014-04-21 2022-07-19 Natera, Inc. Methods for simultaneous amplification of target loci
US11530454B2 (en) 2014-04-21 2022-12-20 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US12260934B2 (en) 2014-06-05 2025-03-25 Natera, Inc. Systems and methods for detection of aneuploidy
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US11946101B2 (en) 2015-05-11 2024-04-02 Natera, Inc. Methods and compositions for determining ploidy
US12146195B2 (en) 2016-04-15 2024-11-19 Natera, Inc. Methods for lung cancer detection
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US11519028B2 (en) 2016-12-07 2022-12-06 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11530442B2 (en) 2016-12-07 2022-12-20 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US12084720B2 (en) 2017-12-14 2024-09-10 Natera, Inc. Assessing graft suitability for transplantation
US12024738B2 (en) 2018-04-14 2024-07-02 Natera, Inc. Methods for cancer detection and monitoring
US12385096B2 (en) 2018-04-14 2025-08-12 Natera, Inc. Methods for cancer detection and monitoring
US12234509B2 (en) 2018-07-03 2025-02-25 Natera, Inc. Methods for detection of donor-derived cell-free DNA
US20210198733A1 (en) 2018-07-03 2021-07-01 Natera, Inc. Methods for detection of donor-derived cell-free dna
US12305235B2 (en) 2019-06-06 2025-05-20 Natera, Inc. Methods for detecting immune cell DNA and monitoring immune system
KR102241471B1 (en) * 2019-10-31 2021-04-15 전북대학교산학협력단 Primer for detecting bovine spongiform encephalopathy-associated somatic mutation

Also Published As

Publication number Publication date
WO2009064897A3 (en) 2009-12-30

Similar Documents

Publication Publication Date Title
WO2009064897A2 (en) Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy
US8551707B2 (en) Nucleic acid-based tests for RhD typing, gender determination and nucleic acid quantification
JP5902843B2 (en) Single nucleotide polymorphisms for determining allele-specific expression of IGF2 gene and combinations of new and known polymorphisms
WO2006019407A2 (en) Method for detecting and quantifying rare mutations/polymorphisms
KR101157526B1 (en) Snp for diagnosing adhd, microarray and kit comprising the same, and method of diagnosing adhd using thereof
WO2011062258A1 (en) Primer set for amplification of mthfr gene, mthfr gene amplification reagent comprising same, and use of same
KR102124652B1 (en) Composition for early predicting or diagnosing anxiety disorder in dog
KR20160106040A (en) Compositions and methods for multimodal analysis of cmet nucleic acids
JP2005535292A (en) Use of HMGA alleles as genetic markers for novel HMGA alleles, growth, fat, meat quality and breeding efficiency characteristics
KR102185440B1 (en) Composition for early predicting or diagnosing hypercholesterolemia in dog
EP1856279B1 (en) Method of diagnosing breast cancer and compositions therefor
EP3124619B1 (en) Reagents, method and kit for across and within dog breed glaucoma diagnosis
CN114250305B (en) GLRX3 gene-based method for detecting pig birth number and piglet birth litter size and application
KR101728023B1 (en) Detection of mutations in ATP7B gene using PCR-LDR
CN112342298B (en) SNP (Single nucleotide polymorphism) marker related to day age of up to 100kg body weight of pig, detection method and application
EP2707497B1 (en) Detecting the brachyspina mutation
KR102438915B1 (en) Methods for detecting target nucleotide sequences Methods and kits for designing and manufacturing probes
KR101985659B1 (en) Method for identification of Baekwoo breed using single nucleotide polymorphism markers
US7794982B2 (en) Method for identifying gene with varying expression levels
US7368243B2 (en) Detection of nucleic acids to assess risk for bovine spongiform encephalopathy
JP7602893B2 (en) Method for testing canine cataracts, reagent for testing canine cataracts, and test kit for canine cataracts
CN119639887B (en) Primer and probe combination and kit for detecting galactose-1-phosphate uridine transferase gene polymorphism
JP2982304B2 (en) Method for identifying nucleic acid and test set for identifying nucleic acid
CN114645077B (en) A method and kit for detecting the presence or proportion of a donor in a recipient sample
EP4466379A2 (en) Selection method for domestic animal breeding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08849871

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08849871

Country of ref document: EP

Kind code of ref document: A2