Nicodeme et al., 1999 - Google Patents
Motif statisticsNicodeme et al., 1999
View PDF- Document ID
- 16892342279041251238
- Author
- Nicodeme P
- Salvy B
- Flajolet P
- Publication year
- Publication venue
- European Symposium on Algorithms
External Links
Snippet
We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers “motifs” widely used in computational biology. Our approach is based on:(i) classical constructive results in theoretical computer …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/3069—Query execution using vector based model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F17/30 and subgroups
- G06F2216/03—Data mining
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Nicodeme et al. | Motif statistics | |
Nicodeme et al. | Motif statistics | |
Chiu et al. | Probabilistic discovery of time series motifs | |
Bille | A survey on tree edit distance and related problems | |
Sakakibara et al. | Stochastic context-free grammers for tRNA modeling | |
Bille | Tree edit distance, alignment distance and inclusion | |
López et al. | Grammatical inference with bioinformatics criteria | |
Chauve et al. | New perspectives on gene family evolution: losses in reconciliation and a link with supertrees | |
Landau et al. | On the common substring alignment problem | |
Radhakrishna et al. | A dissimilarity measure for mining similar temporal association patterns | |
Cormode et al. | L p samplers and their applications: A survey | |
Fischer et al. | Optimal string mining under frequency constraints | |
Baeza-Yates et al. | A fast algorithm on average for all-against-all sequence matching | |
Cao et al. | Indexing DNA sequences using q-grams | |
Elzinga et al. | Versatile string kernels | |
Mukherjee et al. | Hidden Markov Models, grammars, and biology: a tutorial | |
Eidhammer et al. | A constraint based structure description language for biosequences | |
Sakharov et al. | The Viterbi algorithm for subsets of stochastic context-free languages | |
Maaß et al. | Text indexing with errors | |
Schimd et al. | Bounds and estimates on the average edit distance | |
Landau et al. | Sparse LCS common substring alignment | |
Salvy et al. | Motif Statistics | |
Denise et al. | Random generation of structured genomic sequences | |
Chauve et al. | Counting, generating, analyzing and sampling tree alignments | |
Oommen et al. | Dictionary-based syntactic pattern recognition using tries |