Cunial et al., 2022 - Google Patents
Fast and compact matching statistics analyticsCunial et al., 2022
View HTML- Document ID
- 13498632182449248599
- Author
- Cunial F
- Denas O
- Belazzougui D
- Publication year
- Publication venue
- Bioinformatics
External Links
Snippet
Motivation Fast, lightweight methods for comparing the sequence of ever larger assembled genomes from ever growing databases are increasingly needed in the era of accurate long reads and pan-genome initiatives. Matching statistics is a popular method for computing …
- 230000000576 supplementary 0 abstract description 25
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/14—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for phylogeny or evolution, e.g. evolutionarily conserved regions determination or phylogenetic tree construction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Cox et al. | Large-scale compression of genomic sequence databases with the Burrows–Wheeler transform | |
| Marçais et al. | Locality-sensitive hashing for the edit distance | |
| Kuhnle et al. | Efficient construction of a complete index for pan-genomics read alignment | |
| Rahman et al. | Representation of k-mer sets using spectrum-preserving string sets | |
| Bernard et al. | Alignment-free inference of hierarchical and reticulate phylogenomic relationships | |
| Simpson et al. | Efficient construction of an assembly string graph using the FM-index | |
| Li | Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences | |
| Chikhi et al. | On the representation of de Bruijn graphs | |
| Conway et al. | Succinct data structures for assembling large genomes | |
| Muggli et al. | Building large updatable colored de Bruijn graphs via merging | |
| Nicolae et al. | LFQC: a lossless compression algorithm for FASTQ files | |
| US20180373839A1 (en) | Systems and methods for encoding genomic graph information | |
| Kingsford et al. | Reference-based compression of short-read sequences using path encoding | |
| Löchel et al. | Fractal construction of constrained code words for DNA storage systems | |
| Elworth et al. | To petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics | |
| Sun et al. | Allsome sequence bloom trees | |
| Vinga et al. | Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis | |
| Marschall et al. | Efficient exact motif discovery | |
| Sun et al. | Allsome sequence bloom trees | |
| Mustafa et al. | Dynamic compression schemes for graph coloring | |
| Ginart et al. | Optimal compressed representation of high throughput sequence data via light assembly | |
| Braga et al. | The solution space of sorting by DCJ | |
| Cunial et al. | Fast and compact matching statistics analytics | |
| Recanati et al. | A spectral algorithm for fast de novo layout of uncorrected long nanopore reads | |
| Holley et al. | Dynamic alignment-free and reference-free read compression |