Gagie et al., 2015 - Google Patents
Efficient and compact representations of prefix codesGagie et al., 2015
View PDF- Document ID
- 14835278487822870621
- Author
- Gagie T
- Navarro G
- Nekrich Y
- Ordónez A
- Publication year
- Publication venue
- IEEE Transactions on Information Theory
External Links
Snippet
Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix code itself can be an issue. In …
- 238000007906 compression 0 abstract description 43
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G06F17/30625—Trees
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/42—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
- H03M7/425—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory for the decoding process only
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4031—Fixed length to variable length coding
- H03M7/4037—Prefix coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30153—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Valmeekam et al. | Llmzip: Lossless text compression using large language models | |
| Barbay et al. | Efficient fully-compressed sequence representations | |
| Fraenkel et al. | Novel compression of sparse bit-strings—Preliminary report | |
| Ferragina et al. | The myriad virtues of wavelet trees | |
| Bille et al. | Random access to grammar-compressed strings | |
| US9094041B2 (en) | Encoding method, information processing apparatus, and recording medium | |
| JP2011501837A (en) | Two-pass hash extraction of text strings | |
| Klein et al. | On the usefulness of Fibonacci compression codes | |
| Brisaboa et al. | Directly addressable variable-length codes | |
| De Moura et al. | Direct pattern matching on compressed text | |
| Külekci | Enhanced variable-length codes: Improved compression with efficient random access | |
| Chapin et al. | Higher Compression from the Burrows-Wheeler Transform by Modified Sorting. | |
| Gagie et al. | Efficient and compact representations of prefix codes | |
| Sirén | Compressed Full-Text Indexes for Highly Repetitive Collections. | |
| Baruch et al. | A space efficient direct access data structure | |
| Navarro et al. | Compressing Huffman models on large alphabets | |
| Tischler | Faster average case low memory semi-external construction of the Burrows–Wheeler transform | |
| Klein et al. | Using Fibonacci compression codes as alternatives to dense codes | |
| Mishra et al. | Fast pattern matching in compressed text using wavelet tree | |
| Nithya et al. | The Study of Text Compression Algorithms and their Efficiencies Under Different Types of Files | |
| Sinha et al. | Local decodability of the burrows-wheeler transform | |
| Anisimov et al. | Natural-language text compression using reverse multi-delimiter codes | |
| Bharathi et al. | A plain-text incremental compression (pic) technique with fast lookup ability | |
| Rani et al. | A survey on lossless text data compression techniques | |
| Yoshida et al. | Improving parse trees for efficient variable-to-fixed length codes |