Chen et al., 2024 - Google Patents
Oasis: An optimal disjoint segmented learned range filterChen et al., 2024
View PDF- Document ID
- 10925115068848592720
- Author
- Chen G
- He Z
- Li M
- Luo S
- Publication year
- Publication venue
- Proceedings of the VLDB Endowment
External Links
Snippet
The learning-enhanced data structure has inspired the development of the range filter, bringing significantly better false positive rate (FPR) than traditional non-learned range filters. Its core idea is to employ piece-wise linear functions that uniformly map the entire key …
- 238000000034 method 0 description 85
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
- G06F17/3033—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30156—De-duplication implemented within the file system, e.g. based on file segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30153—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30587—Details of specialised database models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30575—Replication, distribution or synchronisation of data between databases or within a distributed database; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Raju et al. | Pebblesdb: Building key-value stores using fragmented log-structured merge trees | |
Ren et al. | SlimDB: A space-efficient key-value storage engine for semi-sorted data | |
CN113612749B (en) | An intrusion-oriented data clustering method and device for traceability | |
Chen et al. | Oasis: An optimal disjoint segmented learned range filter | |
CN107391554B (en) | Efficient Distributed Locality-Sensitive Hashing Method | |
CN101084499B (en) | Systems and methods for searching and storing data | |
Hua et al. | Locality-sensitive bloom filter for approximate membership query | |
Almodaresi et al. | An efficient, scalable, and exact representation of high-dimensional color information enabled using de Bruijn graph search | |
JP7122325B2 (en) | Lossless reduction of data using the base data sieve and performing multidimensional search and content-associative retrieval on the losslessly reduced data using the base data sieve | |
US10210188B2 (en) | Multi-tiered data storage in a deduplication system | |
US20060112264A1 (en) | Method and Computer Program Product for Finding the Longest Common Subsequences Between Files with Applications to Differential Compression | |
CN114281989B (en) | Data deduplication method and device based on text similarity, storage medium and server | |
Knorr et al. | Proteus: A self-designing range filter | |
Vaidya et al. | SNARF: a learning-enhanced range filter | |
CN105989015B (en) | Database capacity expansion method and device and method and device for accessing database | |
JP6726690B2 (en) | Performing multidimensional search, content-associative retrieval, and keyword-based retrieval and retrieval on losslessly reduced data using basic data sieves | |
US11880368B2 (en) | Compressing data sets for storage in a database system | |
US12353418B2 (en) | Handling null values in processing join operations during query execution | |
JP2023525791A (en) | Exploiting Base Data Locality for Efficient Retrieval of Lossless Reduced Data Using Base Data Sieves | |
CN113535670A (en) | Virtual resource mirror image storage system and implementation method thereof | |
WO2012114402A1 (en) | Database management device and database management method | |
Eslami et al. | Memento Filter: A Fast, Dynamic, and Robust Range Filter | |
Firth et al. | TAPER: query-aware, partition-enhancement for large, heterogenous graphs | |
CN118349578A (en) | Data query method, device, electronic equipment and storage medium | |
CN112416879A (en) | Block-level data deduplication method based on NTFS (New technology File System) |