[go: up one dir, main page]

Chen et al., 2024 - Google Patents

Oasis: An optimal disjoint segmented learned range filter

Chen et al., 2024

View PDF
Document ID
10925115068848592720
Author
Chen G
He Z
Li M
Luo S
Publication year
Publication venue
Proceedings of the VLDB Endowment

External Links

Snippet

The learning-enhanced data structure has inspired the development of the range filter, bringing significantly better false positive rate (FPR) than traditional non-learned range filters. Its core idea is to employ piece-wise linear functions that uniformly map the entire key …
Continue reading at www.vldb.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30312Storage and indexing structures; Management thereof
    • G06F17/30321Indexing structures
    • G06F17/3033Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30156De-duplication implemented within the file system, e.g. based on file segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30153Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30587Details of specialised database models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30575Replication, distribution or synchronisation of data between databases or within a distributed database; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance

Similar Documents

Publication Publication Date Title
Raju et al. Pebblesdb: Building key-value stores using fragmented log-structured merge trees
Ren et al. SlimDB: A space-efficient key-value storage engine for semi-sorted data
CN113612749B (en) An intrusion-oriented data clustering method and device for traceability
Chen et al. Oasis: An optimal disjoint segmented learned range filter
CN107391554B (en) Efficient Distributed Locality-Sensitive Hashing Method
CN101084499B (en) Systems and methods for searching and storing data
Hua et al. Locality-sensitive bloom filter for approximate membership query
Almodaresi et al. An efficient, scalable, and exact representation of high-dimensional color information enabled using de Bruijn graph search
JP7122325B2 (en) Lossless reduction of data using the base data sieve and performing multidimensional search and content-associative retrieval on the losslessly reduced data using the base data sieve
US10210188B2 (en) Multi-tiered data storage in a deduplication system
US20060112264A1 (en) Method and Computer Program Product for Finding the Longest Common Subsequences Between Files with Applications to Differential Compression
CN114281989B (en) Data deduplication method and device based on text similarity, storage medium and server
Knorr et al. Proteus: A self-designing range filter
Vaidya et al. SNARF: a learning-enhanced range filter
CN105989015B (en) Database capacity expansion method and device and method and device for accessing database
JP6726690B2 (en) Performing multidimensional search, content-associative retrieval, and keyword-based retrieval and retrieval on losslessly reduced data using basic data sieves
US11880368B2 (en) Compressing data sets for storage in a database system
US12353418B2 (en) Handling null values in processing join operations during query execution
JP2023525791A (en) Exploiting Base Data Locality for Efficient Retrieval of Lossless Reduced Data Using Base Data Sieves
CN113535670A (en) Virtual resource mirror image storage system and implementation method thereof
WO2012114402A1 (en) Database management device and database management method
Eslami et al. Memento Filter: A Fast, Dynamic, and Robust Range Filter
Firth et al. TAPER: query-aware, partition-enhancement for large, heterogenous graphs
CN118349578A (en) Data query method, device, electronic equipment and storage medium
CN112416879A (en) Block-level data deduplication method based on NTFS (New technology File System)