Yan, 1995 - Google Patents
Duplicate detection in information disseminationYan, 1995
View PDF- Document ID
- 2494630744496179479
- Author
- Yan T
- Publication year
External Links
Snippet
Our experience with the SIFT [YGM95] information dissemination system (in use by over 7,000 users daily) has identified an important and generic dissemination problem: duplicate information. In this paper we explain why duplicates arise, we quantify the problem, and we …
- 238000001514 detection method 0 title description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30477—Query execution
- G06F17/30483—Query execution of query operations
- G06F17/30486—Unary operations; data partitioning operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30011—Document retrieval systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11947513B2 (en) | Search phrase processing | |
Yan et al. | The SIFT information dissemination system | |
Yan | Duplicate detection in information dissemination | |
US6757675B2 (en) | Method and apparatus for indexing document content and content comparison with World Wide Web search service | |
US6721749B1 (en) | Populating a data warehouse using a pipeline approach | |
Chen et al. | Ti: an efficient indexing mechanism for real-time search on tweets | |
US6314421B1 (en) | Method and apparatus for indexing documents for message filtering | |
US6226630B1 (en) | Method and apparatus for filtering incoming information using a search engine and stored queries defining user folders | |
US20040205044A1 (en) | Method for storing inverted index, method for on-line updating the same and inverted index mechanism | |
JP2000003321A (en) | Message storage structure of high performance | |
JP2010520549A (en) | Data storage and management methods | |
KR20040017008A (en) | System and method for offering information using a search engine | |
CN111460255A (en) | Music work information data acquisition and storage method | |
Shivakumar | Detecting digital copyright violations on the Internet | |
Garcia-Molina | Duplicate Removal in Information Dissemination | |
Yan et al. | Information finding in a digital library: the Stanford perspective | |
Uehara et al. | Information retrieval based on temporal attributes in WWW archives | |
Yan | Efficient techniques for wide-area information dissemination | |
Kourdounakis | Subscription Indexes for Web Syndication Systems | |
Schäuble | Integrating Information Retrieval and Database Functions |