Entity Resolution is the process by which a dataset is processed and records are identified that represent the same real-world entity.

OYSTER (Open sYSTem Entity Resolution) is an entity resolution system that supports probabilistic direct matching, transitive linking, and asserted linking. To facilitate prospecting for match candidates (blocking), the system builds and maintains an in-memory index of attribute values to identities. Because OYSTER has an identity management system, it also supports persistent identity identifiers. OYSTER is unique among other ER systems in that it is built to incorporate Entity Identity Information Management (EIIM). OYSTER supports EIIM by providing methods that enforce identifiers to be unique among identities, maintain persistent IDs over the life of an identity, and allowing the ability to fix false-positive and false-negative resolutions, which cannot be done with matching rules, through the use of assertion, traceability, and other features.

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 2.0 (GPLv2)

Follow OYSTER Entity Resolution

OYSTER Entity Resolution Web Site

You Might Also Like
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
2
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 4 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 5 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 3 / 5

User Reviews

  • Great tool !
  • OYSTER has been critical to our enterprise-wide data warehouse at the University of Arkansas for Medical Sciences. We have also used it for detecting and resolving duplicate addresses and participant ids in the National Children's Study.
Read more reviews >

Additional Project Details

Operating Systems

Linux, Mac, Windows

Intended Audience

Information Technology, Management

User Interface

Command-line

Programming Language

Java

Database Environment

Flat-file, MySQL

Related Categories

Java Report Generators

Registered

2011-08-25