The OpenEMPI Matching Engine

Turn duplicate records into one source of truth.

OpenEMPI is an enterprise entity-resolution engine that deduplicates and links records at scale. Deterministic, probabilistic, and artificial intelligence-based matching in one configurable pipeline — built as a Master Patient Index for healthcare, proven anywhere identity matters.

DeterministicProbabilisticArtificial intelligence
Scattered duplicate patient records on the left — each showing Name, DOB, Address, Phone and Email — resolving through a matching node with similarity scores into a single golden master record on the right.

5–20%

Duplicate-record rate in a typical EMR — the gap OpenEMPI closes.

3 paradigms

Deterministic, probabilistic, and artificial intelligence-based matching in a single engine.

Any record

Patients, providers, customers, or any entity you define.

How identity gets resolved

Every record runs through the same transparent pipeline. Each stage is configurable, so you can tune accuracy to your data quality and review tolerance.

01

Standardize

Names, dates, phones, and addresses are parsed and normalized so records compare on equal footing.

02

Block

Candidate pairs are grouped in clusters based on similarity to keep matching fast at scale even in instances with tens of millions of records.

03

Score

Field-level distance metrics feed Probabilistic, Deterministic, and Artificial Intelligence models to score every pair.

04

Resolve

High-confidence matches are grouped into a single golden record; uncertain pairs route to review.

An engine built for matching at scale

Cutting-edge algorithms, an extensible architecture, and a data model you define — the same flexibility across every component.

Three records for the same person — registered under varied spellings and ages at a clinic, lab and cardiologist — resolved into a single unified patient profile.

Configurable matching algorithms

Run deterministic, probabilistic, and artificial intelligence-based matching in one pipeline. Tune the strategy and distance metrics to your data instead of bending your data to the tool.

  • Probabilistic linkage
  • Artificial intelligence models
  • Jaro–Winkler · Levenshtein · Soundex · Double Metaphone
A stewardship review queue listing candidate record pairs with confidence scores and merge or reject actions.

Human-in-the-loop stewardship

Uncertain matches surface in a review queue with confidence scores, merge/reject controls, and a full audit trail — so data stewards stay in control of every merge.

  • Confidence-scored candidate pairs
  • Merge / reject with full audit logging
  • Real-time & batch processing modes
A hub-and-spoke diagram connecting a central node to REST API, FHIR, HL7 v2, Webhooks and JSON endpoints.

Integrates with your stack

OpenEMPI lives inside your architecture, not in place of it. A REST API and webhooks connect it to your data lakes, warehouses, and operational systems.

  • RESTful API & webhooks
  • HL7 v2/v3 & FHIR for healthcare
  • Customizable, domain-agnostic data model

Deduplication for every domain

OpenEMPI began as a Master Patient Index. The same engine resolves identity wherever duplicate records undermine trust in your data.

Master Patient Index

Flagship

The flagship use case. Resolve duplicate patient records — which run 5–20% in a typical EMR — into one accurate identity across every system of care.

Provider & facility directories

Maintain a clean, authoritative registry of providers and facilities, free of duplicate and conflicting entries.

Customer 360

Unify customer records scattered across CRMs, billing, and support into a single trusted profile.

Business & entity listings

Deduplicate suppliers, accounts, and business listings to keep master data reliable across the enterprise.

For executives

Cut the operational cost and compliance risk of duplicate records. OpenEMPI gives the enterprise a single, reliable source of truth — backed by commercial support.

  • Rapid time-to-value
  • Commercial support & SLAs
  • Improved quality of care & cost savings
  • Scalable licensing
About Sysnet →

For engineers

A high-performance matching engine with runtime-configurable algorithms and an extensible architecture. Integrate via REST or webhooks into your existing data pipeline.

  • Probabilistic & ML matching
  • RESTful API & webhooks
  • High-throughput, horizontally scalable
  • Customizable data model
Read the docs →

Ready to deduplicate with confidence?

Tell us about your data and the records you need to resolve. We'll show you how OpenEMPI fits your stack.