Technical brief

The OpenEMPI Matching Engine

A high-performance, runtime-configurable entity-resolution platform for the most demanding data environments — deterministic, probabilistic, and artificial intelligence-based matching in one engine.

DeterministicProbabilisticMachine learning
A matching-engine console comparing Record A and Record B field by field — Name, DOB, Address, Phone, Email — with per-field similarity bars, weights, and a 94% match-probability gauge.

Three matching paradigms, one engine

OpenEMPI doesn't force a single matching strategy. Choose the approach that fits your data — or combine them — all behind the same configuration and review workflow.

Probabilistic

Probabilistic record linkage

The classic probabilistic record-linkage framework with numerous proprietary enhancements. Field agreements and disagreements are weighted into a match score with tunable upper and lower thresholds.

Deterministic

Deterministic rules

Exact and fuzzy matching rules are configured to fit your data and use case — from simple exact matches to complex, multi-field conditions.

AI-powered

Artificial intelligence

Artificial intelligence-based models learn from labelled match data to capture patterns that fixed rules miss, lifting accuracy on messy, real-world records.

Flexible entity resolution

OpenEMPI doesn't force you into a fixed record shape. The platform is built around flexible algorithms, extensible components, and configurable entity definitions.

Cutting-edge algorithms

Deterministic, probabilistic, and machine-learning matching in one engine, so each deployment can choose the strategy that best fits its data.

Extensible architecture

Advanced matching algorithms plug into an OpenEMPI instance, and that same extension model applies across the components that support matching workflows.

Configurable data model

Record definitions are customized to your datasets — patient demographics, providers, facilities, customers, business listings, or any entity you define.

Highly customizable

Matching components expose configuration parameters tuned to your data quality, duplicate-rate tolerance, and operational review process.

Supported distance metrics

Jaro–Winkler
Levenshtein
Soundex
Double Metaphone
Phone-aware matching
Numeric ranges
Date proximity

Enterprise features

  • Real-time & batch processing modes
  • Deterministic, probabilistic, and AI-powered matching in one engine
  • Manual review workflow with audit logging
  • Horizontally scalable architecture
  • Data-source weighting & reliability scoring
A hub-and-spoke diagram connecting OpenEMPI to REST API, FHIR, HL7 v2, Webhooks and JSON endpoints.

Integration first

OpenEMPI is designed to live inside your stack, not replace it. A RESTful API and webhook system integrate cleanly with modern data lakes, warehouses, and operational systems — with HL7 and FHIR for healthcare.

REST APIWebhooksHL7 v2/v3FHIRJSON/XML

See the engine on your data

Tell us about the records you need to resolve and we'll walk you through how OpenEMPI matches, scores, and merges them.