Our tech and methods are proven, tested, and peer-reviewed. They're the state-of-the-art in entity resolution systems evaluation. We give you the clarity you need to demonstrate value, deploy, and iterate with confidence.
Evaluation from A to Z
Understand the characteristics of your entity resolution results. Analyze errors and detect performance disparities.
Compare behaviors and performance across time or across competing algorithms, and make informed decisions.
Accurately estimate performance and model characteristics, even when using biased benchmark datasets.
Entity resolution is a clustering problem and should be evaluated as such. Our evaluation tools use a clustering of entity mentions as the starting point of analysis. No labeling of record pairs with a complex sampling scheme. Instead, we use disambiguated entity clusters for increased efficiency and compatibility with all types of ER systems.
Efficient Data Labeling
Data labeling is expensive and hard to get right, especially for entity resolution. That's why we collect ground truth data using a principled methodology tailored to our evaluation tools. Our labeled datasets can easily be used and reused for many different tasks: performance estimation, model comparison, model monitoring, error analysis, sensitivity analysis, and model training.
Performance of entity resolution systems degrades over time, as more data is collected. On top of monitoring of key metrics and automated alerting, we intelligently predict future performance degradation. This lets you plan ahead for model re-evaluation, tuning and retraining.
Unbiased Performance Estimators
Estimating the performance of ER systems is notoriously hard to get right. If you naively compute performance metrics such as pairwise precision or F score on small benchmark datasets, you will overestimate performance. Even worst, making model comparisons, you are likely to choose the wrong model due to performance rank reversals. To avoid these pitfalls, Valires relies on statistically principled performance estimators that have been shown to provide accurate performance estimates for a wide range of metrics, including pairwise, cluster-based, and B-cubed performance metrics.
Read the Science
Binette, O., York, S. A., Hickerson, E., Baek, Y., Madhavan, S., & Jones, C. (2023). The American Statistician, 1-22.
Binette, O., Madhavan, S., Butler, J., Card, B. A., Melluso, E., & Jones, C. (2023). arXiv preprint arXiv:2301.03591.
An End-to-End Evaluation Framework for Entity Resolution Systems With Application to Inventor Name Disambiguation