Towards census-like statistics for foreign-born populations
  • News
  • People
  • Research
  • Publications
  • Other outputs
  • Contact

{blocking} on CRAN

blocking
record-linkage
New package finally submitted to CRAN
Author

Maciej Beręsewicz

Published

June 13, 2025

The 1.0.0 version of the {blocking} package was finally submitted to CRAN.

The goal of {blocking} is to provide blocking methods for record linkage and deduplication using approximate nearest neighbour (ANN) algorithms and graph techniques.

It supports multiple ANN implementations via

  • {rnndescent},
  • {RcppHNSW},
  • {RcppAnnoy},
  • {mlpack}

packages, and provides integration with the {reclin2} package.

The package generates shingles from character strings and similarity vectors for record comparison, and includes evaluation metrics for assessing blocking performance including false positive rate (FPR) and false negative rate (FNR) estimates.

To install the package use:

install.packages("blocking")

For more details see the CRAN website and the documentation

 

Copyright 2025 © Maciej Beręsewicz