This data is taken from RecordLinkage R package developed by Murat Sariyar and Andreas Borg. The package is licensed under GPL-3 license.
The RLdata500 table contains artificial personal data.
Some records have been duplicated with randomly generated errors. RLdata500 contains fifty duplicates.
Format
A data.table with 500 records. Each row represents one record, with the following columns:
fname_c1– first name, first component,fname_c2– first name, second component,lname_c1– last name, first component,lname_c2– last name, second component,by– year of birth,bm– month of birth,bd– day of birth,rec_id– record id,ent_id– entity id.
References
Sariyar M., Borg A. (2022). RecordLinkage: Record Linkage Functions for Linking and Deduplicating Data Sets. R package version 0.4-12.4, https://CRAN.R-project.org/package=RecordLinkage