Algorithms for aggregating duplicate identities based on non-numerical data?
Posted by ropeladder, at datascience.stackexchange.com,
I have a large dataset (2M entries) of people, but many people have multiple entries in the database with slightly (or…