Scalable Reduction of Large Datasets to Interesting Subsets Preprint in SSRN Electronic Journal (January 2010)