Mash Screen: what's in my sequencing run?
Last year we published Mash (Ondov et al. 2016) for the rapid comparison of genomes and metagenomes. Mash builds on Andrei Broder’s foundational paper “On the resemblance and containment of documents” (Broder 1997), and uses the MinHash technique to rapidly estimate resemblance, e.g. the similarity of two whole genomes. Mash is great for this purpose, but is not well suited for estimating containment, e.g. detecting genomes contained within a metagenome. Implementing containment was the next logical step for Mash — it is in the title of the Broder paper, after all! Here we introduce a new Mash screen operation that implements containment in the context of genomics.