March 3, 2017

MapReduce Genome Assembly using Contrail

Contrail has to be the most impressive application of MapReduce I've seen yet.

Taking the widely accepted approach of constructing a De Bruijn graph of sequence n-mers, popularised since the introduction of Daniel Zerbino's Velvet, but distributing the construction and reduction of the graph across multiple nodes instead of trying to do it all in-memory on a single node seems obvious and logical, but is a very difficult solution to architect. However, Contrail has done it, and it looks impressive.

I look forward to seeing some detailed research investigating the accuracy and effectiveness of this new approach to assembly. If it is proven to be a viable solution then it will significantly speed up the process of assembling new genomes.

