The Pistoia Alliance, a precompetitive alliance of life science companies, technology vendors, publishers, and academic groups, has launched the Pistoia Alliance Sequence Squeeze Competition to find the best algorithm for compressing next-generation sequencing (NGS) data. The winning entrant will receive a prize of US$15,000, which will be awarded by a prestigious judging panel that includes representatives from the BGI and the Wellcome Trust Sanger Institute.
The Pistoia Alliance Sequence Squeeze Competition aims to encourage anyone with expertise in data compression-including bioinformaticians, mathematicians, physicists, and computer scientists--to tackle a major problem in the management of NGS data. With sequencing costs dropping faster than the data processing rates predicted by Moore's Law, modern sequencers can now generate more data in a one day than any one sequencer could have produced during the whole of 2005.
Labs rely on compression to enable them to store data from sequencing runs, which includes sequencing reads and associated quality scores. Yet compression technologies are themselves faltering under the data volumes produced by NGS.
"There is a very real need today for novel methods of compressing sequence reads and their quality scores in a way that preserves 100% of the information while achieving much-improved linear-or, even better, non-linear-compression ratios," said Nick Lynch, president of the Pistoia Alliance and chair of the Sequence Squeeze judging panel. "We're excited to be supporting this competition-we believe in championing this type of grassroots, from-the-trenches innovation."
In addition to Lynch, the judging panel includes Guy Coates, information systems lead at the Wellcome Trust Sanger Institute and Yingrui Li, dute operation officer of the BGI. A fourth judge is yet to be announced.
The competition requires entrants to devise and implement a computer algorithm for compressing and then decompressing sequencing data stored in the commonly used FASTQ format. Entries must be fully open source so that the entire scientific community can benefit from the winning algorithm. Anyone can enter, subject to the terms and conditions of entry available on the competition website at www.sequencesqueeze.org.
In addition to sponsoring the Sequence Squeeze competition, the Pistoia Alliance recently commenced Phase 2 of its sequence services project, which aims to develop pilot systems offering secure, cloud-based access to public and private sources of gene sequencing data, including NGS data. Pilot systems should be available for demonstration early in 2012.
The competition closes on 15 March 2012 and is administered on the Pistoia Alliance's behalf by Eagle Genomics Ltd., a bioinformatics services and software company. For full details on how to enter, and to submit an entry, please visit www.sequencesqueeze.org.