I shall be attending a next generation sequencing conference in Nottingham in August and hopefully presenting a talk. The talk would describe nicely a project that I have been involved over the last month or so. I've just written an abstract for this talk, so rather than describe it all again I have included it here: Title: An Ensembl-based pipeline for microRNA prediction and expression profiling using Next Generation Sequencing data Authors: Nick James1, Madhu Donepudi1, William Spooner1 and Michael Watson2 1Eagle Genomics Ltd, Babraham Research Campus, Cambridge CB22 3AT, 2Bioinformatics Group, Institute for Animal Health (IAH), Compton, Newbury, RG20 7NN, UK
Abstract: We have developed a workflow appliance for predicting miRNA loci and profiling miRNA expression based on short read sequences generated from small RNA libraries. Our pipeline runs on the Ensembl eHive distributed processing system for which we have built wrappers for a number of best-in-class, open-source miRNA analysis software including RNAfold, MiPred, miRDeep and DroshaSVM. Ensembl databases are used for data storage, automatically integrating results with the latest genome annotations and providing an excelent and widely used interface for data access. The workflow system is compatible with several cluster architectures, including Sun Grid Engine, Condor, Platform LSF, Amazon Web Services or standalone.