March 3, 2017

Provisioning of Bioinformatics - a 4 horse race

Let's start with three assertions;

  1. The value of high-throughput genomics is no longer in doubt, with Forbes magazine, for instance, touting DNA sequencing as the next $100 billion technology market.
  2. More attention/funding has been given to data production, rather than data analysis, leading to the latter falling behind.
  3. Data deluge, tsunami, torrent; the pundits are forecasting significantly more than an afternoon of persistent data drizzle!

The promise of cash (1) has driven investment in instrumentation (2) that has created the deluge (3). But be in no doubt; given enough cash (vested in 1) funding for computation will materialise (addressing 2) to efficiently harness the deluge (taming 3). The question becomes 'how', not 'if'. To recap; the brave new world of bioinformatics is going to have a market, funding (both private and public), and a whole heap of data.

I will now introduce the four contenders vying for the title of "most appropriate strategic platform for provisioning bioinformatics". Place your bets ladies and gentlemen! Some are gambling for a share of $100 billion, others for career advancement, and a few for pure ' told you so' bragging rights...

  1. In-house compute/storage. The traditional and currently dominant platform. Affords total control, but has the disadvantage of limited scalability.
  2. Cloud comptue/storage (Infrastructure as a Servce, IaaS). Now 'accepted' by most IT decison makers; users effectively 'rent' infrastructure by the hour.  Addresses scalability concerns, but data transfer is an issue.
  3. Specialty cloud (Software as a Service, SaaS). At its simplest SaaS is pre-installed software on top of IaaS, but PerkinElmer, for instance, now offer sequencing twinned with a private SaaS cloud, effectively solving data transfer issues, but at the expense of vendor lock-in.
  4. Outsourced analysis services. Outsourcing is de rigueur in many industries (banking, telecoms). Benefits abound, but the approach is not suitable in all cases.

As scientists we would like some hard data to help us forecast which of the runners is favorite to win, and would encourage as many as possible to participate in our brief "provisioning bioinformatics" survey. In the spirit of open bioinformatics our final report will be open access, with initial results presented at our April symposium; responses so far have been intriguing!

Topics: AWS, Big data technology, Bioinformatics, Bioinformatics, Cloud, data management, high throughput, outsourcing, survey