I'm no patent lawyer but when a colleague passed me a link to a patent application from Illumina which appears to lay claim to having invented the concept of analyzing biological data in the cloud, I couldn't help but wonder what on earth they think they're doing.
The patent application opens with a very clear abstract:
"The present invention provides a novel approach for storing, analyzing, and/or accessing biological data in a cloud computing environment. Sequence data generated by a particular sequencing device may be uploaded to the cloud computing environment during a sequencing run, which reduces the on-site storage needs for the sequence data. Analysis of the data may also be performed in the cloud computing environment, and the instructions for such analysis may be set at the originating sequencing device. The sequence data in the cloud computing environment may be shared according to permissions. Further, the sequence data may be modified or annotated by authorized secondary users."
On further in-depth reading of the later detail sections of the patent, it turns out that the claim to have invented a "novel approach" doesn't relate to any particular software or algorithm, but to the very idea of using the cloud to analyse biological data. It is broad-ranging in its scope, including data from any biology-related lab instrument (not just sequencers, and not just those sold by Illumina), data that is uploaded to the cloud indirectly from other locations as well as directly from the instruments, and all clouds (not just those owned or controlled by Illumina or the usual public ones such as Amazon, but all clouds, including completely internal private clouds).
The "invention" around analysis and access to data is also incredibly broad. This is the text from the claim:
"The present invention also includes a system for analyzing biological samples, comprising: at least one networked computer system configured to: receive sequence data from a remote sequencing device, wherein the sequence data comprises permissions for accessing the sequence data; receive a request from a secondary user to access the sequence data, the secondary user being different from the remote sequencing device; and allowing the secondary user access to the sequence data if the secondary user is authorized under the permissions. Such permissions may be defined by a primary user.
The present invention also includes a computer implemented method for providing genetic data, comprising: receiving, at a server, a request from a user for data related to a particular gene or set of genes on a cloud computing environment; monitoring, on the cloud computing environment, available data relating to the particular gene or set of genes; and conveying to the user the available data based upon the request."
Just how is any of this novel, or indeed an "invention"? (Note, the patent application is dated October 17th 2013, but replaces an earlier application from April 11th 2012). The entire concept already existed even at the earlier April 2012 date, in the form of DNANexus (which launched its cloud analysis solution in April 2010), and also in the systems designed and implemented against the Pistoia Alliance Sequence Services Phase 1 specification, cloud-based versions of which were demonstrated in public by Eagle (and others) in April 2011, a whole year before this patent was applied for by Illumina.
It is highly debatable that the patent really represents any kind of invention, especially as there is public evidence of others having developed the exact same systems and publicised their achievements long before the patent application date. Personally I think this is a clear-cut case of prior art. I can only hope that the US Patent Office agrees and declines to grant the patent to Illumina.