Eagle has a proven history of discovering actionable biomarkers in complex molecular datasets. Our success is based on a flexible, disciplined and collaborative approach that puts the researcher’s scientific priorities first. This is achieved through deep scientific understanding of the problem space and close communication with the customer scientists. Unlike monolithic data analysis “solutions” our approach never takes the researcher out of the loop.

The BiomarkerDiscoveryPlus offering has emerged based on repeated client requests. In each case there is a pressing business need to extract scientific insight/value from the available datasets;

  • An interesting third-party publication or report that it would be valuable to reproduce,
  • In-house analysis processes that need hardening/productising, or
  • Existing datasets (often from collaborators) that require further specialised processing or analysis.



We build for reliability and scale, leveraging the eaglediscover and eaglehive platforms. Large, complex and multidimensional datasets are processed using high performance (mostly cloud-based) computing, and modern machine learning approaches. Feature selection uses as few features as possible, while maintaining high predictive power. This balance is crucial when the goal of data analysis is the identification of highly accurate but small panels of biomarkers with potential clinical utility. We pioneer adoption of cutting-edge solutions. Our battle-hardened toolbox for biomarker discovery includes modules for;


  • Genomic profiling (variation); whole genome/exome sequence processing; calling and annotation of germline, somatic and haplotype variants, genome wide association studies (GWAS).
  • Transcriptomic profiling (expression); digital gene expression (RNA/miRNA by sequencing or array), differential expression analysis, ontology and pathway enrichment.
  • Metagenomic profiling; 16S rRNA, shotgun metagenomics.
  • Proteomics/metabolomics; targeted/untargeted Mass Spectrometry, post-translational modification, tissue/urinary peptidomics.
  • Multi-omics; data integration and machine learning in R and Apache Spark.



Whether the aim is to gain insight into a disease model or to uncover mechanism of action of a therapeutic, BiomarkerDiscoveryPlus maximises the chances of success. The benchmark for a biomarker discovery is the selection of an actionable candidate for further development. In early 2016 an NME for autoimmune disease will enter clinical trials. Previously deprioritized, the drug was rescued by a small team including Eagle who worked tirelessly for 18 months to discover a novel multi-omics biomarker for patient stratification.

Adoption of the BiomarkerDiscoveryPlus workflows can boost operational efficiency of R&D teams: “Unilever’s digital data program now processes genetic sequences twenty times faster without incurring higher compute costs. In addition, its robust architecture supports ten times as many scientists, all working simultaneously.” - Pete Keeley, eScience Technical Lead, R&D IT, Unilever.


Announcements biomarkers Blog cancer clinical datasets database genomics life science data life science R&D metabolomics multi-omics personalised medicine profiling TCGA transcriptomics

Will Spooner

About Will Spooner