March 3, 2017

Mind your metadata

Given the increasing number of experiments and the need to bring together data and analyse it as a whole, knowing where all your data is and what it is becomes ever more important. There are many spreadsheets out there where scientists have dutifully noted down their own experiment descriptions in their own personal notation or shorthand (there must be thousands of pieces of work out there filed under the name of 'Experiment 1' or 'My Experiment'), but trying to bring this together as a whole and understand what the entire group, team, or even company has been working on and where the future direction needs to be headed can be a very challenging task indeed. Collaborating with external partners can also be challenging when attempts at enforcing standards internally at each partner site are made to conflict head-on.

In recent times Eagle has been actively involved in helping people get to grips with this situation and we currently have a number of customer projects that build upon tools from the ISA framework ( ISA helps manage experimental data more effectively, integrate legacy and current experiemental metadata and view them as a cohesive whole, and start to better understand what has been achieved, how it relates to each other, and where the gaps in knowledge lie.

Although the ISA website, linked above, will probably explain all that you need to know, in a nutshell it can:

  • Has been adopted by BGI (GigaScience) and Nature Publishing Group, and is compatible with the European Nucleotide Archive (ENA) and ArrayExpress, amongst others
  • Curates and manages experimental metadata at source through common structured information that transcends biological domains
  • Can be configured to integrate additional community standards and ontologies
  • Export and import data directly to/from public repositories and journal archives, and files in other formats (e.g. MAGEtab, Pride-xml, RDF, OWL, etc.)
  • Can support external collaboration via Google Spreadsheets (and through Eagle's own ElasticAP!)
  • Integrated analyses via BioConductor, Galaxy, and more (not forgetting ElasticAP of course)

Of course, through the heavy commercial use that Eagle has been subjecting the standard ISA tools to, it was inevitable that issues arose. This was never a problem though - Eagle was originally set up to provide a reliable interface between the demands of the commercial world and the offerings of the open-source world and we know exactly how to handle these situations to the benefit of both our commercial customers and the open-source projects. We actively patch the problems that we can solve ourselves and work with the excellent ISA project team at the University of Oxford in order to make the toolkit better for everyone.

If you're having problems organising and summarising your experiments across your team or organisation, get in touch:

Topics: Bioinformatics, collaboration management, data analysis, data efficiency, Eagle secret, experiment organisation, ISA adoption, ISA benefits, ISA framework, metadata, Open source, project managment