Business value of ontologies
Ontologies are essential to many aspects of R&D data integration and data governance. Use of structured metadata including ontologies and controlled vocabularies /nomenclatures can rapidly increase an organisations data management maturity. In order to bridge the gap between “big data” and “biological insight", Eagle utilises pioneering value driven biocuration to assist with answering business and scientific questions where data harmonisation is essential to improving the quality of the underlying datasets.
Ontologies Mapping Project
Eagle Genomics has been a member of the Pistoia Alliance for a number of years.
Pistoia Alliance are a global, not-for-profit alliance of life science companies, vendors, publishers, and academic groups that work together to lower barriers to innovation in R&D.
One of the current active Pistoia Alliance projects is the Ontologies Mapping Project for which Eagle Genomics was invited onto the Project Team. I have been representing Eagle in this project for the last 2 years, being involved in various stages throughout including contributing to a variety of documentation and assessment of the functionality of available academic and commercial ontology mapping tools. It has been a very interesting journey that has enabled us to gain both knowledge and insight in recognizing some of the better ontologies and how some of the best ontology mapping tools work.
I was given a months notice to prepare myself as a panelist on the Pistoia Alliance Debates Webinar "Ontologies Mapping for more effective data integration and knowledge management", to be aired live in February 2017. I was very nervous at the thought of being heard talking by lots of people over the web, live. My largest audience to date was a class of 28 ten year olds at my son’s school and around the same number at a conference talking about my poster. I have never spoken in front of a large audience (200+ in this case) before and this was also my first experience at participating as a panelist on a webinar!
The outline of my presentation in the webinar took the following format:
Ontologies support the bridge between data and insight
The Life Sciences face increasing volumes, variety, veracity, velocity and sources of data, all of which require integration and interoperability. Eagle provides software solutions that bridge between “big data” and “innovative biological insight”.
Ontologies allow disparate data from a range of sources to be harmonized, federated and integrated into a resource from which various high performance computational analysis such as data processing, statistical analyses and data mining can be carried out towards novel biological insights.
Curation is a multistep activity performed on datasets involving its' collection, characterisation, contextualisation and categorisation, thereby allowing for better data management.
Eagle plays an active role in curating, organising and federating a variety of customer multi-omics datasets and associated metadata into a knowledge management platform (eaglecatalog) making data more visible and available for searching, sharing and further analyses including data valuation.
Eagle pioneers measurement of data value (i.e. its’ usefulness and relevance) in the context of specific scientific questions. Value driven biocuration is utilized for data harmonisation and to improve the quality of the underlying datasets for value modelling. This has been the basis for our product eaglediscover which objectively assesses the value of the data and discovers relationships between components contributing to the value.
We can measure the value of data before the use of ontologies and after, according to quality metrics and value metrics such as AHP (analytic hierarchy process; a structured technique for organizing and analyzing complex decisions, based on mathematics and psychology) and QFD (Quality Function Deployment; a structured approach to defining customer needs or requirements and translating them into specific plans to produce products to meet those needs).
Data Governance is emerging as an important activity within the biopharma and healthcare industries. This is a complex initiative which relates to the validity (such as are we doing the right things) and consistency (are we doing the things right) throughout the organisation for efficient data and knowledge management.
It goes towards ensuring everyone refers to the same entity (drug or disease or gene) across all organisational departments and sites (R&D -> clinical trials -> sale of drug to treat disease), which is essential. Data governance should be an activity by design and not a “tick box” one. Hence, it can be initiated by the use of ontologies/ controlled vocabularies to tag and link experiments/ datasets throughout different departments of an organisation.
Despite the need for better data standards to sanction better data management, integration and interoperability, these remain poorly defined and can often be overlooked or ignored, even when present. Ontologies are an important solution because they serve as the “smart glue” for data integration to semantically allow data and knowledge management. However, this is hampered by multiple standards and many varying ontologies which can overlap in the same data domain. Hence, the Ontologies Mapping project has been working towards supporting better tools, services and best practices for ontology management and mapping in the Life Sciences.
Other panelists on this webinar included Martin Romacker (Roche), Simon Jupp (EMBL-EBI) and Lee Harland (SciBite), who all describe different aspects on the importance of ontologies for standardising datasets in the life sciences.
Listen to the Pistoia Alliance Ontology Mapping webinar in full here.
If you would like learn more about how Eagle Genomics utilise ontologies in both eaglecatalog and eaglediscover please get in touch! Call us on +44 (0) 1223 654481 or drop us a line with our specialised form