News and Blog

Using QIIME to study populations of microbiomes

March 23, 2017 / no comments / in microbiomics

A microbiome is a collection of microbes, such as bacteria, fungi, protozoans and viruses, that inhabit a given environment. One such environment is the human body; the human microbiome is important for maintaining health, and when things go wrong it can contribute to disease.

microbiome research dataTo understand how populations of many microbes influence human disease we need to understand the "microbial make up" of individuals. Microbes are studied using metagenomics, a technique that reveals the biological functions of an entire microbial community. Looking at all, and not individual microbes, is important as our nasal, oral, skin, gastro-intestinal and urogenital areas are populated with multiple species that often have different genes that carry out different metabolic functions. Metagenomic research indicates it is often the collection of functions microbes provide that’s important for health, rather than the presence or absence of a single species. The tool QIIME™ ( Quantitative Insights Into Microbial Ecology, pronounced as "chime") is designed to take users from raw sequencing data through to publication quality graphics and statistics that can be used to gain insight into a microbiome of interest. QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.
using qiime to study populations of microbiomes

Citation: QIIME allows analysis of high-throughput community sequencing data. J Gregory Caporaso, Justin Kuczynski, Jesse Stombaugh, et. al. ; Nature Methods, 2010; doi:10.1038/nmeth.f.303

To identify the many microbes that contribute to a sample's microbiome, researchers use DNA sequencing. A common technique is to sequence a short, unique DNA sequence called a marker. The marker can be uniquely identified from the whole genome by use of sequencing primers. The marker serves to identify the genome that contains it, thereby providing the species name for the microbe. One commonly used DNA marker is the gene that codes for the 16S subunit of ribosomal RNA, this gene is an important part of the cell’s protein-building machinery.

QIIME is an multi-platform, open-source bioinformatics pipeline for performing microbiome analysis. The tool allows a user to analyse and interpret the nucleic acid sequence data of the marker from fungal, viral, bacterial, and archaeal communities. The pipeline includes many steps from demultiplexing and quality filtering, OTU picking, taxonomic assignment, phylogenetic reconstruction, and diversity analyses and visualizations.

Here is a common sequence of events for a QIIME pipeline:

  1. Preprocessing involving demultiplexing (splitting pooled sequences into their respective samples), primer removal, quality trimming and filtering.
  2. Denoising to correct for remaining PCR and sequencing errors
  3. Cluster the sequences into operational taxonomic units (OTUs) and choose representative sequences. An OTU can be thought of as a group of closely related microorganisms. OTUs are commonly generated by clustering sequences at 97% identity (or higher), though other methodologies are sometime used.
  4. Align OTU sequences to build a phylogenetic tree
  5. Assign taxonomy to the OTUs and build a table (known as an OTU table) showing the abundance of each OTU in each sample.
  6. Look at diversity within samples (alpha diversity), and between samples (beta diversity).
  7. Visualise data with PCoA plots, rarefaction plots, bar charts and more!

The output of a metagenomics analysis identifies the microbes that exist together in a sample of interest; the different species in a community and their relative abundance. From the data of multiple samples we can ask a scientific question - which microbes show the largest differences between environmental conditions? Microbes often depend on the metabolic products from their neighbours as resources for survival. It is the collection of functions the microbes provide that’s important for health. For example, a healthy gut contains bacteria that produce vitamins. It appears not to matter which types of bacteria are present making the vitamins, as long as the job gets done.

Eagle Genomics have broad experience working with QIIME to perform metagenomics analysis for customers. Eagle metagenomics pipelines are built using the eaglehive architecture. eaglehive coordinates “runnables” (pieces of analytics code) into a coordinated workflow.
Docker_(container_engine)_logo.pngRunnables are available from an ever-growing library of components (which includes QIIME) and dockerised images of fully functional analytical containers. See another blog from Eagle Genomics that discusses deployment of our tools. These docker images can be created, stored and used on demand; our propriety Docker Swarm Orchestrator engine dynamically scales compute resources and schedules docker containers, fullfilling the needs of complicated workflows that support analysis in genomics, metagenomics, data visualisation etc. eaglehive pipelines can reduce the time needed to gain insight into your data of interest.

If you would like to discuss how eaglehive can help further your pursuit for metagenomic research, get in touch on +44 (0) 1223 654481 or drop us a line with our specialised form.

Contact Us

Further reading: Biocote have written a series of blogs about the microbiome that are interesting reading



Eleanor Stanley

About Eleanor Stanley

Scientific data and information security specialist, Eleanor Stanley is a biocurator at Eagle Genomics, and is also responsible for information security. She joined the company in mid 2014 from the Wellcome Trust Sanger Institute (WTSI), where she worked as a bioinformatician building a pipeline for genome annotation within the 50 Helminth Genomes Initiative, which is part of the Global health research project at WTSI. Eleanor’s entire career since University has been biocuration, though she had a flutter and gained a Masters degree in bioinformatics in 2012. She began as a literature curator with FlyBase at the University of Cambridge and then UniProt at European Bioinformatics Institute (EMBL-EBI), focusing on Drosophila, worms, alternative splicing and complete proteome sets. From here she mixed bioinformatics and biocuration at WTSI, building a gene annotation pipeline and taking her automatically generated gene models for Onchocerca volvulus and manually improving them for WormBase. "While fly biology and biocuration of worm datasets isn't the most common route into human genomics, it's all about getting new data and understanding its potential for scientific discovery. Eagle has given me a great opportunity to keep learning, such an energetic company."