News and Blog

What can ChIP-seq data tell us?

Gene expression is tightly regulated and controls the development and maintenance of cells within all organisms. Misregulation of genes can cause adverse effects, for example in humans this could contribute to the initiation and progression of a disease. A technology which is used to identify these misregulated genes is ChIP-sequencing, also known as ChIP-seq.

ChIP-seq is a powerful method that provides the genomic location of regulatory regions (for example transcription-factor binding and histone modifications) in living cells. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. The sites of enrichment are identified using an antibody against the protein of interest to crosslink DNA-protein complexes.

ChIP-Seq_Workflow

Image from Szalkowski, A.M, and Schmid, C.D.(2010) http://ccg.vital-it.ch/chipseq/doc/chipseq_tutorial_intro.php

The sensitivity of this technology depends on the

  • depth of the sequencing run (i.e. the number of mapped sequence tags)
  • size of the genome
  • specificity of the antibody.

For more details, this paper is useful reading: http://genome.cshlp.org/content/22/9/1813.long.

As with many high-throughput sequencing approaches, genome wide ChIP-seq generates extremely large data sets (in the order of 30–100 million mapped reads). Appropriate computational analysis methods are required to identify the uniquely mapped reads, the minimal signal strength for a mammalian sample is suggested to be 20 million.

To predict DNA-binding sites from these sequences reads, differential peak calling methods have been developed that identify significant differences in ChIP-seq signals from distinct biological conditions. Eagle Genomics have worked with experts to develop proficiency with the SeqMonk tool. SeqMonk allows visualisation and analysis of any mapped sequence data (BAM/SAM etc) against an annotated genome. Quantitation and statistical analysis of data can be performed to find the regulatory regions of interest and allow comparisons of these regions between data sets. The image below shows enrichment of reads (shown by the height of the bars) from 4 samples that map to a gene of interest - we can see differential enrichment of reads across a gene between the samples.

ChIP-seq

ChIP-seq has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Analysis of these large datasets to determine the regulatory regions that are important for your condition of interest is a complicated process. There are many tools are available, though at Eagle Genomics we have applied our expertise to SeqMonk for quantitation analysis. It is obviously very important to get reliable results, as this is essential to fully understand biological processes and disease states, and helping our customers with this has been a rewarding process.

Bioinformatics Blog ChIP-Seq genomic location genomics method

Eleanor Stanley

About Eleanor Stanley

Scientific data and information security specialist, Eleanor Stanley is a biocurator at Eagle Genomics, and is also responsible for information security. She joined the company in mid 2014 from the Wellcome Trust Sanger Institute (WTSI), where she worked as a bioinformatician building a pipeline for genome annotation within the 50 Helminth Genomes Initiative, which is part of the Global health research project at WTSI. Eleanor’s entire career since University has been biocuration, though she had a flutter and gained a Masters degree in bioinformatics in 2012. She began as a literature curator with FlyBase at the University of Cambridge and then UniProt at European Bioinformatics Institute (EMBL-EBI), focusing on Drosophila, worms, alternative splicing and complete proteome sets. From here she mixed bioinformatics and biocuration at WTSI, building a gene annotation pipeline and taking her automatically generated gene models for Onchocerca volvulus and manually improving them for WormBase. "While fly biology and biocuration of worm datasets isn't the most common route into human genomics, it's all about getting new data and understanding its potential for scientific discovery. Eagle has given me a great opportunity to keep learning, such an energetic company."