tranSMART 1.2 Chef Recipe

Chef  TranSMART recepe Click on the image to go to the recipe

As part of a customer project we were required to install the latest version of tranSMART (v1.2) for the analysis and visualisation of clinical trial data. At Eagle Genomics, we use Opscode Chef for most of our custom project deployments, other than very simple ones.  So, I wrote a Chef recipe for tranSMART 1.2.  The recipe took 19 minutes to run on a medium size Red Hat system launched on Amazon Web Services, as opposed to several hours, and the install can be automatically tested.  (Note: For an earlier version of tranSMART Chef recipe see our initial blog post).

The install procedure is based on the tranSMART wiki instructions, but there were a few additions and changes needed. We also reported some minor issues with the instructions that were promptly fixed by the wiki author. There is an outstanding issue for installing RNASeq data, which I will go into the details of later.

Operating Systems

The recipe is designed to work on Ubuntu 14.04 Trusty Tahr and RHEL 6.5 (7.0 is available but was not tested as extensively). The default recipe will identify the operating system (OS) of the machine it is being run on and start the appropriate OS specific recipe. I will describe the RHEL 6.5 recipe in detail. The other recipes follow a similar pattern.

Recipe Stages

The recipe performs the following steps:

  • There are a number of packages required, most are installed using the Chef Package command but the PostgreSQL RPM needs to be installed from a URL.
  • PostgreSQL Configuration, the service is initialised and its access permissions modified.
  • Tomcat is downloaded from a URL, unzipped, tomcat7 users and groups created and the ownership and permissions are set.
  • The tranSMART proper source code is downloaded from Git Hub.
  • The last step is running a long bash shell script.
    • Start PostgreSQL and build the tranSMART databases
    • Build tranSMART and SOLR
    • Configure Tomcat JVM, patch and start Tomcat.

We would not normally recommend such a large bash script, but the various steps configure environment variables that are used on subsequent steps.

Quality Control and Testing

We ran all the recipes through both Rubocop and foodcritic as a quality control step. They can catch a lot of common errors in the recipes. In addition, an automated test suite was written using serverspec. The use of automated testing is highly recommended for all recipes and is standard practice at Eagle.

Warnings

There are two issues identified with the install process and therefore the Chef recipes:

1. Version Changes

The URL from which the tranSMART source code is downloaded is not version specific so it downloads the latest version (1.2.4). If the version changes in the future it could break the recipe. If you find it is not working, please let me know by leaving a comment below.

2. RNASeq Data Loading

When RNASeq data is loaded into tranSMART, it is not visible in the analysis window. If this happened you need to run the following on the Linux command line to resolve the issue:

sudo su postgres
psql -U postgres -d tranSMART
transmart=> select * from deapp.de_gpl_info;
transmart=> UPDATE deapp.de_gpl_info SET marker_type = 'RNASEQ_RCNT' WHERE platform = 'NAME';
transmart=> select * from deapp.de_gpl_info;

The marker_type column for the relevant platform should change from “Chromosomal” to “RNASEQ_RCNT”. I have reported the bug so hopefully this should be corrected soon.

Conclusion

I hope this post can save you some time in installing tranSMART or encourages you to install and explore this powerful system for analysing and visualising clinical data.

analysis and visualisation Big data technology Bioinformatics Blog Chef package Chef recipe clinical trail data Opscode Chef RNASeq Data Loading tranSMART

About Bart Ailey

Related Posts
  • By Tag