March 3, 2017

ProServer Das, Ensembl and EC2

If you use the ProServer DAS server with some Ensembl-based adaptors ( maybe you want to server data from Ensembl, or maybe you have genomic data and want to serve it with some extra annotation that is in Ensembl ) then you may  have come across the  'mysql has gone away' timeout error.  Basically what is happening is that the ProServer child processes are outlasting the mysql connection timeout setting that either they have or that is specified in the mysql database.

As is often the case there are a number of solutions to this problem. One way is to change the wait_timeout parameter in the database configuration.  This is fine if you have access to these parameters ( eg. if you have your own local mysql running)  and you don't mind it affecting the other processes.

However, this solution is not always appropriate - for example we have recently moved  to accessing some Amazon EC2- based ensembl databases.

So another solution is to use a feature already present in the Ensembl code, more specifically in the DBConnection.pm module, that as the name suggest handles Ensembl's database connections. Once again this will set the wait_timeout mysql setting, but this time it will only affect this connection.  Here is an example of how to do this:

my $ens_registry = 'Bio::EnsEMBL::Registry';

$ens_registry->load_registry_from_db ( -host=> $ENSEMBL_MYSQL_HOST,
-user=> $ENSEMBL_MYSQL_USERNAME,
-port=> $ENSEMBL_MYSQL_PORT,
-wait_timeout => 2678200
);

my $sliceAdaptor = $ens_registry->get_adaptor ('human','core','Slice');

Now in the DBConnection.pm module some SQL will get run to set the wait_timeout for this connection.

Finally, a quick comment on one of the other possible solutions - as the DBConnection.pm is a wrapper around DBIs database handle, it is possible to add a line of code here to specify using mysql_auto_reconnect :

mysql_auto_reconnect => 1

However, while this approach stops the 'mysql has gone away' error messages, and greatly increases the successful requests to ProServer, it still doesn't appear to work 100% of the time, so we would recommend using the first solution.

Topics: Bioinformatics, Bioinformatics, Cloud, DAS, EC2, Ensembl, ProServer