
Having known and used the
Generic Genome Browser (aka gbrowse, see
here) for years now, it occured to me a while ago that it should be o so simple to create the same functionality with a much easier setup if we could use ruby instead of perl.
Gbrowse depends on
bioperl's
Bio::Graphics module. Although gbrowse has been instrumental for many people's research, it does take a bit of work to get it installed. Apart from bioperl, it depends on Apache for showing the results in a browser. Compare that to any
Rails application, where you basically just need ruby and a "gem install rails". I've created rails applications in the past that contain exactly the kind of data that would typically be visualized by something like gbrowse. Takes no time at all to set up and you can even get away by virtually writing no code. And no Apache to be installed, or configuration files that you can't access because you're not root.
Such a rails application makes it possible to browse, edit and delete the data. The problem comes with the visualization bit. There's
no bioruby graphics library (yet?) that automatically parses features on a reference and creates a nice picture of where your genes are on that chromosome. Of course, the genes should be clickable so you can link through to NCBI or Ensembl.
I've spend some time in the last year creating such a Bio::Graphics thing for ruby. I wanted it to behave the same as the one from bioperl: there one
panel that has one or more
tracks, and each track has
features on it. Even though it was quite easy to create a proof-of-concept library, the most difficult part was actually finding the right
backend.
What should I use to create the pictures themselves? As I'd worked with SVG before, that seemed the right way to go. Downloaded a library from http://raa.ruby-lang.org/project/ruby-svg/ and got a prototype running quite easily. Problem: I needed an SVG viewer or firefox to actually view the picture, and zooming in/out screwed up all text. So after weeks of digging around, I've found rcairo, a ruby-binding to
Cairo. Migrating to this backend was easy peasy and the pictures look really nice (see at the top). Unfortunately, it's impossible to create clickable glyphs using Cairo itself, but that can be easily worked around by creating a html file with the map. That's exactly what gbrowse does as well, isn't it?
The picture at the top has been created using the following simple script:
g = BioExt::Graphics::Panel.new(800, 1200, true, 1, 610)
track1 = g.add_track('generic')
track2 = g.add_track('directed',[0,1,0],'directed_generic')
track3 = g.add_track('triangle',[0.5, 0.5, 0.5],'triangle')
track4 = g.add_track('spliced',[1,0,0],'spliced')
track5 = g.add_track('directed_spliced',[1,0,1],'directed_spliced')
track1.add_feature('bla1','250..375', 'http://www.newsforge.com')
track1.add_feature('bla2','54..124', 'http://www.thearkdb.org')
track1.add_feature('bla3','100..449', 'http://www.google.com')
track2.add_feature('bla4','50..60', 'http://www.google.com')
track2.add_feature('bla5','complement(80..120)', 'http://www.sourceforge.net')
track3.add_feature('piep','56')
track3.add_feature('bla','103', 'http://digg.com')
track4.add_feature('gene1','join(34..52,109..183)','http://news.bbc.co.uk')
track4.add_feature('gene2','complement(join(170..231,264..299,350..360,409..445))')
track4.add_feature('gene3','join(134..152,209..283)')
track5.add_feature('gene1','join(34..52,109..183)', 'http://www.vrtnieuws.net')
track5.add_feature('gene2','complement(join(170..231,264..299,350..360,409..445))','http://www.roslin.ac.uk')
track5.add_feature('gene3','join(134..152,209..283)')
g.draw('my_panel.png')
What happens here?
Line 1: Create a new panel for a sequence of 800 bp, with the picture being 1200 points wide. Make all glyphs clickable if a URL is defined (the
true), and zoom into the region from 1 to 610 bp.
Lines 3-6: Create different tracks, each with a name, a colour (in RGB at the moment) and a type.
Lines 8-24: Add features to those tracks, each with a name, a locus and an optional URL to link out to external websites. Notice how it handles spliced features and features on the reverse strand?
Line 26: Create the PNG (and in this case: also HTML) file.
Here's a nicer way to produce the same type of output:
#Initialize graphic for a nucleotide sequence of 600 bp
my_panel = BioExt::Graphics::Panel.new(1000, 1200, false, 1, 600)
#Create and configure tracks
track_SNP = my_panel.add_track('SNP')
track_gene = my_panel.add_track('gene')
track_transcript = my_panel.add_track('transcript')
track_SNP.feature_colour = [1,0,0]
track_SNP.feature_glyph = 'triangle'
track_gene.feature_glyph = 'directed_spliced'
track_transcript.feature_glyph = 'spliced'
track_transcript.feature_colour = [0,0.5,0]
# Add data to tracks
DATA.each do |line|
line.chomp!
ref, type, name, location, link = line.split(/\s+/)
if link == ''
link = nil
end
if type == 'SNP'
track_SNP.add_feature(name, location, link)
elsif type == 'gene'
track_gene.add_feature(name, location, link)
elsif type == 'transcript'
track_transcript.add_feature(name, location, link)
end
end
# And draw
my_panel.draw('my_panel.png')
__END__
chr1 gene CYP2D6 complement(80..120)
chr1 gene ALDH 100..449
chr1 SNP rs1234 107
chr1 gene bla complement(400..430)
chr1 SNP rs9876 44
chr1 gene some_gene complement(join(170..231,264..299,350..360,409..445))
chr1 transcript transcript1 join(250..300,390..425)
chr1 transcript transcript2 253..330
chr1 transcript transcript3 266..344
chr1 transcript transcript4 complement(join(410..430,239..286,129..151))
If someone would actually be interested in getting the library behind this, just let me know. It should be really easy to incorporate this in a rails application where the data are actually stored in a database.
I wonder what if any role _why's
Shoes thing would/could play...
UPDATE: This library has now been improved a bit and is hosted on
rubyforge. You can find a tutorial and the whole API documentation at
http://bio-graphics.rubyforge.org. You can find instructions on how to install and use it over there.
UPDATE TWO: Forget the previous update. I have moved the bio-graphics code to
github. See
http://github.com/jandot/bio-graphics. That should make it much easier to fork the code and get more input from other developers.