Wednesday, 8 July 2009

First test release of circular genome browser

Worked a couple of days on pARP, the circular genome browser, and I think it's ready to be tested out by others. Consider this an alpha release: expect a lot of issues. It's easy to create regions with a negative length, for example. Also, I didn't focus yet on user-friendliness or general input files. Ways of interaction are not made clear to new users yet and the input files still need to have fixed names and be stored in a particular folder.

pARP is designed to be a genome browser for features that are linked to other features on a genome (e.g. readpair mappings). Using a circular display, lines can be drawn connecting these features.

pARP always shows the whole genome. You can zoom into selected regions but the rest is still shown albeit squeezed a bit more together. The reason for this is that I want to show the context at all times. Suppose you'd zoom into two regions A and B that are linked by a large number of readpairs. If the part of the genome that is not A or B is not shown any readpair that has only one of its reads in A or B will just not be shown. By showing the whole genome, even squeezed in a few pixels, you can at least see that some reads are linked outside of A and B.

I've put some information on the github wiki page, such as how to interact and what the datafiles should look like.

For a little taste: here's a very brief screencast:

A lot of things still need to happen:
  • Catch a lot of edge cases
  • Incorporate a library for fast loading of features (i.e. LocusTree, which doesn't exist yet)
  • Make interaction more straightforward: use mouse for panning/zooming for example
  • About 1,472 other things that I currently forget
Also: I'm looking for a new name for pARP. pARP stands for "processing abnormal readpairs" (which what is was meant for originally), but it's actually just a genome browser using a circular representation to show linked features. Suggestions I already got are encircle and SqWheel or Squeal (the last two based on sequence-wheel; Squeal was my own idea, so I like that most at the moment :-) ).

A very, very big thanks goes to Jeremy Ashkenas, the author of ruby-processing. With pARP I have been pushing the boundaries of what that library does, and he has adapted it for my needs as I went. See here for his ruby-processing library. Other thanks go to my colleagues Erin, Klaudia, Jon, Nelo and Chris for their ideas.

pARP can be downloaded or cloned from github. Mac, Windows and linux are available there as well.