Disclaimer nr 2: We are not yet git veterans ourselves, so if you see simpler ways of doing what we describe below (or spot any errors), please let us know so we can update this post and put it onto the bioruby wiki as well.
Disclaimer nr 3: This is a proposal. Bioruby has not moved to git yet. However, we are working on it and trying to get the support from the main developers. Update: bioruby has been converted to git (thanks, Anthony) and is not available on github. So you can clone or fork now. However, the official development is still on CVS.
Update: I have discovered a very good presentation on how to work and collaborate with git. If you're interested in using git, have a look at this talk. You can fast forward to 1hr10min27sec where he starts talking about the practical use. Very strongly recommended.
In this blog post, we try to give some guidelines on how people can contribute to the bioruby code if/when that code will become available on github. The rationale for what we describe here is very much based on the premise that the job for the maintainer(s) of bioruby should be as simple as possible. Their workload should be as light as possible; this means that there are some additional steps that any contributor has to go through.
What follows is only a proposal. This is not a standard operating procedure; it’s only a guideline. Feel free to digress from it or use a completely different workflow. But remember: keep it simple for the maintainers.
Distributed source control. Git is a truly distributed source control system, and in contrast with CVS or SVN, there is no central repository. With CVS or SVN, every time someone checks out or exports the repository, his own copy is so-to-speak subordinate to the central one. Not so with git: every single clone is equivalent; none is more important than another. In technical terms, the copy of bioruby on your laptop is as important as the one that for example Toshiaki maintains. One of the big advantages is that continued support is more likely should a key developer move on to pastures new (or github goes up in smoke), since the community can simply elect a new "blessed" repository (see below).
A blessed repository. Noticed that I said “in technical terms”? In some cases, like for bioruby, we would obviously like to have some repository that we would consider the ‘true’ one. Enter the notion of a “blessed” repository. This is purely by convention: the community appoints one particular repository as the main one.
A good place to put this repository is Github. For bioruby, this blessed repository will start out to be http://github.com/bioruby/bioruby. Official bioruby builds will take place from there. However, development can take place in additional, personal repositories.
Forking. Any development of bioruby would happen in clones of this blessed repository. Using the “fork” button on Github not only creates a clone, but it automatically puts that clone on Github itself as well. (Forking has the added value of the github social aspect where the network of changes can be viewed.) So if I would want to contribute, I would fork from bioruby/bioruby (that is: username/projectname) which would automatically create http://github.com/jandot/bioruby.
Guidelines for contribution
There are several ways of contributing: you can either create a patch or use a fork/clone.
Here we’ll try to explain how contribution could work with forking for Bioruby, both from the individual contributor’s view as from the view of the person(s) managing the blessed repository. What follows is not a Standard Operating Procedure. You do not have to do it like this. However, it will make it easier on the blessed maintainers to merge your code.
A. Using patches
A.1 The contributor
The simplest way to contribute is to send in patches. RailsCasts has a great screencast explaining this.
Creating a fork
Click the “fork” button on the bioruby/bioruby page. This will create a new repository in your own namespace: jandot/bioruby. It’s on this clone that you will be working; you will not touch bioruby/bioruby itself.
To actually start making changes (e.g. you want to add functionality for Ensembl cigar format), you create a local clone on your own computer (step 2 in the picture):
git clone email@example.com:jandot/bioruby.gitThis will be your local master branch. The first thing to do after cloning your own fork, is to create an additional branch for the feature you want to work on: add_cigar_format (step 3)
The command to do this:
git checkout -b add_cigar_format
This will create the new branch and check it out so it becomes your active one. From the fluxbox wiki (http://fluxbox-wiki.org/index.php/Git_-_using): “Branching and merging is very powerful in git. You can create thousands of local branches, one for each bug you work on or feature you implement. It is good practice to do this because it safes your from accidentally pushing changes to another repository.”
So you’ll end up with 2 branches (do a “git branch”):
master: a reflection of the master branch of your remote repository add_cigar_format: is where the actual work is done
The “git branch” should have a star in front of add_cigar_format because that’s your current branch. If master is starred, do a “git checkout add_cigar_format” to change to this branch.
Now you can edit and change to your heart’s content. The current branch you’re working on maintains an index of files that git is tracking. You can find the current status of the branch by typing
git statuswhich will list the current status of all the files. Changes can be committed to the local index by using the command.
git add fileThe index is an intermediary between the working copy files you are editing, and the changes committed to the repositroy. Changes can be committed from the index to your local repository using the command
git commitThis command will also prompt you for a message describing the commit. Try not to do too much work before committing. A single commit should concern (part of) a single conceptual change with its tests. It’s good practice to commit often (and several commits per conceptual change), but do try not to mix different changes into one commit. This will make it harder afterwards if a commit has to be reverted.
Commits are applied to the only current checked out branch (i.e. add_cigar_format), and do not affect any other branches, or the original repository. Also, if you have to make site-specific changes (e.g. hard-coding a proxy server in one of the files), try to put those changes in one single commit. This will make it easier later to remove them.
Preparing the patch
When you think your change is ready for inclusion in the blessed repository (and you’ve included tests as well), you can create a patch file. To make sure that the blessed repository maintainers will have no problem merging your version, you will want to make the patch reflect the latest version of the blessed repository (step 5).
git remote add blessed git://github.com/bioruby/bioruby.git
git fetch blessed
So now you can check that the patch you will submit will only contain the changes that you want to be included in the blessed repository. One of the things to look out for is that there are not site-specific configurations in your branch (e.g. a hard-coded proxy or directory path, no “STDERR.puts”, ...). Hopefully, you put all those site-specific changes in a separate commit as described above. To get rid of them, you just revert that commit. “git log” will show you the SHA1 of that particular commit (the long crazy string), and you just run “git revert [that_SHA1]". After that, check your changes:
git log -p blessed..feature_c
When that’s done, you can create the actual patch (step 6):
git format-patch blessed..feature_C
This creates a file that you can send to the maintainer (step 7). And you’re done…
A.2 The maintainer
The maintainer gets an email from someone containing a patch. The first thing to do, is to create a new branch and merge the changes into that branch.
git checkout -b feature_c
git am <0001-feature_C_commit_message.patch
Of course he would want to check the changes by comparing the new version of the code with the one that is in the blessed repository (i.e. the master).
git log feature_c..master
If everything looks OK, he can then merge the changes into master itself and push it up onto github.
git branch master
git merge feature_c
And he’s ready. Only thing left to do is remove the branch he created during the process.
git branch -d feature_c
B. Using a pull request
B.1 The contributor
This type of contribution starts out exactly the same as the one with patches: you fork/clone, create a feature branch and hack away.
Preparing the pull
When you think your change is ready for inclusion in the blessed repository, you will create a branch specific for this pull (e.g. called to_pull; step 5): “git branch -b to_pull”.
To make sure that the blessed repository maintainers will have no problem merging your version, you have to rebase your branch (steps 6 and 7).
git remote add blessed git://github.com/bioruby/bioruby.git
git fetch blessed
git rebase blessed/master
git checkout blessed/master fileA_for_user_environment_only
git checkout blessed/master fileB_for_user_environment_only
At this point, a “git log -p blessed/master..to_pull” can help you check that the differences between your _to_pul_l branch and the blessed branch only contain the changes that you intend to be pulled (e.g. getting rid of “STDERR.puts” statements).
When you’re satisfied, you can put the to_pull branch onto your remote repository so it becomes available for the maintainers of the blessed repository (step 8):
git push origin to_pull:refs/heads/to_pulland push the “Send pull request” button on github.
After that, wait for any news if your change is accepted or not. When your remote to_pull branch becomes obsolete, you can remove it (step 10) with
git push origin :to_pull
B.2 The maintainer
The first thing the maintainer has to do, is get the latest version of his own (i.e. the blessed) repository.
git clone firstname.lastname@example.org:bioruby/bioruby.git
Then he can get a copy of your to_pull branch:
git remote add your_name git://github.com/your_name/bioruby.git
git checkout -b your_name/to_pull
git pull your_name to_pull
...and check what the change looks like.
git log -p master..your_name/to_pull
If he’s satisfied, he can merge your changes into the blessed master branch.
git merge your_name/to_pull
If there are no conflicts, he can then push the new version up onto github:
- The thing about git
- Using git within a team -> must-read
- A large but very informative and simple document
- Git repositories for maintainers
- Using git for samba development
- Git howto
- GitGuide: intermediate and advanced git