Saturday, October 23, 2010

At the Google Summer of Code Mentor Summit

This weekend I am spending at the Google Summer of Code Mentor Summit. It is a great event hosted at the Google Campus in Mountain View. Below more info about the sessions I am attending.


Overview of all sessions

Student retention

Notes at

Measuring Usability

Criteria for usability:

Notes at
  1.  Abstraction Level
  2.  Closeness of mapping
  3. Consistency
  4. Diffuseness/terseness
  5. Eror-Proneness
  6. Hard mental operations
  7. Hidden dependencies
  8. Junxtaposability
  9. Premature Commitment
  10. Progressive Evaluation
  11. Role expressiveness
  12. Secondary Notation
  13. Viscosity
  14. Visibility
How to make a team use agile development, if they have never done it before
Notes at:

Liberate your data
Notes at:

Sunday ...

Open Streetmap 
using Cherokee for quick OSM rendering

Advanced Trolling
(This seems to be the most popular session so far ;-)
Notes at:
the CRAPL license

Open Source Science

Jim Procter and I organized the Open Source Science session:

Monday, October 11, 2010

New Paper: Precalculated Protein Structure Alignments at the RCSB PDB website

Bioinformatics just made our latest paper available as an early preview:
Precalculated Protein Structure Alignments at the RCSB-PDB website

Summary: With the continuous growth of the RCSB Protein Data Bank (PDB), Berman et al. (2000), providing an up-to-date systematic structure comparison of all protein structures poses an ever growing challenge. Here we present a comparison tool for calculating both 1D protein sequence and 3D protein structure alignments. This tool supports various applications at the RCSB PDB website. First, a structure alignment web service calculates pairwise alignments. Second, a stand-alone application runs alignments locally and visualizes the results. Third, pre-calculated 3D structure comparisons for the whole PDB are provided and updated on a weekly basis. These three applications allow users to discover novel relationships between proteins available either at the RCSB PDB or provided by the user.

Availability and Implementation: A web user interface is available at The source code
is available under the LGPL license from
A source bundle, prepared for local execution, is available from

UPDATE: the link below should provide free access:
Read the full paper here

Improved Reporting Features at the RCSB PDB site

One of the features at the RCSB PDB site that many people are not aware of, is the powerful tabular reporting tool. Any search result can be use to generate one of several reports. (e.g. Image Collages,  pre-defined reports, fully customizable tables, export to Excel, etc. see screenshot below).

In this release Chuxiao added better reporting for Ligands. There are also plenty of new options for the fully customizable reports, based on feedback we have received from our users.

Friday, October 8, 2010

BioJava's Google Summer of Code summary

Today a slighlty belated summary of what happened at the Google Summer of Code at the BioJava project:

Our two students Mark Chapman and Jianjiong Gao did an amazing job on their two projects "All Java Multiple Sequence Alignment" (MSA) and "Identification and Classification of Posttranslational Modification of Proteins" (PTM).

For Multiple Sequence Alignments we now have a flexible and multi-threaded MSA implementation that works in linear space and that, as an option, allows the users to define anchors that are used in the build up of the multiple alignment. The code is available as part of the new biojava3-alignment module.

The Posttranslational Modification module (biojava3-protmod) can detect three different types of protein modifications in protein structures. It comes with an XML file & Java data structures to store information about different types of protein modifications, and contains entries from RESID, PDBCC and PSI-MOD. There is also a visualisation component to display cross linked PTM on a sequence viewer.

Both Mark and Jianjiong have expressed their interest in maintaining and further developing their modules and I am looking forward to interacting more with them in the future. I want to thank the Mentors and Co-Mentors Peter Rose, Kyle Ellrott and Scooter Willis for their help and guidance for the projects, without them this would not have been possible. Thanks also to Robert Buels and the Open Bioinformatics Foundation for organizing our applications for GSoC and last, but not least, Google for sponsoring this Summer of Code.

Thursday, October 7, 2010

New iPhone app at RCSB PDB (beta)

The latest RCSB PDB release features a first version of an iPhone application. It is provided as a HTML5-based application, which means you can install it without going to the Apple Store. Simply point your iPhone Safari-browser to and click "yes" a couple of times. Best to do this while you are on a wireless connection, since the application installs some data for quicker data access.

Gregg, the author of this application also made a screencast with the installation instructions. You can watch it here:

Wednesday, October 6, 2010

New RCSB PDB Feature: Faceted Browsing

One of the features I find most exciting at the latest RCSB PDB web site release is "faceted browsing". Similar to an online shopping site, which allows to drill down through product categories, it is now possible to drill down through lists of protein structures using categories like Resolution, Organism, Polymer Type, to name just a few of them.

You can easily start browsing by clicking the total number of structures on top of every page. Since this features has become available (Thanks Dimitris!) I have observed myself to use it all the time and I perform much fewer "advanced queries", because this new feature is so easy and quick to use. Let us know if you want to have additional categories.

Tuesday, October 5, 2010

October release of RCSB PDB website

The latest release of the RCSB PDB website features a number of exiting new features some of which I will present in more detail during follow-up blog postings.

Above a screenshot of the new Category Browser for the Molecule of the Months.

Here a list of all new features:

Molecule of the Month Improvements
PDBMobile for the iPhone
Query Result Browser Improvements
Chemical Components
Tabular Report Improvements
Comparison Tool Improvements
General Site Improvements