Sunday, October 28, 2012

RCSB PDB web site update Fall 2012

New Features at the RCSB PDB web site

 This week the  RCSB PDB released the latest major web site update. Here a quick description of some of the new features.

Protein Feature View

One of the main new features is the new Protein Feature View. It allows to compare the full length protein sequence, as defined by UniProt with the regions that have been determined in 3D and are available together with their coordinates from the Protein Data Bank.  Besides the visualization of the PDB and UniProt relationships, the  new view also adds additional annotations for a more comprehensive understanding of the protein. External data such as Pfam domains or regions for which Homology Models are available from the ProteinModelPortal are indicated. There are also some annotations that are being calculated on the fly: Protein disorder regions, as predicted by Peter Troshin's BioJava implementation of RONN are available as a histogram-style track. Finally, regions with increased hydrophobicity can be spotted by looking at the Hydropathy track.

The Protein Feature View is built using SVG graphics and extensively uses the jQuery-SVG library. Using SVG graphics for a prominently  feature on the site (it is on every protein-explorer page) has become possible since the majority of all modern browsers support these types of graphics nowadays. However, there is still a number of users who are stuck with old browser versions.  According to our web site traffic logs, this number is rapidly declining and we estimate that currently less than 15% of our users can't use the new view. These users won't see error messages on the protein-explorer page, thought.  The graphics will simply not be visible and provide a graceful fallback to the way the page used to look before the graphics were introduced.


Better Pfam integration

Another new feature of this release is a better integration with Pfam. Pfam family names are now searchable and one can quickly lookup all protein structures related to these families. Since Pfam is used in structural genomics projects to prioritize targets for crystallization, a possible use case is to look up domains of unknown function (DUFs) and whether 3D coordinates have already been determined for them. As already mentioned above, Pfam domains can be viewed as part of the new Protein Feature View. Weekly up-to-date Pfam-PDB mappings are being calculated by submitting newly released PDB entries to the HMMER3 web site. The details of this process are being described in more detail at the Pfam blog site.

Searching and Reporting

Other improvements of this RCSB PDB web site update include search and reporting improvements. RCSB searches have been improved for better supporting poly-proteins and their sub-components (see screenshot above). There is also better support for searching drug names (and more information about drugs on the Ligand Summary page (e.g. Lipitor), coming from DrugBank . Once a search has been performed, there are now four different types of reports available for investigating the results. Besides the "traditional" search results there is now a "condensed" view, which provides a compact summary of results. The "gallery" provides images for the proteins that have been found in the search. A "timeline" gives a historic overview when proteins were released in the PDB

A full description of all the new features is (as always) available on the What's New Page.