Experimentally determined protein tertiary structures are rapidly accumulating in a database, partly due to the structural genomics projects. Included are proteins of unknown function, whose function has not been investigated by experiments and was not able to be predicted by conventional sequence-based search. Those uncharacterized protein structures highlight the urgent need of computational methods for annotating proteins from tertiary structures, which include function annotation methods through characterizing protein local surfaces. Toward structure-based protein annotation, we have developed !VisGrid algorithm that uses the visibility criterion to characterize local geometric features of protein surfaces. Unlike existing methods, which only concerns identifying pockets that could be potential ligand-binding sites in proteins, !VisGrid is also aimed to identify large protrusions, hollows, and flat regions, which can characterize geometric features of a protein structure. The visibility used in !VisGrid is defined as the fraction of visible directions from a target position on a protein surface. A pocket or a hollow is recognized as a cluster of positions with a small visibility. A large protrusion in a protein structure is recognized as a pocket in the negative image of the structure. !VisGrid correctly identified 95.0% of ligand-binding sites as one of the three largest pockets in 5616 benchmark proteins. To examine how natural flexibility of proteins affects pocket identification, !VisGrid was tested on distorted structures by molecular dynamics simulation. Sensitivity decreased approximately 20% for structures of a root mean square deviation of 2.0 A to the original crystal structure, but specificity was not much affected. Because of its intuitiveness and simplicity, the visibility criterion will lay the foundation for characterization and function annotation of local shape of proteins.
Cite this work
Researchers should cite this work as follows:
- Li, B., Turuvekere, S., Agrawal, M., Kihara, D. (2013). Characterization of Local Geometry of Protein Surfaces with the Visibility Criteria. Purdue University Research Repository. doi:10.4231/D3BZ61792
48set.dat listed 48 bounded, unbounded protein, and RMSD values formatted as: Bound Unbound RMSD (A) : index48.txt listed 48 boundes, unbounded protein with corresponding chains, also with the ligand name, some proteins have more than one ligand, totally there are 65 ligands. Format: Bounded Chain !UnBounded Chain Ligand : 86set.datlisted 86 bounded, unbounded protein, and RMSD values formatted as: Bound !Unbound RMSD (A) : index86.txt listed 86 boundes, unbounded protein with corresponding chains, also with the ligand name. Format: Bounded Chain !UnBounded Chain Ligand : Example of input parameters of NAMD: NAMD need the namd file, for example: 182l.X.BZF.namd , to run the simulation, in namd file, the pdb file 182l.X.BZF.pdb, the psf file, 182l.X.BZF.psf and par_all22_prot.inp are needed to run the simulation. usually run: namd2 182l.X.BZF.namd will do the simulation. The ligand file is 182l.BZF. Zipped all files are in 182l.tar.gz. : Tar files of distorted structures: The 960 distorted structures are zipped in distorted.tar.gz. (48M)