Ras alpha4-5 dimer Catalase Ligands interactions

PFAM assignments in the PDB improved by consensus sequences, HMM-HMM alignmnets, and structure alignments.


Protein Common Interfaces Database

(Based on Pfam v31.0 and the PDB of June 2018)

ProtCID now contains clusters of Pfam domain interfaces, Pfam-peptide interfaces and Pfam-DNA/RNA/Ligands interactions.

ProtCid is a database of similar protein-protein interfaces in crystal structures of homologous proteins. Its main goal is to identify and cluster homodimeric and heterodimeric interfaces observed in multiple crystal forms of homologous proteins. Such interfaces, especially of non-identical proteins or protein complexes, have been associated with biologically relevant interactions [1]. However, we do not explicitly assign a probability of biological relevance to each interface cluster. For more details about the algorithm and benchmarking, please refers to our paper "Statistical Analysis of Interface Similarity in Crystals of Homologous Proteins." and the Help File.

A common interface here indicates chain-chain or domain-domain interactions that occur in different crystal forms. For chain-chain interfaces, all protein sequences in the PDB are assigned a "PFAM chain architecture", which denotes the ordered PFAM assignments for that sequence, e.g. (Pkinase) or (Cyclin_N)_(Cyclin_C). Then we compare homodimeric interfaces in all crystals that contain a particular architecture, for instance (Pkinase), regardless of whether there are other protein types in the crystals. We also compare all interfaces between two different PFAM architectures in all PDB entries that contain them, for instance (Pkinase) and (Cyclin_N)_(Cyclin_C). For both homodimers and heterodimers, the interfaces are clustered into common interfaces based on a similarity score (Q score).

For domain-domain interfaces, we use Pfam domains defined in our PDBfam database. PDB entries are grouped into Pfam-Pfam relations. One entry may belong to different Pfam-Pfam relations, if it contains more than one Pfam domains. Domain-domain interfaces are also clustered based on Q scores.

We report the number of crystal forms that contain a common interface, the number of PDB entries, the number of PDB and PISA biological unit annotations that contain the same interface, the average surface area, and the minimum sequence identity of proteins that contain the interface. We find that PDB and PISA are not always consistent in their biological units in a homologous family, even when an interface is present in all crystal forms. Therefore, our data provide an independent check on publicly available annotations of biological interactions for PDB entries.

Pfam-peptide and Pfam-ligands interactions are grouped based on the common Pfam HMM positions where they interact by Jaccard Indexes. The RMSD of peptides in a Pfam is also calculated, and used to cluster Pfam-peptide interfaces.

You can search ProtCID in three ways:

  • PDB Code. Searching by PDB code returns a list of PFAM architectures for each sequence of the entry. Selecting one or two PFAM architectures returns the interface clusters.
  • One PFAM ID or PFAM Accession Code. A list PDB of entries that contain the query PFAM is returned. Selecting one PDB ID from this list is similar to inputting a PDB code.
  • Two PFAM IDs or PFAM Accession Codes. Searching by a Pfam pair returns the common Pfam-Pfam domain interactions. A list PDB of entries that contain the query PFAM is returned. Selecting one PDB ID from this list is similar to inputting a PDB code.
  • Sequences. Input one or two sequences to find out the PFAM architectures of the sequences that are contained in the PDB.Selecting one architecture to find out the interface clusters.
  • UniProt IDs. Input one or more UniProt IDs to find out the interactions and common interfaces among them. There are two types of interactions provided: Structures-based and Pfams-based. Structures-based interactions only contain the interfaces of these input proteins, while Pfams-based interactions return the interfaces of Pfams in these input proteins, include the interfaces of homologous proteins in the same Pfams.

Or you can browse a list of all PFAMs and Ligands in the PDB

Citing ProtCID

If you find ProtCID useful, please cite the reference that describes the work:

The protein common interface database (ProtCID) - a comprehensive database of interactions of homologous proteins in multiple crystal forms: Q. Xu and R. Dunbrack. Nucleic Acids Research (2011) Database Issue.


1. Xu, Q, et al, Statistical Analysis of Interface Similarity in Crystals of Homologous Proteins. J. Mol. Biol. (2008) 381: 487-507.   PDF