ProtCid is a database of similar protein-protein interfaces in crystal structures of homologous proteins.
Its main goal is to identify and cluster homodimeric and heterodimeric interfaces observed in multiple
crystal forms of homologous proteins. Such interfaces, especially of non-identical proteins or protein
complexes, have been associated with biologically relevant interactions .
However, we do not explicitly assign a probability of biological relevance to each interface cluster.
For more details about the algorithm and benchmarking, please refers to our paper "Statistical Analysis of Interface Similarity in Crystals of Homologous Proteins." and the Help File.
A common interface here indicates chain-chain interactions that occur in different crystal forms.
All protein sequences in the PDB are assigned a ”PFAM chain architecture”,
which denotes the ordered PFAM assignments for that sequence, e.g. (Pkinase) or (Cyclin_N)_(Cyclin_C). Then
we compare homodimeric interfaces in all crystals that contain a particular architecture, for instance (Pkinase),
regardless of whether there are other protein types in the crystals.
We also compare all interfaces between two different PFAM architectures in all PDB entries that contain them,
for instance (Pkinase) and (Cyclin_N)_(Cyclin_C). For both homodimers and heterodimers,
the interfaces are clustered into common interfaces based on a similarity score (Q score).
We report the number of crystal forms that contain a common interface,
the number of PDB entries, the number of PDB and PISA biological unit annotations that contain
the same interface, the average surface area, and the minimum sequence identity of proteins
that contain the interface. We find that PDB and PISA are not always consistent in
their biological units in a homologous family, even when an interface is present
in all crystal forms. Therefore, our data provide an independent check on publicly
available annotations of biological interactions for PDB entries.
You can search ProtCID in three ways:
PDB Code. Searching by PDB code returns a list of PFAM architectures for each sequence of the entry.
Selecting one or two PFAM architectures returns the interface clusters.
PFAM ID or PFAM Accession Code. A list PDB of entries that contain the query PFAM is returned.
Selecting one PDB ID from this list is similar to inputting a PDB code.
Sequences. Input one or two sequences to find out the PFAM architectures of the sequences that
are contained in the PDB.Selecting one architecture to find out the interface clusters.
Or you can browse a list of all PFAMs in the PDB
If you find ProtCID useful, please citing the reference that describes the work:
The protein common interface database (ProtCID) - a comprehensive database of interactions of
homologous proteins in multiple crystal forms:
Q. Xu and R. Dunbrack. Nucleic Acids Research (2011) Database Issue 761-770.
1. Assignment of protein sequences to existing domain and family classification systems:
Pfam and the PDB Qifang Xu; Roland L. Dunbrack Jr. Bioinformatics 2012; doi: 10.1093/bioinformatics/bts533.
2. Xu, Q, et al, Statistical Analysis of Interface Similarity in Crystals of Homologous Proteins.
J. Mol. Biol. (2008) 381: 487-507.   PDF