Overview
ProtCAD is a comprehensive database of protein assemblies in PDB.
It uses EPPIC assemblies and PDB biological assemblies which are not defined in EPPIC,
groups assemblies with same Pfam architectures,
and clusters assemblies in same stoichiometry, same symmetry and similar interfaces.
A cluster contains multiple numbers including
the number of distinct crystal forms (#CFs_clus), the number of entries (#ENT_clus), minimum sequence identity of assembly sequences,
the number of unique UniProts (#UNP_clus),
the number of crystal forms of these UniProts in a cluster (#CFs_UNPclus),
the number of crystal forms of these same UniProts in a Pfam architecture (#CFs_UNParch),
the ratio between #CFs_UNPclus and #CFs_UNParch (R_CF_UNPclus),
the number of PDB entries which contain PDB biological assemblies which are same as the cluster (#PDBBAs),
the ratio between #PDBBAs and #ENT_clus (R_PDB),
the number of PISA assemblies which are same as the cluster (#PISABAs),
the ratio between #PISABAs and #ENT_clus (R_PISA),
and the number of EPPIC predicted biological assemblies which are same as the cluster (#EPPICBAs),
the ratio between #EPPICBAs and #ENT_clus (R_EPPIC).
These numbers can be combined to provide evidences to identify biological assemblies.