Glossary

PFAM Architecture

The PFAMs in a PDB sequence (entity) order by the starting location of each PFAM. For instance, PDB entry 2RII has two sequences, two chains for each sequence. The PFAMs on the sequence is in the table:

PDB ID Asym ID Author ID Entity ID Pfam ID Pfam Accession SeqBeg SeqEnd
2rii A A 1 AhpC-TSA PF00578 8 142
2rii A A 1 1-cysPrx_C PF10417 162 197
2rii B B 1 AhpC-TSA PF00578 8 142
2rii B B 1 1-cysPrx_C PF10417 162 197
2rii C X 2 ParBc PF02195 15 106
2rii D Y 2 ParBc PF02195 15 106

The chain Pfam architecture for the sequence 1 (Entity ID = 1) is (AhpC-TSA)_(1-cysPrx_C) and the chain Pfam architecture for the sequence 2 (Entity ID = 2) is (ParBc). A Pfam is within a pair of parentheses, and the Pfams on one sequence are connected by "_". So, the Pfam architecture for the entry (Entry Pfam architecture) is (AhpC-TSA)_(1-cysPrx_C);(ParBc). The Pfam architecture of each sequence are connected by ";". Please note, the Pfam architecture of a PDB entry is defined in a protein sequence of a PDB entry, not protein chains.

browse


Pfam Architecture Group

We group PDB entries so that a group contains single Pfam chain architecture or pairs of chain architectures. First we define a group for each unique chain architecture found in one or more PDB entries. All PDB entries that contain a particular chain architecture are added to that group. Entries that contain more than one chain architecture thus will appear in multiple groups. For instance, there is a group "(Cyclin_N)_(Cyclin_C)" that contains 82 PDB entries,. Some of these entries have other proteins as well (Such as Pkinase proteins) but they all share the entity (Cyclin_N)_(Cyclin_C). Second, there is a group for each pair of chain architectures that occur together in at least one PDB entry. So there is a group "(Cyclin_N)_(Cyclin_C);(Pkinase)" containing 73 PDB entries that have these two proteins.

Crystal Form (CF)

There are two steps to define a crystal form. First an initial crystal form is defined as same entry architecture, same space group, same asymmetric unit and the parameters of the unit cells within 1%, then those similar initial crystal forms are grouped together if more than 70% of interfaces of two crystal forms are similar to each other. So CFs in a group indicate a list of distinct CFs. 

Reprentative Entry

The PDB entry with best resolution in a CF is the representative structure for the CF.

Chain-Chain Interfaces

All unique interfaces are computed from the crystal by calculating chain-chain interactions. Crystal is generated by applying symmetry operations on the asymmetric unit. Two chains are interacted if and only if there are at least 10 pairs of Cbeta or Calpha with distance less than 12 Angstrom and at least one atomic contact with distance less than 5 Angstrom. The unique interfaces refer to those with different asymmetric chains or low similarity if they contain same asymmetric chains. 

Interface Cluster

Similar interfaces are clustered into clusters by average hirarchical clustering algorithm. Each cluster is represented by the number of distinct CFs in the cluster (M) and total number of CFs in the group (N), as well as the minimum sequence identity in the cluster.