General Index


Structural Classification of Proteins




The enormous diversity of sequences seen in proteins make us think that the structural patterns could also be enormous. However, the rapid progress in the knowledge of 3D structures shows that in proteins there are far more regularities than expected. Thus, there are some attempts to a Structural Classification of Proteins according to their 3D structure. More than 90 % of known 3D structures may be classified according to those criteria. There are in Internet Databases several classification of proteins. Of these, we'll mention SCOP and CATH.

The SCOP (Structural Classification of Proteins) database is maintained by Drs. Alexey G. Murzin, Bartlett G. Ailey, Steven E. Brenner, Tim J.P. Hubbard, and Cyrus Chothia, from Cambridge University (MRC Laboratory of Molecular Biology and Centre for Protein Engineering, Hills Road, Cambridge, CB2 2QH, England), very comprehensive and cross-referred by many other databases as Protein Data Bank; and

The CATH (Protein Structure Classification) database, created by Drs. C.A. Orengo, A.D. Michie, . S. Jones, M.B. Swindells, G. Hutchinson, A. Martin, D.T. Jones, y J.M. Thornton, from The University College, London (Biomolecular Structure and Modelling Unit), similar to SCOP but easier to present in an elementary course. In this demo we'll follow the CATH systematics.

The central object of the CATH classification is the Domain, that is, the basic unit of tertiary structure in proteins. The "taxonomic" hierarchy in CATH is as follows:

  • I. Class (C level): It is the highest level of classification, and corresponds to global secondary structure. Thus, there are four classes: (C1) Mainly a; (C2) Mainly b; (C3), a-b, and a fourth class (C4) including those proteins with little or no secondary structure.
  • II. Architecture (A level): Describes the global packing of secondary structure in the domain, but does not distinguish the connectivity of the different elements.
  • III. Topology (T level): This level describes the connectivity of the different elements in an architecture. We'll not enter in this level.
  • IV. Homology Superfamily (H level): We consider in this level primary structure homology. Different categories at this level refer to groups of proteins having a common phylogenetic ancestor (superfamilies). We'll not enter at this level.

Let's see now the main features of the different classes and architectures described in CATH.


Class: Mainly a Proteins

In these proteins the a-helix secondary structure is predominant (more than 50 % of a- and less than 5 % of b-). We'll consider three different architectures in this class (among others):

Not bundled. a-helices are not forming bundles but they are contained in a globular structure. As an example we have Erythrocruorin, similar to Hemoglobins and Myoglobins.

Its secondary structure is:

Bundle. a-helices form bundles. An example is Cytochrome C', an electron transport protein (not to be confused with Cytochrome C):

Secondary structure:

Small a-Proteins. They consist in one or two helices, as can be seen in the Inovirus Coat Protein (ICP).

Secondary structure:


Class: Mainly b Proteins

the predominant secondary structure is b-sheet. This class is defined with more than 50 % b-sheets and less than 5 % in a-helices. Some representative architectures are:

Ribbon . In this architecture we see a number of b-sheets forming a global elongated structure. Is the case of Plasminogen Activator, a protein involved in the process of Fibrinolysis:

Secondary structure:

Simple Sheet. The structure has only one sheet, as we can see in Heregulin, protein related to the Epidermal Growth Factor, EGF:

Secondary structure:

Roll. A curved b-sheet not forming a barrel. Is the case of Phosphatidyl Inositol Kinase:

Secondary structure:

Barrel. An antiparallel b-sheet forming a closed structure. An example is Porin, a integral membrane protein froming pores with an exclusion limit of 600 Da:

Secondary structure:

Shell. A big b-sheet forming a shell, with a few a-helices around it. It appears in A-Protein, present in photoautotroph Bacteria:

Secondary structure:

bb Sandwich. Two b-sheets forming a sandwich. This is the typical domain of Immunoglobulins. An example is b-2-Microglobulin:

Secondary structure:

Distorted sandwich. Like the anterior but more disordered. An example is Complement Regulatory Protein:

Secondary structure:

Trefoil. Three b-hairpins forming a central barrel. It is the structure of Fibroblast Growth Factor:

Secondary structure:

Orthogonal Prism. Three parallel sheets arranged as a trigonal prism. The direction of b-tracts is perpendicular to the axis of the prism. Galanthus nivalis Agglutinin:

Secondary structure:

Aligned Prism. Three parallel sheets arranged as a trigonal prism. The direction of b-tracts is parallel to the axis of the prism. It is the case of Vitelline Membrane Protein:

Secondary structure:

Propeller (4). Four sheets radially arranged as the blades of a propellor. An example is Hemopexin:

Secondary structure:

Propeller (6). Six sheets arranged as the blades of a propeller. An example is Neuraminidase:

Secondary structure:

Propeller (7). Seven sheets arranged as the blades of a propeller. An example is Methylamine Dehydrogenase:

Secondary structure:

Propeller (8). Eight b-sheets arranged as the blades of a propeller. It is the case of Methanol dehydrogenase:

Secondary structure:

Solenoid (2). Two b-sheets closely packed. It is distinguished from sandwich structures by its connectivity: the peptide chain alternates between the two sheets, giving a solenoidal arrangement. It is sometimes called a b-Helix. An example is Alkaline Protease:

Secondary structure:

Solenoid (3). Three b-sheets with solenoidal connectivity. It is the case of Pectate Lyase:

Secondary structure:

Complex. It a predominant b-structure, but cannot be classified in any of the former groups due to its complex structural pattern. An example is Acid proteinase:

Secondary structure:

Class: a and b Proteins

These are proteins with a-helices (15-55 %) and b-structure (10-45 %). The SCOP classification makes a distinction between a/b Proteins, in which there is connectivity between both types of structure, and a and b Proteins, in which both types appear separated. The main architectures are the following:

Roll. A curved b-sheet surrounding one or several a-helices. An example is Scitallone dehydratase:

Secondary structure:

Barrel. A parallel b-sheet forming a central cylinder whose tracts are connected by a-helices. This is the Triose phosphate isomerase barrel. Another example is the enzyme endo-1,4-b-Glucanase:

Secondary structure:

Sandwich (2 layers, ba). It is formed by two layers: one is the b-sheet and the other the a-helices. As an example, the natural inhibitor Barstar of the ribonuclease Barnase:

Secondary structure:

Sandwich (3 layers, aba). An example is the Gen Regulatory Protein:

Secondary structure:

Sandwich (3 layers, bba). The B-subunit of the enzyme Carboxy Lyase:

Secondary structure:

Sandwich (4 layers, abba). It is the case of Deoxyribonuclease I:

Secondary structure:

Box. An extended b-sheet with the shape of a box, with several a-helices inside. An example is Nuclear Antigen of proliferating cells:

Secondary structure:

Horseshoe. A very ordered b-structure with flanking a-helices with the global shape of a horseshoe. It is the case of Ribonuclease Inhibitor:

Secondary structure:

Complex. Those proteins with a- and b- structures having a complex structure that cannot be classified in any group. Thus, Cytochrome P450 CAM:

Secondary structure:

Small b Peptides. The A-subunit of the enzyme Carboxy lyase:

Secondary structure:


Class: Proteins with little or no secondary structure

These proteins have little or no secondary structure at all. There is only one architecture:

Irregular. The B-subunit of of pea lectin:

Secondary structure:


Previous: Quaternary Structure Top of Page Next: Hemoglobin