General Index
Structural Classification of Proteins
The enormous diversity of sequences seen in proteins make us think that the structural patterns could also be enormous.
However, the rapid progress in the knowledge of 3D structures shows that in proteins there are far more regularities than expected. Thus, there
are some attempts to a Structural Classification of Proteins according to their 3D structure. More than 90 % of known 3D structures may be classified
according to those criteria. There are in Internet Databases several classification of proteins. Of these, we'll mention SCOP and CATH.
The SCOP (Structural Classification of Proteins) database is maintained by
Drs. Alexey G. Murzin, Bartlett G. Ailey, Steven E. Brenner, Tim J.P. Hubbard, and Cyrus Chothia, from Cambridge University (MRC Laboratory of
Molecular Biology and Centre for Protein Engineering, Hills Road, Cambridge, CB2 2QH, England), very comprehensive and cross-referred by many
other databases as Protein Data Bank; and
The CATH (Protein Structure Classification) database, created by
Drs. C.A. Orengo, A.D. Michie, . S. Jones, M.B. Swindells, G. Hutchinson, A. Martin, D.T. Jones, y J.M. Thornton, from The University
College, London (Biomolecular Structure and Modelling Unit), similar to SCOP but easier to present in an elementary course.
In this demo we'll follow the CATH systematics.
The central object of the CATH classification is the Domain, that is, the basic unit of tertiary structure in proteins.
The "taxonomic" hierarchy in CATH is as follows:
- I. Class (C level): It is the highest level of classification, and corresponds to global secondary structure. Thus, there are
four classes: (C1) Mainly a; (C2) Mainly b; (C3), a-b, and a fourth
class (C4) including those proteins with little or no secondary structure.
- II. Architecture (A level): Describes the global packing of secondary structure in the domain, but does not distinguish the connectivity
of the different elements.
- III. Topology (T level): This level describes the connectivity of the different elements in an architecture. We'll not enter in this level.
- IV. Homology Superfamily (H level): We consider in this level primary structure homology. Different categories at this level refer
to groups of proteins having a common phylogenetic ancestor (superfamilies). We'll not enter at this level.
Let's see now the main features of the different classes and architectures described in CATH.
Class: Mainly a Proteins
In these proteins the a-helix secondary structure is predominant (more than 50 % of a-
and less than 5 % of b-). We'll consider three different architectures in this class (among others):
Not bundled. a-helices are not forming bundles but they are contained in a globular structure.
As an example we have Erythrocruorin, similar to Hemoglobins and Myoglobins.
Its secondary structure is:
Bundle. a-helices form bundles. An example is Cytochrome C', an electron
transport protein (not to be confused with Cytochrome C):
Secondary structure:
Small a-Proteins. They consist in one or two helices, as can be seen in the
Inovirus Coat Protein (ICP).
Secondary structure:
Class: Mainly b Proteins
the predominant secondary structure is b-sheet. This class is defined with more than 50 %
b-sheets and less than 5 % in a-helices. Some representative architectures are:
Ribbon . In this architecture we see a number of b-sheets forming a global elongated
structure. Is the case of Plasminogen Activator, a protein involved in the process of Fibrinolysis:
Secondary structure:
Simple Sheet. The structure has only one sheet, as we can see in Heregulin, protein related to the
Epidermal Growth Factor, EGF:
Secondary structure:
Roll. A curved b-sheet not forming a barrel. Is the case of Phosphatidyl Inositol
Kinase:
Secondary structure:
Barrel. An antiparallel b-sheet forming a closed structure. An example is
Porin, a integral membrane protein froming pores with an exclusion limit of 600 Da:
Secondary structure:
Shell. A big b-sheet forming a shell, with a few a-helices
around it. It appears in A-Protein, present in photoautotroph Bacteria:
Secondary structure:
bb Sandwich. Two b-sheets forming a sandwich. This is the
typical domain of Immunoglobulins. An example is b-2-Microglobulin:
Secondary structure:
Distorted sandwich. Like the anterior but more disordered. An example is Complement Regulatory Protein:
Secondary structure:
Trefoil. Three b-hairpins forming a central barrel. It is the structure of
Fibroblast Growth Factor:
Secondary structure:
Orthogonal Prism. Three parallel sheets arranged as a trigonal prism. The direction of b-tracts
is perpendicular to the axis of the prism. Galanthus nivalis Agglutinin:
Secondary structure:
Aligned Prism. Three parallel sheets arranged as a trigonal prism. The direction of b-tracts
is parallel to the axis of the prism. It is the case of Vitelline Membrane Protein:
Secondary structure:
Propeller (4). Four sheets radially arranged as the blades of a propellor. An example is Hemopexin:
Secondary structure:
Propeller (6). Six sheets arranged as the blades of a propeller. An example is Neuraminidase:
Secondary structure:
Propeller (7). Seven sheets arranged as the blades of a propeller. An example is Methylamine Dehydrogenase:
Secondary structure:
Propeller (8). Eight b-sheets arranged as the blades of a propeller. It is the case of
Methanol dehydrogenase:
Secondary structure:
Solenoid (2). Two b-sheets closely packed. It is distinguished from sandwich structures by
its connectivity: the peptide chain alternates between the two sheets, giving a solenoidal arrangement. It is sometimes called a b-Helix.
An example is Alkaline Protease:
Secondary structure:
Solenoid (3). Three b-sheets with solenoidal connectivity. It is the case of
Pectate Lyase:
Secondary structure:
Complex. It a predominant b-structure, but cannot be classified in any of the former
groups due to its complex structural pattern. An example is Acid proteinase:
Secondary structure:
Class: a and b Proteins
These are proteins with a-helices (15-55 %) and b-structure (10-45 %).
The SCOP classification makes a distinction between a/b Proteins, in which there is connectivity between both
types of structure, and a and b Proteins, in which both types appear separated. The
main architectures are the following:
Roll. A curved b-sheet surrounding one or several a-helices.
An example is Scitallone dehydratase:
Secondary structure:
Barrel. A parallel b-sheet forming a central cylinder whose tracts are connected by
a-helices. This is the Triose phosphate isomerase barrel. Another example is the enzyme
endo-1,4-b-Glucanase:
Secondary structure:
Sandwich (2 layers, ba). It is formed by two layers: one is the b-sheet
and the other the a-helices. As an example, the natural inhibitor Barstar of the ribonuclease Barnase:
Secondary structure:
Sandwich (3 layers, aba). An example is the Gen Regulatory Protein:
Secondary structure:
Sandwich (3 layers, bba). The B-subunit of the enzyme Carboxy Lyase:
Secondary structure:
Sandwich (4 layers, abba). It is the case of Deoxyribonuclease I:
Secondary structure:
Box. An extended b-sheet with the shape of a box, with several a-helices
inside. An example is Nuclear Antigen of proliferating cells:
Secondary structure:
Horseshoe. A very ordered b-structure with flanking a-helices
with the global shape of a horseshoe. It is the case of Ribonuclease Inhibitor:
Secondary structure:
Complex. Those proteins with a- and b- structures having a complex
structure that cannot be classified in any group. Thus, Cytochrome P450 CAM:
Secondary structure:
Small b Peptides. The A-subunit of the enzyme Carboxy lyase:
Secondary structure:
Class: Proteins with little or no secondary structure
These proteins have little or no secondary structure at all. There is only one architecture:
Irregular. The B-subunit of of pea lectin:
Secondary structure: