Proteins can be divided into two categories depending on shape and function - globular or fibrous. Globular proteins generally take part in chemical processes in an organism, e.g enzymes and redox proteins, whereas fibrous proteins play a more mechanical and structural role, e.g. actin and myosin in muscle and collagen in tendons. The latter category tend to be very insoluble and are therefore almost impossible to crystallise. Globular proteins on the other hand are better candidates for crystallisation owing to their solubility and have therefore been the subject of the majority of macromolecular crystallographic studies to date.
A protein molecule is a linear chain of amino acids which folds in a reproducible manner. There are twenty different amino acids which naturally occur in nature. They are all laevo (L) - rotatory and have the general formula

The letter
represents the amino acid side chain and is the distinguishing feature among amino acids. The central
carbon atom to which the side group is bound is called the
-carbon, (
).
Amino acids link together to form a polypeptide chain via a peptide bond which is formed between the nitrogen and
carbon ends of adjacent amino acids with the elimination of a water molecule (see Fig.
).
Figure: Formation of the peptide link joining two amino acids. One water molecule is produced per peptide bond
formed.
A complete protein molecule can be made of anything from a few to a few thousand amino acids linked
in this fashion. The two unbonded amino and carboxyl groups left after the formation of the protein are known as
the
and
termini respectively. The central
and its adjacent
and
atoms of each amino acid
residue define the protein main chain or backbone.
Protein structure may be described at four levels. The order in which
the amino acid residues appear in the polypeptide chain is known as the primary structure. The residues are numbered
beginning at the N-terminus. The local conformation of the amino acid residues along the chain is the protein's
secondary structure.
Two types of secondary structure frequently reoccuring in globular proteins are the
-helix and
-pleated
sheet. Regions of the sequence where the main chain atoms zig-zag are
-sheet. The
conformation
is supported by hydrogen bonding between
and
groups of adjacent strands of the polypeptide chain.
In the
-helix structure
the main chain atoms form a helical chain with
residues per turn. Hydrogen bonding occurs between
and
groups of amino acids which are four residues apart in the sequence. The main chain conformation of amino acid residues
can be described by two parameters per residue, the
and
dihedral angles. The
and
angles
are defined as clockwise rotations about the
and
bonds respectively looking along the
bonds away from the
atom. The
conformation is defined as that when all main chain atoms are
coplanar as shown in Fig.
(reproduced from [18], Fig. 2.8(b)).
The double bond of the
group of a single amino acid is delocalised into the
bond giving each
of these bonds a partial double bonded character. This has the effect of restricting the
,
,
and
atom to lie in a plane. When polypeptide chains are formed the planarity of these atoms
is retained although deviations away from planarity amounting to a few degrees may be observed in regions
of a protein where the nature of the polypeptide fold has introduced strain into the molecule.
Figure: The
-
dihedral angles defining backbone conformation. The conformation shown is
where all the main chain atoms lie in the same plane. The structure of a complete protein molecule may thus be
described using only two parameters per residue.
The overall structure of a protein molecule is
determined by the way in which the individual regions of
-helix and
-sheet fold up. The fold of a
protein is its tertiary structure. The tertiary structure is determined and held together by a number of weak bonds.
They are Van der Waals bonds, hydrogen bonds, hydrophobic interactions and salt bridges.
Additional stability is sometimes achieved by
the presence of covalent bonds between the sulphur atoms of sulphur containing residues, cysteines (Cys), present
at different points in the amino acid sequence, but closely separated spatially. This type of bond is known as
a disulphide bridge.
Some proteins may be made up from a number of different polypeptide chains. The arrangement and overall structure
of these chains to form a complete protein molecule is its quaternary structure.