All model refinement was performed using the Automated Refinement Procedure (ARP) developed by Lamzin
and Wilson [74]. The procedure combines existing least-squares
refinement techniques with `bad atom' rejection criteria based on a
test. The program automatically
detects where new atoms should be and places them there. Since the complete lysozyme model was already available
the procedure was principally used to locate and refine the positions and B-factors of solvent water molecules.
Although ARP may be applied to the refinement of incomplete protein models it will be discussed here only in
terms of its use in adding ordered solvent.
ARP has three main steps -
's against the observed X-ray data
,
and
,
Least-squares refinement using X-ray data with maximum resolution of 2.0Å or worse must generally
be accompanied by the application of stereo chemical restraints since at these lower resolutions the
number of parameters to be refined can sometimes exceed the number of observables. The inclusion
of restraints in the refinement complements the X-ray data by effectively increasing the number of
observables. The three programs used in the least-squares minimisation stages are
were SFALL [1] [108], PROTIN [78]
and PROLSQ [51] [69]
The restraints take the form of a library of stereo chemical data which contains a collection
of target bond lengths and bond angles established by small molecule studies of the twenty amino acids and
high resolution studies of small peptide structures and proteins [30].
The function
to be minimised in restrained least-squares refinement is made up of a diffraction
term
, plus restraining terms for bond distances
, planarity of the atoms
in ring molecules such as histidine and phenylalanine, close contact restraints for non-bonded atoms, chirality and
torsion angle restraints and restraints on the isotropic Debye-Waller factors or B-factors.
The diffraction term is defined as

where
is the variance of
and the sum is taken over all reflections.
and the other restraint terms take a similar form. For example

where
are the bond distances in the protein model and
are bond distances recommended
in the restraints library. The sum is taken over all bonds within the structure.
The calculation and automated evaluation of the
and
electron density maps using
the calculated phases from the model form the basis of the ARP procedure. The maps are used to judge whether solvent
molecules should be removed from or included into the model. Solvent molecules in the model which sit in
weak
density (
rms density) are good candidates for removal due to the weak
relative contribution to that point in the density coming from the observed structure factors. During each ARP
cycle a maximum of four solvent molecules were removed in this way. Similarly a maximum of four molecules were
included at positions corresponding to large positive density (
rms density) in the
difference map. Additional criteria such as water peak sphericity and nearest neighbour distances are used as
measures of the integrity of the solvent structure.
Finally in the ARP cycle real space refinement is applied to the water molecules where the function minimised is

The agreement between the protein model with the experimental data is quantified by the R-factor (
) defined as

Typical values of
for a refined protein structure are less than
but rarely less than
.