All model refinement was performed using the Automated Refinement Procedure (ARP) developed by Lamzin and Wilson . The procedure combines existing least-squares refinement techniques with `bad atom' rejection criteria based on a test. The program automatically detects where new atoms should be and places them there. Since the complete lysozyme model was already available the procedure was principally used to locate and refine the positions and B-factors of solvent water molecules. Although ARP may be applied to the refinement of incomplete protein models it will be discussed here only in terms of its use in adding ordered solvent.
ARP has three main steps -
Least-squares refinement using X-ray data with maximum resolution of 2.0Å or worse must generally be accompanied by the application of stereo chemical restraints since at these lower resolutions the number of parameters to be refined can sometimes exceed the number of observables. The inclusion of restraints in the refinement complements the X-ray data by effectively increasing the number of observables. The three programs used in the least-squares minimisation stages are were SFALL  , PROTIN  and PROLSQ   The restraints take the form of a library of stereo chemical data which contains a collection of target bond lengths and bond angles established by small molecule studies of the twenty amino acids and high resolution studies of small peptide structures and proteins . The function to be minimised in restrained least-squares refinement is made up of a diffraction term , plus restraining terms for bond distances , planarity of the atoms in ring molecules such as histidine and phenylalanine, close contact restraints for non-bonded atoms, chirality and torsion angle restraints and restraints on the isotropic Debye-Waller factors or B-factors. The diffraction term is defined as
where is the variance of and the sum is taken over all reflections. and the other restraint terms take a similar form. For example
where are the bond distances in the protein model and are bond distances recommended in the restraints library. The sum is taken over all bonds within the structure.
The calculation and automated evaluation of the and electron density maps using the calculated phases from the model form the basis of the ARP procedure. The maps are used to judge whether solvent molecules should be removed from or included into the model. Solvent molecules in the model which sit in weak density ( rms density) are good candidates for removal due to the weak relative contribution to that point in the density coming from the observed structure factors. During each ARP cycle a maximum of four solvent molecules were removed in this way. Similarly a maximum of four molecules were included at positions corresponding to large positive density ( rms density) in the difference map. Additional criteria such as water peak sphericity and nearest neighbour distances are used as measures of the integrity of the solvent structure.
Finally in the ARP cycle real space refinement is applied to the water molecules where the function minimised is
The agreement between the protein model with the experimental data is quantified by the R-factor () defined as
Typical values of for a refined protein structure are less than but rarely less than .