Treatment of errors in the classical approach to MAD phasing.



next up previous contents index
Next: Algebraic approach. Up: Classical approach. Previous: Classical approach.

Treatment of errors in the classical approach to MAD phasing.

 

In reality the phase circles like those drawn in Fig. gif will not intersect exactly at one point due to the presence of experimental error in the diffraction data. The errors which affect the phasing procedure are the following:

  1. Statistical and systematic errors in the measurement and scaling of the observed intensities.
  2. Errors in the estimation of the heavy atom parameters from Patterson or direct methods, i.e. the heavy atom positions, temperature factors and occupancies.
  3. In the case of the MIR method, errors will be introduced due to lack of isomorphism between the native and derivative structures. In the case of MAD however, where all data are collected from the same protein structure, these errors should be negligible.

Blow and Crick (1959) [16] dealt with only two errors in their mathematical treatment of error in the IR method. Points (2) and (3) in the above list were considered numerically as one single error. Two further assumptions were made by Blow and Crick. Firstly they assumed that all errors were Gaussian and secondly that all the errors in the measurement of the structure factors and in the calculation of the heavy atom structure factor are contained in the derivative structure factor. This is a reasonable assumption since for most proteins the native structure factors and derivative structure factors are approximately co-linear since the heavy atom structure factor is generally small in comparison. Fig. gif shows the important parameters for the representation of experimental error in the phasing procedure. The vector represents the native or reference structure and stands for the derivative structure factor. The calculated heavy atom structure factor is labelled .

  
Figure: Phase triangles showing the native structure factor, , the derivative structure factor, , the calculated heavy atom structure factor, and the lack of closure, for two differing values of , the angle between and . Note that the phase of the reference structure factor may at any time be obtained from by .

The two errors which are to be considered are the error in the observed measurements, and the combined error in and the lack of isomorphism, acting along . Since the errors are Gaussian we can define the total error in as

 

The component of the error in acting perpendicular to is neglected at this stage since any uncertainty in the position of the point in Fig. gif in this direction has very little effect on the phase . Assuming a Gaussian error distribution we can write the probability distribution of the phase as

 

where is a normalisation factor applied such that

 

The relationship between the lack of closure and the phase is given by the cosine rule applied to the phase triangle in Fig. gif. We obtain the expression

 

Using the above equations (Eqs. gif to gif) it is possible to calculate a probability distribution for each derivative with respect to the reference structure factor and these may be combined to give a final phase probability curve using

 

where is the normalisation factor for the combined distribution.

Given a set of phase probability curves for each reflection there still remains a choice of which phase to use in calculating the Fourier transform giving the electron density map. The most probable Fourier would use coefficients where is the phase where the distribution is at its maximum. However it is often the case that the phase probability curves are bimodal. Blow and Crick defined the best Fourier as that Fourier transform which gave the lowest mean square error in the electron density. This corresponds to using phases corresponding to the centroid of the phase probability distribution curve. The best Fourier uses coefficients

Introducing , the figure of merit for a reflection and the centroid phase we write

In practice the phase probability distribution is calculated at regular intervals of say or so that the components of are given by

and

The figure of merit is in fact equal to the mean of the cosine of error in the phase angle. This can be seen if we define the error as and shift the origin such that . m is then given by [18]

When anomalous scattering data are present Blow and Rossmann [17] showed that the probability distribution for a reflection may be written as the product of the two individual distributions for the Bijvoet pair reflections and with corresponding lack of closure values and such that

where now corresponds to the total error in the determination of and . North [88] later observed that this method of combining the isomorphous and anomalous differences did not account for the greater inherent accuracy of the anomalous differences since they are measured from the same crystal sample and are therefore not affected by non-isomorphism or errors in the scaling together of data from different crystals.

Matthews [84] showed that by writing the total lack of closure in a different form the isomorphous and anomalous contributions to the phasing potential could be separated out and thus assigned separate probability distributions with appropriate variances taking into account the differing accuracy of the isomorphous and anomalous data. The total lack of closure became

defining and the isomorphous and anomalous lack of closures.

Fig. gif shows the situation when Bijvoet pairs have been measured. and are defined as the real and imaginary parts of the calculated heavy atom structure factor and is the angle between and the vector . From Fig. gif we have

and

  
Figure: Lack of closure errors when anomalous data is present.

If we define the we can write

and

 

Using the sine rule we have

and substituting for in Eq. gif and rearranging gives

We can now write down the joint probability distribution for in terms of the isomorphous and anomalous lack of closures as

This basic theory is easily applicable to the MAD method since each set of data arising from a measurement at a different X-ray energy will produce its own isomorphous and anomalous probability distribution. However the reference data set will give a zero isomorphous contribution to the phasing and will correspondingly have zero isomorphous lack of closure and a probability distribution which is unity for all .

The overall phase probability distribution is

where the product over is taken over all X-ray energies.

Calculation of the probability distributions and requires an estimation of the errors and which include statistical error, errors in the calculation of the heavy atom position, occupancy and temperature factors and lack of isomorphism. In theory the error in the anomalous differences should have a zero contribution from lack of isomorphism. The statistical error in the structure factors is usually estimated while making the measurements, e.g. from counting statistics. However the second source of error from the heavy atom is more difficult to estimate. For the isomorphous case Blow and Crick [16] showed that may be determined for centric reflections and then applied to the non-centric reflections.

For centric reflections , and are co-linear allowing an estimation of the overall r.m.s. error to be obtained from

For the centrosymmetric case

and thus may be estimated.

For anomalous differences the contribution to may be determined by inspecting centric reflections where the observed Bijvoet difference should in theory be zero. The total r.m.s. deviation from the theory gives an estimation of which in turn allows to be obtained.



next up previous contents index
Next: Algebraic approach. Up: Classical approach. Previous: Classical approach.



Gwyndaf Evans
Fri Oct 7 15:42:16 MET 1994