After the correct Miller indices have been assigned and evaluation of the integrated intensities and estimated errors has been performed for the complete series of oscillation images it is then necessary to place each image on a common relative scale. This scaling is performed to remove systematic error introduced into the data by such effects as absorption arising from non-regular crystal shape, non-linearity in the monitoring of the incident beam intensity by the ionisation chambers and changes in the average diffracted intensities due to variation in the total diffracting volume of the crystal sample arising for example when part of the crystal moves in or out of the incident beam. Thermal motion of the atoms within the structure causes an increase in the fall off of intensity with increased scattering angle. This is conveniently represented by an average temperature factor specific to each diffraction image. Additional effects such as radiation damage contribute to this reduction in intensity as a function of resolution and are also taken up by the image temperature factor. The program used for this purpose was SCALEPACK [90].

Measured diffraction intensities are generally on an arbitrary scale and no attempt is made at this stage to place the data on an absolute scale since this requires knowledge about the contents of the unit cell. For the th diffraction image we define a scale factor and a pseudo-temperature factor which are measured relative to a reference image for which and .

The average intensity of reflections on an image may be written as the average of the squares of the structure factor for each reflection on that image such that

The last term on the right hand side is essentially constant for all images when the number of reflections is large which is typically the case for macromolecular structures. The factor scales the intensity between the measured arbitrary scale and the absolute scale and is also constant for each image.

We thus define a correction factor where

which must be applied to each image relative to our reference to bring them onto a common scale.

Refinement of the scales and merging of redundant reflections (i.e. those reflections which are equivalent by symmetry and therefore have equal intensities) is done as follows. During the th cycle of refinement redundant reflections are multiplied by and then merged by calculating the weighted average

where is given by

is the measured intensity of the th redundant reflection and is its error. The scale factor is dependent on the gain of the particular detector in use. For example in imaging plate detectors the photo-multiplier/ADC gain is generally not set to produce one ADC unit per incident photon on the plate. The value of accounts for this. is a scale factor which takes account of the presence of systematic error in the data and may vary as a function of resolution. Both these values are input parameters to the program and are adjusted so as to achieve a value of unity where is again defined as the ratio of the actual observed variance in the data to that predicted by the error model by the values and . This ensures that correct estimates of the errors in the data have been made.

Values of and for each image other than the reference are adjusted during each cycle so as to minimise the function

The values of are then recalculated and the cycle repeated until the shift in its value is less that 0.01 from cycle to cycle. The error assigned to each of the scaled intensities is that given by Eq. after the last cycle of refinement.

At this stage of the data analysis it is possible to obtain a quantitative idea of which reflections can be clearly defined as statistical outliers from the mean. The greater the redundancy the easier it is to pinpoint such outlier reflections. During the scaling and merging procedure a list of suggested outlier reflections is produced and these may be rejected from the following cycles of refinement.

The weighted average intensity of symmetry equivalent reflections is then calculated along with its error by

where and sum is over symmetry equivalents. The quality of the data may be assessed at this stage by the use of a merging R-factor which in the case of SCALEPACK is given by

where all reflections with only one observation are not included in the calculation and the average excludes the th redundant measurement. is normally calculated in twenty or so resolution bins and is therefore useful for indicating the resolution range over which the data are still useful. Typical values of may range from less than for low resolution data Å up to for data near the high resolution diffraction limit. Overall values should be of the order of a few percent.

The rejection of outlier reflections is based upon an F-test. Assume we have two sets of measurements each having values normally distributed about their means. The F-test provides a quantitative method of assessing the level of belief in the hypothesis that the variances of two such distributions are equal. In the case where we have a number of experimental measurements of the same quantity, the statistical error of each single measurement is in turn subjected to an F-test against the variance in the mean of the other measurements of that quantity. If the two quantities are significantly different from one another then the measurement may be considered for rejection. The test is implemented so as to provide a percentage probability that a reflection is an outlier or not.

Once merging of intensity data has been performed one of two procedures may then be adopted for the calculation of structure factors depending on which software is to be used. Bijvoet differences can either be treated separately which gives for each measurement (acentric) or (centric), or they can be merged and a Bijvoet difference calculated giving values for and where appropriate, where and are given respectively by Eqs. and .

One problem which arises during the calculation of structure factors by is that there is a finite chance that the integrated intensity of very weak reflections may be negative. This arises from the fitting of the background around a reflection profile where the slope of the background is irregular. This may be dealt with in one of three ways. Negative intensity measurements may be ignored, set to zero or modified so as to make them positive but still meaningful. The first two possibilities have been shown to introduce bias into the data [55]. The third proposal however was to use prior knowledge of the theoretical distribution of X-ray intensities, the Wilson distribution [117] to modify the observed intensity distribution using a Bayesian approach [35]. This method has been implemented in the CCP4 program TRUNCATE. The approach has little effect on measurements where (acentric) or (centric).

Fri Oct 7 15:42:16 MET 1994