Data analysis.



next up previous contents index
Next: Solution of the Up: MAD experiments. Previous: Data collection.

Data analysis.

The raw diffraction data produced by the image plate scanner were analysed to give integrated intensities as described in Sec. gif. The quality of the orientational refinement was assessed by the values output for each image by the program. The values were determined using a value of mm for the model error in the spot position and a model error in the determination of reflection partiality of . The resulting average value for the refinement of all images from all data sets was and rarely exceeded 2.0. This suggested good agreement with the error model. Predicted profiles were calculated from spots within a radius of mm and profile fitted intensities were measured for approximately 35000 reflections per data set of which only about 100 were flagged as negative intensity measurements. A low resolution cutoff of 20Å was applied during the image processing so as to avoid including any reflections which may have been partially intercepted by the direct beam back stop since these reflections may have had systematically smaller intensities.

Image scaling within a data set was done by the program SCALEPACK. Partially recorded reflections were excluded from the scaling procedure since their use requires accurate knowledge of their fractional partiality so that they may be compared with their redundant equivalents. In practice the errors in the estimates of reflection partiality are sufficiently large to make them unsuitable for inclusion in the scaling procedure. It is normal however to add partial reflections from neighbouring images and include them in any subsequent data analysis. The error in the final summed intensity will normally be larger than that for an equivalent fully recorded reflection since it is itself made up of the two errors of the individual measurements, but it may not necessarily be less correct. A possibility for the presence of systematic error in the measurement of partial reflections does however exist. Movement of the crystal between the exposure of the neighbouring images can seriously effect the measurement of partial reflections and also fully recorded reflections which lie close to the Ewald sphere. For this reason and given high redundancy of symmetry equivalent reflections (about 4) it was thought inappropriate to use summed partial reflections in the subsequent analysis and they were rejected after the scaling.

The values of and which define the model of the error in the data were adjusted until the parameter was close to unity across the whole resolution range. Deviations of from unity of up to on average were accepted. A value of of 1.5 was typically found to be appropriate along with values of ranging from in the lower resolution range up to in the highest resolution bin. Reflections were rejected if the probability of them being outliers was greater than . The refined image scale factors from SCALEPACK are shown in Fig. gif as a function of the phi angle dissecting each oscillation range for the seven data sets collected. It can clearly be seen that the crystal centering problem encountered during measurement of the the first data set had a severe effect on the average intensity of the reflections over the course of the data collection. The average errors in the calculation of the scale factors ranged between and . Fig. gif shows the image B-factors calculated during the scaling procedure. The refined B-values became more negative throughout the data sets indicating that there was some radiation damage. The errors in the refined B-values ranged from between 0.2 and 0.5Å. This fluctuations in the B-values are of the same order of magnitude as these errors.

  
Figure: Result of the image scaling procedure applied to the seven data sets from X31 and X11. The calculated scale factors for each image are shown as a function of the phi angle at the centre of each oscillation. All of the images within one data set are scaled to a reference image which in all cases shown is the first image.

  
Figure: The B-factors for each image calculated by SCALEPACK are shown as a function of the phi angle at the centre of each oscillation.

  
Figure: Graph showing the average error as a function of in terms of a percentage of the average intensity for seven derivative data sets. It is clearly visible that data set 6 collected at 0.92Å on the X11 beam line is of a higher quality than the other data sets.

The percentage error in the merged intensities as a function is shown in Fig. gif. The plot shows clearly that data sets 1, 3, 4 and 5 are of similar quality, whereas as sets 2 and 7 are of a similar but higher quality. Data set 6 has the lowest statistical error values with the average percentage error only reaching at the highest resolution of Å. Data set 6 displays the expected factor of two improvement in the signal to noise ratio over data sets 1, 3, 4 and 5. Sets 2 and 7 however do not behave as expected. The probable cause of the improved signal to noise level of data set 2 over the other X31 data sets is the reduced effect of absorption by the iridium atoms in the crystal since data set 2 was collected below the absorption edge in energy. This effect evidently increased the average intensity of the diffracted X-rays measured on the imaging plate. The poorer observed signal to noise of data set 7 over set 6 is due to the large X-ray energy difference between the two sets . The increased absorption at the lower X-ray energy where set 7 was collected was not compensated for during the data collection by proportionally longer exposure times.

The merged intensity data were then converted to a form which gave for each acentric reflection values of , , and . Once in this form the program TRUNCATE was used to obtain structure factors and their associated Bijvoet differences and deal with the negative intensity measurements. Wilson plots for each of the derivative data sets were calculated and allowed estimates to be made of the B-values associated with the data along with approximate values of the absolute scale of the data sets. Table gif shows a summary of the data analysis for all seven derivative data sets plus the native set.

The B-values calculated from the Wilson plots increased on average during the course of data collection on X31 and again later when the X11 data were collected. This demonstrates that the crystal had survived remarkably well during the 8 week period between the two experiments but had in fact been susceptible to radiation damage.

  
Table: Results of preliminary data analysis on all seven of the data sets from X31 and X11. The completeness value represents the total percentage of reflections recorded between 20 and 2.5Å (2.25Å for the native set).

Given the quantity of anomalous derivative data collected and the availability of a native set which would not normally be the case for a true MAD experiment it was possible to treat the data either as SIRAS data or as conventional MAD data. With this in mind the scaling of the data set was performed twice, with and without the native data included. The resulting data were dealt with separately so that the effects of non-isomorphism with the native data did not play a role in the analysis of the MAD data alone. In the scaling of the MAD data the reference data set was chosen to be set 4 since this had the most negative value of . This choice was made since it was more convenient to have all isomorphous occupancies positive. Choosing a data set with an intermediate value of would have implied that the isomorphous occupancies of the other data sets would be positive or negative depending on whether their respective values were higher or lower. During the scaling anisotropic refinement of the data B-values was performed. The restraints of the point group symmetry of the crystal (422) implied that and . The resulting refined B-values showed to be on average double the value of for each of the scaled sets. The results of the scaling are summarised in Table gif.

The ratio between the average intensities of the scaled data set and the reference data set was calculated after the scale factors had been applied to obtain an estimate of the quality of the scaling. Their values (column 4 of Table gif) are all close to unity demonstrating that the procedure had been effective in scaling the data. Large deviations form unity would suggest that program has had difficulties in converging to the correct solution. Also shown are the r.m.s anomalous differences as a percentage of the r.m.s structure factor magnitude of the reference data set. The average Bijvoet difference measured for each data set is plotted in Fig. gif as a function of . Mean values of are on the same relative scale. The Bijvoet differences are significantly larger at lower resolution and gradually drop to a minimum at Å () at which point they begin to rise slowly again as increases. This suggested that pairs or groups of Iridium atoms in the protein may have been separated by only a few Angströms since at lower resolutions such pairs would tend to scatter in phase as though they were one body and thus cause the larger observed anomalous signals. Such effects have also previously been observed by Smith et al [103].

  
Figure: The mean plotted as a function of for the seven data sets. Bijvoet differences are on average larger at lower resolutions and essentially constant in the higher resolution regime. The differences are plotted on the same relative scale which is not absolute. The differences between the values of between the seven data sets are also reflected in the average values of in each range.

  
Table: The results of scaling the MAD data alone and scaling it to the native data (SIRAS) are shown. When no true native data is available the scaling procedure requires that one of the data sets is chosen as a reference on which to base the scaling. Data set 4 was chosen here since it has the largest value of . For the MAD scaling, Bijvoet difference ratios, () , and dispersive difference ratios, ()are given along with the ratio of the average intensities calculated for each set against the reference set, (). An analogous intensity ratio is shown for the SIRAS scaling plus the isomorphous difference ratios,() .



next up previous contents index
Next: Solution of the Up: MAD experiments. Previous: Data collection.



Gwyndaf Evans
Fri Oct 7 15:42:16 MET 1994