XDS
- processes a sequence of adjacent, nonoverlapping rotation images collected from a single-crystal at a fixed X-ray wavelength,
- automatically derives crystal orientation, symmetry, and cell parameters from the data images,
- delivers a list of corrected integrated intensities of the reflections occuring in the data images.
XDS handles a variety of imaging plate, CCD and multiwire area detector data and imposes very few limitations on the experimental set-up. XDS handles
- fine sliced data as well as images covering a large oscillation range and only requires that each image covers the same positive amount of crystal rotation,
- arbitrary but fixed orientations of the detector and rotation axis, and only requires that incident beam and rotation axis intercept in one point in the center of the crystal.
As described in this chapter, rotation data images are processed in 8 steps
which are called in succession by XDS.
Information between the steps is communicated by files, which allows repetition of selected steps with a different set of input parameters without rerunning the whole program. The files generated by XDS are either ASCII type files that can be inspected and modified by using a text editor, or binary, unformatted files which are compressed by Jan Pieter Abrahams algorithm. All files have a fixed name defined by XDS, which makes it mandatory to process each data set in a newly created directory to avoid name clashes. Clearly, one should not run more than one XDS-job simultaneously in any given directory. Also, output files generated by rerunning selected steps (see Table 1) should first be given another name if their original contents are meant to be saved.
Data processing begins by copying an appropriate input file into the new directory. Input file templates are provided with the XDS package for a number of frequently used data collection facilities. The copied input file must be renamed XDS.INP and edited to provide the correct parameter values for the actual data collection experiment. Since the length of file names is restricted in XDS to 50 characters it is recommended to abbreviate lengthy path names by a symbolic link, i.e. if the image directory resides on a different computer.
All parameters in XDS.INP are named by keywords containing an equal sign as the last character, and many of them will be mentioned here in context to clarify their meaning. Execution of XDS (JOB= ALL) invokes each of the 8 program steps as described below. Results and diagnostics from each step are saved in files with the extension .LP attached to the program step name. These files should always be studied carefully to see whether processing was satisfactory or - in case of failure - to find out what could have gone wrong.
XYCORR
calculates lookup tables of spatial corrections for each detector pixel which are stored in the files X-CORRECTIONS.pck and Y-CORRECTIONS.pck . In subsequent data processing steps, when the true coordinates of a pixel with respect to the laboratory coordinate system are needed, the correction values for the X- and Y-coordinates are retrieved from the tables and added to the pixel's array coordinates in the data image.
Dependent on the detector, XYCORR computes the spatial corrections in three different ways.
- If the data images are already corrected for geometrical distortions, as is often the case, XYCORR produces tables of zeros.
- For spiral read-out imaging plate detectors like the MAR and MAR345 detectors, XYCORR computes the small corrections resulting from radial (ROFF=) and tangential (TOFF=) offset errors of the scanner.
- For some multiwire- and CCD detectors that deliver geometrically distorted images, corrections are derived from a calibration image (BRASS_PLATE_IMAGE= file name). This image displays the response to a brass plate containing a square grid of holes which is mounted in front of the detector and illuminated by an X-ray point source, e.g., 55Fe. The number of calibration holes and the distance between adjacent holes can be specified by the user via the input parameters MXHOLE= and HOLE_DISTANCE=, respectively, to override the detector specific default values. In addition, the user has the option to specify a minimum number of calibration spots (input parameter MNHOLE=) that must be located by XYCORR in the brass-plate image. This serves as a check that the calibration image was not underexposed.
XYCORR assumes that the source has been placed exactly at the location to be occupied by the crystal during the actual data collection, as photons emanating from the calibration source are meant to simulate all possible diffracted beam directions. XYCORR also generates the file FRAME.pck which contains the original brass plate image in which the calibration spots located by the program are marked for visual control.
Problems:
- A misplaced calibration source leads to an incorrect lookup table, impairing the correct prediction of the observed diffraction pattern in subsequent program steps.
- Underexposure of the calibration image results in an incomplete and unreliable list of calibration spots. Inspect the control image FRAME.pck to find out whether this has been the case.
INIT
determines three lookup tables, saved as files BLANK.pck, GAIN.pck, and BKGINIT.pck, that are required by the subsequent processing steps for classifying pixels in the data images as background or belonging to a diffraction spot ('strong' pixels).
- BLANK.pck contains the detector background noise at each image pixel in the absence of any X-rays. Some of the detectors, like the SIEMENS multiwire detector, do not have background noise and the table will contain just zeros. Otherwise, the table is derived in the following order.
- If a data image was taken in the absence of any X-rays (input parameter DARK_CURRENT_IMAGE=file name), this image is used for the table BLANK.pck.
- If the detector noise in each pixel was estimated by a by a non-negative constant (input parameter OFFSET=), the table BLANK.pck is set to the given constant.
- If neither of the above two cases applies, a constant detector noise is determined from the mean value at the four corners of a few data images. The table BLANK.pck is set to the determined constant.
- GAIN.pck codes for the expected variation of the pixel contents in the background region of a data image. The variance of the contents of a pixel in the background region is GAIN*(pixel_contents-detector_noise). The variance is determined from the scatter of pixel values within a rectangular box (input parameters NBX=, NBY=) of size (2*NBX+1)*(2*NBY+1) centered at each image pixel in succession. The table GAIN.pck is used for distinguishing background pixels from "strong" pixels that are part of a diffraction spot.
- BKGINIT.pck estimates the initial background at each pixel from a few data images specified by the input parameter BACKGROUND_RANGE=. The lookup table is obtained by adding the X-ray background from each image. Shaded regions on the detector (i.e., beam-stop) or pixels outside a user defined circular region (RMAX=) or pixels with an undefined spatial correction value are classified as untrustworthy and marked by -3. The table should be inspected using the VIEW program.
Problems:
Some detectors with insufficient protection from electromagnetic pulses may generate badly spoiled images whose inclusion leads to a completely wrong X-ray background table. These images can be identified in INIT.LP by their unexpected high mean pixel contents, and this step should be repeated with a different set of images.
COLSPOT
locates strong diffraction spots occurring in a subset of the data images and saves their centroids on the file SPOT.XDS.
Up to ten image ranges can be specified (input parameter SPOT_RANGE=). COLSPOT identifies 'strong' pixels ( STRONG_PIXEL=) that are not in the background region ( BACKGROUND_PIXEL=). If the total number of 'strong' pixels occuring in the specified data images exceeds the upper limit as given by the input parameter MAXIMUM_NUMBER_OF_STRONG_PIXELS=, the weaker ones are discarded. Spots are defined as sets of 'strong' pixels adjacent in three dimensions. A spot is accepted if it contains a minimum number of 'strong' pixels ( MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT=) and if the spot centroid is sufficiently close to the location of the strongest pixel in the spot ( SPOT_MAXIMUM-CENTROID=).
Problems:
Sharp edges like ice rings in the images can lead to an excessive number of 'strong' pixels erroneously classified as contributing to diffraction spots. These aliens could prevent IDXREF to recognize the crystal lattice.
IDXREF
uses the initial parameters describing the diffraction experiment as provided by XDS.INP and the observed centroids of the spots from the file SPOT.XDS to find the orientation, metric, and symmetry of the crystal lattice. IDXREF refines some of these parameters (input parameter REFINE(IDXREF)=). and returns the complete parameter set via the file XPARM.XDS. Moreover, the original file SPOT.XDS is replaced by a file of identical name - now with indices attached to each observed spot. Spots not belonging to the crystal lattice are given indices 0,0,0. The run is considered successful if at least 70% of the given spots can be explained with reasonable accuracy ( MAXIMUM_ERROR_OF_SPOT_POSITION=).
To determine a crystal lattice that explains the observed locations of the diffraction spots listed in file SPOT.XDS, IDXREF proceeds as follows.
- The laboratory coordinates of the diffracted beam wave vector (normalized to 1/lambda) that produced the spot at pixel coordinates IX, IY are calculated by using the input parameter values for ORGX=,ORGY=, DETECTOR_DISTANCE=, DIRECTION_OF_DETECTOR_X-AXIS=, DIRECTION_OF_DETECTOR_Y-AXIS=, X-RAY_WAVELENGTH=, QX=, and QY=.
- Subtraction of the incident beam wave vector (determined from the input parameter values INCIDENT_BEAM_DIRECTION= and X-RAY_WAVELENGTH=) from the diffracted beam wave vector results in the corresponding reciprocal lattice vector when the Laue equations are satisfied.
- The reciprocal lattice vector for the unrotated crystal is then found from the centroid of image numbers of the spot (as given in SPOT.XDS) and the input parameter values ROTATION_AXIS=, OSCILLATION_RANGE=, STARTING_ANGLE=, and STARTING_FRAME=.
- Differences between any two such reciprocal lattice vectors which are above a specified minimal length ( SEPMIN=) are then accumulated in a 3-dimensional histogram. These difference vectors will form clusters in the histogram since there are many different pairs of reciprocal lattice vectors of nearly identical vector difference.
- The clusters are found as maxima in the smoothed histogram (CLUSTER_RADIUS=), and a basis of three linear independent cluster vectors is selected that allows all other cluster vectors to be expressed as nearly integral multiples of small magnitude with respect to this basis. The basis vectors and the 60 most populated clusters with attached indices are listed in IDXREF.LP. If many of the indices deviate significantly from integral values, the program is unable to find a reasonable lattice basis and all further processing will be meaningless.
- If space-group and cell constants are specified (input parameters SPACE_GROUP_NUMBER= and UNIT_CELL_CONSTANTS=), a reduced cell is derived, and the reciprocal basis vectors found above are reinterpreted accordingly; otherwise, a reduced triclinic cell is determined directly from the reciprocal basis. Parameters of the reduced cell, coordinates of the reciprocal basis vectors, and their indices with respect to the reduced cell are reported in IDXREF.LP.
- Based on the orientation and metric of the reduced cell now available, IDXREF indexes up to 3,000 of the strongest spots by the local indexing method. This method considers each spot as a node of a tree and identifies the largest subtree of nodes which can be assigned reliable indices. The number of reflections in the ten largest subtrees is reported and usually shows a dominant first tree corresponding to a single lattice, whereas alien spots are found in small subtrees. Input parameters that control the local indexing are INDEX_ERROR=, INDEX_MAGNITUDE=, INDEX_QUALITY=.
- Reflections in the largest subtree are used for initial refinement of the basis vectors of the reduced cell, the incident beam wave vector, and the origin of the detector, which is the point in the detector plane nearest to the crystal.
- After initial refinement based on the reflections in the largest subtree, all spots which can now be indexed are included.
If the crystal symmetry is unknown, IDXREF makes no automatic decision for the space-group and continues data processing with the crystal being described by its reduced cell basis vectors and triclinic symmetry. Instead, IDXREF uses the refined metric parameters of the reduced cell for testing each of the 44 possible lattice types (Kabsch, 1993). For each lattice type IDXREF reports the likelihood of being correct, the conventional cell parameters, and the linear transformation relating original indices to the new indices with respect to the conventional cell. On completion, when integrated intensities are available, the user chooses any plausible space-group according to the rated list of the 44 possible lattice types and repeats only the CORRECT step with the appropriate conventional cell parameters and reindexing transformation.
Problems:
- Indices of many difference vector clusters deviate significantly from integral values. This can be caused by incorrect input parameters, like rotation axis, oscillation angle, detector position, by a large fraction of alien spots in SPOT.XDS, by placing the detector too close to the crystal, or by inappropriate choice of parameters SEPMIN= and CLUSTER_RADIUS= in densely populated images.
- Indexing and refinement is unsatisfactory despite well-indexed difference vector clusters. This probably results from selection of an incorrect index origin caused by inaccurate values for the input parameters ORGX= and ORGY=, and IDXREF should be rerun with a plausible alternative for INDEX_ORIGIN= proposed in IDXREF.LP. Also, a visual check of a data image with the VIEW program could be quite helpful for finding the approximate region for the detector origin.
- Despite successful indexing and refinement, IDXREF stopped with the error message "INSUFFICIENT PERCENTAGE OF INDEXED REFLECTIONS", indicating that only less than 70% of the given spots could be explained. Alien spots often arise because of the presence of ice or small satellite crystals, and continuation of data processing may still be meaningful. To continue data processing, just specify the missing processing steps in XDS.INP by JOB=DEFPIX XPLAN INTEGRATE CORRECT and call XDS again.
DEFPIX
recognizes regions in the initial background table (file BKGINIT.pck) that are obscured by intruding hardware and marks the shaded pixels as untrusted. In addition, pixels outside a user-defined resolution range (INCLUDE_RESOLUTION_RANGE=) are marked and eliminated from the trusted region. The marked background table thus obtained is saved on file BKGPIX.pck which is needed by the subsequent program steps.
For recognizing the obscured regions in the initial background, DEFPIX generates a control image (file ABS.pck) that contains values around 10000 for unshaded pixels and lower values for shaded pixels. The classification of the pixels into reliable and untrusted ones is based on the two input parameters VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS= (default: 6000 30000) and INCLUDE_RESOLUTION_RANGE= (default: 20.0 0.0). Pixels in the table ABS.pck with a value outside the ranges specified by the two parameters are marked unreliable (by -3) in the background table BKGPIX.pck.
Problems:
If the parameter VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS= specifies a too narrow value range, "good" regions will erroneously be excluded from the trusted detector region. Check BKGPIX.pck with the VIEW program (type VIEW BKGPIX.pck) and, if necessary, repeat the DEFPIX step with more appropriate values.
XPLAN
supports the planning of data collection. It is based upon information provided by the input files XPARM.XDS and BKGPIX.pck, both of which are available by processing a few test images. XPLAN estimates the completeness of new reflection data, expected to be collected for each given starting angle ( STARTING_ANGLES_OF_SPINDLE_ROTATION=) and total crystal rotation ( TOTAL_SPINDLE_ROTATION_RANGES=), and reports the results for a number of selected resolution shells ( RESOLUTION_SHELLS=) in the file XPLAN.LP. To minimize recollection of data, the name of a file can be provided by the input parameter REFERENCE_DATA_SET=, the reference data set, which contains already measured reflections.
Problems:
- Incorrect results may occur for some space groups, i.e. P42, if the unit cell determined by XDS from processing a few test images implicates reflection indices inconsistent with those from the reference data set. The correct cell choice can be found, however, by using the old data as a reference and repeating CORRECT with the appropriate reindexing transformation, followed by copying GXPARM.XDS to XPARM.XDS. The same applies if IDXREF was run for an unknown space group and then reindexed in CORRECT.
- XPLAN ignores potential reflection overlap due to the finite oscillation range covered by each image.
INTEGRATE
determines the intensity of each reflection predicted to occur in the rotation data images ( DATA_RANGE=) and saves the results on file INTEGRATE.HKL.
The diffraction parameters needed for predicting the reflection positions are initially provided by the file XPARM.XDS. These parameters are either kept constant or refined periodically using strong diffraction spots encountered in the data images. Whether refinement should be carried out at all and which parameters are to be refined can be specified by the user (input parameter REFINE(INTEGRATE)=). Centroids of the strong spots in the data images are computed from pixels that exceed the background by a given multiple of standard deviations (input parameters SIGNAL_PIXEL=, BACKGROUND_PIXEL=). Strong spots are used in the refinement if their centroids are reasonably close to their calculated position (input parameter MAXIMUM_ERROR_OF_SPOT_POSITION=).
For determination of the intensity, approximate values describing extension and form of the diffraction spot must be specified. The shapes of all spots become very similar when the contents of each of their contributing image pixel is mapped into a 3-dimensional, reflection-specific coordinate system centered on the surface of the Ewald sphere, at the terminus of the diffracted beam wave vector (Kabsch, 1988b). In this coordinate system alpha and beta span the plane tangential to the Ewald sphere with the alpha-axis perpendicular to the incident- and the diffracted beam wave vector. The gamma axis runs perpendicular to the alpha-axis and to the rotated reciprocal lattice vector representing the reflection when the Laue equations are satisfied. The number of grid points in this coordinate system used for representing the transformed reflection profile are usually chosen automatically by XDS; the user has the option to override the automatic assignment by specifying the two input parameters NUMBER_OF_PROFILE_GRID_POINTS_ALONG_ALPHA/BETA=
NUMBER_OF_PROFILE_GRID_POINTS_ALONG_GAMMA=.
The transformed spot can roughly be described as a Gaussian. Four parameters are used for this purpose:
- BEAM_DIVERGENCE= is twice the opening angle of a cone with the diffracted beam wave vector as cone axis. The interception of the cone with the data image traces the boundary of the spot and includes some neighbouring background pixels. The parameter value can be estimated as
BEAM_DIVERGENCE= arctan(spot_diameter/detector_distance). - BEAM_DIVERGENCE_E.S.D.= characterizes the Gaussian spot shape by its standard deviation.
- REFLECTING_RANGE= is the approximate rotation angle required for a strong spot recorded perpendicular to the rotation axis to pass completely through the Ewald sphere.
- REFLECTING_RANGE_E.S.D.= is the standard deviation of the Gaussian intensity distribution when the reflection is rotated through the Ewald sphere on shortest route. This is also defined as the mosaicity of the crystal.
All of the four parameters describing shape and extension of the spots can be determined automatically from a subset of the data images. The subset of images can be specified by the user by the parameter PROFILE_RANGE=
Integration is carried out by a two-step procedure. In the first pass, spot templates are generated by superimposing profiles of fully recorded, strong reflections, and all grid points with a value above a minimum percentage of the maximum in the template ( CUT=) are defined as elements of the integration domain. To allow for variations of their shape, profile templates are generated from reflections located at nine regions of equal size covering the detector surface and additional sets of nine to cover equally-sized ( DELPHI=) batches of images. The actual integration is carried out in the second pass by profile fitting with respect to the spot shape determined in the first pass.
On return from the INTEGRATE step, the data image last processed with all expected spots encircled is saved in the file FRAME.pck for inspection using the VIEW program.
Problems:
- Off-centered profiles indicate incorrectly predicted reflection positions by using the parameters provided by the file XPARM.XDS (i.e., misindexing by using a wrong origin of the indices), crystal slippage, or change in the incident beam direction.
- Profiles extending to the borders of the box indicate too-small values for BEAM_DIVERGENCE= or REFLECTING_RANGE=. This leads to incorrect integrated intensities because of truncated reflection profiles and unreliable background determination.
- Display of the file FRAME.pck shows spots which are not encircled. If these unexpected reflections are not close to the spindle and are not ice reflections, it is likely that the parameters provided by the file XPARM.XDS are wrong.
CORRECT
applies correction factors to intensities and standard deviations of all reflections found in the file INTEGRATE.HKL, refines the unit cell constants, reports the quality and completeness of the data set, and saves the final integrated intensities on the file XDS_ASCII.HKL.
CORRECT accepts reflections from file INTEGRATE.HKL that are
- recorded (parameter MINPK=) on specified images (parameter DATA_RANGE=),
- within a given resolution range (parameter INCLUDE_RESOLUTION_RANGE=),
- outside ice-rings (parameter EXCLUDE_RESOLUTION_RANGE=),
- not overloaded (parameter OVERLOAD=).
- not marked for exclusion in the file REMOVE.HKL
Thus, the user has the option to exclude unreliable reflections from the final data set by repeating the CORRECT step with appropriate parameter values.
For refinement of the unit cell constants (parameter REFINE(CORRECT)=), CORRECT uses a subset of the accepted reflections, whose observed centroid is sufficiently close to the predicted spot position (parameter MAXIMUM_ERROR_OF_SPOT_POSITION=), The refined set of parameters is saved on file GXPARM.XDS which has the identical layout as file XPARM.XDS produced by IDXREF. If the crystal has not slipt during data collection, these parameters are quite accurate.
Intensities of the accepted reflections are corrected for effects due to polarization of the incident beam (parameters FRACTION_OF_POLARIZATION=, POLARIZATION_PLANE_NORMAL=) and absorption effects (parameter AIR=) arising from differences in path lengths of the diffracted beam wave vector.
Other correction factors which partially compensate for radiation damage, absorption effects, and variations in sensitivity of the detector surface are determined from symmetry-equivalent reflections usually found in the data images. Whether Friedel-pairs are considered as symmetry-equivalent reflections in the calculation of the correction factors depends on the values of the two parameters STRICT_ABSORPTION_CORRECTION= and FRIEDEL'S_LAW=. The number of these correction factors is controlled by the input parameter DELPHI=. The corrections are chosen such that the integrated intensities of symmetry-equivalent reflections come out as similar as possible.
The residual scatter in the intensity of symmetry-equivalent reflections is used to determine a correction factor for the standard error in intensity previously estimated from profile fitting in the INTEGRATE step. Based on the more realistic error estimates of the intensities, outliers are recognized by comparison with other symmetry-equivalent reflections. These outliers are included in the main output file XDS_ASCII.HKL in which they are marked by a negative sign attached to the estimated standard deviations of their intensity. Classification of a reflection as a misfit is controlled by a decision constant which has the default value of WFAC1=1.5. A lower value (i.e. WFAC1=1.0) specified by the user will lead to an increasing number of misfits and lower R-factors as outliers are not included in the reported statics.
Data quality as a function of resolution is described by the agreement of intensities of symmetry-related reflections and quantified by the R-factors, Rsym, and the more robust indicator, Rmeas (Diederichs and Karplus, 1997). These R-factors as well as the intensities of all reflections with indices of type h 0 0, 0 k 0, and 0 0 l and those expected to be systematically absent provide important information for identification of the correct space-group. Clearly, large R-factors or many rejected reflections (MISFITS) or large observed intensities for reflections expected to be systematically absent suggest that the assumed space-group or the indexing is incorrect. It is easy to test other possible space-groups (SPACE_GROUP_NUMBER=) by simply repeating the CORRECT step after copying the appropriate reindexing transformation (REIDX=) and conventional cell constants (UNIT_CELL_CONSTANTS=) found in the rated table of the 44 possible lattice types in IDXREF.LP to XDS.INP. The presence or absence of anomalous scatterers is specified by the parameter FRIEDEL'S_LAW=.
Another useful feature is the possibility to compare the new data with those from a previously measured crystal (REFERENCE_DATA_SET= file name). For some space-groups, like P42, possessing an ambiguity in the choice of axes, comparison with the reference data set allows one to identify the consistent solution from the complete set of alternatives already listed in IDXREF.LP together with their required index transformation. Reference data are also found quite useful for recognizing misindexing or for testing potential heavy-atom derivatives.
Finally, CORRECT analyzes the distribution of reflection intensities as a function of their resolution and reports outliers from the Wilson plot. Often these aliens arise from ice rings in the data images. To suppress the unwanted reflections from the final output file XDS_ASCII.HKL, the user copies them to a file named REMOVE.HKL in the current directory and repeats the CORRECT step.
Problems:
- Incomplete data sets may lead to wrong conclusions about the space-group, as some of its symmetry operators might not be involved in the R-factor calculations.
- Conventional cell parameters, as listed in IDXREF.LP, often violate constraints imposed by the space-group and must be edited accordingly after copying to XDS.INP.
- For space-group determination usually several alternatives are investigated by repeating CORRECT in each case. One should remember, that it may be necessary to repeated the CORRECT step for the final choice of the space-group, as XDS overwrites the earlier versions of the output files XDS_ASCII.HKL and GXPARM.XDS.
Wolfgang Kabsch
page last updated: September 15, 2003