OU Crystallography Lab

Department of Chemistry & Biochemistry
Chemical Crystallography Laboratory

| Home | Site Map | Contact

Steps in Structure Refinement

The following sections include a brief introduction to the steps involved in crystal structure refinement. These general steps may be applied to data processing with any computer package.

Table of Contents

Structure Completion

Models from a structure solution often give only a partial set of atoms in the unit cell. However this partial set of atoms usually contains sufficient phase information to allow the user to locate the remaining atoms. From the atom types and relative positions in the initial model, a set of structure factors may be calculated. An electron density map can then be prepared using the calculated phase angles and the observed structure factors. Unfortunately, electron density maps, with coefficients of |Fo|, tend to give back only the parts of the model that are already known. It is usually more helpful to calculate a difference electron density map. Difference maps are calculated using coefficients of (|Fo| - |Fc|) with the calculated phase angles. Difference maps tend to produce peaks where an insufficient amount of electron density has been included in the model (e.g., missing atoms) and produce negative holes or valleys where too much electron density has been included in the model (e.g., too heavy of an atom for the site).

Thus, from a partial model of the structure, structure factor and sometimes refinement calculations are performed that are then followed by a difference electron density map calculation. New atoms are located from the map and included in the model. This process is repeated until all non-hydrogen atoms are located.

Structure Refinement

Least-Squares Refinement

The method of refinement most generally used in small molecule crystallography is refinement by the principle of least squares. Assume that z is a linear function of xn variables and pn parameters that define the function.

z = p1x1 + p2x2 + ... + pnxn

The values of the function are known at m points with m > n. The least squares principle asserts that the "best" values for the p1, p2, ... , pn parameters are given by those that minimize the function

χ2 = ∑ wj (zo,j - zc,j)2

where zo,j is the measured value of the function at j, zc,j is the calculated value of the function at j, and wj is an assigned weight for the measured value.

The parameters being refined in a crystal structure determination are the x, y, and z positional parameters and the U isotropic or the six Ui,j anisotropic parameters for each atom. A typical refinement of k isotropic atoms would utilize 4 k atom parameters, 3 positional and 1 displacement parameter per atom. A typical refinement of m anisotropic atoms would require 9 m parameters, 3 positional parameters and 6 anisotropic parameters per atom. In addition to these atomic parameters, one overall scale factor is refined. This scale factor accounts for a variety of items from the size of the crystal to the intensity of the radiation source. The |Fc| values are not linear functions of the atomic parameters. In order to utilize the least-squares method, approximate values for these functions are obtained using a Taylor series and truncating the series after the linear term. Because of the crude Taylor series approximation, several cycles of refinement are required to achieve convergence.

Refinement Based on F or F2

In crystallography, the parameters may be refined against either the structure factors, Δ = ||Fo| - |Fc||, or the intensities, Δ = |Fo2 - Fc2|. In either case the function being minimized is ∑ w Δ2. Before the 1990's most structures were refined using structure factors, F. For these refinements, the very weak data present a problem. In particular, for some data the intensity measured for the peak is less than the intensity measured for the backgrounds. This produces in a negative value for Fo2. For these data no reasonable value for Fo can be determined. To avoid this problem, data with Fo < n σ(Fo), where n was usually 2-4, were simply omitted from the refinement. This of course introduces systematic error into the results. Since, it is reasonable for some structure factors to sum to essentially Fc = 0, then in principle a significant portion of valuable data are being unreasonably rejected from the refinement.

Refinements based on F2 can utilize all data, including the very weak data. When the data are strong with few unobserved data, either refinement method works well, and both produce similar answers. However, when the data are weak or have significant problems, refinement based on F2 is preferred. More important, for weak data sets, refinements based on F2 give more chemically reasonable bond lengths, bond angles, and displacement parameters than do refinements based on F.

Weights

The weights used in least squares refinement are generally chosen to represent the relative influence an observation should have on the results. Weights typically include some term representing the statistical error of the measured data.

w = 1 / σ2(Y)

where Y is F or F2 depending upon the function being minimized.

Unfortunately, the data nearly always includes systematic errors along with the statistical errors. The systematic errors most often found in data are an ignored or improper absorption correction and a secondary extinction error. The strong, low scattering angle data are significantly affected by secondary extinction problems. Ideally these errors should be removed from the data by applying the proper corrections to the data. Since these and other systematic corrections cannot always be applied, it is found that reducing the weights based upon relative intensities of the peaks produces a more reasonable refinement. Thus a common weighting scheme is given by:

w = 1 / (σ2(Y) + k Fo2)

The value of k is often determined empirically, and typically has values in the range of 0.0001 - 0.02.

More complicated weighting schemes are also possible. Parameters in the weighting scheme are usually modified so that the variances, or sums of the squares of the Δ's, for any group of data are similar to variances of all similar groups of data. In the refinement program SHELXL1, that performs refinements based on F2, weights usually assume the form

whkl = 1/[σ2(Fo,hkl2) + (a P)2 + b P ]

where P = [2 Fc2 + Max(Fo2, 0)] / 3. The values for a and b are chosen to give an even distribution of the variances across all groups of data based on the relative intensities. Wilson2 found that the use of P rather than F2 reduced statistical bias.

Refinement Statistics

One way to judge how well the model fits the observed data is to calculate discrepancy factors. For refinement based on F2, the following discrepancy factors are routinely calculated.

wR2 = { ∑ [w (Fo2 - Fc2)2] / ∑ [w (Fo2)2] }1/2

R1 = ∑ |(|Fo| - |Fc|)| / ∑ |Fo|

The wR2 is a weighted R factor based upon all data that allows the crystallographer to follow the progress of the refinement. The numerator of the wR2 expression is the function being minimized in the refinement. The R1 expression, which is based only on the observed data, Fo > 2 σ(Fo) is called the unweighted R.

The R1 expression has always been reported with refinements on F. To be able to compare refinements on F2 with each other and with refinements on F2, the R1 is still calculated and reported.

Another statistic that is reported with crystal structure refinements is the goodness of fit, S. Technically, the goodness of fit is the standard deviation of an observation of unit weight. In practice the goodness of fit shows how reliable the standard deviations of the positional and displacement parameters of the atoms really are. The standard deviations of the atomic parameters should be multiplied by the goodness of fit to give more realistic estimates of the standard deviations. These adjusted standard deviation can be compared with similar values from other structures. The goodness of fit is strongly influenced by the weighting scheme. Thus crystallographers will modify the weighting scheme to force the goodness of fit to have a value near to 1.0 and hence the standard deviations can be used directly as they are determined. For a refinement on F2 the goodness of fit has the form:

GoF = S = {∑ [w(Fo2 - Fc2)2] / (n-p)}1/2

where n = number of measured data and p = number of parameters.

Correlations

When the shifts in pairs of parameters are not independent of each other the parameters are said to be correlated. Correlations can be either positive, shifts of the parameters in question have the same sign, or negative, shifts in the parameters have opposite signs. Correlations can assume any value from -1, complete negative correlation; to 0, no correlation; to +1, complete positive correlation. Large correlations, those with a magnitude between 0.5 and 1.0, are specifically noted by most refinement programs. To successfully refine parameters with large correlations, the starting model must be very close to its local minimum. Refining parameters with large correlations requires more cycles of refinement to achieve convergence.

Some large correlations are expected and quite reasonable. In nearly all structures with heavy atoms, the overall scale factor and the displacement parameters of the heavy atom(s) are correlated. Large correlations can also occur between the different anisotropic displacement parameters of any particular atom. If the unit cell angles are far from 90° then it is not uncommon to see large correlations between the corresponding x, y, and z parameters for a given atom. For example, in a monoclinic structure with β > 100°, the x and z parameters of a heavy atom are usually strongly correlated. In disordered structures, it is common to see large correlations between the positional and displacement parameters of atoms in close proximity with other atoms.

Some large correlations can also signal problems with the model. In particular, large correlations between the positional parameters of different atoms, e.g., the x parameter of one atom and the x parameter of another atom, when the atoms are not disordered, suggests that the space group may be wrong. A higher symmetry space group can usually be found that has symmetry operations that relate the two atoms being modeled separately in the lower symmetry space group.

Constraints in Refinement

Some crystal structure parameters must either be explicitly set and not refined or must be related to other variables and not allowed to independently refine. These conditions are called constraints. An example of a constraint is the fixing of the x, y, and z coordinates of an atom that sits on a crystallographic center of symmetry.

Restraints in Refinement

When the data are not of good quality or parts of the structure are disordered, it may be necessary to impose restraints. Restraints are additional information that is not exact, but is subject to a probability distribution. The lengths of two chemically but not crystallographically equivalent bonds could be restrained to be approximately equal. A restraint is treated as an extra experimental observation with a standard uncertainty that determines its weight relative to the measured data.

Oscillations and Damping

During the refinement, parameter shifts are compared with the respective standard deviations in the parameters to see when the refinement has converged. For most data sets, convergence occurs after 4-5 cycles of refinement of a given set of parameters. Very occasionally parameter shifts oscillate from a positive value in one cycle to a negative value in the next cycle. This problem is usually overcome by applying damping to the parameter shifts. Note that oscillating refinements are sometimes an indication of a problem structure--be sure to verify that the space group is correctly chosen. Damping may also be necessary for disordered structures where large correlations between the positional parameters of the disordered atoms exist. Note that damping arbitrarily causes the standard deviations of the parameters to be underestimated. Thus for the final cycles of refinement, damping should be reduced to as small as possible, or better yet, completely removed.

Anomalous Scattering

The effects of anomalous scattering should be included in the model of a crystal structure whenever the structure contains atoms heavier than carbon. For centrosymmetric space groups, anomalous dispersion effects on the intensities are still measurable. Note, however, that Friedel's law still remains. Thus the anomalous scattering terms are included only to improve the model and slightly improve the final R factor. For noncentrosymmetric space groups, Friedel's law is no longer true when there is significant anomalous scattering. Compounds that crystallize in a space group with no center of symmetry and have sufficient anomalous dispersion can be assigned the proper absolute structure using a variety of tests including the Hamilton R test4, the Roger's η test5, the Flack test6, and Hooft test7. The Flack test is generally most reliable, but in sensitive cases such as a sample with oxygen as the anomalous scatterer(heaviest atom) and data measured with Mo radiation, then the Hooft test can descriminate between the two different chiralities.

Blocked Refinements

For problems with very large numbers of parameters it is possible to block the refinement, or refine a portion of the parameters in one cycle, and then another part of the parameters in another cycle. Care must be taken to insure that parameters of covalently bonded atoms are refined in the same block. If all covalently bonded atoms of a group cannot be refined in a single block, then the different blocks must refine some parameters in both blocks to insure correlations between the atoms are properly included. Also chemically-similar groups should be refined in the same block to include correlations.

Structure Completion and Refinement Strategy

The steps to complete and refine a crystal structure are somewhat dependent on the program(s) being used for refinement and Fourier map generation. There are two important principles for all refinement methods. First, the model must be chemically reasonable. Second, the answer is in the data. The data will often tell you, through the difference map and an analysis of differences between the Fo2 and Fc2 values, what changes to make to improve the model. Be very careful when adding atoms to the model that are not seen in a difference map. The steps listed below are deliberately conservative, i.e., they are designed for poor quality data sets. For good quality (average (Fo2 / σ) > 10) data sets, least squares refinements usually converge rapidly and many of these steps can be combined.

Refinement of Disordered Fragments

There are a few general indicators that can point to disorder in molecular fragments. Displacement parameters on some atoms that are 2-3 times the values observed on other atoms of the same molecular fragment indicate some rather serious problem, usually disorder. Residual peaks in the difference map that are too close to existing atoms to form plausible bonds suggest disorder. Atoms that are too close to symmetry elements in the space group to produce chemically-reasonable bonding after accounting for the symmetry-related atoms requires a disordered model. Note that if the disorder appears to mimic a symmetry operation, consider modeling the structure with twinning.

  1. Remove all atoms in the region of the suspected disorder, and calculate structure factors and an electron density difference map. From the difference map identify approximate positions for one or more orientations of the uncertain atoms.

  2. Include the newly located atoms in the model with fixed isotropic displacement parameters. Geometrical restraints on the positions of these atoms must be applied. The restraints should have standard uncertainties that are of the order of magnitude expected for the final standard uncertainties of the bonds among the ordered atoms of similar atom type. Set the occupancies of these atoms to refine as a single variable for each orientation. In the refinement model, set up atom connections so that only chemically-reasonable bonds will be reported.

  3. Refine the model with at least twice the number of cycles per run that were used before the disorder was added to the model. Continue refining the model to convergence (maximum shift/error ratio < 0.05). If the refinement oscillates or is too slow, consider including a damping factor.

  4. If the disorder model is chemically complete continue with the next step. Otherwise, locate remaining peaks in the disorder model from the difference map and return to step 2 above.

  5. Include the isotropic displacement parameters in the list of refined variables and refine the model to convergence. If displacement parameters of one or more atoms become implausibly large, remove these atoms from the model and return to step 3 above.

  6. Include hydrogen atoms in the disordered parts of the model. These hydrogens should be modeled by geometry and either not refined or refined with a riding model. Refine to convergence.

  7. Once a plausible, chemically-reasonable model with isotropic displacement parameters has been attained, refine the disordered atoms with anisotropic displacement parameters. Restraints on the displacement parameters of the disordered atoms will probably be needed. Also, damping of the refinement shifts may be required. Refine to convergence.

Structure Refinement Checklist

Completed crystal structures must pass the following tests.

  1. The bonds in the model must be chemically reasonable. Similar bonds should have similar geometries, and all bond lengths, angles, etc., must match literature values.

  2. There should be no atoms with displacement parameters that are non-positive definite, npd. The displacement parameters should be checked for signs of systematic error. For example, ellipsoids of several heavy atoms aligned in one direction may indicate the need for a better absorption correction. Nonspherical or large ellipsoids suggests that the model may need to include disorder.

  3. The structure should be refined to convergence, that is the maximum shift/error ratio should be < 0.05. All non-hydrogen atoms should be refined with anisotropic displacement parameters provided that there are at least 10 data per parameter. Lower data-to-parameter ratios indicate that either the data were not collected to a high enough scattering angle, or that Friedel-related (or equivalent) data were not collected for a structure in a noncentrosymmetric space group.

  4. Noncentrosymmetric space groups should be refined with the correct absolute structure. If anomalous scattering is being used to determine the absolute configuration, then the sample must contain atoms with sufficient differences in anomalous scattering to yield a meaningful absolute structure result. For example, the absolute structure cannot be determined under either of the following conditions, only one type of atom (besides hydrogen) and any type of radiation, or only first row atoms while using Mo Kα radiation. Also the data must be collected with sufficient quality (sufficient count times) to produce large (>10) mean F2/σ. The best test for absolute structure is the Flack test.6

  5. The weighting scheme should be adjusted so as to produce nearly constant values for the variances as functions of intensity and resolution. Doing this will also make the goodness of fit, S, have a value around 1.0.

  6. There should be no peaks with strong intensities in a list of worst-fitting data.

  7. The final difference map should have no abnormally high peaks or low valleys.

  8. The final R1 and wR2 should be reasonably low for the quality of data. Good small molecule crystal structures usually have R1 < 0.05. Acceptable small molecule crystal structures typically have R1 < 0.10.

References

  1. G. M. Sheldrick, SHELXTL Reference Manual, 1997, Bruker-AXS, Inc., Madison, WI
  2. A. J. C. Wilson, Acta Cryst., 1976, A32, 994-996.
  3. H. D. Flack & D. Schwarzenbach, 1988 Acta Cryst., 1988, A44 499-506.
  4. W. C. Hamilton, Acta Cryst., 1965, 18, 502-510.
  5. D. Rogers, Acta Cryst., 1981, A37, 734-741.
  6. H. D. Flack, Acta Cryst., 1983, A39 876-881.
  7. R. W. W. Hooft, L. H. Straver, and A. L. Spek, J. Appl. Cryst., 2008, 41, 96-103.
Thank you for visiting our site.