## Steps in Structure Refinement

The following sections include a brief introduction to the steps involved in crystal structure refinement. These general steps may be applied to data processing with any computer package.

### Table of Contents

- Structure Completion
- Structure Refinement
- Structure Completion and Refinement Strategy
- Refinement of Disordered Fragments
- Structure Refinement Checklist
- References

### Structure Completion

Models from a structure solution often give only a partial
set of atoms in the unit cell. However this partial set of atoms usually contains
sufficient phase information to allow the user to locate the remaining atoms.
From the atom types and relative positions in the initial model, a set of structure
factors may be calculated. An
electron density map can then be
prepared using the calculated phase angles and the observed
structure factors. Unfortunately, electron density maps,
with coefficients of |*F*_{o}|, tend to give
back only the parts of the model that are already known. It is usually more helpful
to calculate a *difference* electron density map. Difference maps are calculated using
coefficients of (|*F*_{o}| - |*F*_{c}|) with the
calculated phase angles. Difference maps tend to produce peaks where an insufficient
amount of electron density has been included in the model (e.g., missing atoms) and
produce negative holes or valleys where too much electron density has been included in
the model (e.g., too heavy of an atom for the site).

Thus, from a partial model of the structure, structure factor and sometimes refinement calculations are performed that are then followed by a difference electron density map calculation. New atoms are located from the map and included in the model. This process is repeated until all non-hydrogen atoms are located.

### Structure Refinement

**Least-Squares Refinement**

The method of refinement most generally used in small molecule
crystallography is refinement by the principle of least squares. Assume that
*z* is a linear function of *x*_{n} variables and
*p*_{n} parameters that define the function.

*z* = *p*_{1}*x*_{1} +
*p*_{2}*x*_{2} + ... +
*p*_{n}*x*_{n}

The values of the function are known at m points with m > n.
The least squares principle asserts that the "best" values for the
*p*_{1}, *p*_{2}, ... , *p*_{n}
parameters are given by those that minimize the function

χ^{2} = ∑
w_{j} (z_{o,j} - z_{c,j})^{2}

where z_{o,j} is the measured value of the function at j,
z_{c,j} is the calculated value of the function at j, and w_{j}
is an assigned weight for the measured value.

The parameters being refined in a crystal structure determination are
the *x, y,* and *z* positional parameters and the *U*
isotropic or the six *U*_{i,j} anisotropic parameters for each
atom. A typical refinement of *k* *isotropic* atoms would
utilize 4 *k* atom parameters, 3 positional and 1 displacement parameter
per atom. A typical refinement of *m* *anisotropic* atoms would
require 9 *m* parameters, 3 positional parameters and 6 anisotropic
parameters per atom. In addition to these atomic parameters, one overall scale
factor is refined. This scale factor accounts for a variety of items from the
size of the crystal to the intensity of the radiation source. The
|*F*_{c}| values are not linear functions of the atomic
parameters. In order to utilize the least-squares method, approximate values
for these
functions are obtained using a Taylor series and truncating the series after the
linear term. Because of the crude Taylor series approximation, several cycles of
refinement are required to achieve convergence.

**Refinement Based on F or
F^{2}**

In crystallography, the parameters may be refined against either the
structure factors, Δ = ||*F*_{o}| -
|*F*_{c}||,
or the intensities, Δ = |*F*_{o}^{2} -
*F*_{c}^{2}|. In either case the function being minimized
is ∑ *w* Δ^{2}. Before
the 1990's most structures were refined using structure factors, *F*. For
these refinements, the very weak data present a problem. In particular, for some
data the intensity measured for the peak is less than the intensity measured for
the backgrounds. This produces in a negative value for
*F*_{o}^{2}. For these data no reasonable value for
*F*_{o} can be determined. To avoid this problem, data with
*F*_{o} < n σ(*F*_{o}), where n
was usually 2-4, were simply omitted from the refinement. This of course
introduces systematic error into the results. Since, it is reasonable for some
structure factors to sum to essentially *F*_{c} = 0, then in principle
a significant portion of valuable data are being unreasonably rejected from
the refinement.

Refinements based on *F*^{2} can utilize
all data, including the very weak data. When the data are strong with few
unobserved data, either refinement method works well, and both
produce similar answers. However, when the data are weak or have significant
problems, refinement based on *F*^{2} is preferred. More important,
for weak data sets, refinements based on *F*^{2} give
more chemically reasonable bond lengths, bond angles, and displacement parameters than
do refinements based on *F*.

**Weights**

The weights used in least squares refinement are generally chosen to represent the relative influence an observation should have on the results. Weights typically include some term representing the statistical error of the measured data.

w = 1 / σ^{2}(*Y*)

where *Y* is *F* or *F*^{2} depending upon the
function being minimized.

Unfortunately, the data nearly always includes systematic errors along with the statistical errors. The systematic errors most often found in data are an ignored or improper absorption correction and a secondary extinction error. The strong, low scattering angle data are significantly affected by secondary extinction problems. Ideally these errors should be removed from the data by applying the proper corrections to the data. Since these and other systematic corrections cannot always be applied, it is found that reducing the weights based upon relative intensities of the peaks produces a more reasonable refinement. Thus a common weighting scheme is given by:

w = 1 / (σ^{2}(*Y*) +
*k F*_{o}^{2})

The value of *k* is often determined empirically, and
typically has values in the range of 0.0001 - 0.02.

More complicated weighting schemes are also possible. Parameters in the
weighting scheme are usually modified so that the variances, or sums of
the squares of the Δ's, for any group of data are similar
to variances of all similar groups of data. In the refinement program
SHELXL^{1}, that performs refinements based on *F*^{2},
weights usually assume the form

w* _{hkl}* =
1/[σ

^{2}(

*F*

_{o,hkl}^{2}) + (a P)

^{2}+ b P ]

where P = [2 *F*_{c}^{2} +
Max(*F*_{o}^{2}, 0)] / 3. The values for a and b are chosen
to give an even distribution of the variances across all groups of data based on the
relative intensities. Wilson^{2} found that the use of P rather than
*F*^{2} reduced statistical bias.

**Refinement Statistics**

One way to judge how well the model fits the observed data is to
calculate discrepancy factors. For refinement based on
*F*^{2}, the following discrepancy factors are routinely
calculated.

*wR*_{2} = { ∑ [w
(*F*_{o}^{2} -
*F*_{c}^{2})^{2}] /
∑ [w (*F*_{o}^{2})^{2}] }^{1/2}

*R*_{1} = ∑
|(|*F*_{o}| - |*F*_{c}|)| /
∑ |*F*_{o}|

The *wR*_{2} is a weighted *R* factor based upon
all data that allows the crystallographer to follow the progress of the
refinement. The numerator of the *wR*_{2} expression is the
function being minimized in the refinement. The *R*_{1}
expression, which is based only on the observed data, *F*_{o} > 2
σ(*F*_{o}) is called the unweighted *R*.

The R_{1} expression has always been reported with refinements on
*F*. To be able to compare refinements on *F*^{2} with each
other and with refinements on *F*^{2}, the *R*_{1}
is still calculated and reported.

Another statistic that is reported with crystal structure
refinements is the *goodness of fit*, *S*. Technically, the
goodness of fit is the standard deviation of an observation of unit
weight.

In practice the goodness of fit shows how reliable the standard
deviations of the positional and displacement parameters of the atoms really
are. The standard deviations of the atomic parameters should be multiplied by
the goodness of fit to give more realistic estimates of the standard deviations.
These adjusted standard deviation can be compared with similar values from other
structures. The goodness of fit is strongly influenced
by the weighting scheme. Thus crystallographers will modify the
weighting scheme to force the goodness of fit to have a value near to 1.0 and hence
the standard deviations can be used directly as they are determined. For a
refinement on *F*^{2} the goodness of fit has the form:

*GoF* = *S* = {∑
[w(*F*_{o}^{2} -
*F*_{c}^{2})^{2}] / (n-p)}^{1/2}

where n = number of measured data and p = number of parameters.

**Correlations**

When the shifts in pairs of parameters are not independent
of each other the parameters are said to be *correlated*. Correlations can
be either positive, shifts of the parameters in question have the same sign, or
negative, shifts in the parameters have opposite signs. Correlations can assume
any value from -1, complete negative correlation; to 0, no correlation; to +1,
complete positive correlation. Large correlations, those with a magnitude between
0.5 and 1.0, are specifically noted by most refinement programs. To successfully
refine parameters with large correlations, the starting model must be very close
to its local minimum. Refining parameters with large correlations requires more
cycles of refinement to achieve convergence.

Some large correlations are expected and quite reasonable. In nearly all structures with heavy atoms, the overall scale factor and the displacement parameters of the heavy atom(s) are correlated. Large correlations can also occur between the different anisotropic displacement parameters of any particular atom. If the unit cell angles are far from 90° then it is not uncommon to see large correlations between the corresponding x, y, and z parameters for a given atom. For example, in a monoclinic structure with β > 100°, the x and z parameters of a heavy atom are usually strongly correlated. In disordered structures, it is common to see large correlations between the positional and displacement parameters of atoms in close proximity with other atoms.

Some large correlations can also signal problems with the model. In particular, large correlations between the positional parameters of different atoms, e.g., the x parameter of one atom and the x parameter of another atom, when the atoms are not disordered, suggests that the space group may be wrong. A higher symmetry space group can usually be found that has symmetry operations that relate the two atoms being modeled separately in the lower symmetry space group.

**Constraints in Refinement**

Some crystal structure parameters must either be explicitly set and
not refined or must be related to other variables and not allowed to independently
refine. These conditions are called constraints.

An example of a
constraint is the fixing of the *x, y,* and *z* coordinates of an
atom that sits on a crystallographic center of symmetry.

**Restraints in Refinement**

When the data are not of good quality or parts of the structure are
disordered, it may be necessary to impose restraints.

Restraints are
additional information that is not exact, but is subject to a probability
distribution. The lengths of two chemically but not crystallographically
equivalent bonds could be restrained to be approximately equal. A restraint
is treated as an extra experimental observation with a standard uncertainty
that determines its weight relative to the measured data.

**Oscillations and Damping**

During the refinement, parameter shifts are compared with the respective standard deviations in the parameters to see when the refinement has converged. For most data sets, convergence occurs after 4-5 cycles of refinement of a given set of parameters. Very occasionally parameter shifts oscillate from a positive value in one cycle to a negative value in the next cycle. This problem is usually overcome by applying damping to the parameter shifts. Note that oscillating refinements are sometimes an indication of a problem structure--be sure to verify that the space group is correctly chosen. Damping may also be necessary for disordered structures where large correlations between the positional parameters of the disordered atoms exist. Note that damping arbitrarily causes the standard deviations of the parameters to be underestimated. Thus for the final cycles of refinement, damping should be reduced to as small as possible, or better yet, completely removed.

**Anomalous Scattering**

The effects of
anomalous scattering should be
included in the model of a crystal structure whenever the structure contains
atoms heavier than carbon. For centrosymmetric space groups, anomalous dispersion
effects on the intensities are still measurable. Note, however, that Friedel's
law still remains. Thus the anomalous scattering terms are included only to
improve the model and slightly improve the final *R* factor. For
noncentrosymmetric space groups, Friedel's law is no longer true when there
is significant anomalous scattering. Compounds that crystallize in a
space group with no center of symmetry and have sufficient anomalous dispersion
can be assigned the proper
absolute structure using a variety of tests including the Hamilton *R*
test^{4}, the Roger's η test^{5}, the Flack test^{6},
and Hooft test^{7}. The Flack test is generally most reliable, but in
sensitive cases such as a sample with oxygen as the anomalous scatterer(heaviest
atom) and data measured with Mo radiation, then the Hooft test can descriminate
between the two different chiralities.

**Blocked Refinements**

For problems with very large numbers of parameters it is possible
to *block* the refinement, or refine a portion of the parameters in one
cycle, and then another part of the parameters in another cycle. Care must be
taken to insure that parameters of covalently bonded atoms are refined in the
same block. If all covalently bonded atoms of a group cannot be refined in a
single block, then the different blocks must refine some parameters in both
blocks to insure correlations between the atoms are properly included. Also
chemically-similar groups should be refined in the same block to include
correlations.

### Structure Completion and Refinement Strategy

The steps to complete and refine a crystal structure are somewhat
dependent on the program(s) being used for refinement and Fourier map generation.
There are two important principles for all refinement methods. First, the
model must be chemically reasonable. Second, the answer is in the data. The
data will often tell you, through the difference map and an analysis of
differences between the *F*_{o}^{2} and
*F*_{c}^{2} values, what changes to make to
improve the model. Be very careful when adding atoms to the model that are
not seen in a difference map. The steps listed below are deliberately
conservative, i.e., they are designed for poor quality data sets. For good
quality (average (*F*_{o}^{2} /
σ) > 10) data sets, least squares refinements usually
converge rapidly and many of these steps can be combined.

Translate or rotate the coordinates of all groups in the structure until the centroids of the groups are within one unit cell (centroid coordinates between 0.0 and 1.0).

If the space group is polar (the origin is not defined in one more directions by the space group operators), then define the origin by either holding the appropriate coordinate(s) of a heavy atom fixed or by restraining all coordinates in the polar axis to sum to a constant value. This latter

*floating origin*approach is described by H. D. Flack and D. Schwarzenbach^{3}, and is used in the SHELXL program.For atoms sitting on special positions in the space group apply the appropriate constraints to hold these atoms on the respective special positions. Usually this requires that one or more coordinates are held fixed. The SHELXL program automatically constrains atoms at (or very near) special positions to be on the special positions. Change the occupancies of these atoms to have values equal to the ratio of the number of symmetry operators for the special position to the number of symmetry operators in the space group. The occupancies are automatically adjusted by SHELXL.

Assign reasonable isotropic displacement parameters to all atoms. For room temperature data sets of organic or organometallic compounds these displacement parameters should be in the range of

*U*= 0.03 to 0.05. Low temperature data for these types of compounds typically have displacement parameters in the range 0.02 to 0.04. Structures with strong bonding networks, such as minerals, usually have displacement parameters in the range 0.001 to 0.02. If the intensity data are weak (mean (*I*/ σ) < 7), fix the displacement parameters for all atoms with Z < 10.Refine the structure using a reasonable weighting scheme. For programs that refine on

*F*, unit weights are suitable in the early stages of refinement, but statistical weights should be used for any final refinement cycles. For programs that refine on*F*^{2}, statistical weights should be used for all cycles of refinement. For the SHELXL program begin withWGHT 0.10

.Begin the structure factor, least-squares, Fourier map calculations.

From a difference map, add non-hydrogen atoms to the model that have chemically-reasonable bond distances and angles. Repeat the structure factor, least squares refinement, and Fourier map calculations until all non-hydrogen atoms are located and until the positional parameters have roughly converged (all shift/error ratios are < 0.1). To achieve convergence may require that rigid group or distance restraints be applied to poorly determined regions of the structure.

Use the difference map as a guide for the following steps in refinement. The largest peaks and valleys in the difference map indicate where the next changes should be made. It is usually best to refine to convergence before beginning each of the following steps.

Refine any heavy atoms with anisotropic displacement parameters. Atoms on special positions may require constraints on the parameters. These constraints are automatically applied in the SHELXL program.

Locate and refine the positions of the hydrogen atoms. For many hydrogen atoms, it is possible to simply calculate the positions from known geometry. If hydrogen atom positions are to be refined, be sure that their final positions represent chemically-reasonable geometry.

Refine the isotropic displacement parameters of the light, non-hydrogen atoms that were fixed in the early stages of refinement.

Regions of the structure exhibiting disorder (more than one orientation for a given group) should be carefully modeled. The occupancies of all atoms in a given orientation must be assigned equivalent values. Often the geometry of the disordered atoms must be restrained to give chemically-reasonable values. The displacement parameters should be initially set at reasonable fixed isotropic values. As the model converges, the displacement parameters may be refined isotropically and finally anisotropically (often with restraints).

Be careful not to

*over*model the structure. Do not add unnecessary parameters in the search for a lower R value.

Include a secondary extinction correction in the model, if needed. Secondary extinction is a multiple diffraction problem that shows up as reduced measured intensities especially for the strong, low scattering angle data. This effect is more pronounced in data from larger crystals. Often empirical absorption corrections at least partially correct for secondary extinction.

If necessary remove obviously poor fitting data. An example would be peaks that had measured positions behind the beam stop. These data would have the lowest 2θ values and would have very small

*F*_{o}^{2}and moderate to large*F*_{c}^{2}.If the space group is noncentrosymmetric, check the absolute structure for correct handedness and for possible racemic twinning. The best test for correct absolute structure is the Flack test.

^{6}If the wrong absolute structure was chosen, the correct absolute structure is usually obtained by inverting through the center of the unit cell. When the space group is one of 11 pairs of enantiomorph space groups (e.g.*P*3_{1}|*P*3_{2}) then the symmetry operators must also be changed to the enantimorph space group. Finally, there are 7 high symmetry space groups that must be inverted through some other point than the origin. These space groups and the corresponding inversion points are listed in different references.^{7,8}Refine all appropriate non-hydrogen atoms with anisotropic displacement parameters. If the displacement parameters for some of these atoms become

*non-positive definite*or*npd*, then carefully consider the model. Is the correct hybridization being used for all nearby atoms? If the correct hybridizations are being used, then try modifying the displacement parameters to correct the*npd*problem and apply restraints to the displacement parameters to force a chemically-reasonable result. Check the difference map for the appearance of peaks that may indicate the need to use a disorder model.Refine the structure to full convergence (all shift / error ratios are < 0.05). Check the analysis of variances and list of worst fitting data for outliers. Check the difference map for large peaks or valleys. Modify the weighting scheme terms so that the goodness of fit is approximately 1.0 and so that the effect of

*R*values is similar for all ranges of the data as a function of scattering angle. SHELXL program suggests an optimized weighting scheme. If you add about 5 % to both of the terms of the SHELXL suggested weighting scheme, you will get a goodness of fit that is much closer to 1.0.

### Refinement of Disordered Fragments

There are a few general indicators that can point to disorder in molecular fragments. Displacement parameters on some atoms that are 2-3 times the values observed on other atoms of the same molecular fragment indicate some rather serious problem, usually disorder. Residual peaks in the difference map that are too close to existing atoms to form plausible bonds suggest disorder. Atoms that are too close to symmetry elements in the space group to produce chemically-reasonable bonding after accounting for the symmetry-related atoms requires a disordered model. Note that if the disorder appears to mimic a symmetry operation, consider modeling the structure with twinning.

Remove all atoms in the region of the suspected disorder, and calculate structure factors and an electron density difference map. From the difference map identify approximate positions for one or more orientations of the uncertain atoms.

Include the newly located atoms in the model with fixed isotropic displacement parameters. Geometrical restraints on the positions of these atoms must be applied. The restraints should have standard uncertainties that are of the order of magnitude expected for the final standard uncertainties of the bonds among the ordered atoms of similar atom type. Set the occupancies of these atoms to refine as a single variable for each orientation. In the refinement model, set up atom connections so that only chemically-reasonable bonds will be reported.

Refine the model with at least twice the number of cycles per run that were used before the disorder was added to the model. Continue refining the model to convergence (maximum shift/error ratio < 0.05). If the refinement oscillates or is too slow, consider including a damping factor.

If the disorder model is chemically complete continue with the next step. Otherwise, locate remaining peaks in the disorder model from the difference map and return to step 2 above.

Include the isotropic displacement parameters in the list of refined variables and refine the model to convergence. If displacement parameters of one or more atoms become implausibly large, remove these atoms from the model and return to step 3 above.

Include hydrogen atoms in the disordered parts of the model. These hydrogens should be modeled by geometry and either not refined or refined with a riding model. Refine to convergence.

Once a plausible, chemically-reasonable model with isotropic displacement parameters has been attained, refine the disordered atoms with anisotropic displacement parameters. Restraints on the displacement parameters of the disordered atoms will probably be needed. Also, damping of the refinement shifts may be required. Refine to convergence.

### Structure Refinement Checklist

Completed crystal structures must pass the following tests.

The bonds in the model must be chemically reasonable. Similar bonds should have similar geometries, and all bond lengths, angles, etc., must match literature values.

There should be no atoms with displacement parameters that are non-positive definite,

*npd*. The displacement parameters should be checked for signs of systematic error. For example, ellipsoids of several heavy atoms aligned in one direction may indicate the need for a better absorption correction. Nonspherical or large ellipsoids suggests that the model may need to include disorder.The structure should be refined to convergence, that is the maximum shift/error ratio should be < 0.05. All non-hydrogen atoms should be refined with anisotropic displacement parameters provided that there are at least 10 data per parameter. Lower data-to-parameter ratios indicate that either the data were not collected to a high enough scattering angle, or that Friedel-related (or equivalent) data were not collected for a structure in a noncentrosymmetric space group.

Noncentrosymmetric space groups should be refined with the correct absolute structure. If anomalous scattering is being used to determine the absolute configuration, then the sample must contain atoms with sufficient differences in anomalous scattering to yield a meaningful absolute structure result. For example, the absolute structure cannot be determined under either of the following conditions, only one type of atom (besides hydrogen) and any type of radiation, or only first row atoms while using Mo

*K*α radiation. Also the data must be collected with sufficient quality (sufficient count times) to produce large (>10) mean*F*^{2}/σ. The best test for absolute structure is the Flack test.^{6}The weighting scheme should be adjusted so as to produce nearly constant values for the variances as functions of intensity and resolution. Doing this will also make the goodness of fit,

*S*, have a value around 1.0.There should be no peaks with strong intensities in a list of

worst-fitting data.

The final difference map should have no abnormally high peaks or low valleys.

The final

*R*_{1}and*wR*_{2}should be reasonably low for the quality of data.*Good*small molecule crystal structures usually have*R*_{1}< 0.05.*Acceptable*small molecule crystal structures typically have*R*_{1}< 0.10.

### References

- G. M. Sheldrick,
*SHELXTL Reference Manual*,**1997**, Bruker-AXS, Inc., Madison, WI - A. J. C. Wilson,
*Acta Cryst.*,**1976**, A*32*, 994-996. - H. D. Flack & D. Schwarzenbach,
**1988***Acta Cryst.*,**1988**, A*44*499-506. - W. C. Hamilton,
*Acta Cryst.*,**1965**,*18*, 502-510. - D. Rogers,
*Acta Cryst.*,**1981**, A*37*, 734-741. - H. D. Flack,
*Acta Cryst.*,**1983**, A*39*876-881. - R. W. W. Hooft, L. H. Straver, and A. L. Spek,
*J. Appl. Cryst.*,**2008**,*41*, 96-103.