Canadian Macromolecular Crystallography Facility
User Guide –
Data Collection Primer
The following sections are meant to provide practical advice on data collection on protein crystals at CMCF for those with experience collecting diffraction data. If you are new to collecting X-ray diffraction data, we advise working with your supervisor or experienced mentor until familiar with the process.
Native datasets are used to obtain high-resolution data in order to discern as much detail as possible about a structure or ligand. They are also used to solve new structures using Molecular Replacement methods when a similar structure is available. Therefore, obtaining the highest resolution spots is the primary motivation. However, it is also important to minimize overloaded pixels in the lower resolution spots for successful molecular replacement, as well as minimize radiation damage for a complete high-quality dataset.
To accomplish these goals, the collection parameters must be carefully balanced. Detector distance is adjusted such that all the desired spots fall on the detector surface with maximal spot separation. An oscillation or rotation angle per image, along with start and total angles must be selected to minimize overlaps while maintaining a reasonable total time for data collection. In general, a sufficient total angle should be collected to achieve a multiplicity in the dataset of 4 or more whenever possible (above 1 in the case of P1). Finally, an exposure is chosen to obtain maximum intensity, balanced with the need of minimizing overloaded pixels and radiation damage which will decrease the quality of the dataset.
Wavelength/Energy. The exact energy setting used for collecting native datasets is not critical, but is generally chosen such that it lies within the higher flux region specific to the beamline. Typically the energy is optimized to 12.658 keV on CMCF beamlines, an energy corresponding to the selenium absorption edge (a metal very commonly used for SAD phasing). On CMCF-BM there is an additional optional fixed-energy high-flux mode (8.157 keV) which can be used for collecting native datasets.
Detector Distance. During the initial screening, take note of the quality of diffraction, and how well the crystal diffraction fills the image. Use the zoom feature, brightness & contrast settings to inspect the image. The middle mouse wheel can be conveniently used to adjust the image brightness to visualize how far the spots extend from the centre of the image. You may not see the weakest spots if the image is too bright; darkening the image and zooming into the areas farther from the centre allows you to inspect these spots. Shorter detector distances result in higher resolution at the edges. Note there are detector distance limits built into the data collection software.
Guideline: detector distance should be adjusted so that spots extend to cover approximately 80% of the distance from the centre of image to edge of the detector. The faintest spots may not be visible to the eye but may still be present some distance from the visible ones.
Oscillation or Rotation Angle per image, Starting & Total Angles. The Autoprocess algorithm is run with each sample screen that has 3 or more images and the results can be viewed in the analysis window of MxDC. You may also run Autoprocess or other programs such as Mosflm manually in order to get an idea of your unit cell, symmetry parameters and obtain recommendations for oscillation/rotation angle per image, total angle, and the best starting angle. In general, smaller oscillation/rotation angles per image are used for larger unit cells. Larger total angles are generally needed for lower symmetry crystals.
CMCF-BM Guidelines: Optimize the oscillation angle, starting angle and total angle according to the Autoprocess recommendations. Oscillations between 0.25 up to 1.0 degree are recommended. When considering the total angle, aim for multiplicity of 4 or higher (above 1 in the case of P1). If there is uncertainty, an oscillations of 0.5 degrees with a total angle of 180 degrees is usually reasonable. Keep in mind the smaller the oscillation angle, and larger the exposure time and total angle, the longer your total data collection will take.
CMCF-ID Guidelines: Optimize the oscillation angle, starting angle and total angle according to the Autoprocess recommendations. A rotation angle per image around 0.2 degrees is usually reasonable (0.5 to 1 degree for screening). For large unit cells, 0.1 or 0.15 degrees may provide better results. When considering the total angle, aim for multiplicity of 4 or higher (above 1 in the case of P1). If there is uncertainty, a total angle of 180 degrees is usually sufficient.
Exposure. The MxDC software offers powerful options for optimizing the experiment. Beneath the diffraction image pane is an Information Button. The information window can be kept open during data collection and displays information about each image.
CMCF-BM Guidelines (MARCCD detector): Generally expect an average intensity between 25 and 200 (lower for well-diffracting crystals). The maximum intensity should generally be between 20,000 to 50,000 with few or no overloads (red pixels on MxDC). On beamline CMCF-BM, typical exposure per frame is between 2 and up to 15 seconds (0.2 - 1 degree oscillations), and the 200 or 150 micron beam size is recommended in most cases. Smaller beam sizes can be used but note that there will be lower flux on the sample. If using the high-flux mode at 8.157 keV, exposure times will be less. Appropriate exposure times may be judged by the maximum intensity as described above.
CMCF-ID Guidelines (PILATUS detector): Generally expect an average intensity between about 10 and 30. On beamline CMCF-ID, typical exposure is 0.2 to 0.4 seconds / 0.2 degrees. The time should be chosen to be roughly equivalent to the angle increment, delta omega, (for example 0.2 seconds / 0.2 degrees). Higher exposure time / angle increment will result in increasing radiation damage to the sample with minimal resolution gains. The beam size is fixed at 50 microns.
- Z Dauter (2010) Carrying out an optimal experiment. Acta Crystallogr. D66, 389-392.
- Z Dauter & KS Wilson (2006) Principles of monochromatic data collection. In: MG Rossmann, E Arnold (eds) International Tables for Crystallography Volume F: Crystallography of biological macromolecules. International Tables for Crystallography, vol F. Springer, Dordrecht.
Single-wavelength Anomalous Diffraction (SAD) datasets are used to identify heavy-atom positions and/or solve new structures for which a suitable Molecular Replacement model is unavailable. Data are generally collected at the peak of the absorption curve of a naturally-occuring heavy atom or heavy atom derivative. Very accurate measurement of the lower-resolution data is important in order to measure anomalous intensity differences. To accomplish this, radiation damage must be minimized and good quality high multiplicity data obtained.
Planning. Both CMCF beamlines can be used to collect anomalous datasets, but in general, the CMCF-BM beamline is chosen. It is more intuitive to avoid over-exposure and radiation damage on this beamline, while still obtaining sufficient intensities for SAD phasing. Before starting, be familiar with the heavy atom being used and its energy absorption edges. This information can be found on X-ray Absorption Edge Tables, and on the Scans page of MxDC. The CMCF beamlines can generally reach energies between 6 - 18 keV. For energies below this, the absorption edge cannot be reached and Sulfur-SAD (S-SAD) methods should be used instead.
Wavelength/Energy. Check the X-ray Absorption Edge Tables to choose an appropriate accessible energy for your heavy atom derivative (between 6 - 18 keV), or examine the periodic table display on the Scans page of MxDC. Adjust the beamline energy to a value near the edge of interest and, in the case of CMCF-ID, optimize the beam before continuing.
Before starting the MAD Scan, make sure your sample is centred properly. It is a good time to take the diffraction screening images to ensure the crystal is centred, to check the quality of diffraction, and to also get the Autoprocess screening algorithm started. Remember to use the "anomalous" option in Autoprocess or Mosflm to obtain an appropriate data collection strategy.
From the Scans page in MxDC, perform a MAD Scan. The following is an example of a scan obtained from a Zn-containing sample.
Along the x-axis, the energy is displayed in units of keV. The y-axis represents fluorescence counts. The fluorescence detector will saturate around 15,000 counts so attenuation is needed if the readings are too high. If the result is a flat line at 0 counts, check that there is appropriately optimized beam, that the beamline shutters are open and that your sample is centred properly. If there is still no reading, call your user support person for help. If there is no metal in the sample, you will get a low level reading that is essentially noise with no distinct peak. Exposure can be increased (and/or attenuation decreased) to ensure fluorescent count levels of around 50 - 100. This is a good starting base level for MAD Scans in general.
Note: An XRF Scan (Excitation Scan) can be used instead of a MAD Scan to identify metals in the sample if there is uncertainty. To perform an Excitation Scan, the beamline energy must be optimized above the absorption edge(s) of the atom(s) of interest, instead of near the absorption edge of the metal of interest.
Once the MAD Scan is complete, Chooch will automatically be run and the peak energy output, along with inflection energy and a suggested remote energy. Inflection & remote energies would be used in addition to the peak energy for a Multiple-wavelength Anomalous Diffraction (MAD) experiment if desired. The calculated peak energy should correspond to the peak energy visible in the plot, and is the energy used for SAD data collection.
Adjust the beamline to the peak energy and, in the case of CMCF-ID, optimize. In the Data page of MxDC, make sure the peak energy and other values are correctly defined before starting the collection.
Detector Distance. In general, detector distance can be chosen as for native datasets, described above.
Oscillation or Rotation Angle per image, Starting & Total Angles. When the Autoprocess screening algorithm is run with the anomalous mode enabled, suggested values will be provided for collecting anomalous data. Mosflm has a similar anomalous option. An important difference for setting these values, as compared with native data, is the total angle.
Guideline: Larger total angle is needed for more multiplicity; 360 degrees is not uncommon. The amount of data collected may be judged by the anomalous signal obtained after collecting some images and/or solving the metal sites. Autoprocess in anomalous mode provides "anomalous signal" and "anomalous correlation" values. Look for anomalous signal at least above ~1 in low resolution shells, with significant correlation. The higher the better. At some point, radiation damage will outweigh the benefit of collecting more frames. In some cases, combining data from multiple crystals may be necessary.
Exposure. This should be much less than for native datasets. Radiation damage must be minimized as much as possible, therefore:
CMCF-BM Guideline: Aim for maximum intensity between 5,000 - 20,000, with no overloads, attenuation may be needed.
CMCF-ID Guideline: 0.2 seconds or more should be used to keep detector readout error <1%; beam attenuation is thus usually required to minimize radiation damage.
- Z Dauter (2006) Estimation of anomalous signal in diffraction data, Acta Crystallogr D62, 867-876.
- JM Holton (2009) A beginner's guide to radiation damage, J Synchrotron Radiat 16, 133-142.
A Sulfur-SAD (S-SAD) experiment is a special case of SAD experiment in which anomalously-scattering sulfur atoms, naturally present in most proteins, are used to solve the structure. In this case, the absorption edge (sulfur K edge = 2.47 keV) is beyond the lower energy limit of a typical crystallography beamline, including CMCF beamlines. Similar methods can be used for iodine (L1 edge 5.19 keV), cesium (L1 edge 5.71 keV), and several other elements which also have absorption edges near the lower accessible limit or below, with no edges within the optimal beamline energy range.
Since the absorption edge is below the energy range of the beamlines, a MAD Scan is not performed. Instead, the energy is optimized near the lower limit of the beamline (usually between 6.0 – 7.0 keV), where some anomalous scattering can still be observed. Since the signal is weaker farther from the absorption edge, and air scattering becomes more prominent at lower energies, care must be taken to measure the reflections accurately.
Planning. Both CMCF beamlines can be used to collect S-SAD datasets, but in general, the CMCF-BM beamline is chosen. It is more intuitive to avoid over-exposure and radiation damage on this beamline, while still obtaining sufficient intensities for S-SAD phasing.
Wavelength/Energy. The recommended energy is 7.0 keV.
Detector Distance. In general, the diffraction pattern should be of high quality with well-separated spots. The detector distance should be somewhat closer than for experiments using higher energies because the effects of air scattering become more prominent at lower energies.
Oscillation or Rotation Angle per image, Starting & Total Angles. These can be chosen in a similar manner as for SAD experiments, described above. However, typically more data are needed for S-SAD.
Guideline: Larger total angles are needed for more multiplicity but this has to be balanced carefully with minimizing radiation damage. One or more 360 degree passes are not uncommon. The detector distance (and/or Chi angle on CMCF-BM) can be adjusted slightly for different passes so that diffraction spots are measured on different parts of the detector with different 360 degree passes. Multiple crystals can be used if radiation damage becomes an issue, and the datasets combined. The amount of data collected may be judged by the anomalous signal and/or solving the sulfur sites. Autoprocess in anomalous mode provides "anomalous signal" and "anomalous correlation" values. Look for anomalous signal above about 1 in low resolution shells, with significant correlation.
Exposure. This should be much less than for native datasets. Radiation damage must be minimized as much as possible.
CMCF-BM Guideline: Aim for maximum intensity around 5,000 with no overloads, attenuation may be needed.
CMCF-ID Guideline: 0.2 seconds or more should be used to keep detector readout error <1%; beam attenuation is thus required to minimize radiation damage.
- Q Liu et al. (2012) Structures from anomalous diffraction of native biological macromolecules. Science 336, 1033-1037.
- M Cianci et al. (2008) The interdependence of wavelength, redundancy and dose in sulfur SAD experiments. Acta Crystallogr. D64, 1196-1209.
- UA Ramagopal et al. (2003) Phasing on anomalous signal of sulfurs: what is the limit? Acta Crystallogr. D59, 1020-1027.