CANADIAN MACROMOLECULAR CRYSTALLOGRAPHY FACILITY

Data collected at CMCF is sent to an automatic processing pipeline developed in-house called MXproc (previously, AutoProcess). MXproc generates appropriate input files for, and runs XDS. It uses POINTLESS to identify the space group from systematic absences and BEST to determine a data collection strategy. Future iterations of MXproc will also support DIALS. All generated input and output files are saved in a new directory, including .mtz, .shelx and .cns reflection output files which can be used directly in your favourite structure solution/refinement programs.

Users do not need to do anything to run MXproc by default. However, can choose to re-submit your data with various changes (such as restricting frames used, merging data from multiple crystals, or specifying the unit cell).

Running MXproc

Run MXproc either from the MxDC data collection software or from a Data Processing terminal command line. Any local terminal can be moved to the Data Processing cluster by typing hpc . Or by opening a new Data Processing terminal using the Desktop icon.

In MxDC, running MXproc is a seamless process. Simply go to the Analysis tab, select the dataset(s), modify the processing options and click Run. Processing in MxDC will automatically upload the results to MxLIVE, associated with the sample on which the data were collected. Additional information is available in the MxDC "Analysis" tab

Command-line input

MXproc will create a subdirectory for output files within the current working directory. Therefore, change the directory appropriately before issuing the commands. auto.process can be run from any folder, while all other commands will need to be run from an existing processing directory..

Accepted variables for each dataset are found by adding -h or --help and are summarized below.

Command-line inputs
Command Function Required variables Common examples
auto.process Run a complete processing.
When run in an existing processing folder, will attempt to resume processing from an existing checkpoint.
image_file
(ex. lysozyme_0001.cbf, lysozyme 000001.h5)
auto.process <image_file>
Processes a native dataset

auto.process <image_file_1> <image_file_2> ...
Batch process from more than one dataset

auto.process -m <image_file_1> <image_file_2> ...
Merge data from more than one dataset

auto.process -a <image_file>
Processes an anomalous dataset without merging Friedel pairs.

auto.process -g <###> <image_file>
Force processing to proceed using the specified space group number.

auto.process -min-sigma <value> -max-sigma <value> <image_file>
Integrate data with the minimum/maximum I/õ specified
auto.init Generate essential processing scripts (XDS.INP) and directories for editing and manual processing. image_file auto.init <image_file>

auto.init -a <image_file>

auto.init -f <frame_range> <image_file>
auto.integrate Repeat a previous processing from the integration step onward. must be run within an existing processing directory auto.integrate -f <1, 4-10, 22>
Process using only the selected data frames.

auto.integrate -g <###>
Force processing with the specified space group.
auto.scale Repeat processing from the scaling step onward. must be run within an existing processing directory auto.scale -r <###>
Truncate high resolution data to the specified value.
auto.spots Repeat processing from the spot-finding step onward. must be run within an existing processing directory auto.spots -c <X_val Y_val>
Override existing beam centre data

auto.spots -f <1, 4-10, 22>
Process using only the selected data frames.
auto.strategy Repeat previous processing from the strategy determination step onward. must be run within an existing processing directory auto.strategy
auto.symmetry Repeat previous processing from the unit cell determination step onward. must be run within an existing processing directory auto.symmetry -g <###>
Force processing using the specified space group.

Multiple arguments can combined in a single command to allow for customized data processing. Repeat calls to auto.process and auto.init will, by default, create a new subdirectory for each new call. All other commands will overwrite existing date from the selected step.

Output

MXproc by default creates a new working directory that contains the XDS input and output files. The new directory name depends on the requested mode of operation. Automatic processing within MXDC are named either "proc-native" (native data processing output) or "proc-screen"(crystal screening/strategy), or "proc-mad" (MAD data processing), or "proc-merge" (merging multiple datasets), or "proc-anom" (anomalous data processing). While user-defined auto.process calls are saved in an "xds-1" folder (or any user-defined name using the -d flag).

For multi-dataset processing such as MAD or Merging, sub-directories will be created within the working directory described above, for individual datasets. The output .mtz, .shelx and .cns reflection files can be found in the top-level working directory. This directory will also contain output as report files in HTML (report.html) and TEXT (report.txt) formats, summarizing the results. The HTML file can be viewed using a browser. TEXT files, and various other .log files can also be viewed using your favorite text editor/viewer such as "gedit" or "more".

To view the web browser compatible report type  

firefox report.html  

Besides the reflection files and reports, XDS creates numerous other files, including

  • image files (.cbf)
  • XDS input files (.INP)
  • XDS output files (.LP)

Viewing .cbf and .h5 Files

You may view the .cbf files with imgview or xds-viewer. For example, to view spots on the last frame processed, type

imgview FRAME.cbf     OR     xds-viewer FRAME.cbf

HDF5 formatted .h5 files contain a collection of images with header data specified in a "<mydata>_master.h5" file. These files can also be opened using imgview.

imgview <mydata>_master.h5

.HDF5 images are also viewable using adxv by selecting an .h5 data file and checking +slabs to cycle images using the forward/back arrows.

Using XDS Input Files

There may be instances when you want to run XDS manually. You can view a full list of XDS input parameters on the XDS homepage.

auto.inputs  <image_file>

  • An example of an input file appears below:
!- XDS.INP ----------- Generated by MX Process
JOB= CORRECT
!------------------- Dataset parameters
X-RAY_WAVELENGTH= 0.95371
DETECTOR_DISTANCE= 168.6
STARTING_ANGLE= 0.0
STARTING_FRAME= 1
OSCILLATION_RANGE= 0.20
FRIEDEL'S_LAW= TRUE
NAME_TEMPLATE_OF_DATA_FRAMES=./Lysozyme/Lysozyme_1/data/Lysozyme_1_??????.h
5 GENERIC
DATA_RANGE= 1 1801
SPOT_RANGE= 1 1801
LIB= /cmcf_apps/xds/dectris-neggia.so
SPACE_GROUP_NUMBER= 92
UNIT_CELL_CONSTANTS= 78.600 78.700 37.100 90.000 90.000 90.000
REIDX= 0 -1 0 0 0 0 -1 0 1 0 0 0
!----------------- Beamline parameters
DETECTOR= EIGER
NX=3110 NY= 3269
QX=0.07500 QY=0.07500
ORGX= 1544 ORGY= 1577
SENSOR_THICKNESS= 0.450
OVERLOAD= 25378
STRONG_PIXEL= 1.0 ! NOTE: SPOT.XDS managed externally
TRUSTED_REGION=0.00 1.2
TEST_RESOLUTION_RANGE= 50.0 1.00
TOTAL_SPINDLE_ROTATION_RANGES= 90 180 30
STARTING_ANGLES_OF_SPINDLE_ROTATION= 0 180 15
VALUE_RANGE_FOR_TRUSTED_DETECTOR_PIXELS= 6000 30000
INCLUDE_RESOLUTION_RANGE=50.0 0.00
FRACTION_OF_POLARIZATION=0.99
POLARIZATION_PLANE_NORMAL= 0.0 1.0 0.0
ROTATION_AXIS= 1.000 0.000 0.000
INCIDENT_BEAM_DIRECTION= 0.000 0.000 1.000
DIRECTION_OF_DETECTOR_X-AXIS= 1.000 0.000 0.000
DIRECTION_OF_DETECTOR_Y-AXIS= 0.000 1.000 -0.000
!----------------- Extra parameters
NUMBER_OF_PROFILE_GRID_POINTS_ALONG_ALPHA/BETA= 13
SEPMIN= 3.0
CLUSTER_RADIUS= 1.5
MINIMUM_NUMBER_OF_PIXELS_IN_A_SPOT= 6
REFINE(IDXREF)= CELL BEAM ORIENTATION AXIS
REFINE(INTEGRATE)= POSITION BEAM ORIENTATIO