SIR92 Description

Description of the SIR92 Program

The main modules of the program are :

SIR92, LIST, DATA, NORMAL, SEMINVARIANTS, INVARIANTS, PHASE, FOURIER/LEAST-SQUARES,EXPORT, RESTART, PATTERSON.

SIR92 module

It interpretes commands and calls desired routines.

LIST module

This is the software interface between SIR92 and the direct access file on which data and results are stored.

DATA module

This routine reads the basic crystallographic information like cell parameters, space group symbol, unit cell contents and reflections. It includes a modified version of the subroutine SYMM by Burzlaff and Hountas 1982. Symmetry operators and information necessary to identify structure seminvariants (estimated in SEMINVARIANTS routine) are directly derived from the space group symbol.

Diffraction data are checked in order to find out equivalent reflections or systematically absent reflections (which are then excluded from the data set) and, eventually, reflections (weak) not included in the data set Cascarano, Giacovazzo and Guagliardi, 1991.

NORMAL module

In this module diffraction intensities are normalized using the Wilson method. Statistical analysis of intensities is made in order to suggest the presence or absence of the inversion centre and to identify the possible presence and type of pseudotranslational symmetry Cascarano, Giacovazzo and Luic, 1988a,b; Fan, Yao and Qian, 1988.. Possible deviations (of the displacive type) from ideal pseudotranslational symmetry are also detected. All the above information is NOT used as prior information in the next steps of SIR92, unless the directive PSEUDO is given to the program.

When some additional prior information, besides positivity and atomicity of electron density, is available, then a suitable renormalization of structure factors is made.

SIR92 is able to deal with the following types of prior information:

a) pseudotranslational symmetry, identified by normalization routine or other source;

b) a well oriented and well positioned molecular fragment Camalli, Giacovazzo and Spagna,1985; Burla,Cascarano,Fares, Giacovazzo, Polidori and Spagna,1989.

SEMINVARIANTS module

This routine is not used in the default run of SIR92. One-phase s.s. are estimated by means of their first and second representation as described by Giacovazzo 1978, Cascarano et al. 1984. As a default the second representation is calculated.

Two-phase s.s. are estimated by means of their first representation as described by Giacovazzo et al. 1979, Burla, Giacovazzo and Polidori 1988.

The estimated s.s. are stored in the direct access file; those evaluated with highest reliability will be actively used in the phasing process while the others will contribute to compute, with other phase relationships, the figure of merit CPHASE.

INVARIANTS module

Up to 20000 triplets relating reflections with normalized E values greater than a given threshold (strong triplets) are stored for active use in the phasing process. Also triplets (psizero triplets ) relating two reflections with large E and one with E close to zero are generated: they are actively used in the phasing process Giacovazzo, 1993 and define a special figure of merit (PSCOMB). Special types of triplets (psi-E triplets) based on two strong and one intermediate reflections (just below the threshold of strong reflections) are calculated and used in the FOURIER/LEAST-SQUARES module in order to extend phase information, Altomare, Cascarano, Giacovazzo and Viterbo , 1991.

Negative quartets are generated by combining the psizero triplets in pairs, and those with cross-magnitudes smaller than a given threshold are estimated by means of their first representation, as described by Giacovazzo 1976. These quartets are be actively used in the phasing process Giacovazzo, Burla and Cascarano, 1992 and will provide an important contribution to the FOM CPHASE.

Active triplets may be estimated according to the p-3 distribution of Cochran 1955 : the concentration parameter of the von Mises distribution is then

            C =  2 * E(h) * E(k) * E(h-k) / sqrt(N)     (1)

Triplets can also be estimated according to their second representation ( i.e. the p-10 formula, as described by Cascarano, Giacovazzo, Camalli, Spagna, Burla, Nunzi and Polidori,1984). The concentration parameter of the new von Mises (i.e. of the same form of Cochran's) distribution is given by

                   G = C (1 + q)                        (2)

where q is a function (positive or negative) of all the magnitudes in the second representation of the triplet. The G values are rescaled on the C values and the triplets are ranked in decreasing order of G. The top relationships represent a better selection of triplets with phase value close to zero than that obtained when ranking according to C. These triplets will be actively used in the phase determination process.

Triplets estimated with a negative G represent a sufficiently good selection of relationships close to 180 degrees to be used both for active use in the phasing process Giacovazzo, Burla and Cascarano, 1992 and for the calculation of a powerful FOM (CPHASE).

Triplets with G close to zero are expected to have values widely dispersed around 90 or 270 degrees and are used to compute an enantiomorph sensitive FOM. A similar FOM is also computed using quartets estimated with a very small concentration parameter.

As a default, triplets are estimated according to p-10 formula.

The parameter C of the Cochran distribution (say p-3) is suitably modified when prior information, such as that described above in the section "NORMAL module", is available. Then triplet phases are no longer expected to be around zero (see quoted references) and may lie anywhere between zero and two pi.

PHASE module

In the SIR92 program the most reliable one-phase s.s. are treated as known phases. Besides triplets, also the most reliable negative quartets and two-phase s.s. may be actively used. Each relationship is used with its proper weight: the concentration parameter of the first representation for quartets and two-phase s.s., and C or G for triplets.

Convergence/divergence procedure:
The convergence procedure Germain, Main and Woolfson, 1970 is a convenient way of defining an optimum starting set of phases to be expanded by the tangent formula or by any other algorithm. When the p-10 formula is used, as a default, a special convergence process is devised which chooses the starting set according to

       alpha = Sum ( G*D1(G)*D1(alpha k) ) * D1(alpha h-k)  (3)

as suggested by Giacovazzo 1979 and by Burla, Cascarano, Giacovazzo, Nunzi and Polidori 1987, with

                    D1(G) = I1(G) / I0(G)

I1 and I0 represent modified Bessel functions of order one and zero respectively. The summation in (3) is over all relation- ships defining the reflection h. If p-3 formula is used the default choice is

                   alpha  = Sum ( C * D1(C) )              (4)

Once the starting set has been defined, a good pathway for phase expansion is determined by a divergence procedure. In the divergence map, starting from the reflections in the starting set, each new reflection is linked to the preceding ones with the highest value of alpha.

Phase extension and refinement:

The starting set defined by the preceding step is usually formed by the origin ( and enantiomorph ) fixing reflections, a few one phase s.s. and a number of other phases which may be obtained:

a) by magic integer permutation White and Woolfson, 1975; Main, 1978,

b) by a random approach Baggio, Woolfson, Declercq and Germain, 1978; Burla, Cascarano and Giacovazzo , 1992.

The option a) is the default, b) runs if the directive RANDOM is used. In this last case a large number (depending on the available computer time) of trials can be requested.

If a partial structure is available (the directive PARTIAL should have been used in the normalization routine) the PHASE routine is automatically able to take that information into account. No further directives are strictly necessary. However directives SYMBOLS, SPECIALS, MAXTRIAL may be used to change default values.

Phase expansion and refinement are carried out by means of a tangent formula using triplets, negative quartets, psi-zero triplets and the most reliable two-phase structure seminvariants. In the weighting scheme the experimental distributions of the alpha parameters are forced to match with the theoretical ones Burla, Cascarano, Giacovazzo, Nunzi and Polidori, 1987.

For each phase set several FOM's are computed using all invariants and seminvariants estimated by means of the representation method. Their meaning and an optimized way of combining all the computed FOM's to give a highly selective combined figure of merit (CFOM) is described in the papers by Cascarano, Giacovazzo and Viterbo 1987 and by Cascarano, Giacovazzo and Guagliardi 1992.

All FOM's, as well as the combined CFOM, are expected to be equal to 1.0 for correct solutions. CFOM larger than 0.5 can be considered encouraging.

If pseudotranslational symmetry is present then CFOM > 0.3 may characterize the correct solution.

FOURIER/LEAST-SQUARES module

The sets of phases generated by the tangent routine are first expanded through psi-E relationships and then passed to the fast fourier transform routine written by L.F.Ten Eick 1977 and subsequently modified by the MULTAN team Main et al. 1980.

Several additional features have been introduced in the present version.

a) special positions are handled, peaks very close to symmetry elements are moved onto symmetry elements, the site symmetry is defined and the atomic occupancy factor is calculated; key numbers for designating free, coupled or fixed positional parameters for least-squares subroutines are also calculated together with symmetry conditions on the thermal ellipsoid.

b) The set of peaks provided by the peak search routine is automatically analyzed in order to provide sound molecular fragments. If (as is usually the case) the atomic species present in the unit cell are known, their atomic radii and (eventually) their chemical coordination are used in order to automatically identify fragments and relate peaks to the atomic species (automatic labelling of peaks).

The above information is automatically processed via least-squares Fourier cycles in order to complete the crystal structure, reject false peaks and refine structural parameters Altomare, Cascarano, Giacovazzo and Guagliardi 1993. An isotropic diagonal matrix refinement is used which does not involve H atoms.
The final R values for a correct solution usually vary from 0.08 to 0.15.

EXPORT/RESTART module

Atomic parameters produced by the preceding module, stored in the direct access file, can be exported in ASCII file in a format suitable for other programs such as CRYSTALS, SHELXL93, MOLPLOT, MOLDRAW, SCHAKAL, etc.

If a graphic interface is available it is possible to delete or relabel some atoms and restart the FOURIER LEAST-SQUARES procedure. If the graphic interface is not available, it is possible to re-run the program after having modified the atom list through the command RESTART.

PATTERSON module

In SIR92 it is possible to compute a PATTERSON map using various coefficients.