Conformer/Rotamer Search

Many molecules, especially in organic chemistry, have a large degree of conformational freedom. At relevant temperatures, they are not represented by a single minimum on the potential energy surface, but rather by an equilibrium between rapidly interconverting minimum structures. Thus, comparing experimental measurements with theoretical data often requires consideration not only of a single structure, but of an entire thermodynamically accessible conformational ensemble.

CREST (Conformer–Rotamer Ensemble Sampling Tool) is a program used to perform conformer searches using the methods implemented in the xTB package. The wrapper function find_conformers provided by HQS Molecules interfaces CREST (which uses xTB as its engine) to search for low-energy conformers and rotamers of a given molecule. Afterwards, a post-processing is performed to refine the conformer ensemble.

Conformers and Rotamers

Conformers are distinct minima on the potential energy surface that are separated by sufficiently low barriers, such that they can interconvert at a given temperature. Often, this occurs through torsional rotations around single bonds.

In this context, rotamers are defined as interconverting structures that can be represented as permutations of the atoms in a conformer [1]. A simple example of this structures related by the rotation of a methyl group (–CH3) in an organic molecule. Somewhat less intuitively, the two chair structures of cyclohexane are also regarded as rotamers in this context, as their structures can be overlayed.

Input

The find_conformers wrapper function performs three successive steps:

  1. Conformer/rotamer ensemble search using CREST.
  2. Reassignment of the structural ensemble and completion of the rotamer sets.
  3. Calculation of vibrational frequencies and discarding structures corresponding to saddle points.

It employs the following arguments:

  • molecule: mandatory specification of the input molecule as a Molecule object.
  • solvent: optional specification of the solvent as a string. By default, the calculation is performed in vacuum.
  • create_tmpdir: set to True by default, so that the calculation is performed in a temporary directory that is deleted upon finishing the calculation.
  • force_field: it is set to True by default; set to False to perform the conformer search with GFN2-xTB instead.
  • freq_threshold: the frequency threshold, in cm−1 below which conformers are considered saddle points and thus discarded.
  • allow_mirror_rotamers: it is set to False by default, such that mirror images are considered to be different structures and not equivalent rotamers.
  • similarity_tolerances: the numerical tolerances for bond distances, bond angles, and dihedral angles for structures to be identified as equivalent.
  • do_rotamer_completion: it is set to True by default, such that it is attempted to complete the set of rotamers using permutational logic.

The second to last argument is of type SimilarityTolerances, which is a data class containing default values for the absolute and relative tolerances for bond distances, bond angles, and dihedral angles to be considered equivalent in the structure-matching procedure. Please note that bond and dihedral angles are compared via the absolute value of the complex exponential of the respective angle difference, and the tolerances refer to those values.

It is important to emphasize that the function uses GFN-FF by default, as conformer searches are very expensive; in contrast, the wrapper functions for the xTB package use GFN2-xTB by default.

Output

The output is an instance of the ConformerEnsemble class.

  • Important methods of the class provide access to flat lists of all conformer structures and energies (get_conformer_structures, get_conformer_energies),

  • as well as the number of rotamers per conformer (get_degeneracies).

  • Detailed information on conformers with their individual rotamers is contained in the items of the conformers list. Each element of this list is of the Conformer type.

    • Apart from having the structure
    • and energy attributes,
    • it also has a list of atom_mappings dictionaries, which represent the atom permutations of the reference structure to the different rotamers.

Note that the outcome of calculations with CREST may differ somewhat, even if otherwise identical calculations are repeated. Moreover, the starting geometry affects the conformer search. However, the conformer refinement procedure vastly reduces this uncertainty.

Algorithm

Different workflows have been implemented for CREST over the years. It is worth noting that find_conformers calls CREST with the static metadynamics simulations algorithm, instead of the computationally less demanding extensive metadynamic sampling utility that is the default when running CREST directly.

Note also that the grouping of conformer and rotamer structures as determined in the CREST calculation is ignored, and all the structures are regrouped by our own procedure as follows:

  • First, a plain list of structures is generated from all conformers and rotamers obtained from CREST.

  • Then, the structures are all compared among each other, grouping them by conformers purely based on their 3D structural parameters (bond lengths, bond angles, and dihedral angles).

  • If they are the same within a given numerical threshold, they are considered to be equivalent structures (rotamers), and the atomic mapping (or permutation) relative to the representative conformer is stored for each of the rotamers instead of all the geometries themselves.

  • Successively, the set of rotamer mappings is checked for completeness by multiplying all permutation matrices and verifying if the result is already present in the set of permutations. That way, a full set of rotamers is determined for each conformer as long as the initial set of structures is sufficiently representative.

  • As a last step of the procedure, structures corresponding to energetic saddle points are discarded.

Example

An example of a conformer/rotamer search, starting from a molecule name:

from hqs_molecules import PubChem, find_conformers, smiles_to_molecule

pc = PubChem.from_name("1-butene")
molecule = smiles_to_molecule(pc.smiles)
conf_data = find_conformers(molecule)

print(f"Found {len(conf_data)} distinct conformers.")  # or conf_data.num_conformers
print(f"Conformer 0 has {conf_data.get_degeneracies()[0]} rotamers.")