Semi-empirical Calculations with xTB
This HQS Molecules
module contains a Python interface to perform calculations with the xTB and
CREST software packages.
The xTB (extended tight binding) program package is developed by the research group of Prof. Stefan Grimme at Universität Bonn. It contains implementations of the semi-empirical quantum-mechanical methods GFNn-xTB (n = 0–2) as well as of the GFN-FF force field ("GFN" stands for Geometries, Frequencies, and Non-covalent interactions, for which it has been parametrized). HQS Molecules
contains wrapper functions to perform single-point energy calculations, geometry optimizations, and frequency calculations.
In this section the user will find further details on the associated functions:
- Single-point energy calculations,
- Structure optimizations,
- Gradient calculations,
- Frequency calculations,
- Stable optimizations including Hessian calculations,
- Biased Hessian, and
- Conformer/Rotamer search.
Single-point energy calculations
The energy of any meaningful set of atomic coordinates with a specific charge and multiplicity
can be calculated using the single_point
function. Calculations are performed with the GFN2-xTB method by default, or alternatively with the GFN-FF method.
Input
The first mandatory argument of the function is a Molecule
object, as the charge and multiplicity need to be known to perform a calculation.
Note: the functions for reading an XYZ file return MolecularGeometry
objects, which contain neither charge nor spin information. A MolecularGeometry
objects can be easily converted to a Molecule
object by providing information on the charge and, optionally, on the multiplicity, as described in the XYZ File Handling chapter. On the other hand, the 2D-to-3D structure conversion functions return Molecule
objects with charge but without spin information.
Keyword arguments
It is possible to further refine the calculation with the following keyword arguments:
force_field
: Use GFN-FF force-field (defaults toFalse
, in which case the GFN2-xTB method is used).solvent
: By default, calculations are performed in vacuum. If the name of a solvent is provided, the implicit solvation model ALPB (analytical linearized Poisson-Boltzmann) will be used in the calculation. The available solvents are listed here.create_tmpdir
: Run the calculation in a temporary folder that is deleted with completion of the calculation (the default is set toTrue
).
Output
The single_point
function returns a float with the single-point energy in Hartree.
Example
An example is provided here:
from hqs_molecules import single_point
energy = single_point(molecule, solvent="Water", force_field=False)
Structure optimization
xTB can vary the position of atoms in the system to find the equilibrium structure with the lowest energy.
This functionality is embedded into HQS Molecules
in the optimization
function.
Extra arguments
The function works similarly to the single_point
function but it has two extra arguments:
-
opt_level
with the possible optimization levels as defined by xTB. In this wrapper, the levels are defined in theXTBOptLevel
class. By default, the very tight criterion is employed (XTBOptLevel.VTight
), whereas the xTB software itself usesXTBOptLevel.Normal
. The options are:- Crude
- Sloppy
- Loose
- Lax
- Normal
- Tight
- VTight
- Extreme
-
max_cycles
is the maximum number of optimization iterations. By default, it is set toNone
, which means thatmax_cycles
is automatically determined by the xTB software depending on the size of the molecule (check above link).
Output: Trajectory
Calls to optimization
function return a Trajectory
object containing the optimization trajectory (structures
) as a list of Molecule
objects and the energies (in Hartree) for each step (energies
) as a list of floats.
Most of the time, the quantities of interest are the optimized geometry and its energy. To access these, a Trajectory
object contains two types of attributes: Trajectory.last
and Trajectory.last_energy
contain the last geometry and its energy, respectively; Trajectory.lowest_energy
and Trajectory.lowest
contain the lowest energy and the corresponding geometry. Trajectory.lowest_step
returns the index of the energetically lowest geometry. In many cases, the last step may also be the one with the lowest energy.
The trajectory can be written to an XYZ file using the write
(or write_with_info
) functions of the xyz
module.
Example
The following box illustrates some general ideas behind executing a geometry optimization, starting from a Molecule
object.
from hqs_molecules import optimization, xyz
# perform optimization
opt_data = optimization(molecule, solvent="water")
# Write the trajectory into an XYZ file
xyz.write(file="trajectory.xyz", mol=opt_data.structures)
# Optimized structure: here it is assumed to be in the last step.
opt_molecule = opt_data.last # Molecule object
opt_energy = opt_data.last_energy # float
Gradient calculation
The nuclear gradient (first derivative of the energy with respect to the atomic coordinates) is calculated at each step of an optimization procedure. A necessary condition for an optimization to converge is typically to reduce the (norm of the) gradient to almost zero.
A single-point gradient can be calculated using the gradient
function with the same arguments as a single_point
procedure. It returns a nested list of floating point numbers, representing an N × 3 matrix (where N is the number of atoms in the molecule). Each value in the list corresponds to the derivative of an atomic position with respect to the x, y or z coordinates, represented in units of Hartree/Bohr.
Frequency calculations
The Hessian matrix (a matrix of second derivatives of the energy with respect to the atomic coordinates) is used to determine the nature of a stationary point, vibrational frequencies, molecular vibrational spectra, and thermochemical properties. In xTB, the Hessian matrix is calculated numerically.
Extra argument
HQS molecules
provides the frequencies
wrapper function to calculate vibrational frequencies and thermochemical corrections; note that it does not return the Hessian matrix itself. Besides the arguments already mentioned in the single_point
wrapper, this function also accepts a temperature
argument, which allows the user to define one or multiple temperatures for the calculation of thermochemical corrections. By default, the calculation will be performed at 298.15 K, but other temperature or a list of temperatures can also be specified.
Stationary point
If frequency calculations are meant to be performed for a stationary point (an optimized equilibrium structure or a saddle point), it is important to perform the calculations with the same method as the optimization itself; the potential energy surfaces of different methods have differing stationary points.
Frequency calculations are often used to ascertain if a stationary point is a minimum or a saddle point. A molecule of N atoms has 3N degrees of freedom: three translational degrees, three rotational degrees (two for linear molecules), and 3N − 6 vibrational degrees (or 3N − 5 for linear molecules). For a minimum, those frequencies need to be positive; for a transition state, exactly one of them needs to be imaginary.
Output
The return value is a MolecularFrequencies
object, which contains three important fields:
total_energy
: the total electronic energy of the system (in Hartree).frequencies
: a list with the harmonic frequencies (in cm−1). This list contains 3N elements, where 6 values should be zero (or 5 for linear molecules). By convention, imaginary frequencies are represented as negative numbers.thermochem
: a list of objects with temperature-dependent thermodynamic properties (BasicThermochemistry
). Each of theseBasicThermochemistry
objects contains thetemperature
(in K), the totalenthalpy
and the totalgibbs_energy
(both in Hartree), and theentropy
(in Hartree / K). Note that the electronic energy is included in the enthalpy and in the Gibbs energy; however, the solvation contribution currently is not.
Example
from hqs_molecules import frequencies
thermo_data = frequencies(opt_geom, temperature=[100, 125], solvent="water")
print("Single point energy:", thermo_data.total_energy)
print("Frequencies:", thermo_data.frequencies)
for res in thermo_data.thermochem:
print(
f"At {res.temperature} K: "
f"the enthalpy is {res.enthalpy}, "
f"the entropy is {res.entropy}, and "
f"the Gibbs energy is {res.gibbs_energy} (in Hartree)"
)
Stable optimization with Hessian calculations
With a simple geometry optimization, there is no guarantee of obtaining a minimum on the potential energy surface; a converged structure can also be a saddle point. Vibrational frequencies need to be calculated for a full characterization. Only if all of them are non-negative, a minimum has been found.
If imaginary frequencies are found, a displacement along the associated normal mode is a good starting point for a new optimization. In difficult cases, optimizations followed by frequency calculations may need to be repeated multiple times in order to find a genuine local minimum.
Such a composite calculation is provided by HQS Molecules
with the opt_with_hess
function. It repeats the Hessian and optimization calculations until no imaginary frequencies are found. If the maximum number of restarts (indicated by the max_restarts
keyword, 4 by default) is reached without converging to a minimum, an exception is raised by default.
The latter behavior can be altered by setting fail_with_saddle
to False
, in which case the non-converged trajectory is returned together
with the Hessian result from the lowest-energy structure. The latter can be obtained from the lowest
property
and can be used for a successive calculation.
Input
Since the function combines optimizations with frequencies, it requires the same arguments as the optimization
function, with an option to provide a temperature
argument (for the thermodynamic properties) and max_restarts
.
Output
The output of optimization with Hessian calculation is a tuple with a Trajectory
object (as for an optimization) and a
MolecularFrequencies
object (as for a frequency calculation). The trajectory concatenates the geometries and energies of all optimizations performed within the procedure. On the other hand, the MolecularFrequencies
object contains information for the converged geometry, determined by the overall lowest energy.
If fail_with_saddle
is set to False
, the analogous results from the possibly non-converged geometry may be returned.
Example
A simple call to this procedure can be made as exemplified below:
from hqs_molecules import opt_with_hess, xyz
molecule = xyz.read("input_geometry.xyz").to_molecule(charge=0, multiplicity=1)
trajectory, frequency_data = opt_with_hess(molecule)
Biased Hessian
Hessian calculations for vibrational frequencies or thermochemical contributions are normally performed at stationary points of the potential energy surface. Therefore, the geometry optimization and the Hessian calculation need to be performed with the same method.
However, Spicher and Grimme proposed a method ("single-point Hessian") to calculate the Hessian for stationary points that were optimized with a different method. Therefore, the geometry can be optimized with a more expensive method, while xTB is used to obtain a quantity resembling the Hessian. In a nutshell, it employs the following steps:
- The input structure is re-optimized under application of a biasing potential, which constrains the atoms to remain close to their original positions.
- The Hessian is calculated at the new, slightly distorted geometry under the application of the biasing potential.
- Finally, the frequencies are re-scaled to reduce the impact of the biasing potential.
HQS Molecules
provides the function biased_hessian
to interface the described method in the xTB package. Its usage is analogous to the function frequencies
described above.