XYZ File Handling
HQS Molecules
provides the functionality for reading and writing XYZ files. XYZ files are a common format for representing molecular structures. The ability to read and write these files ensures compatibility with a wide range of computational chemistry tools and facilitates the exchange of molecular data between different software packages.
Reading and Writing XYZ Files
XYZ files are read using the function xyz.read
. It can be called either with a Path
object or a
string representing a file path; alternatively, it may be provided with a file-like object.
>>> from hqs_molecules import xyz
>>> # providing a file name
>>> molgeom = xyz.read("molecule.xyz")
>>> # providing a file-like object
>>> with open("molecule.xyz", "r") as f:
... molgeom = xyz.read(f)
...
XYZ files contain atomic positions and chemical element symbols, but no information on charge and spin multiplicity. Therefore, the function returns a MolecularGeometry
object with element symbols and atomic positions. It can be easily converted to a Molecule
object by providing information on the charge and, optionally, on the multiplicity:
>>> mol = molgeom.to_molecule(charge=0)
>>> mol = molgeom.to_molecule(charge=0, multiplicity=1)
XYZ files can be written to disk using the function xyz.write
by providing either a MolecularGeometry
or a Molecule
object (the latter is a subclass of the former). The function can either create a file at the specified path, or write to a file-like object.
Converting Strings with XYZ Information
In some cases, the content of an XYZ file may be present in a string, or may need to be converted to a string. Such cases can be handled generally using the aforementioned functions xyz.read
and xyz.write
operating on a io.StringIO
object (instead of a Path
object or a string). However, the HQS Molecules
module also provides convenience functions xyz.parse_string
and xyz.get_string
that read and format strings with XYZ content directly:
>>> # creates a string from a MolecularGeometry or Molecule object
>>> s = xyz.get_string(mol)
>>> # create a MolecularGeometry object from a string with XYZ content
>>> molgeom = xyz.parse_string(s)
Working with Comments in XYZ Files
The XYZ file format permits one single line with an arbitrary comment inside a molecular geometry definition. Some programs use this comment line to store specific information. When importing an XYZ file in HQS Molecules
the comment can be requested by setting the argument return_comment
to True
. This causes the function to return a tuple with the molecular geometry and the comment line.
>>> from hqs_molecules import xyz
>>> molgeom, comment = xyz.read("molecule.xyz", return_comment=True)
Likewise, a comment line can be exported to an XYZ file by providing it in the comment
argument.
>>> xyz.write("output.xyz", molgeom, comment="comment line ...")
Information on charge and spin multiplicity can be written into the comment line of an XYZ file using the function xyz.write_with_info
. Unlike the read
and write
functions, it requires a Molecule
object, as the MolecularGeometry
class lacks information on charge and multiplicity.
XYZ Trajectory Files
Trajectories may be stored in XYZ files simply by concatenating XYZ geometries. Such files can be read using the function xyz.read_geometries
. File paths (represented by a Path
object or a string) as well as file-like objects are supported as its input argument. By default, the function returns a list of molecular geometries. Setting return_comments
to True
causes it to return a list of tuples instead, each tuple containing a geometry and its associated comment.
>>> # Returns a list of geometries.
>>> geom_list = xyz.read_geometries("trajectory.xyz")
>>> # Return of tuples containing geometries and comments.
>>> tuple_list = xyz.read_geometries("trajectory.xyz", return_comments=True)
In order to write a trajectory file, the function xyz.write
is used (as for individual geometries), but with a list of geometries instead of an individual geometry.
>>> xyz.write("output.xyz", geom_list)
If individual comments need to be provided for each geometry, this is best done by working with a file stream, as shown in the example below:
>>> with open("output.xyz", "w") as f:
... for step, geometry in enumerate(geom_list):
... xyz.write(f, geometry, comment=f"step number {step}")
...