sk_builder.utils

Generic utilities for sk_builder.

The utils sub-module of the sk_builder module provides helper functions for Slater-Koster tight-binding models.

TODO:

What to do with the hard-coded constants?

@private

def generate_sample( dist: dict[str, float], size: int, rng: numpy.random._generator.Generator) -> tuple[numpy.ndarray, dict[int, str]]:

Generate sample for alloy.

Arguments:

dist (dict[str, float]): Dictionary representing distribution of various species.
size (int): Total number of atoms in sample.
rng (np.random.Generator): random number generator to be used.

Raises:

ValueError: If negative probabilities are given.

Returns:

tuple[np.ndarray, dict[int, str]]: Tuple of random array and map from id to species name.

def is_positive_shift( translation: Union[tuple[int, ...], list[int]], size: Union[tuple[int, ...], list[int]]) -> bool:

Determines whether a translation corresponds to a positive index shift.

Can be used as normalization condition to make non-directed bonds unique.

NOTE: This depends on the total system size, because the shift depends on the total system size.

NOTE: This does not consider boundary conditions, i.e., the index shift might be larger than total system size.

This function returns a flag indicating whether a bond should be flipped. The consequences of a flipping a translation might differ depending of the type of bond corresponding to the given translation.

Arguments:

translation (tuple[int, ...]): Translation along each axis.
size (tuple[int, ...]): Size of simulation cell along each axis.

Raises:

ValueError: length of size and translation are incompatible.

Returns:

bool: Indicating whether translation is normalized.

def normalize_translation( translation: Union[tuple[int, ...], list[int]], size: Union[tuple[int, ...], list[int]]) -> tuple[int, ...]:

Normalizes translation.

Returns tuple of translation potentially flipped.

Arguments:

translation (Union[tuple[int, ...], list[int]]): Translation along each axis.
size (tuple[int, ...]): Size of simulation cell along each axis.

Returns:

tuple[int, ...]: Normalized translation along each axis.

def normalize_bond( bond: dict[str, typing.Any], system_size: tuple[int, ...]) -> dict[str, typing.Any]:

Normalize dictionary representing bond between atoms.

Normalization uses the system size to determine whether the index shift in the matrix representation of the bond is positive. If not the bond is flipped.

NOTE: While the user is free to define bonds either from atom A to B or from atom B to A, internally we use a convention rendering the bond unique.

Arguments:

bond (dict[str, Any]): Dictionary representing bond.
system_size (tuple[int, ...]): Size of system along (three) lattice_vectors.

Raises:

TypeError: Thrown if 'bond' dictionary does not contain all of the following keys: 'id_from', 'id_to', 'translation'.

Returns:

dict[str, Any]: Dictionary representing normalized bond.

def parse_bonds( bonds_input: list[dict[str, typing.Any]], unique_translations: list[tuple[int, ...]], system_size: tuple[int, ...]) -> dict[tuple, list[tuple[int, ...]]]:

Parse list of "bond" dictionaries into map of translations to list of bonds.

Arguments:

bonds_input (list[dict[str, Any]]): List of dictionaries representing bonds.
unique_translations (list[tuple[int, ...]]): List of unique translations.
system_size (tuple[int, ...]): System size along each direction.

Returns:

dict[tuple, list[tuple[int, ...]]]: Map of unique translations to list of bond indices.

def get_label_from_displacement(d: numpy.ndarray, precision: int = 4) -> str:

Turns array into string representation.

NOTE: should only be used for small float arrays.

Arguments:

d (np.ndarray): Numpy array
precision (int): Precision of string representation. Defaults to 4.

Returns:

str: Label for displacement.

def get_clusters_from_features( features: numpy.ndarray, min_n_clusters: int, max_n_clusters: int, verbose: bool = False) -> tuple[numpy.ndarray, numpy.ndarray]:

Compute labels and average values (centers) using k-means clustering.

NOTE: The silhouette score is used to determine the optimal number of clusters. The score is computed using a hard-coded number of samples.

NOTE: Using a hard-coded seed for reproducibility.

Arguments:

features (np.ndarray): Array of features (shape: (n_samples, n_features))
min_n_clusters (int): Minimal number of clusters.
max_n_clusters (int): Maximal number of clusters.
verbose (bool): Flag triggering output.

Returns:

tuple[np.ndarray, np.ndarray]: Tuple of np.ndarray: Array of labels for features (shape: (n_samples,)) np.ndarray: Array of cluster features (shape: (n_cluster, n_features))

def get_displacement_from_label(label: str) -> numpy.ndarray:

Turns string representation into array.

NOTE: should only be used for string representations of small float arrays.

Arguments:

label (str): String reporesentation of array

Returns:

np.ndarray: Numpy float array.

def generate_positions( uc_pos: numpy.ndarray, lattice_vectors: numpy.ndarray, size: tuple) -> numpy.ndarray:

Generate position of atoms given unit cell positions, lattice vectors and system size.

Arguments:

uc_pos (np.ndarray): positions of the atmos in the unit cell.
lattice_vectors (np.ndarray): lattice vectors.
size (tuple): size in unit cells of the system.

Returns:

np.ndarray: Array holding position of all atoms.

def bonds_to_bond_matrices( bonds: dict[tuple, list[tuple]], n_atoms_in_unitcell: int) -> dict[tuple, scipy.sparse._coo.coo_array]:

Parse list of bonds to COO sparse array.

Arguments:

bonds (dict[tuple, list[tuple]]): Map from translation to list of atom indices representing bonds.
n_atoms_in_unitcell (int): Number of atoms in unitcell.

Returns:

dict[tuple, SPM.coo_array]: Map from translation to COO sparse matrices representing bonds.

def print_lattice_vectors(lattice_vectors: numpy.ndarray, periodic_direction: list[bool]) -> None:

Print information about lattice vectors.

Arguments:

lattice_vectors (np.ndarray): Numpy array representing lattice vectors.
periodic_direction (list[bool]): List of boolean flags specifying whether system is periodic along direction represented by corresponding lattice vector.

def unpack_off_diagonal_contribution( off_diagonal_contribution: sk_builder.models.OffDiagonalContribution) -> tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, tuple[int, ...], numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]:

Unpacking OffDiagonalContribution model into "legacy" format.

Arguments:

off_diagonal_contribution (OffDiagonalContribution): Contribution to off-diagonal of Hamiltonian.

Returns:

tuple[ np.ndarray, np.ndarray, np.ndarray, tuple[int, ...], np.ndarray, np.ndarray, np.ndarray, np.ndarray ]: "Legacy" format of off-diagonal contribution.

def save_off_diagonal_contribution( numpy_container_dict: dict, off_diagonal_contribution: sk_builder.models.OffDiagonalContribution, base_name: str, cnt: int) -> dict:

Save OffDiagonalContribution model into numpy container dictionary.

Arguments:

numpy_container_dict (dict): Dictionary to save data into.
off_diagonal_contribution (OffDiagonalContribution): off-diagonal contribution.
base_name (str): Base name of keys storing the components.
cnt (int): Counter for off diagonal contributions.

Returns:

dict: numpy container dictionary updated with off-diagonal contribution.

def load_off_diagonal_contribution( numpy_container_dict: dict, base_name: str, cnt: int, linear_size_expansion: int) -> sk_builder.models.OffDiagonalContribution:

Load OffDiagonalContribution model deserialized npz file.

Arguments:

numpy_container_dict (dict): Dictionary holding numpy arrays.
base_name (str): Base name of keys storing the components.
cnt (int): Counter for off diagonal contributions.
linear_size_expansion (int): linear size of system (number of atoms).

def check_state_shape( n_atoms: int, n_orbitals: int, batch_size: int, state_shape: tuple[int, int]) -> None:

Check consistency of state shape with n_atoms, n_orbitals and batch_size.

Arguments:

n_atoms (int): Number of atoms.
n_orbitals (int): Number of orbitals per atom.
batch_size (int): Number of states in batch.
state_shape (tuple[int, int]): Shape of state template.

Raises:

ValueError: If state shape does not match expected shape.

def hash_input( sk_params: dict, structure: dict, precision: int, number_discretized_bonds: int) -> str:

Generate hash for input parameters.

Arguments:

sk_params (dict): Dictionary holding Slater-Koster parameters.
structure (dict): Dictionary holding structure information.
precision (int): Parameter controlling precision of bond length.
number_discretized_bonds (int): Parameter controlling maximum number of bonds.

Returns:

str: Hex-digest of input parameters.

def sanitize_system_size(system_size: Union[list, tuple]) -> tuple[int, int, int]:

Sanitize system size to a valid 3D tuple.

Arguments:

system_size (Union[list[int], tuple[int, ...]]): Input system size.

Returns:

tuple[int, int, int]: Valid 3D tuple representing system size.

def print_bond_matrix_analysis( bond_matrices: dict[tuple, scipy.sparse._coo.coo_array], positions: numpy.ndarray, lattice_vectors: numpy.ndarray) -> None:

Print bond matrix analysis.

Arguments:

bond_matrices (dict[tuple, SPM.coo_array]): Dictionary holding information about bonds.
positions (np.ndarray): Positions of atoms (in unit cell).
lattice_vectors (np.ndarray): Lattice vectors (of unit cell).

def normalize_bool(b: Any) -> bool:

Normalize input for boolean.

Arguments:

b (Any): input for boolean.

Returns:

bool: normalized boolean.

Raises:

ValueError: If input cannot be converted to boolean.