lineshape_tools.cli¶
Contains functionality for the command line interface.
Functions¶
|
Collect and process data for fine-tuning. |
|
Bin phonons by energy and find a vector in the subspace that minimizes the spread in forces. |
|
Bin phonons by energy and select one randomly from each bin. |
|
Generate additional configurations to enhance fine-tuning dataset. |
|
Estimate atomic reference energies (E0s) by solving a linear system. |
|
Generate a configuration file for mace_run_train. |
|
Parse a force constants file from phonopy. |
|
Convert phonopy FORCE_CONSTANTS to dynmat.npz. |
|
Calculate the dynamical matrix using MACE. |
|
Calculate the spectral density/function and lineshape for a given dynamical matrix. |
|
Produce analysis plots of the dynamical matrix. |
Module Contents¶
- lineshape_tools.cli.collect(
- files: Annotated[list[pathlib.Path], Parameter(negative='')],
- output_file: pathlib.Path = Path('./database.extxyz'),
- strategy: str = 'none',
- read_index: str = ':',
- max_force: float = 2.0,
- min_force: float = -np.inf,
- dx_tol: float = 0.1,
- rtol: float = 1e-05,
- config_weight: float = 1.0,
- force_weighting: bool = False,
- allow_constraint: bool = False,
Collect and process data for fine-tuning.
Collect files into an extxyz database that can be used for fine-tuning. An optional filtering strategy can be applied. This is potentially useful if relaxation data is being included in the dataset, as it can be noisey from having multiple closely spaced geometries close to the equilibrium geomtry. By default, configurations with too large forces will be thrown away to avoid potential anharmonic contributions to the PES.
- Parameters:
files (list) – list of paths to files that are parseable by ase.io. The files should contain atomic geometries, total energies, and forces at a minimum (for example, vasprun.xml).
output_file (Path) – optional path where data is written (output to stdout by default)
strategy (str) – optional specification of strategy to be used for filtering. Available options are ‘none’, ‘qr’, or ‘dx’.
read_index (str) – pythonic index passed to ase.io.read to determine which structures are read from the input files (the same value is used for each file). The default “:” reads all of the structures, while “:3” would read the first three for example.
max_force (float) – remove structures where the maximum force acting on any atom is above the specified value (in eV/Å)
min_force (float) – remove structures where the maximum force acting on any atom is below the specified value (in eV/Å)
dx_tol (float) – tolerance (in Å) for how far atoms must move to accept configuration in ‘dx’ filtering strategy
rtol (float) – tolerance ratio for determining rank of displacement vectors in ‘qr’ strategy
config_weight (float) – set the configuration weight for training
force_weighting (bool) – store a config_weight that’s inversely proportional to the max force that any atom feels in the configuration [min(0.02 / max_fpa, 1)]. Overwrites the value specified by config_weight.
allow_constraint (bool) – allow constraints on atoms (e.g. selective dynamics) to modify the forces. This is typically not desirable.
- lineshape_tools.cli.get_force_opt_modes(
- n: int,
- omega2: numpy.ndarray,
- U: numpy.ndarray,
- sqrt_mass: numpy.ndarray,
- F: float = 0.5,
- tol: float = 1e-06,
- seed: int = 897689932,
- start_with_min_spread: bool = False,
- save_plot: bool = False,
Bin phonons by energy and find a vector in the subspace that minimizes the spread in forces.
- Parameters:
n (int) – number of modes to select
omega2 (np.ndarray) – frequencies squared of the modes (directly from np.linalg.eigh)
U (np.ndarray) – matrix where eigenvectors of modes are cols (directly from np.linalg.eigh)
sqrt_mass (np.ndarray) – sqrt of the vector of atomic masses
F (float) – target forces to optimize amplitudes for
tol (float) – convergence tolerance for scipy minimize call
seed (int) – seed value for random number generator
start_with_min_spread (bool) – determines if mode with smallest force spread is used as the starting point for optimization. Uses a random vector otherwise.
save_plot (bool) – save a plot for analyzing resulting modes
- Returns:
generated modes as columns of the matrix mode_dqs (np.ndarray): optimized displacement amplitudes following above criteria
- Return type:
modes (np.ndarray)
- lineshape_tools.cli.get_random_phonons(
- n: int,
- omega2: numpy.ndarray,
- U: numpy.ndarray,
- sqrt_mass: numpy.ndarray,
- F: float = 0.5,
- seed: int = 897689932,
Bin phonons by energy and select one randomly from each bin.
The amplitude of each phonon is chosen to produce a max force per atom as close to F as possible. The max displacement on a given atom is kept within a reasonable range (0.005, 0.05).
- Parameters:
n (int) – number of modes to select
omega2 (np.ndarray) – frequencies squared of the modes (directly from np.linalg.eigh)
U (np.ndarray) – matrix where eigenvectors of modes are cols (directly from np.linalg.eigh)
sqrt_mass (np.ndarray) – sqrt of the vector of atomic masses
F (float) – target forces to optimize amplitudes for
seed (int) – seed value for random number generator
- Returns:
generated modes as columns of the matrix mode_dqs (np.ndarray): optimized displacement amplitudes following above criteria
- Return type:
modes (np.ndarray)
- lineshape_tools.cli.gen_confs(
- struct_path: pathlib.Path,
- num_conf: int,
- strategy: str = 'rand',
- output_dir: pathlib.Path = Path('./confs'),
- accepting_mode: pathlib.Path | None = None,
- dynmat_file: pathlib.Path | None = None,
- orthogonalize: bool = False,
- default_max_dx: float = 0.015,
- start_with_min_spread: bool = False,
- opt_tol: float = 1e-06,
- seed: int = 897689932,
Generate additional configurations to enhance fine-tuning dataset.
- Parameters:
struct_path (Path) – path to file containing structure that will be displaced
num_conf (int) – total number of additional configurations to generate
strategy (str) – strategy used to generate the additional configurations. Available options are ‘rand’, ‘phon_rand’, and ‘phon_opt’.
output_dir (Path) – output directory where the new configurations will be written to
accepting_mode (Path) – path to file containing the structure that defines the accepting mode. For example, if struct_path refers to the ground-state equilibrium geometry, then accepting_mode should refer to the excited-state equilibrium geometry and vice versa.
dynmat_file (Path) – path to the .npz file containing the dynamical matrix presumably calculated using the “compute-dynmat” function.
orthogonalize (bool) – perform Gram-Schmidt orthogonalization at the last step
default_max_dx (float) – default value for max displacement of a given atom
start_with_min_spread (bool) – determines if mode with smallest force spread is used as the starting point for optimization in phon_opt strategy. Uses a random vector otherwise.
opt_tol (float) – convergence tolerance for the call to scipy minimize in the phon_opt strat
seed (int) – seed value for random number generator
- lineshape_tools.cli.reestimate_e0s_linear_system(
- calculator: mace.calculators.MACECalculator,
- database_atoms: list[ase.atoms.Atoms],
- elements: list | None = None,
- initial_e0s: dict | None = None,
Estimate atomic reference energies (E0s) by solving a linear system.
Notes
Slightly adapted from code by Noam Bernstein based on private communications with Ilyes Batatia and Joe Hart.
This functionality will eventually be removed once merged into MACE.
- Parameters:
calculator (MACECalculator) – Calculator object for the MACE model.
database_atoms (list) – List of ase Atoms objects with energy and atomic_numbers.
elements (list) – List of element atomic numbers to consider, default to set present in database_atoms.
initial_e0s (dict) – Dictionary mapping element atomic numbers to E0 values, default to values returned by foundation_model for isolated atom configs>
- Returns:
Dictionary with re-estimated E0 values for each element
- lineshape_tools.cli.gen_ft_config(
- out: pathlib.Path | str = './config.default',
- estimate_e0s: bool = False,
- device: str = 'cuda',
- name: str = 'fine-tuned',
- mace_model: str = 'medium-omat-0',
- database: pathlib.Path | str = './database.extxyz',
- head: str = 'default',
Generate a configuration file for mace_run_train.
- Parameters:
out (Path) – path where the mace_run_train configuration file is written to.
estimate_e0s (bool) – estimate the E0s for training of the foundation model.
device (str) – device string passed to MACE to determine where calculation is performed.
name (str) – name of the model.
mace_model (str) – pre-trained MACE model that is used, can be a local path
database (Path) – path to the training dataset file (likely generated with
collect())head (str) – which head from the model to use for prediction
- lineshape_tools.cli.parse_force_constants_file(fname: pathlib.Path | str) numpy.ndarray[source]¶
Parse a force constants file from phonopy.
- lineshape_tools.cli.convert_from_phonopy(
- fname: pathlib.Path | str,
- atoms_file: pathlib.Path | str,
- save_file: pathlib.Path | str = 'dynmat.npz',
Convert phonopy FORCE_CONSTANTS to dynmat.npz.
FORCE_CONSTANTS is written by phonopy when specifying the “–writefc” tag.
- Parameters:
fname (Path) – path to FORCE_CONSTANTS file.
atoms_file (Path) – path to file containing the equilibrium structure that was used to evaluate the force constants. (Only needed to extract sqrt masses.)
save_file (Path) – path where dynamical matrix is saved to (should end in .npz)
- lineshape_tools.cli.compute_dynmat(
- input_struct: pathlib.Path,
- save_file: pathlib.Path = Path('./dynmat.npz'),
- mace_model: str = 'medium-omat-0',
- device: str = 'cuda',
- head: str = 'default',
- relax_struct: bool = True,
- analytical_hessian: bool = True,
- relax_algo: str = 'LBFGSLineSearch',
- fmax: float = 0.001,
Calculate the dynamical matrix using MACE.
- Parameters:
input_struct (Path) – structure about which to compute the dynamical matrix
save_file (Path) – path where dynamical matrix is saved to (should end in .npz)
mace_model (str) – pre-trained MACE model that is used, can be a local path
device (str) – device string passed to MACE to determine where calculation is performed
head (str) – which head from the model to use for prediction
relax_struct (bool) – determines if an atomic relaxation is performed prior to computing the Hessian matrix. This is recommended if the model does not predict the same equilibrium structure as your explicit DFT calculation, which is generally the case unless good fine tuning has been performed.
analytical_hessian (bool) – determines if the Hessian is computed analytically or numerically using finite differences
relax_algo (str) – name of algorithm from ase.optimize that is used for atomic relaxation.
fmax (float) – force convergence criteria for atomic relaxation in eV/Ä
- lineshape_tools.cli.compute_lineshape(
- ground: pathlib.Path,
- excited: pathlib.Path,
- dynmat_file: pathlib.Path,
- emission: Annotated[bool, Parameter(name='--luminescence', negative='--absorption')] = True,
- dE: float | None = None,
- gamma_zpl: float = 0.001,
- sigma_zpl: float = 0.0,
- sigma_psb: tuple[float, float] = (0.005, 0.001),
- gamma_psb: tuple[float, float] | None = None,
- omega_mult: float = 5.0,
- norm: str = 'area',
- T: Annotated[float, Parameter(name=['--T', '-T'])] = 0.0,
- plot: str | None = None,
Calculate the spectral density/function and lineshape for a given dynamical matrix.
- Parameters:
ground (Path) – path to structure containing the ground state equilibrium geometry.
excited (Path) – path to structure containing the excited state equilibrium geometry.
dynmat_file (Path) – path to dynamical matrix file produced by
compute_dynmat()or by phonopy and converted withconvert_from_phonopy().emission (bool) – write luminescence (True) or absorption (False) spectrum.
dE (float) – zero-phonon line energy in eV, inferred from ground/excited if not provided.
gamma_zpl (float) – Lorentzian broadening in the ZPL to capture homogeneous broadening.
sigma_zpl (float) – Gaussian broadening in the ZPL to capture inhomogeneous broadening.
sigma_psb (float, float) – Gaussian broadening used to broaden the partial Huang-Rhys factors. The broadening factor is linearly interpolated from sigma_psb[0] at zero frequency to sigma_psb[1] at the highest (non-LVM) frequency.
gamma_psb (float, float) – Turns on Lorentzian broadening of local vibrational modes identified by their inverse participation ratio. gamma_psb[0] is ipr_cut and gamma_psb[1] is gamma_lvm. See
Broadening.omega_mult (float) – number of factors of maximum phonon frequency from ZPL to plot.
norm (str) – normalization of luminescence (area or max).
T (float) – Temperature in kelvin.
plot (str) – if provide, specifies the type of plot to be generated and save in the current working directory. Can be “subplot”, “inset”, “dos”, “S”, or “L”. See
plot_spec_funcs()for more info.