QPREP

class qprep(create_dat=True, **kwargs)

A class for preparing quantum chemistry input files.

This class handles the creation of input files for various quantum chemistry programs (currently Gaussian and ORCA) from multiple input formats. It supports file conversion, molecular property extraction, and conformer processing.

Attributes:

args: Configuration object containing user parameters and settings

Examples:
>>> prep = QPrep(program='gaussian', qm_input='B3LYP/6-31G opt freq')
>>> prep.process_files(['molecule.xyz'])
check_level_of_theory()

Validate quantum chemistry method against known options.

Cross checks the specified functional and basis set against predefined lists. Note that these lists are not exhaustive - missing items may still be valid.

Raises:

Warning: If functional or basis set not found in predefined lists

get_header(qprep_data)

Generate input file header section.

Creates program-specific header with resource specs and calculation setup.

Args:

qprep_data (dict): Molecular data including name and properties

Returns:

str: Formatted header text

get_tail(qprep_data)

Generate input file tail section.

Creates program-specific closing section with basis sets and modifiers.

Args:

qprep_data (dict): Molecular data

Returns:

str: Formatted tail section

qprep_coords(file, mol, file_format)

Extract molecular coordinates and properties from various file formats.

Retrieves atomic coordinates, atom types, charge and multiplicity from: - RDKit mol objects - QM output files (LOG/OUT) - JSON files - Custom atom_types/cartesians lists

Args:

file (str): Input file path mol (rdkit.Mol, optional): RDKit molecule object file_format (str): Input file format

Returns:

tuple: (atom_types, cartesians, charge, mult, found_coords) - atom_types (List[str]): Atomic symbols - cartesians (np.ndarray): XYZ coordinates - charge (int): Molecular charge - mult (int): Spin multiplicity - found_coords (bool): Whether coordinates were found

sdf_2_com(sdf_file, destination, file_format)

Process SDF file to generate QM input files.

Handles conversion of SDF structures to Gaussian/ORCA input files, with support for conformer selection and energy filtering.

Args:

sdf_file (str): Path to input SDF file destination (Path): Output directory path file_format (str): Original file format

Note:

Uses lowest_only, lowest_n and e_threshold_qprep settings to filter conformers if specified.

write(qprep_data)

Write quantum chemistry input file.

Creates Gaussian (.com) or ORCA (.inp) input files with molecular coordinates and calculation parameters.

Args:
qprep_data (dict): Molecular data including:
  • name (str): Base filename

  • atom_types (List[str]): Atomic symbols

  • cartesians (np.ndarray): XYZ coordinates

  • charge (int): Molecular charge

  • mult (int): Spin multiplicity

Returns:

str: Name of created input file

Parameters

filesmol object, str or list of str, default=None

This module prepares input QM file(s). Formats accepted: mol object(s), Gaussian or ORCA LOG/OUT output files, JSON, XYZ, SDF, PDB. Also, lists can be used (i.e. [FILE1.log, FILE2.log] or *.FORMAT such as *.json).

atom_typeslist of str, default=[]

(If files is None) List containing the atoms of the system

cartesianslist of str, default=[]

(If files is None) Cartesian coordinates used for further processing

w_dir_mainstr, default=os.getcwd()

Working directory

destinationstr, default=None,

Directory to create the input file(s)

varfilestr, default=None

Option to parse the variables using a yaml file (specify the filename)

programstr, default=None

Program required to create the new input files. Current options: 'gaussian', 'orca'

qm_inputstr, default=''

Keywords line for new input files (i.e. 'B3LYP/6-31G opt freq')

qm_endstr, default=''

Final line(s) in the new input files

chargeint, default=None

Charge of the calculations used in the following input files. If charge isn't defined, it defaults to 0

multint, default=None

Multiplicity of the calculations used in the following input files. If mult isn't defined, it defaults to 1

suffixstr, default=''

Suffix for the new input files (i.e. FILENAME_SUFFIX.com for FILENAME.log)

prefixstr, default=''

Prefix added to all the names

chkbool, default=False

Include the chk input line in new input files for Gaussian calculations

oldchkbool, default=False

Include the oldchk input line in new input files for Gaussian calculations

chk_pathstr, default=''

PATH to store CHK files. For example, if chk_path='root/user/FILENAME.chk, the chk line of the input file would be %chk=root/user/FILENAME.chk

oldchk_pathstr, default=''

PATH to read CHK files with %oldchk. For example, if oldchk_path='root/user/FILENAME.chk, the oldchk line of the input file would be %oldchk=root/user/FILENAME.chk

memstr, default='4GB'

Memory for the QM calculations (i) Gaussian: total memory; (ii) ORCA: memory per processor

nprocsint, default=None

Number of processors used in the QM calculations

gen_atomslist of str, default=[]

Atoms included in the gen(ECP) basis set (i.e. ['I','Pd'])

bs_genstr, default=''

Basis set used for gen(ECP) atoms (i.e. 'def2svp')

bs_nogenstr, default=''

Basis set used for non gen(ECP) atoms in gen(ECP) calculations (i.e. '6-31G*')

lowest_onlybool, default=False

Only create input for the conformer with lowest energy of the SDF file

lowest_nint, default=None

Only create inputs for the n conformers with lowest energy of the SDF file

e_threshold_qprepfloat, default=None

Only create inputs for conformers below the energy threshold (to the lowest conformer) of the SDF file