QPREP
- class qprep(create_dat=True, **kwargs)
A class for preparing quantum chemistry input files.
This class handles the creation of input files for various quantum chemistry programs (currently Gaussian and ORCA) from multiple input formats. It supports file conversion, molecular property extraction, and conformer processing.
- Attributes:
args: Configuration object containing user parameters and settings
- Examples:
>>> prep = QPrep(program='gaussian', qm_input='B3LYP/6-31G opt freq') >>> prep.process_files(['molecule.xyz'])
- check_level_of_theory()
Validate quantum chemistry method against known options.
Cross checks the specified functional and basis set against predefined lists. Note that these lists are not exhaustive - missing items may still be valid.
- Raises:
Warning: If functional or basis set not found in predefined lists
- get_header(qprep_data)
Generate input file header section.
Creates program-specific header with resource specs and calculation setup.
- Args:
qprep_data (dict): Molecular data including name and properties
- Returns:
str: Formatted header text
- get_tail(qprep_data)
Generate input file tail section.
Creates program-specific closing section with basis sets and modifiers.
- Args:
qprep_data (dict): Molecular data
- Returns:
str: Formatted tail section
- qprep_coords(file, mol, file_format)
Extract molecular coordinates and properties from various file formats.
Retrieves atomic coordinates, atom types, charge and multiplicity from: - RDKit mol objects - QM output files (LOG/OUT) - JSON files - Custom atom_types/cartesians lists
- Args:
file (str): Input file path mol (rdkit.Mol, optional): RDKit molecule object file_format (str): Input file format
- Returns:
tuple: (atom_types, cartesians, charge, mult, found_coords) - atom_types (List[str]): Atomic symbols - cartesians (np.ndarray): XYZ coordinates - charge (int): Molecular charge - mult (int): Spin multiplicity - found_coords (bool): Whether coordinates were found
- sdf_2_com(sdf_file, destination, file_format)
Process SDF file to generate QM input files.
Handles conversion of SDF structures to Gaussian/ORCA input files, with support for conformer selection and energy filtering.
- Args:
sdf_file (str): Path to input SDF file destination (Path): Output directory path file_format (str): Original file format
- Note:
Uses lowest_only, lowest_n and e_threshold_qprep settings to filter conformers if specified.
- write(qprep_data)
Write quantum chemistry input file.
Creates Gaussian (.com) or ORCA (.inp) input files with molecular coordinates and calculation parameters.
- Args:
- qprep_data (dict): Molecular data including:
name (str): Base filename
atom_types (List[str]): Atomic symbols
cartesians (np.ndarray): XYZ coordinates
charge (int): Molecular charge
mult (int): Spin multiplicity
- Returns:
str: Name of created input file
Parameters
- filesmol object, str or list of str, default=None
This module prepares input QM file(s). Formats accepted: mol object(s), Gaussian or ORCA LOG/OUT output files, JSON, XYZ, SDF, PDB. Also, lists can be used (i.e. [FILE1.log, FILE2.log] or *.FORMAT such as *.json).
- atom_typeslist of str, default=[]
(If files is None) List containing the atoms of the system
- cartesianslist of str, default=[]
(If files is None) Cartesian coordinates used for further processing
- w_dir_mainstr, default=os.getcwd()
Working directory
- destinationstr, default=None,
Directory to create the input file(s)
- varfilestr, default=None
Option to parse the variables using a yaml file (specify the filename)
- programstr, default=None
Program required to create the new input files. Current options: 'gaussian', 'orca'
- qm_inputstr, default=''
Keywords line for new input files (i.e. 'B3LYP/6-31G opt freq')
- qm_endstr, default=''
Final line(s) in the new input files
- chargeint, default=None
Charge of the calculations used in the following input files. If charge isn't defined, it defaults to 0
- multint, default=None
Multiplicity of the calculations used in the following input files. If mult isn't defined, it defaults to 1
- suffixstr, default=''
Suffix for the new input files (i.e. FILENAME_SUFFIX.com for FILENAME.log)
- prefixstr, default=''
Prefix added to all the names
- chkbool, default=False
Include the chk input line in new input files for Gaussian calculations
- oldchkbool, default=False
Include the oldchk input line in new input files for Gaussian calculations
- chk_pathstr, default=''
PATH to store CHK files. For example, if chk_path='root/user/FILENAME.chk, the chk line of the input file would be %chk=root/user/FILENAME.chk
- oldchk_pathstr, default=''
PATH to read CHK files with %oldchk. For example, if oldchk_path='root/user/FILENAME.chk, the oldchk line of the input file would be %oldchk=root/user/FILENAME.chk
- memstr, default='4GB'
Memory for the QM calculations (i) Gaussian: total memory; (ii) ORCA: memory per processor
- nprocsint, default=None
Number of processors used in the QM calculations
- gen_atomslist of str, default=[]
Atoms included in the gen(ECP) basis set (i.e. ['I','Pd'])
- bs_genstr, default=''
Basis set used for gen(ECP) atoms (i.e. 'def2svp')
- bs_nogenstr, default=''
Basis set used for non gen(ECP) atoms in gen(ECP) calculations (i.e. '6-31G*')
- lowest_onlybool, default=False
Only create input for the conformer with lowest energy of the SDF file
- lowest_nint, default=None
Only create inputs for the n conformers with lowest energy of the SDF file
- e_threshold_qprepfloat, default=None
Only create inputs for conformers below the energy threshold (to the lowest conformer) of the SDF file