reforge.mdsystem package
Submodules
reforge.mdsystem.gmxmd module
File: gmxmd.py
- Description:
This module provides classes and functions for setting up, running, and analyzing molecular dynamics (MD) simulations using GROMACS. The main classes include:
GmxSystem: Provides methods to prepare simulation files, process PDB files, run GROMACS commands, and perform various analyses on MD data.
MDRun: A subclass of GmxSystem dedicated to executing MD simulations and performing post-processing tasks (e.g., RMSF, RMSD, covariance analysis).
- Usage:
Import this module and instantiate the GmxSystem or MDRun classes to set up and run your MD simulations.
- Requirements:
Python 3.x
MDAnalysis
NumPy
Pandas
GROMACS (with CLI tools such as gmx_mpi, gmx, etc.)
The reforge package and its dependencies
Author: DY Date: 2025-02-27
- class reforge.mdsystem.gmxmd.GmxSystem(sysdir, sysname)[source]
Bases:
MDSystemClass to set up and analyze protein-nucleotide-lipid systems for MD simulations using GROMACS.
Most attributes are paths to files and directories needed to set up and run the MD simulation.
- MMDPDIR = PosixPath('/scratch/dyangali/reforge/reforge/martini/datdir/mdp')
- gmx(command='-h', clinput=None, clean_wdir=True, **kwargs)[source]
Executes a GROMACS command from the system’s root directory.
- Parameters:
command (str) (The GROMACS command to run (default: “-h”).)
clinput (str, optional) (Input to pass to the command’s stdin.)
clean_wdir (bool, optional) (If True, cleans the working directory after execution.)
kwargs (Additional keyword arguments to pass to gromacs.)
- make_box(**kwargs)[source]
Sets up simulation box with GROMACS editconf command.
- Parameters:
kwargs (Additional keyword arguments for the GROMACS editconf command. Defaults:) –
d: Distance parameter (default: 1.0)
bt: Box type (default: “dodecahedron”).
- make_cg_topology(add_resolved_ions=False, prefix='chain')[source]
Creates the system topology file by including all relevant ITP files and defining the system and molecule sections.
- Parameters:
add_resolved_ions (bool, optional) (If True, counts and includes resolved ions.)
prefix (str, optional) (Prefix for ITP files to include (default: “chain”).)
Writes the topology file (self.systop) with include directives and molecule counts.
- make_gro_file(d=1.25, bt='dodecahedron')[source]
Generates the final GROMACS GRO file from coarse-grained PDB files.
- Parameters:
d (float, optional) (Distance parameter for the editconf command (default: 1.25).)
bt (str, optional) (Box type for the editconf command (default: “dodecahedron”).)
Converts PDB files to GRO files, merges them, and adjusts the system box.
- solvate(**kwargs)[source]
Solvates the system using GROMACS solvate command.
- Parameters:
kwargs (Additional parameters for the solvate command. Defaults:) –
cp: “solute.pdb”
cs: “water.gro”
- add_bulk_ions(solvent='W', **kwargs)[source]
Adds bulk ions to neutralize the system using GROMACS genion.
- Parameters:
solvent (str, optional) – Solvent residue name (default: “W”).
kwargs (dict) – Additional parameters for genion. Defaults include: - conc: 0.15 - pname: “NA” - nname: “CL” - neutral: “”
- make_system_ndx(backbone_atoms=('CA', 'P', "C1'"), water_resname='W')[source]
Creates an index (NDX) file for the system, separating solute, backbone, solvent, and individual chains.
- Parameters:
backbone_atoms (list, optional) – List of atom names to include in the backbone (default: [“CA”, “P”, “C1’”]).
- class reforge.mdsystem.gmxmd.GmxRun(sysdir, sysname, runname)[source]
Bases:
MDRunSubclass of MDRun for running MD simulations with GROMACS.
- gmx(command='-h', clinput=None, clean_wdir=True, **kwargs)[source]
Executes a GROMACS command from the run’s root directory.
- Parameters:
command (str) (The GROMACS command to run (default: “-h”).)
clinput (str, optional) (Input to pass to the command’s stdin.)
clean_wdir (bool, optional) (If True, cleans the working directory after execution.)
kwargs (Additional keyword arguments to pass to gromacs.)
- empp(**kwargs)[source]
Prepares the energy minimization run using GROMACS grompp.
- Parameters:
kwargs (Additional parameters for grompp. Defaults include:) –
f: Path to .mdp file.
c: Structure file.
r: Reference structure.
p: Topology file.
n: Index file.
o: Output TPR file (“em.tpr”).
- hupp(**kwargs)[source]
Prepares the heat-up phase using GROMACS grompp.
- Parameters:
kwargs (Additional parameters for grompp. Defaults include:) –
f: Path to .mdp file
c: Starting structure (“em.gro”).
r: Reference structure (“em.gro”).
p: Topology file.
n: Index file.
o: Output TPR file (“hu.tpr”).
- eqpp(**kwargs)[source]
Prepares the equilibration phase using GROMACS grompp.
- Parameters:
kwargs (Additional parameters for grompp. Defaults include:) –
f: Path to .mdp file.
c: Starting structure (“hu.gro”).
r: Reference structure (“hu.gro”).
p: Topology file.
n: Index file.
o: Output TPR file (“eq.tpr”).
- mdpp(**kwargs)[source]
Prepares the production MD run using GROMACS grompp.
- Parameters:
kwargs (Additional parameters for grompp. Defaults include:) –
f: Path to .mdp file.
c: Starting structure (“eq.gro”).
r: Reference structure (“eq.gro”).
p: Topology file.
n: Index file.
o: Output TPR file (“md.tpr”).
- mdrun(**kwargs)[source]
Executes the production MD run using GROMACS mdrun.
- Parameters:
kwargs (Additional parameters for mdrun. Defaults include:) –
deffnm: “md”
nsteps: “-2”
ntomp: “8”
- trjconv(clinput=None, **kwargs)[source]
Converts trajectories using GROMACS trjconv.
- Parameters:
clinput (str, optional) (Input to be passed to trjconv.)
kwargs (Additional parameters for trjconv.)
- rmsf(clinput=None, **kwargs)[source]
Calculates RMSF using GROMACS rmsf.
- Parameters:
clinput (str, optional) (Input for the rmsf command.)
kwargs (Additional parameters for rmsf. Defaults include:) –
s: Structure file.
f: Trajectory file.
o: Output xvg file.
- rmsd(clinput=None, **kwargs)[source]
Calculates RMSD using GROMACS rms.
- Parameters:
clinput (str, optional) (Input for the rms command.)
kwargs (Additional parameters for rms. Defaults include:) –
s: Structure file.
f: Trajectory file.
o: Output xvg file.
- rdf(clinput=None, **kwargs)[source]
Calculates the radial distribution function using GROMACS rdf.
- Parameters:
clinput (str, optional) (Input for the rdf command.)
kwargs (Additional parameters for rdf. Defaults include:) –
f: Trajectory file.
s: Structure file.
n: Index file.
- cluster(clinput=None, **kwargs)[source]
Performs clustering using GROMACS cluster.
- Parameters:
clinput (str, optional) (Input for the clustering command.)
kwargs (Additional parameters for cluster.)
- extract_cluster(clinput=None, **kwargs)[source]
Extracts frames belonging to a cluster using GROMACS extract-cluster.
- Parameters:
clinput (str, optional) (Input for the extract-cluster command.)
kwargs (Additional parameters for extract-cluster. Defaults include:)
- clusters (“cluster.ndx”)
- covar(clinput=None, **kwargs)[source]
Calculates and diagonalizes the covariance matrix using GROMACS covar.
- Parameters:
clinput (str, optional) (Input for the covar command.)
kwargs (Additional parameters for covar. Defaults include:)
- f (Trajectory file.)
- s (Structure file.)
- n (Index file.)
- anaeig(clinput=None, **kwargs)[source]
Analyzes eigenvectors using GROMACS anaeig.
- Parameters:
clinput (str, optional) (Input for the anaeig command.)
kwargs (Additional parameters for anaeig. Defaults include:)
- f (Trajectory file.)
- s (Structure file.)
- v (Output eigenvector file.)
- make_edi(clinput=None, **kwargs)[source]
Prepares files for essential dynamics analysis using GROMACS make-edi.
- Parameters:
clinput (str, optional) (Input for the make-edi command.)
kwargs (Additional parameters for make-edi. Defaults include:)
- f (Eigenvector file.)
- s (Structure file.)
- get_rmsf_by_chain(**kwargs)[source]
Calculates RMSF for each chain in the system using GROMACS rmsf.
- Parameters:
kwargs (Additional parameters for the rmsf command. Defaults include:)
- f (Trajectory file.)
- s (Structure file.)
- n (Index file.)
- res (Whether to output per-residue RMSF (default: “no”).)
- fit (Whether to fit the trajectory (default: “yes”).)
reforge.mdsystem.mdsystem module
File: mdsystem.py
- Description:
This module provides classes and functions for setting up, running, and analyzing molecular dynamics (MD) simulations. The main classes include:
Usage:
- Requirements:
Python 3.x
MDAnalysis
NumPy
Pandas
The reforge package and its dependencies
Author: DY Date: 2025-02-27
- class reforge.mdsystem.mdsystem.MDSystem(sysdir, sysname)[source]
Bases:
objectMost attributes are paths to files and directories needed to set up and run the MD simulation.
- MDATDIR = PosixPath('/scratch/dyangali/reforge/reforge/martini/datdir')
- MITPDIR = PosixPath('/scratch/dyangali/reforge/reforge/martini/itp')
- NUC_RESNAMES = ['A', 'C', 'G', 'U', 'RA3', 'RA5', 'RC3', 'RC5', 'RG3', 'RG5', 'RU3', 'RU5']
- property chains
Retrieves and returns a sorted list of chain identifiers from the input PDB.
- Returns:
list: Sorted chain identifiers extracted from the PDB file.
- property segments
Same as for chains but for segments IDs
- prepare_files(pour_martini=False)[source]
Prepares the simulation by creating necessary directories and copying input files.
- The method:
Creates directories for proteins, nucleotides, topologies, maps, mdp files, coarse-grained PDBs, GRO files, MD runs, data, and PNG outputs.
Copies ‘water.gro’ and ‘atommass.dat’ from the master data directory.
Copies .itp files from the master ITP directory to the system topology directory.
- sort_input_pdb(in_pdb='inpdb.pdb')[source]
Sorts and renames atoms and chains in the input PDB file.
- Parameters:
in_pdb (str) (Path to the input PDB file (default: “inpdb.pdb”).)
Uses pdbtools to perform sorting and renaming, saving the result to self.inpdb.
- clean_pdb_mm(pdb_file, add_missing_atoms=False, add_hydrogens=False, pH=7.0, **kwargs)[source]
Clean the starting PDB file using PDBfixer by OpenMM.
- Parameters:
pdb_file (str) – Path to the input PDB file.
add_missing_atoms (bool, optional) – Whether to add missing atoms (default: False).
add_hydrogens (bool, optional) – Whether to add missing hydrogens (default: False).
pH (float, optional) – pH value for adding hydrogens (default: 7.0).
**kwargs (dict, optional) – Additional keyword arguments (ignored).
- clean_pdb_gmx(pdb_file, **kwargs)[source]
Cleans the PDB file using GROMACS pdb2gmx tool.
- Parameters:
in_pdb (str, optional) (Input PDB file to clean. If None, uses self.inpdb.)
kwargs (Additional keyword arguments for the GROMACS command.)
After running pdb2gmx, cleans up temporary files (e.g., “topol*” and “posre*”).
- split_chains()[source]
Splits the input PDB file into separate files for each chain.
Nucleotide chains are saved to self.nucdir, while protein chains are saved to self.prodir. The determination is based on the residue names.
- clean_chains_mm(**kwargs)[source]
Cleans chain-specific PDB files using PDBfixer (OpenMM).
Kwargs are passed to pdbtools.clean_pdb. Also renames chain IDs based on the file name.
- clean_chains_gmx(**kwargs)[source]
Cleans chain-specific PDB files using GROMACS pdb2gmx tool.
- Parameters:
kwargs (Additional keyword arguments for the GROMACS command.)
Processes all files in the protein and nucleotide directories, renaming chains and cleaning temporary files afterward.
- get_go_maps(append=False)[source]
Retrieves GO contact maps for proteins using the RCSU server.
http://info.ifpan.edu.pl/~rcsu/rcsu/index.html
- Parameters:
append (bool, optional) (If True, filters out maps that already exist in self.mapdir.)
- martinize_proteins_go(append=False, **kwargs)[source]
Performs virtual site-based GoMartini coarse-graining on protein PDBs.
Uses Martinize2 from https://github.com/marrink-lab/vermouth-martinize. All keyword arguments are passed directly to Martinize2. Run martinize2 -h to see the full list of parameters.
- Parameters:
append (bool, optional) (If True, only processes proteins for) – which corresponding topology files do not already exist.
kwargs (Additional parameters for the martinize_go function.)
Generates .itp files and cleans temporary directories after processing.
- martinize_proteins_en(append=False, **kwargs)[source]
Generates an elastic network for proteins using the Martini elastic network model.
Uses Martinize2 from https://github.com/marrink-lab/vermouth-martinize. All keyword arguments are passed directly to Martinize2. Run martinize2 -h to see the full list of parameters.
- Parameters:
append (bool, optional) (If True, processes only proteins that do not) – already have corresponding topology files.
kwargs (Elastic network parameters (e.g., elastic bond force constants, cutoffs).)
Modifies the generated ITP files by replacing the default molecule name with the actual protein name and cleans temporary files.
- martinize_nucleotides(**kwargs)[source]
Performs coarse-graining on nucleotide PDBs using the martinize_nucleotide tool.
- Parameters:
append (bool, optional) (If True, skips already existing topologies.)
kwargs (Additional parameters for the martinize_nucleotide function.)
After processing, renames files and moves the resulting ITP files to the topology directory.
- martinize_rna(append=False, **kwargs)[source]
Coarse-grains RNA molecules using the martinize_rna tool.
- Parameters:
append (bool, optional) (If True, processes only RNA files without existing topologies.)
kwargs (Additional parameters for the martinize_rna function.)
Exits the process with an error message if coarse-graining fails.
- find_resolved_ions(mask=('MG', 'ZN', 'K'))[source]
Identifies resolved ions in the input PDB file and writes them to “ions.pdb”.
- Parameters:
mask (list, optional) (List of ion identifiers to look for (default: [“MG”, “ZN”, “K”]).)
- count_resolved_ions(ions=('MG', 'ZN', 'K'))[source]
Counts the number of resolved ions in the system PDB file.
- Parameters:
ions (list, optional) – List of ion names to count (default: [“MG”, “ZN”, “K”]).
- Returns:
A dictionary mapping ion names to their counts.
- Return type:
dict
- get_mean_sem(pattern='dfi*.npy')[source]
Calculates the mean and standard error of the mean (SEM) from numpy files.
- Parameters:
pattern (str, optional) (Filename pattern to search for (default: “dfi.npy”).*)
Saves the calculated averages and errors as numpy files in the data directory.
- get_td_averages(pattern)[source]
Calculates time-dependent averages from a set of numpy files.
- Parameters:
fname (str) – Filename pattern to pull files from the MD runs directory.
loop (bool, optional) – If True, processes files sequentially (default: True).
- Returns:
The time-dependent average.
- Return type:
numpy.ndarray
- class reforge.mdsystem.mdsystem.MDRun(sysdir, sysname, runname)[source]
Bases:
MDSystemSubclass of MDSystem for executing molecular dynamics (MD) simulations and performing post-processing analyses.
- get_covmats(u, ag, **kwargs)[source]
Calculates covariance matrices by splitting the trajectory into chunks.
- Parameters:
u (MDAnalysis.Universe, optional) (Pre-loaded MDAnalysis Universe; if None, creates one.)
ag (AtomGroup, optional) (Atom selection; if None, selects backbone atoms.)
sample_rate (int, optional) (Sampling rate for positions.)
b (int, optional) (Begin time/frame.)
e (int, optional) (End time/frame.)
n (int, optional) (Number of covariance matrices to calculate.)
outtag (str, optional) (Tag prefix for output files.)
- get_pertmats(intag='covmat', outtag='pertmat')[source]
Calculates perturbation matrices from the covariance matrices.
- Parameters:
intag (str, optional) (Input file tag for covariance matrices.)
outtag (str, optional) (Output file tag for perturbation matrices.)
- get_dfi(intag='pertmat', outtag='dfi')[source]
Calculates Dynamic Flexibility Index (DFI) from perturbation matrices.
- Parameters:
intag (str, optional) (Input file tag for perturbation matrices.)
outtag (str, optional) (Output file tag for DFI values.)
- get_dci(intag='pertmat', outtag='dci', asym=False)[source]
Calculates the Dynamic Coupling Index (DCI) from perturbation matrices.
- Parameters:
intag (str, optional) (Input file tag for perturbation matrices.)
outtag (str, optional) (Output file tag for DCI values.)
asym (bool, optional) (If True, calculates asymmetric DCI.)
- get_group_dci(groups, labels, **kwargs)[source]
Calculates DCI between specified groups based on perturbation matrices.
- Parameters:
groups (list) (List of groups (atom indices or similar) to compare.)
labels (list) (Corresponding labels for the groups.)
asym (bool, optional) (If True, computes asymmetric group DCI.)
reforge.mdsystem.mmmd module
File: mmmd.py
- Description:
This module provides classes and functions for setting up, running, and analyzing molecular dynamics (MD) simulations using GROMACS. The main classes include:
Mm: Provides methods to prepare simulation files, process PDB files, run GROMACS commands, and perform various analyses on MD data.
MmRun: A subclass of GmxSystem dedicated to executing MD simulations and performing post-processing tasks (e.g., RMSF, RMSD, covariance analysis).
- Usage:
Import this module and instantiate the MmSystem or MmRun classes to set up and run your MD simulations.
Author: DY
- class reforge.mdsystem.mmmd.MmSystem(sysdir, sysname, **kwargs)[source]
Bases:
MDSystemSubclass for OpenMM
- clean_pdb(pdb_file, add_missing_atoms=False, add_hydrogens=False, pH=7.0, **kwargs)[source]
Clean the starting PDB file using PDBfixer by OpenMM.
- Parameters:
pdb_file (str) – Path to the input PDB file.
add_missing_atoms (bool, optional) – Whether to add missing atoms (default: False).
add_hydrogens (bool, optional) – Whether to add missing hydrogens (default: False).
pH (float, optional) – pH value for adding hydrogens (default: 7.0).
**kwargs (dict, optional) – Additional keyword arguments (ignored).
- class reforge.mdsystem.mmmd.MmRun(sysdir, sysname, runname)[source]
Bases:
MDRun- em(simulation, tolerance=10, max_iterations=1000)[source]
Perform energy minimization for the simulation.
- Parameters:
simulation (openmm.app.Simulation) – The simulation object.
tolerance [kJ/nm/mol] (float, optional) – RMSF force tolerance for energy minimization (default: 10).
max_iterations (int, optional) – Maximum number of iterations (default: 1000).
Notes
Minimizes the energy, saves the minimized state, and logs progress.
- hu(simulation, temperature, n_cycles=100, steps_per_cycle=100, **kwargs)[source]
Run equilibration.
- Parameters:
simulation (openmm.app.Simulation) – The simulation object.
nsteps (int, optional) – Number of steps for equilibration (default: 10000).
**kwargs (dict, optional) – Additional keyword arguments.
Notes
Loads the minimized state, runs heatup, and saves the final state.
- eq(simulation, n_cycles=100, steps_per_cycle=100, **kwargs)[source]
Run equilibration.
- Parameters:
simulation (openmm.app.Simulation) – The simulation object.
nsteps (int, optional) – Number of steps for equilibration (default: 10000).
**kwargs (dict, optional) – Additional keyword arguments.
Notes
Loads the heated state, runs equilibration, and saves the equilibrated state.
- class reforge.mdsystem.mmmd.MmReporter(file, reportInterval, enforcePeriodicBox=None, selection: str = None, writer_kwargs: dict = None)[source]
Bases:
objectMost of this code is adapted from https://github.com/sef43/openmm-mdanalysis-reporter. MDAReporter outputs a series of frames from a Simulation to any file format supported by MDAnalysis. To use it, create a MDAReporter, then add it to the Simulation’s list of reporters.
- describeNextReport(simulation)[source]
Get information about the next report this object will generate. :Parameters: simulation (Simulation) – The Simulation to generate a report for
- Returns:
A six element tuple. The first element is the number of steps until the next report. The next four elements specify whether that report will require positions, velocities, forces, and energies respectively. The final element specifies whether positions should be wrapped to lie in a single periodic box.
- Return type:
tuple