COLLECTIVE VARIABLES MODULE
Reference manual for VMD
Code version: 2024-10-16
Updated versions of this manual: [GROMACS] [LAMMPS] [NAMD] [Tinker-HP] [VMD]
Alejandro Bernardin, Haochuan Chen, Jeffrey R. Comer, Giacomo Fiorin, Haohao Fu, Jérôme
Hénin, Axel Kohlmeyer, Fabrizio Marinelli, Hubert Santuz, Joshua V. Vermaas, Andrew D.
White
In molecular dynamics simulations, it is often useful to reduce the large number of degrees of freedom of a physical system into few parameters whose statistical distributions can be analyzed individually, or used to define biasing potentials to alter the dynamics of the system in a controlled manner. These have been called ‘order parameters', ‘collective variables', ‘(surrogate) reaction coordinates', and many other terms.
Here we use primarily the term ‘collective variable', often shortened to colvar, to indicate any differentiable function of atomic Cartesian coordinates, ${\text{}x\text{}}_{i}$, with $i$ between $1$ and $N$, the total number of atoms:
This manual documents the collective variables module (Colvars), a software that provides an implementation for the functions $\xi \left(\text{}X\text{}\right)$ with a focus on flexibility, robustness and high performance. The module is designed to perform multiple tasks concurrently during or after a simulation, the most common of which are:
Note: although restraints and PMF algorithms are primarily used during simulations, they are also available in VMD to test a new input for a simulation, or to evaluate the relative free energy of a new structure based on data from a previous calculation. Options that only have an effect during a simulation are also included for compatibility purposes.
Detailed explanations of the design of the Colvars module are provided in reference [1]. Please cite this reference whenever publishing work that makes use of this module, alongside any other publications for specific features being, according to the usage summary printed when running a Colvars-enabled MD simulation or analysis.
Using the Colvars module in VMD Within VMD [2], the Colvars module can be accessed in two ways:
The Colvars configuration is a plain text file or string that defines collective variables, biases, and general parameters of the Colvars module. It is passed to the module using back-end-specific commands documented in section 4. Writing the configuration fora collective variable in VMD is made much easier using the Dashboard and its configuration editor (section 3). However, note that the Dashboard does not handle biases: if necessary, they should be managed separately using the scripting interface.
Example: steering two groups of atoms away from each other. Now let us look at a complete, non-trivial configuration. Suppose that we want to run a steered MD experiment where a small molecule is pulled away from a protein binding site. In Colvars terms, this is done by applying a moving restraint to the distance between the two objects. The configuration will contain two blocks, one defining the distance variable (see section 5 and 5.3.1), and the other the moving harmonic restraint (7.7). Note that in VMD, no biasing forces are applied, but biases may be useful in the context of an analysis script, e.g. to collect histograms or to compute bias energies.
colvar {
name dist
distance {
group1 { atomNumbersRange 42-55 }
group2 { indexGroup C-alpha_15-30 }
}
}
harmonic {
colvars dist
forceConstant 20.0
centers 4.0 # initial distance
targetCenters 15.0 # final distance
targetNumSteps 500000
}
Reading this input in plain English: the variable here named dist consists in a distance function between the centers of two groups: the ligand (atoms 42 to 55) and the $\alpha $-carbon atoms of residues 15 to 30 in the protein (segment name PR). To the “dist" variable, we apply a harmonic potential of force constant 20 energy unit/length unit${}^{2}$, initially centered around a value of 4 length unit, which will increase to 15 length unit over 500,000 simulation steps.
The atom selection keywords are detailed in section 6.
Example: using multiple variables and multiple biasing/analysis methods together. A more complex example configuration is included below, showing how a variable may be constructed by combining multiple existing functions, and how multiple variables or multiple biases may be used concurrently. The colvar indicated below as “$d$" is defined as the difference between two distances (see 5.3): the first distance (${d}_{1}$) is taken between the center of mass of atoms 1 and 2 and that of atoms 3 to 5, the second (${d}_{2}$) between atom 7 and the center of mass of atoms 8 to 10 (see 6). The difference $d={d}_{1}-{d}_{2}$ is obtained by multiplying the two by a coefficient $C=+1$ or $C=-1$, respectively (see 5.17). The colvar called “$c$" is the coordination number calculated between atoms 1 to 10 and atoms 11 to 20. A harmonic restraint (see 7.7) is applied to both $d$ and $c$: to allow using the same force constant $K$, both $d$ and $c$ are scaled by their respective fluctuation widths ${w}_{d}$ and ${w}_{c}$. A third colvar “alpha" is defined as the $\alpha $-helical content of residues 1 to 10 (see 5.8.1). The values of “$c$" and “alpha" are also recorded throughout the simulation as a joint 2-dimensional histogram (see 7.12).
colvar {
# difference of two distances
name d
width 0.2 # estimated fluctuation width
distance {
componentCoeff 1.0
group1 { atomNumbers 1 2 }
group2 { atomNumbers 3 4 5 }
}
distance {
componentCoeff -1.0
group1 { atomNumbers 7 }
group2 { atomNumbers 8 9 10 }
}
}
colvar {
name c
coordNum {
cutoff 6.0
group1 { atomNumbersRange 1-10 }
group2 { atomNumbersRange 11-20 }
tolerance 1.0e-6
pairListFrequency 1000
}
}
colvar {
name alpha
alpha {
psfSegID PROT
residueRange 1-10
}
}
harmonic {
colvars d c
centers 3.0 4.0
forceConstant 5.0
}
histogram {
colvars c alpha
}
The Colvars Dashboard is a graphical interface for interactive visualization and refinement of collective variables aided by molecular structures and trajectories. It is accessible in VMD's Main Menu under “Extensions/Analysis/Colvars Dashboard". Throughout the interface, keyboard shortcuts for common operations are indicated in square brackets. Context menus in the colvar and bias tables can be accessed using either right click or Control-click.
Here are the steps for a quick first tour of the Dashboard:
Alternately, use the Automatic colvars features in the Actions tab.
Now, clicking “Edit" in the Dashboard window (or right-clicking on a colvar in the list view), you can modify the collective variable to reflect interesting geometric properties of the system. The power of the collective variables approach lies in the variety of geometric functions (“components") and their combinations. The editor window provides a number of helpers to make it easy and quick to define the most relevant variables. See section 3.5 for details.
The Dashboard window displays a table listing currently defined variables, and their values for the current frame indicated at the bottom of the window. By default the frame is updated to track VMD's currently displayed frame, but that can be changed by toggling the “Track frame" checkbox, e.g. to animate the trajectory without recomputing expensive variables. Vector-values variables can be expanded to list their scalar elements. This is necessary when individual scalar quantities have to be selected for plotting. Other operations act on variables as a whole and ignore specific selected scalar elements. Right-clicking in the table brings up a context menu with specific actions for the selected variables.
Buttons above the table allow for general operations on the state of the Colvars module. Below the table, the Actions tab allows operations on selected variables, while the Settings tab offers advanced settings for the various visualizations.
If variables are modified, added or deleted by an external script, hit “Refresh" or press F5 to update the displayed variables and values. Starting the Dashboard also enables trajectory animation using the left/right arrow keys within VMD's graphical window. Atomic coordinates can be modified using VMD's “Mouse/Move" functions, and the Colvars module can then be updated by pressing F5 directly from the graphical window.
A dropdown list allows for changing the current unit system if no variables are defined (see section 4.1). When changing units, it is strongly recommended to check the configuration for all colvars to ensure that all quantities are expressed in the desired set of units.
The Colvars Module interacts with one VMD Molecule at a time. At the bottom of the Actions tab, a dropdown lets the user change which VMD molecule is associated with the Colvars module. Internally, changing it means recording the configuration of currently defined colvars, deleting the current instance of the Colvars module, creating a new one linked to the target molecule, and applying the saved configuration. Beware of incompatible colvar definitions, such as atom groups listing atom IDs that exist in one molecule, but not the other. Auto-updating selections (see below) can be used to adapt the colvar definitions to a different system using VMD selection texts.
This saves the current configuration of the colvars Module to a file. This includes comments found in the input, general parameters of the Colvars Module, the collective variables themselves, and collective variable biases. The resulting configuration file can be read by Colvars, either back in VMD, or within a supported MD engine.
The Actions tab offers two buttons to generate sets of colvars from the current VMD session:
These automatically generated colvars can be used as-is, or customized using the configuration editor (Edit in the context menu, or in the Actions tab).
The configuration editor can be started with the “Edit" or “New" buttons. Using the “Edit" button, the configuration of selected variables is loaded, and those variables will be replaced when applying the new configuration.
The editor window offers links to online documentation, as well as helpers to write correct configuration files.
As a first step, the most useful helper is the collection of template files. Some parameters that must be supplied are indicated by the symbol @. Colvar templates can be inserted at the beginning of the configuration, whereas “component" templates define basis functions that belong inside a colvar block. Templates are indented using 4 spaces per level to indicate their position in the nested structure of the configuration: general options, colvars and biases at level 0, bias and colvar parameters like components at level 1, component parameters such as atom groups at level 2, and atom group parameters at level 3.
The next helper buttons allow importing atom selections from VMD, either typing a VMD atom selection text, by copying the selection of an existing graphical representation, or by inserting the list of atoms currently labeled in VMD using the “Pick atom" feature. Atom selections should be inserted within an atom group block, within a component block (such as distance). By default atom selections are preceded with a comment line marking them as auto-updating. This instructs the Dashboard to update the list of atoms whenever the configuration is applied, that is, when it is edited, when the file is loaded by the Dashboard, or when changing the VMD molecule linked to Colvars. This is useful when working across systems with different atom numberings, but topologies that make the relevant atom groups identifiable using VMD selection texts. Special fields (O, B, and user) may be used, as well as atom positions (e.g. z > 0). If the selection text is modified manually, the atom list will be updated when applying the new configuration. This auto-updating behavior can be disabled by removing the special comment line or altering the keywords “auto-updating selection".
Note that atom lists are not auto-updated:
The “Insert labeled..." button combined with the selection box allows for inserting components matching VMD's geometry measurements: Bonds (distances), angles, and dihedrals. Hidden labels are not used for inserting components.
Timeline plots show the selected variables as a function of time. A vertical bar indicates the current frame, which can be changed either using VMD's trajectory animation controls, or directly in the plot window by clicking the mouse inside the graph, or using the keyboard left/right arrows. Shift+arrow steps by 10 frames for faster animation, and Ctrl+arrow steps over 50 frames. The Home/End keys jump to the first and last frames, respectively. The up/down arrows operate a zoom/unzoom along the time axis. Visible data can be fitted vertically using the h key. All data can be fitted horizontally using the h key.
Pairwise scatterplots are useful to identify correlation between variables. To create a pairwise plot, select exactly two scalar variables (or scalar components of vector variables), and click “Pairwise plot". Frames are represented by circles, and lines connect consecutive frames. The blue dot tracks the current frame: clicking a circle jumps to the corresponding frame. Arrow keys and Home/End animate the trajectory as in the timeline plot.
Histogram displays a histogram of the selected scalar variable over the trajectory, using approximately the number of bins specified in the Settings tab (adjusted after selecting a round number for the bin width). The current value of the colvar is indicated by a vertical bar. Clicking anywhere in the plot jumps to the frame with the nearest value of the colvar. Keyboard navigation of the trajectory is similar to the other plots, except that the trajectory is traversed according to colvar values rather than frame numbers. This enables exploration of frames corresponding to a neighborhood of colvar values, even when they are not contiguous in time.
“Show atoms" creates representations of the atoms involved in the definition of the selected colvars. Each atom group is shown in a different color. “Show gradient" is available for scalar variables only. It creates a graphical representation of the atomic gradients of the selected variables, visualizing how the value of the collective variable would vary in response to a change in atomic coordinates. The graphical representation of the gradients is controlled by parameters in the Settings tab. Vectors representing the gradient are rescaled as indicated by the radio buttons Max. vector norm and Scaling factor. Max. vector norm rescales gradients so that the largest vector component of each colvar's gradient is represented by an arrow of the specified length, in Å. Scaling factor rescales gradients by the specified factor, divided by the colvar's width parameter (1 by default, see 5.20). This use of width makes it easier to compare the gradients of collective variables that are not commensurate. The scaling factor has the unit $\ast L\u2215(cv\u2215width)$, where $L$ is the current length unit, and $cv$ represents the natural unit of the collective variable. By default $width$ is unity, but $(cv\u2215width)$ may be seen as dimensionless if $width$ is expressed in cv units.
The Dashboard offers visualizations for specific types of variables:
Parameters for fine-tuning these visualizations can be set in the Settings tab.
The main Dashboard window features a "Biases" tab that contains a list of currently defined biases, their current energy, and the colvars that they depend upon. A new harmonic bias acting on a given set of colvars can be created by selecting the colvars and opening the context menu in the colvar table (right click or control-click). A button enables creating a Timeline plot with the energies of currently selected biases, similar to the colvar Timeline plot. Only one plot can be active at a time. Another button creates a visualization of the biasing forces computed by the selected biases, similar to the gradient display for colvars. The graphical representation of forces is controlled by the same parameters as the gradients, shown in the Settings tab. The difference is the scaling factor unit, which is Ångström per Colvars force unit (that is, Ångström times Colvars length unit per Colvars energy unit).
Note that the energy of history-dependent biases such as ABF or metadynamics may not be updated in VMD. However, values of the bias energy from a previous simulation can be displayed if a state file has been loaded with cv load (section 4.2), or in the case of ABF, if inputPrefix was specified.
Hitting the Save button will save the configuration of any bias listed in this tab, except the harmonic wall biases that are created automatically from a colvar for which the lower or upper wall options are specified.
Sometimes, important time-dependent quantities are available for a trajectory, but must be computed by an external tool (e.g. energies from a simulation). These can be visualized alongside colvars by taking advantage of the scripted colvar feature.
Such a custom variable can either be computed on the fly (by calling an external tool or a VMD feature) or pre-computed and cached. For example, we can use VMD to compute the protein solvent-accessible surface area (SASA) on the fly:
set sasa_sel [atomselect top "protein and noh"]
proc calc_user_colvar { x } {
# Follow current Colvars frame
$::sasa_sel frame [cv frame]
return [measure sasa 1 $::sasa_sel]
}
The Tcl script above should be entered into VMD's text console, or sourced from a file. Then we can define a colvar that will be calculated by the scripted procedure above, by entering the following configuration as a new colvar in the Dashboad Editor Window:
colvar {
name user
scriptedFunction user_colvar
distance {
# The distance component is just a placeholder
group1 {
dummyatom (0, 0, 0)
}
group2 {
dummyatom (0, 0, 0)
}
}
}
If the SASA computation is slow, it can be cached to make subsequent interaction responsive. To that effect, we source the following script to pre-compute the SASA for every frame and store it in VMD's User field for atom 0.
set atom0_sel [atomselect top "index 0"]
for {set f 0} {$f < [molinfo top get numframes]} {incr f} {
$::sasa_sel frame $f
$::atom0_sel frame $f
$::atom0_sel set user [measure sasa 1 $::sasa_sel]
}
Then we use the following Tcl procedure to compute a custom variable that is simply equal to the User field for atom 0 for the current frame:
proc calc_user_colvar { x } {
# Follow current Colvars frame
$::atom0_sel frame [cv frame]
return [$::atom0_sel get user]
}
This version results in extremely fast updates.
Here, we document the syntax of the commands and parameters used to set up and use the Colvars module in VMD. One of these parameters is the configuration file or the configuration text for the module itself, whose syntax is described in 4.3 and in the following sections.
The “internal units" of the Colvars module are the units in which values are expressed in the configuration file, and in which collective variable values, energies, etc. are expressed in the output and colvars trajectory files. Generally the Colvars module uses internally the same units as its back-end MD engine, with the exception of VMD, where different unit sets are supported to allow for easy setup, visualization and analysis of Colvars simulations performed with any simulation engine.
Note that angles are expressed in degrees, and derived quantities such as force constants are based on degrees as well. Some colvar components have default values, expressed in Ångström (Å) in this documentation. They are converted to the current length unit, if different from Å. Atomic coordinates read from XYZ files (and PDB files where applicable) are expected to be expressed in Ångström, no matter what unit system is in use by the back-end (VMD) or the Colvars Module. They are converted internally to the current length unit as needed. Note that force constants in harmonic and harmonicWalls biases (7.7) are rescaled according to the width parameter of colvars, so that they are formally in energy units, although if width is given its default value of 1.0, force constants are effectively expressed in energy unit/(colvar unit)${}^{2}$.
To avoid errors due to reading configuration files written in a different unit system, it can be specified within the input:
After the first initialization of the Colvars module, the internal state of Colvars objects may be queried or modified in a VMD script:
cv $<$method$>$ arg1 arg2 ...
where $<$method$>$ is the name of a specific procedure and arg1, arg2, …are its required and/or optional arguments.
In VMD, the cv command is used by the Dashboard graphical interface,(3), but can be also used in scripts or interactively from the command-line terminal (for example, in remote terminal sessions) or in the Tk Console.
In the remainder of this section, the most frequently used commands of the Colvars scripting interface
are discussed and exemplified. For a full list of scripting commands available, see section 8.
The first step to using Colvars in VMD is choosing which “molecule" (i.e. which system): because VMD can handle multiple “molecules", the Colvars module needs to remain attached to a specific VMD molecule. For example:
cv molid "top"
will attach the Colvars module onto the molecule currently holding the “top" status (alternatively, you can refer to a molecule by its numeric ID in lieu of top). All following invocations of the cv command will continue operating on the same molecule, regardless of whether other molecules are loaded, or which one has the “top" status. The cv molid command without argument will return the molid currently associated with Colvars.
To define new collective variables and/or biases for immediate use in the current session, configuration can be loaded from an external configuration file:
cv configfile "colvars-file.in"
This can in principle be called at any time, if only flags internal to Colvars are being modified. In practice, when new atoms or any new atomic properties (e.g. total forces) are being requested, initialization steps will be required that are not carried out during a simulation. Therefore, it is generally good practice in a simulation to change the Colvars configuration outside the scope between segments of the same computation. However, in VMD initialization is always immediate, allowing interactive usage by Tcl scripts or the Dashboard.
To load the configuration directly from a string the “config" method may be used:
cv config "keyword { ... }"
The vast majority of the syntax in Colvars is backward-compatible, adding keywords when new features are introduced. However, when using multiple versions simultaneously it may be useful to test within the script whether the version is recent enough to support the desired feature. The “version" can be used to get the Colvars version for this use:
if { [cv version] >= "2020-02-25" } {
cv config "(use a recent feature)"
}
After a configuration is fully defined, the “load" method may be used to load a state file from a previous simulation that contains e.g. data from history-dependent biases), to either continue that simulation or analyze its results:
cv load "$<$oldjob$>$.colvars.state"
or more simply using the prefix of the state file itself:
cv load "$<$oldjob$>$"
The “save" method, analogous to “load", allows to save all restart information to a state file. This is normally not required during a simulation if colvarsRestartFrequency is defined (either directly or indirectly by the VMD restart frequency). Because not only a state file (used to continue simulations) but also other data files (used to analyze the trajectory) are written, it is generally recommended to call the save method using a prefix, rather than a complete file name:
cv save "$<$job$>$"
For computational efficiency the Colvars module keeps internal copies of the numeric IDs, masses, charges, positions, and optionally total forces of the atoms requested for a Colvars computation. At each simulation step, up-to-date versions of these properties are copied from the central memory of VMD into the internal memory of the Colvars module. In a post-processing workflow or outside a simulation (e.g. when using VMD), this copy can be carried out as part of the update method:
cv update
which also performs the (re-)computation of all variables and biases defined in Colvars.
For example, the current sequence of numeric IDs of the atoms processed by Colvars can be obtained as:
cv getatomids
and their current positions as:
cv getatompositions
This may prove useful to test the correctness of the coordinates passed to Colvars, particularly in regard to periodic boundary conditions (6.3). There is currently no mechanism to modify the above fields via the scripting interface, but such capability will be added in the future.
While running a simulation, or when setting one up in VMD, it is possible to examine all the forces that were last applied by Colvars to the atoms, or are about to be applied:
cv getatomappliedforces
where the length and order of this sequence matches that provided by the getatomids method. A simpler way of testing the stability of a Colvars configuration before or during a simulation makes use of aggregated data, such as the energy:
cv getenergy
the root-mean-square of the Colvars applied forces:
cv getatomappliedforcesrms
or the maximum norm of the applied forces:
cv getatomappliedforcesmax
which can be matched to a specific atom via its numeric ID obtained as:
cv getatomappliedforcesmaxid
See 8.1 for a complete list of scripting commands used to manage atomic data and runtime parameters of the Colvars module.
One of the typical uses of Colvars in VMD is computing the values of one or more variables along an existing trajectory. This is most easily done using the Dashboard (section 3), but can also be done by a script, as in the example below:
# Activate the module on the current VMD molecule
cv molid top
# Load a Colvars config file
cv configfile test.in
set out [open "test.colvars.traj" "w"]
# Write the labels to the file
puts -nonewline ${out} [cv printframelabels]
for { set fr 0 } { ${fr} < [molinfo top get numframes] } { incr fr } {
# Point Colvars to this trajectory frame
cv frame ${fr}
# Recompute variables and biases (required in VMD)
cv update
# Print variables and biases to the file
puts -nonewline ${out} [cv printframe]
}
close ${out}
After one or more collective variables are defined, they can be accessed with the following syntax.
cv colvar "$<$name$>$" $<$method$>$ arg1 arg2 ...
where “$<$name$>$" is the name of the variable.
For example, to recompute the collective variable “xi" after a change in its parameters, the following command can be used:
cv colvar "xi" update
This ordinarily is not needed during a simulation run, where all variables are recomputed at every step (along with biasing forces acting on them). However, when analyzing an existing trajectory, e.g. in VMD, a call to update is generally required.
While in all typical cases all configuration of the variables is done with the “config" or “configfile" methods, a limited set of changes can be enacted at runtime using:
cv colvar "$<$name$>$" modifycvcs arg1 arg2 ...
where each argument is a string passed to the function or functions that are used to compute the variable, and are called colvar components, or CVCs (5.1). For example, a variable “DeltaZ" made of a single “distanceZ" component can be made periodic with a period equal to the unit cell dimension along the $Z$-axis:
cv colvar "DeltaZ" modifycvcs "period ${Lz}"
Please note that this option is currently limited to changing the values of the polynomial superposition parameters componentCoeff, or of the componentExp to update on the fly, of period, wrapAround or forceNoPBC options for components where it is relevant.
If the variable is computed using many components, it is possible to selectively turn some of them on or off:
cv colvar "$<$name$>$" cvcflags $<$flags$>$
where “$<$flags$>$" is a list of 0/1 values, one per component. This is useful for example when script-based path collective variables in Cartesian coordinates (5.11.3) are used, to minimize computational cost by disabling the computation of terms that are very close to zero.
Important: None of the changes enacted by the “modifycvcs" or “cvcflags" methods will be saved to state files, and will be lost when restarting a simulation, deleting the corresponding collective variable, or resetting the module with the “delete" or “reset" methods.
As soon as a colvar “xi" and its associated biasing potentials are up to date (i.e. during a MD run, or after the respective “update" methods have been called), the force applied onto the colvar is known and may be accessed through the getappliedforce method:
cv colvar "xi" getappliedforce
See also the use of the outputAppliedForce option to have this force be saved to file during a simulation.
Aside from the biasing methods already implemented within Colvars (7) this force may be incremented ad hoc, for example as part of a custom restraint implemented by scriptedColvarForces:
cv colvar "xi" addforce $<$force$>$
where “$<$force$>$" is a scalar or a vector (depending on the type of variable “xi"). Note that in VMD, any forces applied via addforce do not have any effect on the movement of atoms: this feature is only available for compatibility.
For certain types of variable, the force applied directly on a colvar may be combined with those acting indirectly on it via the interatomic force field, making up the total force. When the outputTotalForce keyword is enabled, or when a biasing method that makes explicit use of the total force is enabled, the total force may be obtained as:
cv colvar "xi" gettotalforce
Note that not all types of variable support total-force computation, and the value of the total force may not be available immediately within the same simulation step: see the documentation of outputTotalForce for more details.
See 8.2 for a complete list of scripting commands used to manage collective variables.
Because biases depend only upon data internal to the Colvars module (i.e. they do not need atomic coordinates from VMD), it is generally easy to create them or update their configuration at any time. For example, given the most current value of the variable “xi", an already-defined harmonic restraint on it named “h_xi" can be updated as:
cv bias "h_xi" update
During a running simulation this step is not needed, because an automatic update of each bias is already carried out.
Some types of bias are history-dependent, and the magnitude of their forces depends not only on the values of their corresponding variables, but also on previous simulation history. It is thus useful to load information from a state file that contains information specifically for one bias only, for example:
cv bias "metadynamics1" load "old.colvars.state"
or alternatively, using the prefix of the file instead of its full name:
cv bias "metadynamics1" load "old"
A corresponding “save" function is also available:
cv bias "metadynamics1" save "new"
Please note that the file above must contain only the state information for that particular bias: loading a state file for the whole module is not allowed.
This pair of functions is also used internally by Colvars to implement e.g. multiple-walker metadynamics (7.5.7), but they can be called from a scripted function to implement alternative coupling schemes.
See 8.3 for a complete list of scripting commands used to manage biases.
Configuration for the Colvars module is passed using an external file, or inlined as a string in a VMD script using the Tcl command cv config "...". Configuration lines follow the format “keyword value" or “keyword { ... }", where the keyword and its value must be separated by one or more space characters. The following formatting rules apply:
The following keywords are available in the global context of the Colvars configuration, i.e. they are not nested inside other keywords:
Several of the sampling methods implemented in Colvars are time- or history-dependent, i.e. they work by accumulating data as a simulation progresses, and use these data to determine their biasing forces. If the simulation engine uses a checkpoint or restart file (as GROMACS and LAMMPS do), any data needed by Colvars are embedded into that file. Otherwise, a dedicated state file can be loaded into Colvars directly.
When a dedicated Colvars state file is used, it may be in either one of two formats:
In either format, the state file contains accumulated data as well as the step number at the end of the run. The step number read from a state file overrides any value that VMD provides, and will be incremented if the simulation proceeds. This means that the step number used internally by Colvars may not always match the step number reported by VMD.
In some cases, it is useful to modify the configuration of variables or biases between consecutive runs, for example by adding or removing a restraint. Some special provisions will happen in that case. When a state file is loaded, no information is available about any newly added variable or bias, which will thus remain uninitialized until the first compute step. Conversely, any information that the state file may contain about variables or biases that are no longer defined will be silently ignored. Please note that these checks are performed based only on the names of variables and biases: it is your responsibility to ensure that these names have consistent definitions between runs.
The flexibility just described carries some limitations: namely, it is only supported when reading text-format Colvars state files. Instead, restarting from binary files after a configuration change will trigger an error. It is also important to remind that when switching to a different build of VMD, the binary format may change slightly, even if the release version is the same.
To work around the potential issues just described, a text-format Colvars state file should be loaded. This is the default in VMD unless the “COLVARS_BINARY_RESTART" is set to 1, and this information is only provided here for troubleshooting purposes.
When the output prefix outputName is defined, the following output files are written during a simulation run:
This section summarizes the file formats of various files that Colvars may be reading or writing.
Configuration files are text files that are generally read as input by VMD, and may be optionally inlined in a VMD script (see 4.2.1). Starting from version 2017-02-01, changes in newline encodings are handled transparently, i.e. it is possible to typeset a configuration file in Windows (CR-LF newlines) and then use it with Linux or macOS (LF-only newlines).
Formatted state files, although not written manually, follow otherwise the same text format as configuration files. Binary state files can only be read by the Colvars code itself.
For atom selections that cannot be specified only by using internal Colvars keywords, external index files may also be used following the NDX format used in GROMACS:
[ group_1_name ]
i1 i2 i3 i4 ...
... ... iN
[ group_2_name ]
...
where i1 through iN are 1-based indices. Each group name may not contain spaces or tabs: otherwise, a parsing error will be raised.
Multiple index files may be provided to Colvars, each using the keyword indexFile. Two index files may contain groups with the same names, however these must also represent identical atom selections, i.e. the same sequence of indices including order.
Other than with GROMACS, an index group may also be generated from the VMD command-line interface, using the helper function write_index_group provided in the colvartools folder:
source colvartools/write_index_group.tcl
set sel [atomselect top "resname XXX and not hydrogen"]
write_index_group indexfile.ndx $sel "Ligand"
XYZ coordinate files are text files with the extension “.xyz". They are read by the Colvars module using an internal reader, and expect the following format:
$N$ | |||
Comment | line | ||
${E}_{1}$ | ${x}_{1}$ | ${y}_{1}$ | ${z}_{1}$ |
${E}_{2}$ | ${x}_{2}$ | ${y}_{2}$ | ${z}_{2}$ |
… | |||
${E}_{N}$ | ${x}_{N}$ | ${y}_{N}$ | ${z}_{N}$ |
where $N$ is the number of atomic coordinates in the file and ${E}_{i}$ is the chemical element of the $i$-th atom. Because ${E}_{i}$ is not used in Colvars, any string that does not contain tabs or spaces is acceptable.
Note: all XYZ coordinates are assumed to be expressed in Å units.
An XYZ file may contain either one of the following scenarios:
XYZ-file coordinates are read directly by Colvars and stored internally as double-precision floating point numbers (unlike VMD's reader, which uses single precision).
PDB coordinate files are read by the Colvars module using existing functionality in VMD, and therefore follow the same format. The values of the atomic coordinates and other fields, such as occupancy or temperature factors, are then communicated to Colvars by VMD.
PDB files may be used either as one of the available mechanisms to select atoms (see the atomsFile keyword), or more frequently to read reference coordinates for least-squares fit alignment (see the refPositionsFile keyword).
To select atoms via the atomsFile keyword, the option atomsFile is required, and atoms are selected based on either one of the following cases.
To read coordinates via the refPositionsFile keyword, there are four possible scenarios.
Due to the fixed-precision PDB format, it is not recommended to use PDB files to read coordinates when precision is a concern, and the XYZ format (see 4.7.2) is recommended instead.
Many simulation methods and analysis tools write files that contain functions of the collective variables tabulated on a grid (e.g. potentials of mean force or multidimensional histograms) for the purpose of analyzing results. Such files are produced by ABF (7.2), metadynamics (7.5), multidimensional histograms (7.12), as well as any restraint with optional thermodynamic integration support (7.1).
In some cases, these files may also be read as input of a new simulation. Suitable input files for
this purpose are typically generated as output files of previous simulations, or directly by
the user in the specific case of ensemble-biased metadynamics (7.5.5). This section
explains the “multicolumn" format used by these files. For a multidimensional function
$f({\xi}_{1}$,
${\xi}_{2}$,
…$)$ the
multicolumn grid format is defined as follows:
# | ${N}_{cv}$ | |||||
# | $min\left({\xi}_{1}\right)$ | $width\left({\xi}_{1}\right)$ | $npoints\left({\xi}_{1}\right)$ | $periodic\left({\xi}_{1}\right)$ | ||
# | $min\left({\xi}_{2}\right)$ | $width\left({\xi}_{2}\right)$ | $npoints\left({\xi}_{2}\right)$ | $periodic\left({\xi}_{2}\right)$ | ||
# | … | … | … | … | ||
# | $min\left({\xi}_{{N}_{cv}}\right)$ | $width\left({\xi}_{{N}_{cv}}\right)$ | $npoints\left({\xi}_{{N}_{cv}}\right)$ | $periodic\left({\xi}_{{N}_{cv}}\right)$ | ||
${\xi}_{1}^{1}$ | ${\xi}_{2}^{1}$ | … | ${\xi}_{{N}_{cv}}^{1}$ | f(${\xi}_{1}^{1}$, ${\xi}_{2}^{1}$, …, ${\xi}_{{N}_{cv}}^{1}$) | ||
${\xi}_{1}^{1}$ | ${\xi}_{2}^{1}$ | … | ${\xi}_{{N}_{cv}}^{2}$ | f(${\xi}_{1}^{1}$, ${\xi}_{2}^{1}$, …, ${\xi}_{{N}_{cv}}^{2}$) | ||
… | … | … | … | … | ||
Lines beginning with the character “#" are the header of the file. ${N}_{cv}$ is the number of collective variables sampled by the grid. For each variable ${\xi}_{i}$, $min\left({\xi}_{i}\right)$ is the lowest value sampled by the grid (i.e. the left-most boundary of the grid along ${\xi}_{i}$), $width\left({\xi}_{i}\right)$ is the width of each grid step along ${\xi}_{i}$, $npoints\left({\xi}_{i}\right)$ is the number of points and $periodic\left({\xi}_{i}\right)$ is a flag whose value is 1 or 0 depending on whether the grid is periodic along ${\xi}_{i}$. In most situations:
How the grid's boundaries affect the sequence of points depends on how the contents of the file were computed. In many cases, such as histograms and PMFs computed by metadynamics (7.5.5), the values of ${\xi}_{i}$ in the first few columns correspond to the midpoints of the corresponding bins, i.e. ${\xi}_{1}^{1}=min\left({\xi}_{i}\right)+width\left({\xi}_{i}\right)\u22152$. However, there is a slightly different format in PMF files computed by ABF (7.2) or other biases that use thermodynamic integration (7.1). In these cases, it is free-energy gradients that are accumulated on an (npoints)-long grid along each variable $\xi $: after these gradients are integrated, the resulting PMF is discretized on a slightly larger grid with (npoints+1) points along $\xi $ (unless the interval is periodic). Therefore, the grid's outer edges extend by $width\left({\xi}_{i}\right)\u22152$ above and below the specified boundaries, so that for instance $min\left({\xi}_{i}\right)$ in the header appears to be shifted back by $width\left({\xi}_{i}\right)\u22152$ compared to what would be expected. Please keep this difference in mind when comparing PMFs computed by different methods.
After the header, the rest of the file contains values of the tabulated function
$f({\xi}_{1}$,
${\xi}_{2}$,
…${\xi}_{{N}_{cv}})$, one for each line.
The first ${N}_{cv}$ columns
contain values of ${\xi}_{1}$,
${\xi}_{2}$,
…${\xi}_{{N}_{cv}}$ and the last column contains
the value of the function $f$.
Points are sorted in ascending order with the fastest-changing values at the right (“C-style" order). Each sweep of the
right-most variable ${\xi}_{{N}_{cv}}$
is terminated by an empty line. For two dimensional grid files, this allows quick visualization by programs
such as GNUplot.
Example 1: multicolumn text file for a one-dimensional histogram with lowerBoundary = 15, upperBoundary = 48 and width = 0.1.
# | 1 | ||||
# | 15 | 0.1 | 330 | 0 | |
15.05 | 6.14012e-07 | ||||
15.15 | 7.47644e-07 | ||||
… | … | ||||
47.85 | 1.65944e-06 | ||||
47.95 | 1.46712e-06 | ||||
Example 2: multicolumn text file for a two-dimensional histogram of two dihedral angles (periodic interval with 6${}^{\circ}$ bins):
# | 2 | ||||
# | -180.0 | 6.0 | 30 | 1 | |
# | -180.0 | 6.0 | 30 | 1 | |
-177.0 | -177.0 | 8.97117e-06 | |||
-177.0 | -171.0 | 1.53525e-06 | |||
… | … | … | |||
-177.0 | 177.0 | 2.442956-06 | |||
-171.0 | -177.0 | 2.04702e-05 | |||
… | … | … | |||
The Colvars trajectory file (with a suffix .colvars.traj) is a plain text file (scientific notation with 14-digit precision) whose columns represent quantities such as colvar values, applied forces, or individual restraints' energies. Under most scenarios, plotting or analyzing this file is straightforward: for example, the following contains a variable “$A$" and the energy of a restraint “$rA$":
# step A E_rA
0 1.42467449615693e+01 6.30982865292123e+02
100 1.42282559728026e+01 6.20640585041317e+02
…
Occasionally, if the Colvars configuration is changed mid-run certain quantities may be added or removed, changing the column layout. Labels in comment lines can assist in such cases: for example, consider the trajectory above with the addition of a second variable, “$B$", after 10,000 steps:
# step A E_rA
0 1.42467449615693e+01 6.30982865292123e+02
100 1.42282559728026e+01 6.20640585041317e+02
…
# step A B E_rA
10000 1.38136915830229e+01 9.99574098859265e-01 4.11184644791030e+02
10100 1.36437184346326e+01 9.99574091957314e-01 3.37726286543895e+02
Analyzing the above file with standard tools is possible, but laborious: a convenience script is provided for this and related purposes. It may be used either as a command-line tool or as a Python module:
>>> from plot_colvars_traj import Colvars_traj
>>> traj = Colvars_traj('test.colvars.traj')
>>> print(traj['A'].steps, traj['A'].values)
[ 0 100 ... 10000 10100] [14.246745 14.228256 ... 13.813692 13.643718]
>>> print(traj['B'].steps, traj['B'].values)
[10000 10100] [0.999574 0.9995741]
A collective variable is defined by the keyword colvar followed by its configuration options contained within curly braces:
colvar {
name xi
$<$other options$>$
function_name {
$<$parameters$>$
$<$atom selection$>$
}
}
There are multiple ways of defining a variable:
Choosing a component (function) is the only parameter strictly required to define a collective variable. It is also highly recommended to specify a name for the variable:
In this context, the function that computes a colvar is called a component. A component's choice and definition consists of including in the variable's configuration a keyword indicating the type of function (e.g. rmsd), followed by a definition block specifying the atoms involved (see 6) and any additional parameters (cutoffs, “reference" values, …). At least one component must be chosen to define a variable: if none of the keywords listed below is found, an error is raised.
The following components implement functions with a scalar value (i.e. a real number):
Some components do not return scalar, but vector values:
The types of components used in a colvar (scalar or not) determine the properties of that colvar, and particularly which biasing or analysis methods can be applied.
What if “X" is not listed? If a function type is not available on this list, it may be possible to define it as a polynomial superposition of existing ones (see 5.17), a custom function (see 5.18), or a scripted function (see 5.19).
In the rest of this section, all available component types are listed, along with their physical units and their ranges of values, if limited. Such ranges are often used to define automatically default sampling intervals, for example by setting the parameters lowerBoundary and upperBoundary in the parent colvar.
For each type of component, the available configurations keywords are listed: when two components share certain keywords, the second component references to the documentation of the first one that uses that keyword. The very few keywords that are available for all types of components are listed in a separate section 5.14.
In all colvar components described below, the following rules apply concerning periodic boundary conditions (PBCs):
The distance {...} block defines a distance component between the two atom groups, group1 and group2.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The value returned is a positive number (in length unit), ranging from $0$ to the largest possible interatomic distance within the chosen boundary conditions (with PBCs, the minimum image convention is used unless the forceNoPBC option is set).
The distanceZ {...} block defines a distance projection component, which can be seen as measuring the distance between two groups projected onto an axis, or the position of a group along such an axis. The axis can be defined using either one reference group and a constant vector, or dynamically based on two reference groups. One of the groups can be set to a dummy atom to allow the use of an absolute Cartesian coordinate.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a number (in length unit) whose range is determined by the chosen boundary conditions. For instance, if the $z$ axis is used in a simulation with periodic boundaries, the returned value ranges between $-{b}_{z}\u22152$ and ${b}_{z}\u22152$, where ${b}_{z}$ is the box length along $z$ (this behavior is disabled if forceNoPBC is set).
The distanceXY {...} block defines a distance projected on a plane, and accepts the same keywords as the component distanceZ, i.e. main, ref, either ref2 or axis, and oneSiteTotalForce. It returns the norm of the projection of the distance vector between main and ref onto the plane orthogonal to the axis. The axis is defined using the axis parameter or as the vector joining ref and ref2 (see distanceZ above).
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The distanceVec component computes the 3-dimensional vector joining the centers of mass of group1 and group2. Its values are therefore multi-dimensional and are subject to the restrictions listed in 5.16. Moreover, when computing differences between two different values of a distanceVec variable the minimum-image convention is assumed (unless forceNoPBC is enabled).
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The distanceDir {...} block defines a distance unit vector component, which accepts the same keywords as the component distance: group1, group2, and forceNoPBC. It returns a 3-dimensional unit vector $d=({d}_{x},{d}_{y},{d}_{z})$, with $\left|d\right|=1$.
This multi-dimensional variable has two intrinsic degrees of freedom: however, these cannot be sampled independently as one-dimensional variables. A decomposition in two dimensions can be done using polarTheta and polarPhi angles, with the caveat that the latter is ill-defined when the former approaches 0${}^{\circ}$ or 180${}^{\circ}$.
The distance between two values of distanceDir is calculated internally as the angle (in radians) between the two unit vectors: this definition adapts the standard Euclidean distance to the unit sphere, to ensure that restraint forces comply with the mathematical constraint.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The distanceInv {...} block defines a generalized mean distance between two groups of atoms 1 and 2, where each distance is taken to the power $-n$:
$${d}_{1,2}^{\left[n\right]}\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}{\left(\frac{1}{{N}_{1}{N}_{2}}\sum _{i,j}{d}_{ij}^{-n}\right)}^{-1\u2215n}$$ | (2) |
where ${d}_{ij}$ is the distance between atoms $i$ and $j$ in groups 1 and 2 respectively, and $n$ is an even integer.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a number ranging from $0$ to the largest possible distance within the chosen boundary conditions.
The angle {...} block defines an angle, and contains the three blocks group1, group2 and group3, defining the three groups. It returns an angle (in degrees) within the interval $[0:180]$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The dipoleAngle {...} block defines an angle, and contains the three blocks group1, group2 and group3, defining the three groups, being group1 the group where dipole is calculated. It returns an angle (in degrees) within the interval $[0:180]$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The dihedral {...} block defines a torsional angle, and contains the blocks group1, group2, group3 and group4, defining the four groups. It returns an angle (in degrees) within the interval $[-180:180]$. The Colvars module calculates all the distances between two angles taking into account periodicity. For instance, reference values for restraints or range boundaries can be defined by using any real number of choice.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The polarTheta {...} block defines the polar angle in spherical coordinates, for the center of mass of a group of atoms described by the block atoms. It returns an angle (in degrees) within the interval $[0:180]$. To obtain spherical coordinates in a frame of reference tied to another group of atoms, use the fittingGroup (6.2) option within the atoms block. An example is provided in file examples/11_polar_angles.in of the Colvars public repository.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The polarPhi {...} block defines the azimuthal angle in spherical coordinates, for the center of mass of a group of atoms described by the block atoms. It returns an angle (in degrees) within the interval $[-180:180]$. The Colvars module calculates all the distances between two angles taking into account periodicity. For instance, reference values for restraints or range boundaries can be defined by using any real number of choice. To obtain spherical coordinates in a frame of reference tied to another group of atoms, use the fittingGroup (6.2) option within the atoms block. An example is provided in file examples/11_polar_angles.in of the Colvars public repository.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
Note: polarPhi is ill-defined when the corresponding polarTheta component is close to 0${}^{\circ}$ or 180${}^{\circ}$; please take measures to avoid sampling these configurations in your simulations.
The coordNum {...} block defines a coordination number (or number of contacts), which calculates the function $(1-{(d\u2215{d}_{0})}^{n})\u2215(1-{(d\u2215{d}_{0})}^{m})$, where ${d}_{0}$ is the “cutoff" distance, and $n$ and $m$ are exponents that can control its long range behavior and stiffness [3]. This function is summed over all pairs of atoms in group1 and group2:
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a dimensionless number, which ranges from approximately 0 (all interatomic distances are much larger than the cutoff) to ${N}_{group1}\times {N}_{group2}$ (all distances are less than the cutoff), or ${N}_{group1}$ if group2CenterOnly is used. For performance reasons, at least one of group1 and group2 should be of limited size or group2CenterOnly should be used: the cost of the loop over all pairs grows as ${N}_{group1}\times {N}_{group2}$. Setting $tolerance>0$ ameliorates this to some degree, although every pair is still checked to regenerate the pair list.
The selfCoordNum {...} block defines a coordination number similarly to the component coordNum, but the function is summed over atom pairs within group1:
The keywords accepted by selfCoordNum are a subset of those accepted by coordNum, namely group1 (here defining all of the atoms to be considered), cutoff, expNumer, and expDenom.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a dimensionless number, which ranges from approximately 0 (all interatomic distances much larger than the cutoff) to ${N}_{group1}\times ({N}_{group1}-1)\u22152$ (all distances within the cutoff). For performance reasons, group1 should be of limited size, because the cost of the loop over all pairs grows as ${N}_{group1}^{2}$.
The hBond {...} block defines a hydrogen bond, implemented as a coordination number (eq. 3) between the donor and the acceptor atoms. Therefore, it accepts the same options cutoff (with a different default value of 3.3 Å), expNumer (with a default value of 6) and expDenom (with a default value of 8). Unlike coordNum, it requires two atom numbers, acceptor and donor, to be defined. It returns a dimensionless number, with values between 0 (acceptor and donor far outside the cutoff distance) and 1 (acceptor and donor much closer than the cutoff).
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The block rmsd {...} defines the root mean square replacement (RMSD) of a group of atoms with respect to a reference structure. For each set of coordinates $\left\{{x}_{1}\right(t),{x}_{2}(t),\dots \⁡{x}_{N}(t\left)\right\}$, the colvar component rmsd calculates the optimal rotation ${U}^{\left\{{x}_{i}\right(t\left)\right\}\to \left\{{x}_{i}^{\left(ref\right)}\right\}}$ that best superimposes the coordinates $\left\{{x}_{i}\right(t\left)\right\}$ onto a set of reference coordinates $\left\{{x}_{i}^{\left(ref\right)}\right\}$. Both the current and the reference coordinates are centered on their centers of geometry, ${x}_{cog}\left(t\right)$ and ${x}_{cog}^{\left(ref\right)}$. The root mean square displacement is then defined as:
The optimal rotation ${U}^{\left\{{x}_{i}\right(t\left)\right\}\to \left\{{x}_{i}^{\left(ref\right)}\right\}}$ is calculated within the formalism developed in reference [4], which guarantees a continuous dependence of ${U}^{\left\{{x}_{i}\right(t\left)\right\}\to \left\{{x}_{i}^{\left(ref\right)}\right\}}$ with respect to $\left\{{x}_{i}\right(t\left)\right\}$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a positive real number (in length unit).
In the standard usage as described above, the rmsd component calculates a minimum RMSD, that is, current coordinates are optimally fitted onto the same reference coordinates that are used to compute the RMSD value. The fit itself is handled by the atom group object, whose parameters are automatically set by the rmsd component. For very specific applications, however, it may be useful to control the fitting process separately from the definition of the reference coordinates, to evaluate various types of non-minimal RMSD values. This can be achieved by setting the related options (refPositions, etc.) explicitly in the atom group block. This allows for the following non-standard cases:
The block eigenvector {...} defines the projection of the coordinates of a group of atoms (or more precisely, their deviations from the reference coordinates) onto a vector in ${\mathbb{R}}^{3n}$, where $n$ is the number of atoms in the group. The computed quantity is the total projection:
where, as in the rmsd component, $U$ is the optimal rotation matrix, ${x}_{cog}\left(t\right)$ and ${x}_{cog}^{\left(ref\right)}$ are the centers of geometry of the current and reference positions respectively, and ${v}_{i}$ are the components of the vector for each atom. Example choices for $\left({v}_{i}\right)$ are an eigenvector of the covariance matrix (essential mode), or a normal mode of the system. It is assumed that ${\sum \⁡}_{i}{v}_{i}=0$: otherwise, the Colvars module centers the ${v}_{i}$ automatically when reading them from the configuration.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The block gyration {...} defines the parameters for calculating the radius of gyration of a group of atomic positions $\left\{{x}_{1}\right(t),{x}_{2}(t),\dots \⁡{x}_{N}(t\left)\right\}$ with respect to their center of geometry, ${x}_{cog}\left(t\right)$:
$${R}_{gyr}\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}\sqrt{\frac{1}{N}\sum _{i=1}^{N}{\left|{x}_{i}\left(t\right)-{x}_{cog}\left(t\right)\right|}^{2}}$$ | (7) |
This component must contain one atoms {...} block to define the atom group, and returns a positive number, expressed in length unit.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The block inertia {...} defines the parameters for calculating the total moment of inertia of a group of atomic positions $\left\{{x}_{1}\right(t),{x}_{2}(t),\dots \⁡{x}_{N}(t\left)\right\}$ with respect to their center of geometry, ${x}_{cog}\left(t\right)$:
$$I\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}\sum _{i=1}^{N}{\left|{x}_{i}\left(t\right)-{x}_{cog}\left(t\right)\right|}^{2}$$ | (8) |
Note that all atomic masses are set to 1 for simplicity. This component must contain one atoms {...} block to define the atom group, and returns a positive number, expressed in length unit${}^{2}$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The dipoleMagnitude {...} block defines the dipole magnitude of a group of atoms (norm of the dipole moment's vector), being atoms the group where dipole magnitude is calculated. It returns the magnitude in elementary charge $e$ times length unit.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The block inertiaZ {...} defines the parameters for calculating the component along the axis $e$ of the moment of inertia of a group of atomic positions $\left\{{x}_{1}\right(t),{x}_{2}(t),\dots \⁡{x}_{N}(t\left)\right\}$ with respect to their center of geometry, ${x}_{cog}\left(t\right)$:
$${I}_{e}\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}\sum _{i=1}^{N}{\left(\left({x}_{i}\left(t\right)-{x}_{cog}\left(t\right)\right)\cdot e\right)}^{2}$$ | (9) |
Note that all atomic masses are set to 1 for simplicity. This component must contain one atoms {...} block to define the atom group, and returns a positive number, expressed in length unit${}^{2}$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The variables discussed in this section quantify the rotations of macromolecules (or other quasi-rigid objects) from a given set of reference coordinates to the current coordinates. Such rotations are computed following the same method used for best-fit RMSDs (see rmsd and fittingGroup). The underlying mathematical formalism is described in reference [4], and the implementation in reference [1].
The first of the functions described is the orientation, which describes the full rotation as a unit quaternion $q=({q}_{0},{q}_{1},{q}_{2},{q}_{3})$, i.e. 4 numbers with one constraint (3 degrees of freedom). The quaternion $q$ is one of only two representations that are both complete and accurate, the other being a $3\times 3$ unit matrix with 3 independent parameters. Although $q$ is used internally in the Colvars module for features such as the rmsd function and the fittingGroup option, its direct use as a collective variable is more difficult, and mostly limited to fixed or moving restraints.
The two functions orientationAngle and orientationProj, with the latter being the cosine of the former, represent the amplitude of the full rotation described by $q$, regardless of the direction of its axis. As one-dimensional scalar variables, both orientationAngle and orientationProj are a much reduced simplification of the full rotation. However, they can be used in a variety of methods including both restraints and PMF computations.
A slightly more complete parametrization is achieved by decomposing the full rotation into the two parameters, tilt and spinAngle. These quantify the amplitudes of two independent sub-rotations away from a certain axis $e$, and around the same axis $e$, respectively. The axis $e$ is chosen by the user, and is by default the Z axis: under that choice, tilt is equivalent to the sine of the Euler “pitch" angle $\mathit{\theta}$, and spinAngle to the sum of the other two angles, $(\varphi +\psi )$. This parameterization is mathematically well defined for almost all full rotations, including small ones when the current coordinates are almost completely aligned with the reference ones. However, a mathematical singularity prevents using the spinAngle function near configurations where the value of tilt tilt is -1 (i.e. a -180${}^{\circ}$ rotation around an axis orthogonal to $e$). For these reasons, tilt and spinAngle are useful when the allowed rotations are known to have approximately the same axis, and differ only in the magnitude of the corresponding angle. In this use case, spinAngle measures the angle of the sub-rotation around the chosen axis $e$, whereas tilt measures the dot product between $e$ and the actual axis of the full rotation.
Lastly, the traditional Euler angles are also available as the functions eulerPhi, eulerTheta and eulerPsi. Altogether, these are sufficient to represent all three degrees of freedom of a full rotation. However, they also suffer from the potential “gimbal lock" problem, which emerges whenever $\mathit{\theta}\simeq \pm 9{0}^{\circ}$, which includes also the case where the full rotation is small. Under such conditions, the angles $\varphi $ and $\psi $ are both ill-defined and cannot be used as collective variables. For these reasons, it is highly recommended that Euler angles are used only in simulations where their range of applicability is known ahead of time, and excludes configurations where $\mathit{\theta}\simeq \pm 9{0}^{\circ}$ altogether.
The block orientation {...} returns the same optimal rotation used in the rmsd component to superimpose the coordinates $\left\{{x}_{i}\right(t\left)\right\}$ onto a set of reference coordinates $\left\{{x}_{i}^{\left(ref\right)}\right\}$. Such component returns a four dimensional vector $q=({q}_{0},{q}_{1},{q}_{2},{q}_{3})$, with ${\sum \⁡}_{i}{q}_{i}^{2}=1$; this quaternion expresses the optimal rotation $\left\{{x}_{i}\right(t\left)\right\}\to \left\{{x}_{i}^{\left(ref\right)}\right\}$ according to the formalism in reference [4]. The quaternion $({q}_{0},{q}_{1},{q}_{2},{q}_{3})$ can also be written as $\left(\mathrm{cos}\⁡(\mathit{\theta}\u22152),\phantom{\rule{0.17em}{0ex}}\mathrm{sin}\⁡(\mathit{\theta}\u22152)u\right)$, where $\mathit{\theta}$ is the angle and $u$ the normalized axis of rotation; for example, a rotation of 90${}^{\circ}$ around the $z$ axis is expressed as “(0.707, 0.0, 0.0, 0.707)". The script quaternion2rmatrix.tcl provides Tcl functions for converting to and from a $4\times 4$ rotation matrix in a format suitable for usage in VMD.
As for the component rmsd, the available options are atoms, refPositionsFile, refPositionsCol and refPositionsColValue, and refPositions.
Note: refPositions and refPositionsFile define the set of positions from which the optimal rotation is calculated, but this rotation is not applied to the coordinates of the atoms involved: it is used instead to define the variable itself.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
Example: stopping the rotation of a protein. To stop the rotation of an elongated macromolecule in solution (and use an anisotropic box to save water molecules), it is possible to define a colvar with an orientation component, and restrain it through the harmonic bias around the identity rotation, (1.0, 0.0, 0.0, 0.0). Only the overall orientation of the macromolecule is affected, and not its internal degrees of freedom.
colvar {
name Orient
orientation {
atoms { … }
refPositionsFile reference.pdb
}
}
harmonic { # Define a harmonic restraint
colvars Orient # acting on colvar "Orient"
centers (1.0, 0.0, 0.0, 0.0) # center the unit quaternion (no rotation)
forceConstant 500.0 # unit is energy: quaternions are dimensionless
}
The block orientationAngle {...} accepts the same base options as the component orientation: atoms, refPositions, refPositionsFile, refPositionsCol and refPositionsColValue. The returned value is the angle of rotation $\mathit{\theta}$ between the current and the reference positions. This angle is expressed in degrees within the range [0${}^{\circ}$:180${}^{\circ}$].
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The block orientationProj {...} accepts the same base options as the component orientation: atoms, refPositions, refPositionsFile, refPositionsCol and refPositionsColValue. The returned value is the cosine of the angle of rotation $\mathit{\theta}$ between the current and the reference positions. The range of values is [-1:1].
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The complete rotation described by orientation can optionally be decomposed into two sub-rotations: one is a “spin" rotation around e, and the other a “tilt" rotation around an axis orthogonal to e. The component spinAngle measures the angle of the “spin" sub-rotation around e.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The component spinAngle returns an angle (in degrees) within the periodic interval $[-180:180]$.
Note: the value of spinAngle is a continuous function almost everywhere, with the exception of configurations with the corresponding “tilt" angle equal to 180${}^{\circ}$ (i.e. the tilt component is equal to $-1$): in those cases, spinAngle is undefined. If such configurations are expected, consider defining a tilt colvar using the same axis e, and restraining it with a lower wall away from $-1$.
The component tilt measures the cosine of the angle of the “tilt" sub-rotation, which combined with the “spin" sub-rotation provides the complete rotation of a group of atoms. The cosine of the tilt angle rather than the tilt angle itself is implemented, because the latter is unevenly distributed even for an isotropic system: consider as an analogy the angle $\mathit{\theta}$ in the spherical coordinate system. The component tilt relies on the same options as spinAngle, including the definition of the axis e. The values of tilt are real numbers in the interval $[-1:1]$: the value $1$ represents an orientation fully parallel to e (tilt angle = 0${}^{\circ}$), and the value $-1$ represents an anti-parallel orientation.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
Assuming the axes of the original frame are denoted as x, y, z and the axes of the rotated frame as X, Y, Z, the line of nodes, N, can be defined as the intersection of the plane xy and XY. The axis perpendicular to N and z is defined as P. In this case, as illustrated in the figure below, the complete rotation described by orientation can optionally be decomposed into three Euler angles:
Although Euler angles are more straightforward to use than quaternions, they are also potentially
subject to the “gimbal lock" problem:
https://en.wikipedia.org/wiki/Gimbal_lock
which arises whenever $\mathit{\theta}\simeq \pm 9{0}^{\circ}$,
including the common case when the simulated coordinates are near the reference coordinates.
Therefore, a safe use of Euler angles as collective variables requires the use of restraints to avoid
such singularities, such as done in reference [6] and in the protein-ligand binding NAMD
tutorial.
The eulerPhi component accepts exactly the same options as orientation, and measures the rotation angle from the x axis to the N axis. This angle is expressed in degrees within the periodic range $[-18{0}^{\circ}:18{0}^{\circ}]$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
— same definition as refPositionsCol (rmsd component)
— same definition as refPositionsColValue (rmsd component)
This component accepts exactly the same options as orientation, and measures the rotation angle from the P axis to the Z axis. This angle is expressed in degrees within the range $[-9{0}^{\circ}:9{0}^{\circ}]$.
Warning: When this angle reaches $-9{0}^{\circ}$ or $9{0}^{\circ}$, the definition of orientation by euler angles suffers from the gimbal lock issue. You may need to apply a restraint to keep eulerTheta away from the singularities.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
— same definition as refPositionsCol (rmsd component)
— same definition as refPositionsColValue (rmsd component)
This component accepts exactly the same options as orientation, and measures the rotation angle from the N axis to the X axis. This angle is expressed in degrees within the periodic range $[-18{0}^{\circ}:18{0}^{\circ}]$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
— same definition as refPositionsCol (rmsd component)
— same definition as refPositionsColValue (rmsd component)
The block alpha {...} defines the parameters to calculate the helical content of a segment of protein residues. The $\alpha $-helical content across the $N+1$ residues ${N}_{0}$ to ${N}_{0}+N$ is calculated by the formula:
$$\begin{array}{rcll}\alpha \left({C}_{\alpha}^{\left({N}_{0}\right)},{O}^{\left({N}_{0}\right)},{C}_{\alpha}^{({N}_{0}+1)},{O}^{({N}_{0}+1)},\dots \⁡{N}^{({N}_{0}+5)},{C}_{\alpha}^{({N}_{0}+5)},{O}^{({N}_{0}+5)},\dots \⁡{N}^{({N}_{0}+N)},{C}_{\alpha}^{({N}_{0}+N)}\right)\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}& & & \text{(10)}\text{}\text{}\\ \phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}\phantom{\rule{0.28em}{0ex}}\frac{(1-{C}_{\text{hb}})}{N-1}\sum _{n={N}_{0}}^{{N}_{0}+N-2}angf\left({C}_{\alpha}^{\left(n\right)},{C}_{\alpha}^{(n+1)},{C}_{\alpha}^{(n+2)}\right)\phantom{\rule{0.28em}{0ex}}+\phantom{\rule{0.28em}{0ex}}\frac{{C}_{\text{hb}}}{N-3}\sum _{n={N}_{0}}^{{N}_{0}+N-4}hbf\left({O}^{\left(n\right)},{N}^{(n+4)}\right),& & & \text{}\\ & & & \text{(11)}\text{}\text{}\end{array}$$
where ${C}_{\text{hb}}$ is defined by hBondCoeff, the score function $angf$ for the ${C}_{\alpha}-{C}_{\alpha}-{C}_{\alpha}$ angle is defined as:
and the score function $hbf$ for the ${O}^{\left(n\right)}\leftrightarrow {N}^{(n+4)}$ hydrogen bond is defined through a hBond colvar component on the same atoms.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns positive values, always comprised between 0 (lowest $\alpha $-helical score) and 1 (highest $\alpha $-helical score).
The block dihedralPC {...} defines the parameters to calculate the projection of backbone dihedral angles within a protein segment onto a dihedral principal component, following the formalism of dihedral principal component analysis (dPCA) proposed by Mu et al.[7] and documented in detail by Altis et al.[8]. Given a peptide or protein segment of $N$ residues, each with Ramachandran angles ${\varphi}_{i}$ and ${\psi}_{i}$, dPCA rests on a variance/covariance analysis of the $4(N-1)$ variables $\mathrm{cos}\⁡\left({\psi}_{1}\right),\mathrm{sin}\⁡\left({\psi}_{1}\right),\mathrm{cos}\⁡\left({\varphi}_{2}\right),\mathrm{sin}\⁡\left({\varphi}_{2}\right)\cdots \mathrm{cos}\⁡\left({\varphi}_{N}\right),\mathrm{sin}\⁡\left({\varphi}_{N}\right)$. Note that angles ${\varphi}_{1}$ and ${\psi}_{N}$ have little impact on chain conformation, and are therefore discarded, following the implementation of dPCA in the analysis software Carma.[9]
For a given principal component (eigenvector) of coefficients ${\left({k}_{i}\right)}_{1\le i\le 4(N-1)}$, the projection of the current backbone conformation is:
dihedralPC expects the same parameters as the alpha component for defining the relevant residues (residueRange and psfSegID) in addition to the following:
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The cartesian {...} block defines a component returning a flat vector containing the Cartesian coordinates of all participating atoms, in the order $({x}_{1},{y}_{1},{z}_{1},\cdots \phantom{\rule{0.17em}{0ex}},{x}_{n},{y}_{n},{z}_{n})$.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The distancePairs {...} block defines a ${N}_{1}\times {N}_{2}$-dimensional variable that includes all mutual distances between the atoms of two groups. This can be useful, for example, to develop a new variable defined over two groups, by using the scriptedFunction feature.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This component returns a ${N}_{1}\times {N}_{2}$-dimensional vector of numbers, each ranging from $0$ to the largest possible distance within the chosen boundary conditions.
The geometric path collective variables define the progress along a path, $s$, and the distance from the path, $z$. These CVs are proposed by Leines and Ensing[10] , which differ from that[11] proposed by Branduardi et al., and utilize a set of geometric algorithms. The path is defined as a series of frames in the atomic Cartesian coordinate space or the CV space. $s$ and $z$ are computed as
$$s=\frac{m}{M}\pm \frac{1}{2M}\left(\frac{\sqrt{{({v}_{1}\cdot {v}_{3})}^{2}-\left|{v}_{3}{|}^{2}\right(|{v}_{1}{|}^{2}-|{v}_{2}{|}^{2})}-({v}_{1}\cdot {v}_{3})}{|{v}_{3}{|}^{2}}-1\right)$$ | (14) |
where ${v}_{1}={s}_{m}-z$ is the vector connecting the current position to the closest frame, ${v}_{2}=z-{s}_{m-1}$ is the vector connecting the second closest frame to the current position, ${v}_{3}={s}_{m+1}-{s}_{m}$ is the vector connecting the closest frame to the third closest frame, and ${v}_{4}={s}_{m}-{s}_{m-1}$ is the vector connecting the second closest frame to the closest frame. $m$ and $M$ are the current index of the closest frame and the total number of frames, respectively. If the current position is on the left of the closest reference frame, the $\pm $ in $s$ turns to the positive sign. Otherwise it turns to the negative sign.
The equations above assume: (i) the frames are equidistant and (ii) the second and the third closest frames are neighbouring to the closest frame. When these assumptions are not satisfied, this set of path CV should be used carefully.
In the gspath {...} and the gzpath {...} block all vectors, namely $z$ and ${s}_{k}$ are defined in atomic Cartesian coordinate space. More specifically, $z=\left[{r}_{1},\cdots \phantom{\rule{0.17em}{0ex}},{r}_{n}\right]$, where ${r}_{i}$ is the $i$-th atom specified in the atoms block. ${s}_{k}=\left[{r}_{k,1},\cdots \phantom{\rule{0.17em}{0ex}},{r}_{k,n}\right]$, where ${r}_{k,i}$ means the $i$-th atom of the $k$-th reference frame.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The usage of gzpath and gspath is illustrated below:
colvar {
# Progress along the path
name gs
# The path is defined by 5 reference frames (from string-00.pdb to string-04.pdb)
# Use atomic coordinate from atoms 1, 2 and 3 to compute the path
gspath {
atoms {atomnumbers { 1 2 3 }}
refPositionsFile1 string-00.pdb
refPositionsFile2 string-01.pdb
refPositionsFile3 string-02.pdb
refPositionsFile4 string-03.pdb
refPositionsFile5 string-04.pdb
}
}
colvar {
# Distance from the path
name gz
# The path is defined by 5 reference frames (from string-00.pdb to string-04.pdb)
# Use atomic coordinate from atoms 1, 2 and 3 to compute the path
gzpath {
atoms {atomnumbers { 1 2 3 }}
refPositionsFile1 string-00.pdb
refPositionsFile2 string-01.pdb
refPositionsFile3 string-02.pdb
refPositionsFile4 string-03.pdb
refPositionsFile5 string-04.pdb
}
}
This is a helper CV which can be defined as a linear combination of other CVs. It maybe useful when you want to define the gspathCV {...} and the gzpathCV {...} as combinations of other CVs. Total forces (required by ABF) of this CV are not available.
This is a helper CV which can be defined as a mathematical expression (see 5.18) of other CVs by using customFunction. Currently only the scalar type of customFunction is supported. If customFunction is not provided, this component falls back to linearCombination. It maybe useful when you want to define the gspathCV {...}, the gzpathCV {...} and NeuralNetwork {...} as combinations of other CVs. Total forces (required by ABF) of this CV are not available.
In the gspathCV {...} and the gzpathCV {...} block all vectors, namely $z$ and ${s}_{k}$ are defined in CV space. More specifically, $z=\left[{\xi}_{1},\cdots \phantom{\rule{0.17em}{0ex}},{\xi}_{n}\right]$, where ${\xi}_{i}$ is the $i$-th CV. ${s}_{k}=\left[{\xi}_{k,1},\cdots \phantom{\rule{0.17em}{0ex}},{\xi}_{k,n}\right]$, where ${\xi}_{k,i}$ means the $i$-th CV of the $k$-th reference frame. It should be note that these two CVs requires the pathFile option, which specifies a path file. Each line in the path file contains a set of space-seperated CV value of the reference frame. The sequence of reference frames matches the sequence of the lines.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
The usage of gzpathCV and gspathCV is illustrated below:
colvar {
# Progress along the path
name gs
# Path defined by the CV space of two dihedral angles
gspathCV {
pathFile ./path.txt
dihedral {
name 001
group1 {atomNumbers {5}}
group2 {atomNumbers {7}}
group3 {atomNumbers {9}}
group4 {atomNumbers {15}}
}
dihedral {
name 002
group1 {atomNumbers {7}}
group2 {atomNumbers {9}}
group3 {atomNumbers {15}}
group4 {atomNumbers {17}}
}
}
}
colvar {
# Distance from the path
name gz
gzpathCV {
pathFile ./path.txt
dihedral {
name 001
group1 {atomNumbers {5}}
group2 {atomNumbers {7}}
group3 {atomNumbers {9}}
group4 {atomNumbers {15}}
}
dihedral {
name 002
group1 {atomNumbers {7}}
group2 {atomNumbers {9}}
group3 {atomNumbers {15}}
group4 {atomNumbers {17}}
}
}
}
The arithmetic path collective variable in CV space uses a similar formula as the one proposed by Branduardi[11] et al., except that it computes $s$ and $z$ in CV space instead of RMSDs in Cartesian space. Moreover, this implementation allows different coefficients for each CV components as described in [12]. Assuming a path is composed of $N$ reference frames and defined in an $M$-dimensional CV space, then the equations of $s$ and $z$ of the path are
$$z=-\frac{1}{\lambda}\mathrm{ln}\⁡\left(\sum _{i=0}^{N-1}\mathrm{exp}\⁡\left(-\lambda \sum _{j=1}^{M}{c}_{j}^{2}{\left({x}_{j}-{x}_{i,j}\right)}^{2}\right)\right)$$ | (17) |
where ${c}_{j}$ is the coefficient(weight) of the $j$-th CV, ${x}_{i,j}$ is the value of $j$-th CV of $i$-th reference frame and ${x}_{j}$ is the value of $j$-th CV of current frame. $\lambda $ is a parameter to smooth the variation of $s$ and $z$. It should be noted that the index $i$ ranges from $0$ to $N-1$, and the definition of $s$ is normalized by $1\u2215(N-1)$. Consequently, the scope of $s$ is $[0:1]$.
This colvar component computes the $s$ variable.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
This colvar component computes the $z$ variable. Options are the same as in 5.11.1.
The usage of azpathCV and aspathCV is illustrated below:
colvar {
# Progress along the path
name as
# Path defined by the CV space of two dihedral angles
aspathCV {
pathFile ./path.txt
weights {1.0 1.0}
lambda 0.005
dihedral {
name 001
group1 {atomNumbers {5}}
group2 {atomNumbers {7}}
group3 {atomNumbers {9}}
group4 {atomNumbers {15}}
}
dihedral {
name 002
group1 {atomNumbers {7}}
group2 {atomNumbers {9}}
group3 {atomNumbers {15}}
group4 {atomNumbers {17}}
}
}
}
colvar {
# Distance from the path
name az
azpathCV {
pathFile ./path.txt
weights {1.0 1.0}
lambda 0.005
dihedral {
name 001
group1 {atomNumbers {5}}
group2 {atomNumbers {7}}
group3 {atomNumbers {9}}
group4 {atomNumbers {15}}
}
dihedral {
name 002
group1 {atomNumbers {7}}
group2 {atomNumbers {9}}
group3 {atomNumbers {15}}
group4 {atomNumbers {17}}
}
}
}
The path collective variables defined by Branduardi et al. [11] are based on RMSDs in Cartesian coordinates. Noting ${d}_{i}$ the RMSD between the current set of Cartesian coordinates and those of image number $i$ of the path:
$$s=\frac{1}{N-1}\frac{\sum _{i=1}^{N}(i-1)\mathrm{exp}\⁡\left(-\lambda {d}_{i}^{2}\right)}{\sum _{i=1}^{N}\mathrm{exp}\⁡\left(-\lambda {d}_{i}^{2}\right)}$$ | (18) |
$$z=-\frac{1}{\lambda}\mathrm{ln}\⁡\left(\sum _{i=1}^{N}\mathrm{exp}\⁡(-\lambda {d}_{i}^{2})\right)$$ | (19) |
where $\lambda $ is the smoothing parameter.
These coordinates are implemented as Tcl-scripted combinations of rmsd components. The implementation is available as file colvartools/pathCV.tcl, and an example is provided in file examples/10_pathCV.namd of the Colvars public repository. It implements an optimization procedure, whereby the distance to a given image is only calculated if its contribution to the sum is larger than a user-defined tolerance parameter. All distances are calculated every freq timesteps to update the list of nearby images.
This CV computes a special case of Eq. 16, where ${x}_{j}$ is the $j$-th atomic position, ${x}_{i,j}$ is the $j$-th atomic position of the $i$-th reference frame. The subtraction ${x}_{j}-{x}_{i,j}$ is actually calculated as ${x}_{j}-{R}_{i}{x}_{i,j}$, where ${R}_{i}$ is a 3x3 rotation matrix that minimizes the RMSD between the current atomic positions of simulation and the $i$-th reference frame. Bold ${x}_{j}$ is used since an atomic position is a vector.
For NAMD and VMD users, this component can be regarded as an improved C++ implementation of the PCV progress component of the Tcl-scripted version (see 5.11.3).
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
Similar to aspath, this CV computes a special case of Eq. 17, and shares the same options as aspath.
The usage of azpath and aspath is illustrated below:
colvar {
# Progress along the path
name as
# The path is defined by 5 reference frames (from string-00.pdb to string-04.pdb)
# Use atomic coordinate from atoms 1, 2 and 3 to compute the path
aspath {
atoms {atomnumbers { 1 2 3 }}
refPositionsFile1 string-00.pdb
refPositionsFile2 string-01.pdb
refPositionsFile3 string-02.pdb
refPositionsFile4 string-03.pdb
refPositionsFile5 string-04.pdb
}
}
colvar {
# Distance from the path
name az
# The path is defined by 5 reference frames (from string-00.pdb to string-04.pdb)
# Use atomic coordinate from atoms 1, 2 and 3 to compute the path
azpath {
atoms {atomnumbers { 1 2 3 }}
refPositionsFile1 string-00.pdb
refPositionsFile2 string-01.pdb
refPositionsFile3 string-02.pdb
refPositionsFile4 string-03.pdb
refPositionsFile5 string-04.pdb
}
}
This colvar component computes a non-linear combination of other scalar colvar components, where the transformation is defined by a dense neural network.[13] The network can be optimized using any framework, and its parameters are provided to Colvars in plain text files, as detailed below. An example Python script to export the parameters of a TensorFlow model is provided in colvartools/extract_weights_biases.py in the Colvars source tree.
The output of the $j$-th node of a $k$-th layer that has ${N}_{k}$ nodes is computed by
$${y}_{k,j}={f}_{k}\left(\sum _{i=1}^{{N}_{k-1}}{w}_{(k,j),(k-1,i)}{y}_{k-1,i}+{b}_{k,j}\right),$$ | (20) |
where ${f}_{k}$ is the activation function of the $k$-th layer, ${w}_{(k,j),(k-1,i)}$ is the weight of $j$-th node with respect to the $i$-th output of previous layer, and ${b}_{k,j}$ is the bias of $j$-th node of $k$-th layer.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
An example of configuration using NeuralNetwork is shown below:
colvar {
# Define a neural network with 2 layers
# The inputs are two torsion angles
# and the first node at the output layer is used as the final CV
name nn_output_1
NeuralNetwork {
output_component 0
layer1_WeightsFile dense_1_weights.txt
layer1_BiasesFile dense_1_biases.txt
layer1_activation tanh
layer2_WeightsFile dense_2_weights.txt
layer2_BiasesFile dense_2_biases.txt
layer2_activation tanh
# The component coefficient is used for normalization
componentCoeff 180.0
dihedral {
name 001
# normalization factor 1.0/180.0
componentCoeff 0.00555555555555555556
group1 {atomNumbers {5}}
group2 {atomNumbers {7}}
group3 {atomNumbers {9}}
group4 {atomNumbers {15}}
}
dihedral {
name 002
# normalization factor 1.0/180.0
componentCoeff 0.00555555555555555556
group1 {atomNumbers {7}}
group2 {atomNumbers {9}}
group3 {atomNumbers {15}}
group4 {atomNumbers {17}}
}
}
}
Volumetric maps of the Cartesian coordinates, typically defined as mesh grid along the three Cartesian axes, may be used to define collective variables. Please cite [14] when using this implementation of collective variables based on volumetric maps.
Given a function of the Cartesian coordinates $\varphi \left(x\right)=\varphi (x,y,z)$, a mapTotal collective variable component $\Phi \left(X\right)$ is defined as the sum of the values of the function $\varphi \left(x\right)$ evaluated at the coordinates of each atom, ${x}_{i}$:
$$\Phi \left(X\right)=\sum _{i=1}^{N}{w}_{i}\varphi \left({x}_{i}\right)$$ | (21) |
where ${w}_{i}$
are weights assigned to each variable (1 by default). This formulation allows, for example, to
“count" the number of atoms within a region of space by using a positive-valued function
$\varphi \left(x\right)$, such
as for example the number of water molecules in a hydrophobic cavity [14].
Because the volumetric map itself and the atoms affected by it are defined externally to Colvars, this component has a very limited number of keywords.
List of keywords (see also 5.2, 5.14, 5.15 and 5.17 for additional options):
To study processes that involve changes in shape of a macromolecular aggregate (for example, deformations of lipid membranes) it is useful to define collective variables based on more than one volumetric map at a time, measuring the relative similarity with each map while still achieving correct thermodynamic sampling of each state.
This is achieved by combining multiple mapTotal components, each based on a differently-shaped volumetric map, into a single collective variable $\xi $. To track transitions between states, the contribution of each map to $\xi $ should be discriminated from the others, for example by assigning to it a different weight. The “Multi-Map" progress variable [14] uses a weight sum of these components, using linearly-increasing weights:
$$\xi \left(X\right)=\sum _{k=1}^{K}k{\Phi}_{k}\left(X\right)=\sum _{k=1}^{K}k\sum _{i=1}^{N}{\varphi}_{k}\left({x}_{i}\right)$$ | (22) |
where $K$ is the number of maps employed and each ${\Phi}_{k}$ is a mapTotal component.
Here is a link to the Multi-Map tutorial page: https://colvars.github.io/multi-map/multi-map.html
An example configuration for illustration purposes is also included below.
Example: transitions between macromolecular shapes using volumetric maps.
A series of map files, each representing a different shape, is loaded into NAMD:
mGridForce yes
for { set k 1 } { $k <= $K } { incr k } {
mGridForcePotFile Shape_$k map_$k.dx # Density map of the k-th state
mGridForceFile Shape_$k atoms.pdb # PDB file used for atom selection
mGridForceCol Shape_$k O # Use the occupancy column of the PDB file atoms.pdb
mGridForceChargeCol Shape_$k B # Use beta as “charge" (default: electric charge)
mGridForceScale Shape_$k 0.0 0.0 0.0 # Do not use GridForces for this map
}
The GridForces maps thus loaded are then used to define the Multi-Map collective variable, with
coefficients ${\xi}_{k}=k$
[14]:
# Collect the definition of all components into one string
set components "
for { set k 1 } { $k <= $K } { incr k } {
set components "${components}
mapTotal {
mapName Shape_$k
componentCoeff $k
}
}
"
# Use this string to define the variable
cv config "
colvar {
name shapes
${components}
}"
The above example illustrates a use case where a weighted sum (i.e. a linear combination) is used to define a single variable from multiple components. Depending on the problem under study, non-linear functions may be more appropriate. These may be defined a custom functions if implemented (see 5.18), or scripted functions (see 5.19).
The following options can be used for any of the above colvar components in order to obtain a polynomial combination or any user-supplied function provided by scriptedFunction.
Certain components, such as dihedral or dihedral, compute angles that lie in a periodic interval between $-18{0}^{\circ}$ and $18{0}^{\circ}$. When computing pairwise distances between values of those angles (e.g. for the sake of computing restraint potentials, or sampling PMFs), periodicity is taken into account by following the minimum-image convention.
Additionally, several other components, such as distanceZ, support optional periodicity if this is provided in the configuration.
The following keywords can be used within periodic components, or within custom variables (5.18), or wthin scripted variables 5.19).
Note: using linear/polynomial combinations of periodic components (see 5.17), or other custom or scripted function may invalidate the periodicity. Use such combinations carefully: estimate the range of possible values of each component in a given simulation, and make use of wrapAround to limit this problem whenever possible.
When one of the following components are used, the defined colvar returns a value that is not a scalar number:
The distance between two 3-dimensional unit vectors is computed as the angle between them. The distance between two quaternions is computed as the angle between the two 4-dimensional unit vectors: because the orientation represented by $q$ is the same as the one represented by $-q$, distances between two quaternions are computed considering the closest of the two symmetric images.
Non-scalar components carry the following restrictions:
Note: while these restrictions apply to individual colvars based on non-scalar components, no limit is set to the number of scalar colvars. To compute multi-dimensional histograms and PMFs, use sets of scalar colvars of arbitrary size.
In addition to the restrictions due to the type of value computed (scalar or non-scalar), a final restriction can arise when calculating total force (outputTotalForce option or application of a abf bias). total forces are available currently only for the following components: distance, distanceZ, distanceXY, angle, dihedral, rmsd, eigenvector and gyration.
To extend the set of possible definitions of colvars $\xi \left(r\right)$, multiple components ${q}_{i}\left(r\right)$ can be summed with the formula:
$$\xi \left(r\right)=\sum _{i}{c}_{i}{\left[{q}_{i}\right(r\left)\right]}^{{n}_{i}}$$ | (23) |
where each component appears with a unique coefficient ${c}_{i}$ (1.0 by default) the positive integer exponent ${n}_{i}$ (1 by default).
Any set of components can be combined within a colvar, provided that they return the same type of values (scalar, unit vector, vector, or quaternion). By default, the colvar is the sum of its components. Linear or polynomial combinations (following equation (23)) can be obtained by setting the following parameters, which are common to all components:
Example: To define the average of a colvar across different parts of the system, simply define within the same colvar block a series of components of the same type (applied to different atom groups), and assign to each component a componentCoeff of $1\u2215N$.
Collective variables may be defined by specifying a custom function of multiple components, i.e. an analytical expression that is more general than the linear combinations described in 5.17. Such expression is parsed and calculated by Lepton, the lightweight expression parser written by Peter Eastman (https://simtk.org/projects/lepton) that produces efficient evaluation routines for both the expression and its derivatives. Although Lepton is generally available in most applications and builds where Colvars is included, it is best to check section 10 to confirm.
The expression may use the collective variable components as variables, referred to by their user-defined name. Scalar elements of vector components may be accessed by appending a 1-based index to their name, as in the example below. When implementing generic functions of Cartesian coordinates rather than functions of existing components, the cartesian component may be particularly useful. A scalar-valued custom variable may be manually defined as periodic by providing the keyword period, and the optional keyword wrapAround, with the same meaning as in periodic components (see 5.15 for details). A vector variable may be defined by specifying the customFunction parameter several times: each expression defines one scalar element of the vector colvar, as in this example:
colvar {
name custom
# A 2-dimensional vector function of a scalar x and a 3-vector r
customFunction cos(x) * (r1 + r2 + r3)
customFunction sqrt(r1 * r2)
distance {
name x
group1 { atomNumbers 1 }
group2 { atomNumbers 50 }
}
distanceVec {
name r
group1 { atomNumbers 10 11 12 }
group2 { atomNumbers 20 21 22 }
}
}
Numeric constants may be given in either decimal or exponential form (e.g. 3.12e-2). An expression
may be followed by definitions for intermediate values that appear in the expression, separated by
semicolons. For example, the expression:
a^2 + a*b + b^2; a = a1 + a2; b = b1 + b2
is exactly equivalent to:
(a1 + a2)^2 + (a1 + a2) * (b1 + b2) + (b1 + b2)^2.
The definition of an intermediate value may itself involve other intermediate values. All uses of a value
must appear before that value's definition.
Lepton supports the usual arithmetic operators +, -, *, /, and ^ (power), as well as the following functions:
sqrt | Square root |
exp | Exponential |
log | Natural logarithm |
erf | Error function |
erfc | Complementary error function |
sin | Sine (angle in radians) |
cos | Cosine (angle in radians) |
sec | Secant (angle in radians) |
csc | Cosecant (angle in radians) |
tan | Tangent (angle in radians) |
cot | Cotangent (angle in radians) |
asin | Inverse sine (in radians) |
acos | Inverse cosine (in radians) |
atan | Inverse tangent (in radians) |
atan2 | Two-argument inverse tangent (in radians) |
sinh | Hyperbolic sine |
cosh | Hyperbolic cosine |
tanh | Hyperbolic tangent |
abs | Absolute value |
floor | Floor |
ceil | Ceiling |
min | Minimum of two values |
max | Maximum of two values |
delta | $delta\left(x\right)=1$ if $x=0$, 0 otherwise |
step | $step\left(x\right)=0$ if $x<0$, 1 if $x>=0$ |
select | $select(x,y,z)=z$ if $x=0$, $y$ otherwise |
When scripting is supported (default in VMD), a colvar may be defined as a scripted function of its components, rather than a linear or polynomial combination. When implementing generic functions of Cartesian coordinates rather than functions of existing components, the cartesian component may be particularly useful. A scalar-valued scripted variable may be manually defined as periodic by providing the keyword period, and the optional keyword wrapAround, with the same meaning as in periodic components (see 5.15 for details).
An example of elaborate scripted colvar is given in example 10, in the form of path-based collective variables as defined by Branduardi et al[11] (Section 5.11.3).
Many algorithms require the definition of two boundaries and a bin width for each colvar, which are necessary to compute discrete “states" for a collective variable's otherwise continuous values. The following keywords define these parameters for a specific variable, and will be used by all bias that refer to that variable unless otherwise specified.
Further, many restraints such as harmonic potentials (7.7), harmonic walls (7.9) and linear restraints (7.10) also use this parameter to define the expected fluctuations of the colvar, allowing to express the force constant in terms of this unit. This is most useful with multi-dimensional restraints acting on variables that have very different units (for examples, working with length unit and degrees ${}^{\circ}$ simultaneously): a single force constant can be used for all, which is converted to the respective unit of each variable when forces are applied (the are printed at initialization time.
The following options enable extended-system dynamics, where a colvar is coupled to an additional degree of freedom (fictitious particle) by a harmonic spring. This extended coordinate masks the colvar and replaces it transparently from the perspective of biasing and analysis methods. Biasing forces are then applied to the extended degree of freedom, and the actual geometric colvar (function of Cartesian coordinates) only feels the force from the harmonic spring. This is particularly useful when combined with an abf bias to perform eABF simulations (7.3).
Note that for some biases (harmonicWalls, histogram), this masking behavior is controlled by the keyword bypassExtendedLagrangian. Specifically for harmonicWalls, the default behavior is to bypass extended Lagrangian coordinates and act directly on the actual colvars.
Run-time calculations of statistical properties that depend explicitly on time can be performed for individual collective variables. Currently, several types of time correlation functions, running averages and running standard deviations are implemented. For run-time computation of histograms, please see the histogram bias (7.12).
To define collective variables, atoms are usually selected as groups. Each group is defined using an identifying keyword that is unique in the context of the specific colvar component (e.g. for a distance component, the two groups are identified by the group1 and group2 keywords).
The group's identifying keyword is followed by a brace-delimited block containing selection keywords and other parameters, one of which is name:
Other keywords are documented in the following sections.
In the example below, the gyration component uses the identifying keyword atoms to define its associated group, which is defined based on the index group named “Protein-H". Optionally, the group is also given the unique name “my_protein", so that atom groups defined later in the Colvars configuration may refer to it.
colvar {
name rgyr
gyration {
atoms {
name my_protein
indexGroup Protein-H
}
}
}
Selection keywords may be used individually or in combination with each other, and each can be
repeated any number of times. Selection is incremental: each keyword adds the corresponding
atoms to the selection, so that different sets of atoms can be combined. However, atoms
included by multiple keywords are only counted once. Below is an example configuration
for an atom group called “atoms". Note: this is an unusually varied combination of selection
keywords, demonstrating how they can be combined together: most simulations only use one of
them.
atoms {
# add atoms 1 and 3 to this group (note: first atom in the system is 1)
atomNumbers {
1 3
}
# add atoms starting from 20 up to and including 50
atomNumbersRange 20-50
# add all the atoms with occupancy 2 in the file atoms.pdb
atomsFile atoms.pdb
atomsCol O
atomsColValue 2.0
# add all the C-alphas within residues 11 to 20 of segments "PR1" and "PR2"
psfSegID PR1 PR2
atomNameResidueRange CA 11-20
atomNameResidueRange CA 11-20
# add index group (requires a .ndx file to be provided globally)
indexGroup Water
}
The resulting selection includes atoms 1 and 3, those between 20 and 50, the ${C}_{\alpha}$ atoms between residues 11 and 20 of the two segments PR1 and PR2, and those in the index group called “Water". The indices of this group are read from the file provided by the global keyword indexFile.
In the current version, the Colvars module does not manipulate VMD atom selections directly: however, these can be converted to atom groups within the Colvars configuration string, using selection keywords such as atomNumbers. The complete list of selection keywords available in VMD is:
The following options define an automatic calculation of an optimal translation (centerToReference) or optimal rotation (rotateToReference), that superimposes the positions of this group to a provided set of reference coordinates. Alternately, centerToOrigin applies a translation to place the geometric center of the group at (0, 0, 0). This can allow, for example, to effectively remove from certain colvars the effects of molecular tumbling and of diffusion. Given the set of atomic positions ${x}_{i}$, the colvar $\xi $ can be defined on a set of roto-translated positions ${x}_{i}^{\prime}=R({x}_{i}-{x}^{C})+{x}^{ref}$. ${x}^{C}$ is the geometric center of the ${x}_{i}$, $R$ is the optimal rotation matrix to the reference positions and ${x}^{ref}$ is the geometric center of the reference positions.
Components that are defined based on pairwise distances are naturally invariant under global roto-translations. Other components are instead affected by global rotations or translations: however, they can be made invariant if they are expressed in the frame of reference of a chosen group of atoms, using the centerToReference and rotateToReference options. Finally, a few components are defined by convention using a roto-translated frame (e.g. the minimal RMSD): for these components, centerToReference and rotateToReference are enabled by default. In typical applications, the default settings result in the expected behavior.
Warning on rotating frames of reference and periodic boundary conditions. rotateToReference affects coordinates that depend on minimum-image distances in periodic boundary conditions (PBC). After rotation of the coordinates, the periodic cell vectors become irrelevant: the rotated system is effectively non-periodic. A safe way to handle this is to ensure that the relevant inter-group distance vectors remain smaller than the half-size of the periodic cell. If this is not desirable, one should avoid the rotating frame of reference, and apply orientational restraints to the reference group instead, in order to keep the orientation of the reference group consistent with the orientation of the periodic cell.
Warning on rotating frames of reference and ABF. Note that centerToReference and rotateToReference may affect the Jacobian derivative of colvar components in a way that is not taken into account by default. Be careful when using these options in ABF simulations or when using total force values.
The following example illustrates the use of fittingGroup as part of a Distance to Bound Configuration (DBC) coordinate for use in ligand restraints for binding affinity calculations.[16] The group called “atoms" describes coordinates of a ligand's atoms, expressed in a moving frame of reference tied to a binding site (here within a protein). An optimal roto-translation is calculated automatically by fitting the C${}_{\alpha}$ trace of the rest of the protein onto the coordinates provided by a PDB file. To define a DBC coordinate, this atom group would be used within an rmsd function.
# Example: defining a group "atoms" (the ligand) whose coordinates are expressed
# in a roto-translated frame of reference defined by a second group (the receptor)
atoms {
atomNumbers 1 2 3 4 5 6 7 # atoms of the ligand (1-based)
centerToReference yes
rotateToReference yes
fittingGroup {
# define the frame by fitting alpha carbon atoms
# in 2 protein segments close to the site
psfSegID PROT PROT
atomNameResidueRange CA 1-40
atomNameResidueRange CA 59-100
}
refPositionsFile all.pdb # can be the entire system
}
The following options have default values appropriate for the vast majority of applications, and are only provided to support rare, special cases.
In simulations with periodic boundary conditions (PBCs), Colvars computes all distances between two points following the nearest-image convention, using PBC parameters provided by VMD. However, many common variables rely on a consistent definition of the center of mass or geometry of a group of atoms. This requires the use of unwrapped coordinates, which are not subject to “jumps" when they diffuse across periodic boundaries.
In general, internal coordinate wrapping by VMD does not affect the calculation of colvars if each atom group satisfies one or more of the following:
If none of these conditions are met, wrapping may affect the calculation of collective variables: a possible solution is to use pbc wrap or pbc unwrap (or the alternative qwrap or qunwrap: https://github.com/jhenin/qwrap) prior to processing a trajectory with the Colvars module.
In simulations performed with MD simulation engines such as GROMACS, LAMMPS or NAMD, the computation of energy and forces is distributed (i.e., parallelized) over multiple nodes, as well as over the CPU/GPU cores of each node. When Colvars is enabled, atomic coordinates are collected on a single CPU core, where collective variables and their biases are computed. This means that in the case of simulations that are already being run over large numbers of nodes, or inside a GPU, a Colvars calculation may produce a significant overhead. This overhead comes from the combined cost of two operation: transmitting atomic coordinates, and computing functions of the same.
Performance can be improved in multiple ways:
A biasing or analysis method can be applied to existing collective variables by using the following configuration:
$<$biastype$>$ {
name $<$name$>$
colvars $<$xi1$>$ $<$xi2$>$ ...
$<$parameters$>$
}
The keyword $<$biastype$>$ indicates the method of choice. There can be multiple instances of the same method, e.g. using multiple harmonic blocks allows defining multiple restraints.
All biasing and analysis methods implemented recognize the following options:
The methods implemented here provide a variety of estimators of conformational free-energies. These are carried out at run-time, or with the use of post-processing tools over the generated output files. The specifics of each estimator are discussed in the documentation of each biasing or analysis method.
A special case is the traditional thermodynamic integration (TI) method, used for example to compute potentials of mean force (PMFs). Most types of restraints (7.7, 7.9, 7.10, ...) as well as metadynamics (7.5) can optionally use TI alongside their own estimator, based on the keywords documented below.
In adaptive biasing force (ABF) (7.2) the above keywords are not recognized, because their functionality is either included already (conventional ABF) or not available (extended-system ABF).
For a full description of the Adaptive Biasing Force method, see reference [17]. For details about this implementation, see references [18] and [19]. When publishing research that makes use of this functionality, please cite references [17] and [19].
An alternate usage of this feature is the application of custom tabulated biasing potentials to one or more colvars. See inputPrefix and updateBias below.
Combining ABF with the extended Lagrangian feature (5.22) of the variables produces the extended-system ABF variant of the method (7.3).
ABF is based on the thermodynamic integration (TI) scheme for computing free energy profiles. The free energy as a function of a set of collective variables $\text{}\xi \text{}={\left({\xi}_{i}\right)}_{i\in [1,n]}$ is defined from the canonical distribution of $\text{}\xi \text{}$, $\mathcal{\mathcal{P}}\left(\text{}\xi \text{}\right)$:
$$A\left(\text{}\xi \text{}\right)=-\frac{1}{\beta}\mathrm{ln}\⁡\mathcal{\mathcal{P}}\left(\text{}\xi \text{}\right)+{A}_{0}$$ | (24) |
In the TI formalism, the free energy is obtained from its gradient, which is generally calculated in the form of the average of a force ${\text{}F\text{}}_{\xi}$ exerted on $\text{}\xi \text{}$, taken over an iso-$\text{}\xi \text{}$ surface:
$${\text{}\nabla \⁡\text{}}_{\xi}A\left(\text{}\xi \text{}\right)={\u27e8-{\text{}F\text{}}_{\xi}\u27e9}_{\text{}\xi \text{}}$$ | (25) |
Several formulae that take the form of (25) have been proposed. This implementation relies partly on the classic formulation [20], and partly on a more versatile scheme originating in a work by Ruiz-Montero et al. [21], generalized by den Otter [22] and extended to multiple variables by Ciccotti et al. [23]. Consider a system subject to constraints of the form ${\sigma}_{k}\left(\text{}x\text{}\right)=0$. Let ${\left({\text{}v\text{}}_{i}\right)}_{i\in [1,n]}$ be arbitrarily chosen vector fields (${\mathbb{R}}^{3N}\to {\mathbb{R}}^{3N}$) verifying, for all $i$, $j$, and $k$:
$$\begin{array}{rcll}{\text{}v\text{}}_{i}\cdot \text{}{\nabla \⁡}_{\phantom{\rule{-0.17em}{0ex}}\phantom{\rule{-0.17em}{0ex}}x}\phantom{\rule{0.17em}{0ex}}\text{}{\xi}_{j}& =& {\delta}_{ij}& \text{(26)}\text{}\text{}\\ {\text{}v\text{}}_{i}\cdot \text{}{\nabla \⁡}_{\phantom{\rule{-0.17em}{0ex}}\phantom{\rule{-0.17em}{0ex}}x}\phantom{\rule{0.17em}{0ex}}\text{}{\sigma}_{k}& =& 0& \text{(27)}\text{}\text{}\end{array}$$
then the following holds [23]:
where $V$ is the potential energy function. ${\text{}v\text{}}_{i}$ can be interpreted as the direction along which the force acting on variable ${\xi}_{i}$ is measured, whereas the second term in the average corresponds to the geometric entropy contribution that appears as a Jacobian correction in the classic formalism [20]. Condition (26) states that the direction along which the total force on ${\xi}_{i}$ is measured is orthogonal to the gradient of ${\xi}_{j}$, which means that the force measured on ${\xi}_{i}$ does not act on ${\xi}_{j}$.
Equation (27) implies that constraint forces are orthogonal to the directions along which the free energy gradient is measured, so that the measurement is effectively performed on unconstrained degrees of freedom.
In the framework of ABF, ${F}_{\xi}$ is accumulated in bins of finite size $\delta \xi $, thereby providing an estimate of the free energy gradient according to equation (25). The biasing force applied along the collective variables to overcome free energy barriers is calculated as:
where $\text{}{\nabla \⁡}_{\phantom{\rule{-0.17em}{0ex}}\phantom{\rule{-0.17em}{0ex}}x}\phantom{\rule{0.17em}{0ex}}\text{}\stackrel{~}{A}$ denotes the current estimate of the free energy gradient at the current point $\text{}\xi \text{}$ in the collective variable subspace, and $\alpha \left({N}_{\xi}\right)$ is a scaling factor that is ramped from 0 to 1 as the local number of samples ${N}_{\xi}$ increases to prevent non-equilibrium effects in the early phase of the simulation, when the gradient estimate has a large variance. See the fullSamples parameter below for details.
As sampling of the phase space proceeds, the estimate $\text{}{\nabla \⁡}_{\phantom{\rule{-0.17em}{0ex}}\phantom{\rule{-0.17em}{0ex}}x}\phantom{\rule{0.17em}{0ex}}\text{}\stackrel{~}{A}$ is progressively refined. The biasing force introduced in the equations of motion guarantees that in the bin centered around $\text{}\xi \text{}$, the forces acting along the selected collective variables average to zero over time. Eventually, as the underlying free energy surface is canceled by the adaptive bias, evolution of the system along $\text{}\xi \text{}$ is governed mainly by diffusion. Although this implementation of ABF can in principle be used in arbitrary dimension, a higher-dimension collective variable space is likely to be difficult to sample and visualize. Most commonly, the number of variables is one or two, sometimes three.
The following conditions must be met for an ABF simulation to be possible and to produce an accurate estimate of the free energy profile. Note that these requirements do not apply when using the extended-system ABF method (7.3).
ABF depends on parameters from each collective variable to define the grid on which free energy gradients are computed: see 5.20 for detauls. Other parameters to control the ABF runtime can be set in the ABF configuration block:
The ABF bias produces the following files, all in multicolumn text format (4.7.5):
Also in the case of one-dimensional calculations, the ABF bias can report its current energy via outputEnergy; in higher dimensions, such computation is not implemented and the energy reported is zero.
If several ABF biases are defined concurrently, their name is inserted to produce unique filenames for output, as in outputName.abf1.grad. This should not be done routinely and could lead to meaningless results: only do it if you know what you are doing!
If the colvar space has been partitioned into sections (windows) in which independent ABF simulations have been run, the resulting data can be merged using the inputPrefix option described above (a run of 0 steps is enough).
The ABF method only produces an estimate of the free energy gradient. The free energy surface itself can be computed depending on the value of integrate and related options.
$${\nabla \⁡}^{2}{A}_{t}=\nabla \⁡\cdot {G}_{t}$$ | (30) |
wehere ${G}_{t}$ is the estimated gradient at time $t$, and ${A}_{t}$ is corresponding free energy surface. The free energy surface is written under the file name <outputName>.pmf, in a plain text format (see 4.7.5) that can be read by most data plotting and analysis programs (e.g. Gnuplot). Periodic boundary conditions are applied to periodic coordinates, and Neumann boundary conditions otherwise (imposed free energy gradient at the boundary of the domain). The grid used for free energy discretization is extended by one point along non-periodic coordinates, but not along periodic coordinates. See ref. [24] for details.
In dimension 4 or greater, integrating the discretized gradient becomes non-trivial. The standalone utility abf_integrate is provided to perform that task. Because 4D ABF calculations are uncommon, this tool is practically deprecated by the Poisson integration described above.
abf_integrate reads the gradient data and uses it to perform a Monte-Carlo (M-C) simulation in discretized collective variable space (specifically, on the same grid used by ABF to discretize the free energy gradient). By default, a history-dependent bias (similar in spirit to metadynamics) is used: at each M-C step, the bias at the current position is incremented by a preset amount (the hill height). Upon convergence, this bias counteracts optimally the underlying gradient; it is negated to obtain the estimate of the free energy surface.
abf_integrate is invoked using the command-line:
abf_integrate <gradient_file> [-n <nsteps>] [-t <temp>] [-m (0|1)] [-h <hill_height>] [-f
<factor>]
The gradient file name is provided first, followed by other parameters in any order. They are described below, with their default value in square brackets:
Using the default values of all parameters should give reasonable results in most cases.
abf_integrate produces the following output files:
Note: Typically, the “deviation" vector field does not vanish as the integration converges. This happens because the numerical estimate of the gradient does not exactly derive from a potential, due to numerical approximations used to obtain it (finite sampling and discretization on a grid). See Ref.[24] for details.
Extended-system ABF (eABF) is a variant of ABF (7.2) where the bias is not applied directly to the collective variable, but to an extended coordinate (“fictitious variable") $\lambda $ that evolves dynamically according to Newtonian or Langevin dynamics. Such an extended coordinate is enabled for a given colvar using the extendedLagrangian and associated keywords (5.22). The theory of eABF and the present implementation are documented in detail in reference [25].
Defining an ABF bias on a colvar wherein the extendedLagrangian option is active will perform eABF automatically; there is no dedicated option.
The extended variable $\lambda $ is coupled to the colvar $z=\xi \left(q\right)$ by the harmonic potential $(k\u22152){(z-\lambda )}^{2}$. Under eABF dynamics, the adaptive bias on $\lambda $ is the running estimate of the average spring force:
$${F}^{bias}\left({\lambda}^{\ast}\right)={\u27e8k(\lambda -z)\u27e9}_{{\lambda}^{\ast}}$$ | (31) |
where the angle brackets indicate a canonical average conditioned by $\lambda ={\lambda}^{\ast}$. At long simulation times, eABF produces a flat histogram of the extended variable $\lambda $, and a flattened histogram of $\xi $, whose exact shape depends on the strength of the coupling as defined by extendedFluctuation in the colvar. Coupling should be somewhat loose for faster exploration and convergence, but strong enough that the bias does help overcome barriers along the colvar $\xi $.[25] Distribution of the colvar may be assessed by plotting its histogram, which is written to the outputName.zcount file in every eABF simulation. Note that a histogram bias (7.12) applied to an extended-Lagrangian colvar will access the extended degree of freedom $\lambda $, not the original colvar $\xi $; however, the joint histogram may be explicitly requested by listing the name of the colvar twice in a row within the colvars parameter of the histogram block.
The eABF PMF is that of the coordinate $\lambda $, it is not exactly the free energy profile of $\xi $. That quantity can be calculated based on the CZAR estimator.
The corrected z-averaged restraint (CZAR) estimator is described in detail in reference [25]. It is computed automatically in eABF simulations, regardless of the number of colvars involved. Note that ABF may also be applied on a combination of extended and non-extended colvars; in that case, CZAR still provides an unbiased estimate of the free energy gradient.
CZAR estimates the free energy gradient as:
$${A}^{\prime}\left(z\right)=-\frac{1}{\beta}\frac{d\mathrm{ln}\⁡\stackrel{~}{\rho}\left(z\right)}{dz}+k({\u27e8\lambda \u27e9}_{z}-z).$$ | (32) |
where $z=\xi \left(q\right)$ is the colvar, $\lambda $ is the extended variable harmonically coupled to $z$ with a force constant $k$, and $\stackrel{~}{\rho}\left(z\right)$ is the observed distribution (histogram) of $z$, affected by the eABF bias.
Parameters for the CZAR estimator are:
Similar to ABF, the CZAR estimator produces two output files in multicolumn text format (4.7.5):
The sampling histogram associated with the CZAR estimator is the $z$-histogram, which is written in the file outputName.zcount.
This implements the Adiabatic Bias Molecular Dynamics (ABMD) method of Marchi and Ballone [26], sometimes referred to as ratchet-and-pawl or ratcheted MD. ABMD is a non-equilibrium process that enhances the motion of a scalar colvar in a given direction. For simplicity, the case of an increasing value is described below, but enhancing downward motion of the variable is also supported via the decreasing flag.
ABMD does not directly push the variable forward, but prevents it from backtracking by applying a time-dependent half-harmonic potential ${V}_{t}$, the center of which is the highest value attained by the variable so far (its high-water mark). This design implies that the bias is conservative at all times and therefore exerts zero net work, hence the “adiabatic" qualifier:
where ${\xi}_{t}^{ref}$ is the high-water mark at time $t$, bounded by a user-defined stopping value ${\xi}^{stop}$:
$${\xi}_{t}^{ref}=\mathrm{min}\⁡\left(\underset{s=0}{\overset{t}{\mathrm{max}\⁡}}{\xi}_{s},{\xi}^{stop}\right).$$ | (34) |
Note: because the ABMD potential in eq. 33 is never defined for more than one variable, no internal unit conversion is applied to $k$: this behavior is different from other restraints available in Colvars, such as the harmonic wall restraints in 7.9.
Besides the name of the biased variable specified by the colvars keyword, the tunable parameters of ABMD are the force constant $k$ and the stopping value ${\xi}^{stop}$, set by the following user keywords:
ABMD also supports the following common bias parameters:
The metadynamics method uses a history-dependent potential [27] that generalizes to any type of colvars the conformational flooding [28] and local elevation [29] methods, originally formulated to use as colvars the principal components of a covariance matrix or a set of dihedral angles, respectively. The metadynamics potential on the colvars $\text{}\xi \text{}=({\xi}_{1},{\xi}_{2},\dots \⁡,{\xi}_{{N}_{cv}})$ is defined as:
where ${V}_{meta}$ is the history-dependent potential acting on the current values of the colvars $\text{}\xi \text{}$, and depends only parametrically on the previous values of the colvars. ${V}_{meta}$ is constructed as a sum of ${N}_{cv}$-dimensional repulsive Gaussian “hills", whose height is a chosen energy constant $W$, and whose centers are the previously explored configurations $\left(\text{}\xi \text{}\left(\delta t\right),\text{}\xi \text{}\left(2\delta t\right),\dots \⁡\right)$.
During the simulation, the system evolves towards the nearest minimum of the “effective" potential of mean force $\xc3\left(\text{}\xi \text{}\right)$, which is the sum of the “real" underlying potential of mean force $A\left(\text{}\xi \text{}\right)$ and the the metadynamics potential, ${V}_{meta}\left(\text{}\xi \text{}\right)$. Therefore, at any given time the probability of observing the configuration $\text{}{\xi}^{\ast}\text{}$ is proportional to $\mathrm{exp}\⁡\left(-\xc3\left(\text{}{\xi}^{\ast}\text{}\right)\u2215{\kappa}_{B}T\right)$: this is also the probability that a new Gaussian “hill" is added at that configuration. If the simulation is run for a sufficiently long time, each local minimum is canceled out by the sum of the Gaussian “hills". At that stage the “effective" potential of mean force $\xc3\left(\text{}\xi \text{}\right)$ is constant, and $-{V}_{meta}\left(\text{}\xi \text{}\right)$ is an estimator of the “real" potential of mean force $A\left(\text{}\xi \text{}\right)$, save for an additive constant:
$$A\left(\text{}\xi \text{}\right)\phantom{\rule{0.28em}{0ex}}\simeq \phantom{\rule{0.28em}{0ex}}-{V}_{meta}\left(\text{}\xi \text{}\right)+K$$ | (36) |
Such estimate of the free energy can be provided by enabling writeFreeEnergyFile. Assuming that the set of collective variables includes all relevant degrees of freedom, the predicted error of the estimate is a simple function of the correlation times of the colvars ${\tau}_{{\xi}_{i}}$, and of the user-defined parameters $W$, ${\sigma}_{{\xi}_{i}}$ and $\delta t$ [30]. In typical applications, a good rule of thumb can be to choose the ratio $W\u2215\delta t$ much smaller than ${\kappa}_{B}T\u2215{\tau}_{\text{}\xi \text{}}$, where ${\tau}_{\text{}\xi \text{}}$ is the longest among $\text{}\xi \text{}$'s correlation times: ${\sigma}_{{\xi}_{i}}$ then dictates the resolution of the calculated PMF.
If the metadynamics parameters are chosen correctly, after an equilibration time, ${t}_{e}$, the estimator provided by eq. 36 oscillates on time around the “real" free energy, thereby a better estimate of the latter can be obtained as the time average of the bias potential after ${t}_{e}$ [31, 32]:
$$A\left(\text{}\xi \text{}\right)\phantom{\rule{0.28em}{0ex}}=\phantom{\rule{0.28em}{0ex}}-\frac{1}{{t}_{tot}-{t}_{e}}{\int}_{{t}_{e}}^{{t}_{tot}}{V}_{meta}(\text{}\xi \text{},t)dt$$ | (37) |
where ${t}_{e}$ is the time after which the bias potential grows (approximately) evenly during the simulation and ${t}_{tot}$ is the total simulation time. The free energy calculated according to eq. 37 can thus be obtained averaging on time multiple time-dependent free energy estimates, that can be printed out through the keyword keepFreeEnergyFiles. An alternative is to obtain the free energy profiles by summing the hills added during the simulation; the hills trajectory can be printed out by enabling the option writeHillsTrajectory.
In typical scenarios the Gaussian hills of a metadynamics potential are interpolated and summed together onto a grid, which is much more efficient than computing each hill independently at every step (the keyword useGrids is on by default). This numerical approximation typically yields negligible errors in the resulting PMF [1]. However, due to the finite thickness of the Gaussian function, the metadynamics potential would suddenly vanish each time a variable exceeds its grid boundaries.
To avoid such discontinuity the Colvars metadynamics code will keep an explicit copy of each hill that straddles a grid's boundary, and will use it to compute metadynamics forces outside the grid. This measure is taken to protect the accuracy and stability of a metadynamics simulation, except in cases of “natural" boundaries (for example, the $[0:180]$ interval of an angle colvar) or when the flags hardLowerBoundary and hardUpperBoundary are explicitly set by the user. Unfortunately, processing explicit hills alongside the potential and force grids could easily become inefficient, slowing down the simulation and increasing the state file's size.
In general, it is a good idea to define a repulsive potential to avoid hills from coming too close to the
grid's boundaries, for example as a harmonicWalls restraint (see 7.9).
Example: Using harmonic walls to protect the grid's boundaries.
colvar {
name r
distance { ... }
upperBoundary 15.0
width 0.2
}
metadynamics {
name meta_r
colvars r
hillWeight 0.001
hillWidth 2.0
}
harmonicWalls {
name wall_r
colvars r
upperWalls 13.0
upperWallConstant 2.0
}
In the colvar r, the distance function used has a lowerBoundary automatically set to 0 by default, thus the keyword lowerBoundary itself is not mandatory and hardLowerBoundary is set to yes internally. However, upperBoundary does not have such a “natural" choice of value. The metadynamics potential meta_r will individually process any hill whose center is too close to the upperBoundary, more precisely within fewer grid points than 6 times the Gaussian $\sigma $ parameter plus one. It goes without saying that if the colvar r represents a distance between two freely-moving molecules, it will cross this “threshold" rather frequently.
In this example, where the value of hillWidth ($2\sigma $) amounts to 2 grid points, the threshold is 6+1 = 7 grid points away from upperBoundary. In explicit units, the width of $r$ is ${w}_{r}=$ 0.2 Å, and the threshold is 15.0 - 7$\times $0.2 = 13.6 Å.
The wall_r restraint included in the example prevents this: the position of its upperWall is 13 Å, i.e. 3 grid points below the buffer's threshold (13.6 Å). For the chosen value of upperWallConstant, the energy of the wall_r bias at r = ${r}_{upper}$ = 13.6 Å is:
$${E}^{\ast}=\frac{1}{2}k{\left(\frac{r-{r}_{upper}}{{w}_{r}}\right)}^{2}=\frac{1}{2}2.0{\left(-3\right)}^{2}=9\phantom{\rule{0.33em}{0ex}}kcal\u2215mol$$ |
which results in a relative probability $\mathrm{exp}\⁡(-{E}^{\ast}\u2215{\kappa}_{B}T)\simeq $ $3\times 1{0}^{-7}$ that r crosses the threshold. The probability that r exceeds upperBoundary, which is further away, has also become vanishingly small. At that point, you may want to set hardUpperBoundary to yes for r, and let meta_r know that no special treatment near the grid's boundaries will be needed.
What is the impact of the wall restraint onto the PMF? Not a very complicated one: the PMF reconstructed by metadynamics will simply show a sharp increase in free-energy where the wall potential kicks in (r $>$ 13 Å). You may then choose between using the PMF only up until that point and discard the rest, or subtracting the energy of the harmonicWalls restraint from the PMF itself. Keep in mind, however, that the statistical convergence of metadynamics may be less accurate where the wall potential is strong.
In summary, although it would be simpler to set the wall's position upperWall and the grid's boundary upperBoundary to the same number, the finite width of the Gaussian hills calls for setting the former strictly within the latter.
To enable a metadynamics-based calculation, a metadynamics {...} block must be included in the Colvars configuration file.
By default, metadynamics bias energy and forces will be recorded onto a grid, the parameters of which can be defined within the definition of each colvar, as described in 5.20.
Other required keywords will be specified within the metadynamics block: these are colvars (the names of the variables involved), hillWeight (the weight parameter $W$), and the widths $2\sigma $ of the Gaussian hills in each dimension that can be given either as the single dimensionless parameter hillWidth, or explicitly for each colvar with gaussianSigmas.
When interpolating grids are enabled (default behavior), the PMF is written by default every colvarsRestartFrequency steps to the file outputName.pmf in multicolumn text format (4.7.5). The following two options allow to disable or control this behavior and to track statistical convergence:
The following options control the computational cost of metadynamics calculations, but do not affect results. Default values are chosen to minimize such cost with no loss of accuracy.
The ensemble-biased metadynamics (EBMetaD) approach [33] is designed to reproduce a target probability distribution along selected collective variables. Standard metadynamics can be seen as a special case of EBMetaD with a flat distribution as target. This is achieved by weighing the Gaussian functions used in the metadynamics approach by the inverse of the target probability distribution:
where ${\rho}_{exp}\left(\text{}\xi \text{}\right)$ is the target probability distribution and ${S}_{\rho}=-\int \⁡{\rho}_{exp}\left(\text{}\xi \text{}\right)\mathrm{log}\⁡{\rho}_{exp}\left(\text{}\xi \text{}\right)\phantom{\rule{0.17em}{0ex}}d\text{}\xi \text{}$ its corresponding differential entropy. The method is designed so that during the simulation the resulting distribution of the collective variable $\text{}\xi \text{}$ converges to ${\rho}_{exp}\left(\text{}\xi \text{}\right)$. A practical application of EBMetaD is to reproduce an “experimental" probability distribution, for example the distance distribution between spectroscopic labels inferred from Förster resonance energy transfer (FRET) or double electron-electron resonance (DEER) experiments [33].
The PMF along $\xi $ can be estimated from the bias potential and the target ditribution [33]:
and obtained by enabling writeFreeEnergyFile. Similarly to eq. 37, a more accurate estimate of the free energy can be obtained by averaging (after an equilibration time) multiple time-dependent free energy estimates (see keepFreeEnergyFiles).
The following additional options define the configuration for the ensemble-biased metadynamics approach:
As with standard metadynamics, multidimensional probability distributions can be targeted using a
single metadynamics block using multiple colvars and a multidimensional target distribution file (see
4.7.5). Instead, multiple probability distributions on different variables can be targeted separately in
the same simulation by introducing multiple metadynamics blocks with the ebMeta option.
Example: EBmetaD configuration for a single variable.
colvar {
name r
distance {
group1 { atomNumbers 991 992 }
group2 { atomNumbers 1762 1763 }
}
upperBoundary 100.0
width 0.1
}
metadynamics {
name ebmeta
colvars r
hillWeight 0.01
hillWidth 3.0
ebMeta on
targetDistFile targetdist1.dat
ebMetaEquilSteps 500000
}
where targetdist1.dat is a text file in “multicolumn" format (4.7.5) with the same width as the variable r
(0.1 in this case):
# | 1 | ||||
# | 0.0 | 0.1 | 1000 | 0 | |
0.05 | 0.0012 | ||||
0.15 | 0.0014 | ||||
… | … | ||||
99.95 | 0.0010 | ||||
Tip: Besides setting a meaningful value for targetDistMinVal, the exploration of unphysically low values of the target distribution (which would lead to very large hills and possibly numerical instabilities) can be also prevented by restricting sampling to a given interval, using e.g. harmonicWalls restraint (7.9).
The following options define the configuration for the “well-tempered" metadynamics approach [34]:
Metadynamics calculations can be performed concurrently by multiple replicas that share a common history. This variant of the method is called multiple-walker metadynamics [35]: the Gaussian hills of all replicas are periodically combined into a single biasing potential, intended to converge to a single PMF.
In the implementation here described [1], replicas communicate through files. This arrangement allows launching the replicas either (1) as a bundle (i.e. a single job in a cluster's queueing system) or (2) as fully independent runs (i.e. as separate jobs for the queueing system). One advantage of the use case (1) is that an identical Colvars configuration can be used for all replicas (otherwise, replicaID needs to be manually set to a different string for each replica). However, the use case (2) is less demanding in terms of high-performance computing resources: a typical scenario would be a computer cluster (including virtual servers from a cloud provider) where not all nodes are connected to each other at high speed, and thus each replica runs on a small group of nodes or a single node.
Whichever way the replicas are started (coupled or not), a shared filesystem is needed so that each
replica can read the files created by the others: paths to these files are stored in the shared file
replicasRegistry. This file, and those listed in it, are read every replicaUpdateFrequency steps. Each
time the Colvars state file is written (for example, colvarsRestartFrequency steps), the file
named:
outputName.colvars.name.replicaID.state
is written as well; this file contains only the state of the metadynamics bias, which the other replicas will
read in turn. In between the times when this file is modified/replaced, new hills are also temporarily written
to the file named:
outputName.colvars.name.replicaID.hills
Both files are only used for communication, and may be deleted after the replica begins writing files with a
new outputName.
Example: Multiple-walker metadynamics with file-based communication.
metadynamics {
name mymtd
colvars x
hillWeight 0.001
newHillFrequency 1000
hillWidth 3.0
multipleReplicas on
replicasRegistry /shared-folder/mymtd-replicas.txt
replicaUpdateFrequency 50000 # Best if larger than newHillFrequency
}
The following are the multiple-walkers related options:
This biasing method implements the on-the-fly probability enhanced sampling (OPES) with metadynamics-like target distribution.[36] The bias samples target distributions defined via their marginal distribution ${p}^{tg}\left(\xi \right)$ over some CVs, $\xi =\xi \left(x\right)$. By default opes_metad targets the well-tempered distribution, ${p}^{WT}\left(\xi \right)={\left[P\right(\xi \left)\right]}^{1\u2215\gamma}$, where $\gamma $ is known as the biasfactor. Similarly to metadynamics, opes_metad optimizes the bias on-the-fly, with a given newHillfrequency. It does so by reweighting via kernel density estimation of the unbiased distribution in the CV space, $P\left(\xi \right)$. A compression algorithm is used to prevent the number of kernels from growing linearly with the simulation time. The bias at step $n$ is
$${V}_{n}\left(\xi \right)=-{k}_{B}T\left(1-1\u2215\gamma \right)\mathrm{ln}\⁡\left(\frac{{P}_{n}\left(\xi \right)}{{Z}_{n}}+\mathit{\epsilon}\right),$$ | (40) |
where the probability ${P}_{n}\left(\xi \right)$ and the normalization factor ${Z}_{n}$ are computed as
$${P}_{n}\left(\xi \right)=\frac{\sum _{k}^{n}{w}_{k}G(\xi ,{\xi}_{k})}{\sum _{k}^{n}{w}_{k}}$$ | (41) |
$${Z}_{n}=\frac{1}{\left|{\mathrm{\Omega}}_{n}\right|}{\int}_{{\mathrm{\Omega}}_{n}}{P}_{n}\left(\xi \right)d\xi ,$$ | (42) |
where the weights ${w}_{k}$ are given by ${w}_{k}=\mathrm{exp}\⁡\left(\beta {V}_{k-1}\right({\xi}_{k}\left)\right)$, and the Gaussian kernels $G\left(\xi ,{\xi}^{\prime}\right)=h\mathrm{exp}\⁡\left[-1\u22152{\left(\xi -{\xi}^{\prime}\right)}^{T}{\mathrm{\Sigma}}^{-1}\left(\xi -{\xi}^{\prime}\right)\right]$ have a diagonal covariance matrix ${\mathrm{\Sigma}}_{ij}={\sigma}^{2}{\delta}_{ij}$ and fixed height $h={\mathrm{\Pi}}_{i}{\left({\sigma}_{i}\sqrt{2\pi}\right)}^{-1}$ (see Ref.[36] for a complete description of the method).
If the exploration mode (keyword explore) is on, then the on-the-fly target probability distribution ${p}^{WT}\left(\xi \right)$ is used to define the biasing energy:
$${V}_{n}\left(\xi \right)=-{k}_{B}T\left(\gamma -1\right)\mathrm{ln}\⁡\left(\frac{{p}_{n}^{WT}\left(\xi \right)}{{Z}_{n}}+\mathit{\epsilon}\right),$$ | (43) |
(See Ref.[37] for a complete description of the exploration mode.)
The implementation of opes_metad and its documentation are largely based on the OPES module of the PLUMED package.
Compared to the the OPES module in PLUMED [38], this implementation currently has the following limitations:
The following table summarizes the differences of the option names in Colvars and the corresponding option names in the PLUMED OPES module:
Colvars keyword | PLUMED keyword |
barrier | BARRIER |
newHillFrequency | PACE |
gaussianSigma | SIGMA |
gaussianSigmaMin | SIGMA_MIN |
kernelCutoff | KERNEL_CUTOFF |
compressionThreshold | COMPRESSION_THRESHOLD |
adaptiveSigma | (use SIGMA=ADAPTIVE) |
adaptiveSigmaStride | ADAPTIVE_SIGMA_STRIDE |
neighborList | NLIST |
neighborListNewHillReset | NLIST_PACE_RESET |
neighborListParameters | NLIST_PARAMETERS |
noZed | NO_ZED |
fixedGaussianSigma | FIXED_SIGMA |
recursiveMerge | (the opposite of RECURSIVE_MERGE_OFF) |
calcWork | CALC_WORK |
multipleReplicas | WALKERS_MPI |
The PLUMED options for restarting (STATE_RFILE, STATE_WFILE and STATE_WSTRIDE) are not necessary in Colvars, since Colvars has a unified mechanism to save the information required for restarting in the .colvars.state file.
To enable a OPES-based calculation, a opes_metad {...} block must be included in the Colvars configuration file. The opes_metad block supports the following options:
The following options can be used for the multiple-walker OPES, and require the direct communication between replicas through MPI:
The following options are used for setting the PMF and trajectory output. The PMF is calculated based on reweighting the CVs specified by pmfColvars, which collects on-the-fly the biasing energy $V\left(\xi \right)$ and the values of CVs of every step, and builds a weighted histogram to calculate the unbiased probability of $\xi $
where $N$ is the total number of simulation steps, ${\xi}^{\prime}$ is the grid center in the histogram, and $\delta $ is the dirac delta function. The PMF is then obtained by
$$A\left({\xi}^{\prime}\right)=-{k}_{B}T\mathrm{ln}\⁡\u27e8\mathcal{\mathcal{P}}\left({\xi}^{\prime}\right)\u27e9+{A}_{0}$$ | (45) |
where ${A}_{0}$ is a constant to ensure $A\left({\xi}^{\prime}\right)$ on every grid of the histogram is not negative.
Similar to the output file specified by the FILE option in the PLUMED implementation, the output file “outputName.colvars.$<$name$>$.$<$replicaID$>$.kernels.dat" includes all deposited uncompressed kernels (the step number of deposition, the kernel center, the Gaussian kernel $\sigma $, the height of the kernel, and the biasing energy in ${k}_{B}T$). The format of .kernels.dat files manages to be compatible with the post-processing tools provided in the PLUMED OPES tutorial, so that it is possible to run State_from_Kernels.py and then FES_from_State.py to get the PMF along the biased CVs. Besides the pmf option, there are two ways to manually estimate the PMF, namely (i) reweighting the trajectories using the biasing energy either in the .colvars.traj file (with outputEnergy) or the .misc.traj file (the $<$name$>$.bias column), and (ii) summing up all the kernels (see Ref.[36] for more information).
Example: OPES with adaptive kernel bandwidth, neighbor list and PMF based on reweighting.
colvarsTrajFrequency 500
colvar {
name phi
lowerBoundary -180
upperBoundary 180
width 5.0
dihedral {
group1 {atomNumbers { 5 }}
group2 {atomNumbers { 7 }}
group3 {atomNumbers { 9 }}
group4 {atomNumbers { 15}}
}
}
colvar {
name psi
lowerBoundary -180
upperBoundary 180
width 5.0
dihedral {
group1 {atomNumbers { 7 }}
group2 {atomNumbers { 9 }}
group3 {atomNumbers { 15 }}
group4 {atomNumbers { 17 }}
}
}
opes_metad {
colvars phi psi
newHillFrequency 500
barrier 11.950286806883364
adaptiveSigma on
neighborList on
printTrajectoryFrequency 500
pmf on
pmfColvars phi psi
pmfHistoryFrequency 1000
outputEnergy on
}
The harmonic biasing method may be used to enforce fixed or moving restraints, including variants of Steered and Targeted MD. Within energy minimization runs, it allows for restrained minimization, e.g. to calculate relaxed potential energy surfaces. In the context of the Colvars module, harmonic potentials are meant according to their textbook definition:
$$V\left(\xi \right)=\frac{1}{2}k{\left(\frac{\xi -{\xi}_{0}}{{w}_{\xi}}\right)}^{2}$$ | (46) |
There are two noteworthy aspects of this expression:
This property can be used for setting the force constant in umbrella-sampling ensemble runs: if the restraint centers are chosen in increments of ${w}_{\xi}$, the resulting distributions of $\xi $ are most often optimally overlapped. In regions where the underlying free-energy landscape induces highly skewed distributions of $\xi $, additional windows may be added as needed, with spacings finer than ${w}_{\xi}$.
Beyond one dimension, the use of a scaled harmonic potential also allows a standard definition of a multi-dimensional restraint with a unified force constant:
$$V\left({\xi}_{1},\dots \⁡,{\xi}_{M}\right)=\frac{1}{2}k\sum _{i=1}^{M}{\left(\frac{{\xi}_{i}-{\xi}_{0}}{{w}_{\xi}}\right)}^{2}$$ | (47) |
If one-dimensional or homogeneous multi-dimensional restraints are defined, and there are no other uses for the parameter ${w}_{\xi}$, width can be left at its default value of $1$.
A harmonic restraint is defined by a harmonic {...} block, which may contain the following keywords:
Tip: A complex set of restraints can be applied to a system, by defining several colvars, and applying one or more harmonic restraints to different groups of colvars. In some cases, dozens of colvars can be defined, but their value may not be relevant: to limit the size of the colvars trajectory file, it may be wise to disable outputValue for such “ancillary" variables, and leave it enabled only for “relevant" ones.
The following options allow to change gradually the centers of the harmonic restraints during a simulations. When the centers are changed continuously, a steered MD in a collective variable space is carried out.
Note on restarting moving restraint simulations: Information about the current step and stage of a simulation with moving restraints is stored in the restart file (state file). Thus, such simulations can be run in several chunks, and restarted directly using the same colvars configuration file. In case of a restart, the values of parameters such as targetCenters, targetNumSteps, etc. should not be changed manually.
The centers of the harmonic restraints can also be changed in discrete stages: in this cases a one-dimensional umbrella sampling simulation is performed. The sampling windows in simulation are calculated in sequence. The colvars trajectory file may then be used both to evaluate the correlation times between consecutive windows, and to calculate the frequency distribution of the colvar of interest in each window. Furthermore, frequency distributions on a predefined grid can be automatically obtained by using the histogram bias (see 7.12).
To activate an umbrella sampling simulation, the same keywords as in the previous section can be used, with the addition of the following:
The force constant of the harmonic restraint may also be changed to equilibrate [39].
If the restraint centers or force constant are changed continuosly (targetNumStages undefined) it is possible to record the net work performed by the changing restraint:
The harmonicWalls {...} bias is closely related to the harmonic bias (see 7.7), with the following two differences: (i) instead of a center a lower wall and/or an upper wall are defined, outside of which the bias implements a half-harmonic potential;
where ${\xi}_{lower}$
and ${\xi}_{upper}$ are
the lower and upper wall thresholds, respectively; (ii) because an interval between two walls is defined,
only scalar variables can be used (but any number of variables can be defined, and the wall bias is
intrinsically multi-dimensional).
Note: this bias replaces the keywords lowerWall, lowerWallConstant, upperWall and upperWallConstant defined in the colvar context. Those keywords are deprecated.
The harmonicWalls bias implements the following options:
Example 1: harmonic walls for one variable with two different force constants.
harmonicWalls {
name mywalls
colvars dist
lowerWalls 22.0
upperWalls 38.0
lowerWallConstant 2.0
upperWallConstant 10.0
}
Example 2: harmonic walls for two variables with a single force constant.
harmonicWalls {
name mywalls
colvars phi psi
lowerWalls -180.0 0.0
upperWalls 0.0 180.0
forceConstant 5.0
}
The linear keyword defines a linear potential:
$$V\left(\xi \right)=k\left(\frac{\xi -{\xi}_{0}}{{w}_{\xi}}\right)$$ | (49) |
whose force is simply given by the constant $k\u2215{w}_{\xi}$ itself:
$$f\left(\xi \right)=k\u2215{w}_{\xi}$$ | (50) |
This type of bias is therefore most useful in situations where a constant force is desired. As all other restraints, it can be defined on one or more CVs, with each contribution added to the total potential and the parameters ${w}_{\xi}$ determining the relative magnitude for each.
Example: A possible use case of the linear bias is mimicking a constant electric field acting on a specific particle, or the center of mass of many particles. In the following example, a linear restraint is applied on a distanceZ variable (5.3.2), generating a constant force parallel to the Z axis of magnitude 5 energy unit/length unit:
colvar {
name z
distanceZ {
...
}
}
linear {
colvars z
forceConstant 5.0
centers 0.0
}
Another useful application of a linear restraint is to enforce experimental constraints in a simulation, with a lower non-equilibrium work than e.g. harmonic restraints [40]. There is generally a unique strength of bias for each CV center, which means you must know the bias force constant specifically for the center of the CV. This force constant may be found by using experiment directed simulation described in section 7.11.
Experiment directed simulation applies a linear bias with a changing force constant. Please cite White and Voth [41] when using this feature. As opposed to that reference, the force constant here is scaled by the width corresponding to the biased colvar. In White and Voth, each force constant is scaled by the colvars set center. The bias converges to a linear bias, after which it will be the minimal possible bias. You may also stop the simulation, take the median of the force constants (ForceConst) found in the colvars trajectory file, and then apply a linear bias with that constant. All the notes about units described in sections 7.10 and 7.7 apply here as well. This is not a valid simulation of any particular statistical ensemble and is only an optimization algorithm until the bias has converged.
The histogram feature is used to record the distribution of a set of collective variables in the form of a N-dimensional histogram. Defining such a histogram is generally useful for analysis purposes, but it has no effect on the simulation.
Example 1: the two-dimensional histogram of a distance and an angle can be generated using the configuration below. The histogram code requires that each variable is a scalar number that is confined within a pre-defined interval. The interval's boundaries may be specified by hand (e.g. through lowerBoundary and upperBoundary in the variable definition), or auto-detected based on the type of function. In this example, the lower boundary for the distance variable “r" is automatically set to zero, and interval for the three-body angle “theta" is ${0}^{\circ}$ and $18{0}^{\circ}$: however, that an upper boundary for the distance “r" still needs to be specified manually. The grid spacings for the two variables are $0.2$ length unitand $3.{0}^{\circ}$, respectively.
colvar {
name r
width 0.2
upperBoundary 20.0
distance { ... }
}
colvar {
name theta
width 3.0
dihedral { ... }
}
histogram {
name hist2d
colvars r theta
}
Example 2: This example is similar to the previous one, but with the important difference that the parameters for the histogram's grid are defined explicitly for this histogram instance. Therefore, this histogram's grid may differ from the one defined from parameters embedded in the colvar { ... } block (for example, narrower intervals and finer grid spacings may be selected).
colvar {
name r
upperBoundary 20.0
distance { ... }
}
colvar {
name theta
dihedral { ... }
}
histogram {
name hist2d
colvars r theta
histogramGrid {
widths 0.1 1.0
lowerBoundaries 2.0 30.0
upperBoundaries 10.0 90.0
}
}
The standard keywords below are used to control the histogram's computation and to select the variables that are sampled. See also 7.12.1 for keywords used to define the grid, 7.12.2 for output parameters and 7.12.3 for more advanced keywords.
Grid parameters for the histogram may be provided at the level of the individual variables, or via a dedicated configuration block histogramGrid { …} inside the configuration of this histogram. The options supported inside this block are:
The accumulated histogram is written in the Colvars state file, allowing for its accumulation across continued runs. Additionally, the following files are written depending on the histogram's dimensionality:
As with any other biasing and analysis method, when a histogram is applied to an extended-system colvar (5.22), it accesses the value of the extended coordinate rather than that of the actual colvar. This can be overridden by enabling the bypassExtendedLagrangian option. A joint histogram of the actual colvar and the extended coordinate may be collected by specifying the colvar name twice in a row in the colvars parameter (e.g. colvars myColvar myColvar): the first instance will be understood as the actual colvar, and the second, as the extended coordinate.
The histogramRestraint bias implements a continuous potential of many variables (or of a single high-dimensional variable) aimed at reproducing a one-dimensional statistical distribution that is provided by the user. The $M$ variables $({\xi}_{1},\dots \⁡,{\xi}_{M})$ are interpreted as multiple observations of a random variable $\xi $ with unknown probability distribution. The potential is minimized when the histogram $h\left(\xi \right)$, estimated as a sum of Gaussian functions centered at $({\xi}_{1},\dots \⁡,{\xi}_{M})$, is equal to the reference histogram ${h}_{0}\left(\xi \right)$:
$$V\left({\xi}_{1},\dots \⁡,{\xi}_{M}\right)=\frac{1}{2}k\int {\left(h\left(\xi \right)-{h}_{0}\left(\xi \right)\right)}^{2}d\xi $$ | (51) |
$$h\left(\xi \right)=\frac{1}{M\sqrt{2\pi {\sigma}^{2}}}\sum _{i=1}^{M}\mathrm{exp}\⁡\left(-\frac{{(\xi -{\xi}_{i})}^{2}}{2{\sigma}^{2}}\right)$$ | (52) |
When used in combination with a distancePairs multi-dimensional variable, this bias implements the refinement algorithm against ESR/DEER experiments published by Shen et al [42].
This bias behaves similarly to the histogram bias with the gatherVectorColvars option, with the important difference that all variables are gathered, resulting in a one-dimensional histogram. Future versions will include support for multi-dimensional histograms.
The list of options is as follows:
Rather than using the biasing methods described above, it is possible to apply biases provided at run time as a Tcl script. This option, also available in NAMD, can be useful to test a new algorithm to be used in a MD simulation.
If concurrent computation over multiple threads is available (this is indicated by the message “SMP parallelism is available." printed at initialization time), it is useful to take advantage of the scripting interface to combine many components, all computed in parallel, into a single variable.
The default SMP schedule is the following:
The following options allow to fine-tune this schedule:
This section lists all the commands used in VMD to control the behavior of the Colvars module from within a run script.
cv addenergy <E>
Add an energy to the MD engine (no effect in VMD)
Parameters
E : float - Amount of energy to add
cv config <conf>
Read configuration from the given string
Parameters
conf : string - Configuration string
cv configfile <conf_file>
Read configuration from a file
Parameters
conf_file : string - Path to configuration file
cv delete
Delete this Colvars module instance (VMD only)
cv featurereport
Return a summary of Colvars features used so far and their citations
Returns
report : string - Feature report and citations
cv frame [frame]
Get or set current frame number (VMD only)
Parameters
frame : integer - Frame number (optional)
Returns
frame : integer - Frame number
cv getatomappliedforces
Get the list of forces applied by Colvars to atoms
Returns
forces : array of arrays of floats - Atomic forces
cv getatomappliedforcesmax
Get the maximum norm of forces applied by Colvars to atoms
Returns
force : float - Maximum atomic force
cv getatomappliedforcesmaxid
Get the atom ID with the largest applied force
Returns
id : int - ID of the atom with the maximum atomic force
cv getatomappliedforcesrms
Get the root-mean-square norm of forces applied by Colvars to atoms
Returns
force : float - RMS atomic force
cv resetatomappliedforces
Reset forces applied by Colvars to atoms
cv getatomids
Get the list of indices of atoms used in Colvars
Returns
indices : array of ints - Atom indices