System Building with Dabble

Dabble includes a lot of functionality for building membrane protein systems, proteins in a water box, and manipulating various molecule file formats.

Basic system building

The command line invocation of Dabble constructs a DabbleBuilder object. DabbleBuilder.write then does all of the work. This class is a good place to look for methods that handle multiple VMD molecules, and use other Dabble building functions.

class Dabble.DabbleBuilder(**kwargs)

Builds protein systems, optionally in a membrane.

Tiles box appropriately, inserts protein into box, removes conflicts, trims excess solvent and adds ions.

molids

dict str->int -- Molecule IDs comprising system components

size

array len 3 of floats -- Size of the x, y, and z dimensions of the system

solute_sel

str -- VMD atom selection string for original solute

opts

dictionary -- All options passed to the system builder

tmp_dir

str -- Directory in which to save temporary files

water_only

bool -- If the solvent is just a water box

add_molecule(filename, desc)

Adds a molecule file to the system.

Parameters:
  • filename (str) -- File to load
  • desc (str) -- Type of molecule: solute, solvent, etc
Returns:

True if the molecule was successfully added

convert_ions(salt_conc, cation, molid)

Calculates the charge of the molecule and adds salt ions to get the desired concentration by converting water molecules to salt

Parameters:
  • salt_conc (float) -- Desired salt concentration in M
  • cation (str) -- Cation to add, either Na or K
  • molid (int) -- VMD molecule id to consider
Returns:

(int) number of ions added

Raises:

ValueError if invalid cation is specified

get_cell_size(mem_buf, wat_buf, molid=None, filename=None, zh_mem_full=25.0, zh_mem_hyd=15.0)

Gets the cell size of the final system given initial system and buffers. Detects whether or not a membrane is present. Sets the size of the system.

Parameters:
  • mem_buf (float) -- Membrane (xy) buffer amount
  • wat_buf (float) -- Water (z) buffer amount
  • molid (int) -- VMD molecule ID to consider (can't use with filename)
  • filename (str) -- Filename of system to consider (can't use w molid)
  • zh_mem_full (float) -- Membrane thickness
  • zh_mem_hyd (float) -- Membrane hydrophobic region thickness

Returns: return dx_sol, dy_sol, dx_tm, dy_tm, dz_full

(float tuple): x solute dimension, y solute dimension,
TM x solute dimension, TM y solute dimension, solute z dimension
Raises:ValueError -- if filename and molid are both specified
remove_molecule(desc)

Removes a molecule file from the system.

Parameters:desc (str) -- Key for molecule to remove
Returns:true if a molecule was deleted, false otherwise
Raises:KeyError if there is no molecule with that key string
write()

Writes the final built system.

Parameters:filename (str) -- Name of file

Manipulating single molecules

Dabble provides many functions for manipulating molecules using both the VMD atom selection language as well as use of intermediate molecules. These may be found in the molutils module.

This module contains functions for manipulating molecules using the VMD python API.

Dabble.molutils.atomsel_remaining(molid, sel)

Selects all remaining atoms. Whether or not an atom is counted is determined by value of the beta flag.

Parameters:
  • molid (int) -- VMD molecule id to consider
  • sel (str) -- VMD atom selection string to grab, defaults to all
Returns:

VMD atomsel object representing the selection, or None if no atoms matched the selection

Dabble.molutils.center_system(molid, tmp_dir, center_z=False)

Centers an entire system in the XY-plane, and optionally in the Z dimension. Needs to save and reload the file in case the current positions need to be concatenated to produce a new file.

Parameters:
  • molid (int) -- VMD molecule id to center
  • tmp_dir (str) -- Directory to create temp file in
  • center_z (bool) -- Whether or not to center along the Z axis as well
Returns:

VMD molecule id of centered system

Return type:

(int)

Dabble.molutils.combine_molecules(input_ids, tmp_dir)

Combines input molecules, closes them and returns the molecule id of the new molecule that combines them, putting it on top.

Parameters:
  • input_ids (list of int) -- Molecule IDs to combine, will be closed
  • tmp_dir (str) -- Directory to put combined molecule into
Returns:

(int) molid of combined system

Dabble.molutils.diameter(coords, chunkmem=30000000.0)

Returns the diameter of a set of xy coordinates.

Parameters:
  • coords (numpy array Nx2) -- XY coordinates to get diameter of
  • chunkmem (int) -- Chunk memory for calculation
Returns:

The diameter of the stuff in both dimensions

Return type:

(float, float)

Dabble.molutils.get_net_charge(sel, molid)

Gets the net charge of an atom selection, using the charge field of the data.

Parameters:
  • sel (str) -- VMD atom selection to compute the charge of
  • molid (int) -- VMD molecule id to select within
Returns:

The rounded net charge of the selection

Return type:

(int)

Throws:
ValueError: If charge does not round to an integer value
Dabble.molutils.get_num_salt_ions_needed(molid, conc, water_sel='water and element O', cation='Na', anion='Cl')

Gets the number of salt ions needed to put the system at a given concentration of salt.

Parameters:
  • molid (int) -- The VMD molecule ID to consider
  • conc (float) -- Desired salt concentration
  • water_sel (str) -- VMD atom selection for water
  • cation (str) -- Cation to use, either Na or K right now
  • anion (str) -- Anion to use, only Cl currently supported
Returns:

# cations needed, # anions needed, number of waters

that will remain, total # cations, total # anions, cation concentration, anion concentration

Return type:

(float tuple)

Raises:

Exception if number of cations and the net cation charge are -- not equal (should never happen)

Dabble.molutils.get_system_dimensions(molid)

Gets the periodic box dimensions of a system.

Parameters:molid (int) -- VMD molecule ID to consider
Returns:A, B, C box dimensions
Return type:(float tuple)
Raises:ValueError if no box is found
Dabble.molutils.get_system_net_charge(molid)

Gets the net charge of the entire system. What is in the system is defined by the beta field, as atoms that won't be written have beta 0.

Parameters:molid (int) -- VMD molecule id to compute the charge of
Returns:The net charge of the molecule
Return type:(int)
Dabble.molutils.lipid_composition(lipid_sel, molid)

Calculates the lipid composition of each leaflet of the membrane

Parameters:
  • lipid_sel (str) -- VMD selection string for lipid
  • molid (int) -- VMD molecule ID to consider
Returns:

(int tuple) number of lipids on inner and outer membrane leaflets

Dabble.molutils.num_atoms_remaining(molid, sel='all')

Returns the number of atoms remaining in the system, indicated by the value of the beta flag.

Parameters:
  • molid (int) -- VMD molecule id to consider
  • sel (str) -- VMD atom selection to count, defaults to all
Returns:

(int) number of atoms remaining in the system

Dabble.molutils.num_lipids_remaining(molid, lipid_sel)

Returns the number of lipids remaining in the system, indicated by the value of the beta flag. Uses the fragment notation to count the number of lipids.

Parameters:
  • molid (int) -- VMD molecule ID to consider
  • lipid_sel (str) -- VMD atom selection that counts as lipid
Returns:

(int) number of lipid molecules remaining in the system

Raises:
  • ValueError if empty lipid selection
  • ValueError if non-existent molecule given
Dabble.molutils.num_waters_remaining(molid, water_sel='water and element O')

Returns the number of waters remaining in the system, indicated by the value of the beta flag.

Parameters:
  • molid (int) -- VMD molecule id to consider
  • sel (str) -- VMD atom selection that counts as water, defaults to 'water and element O'
Returns:

(int) number of water molecules remaining in the system

Raises:
  • ValueError if empty water selection
  • ValueError if non-existent molecule given
Dabble.molutils.print_lipid_composition(lipid_sel, molid)

Describes the composition of the inner and outer leaflet

Parameters:
  • molid (int) -- VMD molecule id to look at
  • lipid_sel (str) -- VMD atom selection for lipid
Returns:

(str) string describing the lipid composition

Dabble.molutils.set_cations(molid, element, filter_sel='none')

Sets all of the specified atoms to a cation

Parameters:
  • molid (int) -- VMD molecule ID to consider
  • element (str in Na, K) -- Cation to convert
  • filter_sel (str) -- VMD atom selection string for atoms to convert
Raises:

ValueError if invalid cation specified

Dabble.molutils.set_ion(molid, atom_id, element)

Sets an atom to be the desired ion

Parameters:
  • molid (int) -- VMD molecule to operate on
  • atom_id (int) -- Atom index to change to ion
  • element (str in Na, K, Cl) -- Ion to apply
Raises:

ValueError if the index to change is not present

Dabble.molutils.solute_xy_diameter(solute_sel, molid)

Returns the XY diameter of a set of atoms.

Parameters:
  • solute_sel (str) -- VMD atom selection to get diameter of
  • molid (int) -- VMD molecule ID to select within
Returns:

(float, float) X and Y diameter of the selection

Dabble.molutils.tile_system(input_id, times_x, times_y, times_z, tmp_dir)

Tiles the membrane or solvent system the given number of times in each direction to produce a larger system.

Parameters:
  • input_id (int) -- VMD molecule id to tile
  • times_x (int) -- Number of times to tile in x direction
  • times_y (int) -- Number of times to tile in y direction
  • times_z (int) -- Number of times to tile in z direction
  • tmp_dir (str) -- Directory in which to put temporary files
Returns:

(int) VMD molecule ID of tiled system

Working with multiple molecule files

When constructing a system, Dabble performs operations such as tiling template membrane pieces, merging those pieces with a protein, and deleting atoms that are in the way. The fileutils module provides functionality for doing these things.

All intermediate files are in the mae file format, as it provides explicit connectivity and charge information, and allows combination of multiple molecules for easy combining of systems without renumbering all of the atoms. It is also easily extensible and not too difficult to parse.

This module contains functions for manipulating files using the VMD python API.

Dabble.fileutils.check_out_type(value, forcefield, hmr=False)

Checks the file format of the requiested output is supported, and sets internal variables as necessary.

Parameters:
  • value (str) -- Filename requested
  • forcefield (str) -- Force field requested
  • hmr (bool) -- If hydrogen mass repartitioning is requested
Returns:

The requested output format

Raises:
  • ValueError -- if the output format requested is currently unsupported
  • NotImplementedError -- if hydrogen mass repartitioning is requested for amber files
Dabble.fileutils.check_write_ok(filename, out_fmt, overwrite=False)

Checks if the output files for the requested format exists, and prints out an error message if the current options don't allow overwriting them.

Parameters:
  • filename (str) -- Output filename requested
  • out_fmt (str) -- Output format requested. All intermediate
  • involved in writing to this format will be checked for (files) --
  • existence. --
  • overwrite (bool) -- True if overwriting is allowed
Returns:

True if it okay to overwrite Quits the program otherwise

Dabble.fileutils.concatenate_mae_files(output_filename, input_filenames=None, input_ids=None)

Concatenates several mae files together into one. Since this file format allows concatenation, the new file is a combined system of all the input molecules superimposed. Either takes a list of files to concatenate, or a list of VMD molecule ids. If molecule ids are given, VMD's interface is used to get the filename corresponding to each id.

Parameters:
  • output_filename (str) -- Filename to write
  • input_filenames (list of str) -- List of input mae files to combine, OR:
  • input_ids (list of int) -- List of input molecules to combine
Returns:

True if successful

Raises:
  • ValueError -- if input_filenames and input_ids are both specified
  • AssertionError -- if there are no input files
Dabble.fileutils.load_solute(filename, tmp_dir)

Loads a molecule input file, guessing the format from the extension.

Parameters:
  • filename (str) -- Filename to load
  • tmp_dir (str) -- Directory to put temporary files in
Returns:

(int) VMD molecule ID that was loaded

Raises:

ValueError if filetype is currently unsupported

Dabble.fileutils.write_ct_blocks(molid, sel, output_filename, tmp_dir)

Writes a mae format file containing the specified selection.

Parameters:
  • molid (int) -- VMD molecule ID to write
  • sel (str) -- the selection to write
  • output_filename (str) -- the file to write to, including .mae extension
  • tmp_dir (str) -- Directory to put files in
Returns:

the number of CT blocks written

Return type:

length (int)

Dabble.fileutils.write_final_system(out_fmt, out_name, molid, **kwargs)

Writes the final output in whatever format(s) are requested. Always writes a mae format file as well

Parameters:
  • out_name (str) -- Filename to output to
  • out_fmt (str) -- format to write the output to
  • molid (int) -- VMD molecule_id to write
  • tmp_dir (str) -- Directory to put temporary files in
  • extra_topos (list of str) -- Extra topology files to use
  • extra_params (list of str) -- Extra parameter files to use
  • extra_streams (list of str) -- Extra stream files to use
  • lipid_sel (str) -- Lipid selection
  • hmassrepartition (bool) -- Whether or not to repartition hydrogen masses
  • debug_verbose (bool) -- Extra debug output from tleap
Returns:

(str) main final filename written