structures — load and cache PDB files

This module provides a function that will read a directory of PDB files and return a pandas data frame containing a number of score, distance, and sequence metrics for each structure. This information is also cached, because it takes a while to calculate up front. Note that the cache files are pickles and seem to depend very closely on the version of pandas used to generate them. For example, caches generated with pandas 0.15 can’t be read by pandas 0.14.

class pull_into_place.structures.Design(directory)[source]

Represent a single validated design. Each design is associated with 500 scores, 500 restraint distances, and a “representative” (i.e. lowest scoring) model. The representative has its own score and restraint distance, plus a path to a PDB structure.

exception pull_into_place.structures.IOError[source]
pull_into_place.structures.load(pdb_dir, use_cache=True, job_report=None, require_io_dir=True)[source]

Return a variety of score and distance metrics for the structures found in the given directory. As much information as possible will be cached. Note that new information will only be calculated for file names that haven’t been seen before. If a file changes or is deleted, the cache will not be updated to reflect this and you may be presented with stale data.

pull_into_place.structures.read_and_calculate(workspace, pdb_paths)[source]

Calculate a variety of score and distance metrics for the given structures.

pull_into_place.structures.xyz_to_array(xyz)[source]

Convert a list of strings representing a 3D coordinate to floats and return the coordinate as a numpy array.