tools

pyrosetta.distributed.cluster.tools.get_protocols(protocols: Optional[Union[List[Union[Callable[[...], Any], str]], Callable[[...], Any], str]] = None, input_file: Optional[Union[str, Pose, PackedPose]] = None, scorefile: Optional[str] = None, decoy_name: Optional[str] = None) Union[List[Union[Callable[[...], Any], str]], NoReturn]

Given an ‘input_file’ that was written by PyRosettaCluster, or a full ‘scorefile’ and a ‘decoy_name’ that was written by PyRosettaCluster, if ‘protocols’ is provided then validate the ‘protocols’ against those in the ‘input_file’ or ‘scorefile’, otherwise if ‘protocols’ is NoneType then attempt to return the PyRosettaCluster protocols from the current scope matching the protocol names in the ‘input_file’ or ‘scorefile’ keyword argument parameters.

Args:
protocols: An iterable of str objects specifying the names of user-provided

PyRosetta protocols to validate or return. Default: None

input_file: A str object specifying the path to the ‘.pdb’, ‘.pdb.bz2’, ‘.pkl_pose’,

‘.pkl_pose.bz2’, ‘.b64_pose’, ‘.b64_pose.bz2’, ‘.init’, or ‘.init.bz2’ file, or a Pose`or `PackedPose object, from which to extract PyRosettaCluster instance kwargs. If ‘input_file’ is provided, then ignore the ‘scorefile’ and ‘decoy_name’ keyword argument parameters. Default: None

scorefile: A str object specifying the path to the JSON-formatted scorefile

(or pickled pandas.DataFrame scorefile) from a PyRosettaCluster simulation from which to extract PyRosettaCluster instance kwargs. If ‘scorefile’ is provided, ‘decoy_name’ must also be provided. In order to use a scorefile, it must contain full simulation records from the original production run; i.e., the attribute ‘simulation_records_in_scorefile’ was set to True. Default: None

decoy_name: A str object specifying the decoy name for which to extract

PyRosettaCluster instance kwargs. If decoy_name is provided, scorefile must also be provided. Default: None

Returns:

A list of user-defined PyRosetta protocol names from the ‘input_file’ or ‘scorefile’. If protocols is None, then attempt to return the PyRosettaCluster protocols from the current scope matching the protocol names in the ‘input_file’ or ‘scorefile’.

pyrosetta.distributed.cluster.tools.get_instance_kwargs(input_file: Optional[Union[str, Pose, PackedPose]] = None, scorefile: Optional[str] = None, decoy_name: Optional[str] = None, skip_corrections: Optional[bool] = None, with_metadata_kwargs: Optional[bool] = None) Union[Dict[str, Any], Tuple[Dict[str, Any], Dict[str, Any]], NoReturn]

Given an input file that was written by PyRosettaCluster, or a scorefile and a decoy name that was written by PyRosettaCluster, return the PyRosettaCluster instance kwargs needed to reproduce the decoy using PyRosettaCluster.

Args:
input_file: A str object specifying the path to the ‘.pdb’, ‘.pdb.bz2’, ‘.pkl_pose’,

‘.pkl_pose.bz2’, ‘.b64_pose’, ‘.b64_pose.bz2’, ‘.init’, or ‘.init.bz2’ file, or a Pose or PackedPose object, from which to extract PyRosettaCluster instance kwargs. If ‘input_file’ is provided, then ignore the ‘scorefile’ and ‘decoy_name’ keyword argument parameters. Default: None

scorefile: A str object specifying the path to the JSON-formatted scorefile

(or pickled pandas.DataFrame scorefile) from a PyRosettaCluster simulation from which to extract PyRosettaCluster instance kwargs. If ‘scorefile’ is provided, ‘decoy_name’ must also be provided. In order to use a scorefile, it must contain full simulation records from the original production run; i.e., the attribute ‘simulation_records_in_scorefile’ was set to True. Default: None

decoy_name: A str object specifying the decoy name for which to extract

PyRosettaCluster instance kwargs. If ‘decoy_name’ is provided, ‘scorefile’ must also be provided. Default: None

skip_corrections: A bool object specifying whether or not to skip any ScoreFunction

corrections specified in the PyRosettaCluster task initialization options (extracted from the ‘input_file’ or ‘scorefile’ keyword argument parameter). Default: None

with_metadata_kwargs: A bool object specifying whether or not to return a tuple

object with the instance kwargs as the first element and the metadata kwargs as the second element. Default: None

Returns:

A dict object of PyRosettaCluster instance kwargs, or a tuple object of dict objects with the PyRosettaCluster instance kwargs as the first element and the PyRosettaCluster metadata kwargs as the second element when with_metadata_kwargs=True.

pyrosetta.distributed.cluster.tools.reserve_scores(func: P) Union[P, NoReturn]

Use this as a Python decorator of any user-provided PyRosetta protocol. If any scoreterms and values are present in the input packed_pose, then if they are deleted during execution of the decorated user-provided PyRosetta protocol, then append those scoreterms and values back into the pose.cache dictionary after execution. If any scoreterms and values are present in the input packed_pose and also present in the returned or yielded output Pose or PackedPose objects, then do not append the original scoreterms and values back into the pose.cache dictionary after execution (that is, keep the outputted scoreterms and values in the pose.cache dictionary). Any new scoreterms and values acquired in the decorated user-provided PyRosetta protocol will never be overwritten. This allows users to maintain scoreterms and values acquired in earlier user-defined PyRosetta protocols if needing to execute Rosetta Movers that happen to delete scores from pose objects.

For example:

@reserve_scores def my_pyrosetta_protocol(packed_pose, **kwargs):

from pyrosetta import MyMover pose = packed_pose.pose MyMover().apply(pose) return pose

Args:

A user-provided PyRosetta function.

Returns:

The output from the user-provided PyRosetta function, reserving the scores.

pyrosetta.distributed.cluster.tools.requires_packed_pose(func: P) Union[PackedPose, None, P]

Use this as a Python decorator of any user-provided PyRosetta protocol. If a user-provided PyRosetta protocol requires that the first argument parameter be a non-empty PackedPose object, then return any received empty PackedPose objects or NoneType objects and skip the decorated protocol, otherwise run the decorated protocol.

If using PyRosettaCluster(filter_results=False) and the preceding protocol returns or yields either None, an empty Pose object, or an empty PackedPose object, then an empty PackedPose object is distributed to the next user-provided PyRosetta protocol, in which case the next protocol and/or any downstream protocols are skipped if they are decorated with this decorator. If using PyRosettaCluster(ignore_errors=True) and an error is raised in the preceding protocol, then a NoneType object is distributed to the next user-provided PyRosetta protocol, in which case the next protocol and/or any downstream protocols are skipped if they are decorated with this decorator.

For example:

@requires_packed_pose def my_pyrosetta_protocol(packed_pose, **kwargs):

assert packed_pose.pose.size() > 0 return packed_pose

Args:

A user-provided PyRosetta function.

Returns:

The input packed_pose argument parameter if it is an empty PackedPose object or a NoneType object, otherwise the results from the decorated protocol.

pyrosetta.distributed.cluster.tools.reproduce(input_file: Optional[str] = None, scorefile: Optional[str] = None, decoy_name: Optional[str] = None, protocols: Any = None, client: Optional[Client] = None, clients: Optional[List[Client]] = None, input_packed_pose: Optional[Union[Pose, PackedPose]] = None, instance_kwargs: Optional[Dict[Any, Any]] = None, clients_indices: Optional[List[int]] = None, resources: Optional[Dict[Any, Any]] = None, skip_corrections: bool = False, init_from_file_kwargs: Optional[Dict[str, Any]] = None) Optional[NoReturn]

Given an input file that was written by PyRosettaCluster (or a full scorefile and a decoy name that was written by PyRosettaCluster) and any additional PyRosettaCluster instance kwargs, run the reproduction simulation for the given decoy with a new instance of PyRosettaCluster.

Args:
input_file: A str object specifying the path to the ‘.pdb’, ‘.pdb.bz2’,

‘.pkl_pose’, ‘.pkl_pose.bz2’, ‘.b64_pose’, ‘.b64_pose.bz2’, ‘.init’ or ‘.init.bz2’ file from which to extract PyRosettaCluster instance kwargs. If ‘input_file’ is provided, then ignore the ‘scorefile’ and ‘decoy_name’ argument parameters. If a ‘.init’ or ‘.init.bz2’ file is provided and PyRosetta is not yet initialized, this first initializes PyRosetta with the PyRosetta initialization file (see the ‘init_from_file_kwargs’ keyword argument). Note that ‘.pkl_pose’, ‘.pkl_pose.bz2’, ‘.b64_pose’, ‘.b64_pose.bz2’, ‘.init’ and ‘.init.bz2’ files contain pickled Pose objects that are deserialized using PyRosetta’s secure unpickler upon running the reproduce() function, but please still only input these file types if you know and trust their source. Learn more here. Default: None

scorefile: A str object specifying the path to the JSON-formatted scorefile

(or pickled pandas.DataFrame scorefile) from a PyRosettaCluster simulation from which to extract PyRosettaCluster instance kwargs. If ‘scorefile’ is provided, ‘decoy_name’ must also be provided. In order to use a scorefile, it must contain full simulation records from the original production run; i.e., the attribute ‘simulation_records_in_scorefile’ was set to True. Note that in order to securely load pickled pandas.DataFrame objects, please ensure that pyrosetta.secure_unpickle.add_secure_package(“pandas”) has been run. Default: None

decoy_name: A str object specifying the decoy name for which to extract

PyRosettaCluster instance kwargs. If decoy_name is provided, scorefile must also be provided. Default: None

protocols: An optional iterable object of function or generator objects specifying

an ordered sequence of user-defined PyRosetta protocols to execute for the reproduction. This argument only needs to be provided if the user-defined PyRosetta protocols are not defined with the same scope as in the original production run. Default: None

client: An optional initialized dask distributed.client.Client object to be used as

the dask client interface to the local or remote compute cluster. If None, then PyRosettaCluster initializes its own dask client based on the settings from the original production run. Deprecated by the clients attribute, but supported for legacy purposes. Default: None

clients: A list or tuple object of initialized dask distributed.client.Client

objects to be used as the dask client interface(s) to the local or remote compute cluster(s). If None, then PyRosettaCluster initializes its own dask client based on the settings from the original production run. Optionally used in combination with the clients_indices attribute. Default: None

input_packed_pose: An optional input PackedPose object that is accessible via

the first argument of the first user-defined PyRosetta protocol. Default: None

instance_kwargs: An optional dict object of valid PyRosettaCluster attributes

which will override any PyRosettaCluster attributes that were used to generate the original decoy. Default: None

clients_indices: An optional list or tuple object of int objects, where each int object represents

a zero-based index corresponding to the initialized dask distributed.client.Client object(s) passed to the PyRosettaCluster(clients=…) class attribute. If not None, then the length of the clients_indices object must equal the number of protocols passed to the PyRosettaCluster().distribute method. Default: None

resources: An optional list or tuple object of dict objects, where each dict object represents

an abstract, arbitrary resource to constrain which dask workers run the user-defined PyRosetta protocols. If None, then do not impose resource constaints on any protocols. If not None, then the length of the resources object must equal the number of protocols passed to the PyRosettaCluster().distribute method, such that each resource specified indicates the unique resource constraints for the protocol at the corresponding index of the protocols passed to PyRosettaCluster().distribute. Note that this feature is only useful when one passes in their own instantiated client(s) with dask workers set up with various resource constraints. If dask workers were not instantiated to satisfy the specified resource constraints, protocols will hang indefinitely because the dask scheduler is waiting for workers that meet the specified resource constraints so that it can schedule these protocols. Unless workers were created with these resource tags applied, the protocols will not run. See https://distributed.dask.org/en/latest/resources.html for more information. Default: None

skip_corrections: A bool object specifying whether or not to skip any ScoreFunction corrections specified in

the PyRosettaCluster task ‘options’ or ‘extra_options’ values (extracted from either the ‘input_file’ or ‘scorefile’ keyword argument parameter), which are set in-code upon PyRosetta initialization. If the current PyRosetta build and conda environment are identical to those used for the original simulation, this parameter may be set to True to enable the reproduced decoy output file to be used for successive reproductions. If reproducing from a ‘.init’ file, it is recommended to also set ‘skip_corrections’ of the ‘init_from_file_kwargs’ keyword argument to the same value. Default: False

init_from_file_kwargs: An optional dict object to override the default pyrosetta.init_from_file() keyword

arguments if the ‘input_file’ keyword argument parameter is a path to a ‘.init’ file, otherwise it is not used. See the pyrosetta.init_from_file docstring for more information. Default: {

‘output_dir’: os.path.join(tempfile.TemporaryDirectory().name, “pyrosetta_init_input_files”), ‘skip_corrections’: skip_corrections, # Defaults to the ‘skip_corrections’ value from reproduce() ‘relative_paths’: True, ‘dry_run’: False, ‘max_decompressed_bytes’: pow(2, 30), # 1 GiB ‘database’: None, ‘verbose’: True, ‘set_logging_handler’: ‘logging’, ‘notebook’: None, ‘silent’: False,

}

Returns:

None

pyrosetta.distributed.cluster.tools.produce(**kwargs: Any) Optional[NoReturn]

PyRosettaCluster().distribute() shim requiring the ‘protocols’ keyword argument, and optionally any PyRosettaCluster keyword arguments or the ‘clients_indices’ keyword argument (when using the PyRosettaCluster(clients=…) keyword argument), or the ‘resources’ keyword argument, or the ‘priorities’ keyword argument.

Args:
**kwargs: See PyRosettaCluster docstring. The keyword arguments must also include

‘protocols’, an iterable object of function or generator objects specifying an ordered sequence of user-defined PyRosetta protocols to execute for the simulation (see PyRosettaCluster().distribute docstring). The keyword arguments may also optionally include ‘clients_indices’, ‘resources’, and ‘priorities’ (see PyRosettaCluster().distribute docstring).

Returns:

None

pyrosetta.distributed.cluster.tools.run(**kwargs: Any) Optional[NoReturn]

PyRosettaCluster().distribute() shim requiring the ‘protocols’ keyword argument, and optionally any PyRosettaCluster keyword arguments or the ‘clients_indices’ keyword argument (when using the PyRosettaCluster(clients=…) keyword argument), or the ‘resources’ keyword argument, or the ‘priorities’ keyword argument.

Args:
**kwargs: See PyRosettaCluster docstring. The keyword arguments must also include

‘protocols’, an iterable object of function or generator objects specifying an ordered sequence of user-defined PyRosetta protocols to execute for the simulation (see PyRosettaCluster().distribute docstring). The keyword arguments may also optionally include ‘clients_indices’, ‘resources’, and ‘priorities’ (see PyRosettaCluster().distribute docstring).

Returns:

None

pyrosetta.distributed.cluster.tools.iterate(**kwargs: Any) Union[NoReturn, Generator[Tuple[PackedPose, Dict[Any, Any]], None, None]]

PyRosettaCluster().generate() shim requiring the ‘protocols’ keyword argument, and optionally any PyRosettaCluster keyword arguments or the ‘clients_indices’ keyword argument (when using the PyRosettaCluster(clients=…) keyword argument), or the ‘resources’ keyword argument, or the ‘priorities’ keyword argument.

Args:
**kwargs: See PyRosettaCluster docstring. The keyword arguments must also include

‘protocols’, an iterable object of function or generator objects specifying an ordered sequence of user-defined PyRosetta protocols to execute for the simulation (see PyRosettaCluster().generate docstring). The keyword arguments may also optionally include ‘clients_indices’, ‘resources’, and ‘priorities’ (see PyRosettaCluster().generate docstring).

Yields:

(PackedPose, dict) tuples from the most recently run user-provided PyRosetta protocol if PyRosettaCluster(save_all=True) otherwise from the final user-defined PyRosetta protocol.