io

class pyrosetta.distributed.cluster.io.IO

Bases: object

Input/Output methods for PyRosettaCluster.

DATETIME_FORMAT: str = '%Y-%m-%d %H:%M:%S.%f'
REMARK_FORMAT: str = 'REMARK PyRosettaCluster: '
_get_instance_and_metadata(kwargs: Dict[str, Any]) Tuple[Dict[str, Any], Dict[str, Any]]

Get the current state of the PyRosettaCluster instance, and split the input keyword arguments into the PyRosettaCluster instance attributes and ancillary metadata.

_get_output_dir(decoy_dir: str) str

Get the output directory in which to write files to disk.

static _filter_scores_dict(scores_dict: Dict[str, Any]) Dict[str, Any]

Filter for JSON-serializable scoring data.

_format_result(result: Union[Pose, PackedPose]) Tuple[PackedPose, str, Dict[str, Any], Dict[str, Any]]

Given a Pose or PackedPose object, return a tuple object containing the Pose or PackedPose object, and its PDB string, Pose.cache dictionary, and JSON-serializable Pose.cache dictionary.

Warning: This method uses the pickle module to deserialize pickled Pose objects and arbitrary Python types in Pose.cache dictionary. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

_parse_results(results: Optional[Union[bytes, Pose, PackedPose, Iterable[Union[bytes, Pose, PackedPose]]]]) List[Tuple[str, Dict[str, Any]]]

Format output results from a Dask worker.

Warning: This method uses the pickle module to deserialize pickled Pose objects and arbitrary Python types in Pose.cache dictionary. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

Args:
results: Pose | PackedPose | bytes | Iterable[Pose | PackedPose | bytes] | None

An Pose, PackedPose, bytes or None object, or an iterable of Pose, PackedPose, or bytes objects.

Returns:

A list object of tuple objects, where each tuple object contains a PDB string, Pose.cache dictionary, and JSON-serializable Pose.cache dictionary.

_process_kwargs(kwargs: Dict[str, Any]) Dict[str, Any]

Parse a returned task dictionary.

_get_init_file_json(packed_pose: PackedPose) str

Return a PyRosetta initialization file as a JSON-serialized string.

Warning: This method uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

static _add_pose_comment(packed_pose: PackedPose, pdbfile_data: str) PackedPose

Cache simulation data as a Pose comment.

Warning: This method uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

static _dump_json(data: Dict[str, Any]) str

Return JSON-serialized data.

_save_results(results: Optional[bytes], kwargs: Dict[str, Any]) None

Write output results to disk.

Warning: This method uses the pickle module to deserialize pickled Pose objects and arbitrary Python types in Pose.cache dictionary. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

_cache_toml() None

Cache the Pixi/uv TOML file string and TOML file format.

_write_environment_file(filename: str) None

Write the Conda/Mamba YML or uv/Pixi lock file string to the input filename. If Pixi/uv is used as the environment manager, also write the TOML file string to a separate filename.

_write_init_file() None

Maybe dump a PyRosetta initialization file.

Warning: This method uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

_dump_init_file(filename: str, input_packed_pose: Optional[PackedPose] = None, output_packed_pose: Optional[PackedPose] = None, verbose: bool = True) None

Dump compressed PyRosetta initialization input files and Pose or PackedPose objects to the input filename.

Warning: This method uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

pyrosetta.distributed.cluster.io.verify_init_file(init_file: str, input_packed_pose: Optional[PackedPose], output_packed_pose: Optional[PackedPose], metadata: Dict[str, Any]) None

Verify that a PyRosetta initialization file was written by PyRosettaCluster.

Warning: This function uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

pyrosetta.distributed.cluster.io.sign_init_file_metadata_and_poses(input_packed_pose: Optional[PackedPose] = None, output_packed_pose: Optional[PackedPose] = None) Tuple[Dict[str, Any], List[PackedPose]]

Sign PyRosetta initialization file “metadata” and “poses” keys.

Warning: This function uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

pyrosetta.distributed.cluster.io.get_poses_from_init_file(init_file: str, verify: bool = False) Tuple[Optional[PackedPose], Optional[PackedPose]]

Return a tuple object of the input PackedPose object and the output PackedPose object from a “.init” or “.init.bz2” file, and optionally verify PyRosettaCluster metadata in the “.init” or “.init.bz2” file.

Warning: This function uses the pickle module to deserialize pickled Pose objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

pyrosetta.distributed.cluster.io.secure_read_pickle(filepath_or_buffer: str, compression: Optional[Union[str, Dict[str, Any]]] = 'infer', storage_options: Optional[Dict[str, Any]] = None) DataFrame

Proxy for pandas.read_pickle for file-like objects using the SecureSerializerBase class in PyRosetta. Usage requires adding “pandas” as a secure package to unpickle in PyRosetta.

Warning: This function uses the pickle module to deserialize pickled pandas.DataFrame objects. Using the pickle module is not secure, so please only run with input files you trust. Learn more about the pickle module and its security here.

Args:
filepath_or_buffer: str

See pandas.read_pickle.

compression: str | dict[str, Any] | None

See pandas.read_pickle.

Default: “infer”

storage_options: dict[str, Any] | None

See pandas.read_pickle.

Default: None

Example:
>>> pyrosetta.secure_unpickle.add_secure_package("pandas")
>>> secure_read_pickle("/path/to/my/scorefile.gz")
Note:

If using pandas version >=3.0.0, PyArrow-backed datatypes may be enabled by default; in this case, please ensure that pyrosetta.secure_unpickle.add_secure_package(“pyarrow”) has also first been run.

See https://pandas.pydata.org/pdeps/0010-required-pyarrow-dependency.html and https://pandas.pydata.org/pdeps/0014-string-dtype.html for more information.

Returns:

A deserialized pandas.DataFrame object.

pyrosetta.distributed.cluster.io.sanitize_urls(yml_str: str) str

Scan the input string and sanitize any URLs that include credentials for source domains, returning the updated string.

pyrosetta.distributed.cluster.io._is_pandas_object_pyarrow_backed(obj: Union[DataFrame, Series]) bool

Determine if a pandas.DataFrame or pandas.Series object uses Arrow-backed pandas dtypes.

Warning: This function is experimental and subject to change in future versions. See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.convert_dtypes.html for more information.

Args:
obj: pandas.DataFrame | pandas.Series

An input pandas.DataFrame or pandas.Series object to test.

Returns:

A bool object.