Lib I/O

Functions and Classes regarding Input/Output.

class idpconfgen.libs.libio.FileIterator(origin, ext='.pdb')[source]

Iterate over files.

class idpconfgen.libs.libio.FileIteratorBase[source]

File iterator base class.

class idpconfgen.libs.libio.FileReaderIterator(origin, **kwargs)[source]

Dispaches file read iteractor.

Here I created a class interface instead of a function for convinience.

class idpconfgen.libs.libio.TarFileIterator(origin, ext='.pdb')[source]

Iterate over files in tarfiule.

idpconfgen.libs.libio.add_existent_files(storage, source)[source]

Add files that exist to a list.

Given a list of source Paths, if Path exist adds it to storage.

Adds Path instances.

idpconfgen.libs.libio.concatenate_entries(entry_list)[source]

Concatente entries.

Entries can be given in a list of entries or file paths with entry lists. Single entries in the input list are used directly while files are read and their lines added one by one to the concatenated list.

Notice:

Does not descriminate between single entries and mispelled file paths. Every string that cannot be openned as path is considered an individual entry.

Parameters:

entry_list (lits) – List containing strings or file paths

Returns:

list – Concatenated strings plus lines in files.

idpconfgen.libs.libio.extract_from_tar(tar_file, output=None, ext='.pdb')[source]

Extract files from tarfile.

Parameters:
  • tar_file (str) – The tarfile to extract.

  • output (str, optional) – The folder to where extract tar files. Defaults to current working directory.

Returns:

list – A list of Path-objects pointing to the extracted files.

idpconfgen.libs.libio.file_exists(path, ifdir=<function get_false>, doelse=<function get_false>)[source]

Confirm file exists.

Parameters:

path (path-object or str)

Returns:

bool

idpconfgen.libs.libio.glob_folder(folder, ext)[source]

List files with extention ext in folder.

Does NOT perform recursive search.

Parameters:
  • folder (str) – The path to the folder to investigate.

  • ext (str) – The file extention. Can be with or without the dot [.] preffix.

Returns:

list of Path objects – SORTED list of matching results.

idpconfgen.libs.libio.has_suffix(path, ext=None)[source]

Evaluate file suffix according to ext condition.

Parameters:
  • path (str of Path) – The file path.

  • ext (str) – The file extension. Can be dotted ‘.’ (.csv) or plain (csv).

Returns:

bool

True

Always if ext is None. If path suffix equals ext.

False

Otherwise

idpconfgen.libs.libio.has_suffix_fasta(path, *, ext='.fasta')

Evaluate file suffix according to ext condition.

Parameters:
  • path (str of Path) – The file path.

  • ext (str) – The file extension. Can be dotted ‘.’ (.csv) or plain (csv).

Returns:

bool

True

Always if ext is None. If path suffix equals ext.

False

Otherwise

idpconfgen.libs.libio.is_valid_fasta_file(path)[source]

Return True if FASTA file is valid.

idpconfgen.libs.libio.list_files_recursively(folder, ext=None)[source]

List files recursively from source folder.

Parameters:
  • folder (string or Path) – The folder from where to start searching.

  • ext (string) – The file extension to consider. Files without the defined ext will be ignored. Defaults to None, all files are considered.

Returns:

unsorted list – Of the file paths relative to the source folder.

idpconfgen.libs.libio.log_nonexistent(files)[source]

Log to ERROR files that do not exist.

Parameters:

files (iterable of Paths)

idpconfgen.libs.libio.make_destination_folder(dest)[source]

Make a destination folder.

Returns:

Path-object – A path pointing to the folder created. If dest is None returns a Path poiting to the CWD.

idpconfgen.libs.libio.make_folder_or_cwd(folder)[source]

Make a folder or five CWD.

Parameters:

folder (str or Path) – Make the folder. If folder is None return the CWD.

Returns:

Path

idpconfgen.libs.libio.parse_suffix(ext)[source]

Represent a suffix of a file.

Example

parse_suffix(‘.pdf’) ‘.pdf’

parse_suffix(‘pdf’) ‘.pdf’

Parameters:

ext (str) – String to extract the suffix from.

Returns:

str – File extension with leading period.

idpconfgen.libs.libio.paths_from_flist(path)[source]

Read Paths from file listing paths.

Returns:

Map generator – Path representation of the entries in the file.

idpconfgen.libs.libio.read_FASTAS_from_file(fpath)[source]

Read FASTA sequence from file.

idpconfgen.libs.libio.read_FASTAS_from_file_to_strings(fpath)[source]

Read FASTA sequences from file.

FASTA is output as string.

Note that there should be no blank spaces between different sequences. The final return will be a dictionary where the key value will be “>XYZ” header and the value will be a list of individual residue letters [‘X’, ‘Y’, ‘Z’].

idpconfgen.libs.libio.read_PDBID_from_folder(folder)[source]

Read PDBIDs from folder.

Parameters:

folder (str) – The folder to read.

Returns:

idpconfgen.libs.libpdb.PDBList

idpconfgen.libs.libio.read_PDBID_from_source(source)[source]

Read PDBIDs from destination.

Accepted destinations:
  • folder

  • tarfile

Returns:

idpconfgen.libs.libpdb.PDBList.

idpconfgen.libs.libio.read_PDBID_from_tar(tar_file)[source]

Read PDBIDs from tarfile.

Note

Case-specific function, not an abstraction.

Parameters:

tar_file (idpconfgen.Path) – The tarfile to read.

Returns:

idpconfgen.libs.libpdb.PDBList – If file is not found returns an empty PDBList.

idpconfgen.libs.libio.read_dict_from_json(path)[source]

Read dict from json.

idpconfgen.libs.libio.read_dict_from_pickle(path)[source]

Read dictionary from pickle.

idpconfgen.libs.libio.read_dict_from_tar(path)[source]

Read dictionary from .tar file.

idpconfgen.libs.libio.read_dictionary_from_disk(path)[source]

Read a dictionary from disk.

Accepted formats:
  • pickle

  • json

Returns:

dict

idpconfgen.libs.libio.read_lines(fpath)[source]

Read lines from path.

idpconfgen.libs.libio.read_path_bundle(path_bundle, ext=None, listext='.list')[source]

Read path bundle.

Read paths encoded in strings, folders or files that are list of files.

If a string or Path object points to an existing file,

register that path.

If a string points to a folder, registers all files in that folder

that have extension ext, recursively.

If a string points to a file with extension listext, registers

all files referred in the listext file and which exist in disk.

Non-existing files are log as error messages.

Parameters:
  • path_bundle (list-like) – A list containing strings or paths that point to files or folders.

  • ext (string) – The file extension to consider. If None considers all files. Defaults to None.

  • listext (string) – The file extension to consider as a file listing other files. Defaults to .flist.

Returns:

generator – A generator that complies with the specifications described

idpconfgen.libs.libio.read_text(fpath)[source]

Read text from path.

idpconfgen.libs.libio.save_dict_to_json(mydict, output='mydict.json', indent=True, sort_keys=True)[source]

Save dictionary to JSON.

idpconfgen.libs.libio.save_dict_to_pickle(mydict, output='mydict.pickle', **kwargs)[source]

Save dictionary to pickle file.

idpconfgen.libs.libio.save_dictionary(mydict, output='mydict.pickle')[source]

Save dictionary to disk.

Accepted formats:
  • pickle

  • json

Parameters:
  • mydict (dict) – The dict to be saved.

  • output (str or Path) – The output file. Format is deduced from file extension.

Raises:

KeyError – If extension is not a compatible format.

idpconfgen.libs.libio.save_file_to_tar(tar, fout, data)[source]

Save content to a file inside a tar.

Parameters:
  • tar (openned tar-file instance)

  • fout (str) – The name under which to save the data.

  • data (str or bytes) – Data to save to fout named file inside tar.

idpconfgen.libs.libio.save_pairs_to_disk(pairs, destination)[source]

Save pairs to files.

Parameters:
  • pairs (list or tuple) – First indexes are used as file names, second indexes as info to write to files.

  • destination (str or Path-like) – Destination in the disk where to save the files. Current options are: a folder, a TAR file.

idpconfgen.libs.libio.save_pairs_to_files(pairs, destination=None)[source]

Save pairs to files.

Where each pair (tuple or list of length 2) is composed of:

(file name, data content)

Parameters:
  • pairs (list or tuple) – The pairs of information to save to the disk. Index 0 is the file name, and index 1 is the data content.

  • destination (str or Path, optional) – The folder where to save the files. It is NOT the file name. Defauls to the CWD.

idpconfgen.libs.libio.save_pairs_to_tar(pairs, destination)[source]

Save pairs to files inside a TAR file.

Where each pair (tuple or list of length 2) is composed of:

(file name, data content)

Parameters:
  • pairs (list or tuple) – The pairs of information to save to the disk. Index 0 is the file name, and index 1 is the data content.

  • destination (str or Path) – The TAR file where to save the files. It is NOT the file name. If exists appends, if not creates.