Welcome to Cromwell Manager’s documentation!¶
Indices and tables¶
Modules¶
-
class
cromwell_manager.cromwell.
Cromwell
(cromwell_url, username=None, password=None, api_version='v1')[source]¶ Wrapper for the Cromwell REST API
-
abort_workflow
(workflow_id, *args, **kwargs)[source]¶ Abort a workflow.
Parameters: - workflow_id (str) – hash for workflow to abort
- verbose (bool) – if True, print the query, response code, and content (default False)
- args – additional arguments to pass to requests.post
- kwargs – additional arguments to pass to requests.post
Return response.Response: requests response object
-
backends
(*args, **kwargs)[source]¶ Retrieve backends for this cromwell instance.
Parameters: - verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
cromwell_url
¶ URL for the cromwell REST endpoints.
-
get
(url, verbose=False, open_browser=False, *args, **kwargs)[source]¶ Make a REST GET query to url.
Parameters: - url (str) – GET query url
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return requests.Response: requests response object
-
logs
(workflow_id, *args, **kwargs)[source]¶ Retrieve logs for workflow_id.
Parameters: - workflow_id (str) – hash for workflow to abort
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
metadata
(workflow_id, *args, **kwargs)[source]¶ Retrieve metadata for workflow_id.
Parameters: - workflow_id (str) – hash for workflow to abort
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
outputs
(workflow_id, *args, **kwargs)[source]¶ Retrieve outputs for workflow_id.
Parameters: - workflow_id (str) – hash for workflow to abort
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
post
(url, verbose=False, *args, **kwargs)[source]¶ Make a REST POST query to url.
Parameters: - url (str) – POST query url
- verbose (bool) – if True, print the query, response code, and content (default False)
- args – additional arguments to pass to requests.post
- kwargs – additional arguments to pass to requests.post
Return requests.Response: requests response object
-
static
print_failure
(response, message='')[source]¶ Print information on a failing request to console.
Parameters: - response (requests.Response) – response from request operation
- message (str) – (optional) message to append to failure report
-
static
print_request
(request_type, request_string, response)[source]¶ Print a request to console.
Triggered by the verbose=True flag on cromwell or workspace functions and properties.
Parameters: - request_type (str) – {GET, POST} type of REST operation
- request_string (str) – full request url
- response (requests.Response) – response from request operation
-
query
(start=None, end=None, names=None, ids=None, status=None, labels=None, *args, **kwargs)[source]¶ Query cromwell for workflows matching specified metadata information.
Parameters: - start (str) – datetime string in format #todo
- end (str) – datetime string in format #todo
- names (list) – list of one or more workflow name(s)
- ids (list) – list of one or more workflow id(s)
- status (list) – list of one or more workflow status(es). Must be a valid status: {Submitted, Running, Aborting, Failed, Succeeded, Aborted}
- labels (dict) – dictionary of custom label:value pairs
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
Return requests.Response:
-
server_is_running
(*args, **kwargs)[source]¶ Return True if the server is running, else False.
Parameters: - verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
-
stats
(*args, **kwargs)[source]¶ Retrieve cromwell statistics on number of running jobs
Parameters: - verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
status
(workflow_id, *args, **kwargs)[source]¶ Retrieve status for workflow_id.
Parameters: - workflow_id (str) – hash for workflow to abort
- verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
submit
(files, wait=True, timeout=15, delay=3, verbose=False, *args, **kwargs)[source]¶ Submit a new workflow.
Parameters: - files (dict) – dictionary of files from workflow._submission_json
- wait (bool) – if True, wait until workflow recognizes as submitted
- timeout (int) – maximum time to wait
- delay (int) – time between status queries
- verbose (bool) – if True, print request results
- args – additional positional args to pass to requests.post
- kwargs – additional keyword args to pass to request.post
Return response.Response: requests response object
-
timing
(workflow_id)[source]¶ Open timing in browser window for workflow_id.
Parameters: workflow_id (str) – run id to open timing for
-
version
(*args, **kwargs)[source]¶ Retrieve the cromwell version
Parameters: - verbose (bool) – if True, print the query, response code, and content (default False)
- open_browser (bool) – if True, display the GET result in browser (default False)
- args – additional positional args to pass to requests.get
- kwargs – additional keyword args to pass to request.get
Return response.Response: requests response object
-
wait_for_status
(status, workflow_id, verbose=False, timeout=15, delay=3)[source]¶ Wait until any status in a list of potentially many statuses is achieved for a workflow.
Parameters: - status (Iterable) – Iterable of one or more statuses to wait for
- workflow_id (str) – identifier hash code for a workflow
- verbose (bool) – if True, print the requests made
- timeout (int) – maximum time to wait
- delay (int) – time between status queries
Return requests.Response: response object generated when workflow_id achieves the first valid status
-
-
class
cromwell_manager.workflow.
WorkflowBase
(workflow_id, cromwell_server, storage_client=None)[source]¶ -
cromwell_server
¶ Authenticated, currently-running Cromwell server.
-
inputs
¶ workflow inputs
-
logs
¶ workflow logs
-
metadata
¶ Workflow metadata.
-
outputs
¶ workflow outputs
-
root
¶ root directory for workflow outputs
-
save_resource_utilization
(filename, retrieve=True)[source]¶ Save resource utilizations for each task to file.
Parameters: - | BufferedIOBase filename (str) – filename or open file object in which to save resource utilization
- retrieve (bool) – if True, get the current metadata from Cromwell, otherwise retrieve stored metadata (default True)
- workflow_id (str) – identifier hash code for a workflow
- verbose (bool) – if True, print the requests made
- timeout (int) – maximum time to wait
- delay (int) – time between status queries
Return requests.Response: status response from Cromwell
-
status
¶ Status of workflow.
-
storage_client
¶ Authenticated google storage client.
-
tasks
¶ Get the workflow task summaries.
Return dict: Cromwell metadata for workflow
-
-
class
cromwell_manager.workflow.
Workflow
(workflow_id, cromwell_server, storage_client=None)[source]¶ Object to define an instance of a top-level workflow run on Cromwell.
-
abort
(*args, **kwargs)[source]¶ Abort this workflow.
Parameters: - args – arguments to pass to cromwell.abort
- kwargs – keyword arguments to pass to cromwell.abort
Return request.Response: abort response
-
classmethod
from_submission
(wdl, inputs_json, cromwell_server, storage_client, options_json=None, workflow_dependencies=None, custom_labels=None, *args, **kwargs)[source]¶ Submit a new workflow, returning a Workflow object.
Parameters: - wdl (str) – wdl that defines this workflow
- inputs_json (str) – inputs to this wdl
- cromwell_server (Cromwell) – an authenticated cromwell server
- storage_client (storage.Client) – authenticated google storage client
- | dict workflow_dependencies (str) –
- custom_labels (dict) –
- options_json (str) – options file for the workflow
- wait (bool) – if True, wait until workflow recognizes as submitted (default: True)
- timeout (int) – maximum time to wait
- delay (int) – time between status queries
- verbose (bool) – if True, print request results
- args – additional positional args to pass to requests.post
- kwargs – additional keyword args to pass to request.post
Return dict: Cromwell submission result
-
classmethod
validate
(wdl, inputs_json, storage_client, options_json=None, workflow_dependencies=None, custom_labels=None, *args, **kwargs)[source]¶ Validate a workflow, catching errors before submission.
if using positional arguments, the same argument set that is used for submission can be used to call validate.
Parameters: - wdl (str) – wdl that defines this workflow
- inputs_json (str) – inputs to this wdl
- storage_client (storage.Client) – authenticated google storage client
- | dict workflow_dependencies (str) –
- custom_labels (dict) –
- options_json (str) – options file for the workflow
- args – argument sink for arguments to from_submission that are not used.
- kwargs – argument sink for arguments to from_submission that are not used.
-
wait_until_complete
(*args, **kwargs)[source]¶ Wait until the workflow completes running.
Optional Arguments: :param str workflow_id: identifier hash code for a workflow :param bool verbose: if True, print the requests made :param int timeout: maximum time to wait :param int delay: time between status queries
Return requests.Response: status response from Cromwell
-
-
class
cromwell_manager.workflow.
SubWorkflow
(workflow_id, cromwell_server, storage_client=None)[source]¶ A workflow without custom constructors
-
class
cromwell_manager.calledtask.
CalledTask
(name, shard_metadata, client)[source]¶ Object to define an instance of a called workflow task.
-
class
cromwell_manager.calledtask.
Shard
(metadata, client)[source]¶ at the moment, shard is a simple named dictionary class containing shard information
-
class
cromwell_manager.resource_utilization.
ResourceUtilization
(task_name, max_memory, total_memory, max_disk, total_disk, robust)[source]¶ Class to store resource utilization information for a task, run on Cromwell.
-
classmethod
from_file
(task_name, open_log_file_object)[source]¶ Create a ResourceUtilization object from a monitoring log file.
Parameters: - task_name (str) – Name of this task
- open_log_file_object (file) – an open monitoring log from cromwell
Return ResourceUtilization: memory and disk utilization for this task
-
static
merge
(x, y=None)[source]¶ Merge two ResourceUtilization objects for the same task, returning the maximum utilization.
Parameters: - x (ResourceUtilization) –
- y (ResourceUtilization) –
Return ResourceUtilization: maximum resource utilization
-
classmethod
-
class
cromwell_manager.io_util.
GSObject
(gs_filestring, client=None)[source]¶ -
-
download_to_bytes_readable
()[source]¶ Return a bytes file-like object readable by requests and REST APIs
Return BufferedIOBase: readable file object
-
-
class
cromwell_manager.io_util.
HTTPObject
(url)[source]¶
-
cromwell_manager.io_util.
open_gs_console
(link, project)[source]¶ open the google storage console to view the contents of link
Parameters: - link (str) – gs file or directory
- project (str) – project owner of link
-
cromwell_manager.io_util.
package_workflow_dependencies
(**dependencies)[source]¶ Download wdls, zip, and return a bytes-readable output
Parameters: dependencies – dict of dependency (name, path) pairs to be included in the archive - name should be the expected name for the imported dependency - path should give the object’s location, supports google storage, https, and local paths Return File: file object with binary data written.