Welcome to Cromwell Manager’s documentation!

Indices and tables

Modules

class cromwell_manager.cromwell.Cromwell(cromwell_url, username=None, password=None, api_version='v1')[source]

Wrapper for the Cromwell REST API

abort_workflow(workflow_id, *args, **kwargs)[source]

Abort a workflow.

Parameters:
  • workflow_id (str) – hash for workflow to abort
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • args – additional arguments to pass to requests.post
  • kwargs – additional arguments to pass to requests.post
Return response.Response:
 

requests response object

backends(*args, **kwargs)[source]

Retrieve backends for this cromwell instance.

Parameters:
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

cromwell_url

URL for the cromwell REST endpoints.

get(url, verbose=False, open_browser=False, *args, **kwargs)[source]

Make a REST GET query to url.

Parameters:
  • url (str) – GET query url
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return requests.Response:
 

requests response object

logs(workflow_id, *args, **kwargs)[source]

Retrieve logs for workflow_id.

Parameters:
  • workflow_id (str) – hash for workflow to abort
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

metadata(workflow_id, *args, **kwargs)[source]

Retrieve metadata for workflow_id.

Parameters:
  • workflow_id (str) – hash for workflow to abort
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

outputs(workflow_id, *args, **kwargs)[source]

Retrieve outputs for workflow_id.

Parameters:
  • workflow_id (str) – hash for workflow to abort
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

post(url, verbose=False, *args, **kwargs)[source]

Make a REST POST query to url.

Parameters:
  • url (str) – POST query url
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • args – additional arguments to pass to requests.post
  • kwargs – additional arguments to pass to requests.post
Return requests.Response:
 

requests response object

static print_failure(response, message='')[source]

Print information on a failing request to console.

Parameters:
  • response (requests.Response) – response from request operation
  • message (str) – (optional) message to append to failure report
static print_request(request_type, request_string, response)[source]

Print a request to console.

Triggered by the verbose=True flag on cromwell or workspace functions and properties.

Parameters:
  • request_type (str) – {GET, POST} type of REST operation
  • request_string (str) – full request url
  • response (requests.Response) – response from request operation
query(start=None, end=None, names=None, ids=None, status=None, labels=None, *args, **kwargs)[source]

Query cromwell for workflows matching specified metadata information.

Parameters:
  • start (str) – datetime string in format #todo
  • end (str) – datetime string in format #todo
  • names (list) – list of one or more workflow name(s)
  • ids (list) – list of one or more workflow id(s)
  • status (list) – list of one or more workflow status(es). Must be a valid status: {Submitted, Running, Aborting, Failed, Succeeded, Aborted}
  • labels (dict) – dictionary of custom label:value pairs
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
Return requests.Response:
 
server_is_running(*args, **kwargs)[source]

Return True if the server is running, else False.

Parameters:
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
stats(*args, **kwargs)[source]

Retrieve cromwell statistics on number of running jobs

Parameters:
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

status(workflow_id, *args, **kwargs)[source]

Retrieve status for workflow_id.

Parameters:
  • workflow_id (str) – hash for workflow to abort
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

submit(files, wait=True, timeout=15, delay=3, verbose=False, *args, **kwargs)[source]

Submit a new workflow.

Parameters:
  • files (dict) – dictionary of files from workflow._submission_json
  • wait (bool) – if True, wait until workflow recognizes as submitted
  • timeout (int) – maximum time to wait
  • delay (int) – time between status queries
  • verbose (bool) – if True, print request results
  • args – additional positional args to pass to requests.post
  • kwargs – additional keyword args to pass to request.post
Return response.Response:
 

requests response object

swagger()[source]

Open the swagger page for this cromwell server.

timing(workflow_id)[source]

Open timing in browser window for workflow_id.

Parameters:workflow_id (str) – run id to open timing for
version(*args, **kwargs)[source]

Retrieve the cromwell version

Parameters:
  • verbose (bool) – if True, print the query, response code, and content (default False)
  • open_browser (bool) – if True, display the GET result in browser (default False)
  • args – additional positional args to pass to requests.get
  • kwargs – additional keyword args to pass to request.get
Return response.Response:
 

requests response object

wait_for_status(status, workflow_id, verbose=False, timeout=15, delay=3)[source]

Wait until any status in a list of potentially many statuses is achieved for a workflow.

Parameters:
  • status (Iterable) – Iterable of one or more statuses to wait for
  • workflow_id (str) – identifier hash code for a workflow
  • verbose (bool) – if True, print the requests made
  • timeout (int) – maximum time to wait
  • delay (int) – time between status queries
Return requests.Response:
 

response object generated when workflow_id achieves the first valid status

class cromwell_manager.workflow.WorkflowBase(workflow_id, cromwell_server, storage_client=None)[source]
cromwell_server

Authenticated, currently-running Cromwell server.

inputs

workflow inputs

logs

workflow logs

metadata

Workflow metadata.

outputs

workflow outputs

refresh_tasks()[source]

update tasks in self.tasks

root

root directory for workflow outputs

save_resource_utilization(filename, retrieve=True)[source]

Save resource utilizations for each task to file.

Parameters:
  • | BufferedIOBase filename (str) – filename or open file object in which to save resource utilization
  • retrieve (bool) – if True, get the current metadata from Cromwell, otherwise retrieve stored metadata (default True)
  • workflow_id (str) – identifier hash code for a workflow
  • verbose (bool) – if True, print the requests made
  • timeout (int) – maximum time to wait
  • delay (int) – time between status queries
Return requests.Response:
 

status response from Cromwell

status

Status of workflow.

storage_client

Authenticated google storage client.

tasks

Get the workflow task summaries.

Return dict:Cromwell metadata for workflow
timing()[source]

Open timing for this task in browser window.

class cromwell_manager.workflow.Workflow(workflow_id, cromwell_server, storage_client=None)[source]

Object to define an instance of a top-level workflow run on Cromwell.

abort(*args, **kwargs)[source]

Abort this workflow.

Parameters:
  • args – arguments to pass to cromwell.abort
  • kwargs – keyword arguments to pass to cromwell.abort
Return request.Response:
 

abort response

classmethod from_submission(wdl, inputs_json, cromwell_server, storage_client, options_json=None, workflow_dependencies=None, custom_labels=None, *args, **kwargs)[source]

Submit a new workflow, returning a Workflow object.

Parameters:
  • wdl (str) – wdl that defines this workflow
  • inputs_json (str) – inputs to this wdl
  • cromwell_server (Cromwell) – an authenticated cromwell server
  • storage_client (storage.Client) – authenticated google storage client
  • | dict workflow_dependencies (str) –
  • custom_labels (dict) –
  • options_json (str) – options file for the workflow
  • wait (bool) – if True, wait until workflow recognizes as submitted (default: True)
  • timeout (int) – maximum time to wait
  • delay (int) – time between status queries
  • verbose (bool) – if True, print request results
  • args – additional positional args to pass to requests.post
  • kwargs – additional keyword args to pass to request.post
Return dict:

Cromwell submission result

classmethod validate(wdl, inputs_json, storage_client, options_json=None, workflow_dependencies=None, custom_labels=None, *args, **kwargs)[source]

Validate a workflow, catching errors before submission.

if using positional arguments, the same argument set that is used for submission can be used to call validate.

Parameters:
  • wdl (str) – wdl that defines this workflow
  • inputs_json (str) – inputs to this wdl
  • storage_client (storage.Client) – authenticated google storage client
  • | dict workflow_dependencies (str) –
  • custom_labels (dict) –
  • options_json (str) – options file for the workflow
  • args – argument sink for arguments to from_submission that are not used.
  • kwargs – argument sink for arguments to from_submission that are not used.
wait_until_complete(*args, **kwargs)[source]

Wait until the workflow completes running.

Optional Arguments: :param str workflow_id: identifier hash code for a workflow :param bool verbose: if True, print the requests made :param int timeout: maximum time to wait :param int delay: time between status queries

Return requests.Response:
 status response from Cromwell
class cromwell_manager.workflow.SubWorkflow(workflow_id, cromwell_server, storage_client=None)[source]

A workflow without custom constructors

class cromwell_manager.calledtask.CalledTask(name, shard_metadata, client)[source]

Object to define an instance of a called workflow task.

class cromwell_manager.calledtask.Shard(metadata, client)[source]

at the moment, shard is a simple named dictionary class containing shard information

class cromwell_manager.resource_utilization.ResourceUtilization(task_name, max_memory, total_memory, max_disk, total_disk, robust)[source]

Class to store resource utilization information for a task, run on Cromwell.

classmethod from_file(task_name, open_log_file_object)[source]

Create a ResourceUtilization object from a monitoring log file.

Parameters:
  • task_name (str) – Name of this task
  • open_log_file_object (file) – an open monitoring log from cromwell
Return ResourceUtilization:
 

memory and disk utilization for this task

static merge(x, y=None)[source]

Merge two ResourceUtilization objects for the same task, returning the maximum utilization.

Parameters:
Return ResourceUtilization:
 

maximum resource utilization

class cromwell_manager.io_util.GSObject(gs_filestring, client=None)[source]
download_as_string()[source]

Download data as a string

Return str:downloaded blob data
download_to_bytes_readable()[source]

Return a bytes file-like object readable by requests and REST APIs

Return BufferedIOBase:
 readable file object
download_to_file(file_object)[source]

Download data to file

Parameters:file_object (io.BufferedIOBase) – open bytes-writable file object
static split_path(path)[source]

Utility to split a google storage path into bucket + key.

Parameters:path (str) – google storage path (must have gs:// prefix)
Return str:bucket
Return str:blob
class cromwell_manager.io_util.HTTPObject(url)[source]
download_as_string()[source]

Download data as a string

Return str:downloaded url data
download_to_bytes_readable()[source]

Return a bytes file-like object readable by requests and REST APIs

Return BufferedIOBase:
 readable file object
download_to_file(file_object)[source]

Download data to file

Parameters:file_object (io.BufferedIOBase) – open bytes-writable file object
cromwell_manager.io_util.open_gs_console(link, project)[source]

open the google storage console to view the contents of link

Parameters:
  • link (str) – gs file or directory
  • project (str) – project owner of link
cromwell_manager.io_util.package_workflow_dependencies(**dependencies)[source]

Download wdls, zip, and return a bytes-readable output

Parameters:dependencies – dict of dependency (name, path) pairs to be included in the archive - name should be the expected name for the imported dependency - path should give the object’s location, supports google storage, https, and local paths
Return File:file object with binary data written.