
Run-data-managers is a tool for provisioning data on a galaxy instance.

Run-data-managers has the ability to reload the datatables after a data manager has finished. It is therefore able to run multiple data managers that are interdependent. When a reference genome is needed for bwa-mem for example, Run-data-managers can first run a data manager to fetch the fasta file, reload the data table and run another data manager that indexes the fasta file for bwa-mem.

Run-data-managers needs a yaml that specifies what data managers are run and with which settings. An example file can be found here.

By default run-data-managers skips entries in the yaml file that have already been run. It checks it in the following way: * If the data manager has input variables “name” or “sequence_name” it will check if the “name” column in the data table already has this entry.

“name” will take precedence over “sequence_name”.
  • If the data manager has input variables “value”, “sequence_id” or ‘dbkey’ it will check if the “value” column in the data table already has this entry. Value takes precedence over sequence_id which takes precedence over dbkey.
  • If none of the above input variables are specified the data manager will always run.


Running Galaxy data managers in a defined order with defined parameters.

usage: run-data-managers [-h] [-v] [-g GALAXY] [-u USER] [-p PASSWORD]
                         [-a API_KEY] --config CONFIG [--overwrite]
Optional Arguments
--config Path to the YAML config file with the list of data managers and data to install.
 Disables checking whether the item already exists in the tool data table.
Galaxy connection
-v=False, --verbose=False
 Increase output verbosity.
-g="http://localhost:8080", --galaxy="http://localhost:8080"
 Target Galaxy instance URL/IP address
-u, --user Galaxy user name
-p, --password Password for the Galaxy user
-a, --api_key Galaxy admin user API key (required if not defined in the tools list file)