Skip to content
Snippets Groups Projects
Commit 944bea29 authored by Izzard, Robert Dr (Maths & Physics)'s avatar Izzard, Robert Dr (Maths & Physics)
Browse files

slurm grids are now working

beware though: there are many debugging options still built in, and the gridcode should (but doesn't) have a unique filename
parent e0fc34da
No related branches found
No related tags found
No related merge requests found
......@@ -4,7 +4,7 @@ Docstring coverage:
Test coverage:
![test coverage](./badges/test_coverage.svg)
Binary population synthesis code that interfaces with binary_c. Based on a original work by Jeff Andrews (can be found in old_solution/ directory). Updated and extended for Python3 by David Hendriks, Robert Izzard.
Binary population synthesis code that interfaces with binary_c. Based on a original work by Jeff Andrews. Updated and extended for Python3 by David Hendriks, Robert Izzard.
The current release is version [version](VERSION), make sure to use that version number when installing!
......
This diff is collapsed.
......@@ -33,6 +33,7 @@ grid_options_defaults_dict = {
"parse_function": None, # Function to parse the output with.
"multiplicity_fraction_function": 0, # Which multiplicity fraction function to use. 0: None, 1: Arenou 2010, 2: Rhagavan 2010, 3: Moe and di Stefano 2017
"tmp_dir": temp_dir(), # Setting the temp dir of the program
"status_dir" : None, #
"_main_pid": -1, # Placeholder for the main process id of the run.
"save_ensemble_chunks": True, # Force the ensemble chunk to be saved even if we are joining a thread (just in case the joining fails)
"combine_ensemble_with_thread_joining": True, # Flag on whether to combine everything and return it to the user or if false: write it to data_dir/ensemble_output_{population_id}_{thread_id}.json
......@@ -128,6 +129,8 @@ grid_options_defaults_dict = {
"_grid_variables": {}, # grid variables
"gridcode_filename": None, # filename of gridcode
"symlink latest gridcode": True, # symlink to latest gridcode
"save_population_object" : None, # filename to which we should save a pickled grid object as the final thing we do
'joinlist' : None,
## Monte carlo type evolution
# TODO: make MC options
## Evolution from source file
......@@ -164,11 +167,11 @@ grid_options_defaults_dict = {
"slurm_dir": "", # working directory containing scripts output logs etc.
"slurm_njobs": 0, # number of scripts; set to 0 as default
"slurm_jobid": "", # slurm job id (%A)
"slurm_memory": 512, # in MB, the memory use of the job
"slurm_warn_max_memory": 1024, # in MB : warn if mem req. > this
"slurm_memory": '512MB', # memory required for the job
"slurm_warn_max_memory": '1024MB', # warn if we set it to more than this (usually by accident)
"slurm_use_all_node_CPUs": 0, # 1 = use all of a node's CPUs. 0 = use a given number of CPUs
"slurm_postpone_join": 0, # if 1 do not join on slurm, join elsewhere. want to do it off the slurm grid (e.g. with more RAM)
"slurm_jobarrayindex": "", # slurm job array index (%a)
"slurm_jobarrayindex": None, # slurm job array index (%a)
"slurm_jobname": "binary_grid", # default
"slurm_partition": None,
"slurm_time": 0, # total time. 0 = infinite time
......@@ -197,24 +200,24 @@ grid_options_defaults_dict = {
"condor_universe": "vanilla", # usually vanilla universe
"condor_extra_settings": {}, # Place to put extra configuration for the CONDOR submit file. The key and value of the dict will become the key and value of the line in te slurm batch file. Will be put in after all the other settings (and before the command). Take care not to overwrite something without really meaning to do so.
# snapshots and checkpoints
condor_snapshot_on_kill:0, # if 1 snapshot on SIGKILL before exit
condor_load_from_snapshot:0, # if 1 check for snapshot .sv file and load it if found
condor_checkpoint_interval:0, # checkpoint interval (seconds)
condor_checkpoint_stamp_times:0, # if 1 then files are given timestamped names
'condor_snapshot_on_kill':0, # if 1 snapshot on SIGKILL before exit
'condor_load_from_snapshot':0, # if 1 check for snapshot .sv file and load it if found
'condor_checkpoint_interval':0, # checkpoint interval (seconds)
'condor_checkpoint_stamp_times':0, # if 1 then files are given timestamped names
# (warning: lots of files!), otherwise just store the lates
condor_streams:0, # stream stderr/stdout by default (warning: might cause heavy network load)
condor_save_joined_file:0, # if 1 then results/joined contains the results
'condor_streams':0, # stream stderr/stdout by default (warning: might cause heavy network load)
'condor_save_joined_file':0, # if 1 then results/joined contains the results
# (useful for debugging, otherwise a lot of work)
condor_requirements:'', # used?
'condor_requirements':'', # used?
# # resubmit options : if the status of a condor script is
# # either 'finished','submitted','running' or 'crashed',
# # decide whether to resubmit it.
# # NB Normally the status is empty, e.g. on the first run.
# # These are for restarting runs.
# condor_resubmit_finished:0,
condor_resubmit_submitted:0,
condor_resubmit_running:0,
condor_resubmit_crashed:0,
'condor_resubmit_submitted':0,
'condor_resubmit_running':0,
'condor_resubmit_crashed':0,
##########################
# Unordered. Need to go through this. Copied from the perl implementation.
##########################
......@@ -271,7 +274,7 @@ grid_options_defaults_dict = {
# C_auto_logging : undef,
# custom_output_C_function_pointer : binary_c_function_bind(),
# # control flow
rungrid : True, # usually run the grid, but can be 0
'rungrid' : 1, # usually run the grid, but can be 0
# # to skip it (e.g. for condor/slurm runs)
# merge_datafiles:'',
# merge_datafiles_filelist:'',
......@@ -450,6 +453,7 @@ grid_options_defaults_dict = {
# Grid containing the descriptions of the options # TODO: add input types for all of them
grid_options_descriptions = {
"tmp_dir": "Directory where certain types of output are stored. The grid code is stored in that directory, as well as the custom logging libraries. Log files and other diagnostics will usually be written to this location, unless specified otherwise", # TODO: improve this
"status_dir" : "Directory where grid status is stored",
"_binary_c_dir": "Director where binary_c is stored. This options are not really used",
"_binary_c_config_executable": "Full path of the binary_c-config executable. This options is not used in the population object.",
"_binary_c_executable": "Full path to the binary_c executable. This options is not used in the population object.",
......@@ -489,7 +493,7 @@ grid_options_descriptions = {
"slurm": "Int flag whether to use a Slurm type population evolution.", # TODO: describe this in more detail
"weight": "Weight factor for each system. The calculated probability is multiplied by this. If the user wants each system to be repeated several times, then this variable should not be changed, rather change the _repeat variable instead, as that handles the reduction in probability per system. This is useful for systems that have a process with some random element in it.", # TODO: add more info here, regarding the evolution splitting.
"repeat": "Factor of how many times a system should be repeated. Consider the evolution splitting binary_c argument for supernovae kick repeating.", # TODO: make sure this is used.
"evolution_type": "Variable containing the type of evolution used of the grid. Multiprocessing or linear processing",
"evolution_type": "Variable containing the type of evolution used of the grid. Multiprocessing, linear processing or possibly something else (e.g. for Slurm or Condor).",
"combine_ensemble_with_thread_joining": "Boolean flag on whether to combine everything and return it to the user or if false: write it to data_dir/ensemble_output_{population_id}_{thread_id}.json",
"log_runtime_systems": "Whether to log the runtime of the systems . Each systems run by the thread is logged to a file and is stored in the tmp_dir. (1 file per thread). Don't use this if you are planning to run a lot of systems. This is mostly for debugging and finding systems that take long to run. Integer, default = 0. if value is 1 then the systems are logged",
"_total_mass_run": "To count the total mass that thread/process has ran",
......
......@@ -263,9 +263,12 @@ setup(
install_requires=[
"astropy",
"colorama",
"compress_pickle",
"datasize",
"h5py",
"halo",
"humanize",
"lib_programname",
"matplotlib",
"msgpack",
"numpy",
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment