- Jun 17, 2022
-
-
dh00601 authored
-
- Dec 29, 2021
- Nov 30, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 29, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 28, 2021
-
-
Izzard authored
clean up open() functions to use self.auto() which does compression for us based on the file extension
-
- Nov 27, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard authored
-
- Nov 24, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
update _const_dt to include a wrapper just before it's cached, allowing output to the screen for debugging other minor fixes
-
- Nov 22, 2021
-
-
Izzard authored
more attempts to clean up the code to work better on NFS : seems ok now, but expect more "bug" fixes to come (and more cleanup as the code is a bit hacky)
-
- Nov 21, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 20, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
few other cleanups
-
Izzard, Robert Dr (Maths & Physics) authored
add automatic cacheing of various functions, and a cache_test() routine to test the runtimes with and without caching
-
- Nov 19, 2021
-
-
Izzard authored
attempt to fix (finally?) the joingingfile logic
-
- Nov 18, 2021
-
-
Izzard authored
-
Izzard authored
-
Izzard authored
-
Izzard authored
the joiningfile is now made by whichever job gets to this task first, so the error where it wasn't fully made (when jobs were slowly run from the HPC queue) should not happen now unless it takes a long time to write that one short file. I guess we could put in a checkfile if required, but this should be *very* rare.
-
- Nov 16, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 15, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 14, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
fix some bugs in restarting from snapshots: the start_at was calculated incorrectly (needed to take into account modulo and previous start_at)
-
dh00601 authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
all other files access the HPC API, not the slurm and condor objects directly
-
- Nov 13, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 12, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
set up json functions to not convert ascii (should be faster and preserve UTF8)
-
- Nov 10, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
Slurm restarts work, just take your old (unfinished) Slurm dir and set it with slurm_restart_dir=<whatever>
-
- Nov 09, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
doesn't work for slurm yet, but small steps also fixed a bug in keys_to_floats()
-
Izzard, Robert Dr (Maths & Physics) authored
add CPU time to metadata output, and fix an issue with a tuple rather than a string from platform.something()
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 08, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
- Nov 07, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
I've added a new grid_option, num_processes, which is the number of processes launched by Python's multiprocessing. num_cores is used to set this: if > 0 use the number specified (as previously, so backwards compatibility is fine) if == 0 use the number of logical cores if == -1 use the number of physical cores Try running it with a command like: --- rm -rf /tmp/slurm ; nice python3.9 ./src/python/ensemble.py dists=Moe binaries=False r=100 verbosity=1 max_evolution_time=10 slurm_dir=/tmp/slurm slurm_partition=debug slurm_memory=100MB monte_carlo_kicks=0 save_ensemble_chunks=False num_cores=-1 slurm=1 slurm_njobs=2 num_cores=2 --- You will want to change num_cores and slurm_njobs to suit. Each Slurm job gets num_processes cores allocated to it. Note: you should set your slurm directory to be empty. This isn't really required, but makes debugging a lot easier. You also have to set the slurm_partition by hand - this is something you need to find out based on your cluster. In the above example I use "debug" because this is the default. There are quite a few changes internally, particularly new functions to load, save and merge Population objects and their data (mostly) correctly, and updates to the dict merging functions that this required. please report bugs because there will be many!
-
- Nov 06, 2021
-
-
Izzard, Robert Dr (Maths & Physics) authored
-
Izzard, Robert Dr (Maths & Physics) authored
beware though: there are many debugging options still built in, and the gridcode should (but doesn't) have a unique filename
-