Commit graph

40 commits

Author SHA1 Message Date
fe374560a1
I *hate* Tensorflow SO MUCH...... 2024-11-14 22:38:27 +00:00
7c4f3d325d
slurm/dlr: fix workaround logic 2024-11-14 22:28:39 +00:00
17d2d2bcaf
slurm/dlr: tensorflow is dumb
Workaround for this crash on Tensorflow 2.13:

Could not load library libcublasLt.so.12. Error: libcublasLt.so.12: cannot open shared object file: No such file or directory
2024-11-14 22:26:16 +00:00
159f8a4679
slurm/dlr: DIR_RAINFALLWATER default → ~/data/.... 2024-11-14 21:59:32 +00:00
a7ab5ee341
slurm/dlr: cpu cores 14→9 2024-11-14 21:49:54 +00:00
2b69d2c4f2
slurm/dlr: correct logging msg 2024-11-14 21:37:11 +00:00
bd2c6b1c3f
slurm/dlr: don't runmodule load .... on csgpu cluster 2024-11-14 21:34:04 +00:00
4ac7082754
write commit info to file in DIR_OUTPUT 2024-11-14 21:17:53 +00:00
8befef5fc1
slurm/dlr: logging
it Should™ work now?
TODO test this!
2024-11-14 19:55:59 +00:00
04ea305b70
dlr/slurm: implement USE_CONDA, module command opt-support 2024-11-08 21:47:55 +00:00
e5f6e6394f
Implement initial UNTESTED support for split_validation and split_test 2024-08-29 19:33:40 +01:00
0f9f185983
dlr: add PARALLEL_READS env var, update docs 2023-11-30 16:33:22 +00:00
7869505cfb
dlr: add PREDICT_AS_ONE 2023-06-16 18:23:40 +01:00
5a73388a80
dlr: add RANDSEED to slurm 2023-05-11 16:02:13 +01:00
8593999eb6
dlr: add JIT_COMPILE 2023-05-04 18:22:18 +01:00
dddc08c663
dlr: set steps_per_execution to 16 by default 2023-05-04 18:13:08 +01:00
e2e6a56b40
dlr: add UPSAMPLE env var
...AND actually add the functionality this time!
2023-05-04 17:40:16 +01:00
623208ba6d
dlr: add env var for water thresholding 2023-03-14 20:18:39 +00:00
c5fc62c411
dlr CHANGE: Add optional log(cosh(dice_loss))
Ref https://doi.org/10.1109/cibcb48159.2020.9277638
2023-03-10 20:24:13 +00:00
b5f23e76d1
dlr eo: allow setting DIR_OUTPUT directly 2023-03-01 16:54:15 +00:00
747ddfd41b
weird, XLA_FLAGS cuda data dir wasn't needed before
libdevice not found at ./libdevice.10.bc
2023-02-10 13:28:34 +00:00
64c57bbc21
dlr: add no-requeue
Ref https://support.hull.ac.uk/tas/public/ssp/content/detail/incident?unid=652db7ac6e73485c9f7658db78b2b628
2023-01-17 18:20:26 +00:00
835b376c72
slurm dlr: log exit code 2023-01-17 15:18:26 +00:00
40a550f155
slurm dlr: fixup 2023-01-16 18:45:08 +00:00
6ff2864d23
slurm dlr: shell out in conda; redirect stderr & stdout to disk inside the experiments folder
Also, if the job restarts, we still save the previous run's results because we append rather than overwrite
2023-01-16 17:32:22 +00:00
7b10f5c5fe
dlr: add learning_rate env var 2023-01-13 18:29:39 +00:00
be77f035c8
dlr: add cross-entropy + dice loss fn option 2023-01-13 17:58:00 +00:00
f7672db599
annoying 2023-01-13 17:00:47 +00:00
3c4d1c5140
dlr: Add support for stripping isolated water pixels
That is, water pixels that have no other water pixels immediately adjacent thereto (diagonals count).
2023-01-13 16:57:26 +00:00
f0dd9711ed
dlr: fixup 2023-01-12 18:55:33 +00:00
176dc022a0
add moar env vars 2023-01-12 18:54:39 +00:00
7be0509ac8
dlr: slurm PATH_CHECKPOINT 2023-01-11 17:27:26 +00:00
a69c809008
dlr: slurm, load checkpoint 2023-01-11 17:26:57 +00:00
2591cbe6bc
slurm dlr: quiet, pip 2023-01-10 18:12:35 +00:00
db0b010814
slur dlr: log file names correct 2023-01-05 19:47:51 +00:00
e01ecfb615
slurm dlr: fix output dir 2023-01-05 19:42:42 +00:00
aa76d754c1
slurm dlr: fix pathing 2023-01-05 19:35:56 +00:00
0d4cc63b76
dl rainfall: fix env var name 2023-01-05 17:42:20 +00:00
11ccd4cbee
slurm deeplab rainfall: fix variable naming 2023-01-05 17:08:57 +00:00
677e39f820
work on slurm for deeplabv3+ rainfall, but it's NOT FINISHED YET 2022-12-16 19:52:44 +00:00