aa7d9b8cf6
fixup
2022-11-10 19:46:09 +00:00
0894bd09e8
train_predict: add error message for parrams.json not found
2022-11-10 19:45:41 +00:00
0353072d15
allow pretrain to run on gpu
...
we've slashed the size of the 2nd encoder, so ti should fit naow?
2022-11-04 17:02:07 +00:00
44ad51f483
CallbackNBatchCsv: bugfix .sort() → sorted()
2022-11-04 16:40:21 +00:00
4dddcfcb42
pretrain_predict: missing \n
2022-11-04 16:01:28 +00:00
1375201c5f
CallbackNBatchCsv: open_handle mode
2022-11-03 18:29:00 +00:00
3206d6b7e7
slurm: rename segmenter job name
2022-11-03 17:12:27 +00:00
f2ae74ce7b
how could I be so stupid..... round 2
2022-11-02 17:38:26 +00:00
5f8d6dc6ea
Add metrics every 64 batches
...
this is important, because with large batches it can be difficult to tell what's happening inside each epoch.
2022-10-31 19:26:10 +00:00
cf872ef739
how could I be so *stupid*......
2022-10-31 18:40:58 +00:00
da32d75778
make_callbacks: display steps, not samples
2022-10-31 18:36:28 +00:00
dfef7db421
moar debugging
2022-10-31 18:26:34 +00:00
172cf9d8ce
tweak
2022-10-31 18:19:43 +00:00
dbe35ee943
loss: comment l2 norm
2022-10-31 18:09:03 +00:00
5e60319024
fixup
2022-10-31 17:56:49 +00:00
b986b069e2
debug party time
2022-10-31 17:50:29 +00:00
458faa96d2
loss: fixup
2022-10-31 17:18:21 +00:00
55dc05e8ce
contrastive: comment weights that aren't needed
2022-10-31 16:26:48 +00:00
33391eaf16
train_predict/jsonl: don't argmax
...
I'm interested inthe raw values
2022-10-26 17:21:19 +01:00
74f2cdb900
train_predict: .list() → .tolist()
2022-10-26 17:12:36 +01:00
4f9d543695
train_predict: don't pass model_code
...
it's redundant
2022-10-26 17:11:36 +01:00
1b489518d0
segmenter: add LayerStack2Image to custom_objects
2022-10-26 17:05:50 +01:00
48ae8a5c20
LossContrastive: normalise features as per the paper
2022-10-26 16:52:56 +01:00
843cc8dc7b
contrastive: rewrite the loss function.
...
The CLIP paper *does* kinda make sense I think
2022-10-26 16:45:45 +01:00
fad1399c2d
convnext: whitespace
2022-10-26 16:45:20 +01:00
1d872cb962
contrastive: fix initial temperature value
...
It should be 1/0.07, but we had it set to 0.07......
2022-10-26 16:45:01 +01:00
f994d449f1
Layer2Image: fix
2022-10-25 21:32:17 +01:00
6a29105f56
model_segmentation: stack not reshape
2022-10-25 21:25:15 +01:00
98417a3e06
prepare for NCE loss
...
.....but Tensorflow's implementation looks to be for supervised models :-(
2022-10-25 21:15:05 +01:00
bb0679a509
model_segmentation: don't softmax twice
2022-10-25 21:11:48 +01:00
f2e2ca1484
model_contrastive: make water encoder significantly shallower
2022-10-24 20:52:31 +01:00
a6b07a49cb
count water/nowater pixels in Jupyter Notebook
2022-10-24 18:05:34 +01:00
a8b101bdae
dataset_predict: add shape_water_desired
2022-10-24 18:05:13 +01:00
587c1dfafa
train_predict: revamp jsonl handling
2022-10-21 16:53:08 +01:00
8195318a42
SparseCategoricalAccuracy: losses → metrics
2022-10-21 16:51:20 +01:00
612735aaae
rename shuffle arg
2022-10-21 16:35:45 +01:00
c98d8d05dd
segmentation: use the right accuracy
2022-10-21 16:17:05 +01:00
bb0258f5cd
flip squeeze operator ordering
2022-10-21 15:38:57 +01:00
af26964c6a
batched_iterator: reset i_item after every time
2022-10-21 15:35:43 +01:00
c5b1501dba
train-predict fixup
2022-10-21 15:27:39 +01:00
42aea7a0cc
plt.close() fixup
2022-10-21 15:23:54 +01:00
12dad3bc87
vis/segmentation: fix titles
2022-10-21 15:22:35 +01:00
0cb2de5d06
train-preedict: close matplotlib after we've finished
...
they act like file handles
2022-10-21 15:19:31 +01:00
81e53efd9c
PNG: create output dir if doesn't exist
2022-10-21 15:17:39 +01:00
3f7db6fa78
fix embedding confusion
2022-10-21 15:15:59 +01:00
847cd97ec4
fixup
2022-10-21 14:26:58 +01:00
0e814b7e98
Contraster → Segmenter
2022-10-21 14:25:43 +01:00
1b658a1b7c
train-predict: can't destructure array when iterating generator
...
....it seems to lead to undefined behaviour or something
2022-10-20 19:34:04 +01:00
aed2348a95
train_predict: fixup
2022-10-20 15:42:33 +01:00
cc6679c609
batch data; use generator
2022-10-20 15:22:29 +01:00
d306853c42
use right daataset
2022-10-20 15:16:24 +01:00
59cfa4a89a
basename paths
2022-10-20 15:11:14 +01:00
4d8ae21a45
update cli help text
2022-10-19 17:31:42 +01:00
200076596b
finish train_predict
2022-10-19 17:26:40 +01:00
488f78fca5
pretrain_predict: default to parallel_reads=0
2022-10-19 16:59:45 +01:00
63e909d9fc
datasets: add shuffle=True/False to get_filepaths.
...
This is important because otherwise it SCAMBLES the filenames, which is a disaster for making predictions in the right order....!
2022-10-19 16:52:07 +01:00
fe43ddfbf9
start implementing driver for train_predict, but not finished yet
2022-10-18 19:37:55 +01:00
b3ea189d37
segmentation: softmax the output
2022-10-13 21:02:57 +01:00
f121bfb981
fixup summaryfile
2022-10-13 17:54:42 +01:00
5c35c0cee4
model_segmentation: document; remove unused args
2022-10-13 17:50:16 +01:00
f12e6ab905
No need for a CLI arg for feature_dim_in - metadata should contain this
2022-10-13 17:37:16 +01:00
e201372252
write quick Jupyter notebook to test data
...
....I'm paranoid
2022-10-13 17:27:17 +01:00
ae53130e66
layout
2022-10-13 14:54:20 +01:00
7933564c66
typo
2022-10-12 17:33:54 +01:00
dbe4fb0eab
train: add slurm job file
2022-10-12 17:27:10 +01:00
6423bf6702
LayerConvNeXtGamma: avoid adding an EagerTensor to config
...
Very weird how this is a problem when it wasn't before..
2022-10-12 17:12:07 +01:00
32f5200d3b
pass model_arch properly
2022-10-12 16:50:06 +01:00
5933fb1061
fixup
2022-10-11 19:23:41 +01:00
c45b90764e
segmentation: adds xxtiny, but unsure if it's small enough
2022-10-11 19:22:37 +01:00
f4a2c742d9
typo
2022-10-11 19:19:23 +01:00
11f91a7cf4
train: add --arch; default to convnext_i_xtiny
2022-10-11 19:18:01 +01:00
5666c5a0d9
typo
2022-10-10 18:12:51 +01:00
131c0a0a5b
pretrain-predict: create dir if not exists
2022-10-10 18:00:55 +01:00
deede32241
slurm-pretrain: limit memory usage
2022-10-10 17:45:29 +01:00
13a8f3f511
pretrain-predict: only queue pretrain-plot if I we output jsonl
2022-10-10 17:11:10 +01:00
ffcb2e3735
pretrain-predict: queue for the actual input
2022-10-10 16:53:28 +01:00
f883986eaa
Bugfix: modeset to enable TFRecordWriter instead of bare handle
2022-10-06 20:07:59 +01:00
e9a8e2eb57
fixup
2022-10-06 19:23:31 +01:00
9f3ae96894
finish wiring for --water-size
2022-10-06 19:21:50 +01:00
5dac70aa08
typo
2022-10-06 19:17:03 +01:00
2960d3b645
exception → warning
2022-10-06 18:26:40 +01:00
0ee6703c1e
Add todo and comment
2022-10-03 19:06:56 +01:00
2b182214ea
typo
2022-10-03 17:53:10 +01:00
92c380bff5
fiddle with Conv2DTranspose
...
you need to set the `stride` argument to actually get it to upscale..... :P
2022-10-03 17:51:41 +01:00
d544553800
fixup
2022-10-03 17:33:06 +01:00
058e3b6248
model_segmentation: cast float → int
2022-10-03 17:31:36 +01:00
04e5ae0c45
model_segmentation: redo reshape
...
much cheese was applied :P
2022-10-03 17:27:52 +01:00
deffe69202
typo
2022-10-03 16:59:36 +01:00
fc6d2dabc9
Upscale first, THEN convnext...
2022-10-03 16:38:43 +01:00
6a0790ff50
convnext_inverse: add returns; change ordering
2022-10-03 16:32:09 +01:00
fe813cb46d
slurm predict: fix plotting subcall
2022-10-03 16:03:26 +01:00
e51087d0a9
add reshape layer
2022-09-28 18:22:48 +01:00
a336cdee90
and continues
2022-09-28 18:18:10 +01:00
de47a883d9
missing units
2022-09-28 18:17:22 +01:00
b5e08f92fe
the long night continues
2022-09-28 18:14:09 +01:00
dc159ecfdb
and again
2022-09-28 18:11:46 +01:00
4cf0485e32
fixup... again
2022-09-28 18:10:11 +01:00
030d8710b6
fixup
2022-09-28 18:08:31 +01:00
4ee7f2a0d6
add water thresholding
2022-09-28 18:07:26 +01:00
404dc30f08
and again
2022-09-28 17:39:09 +01:00
4cd8fc6ded
segmentation: param name fix
2022-09-28 17:37:42 +01:00
41ba980d69
segmentationP implement dataset parser
2022-09-28 17:19:21 +01:00
d618e6f8d7
pretrain-predict: params.json → metadata.jsonl
2022-09-28 16:35:22 +01:00
e9e6139c7a
typo
2022-09-28 16:28:18 +01:00
3dee3d8908
update cli help
2022-09-28 16:23:47 +01:00
b836f7f70c
again
2022-09-27 19:06:41 +01:00
52dff130dd
slurm pretraian predict: moar logging
2022-09-27 19:05:54 +01:00
2d24174e0a
slurm pretrain predict: add $OUTPUT
2022-09-27 18:58:02 +01:00
7d0e3913ae
fix logging
2022-09-27 18:51:58 +01:00
d765b3b14e
fix crash
2022-09-27 18:43:43 +01:00
2cd59a01a5
slurm-pretrain-plot: add ARGS
2022-09-27 18:41:35 +01:00
f4d1d1d77e
just wh
2022-09-27 18:25:45 +01:00
4c24d69ae6
$d → +d
2022-09-27 18:17:07 +01:00
cdb19b4d9f
fixup
2022-09-27 18:13:21 +01:00
c4d3c16873
add some logging
2022-09-27 18:10:58 +01:00
3772c3227e
fixup
2022-09-27 17:57:21 +01:00
dbfa45a016
write params.json properly
2022-09-27 17:49:54 +01:00
a5455dc22a
fix import
2022-09-27 17:41:24 +01:00
d6ff3fb2ce
pretrain_predict fix write mode
2022-09-27 17:38:12 +01:00
f95fd8f9e4
pretrain-predict: add .tfrecord output function
2022-09-27 16:59:31 +01:00
30b8dd063e
fixup
2022-09-27 15:54:37 +01:00
3cf99587e4
Contraster: add gamma layer to load_model
2022-09-27 15:53:52 +01:00
9e9852d066
UMAP: 100k random
...
no labels.
2022-09-27 15:52:45 +01:00
58c65bdc86
slurm: allow predictions on gpu
2022-09-23 19:21:57 +01:00
d59de41ebb
embeddings: change title rendering; make even moar widererer
...
We need to see that parallel coordinates plot in detail
2022-09-23 18:56:39 +01:00
b4ddb24589
slurm plot: compute → highmem
2022-09-22 18:27:58 +01:00
df12470e78
flip conda and time
...
hopefully we can capture the exit code this way
2022-09-21 14:40:28 +01:00
5252a81238
vis: don't call add_subplot
2022-09-20 19:06:21 +01:00
32bb55652b
slurm predict: autoqueue UMAP plot
2022-09-16 19:36:57 +01:00
24c5263bf0
slurm predict-plot: fixup
2022-09-16 19:27:26 +01:00
7e9c119b04
slurm: fix job name
2022-09-16 19:20:59 +01:00
1574529704
slurm pretrain-predict: move to compute; make exclusive just in case
...
also shortent o 3 days
2022-09-16 19:17:42 +01:00
d7f5958af0
slurm: write new job files
2022-09-16 19:00:43 +01:00
a552cc4dad
ai vis: make parallel coordinates wider
2022-09-16 18:51:49 +01:00
a70794e661
umap: no min_dist
2022-09-16 17:09:09 +01:00
5778fc51f7
embeddings: fix title; remove colourmap
2022-09-16 17:08:04 +01:00
4fd852d782
fixup
2022-09-16 16:44:35 +01:00
fcab227f6a
cheese: set label for everything to 1
2022-09-16 16:42:05 +01:00
b31645bd5d
pretrain-plot: fix crash; remove water code
...
the model doesn't save the water encoder at this time
2022-09-16 16:24:07 +01:00
a5f03390ef
pretrain-plot: handle open w -> r
2022-09-16 16:14:30 +01:00
1103ae5561
ai: tweak progress bar
2022-09-16 16:07:16 +01:00
1e35802d2b
ai: fix embed i/o
2022-09-16 16:02:27 +01:00
ed94da7492
fixup
2022-09-16 15:51:26 +01:00
366db658a8
ds predict: fix filenames in
2022-09-16 15:45:22 +01:00
e333dcba9c
tweak projection head
2022-09-16 15:36:01 +01:00
6defd24000
bugfix: too many values to unpack
2022-09-15 19:56:17 +01:00
e3c8277255
ai: tweak the segmentation model structure
2022-09-15 19:54:50 +01:00
1bc8a5bf13
ai: fix crash
2022-09-15 19:37:06 +01:00
bd64986332
ai: implement batched_iterator to replace .batch()
...
...apparently .batch() means you get a BatchedDataset or whatever when you iterate it like a tf.function instead of the actual tensor :-/
2022-09-15 19:16:38 +01:00
ccd256c00a
embed rainfall radar, not both
2022-09-15 17:37:04 +01:00
2c74676902
predict → predict_on_batch
2022-09-15 17:31:50 +01:00
f036e79098
fixup
2022-09-15 17:09:26 +01:00
d5f1a26ba3
disable prefetching when predicting a thing
2022-09-15 17:09:09 +01:00
8770638022
ai: call .predict(), not the model itself
2022-09-14 17:41:01 +01:00
a96cefde62
ai: predict oops
2022-09-14 17:37:48 +01:00
fa3165a5b2
dataset: simplify dataset_predict
2022-09-14 17:33:17 +01:00
279e27c898
fixup
2022-09-14 17:16:49 +01:00
fad3313ede
fixup
2022-09-14 17:14:04 +01:00
1e682661db
ai: kwargs in from_checkpoint
2022-09-14 17:11:06 +01:00
6bda24d4da
ai: how did I miss that?!
...
bugfix ah/c
2022-09-14 16:53:43 +01:00
decdd434d8
ai from_checkpoint: bugfix
2022-09-14 16:49:01 +01:00
c9e00ea485
pretrain_predict: fix import
2022-09-14 16:02:36 +01:00
f568d8d19f
io.open → handle_open
...
this was we get transparent .gz support
2022-09-14 16:01:22 +01:00
9b25186541
fix --only-gpu
2022-09-14 15:55:21 +01:00
a9c9c70d13
typo
2022-09-14 15:17:59 +01:00
fb8f884487
add umap dependencies
2022-09-14 15:16:45 +01:00
1876a8883c
ai pretrain-predict: fix - → _ in cli parsing
2022-09-14 15:12:07 +01:00
f97b771922
make --help display help
2022-09-14 15:03:07 +01:00
3c1aef5913
update help
2022-09-13 19:36:56 +01:00
206257f9f5
ai pretrain_predict: no need to plot here anymore
2022-09-13 19:35:44 +01:00
7685ec3e8b
implement ability to embed & plot pretrained embeddings
2022-09-13 19:18:59 +01:00
7130c4fdf8
start implementing core image segmentation model
2022-09-07 17:45:38 +01:00
22620a1854
ai: implement saving only the rainfall encoder
2022-09-06 19:48:46 +01:00
4c4358c3e5
whitespace
2022-09-06 16:24:11 +01:00
4202821d98
typo
2022-09-06 15:37:36 +01:00
3e13ad12c8
ai Bugfix LayerContrastiveEncoder: channels → input_channels
...
for consistency
2022-09-05 23:53:16 +01:00
ead8009425
pretrain: add CLI arg for size of watch prediction width/height
2022-09-05 15:36:40 +01:00
9d39215dd5
dataset: drop incomplete batches
2022-09-05 15:36:10 +01:00
c94f5d042e
ai: slurm fixup again
2022-09-02 19:13:56 +01:00
7917820e59
ai: slurm fixup
2022-09-02 19:11:54 +01:00
cd104190e8
slurm: no need to be exclusive anymore
2022-09-02 19:09:45 +01:00
a9dede7bfe
ai: n=10 slurm
2022-09-02 19:09:07 +01:00
42de502f99
slurm-pretrain: set name
2022-09-02 19:08:16 +01:00
457c2eef0d
ai: fixup
2022-09-02 19:06:54 +01:00
a3f03b6d8d
slurm-pretrain: prepare
2022-09-02 19:05:18 +01:00
1d1533d160
ai: how did things get this confusing
2022-09-02 18:51:46 +01:00
1c5defdcd6
ai: cats
2022-09-02 18:45:23 +01:00
6135bcd0cd
fixup
2022-09-02 18:41:31 +01:00
c33c0a0899
ai: these shapes are so annoying
2022-09-02 18:39:24 +01:00
88acd54a97
ai: off-by-one
2022-09-02 18:13:48 +01:00
23bcd294c4
AI: 0.75 → 0.5?
2022-09-02 18:12:51 +01:00
b0e1aeac35
ai: knew it
2022-09-02 18:10:29 +01:00
ef0f15960d
send correct water shape to
2022-09-02 18:09:36 +01:00
ad156a9a00
ai dataset: centre crop the water data to 75% original size
...
this should both help the model and reduce memory usage
2022-09-02 18:05:32 +01:00
efe41b96ec
fixup
2022-09-02 17:58:02 +01:00
f8ee0afca1
ai: fix arch_name plumbing
2022-09-02 17:57:07 +01:00
3d44831080
Add NO_PREFETCH env var
2022-09-02 17:55:04 +01:00
3e0ca6a315
ai: fix summary file writing; make water encoder smaller
2022-09-02 17:51:45 +01:00
389216b391
reordering
2022-09-02 17:31:19 +01:00
b9d01ddadc
summary logger → summarywriter
2022-09-02 17:28:00 +01:00
9f7f4af784
dataset: properly resize
2022-09-02 17:09:38 +01:00
c78b6562ef
debug adjustment
2022-09-02 17:00:21 +01:00
c89677abd7
dataset: explicit reshape
2022-09-02 16:57:59 +01:00
c066ea331b
dataset: namespace → dict
...
Python is so annoying
2022-09-02 16:07:44 +01:00
dbff15d4a9
add bug comment
2022-09-01 19:05:50 +01:00
3e4128c0a8
resize rainfall to be 1/2 size of current
2022-09-01 18:47:07 +01:00
8a86728b54
ai: how did I miss that
2022-09-01 18:07:53 +01:00
b2f1ba29bb
debug a
2022-09-01 18:05:33 +01:00
e2f621e212
even moar debug logging
2022-09-01 17:49:51 +01:00
2663123407
moar logging
2022-09-01 17:40:23 +01:00
c2fcb3b954
ai: channels_first → channels_last
2022-09-01 17:06:18 +01:00
f1d7973f22
ai: add dummy label
2022-09-01 17:01:00 +01:00
17d42fe899
ai: json.dumps
2022-09-01 16:21:52 +01:00
940f7aa1b5
ai: set self.model
2022-09-01 16:20:23 +01:00
f1be5fe2bd
ai: add missing arguments to LossContrastive
2022-09-01 16:14:00 +01:00
ddb375e906
ai: another typo
2022-09-01 16:12:05 +01:00
cb3e1d3a23
ai: fixup
2022-09-01 16:10:50 +01:00
e4c95bc7e3
typo
2022-09-01 16:09:24 +01:00
cfbbe8e8cf
ai: global? really?
2022-09-01 16:06:24 +01:00
4952ead094
ai: try the nonlocal keyword? I'm unsure what's going on here
...
....clearly I need to read up on scoping in python.
2022-09-01 16:02:37 +01:00
1d9d0fbb73
requirements.txt:a dd tensorflow_addons
2022-09-01 15:56:02 +01:00
8bdded23eb
ai: fix 'nother crash' name ConvNeXt submodels
2022-08-31 18:57:27 +01:00
b2a320134e
ai: typo
2022-08-31 18:54:03 +01:00
e4edc68df5
ai: add missing gamma layer
2022-08-31 18:52:35 +01:00
51cf08a386
ResNetRSV2 → ConvNeXt
...
ironically this makes the model simpler o/
2022-08-31 18:51:01 +01:00
3d614d105b
channels first
2022-08-31 18:06:59 +01:00
654eefd9ca
properly handle water dimensions; add log files to .gitignore
...
TODO: add heightmap
2022-08-31 18:03:39 +01:00
5846828f9e
debug logging
2022-08-31 17:48:09 +01:00
12c77e128d
handle feature_dim properly
2022-08-31 17:41:51 +01:00
c52a9f961c
and another
2022-08-31 17:37:28 +01:00
58dbabd561
fix another crash
2022-08-31 17:33:07 +01:00
5fc0725917
slurm: chmod +x
2022-08-31 16:33:07 +01:00
dbf929325a
typo; add pretrain slurm job file
2022-08-31 16:32:17 +01:00
e0162bc70b
requirements.txt: add missing dependencies
2022-08-31 16:25:47 +01:00
f2312c1184
fix crash
2022-08-31 16:25:27 +01:00
15a3519107
ai: the best thing about implementing a model is that you don't have to test it on the same day :P
2022-08-11 18:26:28 +01:00
c0a9cb12d8
ai: start creating initial model implementation.
...
it's not hooked up to the CLI yet though.
Focus is still on ensuring the dataset is in the right format though
2022-08-10 19:03:25 +01:00
b52c7f89a7
Move dataset parsing function to the right place
2022-08-10 17:24:55 +01:00
222a6146ec
write glue for .jsonl.gz → .tfrecord.gz converter
2022-08-08 15:33:59 +01:00
28a3f578d5
update .gitignore
2022-08-04 16:49:53 +01:00
323d708692
dataset: add todo
...
just why, Tensorflow?!
tf.data.TextLineDataset looks almost too good to be true..... and it is, as despite supporting decompressing via gzip(!) it doesn't look like we can convince it to parse JSON :-/
2022-07-26 19:53:18 +01:00
b53c77a2cb
index.py: call static function name run
2022-07-26 19:51:28 +01:00
a7ed58fc03
ai: move requirements.txt to the right place
2022-07-26 19:25:11 +01:00
e93a95f1b3
ai dataset: add if main == main
2022-07-26 19:24:40 +01:00
de4c3dab17
typo
2022-07-26 19:14:55 +01:00
18a7d3674b
ai: create (untested) dataset
2022-07-26 19:14:10 +01:00
dac6919fcd
ai: start creating initial scaffolding
2022-07-25 19:01:10 +01:00
8a9cd6c1c0
Lay out some basic scaffolding
...
I *really* hope this works. This is the 3rd major revision of this
model. I've learnt a ton of stuff between now and my last attempt, so
here's hoping that all goes well :D
The basic idea behind this attempt is *Contrastive Learning*. If we
don't get anything useful with this approach, then we can assume that
it's not really possible / feasible.
Something we need to watch out for is the variance (or rather lack
thereof) in the dataset. We have 1.5M timesteps, but not a whole lot
will be happening in most of those....
We may need to analyse the variance of the water depth data and extract
a subsample that's more balanced.
2022-05-13 19:06:15 +01:00