Starbeamrainbowlabs
5fc0725917
slurm: chmod +x
2022-08-31 16:33:07 +01:00
Starbeamrainbowlabs
dbf929325a
typo; add pretrain slurm job file
2022-08-31 16:32:17 +01:00
Starbeamrainbowlabs
e0162bc70b
requirements.txt: add missing dependencies
2022-08-31 16:25:47 +01:00
Starbeamrainbowlabs
f2312c1184
fix crash
2022-08-31 16:25:27 +01:00
Starbeamrainbowlabs
fe7a8b3fc0
Add 9 simple steps to use the rainfall radar model..... oh boy.
2022-08-31 16:18:28 +01:00
Starbeamrainbowlabs
15a3519107
ai: the best thing about implementing a model is that you don't have to test it on the same day :P
2022-08-11 18:26:28 +01:00
Starbeamrainbowlabs
28bcdf2192
vscode settings file
2022-08-11 16:38:25 +01:00
Starbeamrainbowlabs
c0a9cb12d8
ai: start creating initial model implementation.
...
it's not hooked up to the CLI yet though.
Focus is still on ensuring the dataset is in the right format though
2022-08-10 19:03:25 +01:00
Starbeamrainbowlabs
6cdf2b2389
wrangler python child: explicitly close stdout+stderr.
...
Hopefully this will avoid any more hanging issues.
2022-08-10 18:51:30 +01:00
Starbeamrainbowlabs
5880bf9020
wrangler: add current date to process indicator.
...
There's a bug that causes it to hang, but we don't know why
2022-08-10 18:50:57 +01:00
Starbeamrainbowlabs
231c832888
wrangler bugfix: crashes; logging output
2022-08-10 17:33:10 +01:00
Starbeamrainbowlabs
b52c7f89a7
Move dataset parsing function to the right place
2022-08-10 17:24:55 +01:00
Starbeamrainbowlabs
50f214450f
wrangler: fix crash
2022-08-10 17:05:01 +01:00
Starbeamrainbowlabs
0bac8c8c0c
fixup
2022-08-08 17:23:24 +01:00
Starbeamrainbowlabs
405f1a0bb0
fixup
2022-08-08 17:22:31 +01:00
Starbeamrainbowlabs
5e1356513c
slurm: use compute, because 28 tf processes in parallel is too much for the GPU memory
2022-08-08 17:22:18 +01:00
Starbeamrainbowlabs
133ef59af3
fixup
2022-08-08 16:33:05 +01:00
Starbeamrainbowlabs
80e1a33ee2
slurm-jsonl2tfrecord.job: auto install dependencies
2022-08-08 16:31:49 +01:00
Starbeamrainbowlabs
1442d20524
slurm: request gpu
2022-08-08 15:56:46 +01:00
Starbeamrainbowlabs
f6f2e3694c
json2tfrecord: write slurm job file
2022-08-08 15:53:32 +01:00
Starbeamrainbowlabs
222a6146ec
write glue for .jsonl.gz → .tfrecord.gz converter
2022-08-08 15:33:59 +01:00
Starbeamrainbowlabs
f3652edf82
fixup
2022-08-05 19:10:40 +01:00
Starbeamrainbowlabs
9399d1d8f5
Create (untested) JS interface to Python jsonl→tfrecord converter
...
also test Python .jsonl.gz → .tfrecord.gz
2022-08-05 19:10:28 +01:00
Starbeamrainbowlabs
a02c3436ab
get python bridge working t convert .jsonl.gz → .tfrecord.gz
2022-08-05 18:07:04 +01:00
Starbeamrainbowlabs
28a3f578d5
update .gitignore
2022-08-04 16:49:53 +01:00
Starbeamrainbowlabs
2ccc1be414
json2tfrecord: write (untested python to convert .jsonl → .tfrecord
2022-07-28 19:48:25 +01:00
Starbeamrainbowlabs
323d708692
dataset: add todo
...
just why, Tensorflow?!
tf.data.TextLineDataset looks almost too good to be true..... and it is, as despite supporting decompressing via gzip(!) it doesn't look like we can convince it to parse JSON :-/
2022-07-26 19:53:18 +01:00
Starbeamrainbowlabs
b53c77a2cb
index.py: call static function name run
2022-07-26 19:51:28 +01:00
Starbeamrainbowlabs
a7ed58fc03
ai: move requirements.txt to the right place
2022-07-26 19:25:11 +01:00
Starbeamrainbowlabs
e93a95f1b3
ai dataset: add if main == main
2022-07-26 19:24:40 +01:00
Starbeamrainbowlabs
de4c3dab17
typo
2022-07-26 19:14:55 +01:00
Starbeamrainbowlabs
18a7d3674b
ai: create (untested) dataset
2022-07-26 19:14:10 +01:00
Starbeamrainbowlabs
dac6919fcd
ai: start creating initial scaffolding
2022-07-25 19:01:10 +01:00
Starbeamrainbowlabs
1ec502daea
Remove rogue package*.json files
2022-07-25 19:00:21 +01:00
Starbeamrainbowlabs
927c30e189
recompress files in the right order
2022-07-25 18:44:23 +01:00
Starbeamrainbowlabs
3332fa598a
Add new recompress subcommand
...
also fix typos, CLI definitions
2022-07-25 17:54:23 +01:00
Starbeamrainbowlabs
d9b9a4f9fc
note tos elf
2022-07-22 19:04:41 +01:00
Starbeamrainbowlabs
593dc2d5ce
fixup
2022-07-22 18:51:29 +01:00
Starbeamrainbowlabs
a593077d46
add slurm job file for uniq
2022-07-22 18:46:05 +01:00
Starbeamrainbowlabs
03e398504a
Bugfix: fix crash when target dir isn't specified
2022-07-22 18:36:00 +01:00
Starbeamrainbowlabs
82e826fd69
Fix bugs in remainder of rainfallwrangler:uniq :D
2022-07-22 18:05:03 +01:00
Starbeamrainbowlabs
31bd7899b6
Merge branch 'main' of git.starbeamrainbowlabs.com:sbrl/PhD-Rainfall-Radar
2022-07-22 17:10:52 +01:00
Starbeamrainbowlabs
ce303814d6
Bugfix: don't make 1 group for each duplicate....
2022-07-22 17:06:02 +01:00
Starbeamrainbowlabs
38a0bd0942
uniq: bugfix a lot, but it's not working right just yet
...
There's still a bug in the file line deletor
2022-07-09 00:31:32 +01:00
Starbeamrainbowlabs
a966cdff35
uniq: bugfix a lot, but it's not working right just yet
...
There's still a bug in the file line deletor
2022-07-08 19:54:24 +01:00
Starbeamrainbowlabs
3b2715c6cd
recordify: fix process exiting and imcomplete files issues
...
• Node.js not exiting at all
• Node.js exiting on end_safe ing stream.Writable (?????)
• Incomplete files - "unexpected end of file" errors and invalid JSON
2022-07-08 18:54:00 +01:00
Starbeamrainbowlabs
cb922ae8c8
fixup
2022-07-08 16:52:19 +01:00
Starbeamrainbowlabs
b9a018f9a9
properly close all teh streams
2022-07-08 16:51:17 +01:00
Starbeamrainbowlabs
1a657bd653
add new uniq subcommand
...
It deduplicates lines in the files, with the potential to add the ability to filter on a specific property later.
The reasoningf or this is thus:
1. There will naturally be periods of time where nothing happens
2. Too many duplicates will interfere and confuse with the contrastive learning algorithm, as in each batch it will have less variance in samples
This is especially important because contrastive learning causes it to compare every item in each batch with every othear item in the batch.
2022-07-04 19:46:06 +01:00
Starbeamrainbowlabs
234e2b7978
Write \n end of line character
...
we actually forgot this, wow....
2022-07-04 17:05:05 +01:00