sbrl-github/research-rainfallradar

mirror of https://github.com/sbrl/research-rainfallradar synced 2024-11-25 18:33:01 +00:00

Author	SHA1	Message	Date
Starbeamrainbowlabs	9edda1f397	rainfallwrangler json2tfrecord.py: normalise data	2022-09-01 19:03:15 +01:00
Starbeamrainbowlabs	3e4128c0a8	resize rainfall to be 1/2 size of current	2022-09-01 18:47:07 +01:00
Starbeamrainbowlabs	6cdf2b2389	wrangler python child: explicitly close stdout+stderr. Hopefully this will avoid any more hanging issues.	2022-08-10 18:51:30 +01:00
Starbeamrainbowlabs	5880bf9020	wrangler: add current date to process indicator. There's a bug that causes it to hang, but we don't know why	2022-08-10 18:50:57 +01:00
Starbeamrainbowlabs	231c832888	wrangler bugfix: crashes; logging output	2022-08-10 17:33:10 +01:00
Starbeamrainbowlabs	b52c7f89a7	Move dataset parsing function to the right place	2022-08-10 17:24:55 +01:00
Starbeamrainbowlabs	50f214450f	wrangler: fix crash	2022-08-10 17:05:01 +01:00
Starbeamrainbowlabs	0bac8c8c0c	fixup	2022-08-08 17:23:24 +01:00
Starbeamrainbowlabs	405f1a0bb0	fixup	2022-08-08 17:22:31 +01:00
Starbeamrainbowlabs	5e1356513c	slurm: use compute, because 28 tf processes in parallel is too much for the GPU memory	2022-08-08 17:22:18 +01:00
Starbeamrainbowlabs	133ef59af3	fixup	2022-08-08 16:33:05 +01:00
Starbeamrainbowlabs	80e1a33ee2	slurm-jsonl2tfrecord.job: auto install dependencies	2022-08-08 16:31:49 +01:00
Starbeamrainbowlabs	1442d20524	slurm: request gpu	2022-08-08 15:56:46 +01:00
Starbeamrainbowlabs	f6f2e3694c	json2tfrecord: write slurm job file	2022-08-08 15:53:32 +01:00
Starbeamrainbowlabs	222a6146ec	write glue for .jsonl.gz → .tfrecord.gz converter	2022-08-08 15:33:59 +01:00
Starbeamrainbowlabs	f3652edf82	fixup	2022-08-05 19:10:40 +01:00
Starbeamrainbowlabs	9399d1d8f5	Create (untested) JS interface to Python jsonl→tfrecord converter also test Python .jsonl.gz → .tfrecord.gz	2022-08-05 19:10:28 +01:00
Starbeamrainbowlabs	a02c3436ab	get python bridge working t convert .jsonl.gz → .tfrecord.gz	2022-08-05 18:07:04 +01:00
Starbeamrainbowlabs	2ccc1be414	json2tfrecord: write (untested python to convert .jsonl → .tfrecord	2022-07-28 19:48:25 +01:00
Starbeamrainbowlabs	927c30e189	recompress files in the right order	2022-07-25 18:44:23 +01:00
Starbeamrainbowlabs	3332fa598a	Add new recompress subcommand also fix typos, CLI definitions	2022-07-25 17:54:23 +01:00
Starbeamrainbowlabs	593dc2d5ce	fixup	2022-07-22 18:51:29 +01:00
Starbeamrainbowlabs	a593077d46	add slurm job file for uniq	2022-07-22 18:46:05 +01:00
Starbeamrainbowlabs	03e398504a	Bugfix: fix crash when target dir isn't specified	2022-07-22 18:36:00 +01:00
Starbeamrainbowlabs	82e826fd69	Fix bugs in remainder of rainfallwrangler:uniq :D	2022-07-22 18:05:03 +01:00
Starbeamrainbowlabs	31bd7899b6	Merge branch 'main' of git.starbeamrainbowlabs.com:sbrl/PhD-Rainfall-Radar	2022-07-22 17:10:52 +01:00
Starbeamrainbowlabs	ce303814d6	Bugfix: don't make 1 group for each duplicate....	2022-07-22 17:06:02 +01:00
Starbeamrainbowlabs	38a0bd0942	uniq: bugfix a lot, but it's not working right just yet There's still a bug in the file line deletor	2022-07-09 00:31:32 +01:00
Starbeamrainbowlabs	a966cdff35	uniq: bugfix a lot, but it's not working right just yet There's still a bug in the file line deletor	2022-07-08 19:54:24 +01:00
Starbeamrainbowlabs	3b2715c6cd	recordify: fix process exiting and imcomplete files issues • Node.js not exiting at all • Node.js exiting on end_safe ing stream.Writable (?????) • Incomplete files - "unexpected end of file" errors and invalid JSON	2022-07-08 18:54:00 +01:00
Starbeamrainbowlabs	cb922ae8c8	fixup	2022-07-08 16:52:19 +01:00
Starbeamrainbowlabs	b9a018f9a9	properly close all teh streams	2022-07-08 16:51:17 +01:00
Starbeamrainbowlabs	1a657bd653	add new uniq subcommand It deduplicates lines in the files, with the potential to add the ability to filter on a specific property later. The reasoningf or this is thus: 1. There will naturally be periods of time where nothing happens 2. Too many duplicates will interfere and confuse with the contrastive learning algorithm, as in each batch it will have less variance in samples This is especially important because contrastive learning causes it to compare every item in each batch with every othear item in the batch.	2022-07-04 19:46:06 +01:00
Starbeamrainbowlabs	234e2b7978	Write \n end of line character we actually forgot this, wow....	2022-07-04 17:05:05 +01:00
Starbeamrainbowlabs	920cc3feaf	Properly close last writer otherwise Node.js doesn't quit	2022-07-04 17:04:11 +01:00
Starbeamrainbowlabs	588ee87b83	Bugfix: fix end-of-file	2022-07-01 19:34:26 +01:00
Starbeamrainbowlabs	5b2d71f41f	it works .....I think	2022-07-01 19:08:36 +01:00
Starbeamrainbowlabs	1297f41105	.tfrecord files are too much hassle let's go with a standard of .jsonl.gz instead	2022-07-01 18:28:39 +01:00
Starbeamrainbowlabs	f5f267c6b6	Update dependencies	2022-07-01 16:56:51 +01:00
Starbeamrainbowlabs	ba258fbba0	Remove debug loogging	2022-05-19 19:25:44 +01:00
Starbeamrainbowlabs	e030e6c2d5	Fix remaining(?) crashes= in our code	2022-05-19 19:13:28 +01:00
Starbeamrainbowlabs	3cb7e42505	it doesn't crash as much now, but it still isn't behaving.	2022-05-19 18:52:15 +01:00
Starbeamrainbowlabs	bb018c53f6	Fix many bugs Many bugs remain though	2022-05-19 17:54:14 +01:00
Starbeamrainbowlabs	cc5efbae8a	Implement tfrecodify subcommand. It's all still untested, but that's the next step	2022-05-19 17:15:15 +01:00
Starbeamrainbowlabs	0fa7ae9d6a	Imnplement plumbing, but it's all untested	2022-05-18 17:47:02 +01:00
Starbeamrainbowlabs	bf4866bdbc	Add data readers	2022-05-18 17:04:11 +01:00
Starbeamrainbowlabs	9411ad3218	tweak licence	2022-05-13 19:08:04 +01:00
Starbeamrainbowlabs	8a9cd6c1c0	Lay out some basic scaffolding I really hope this works. This is the 3rd major revision of this model. I've learnt a ton of stuff between now and my last attempt, so here's hoping that all goes well :D The basic idea behind this attempt is Contrastive Learning. If we don't get anything useful with this approach, then we can assume that it's not really possible / feasible. Something we need to watch out for is the variance (or rather lack thereof) in the dataset. We have 1.5M timesteps, but not a whole lot will be happening in most of those.... We may need to analyse the variance of the water depth data and extract a subsample that's more balanced.	2022-05-13 19:06:15 +01:00

48 commits