sbrl-github/research-rainfallradar

mirror of https://github.com/sbrl/research-rainfallradar synced 2024-10-05 19:34:07 +00:00

Author	SHA1	Message	Date
Starbeamrainbowlabs	784b8ed35c	recordify: catch NaN --count-file	2022-11-01 19:53:21 +00:00
Starbeamrainbowlabs	91152ebb1c	wrangler:recordify update cli help we only output .jsonl.gz to a DIRECTORY, so update cli help to reflect this	2022-11-01 18:29:47 +00:00
Starbeamrainbowlabs	0c11ddca4b	rainfallwrangler does NOT mess up the ordering of the data	2022-10-18 19:07:14 +01:00
Starbeamrainbowlabs	222a6146ec	write glue for .jsonl.gz → .tfrecord.gz converter	2022-08-08 15:33:59 +01:00
Starbeamrainbowlabs	927c30e189	recompress files in the right order	2022-07-25 18:44:23 +01:00
Starbeamrainbowlabs	3332fa598a	Add new recompress subcommand also fix typos, CLI definitions	2022-07-25 17:54:23 +01:00
Starbeamrainbowlabs	82e826fd69	Fix bugs in remainder of rainfallwrangler:uniq :D	2022-07-22 18:05:03 +01:00
Starbeamrainbowlabs	a966cdff35	uniq: bugfix a lot, but it's not working right just yet There's still a bug in the file line deletor	2022-07-08 19:54:24 +01:00
Starbeamrainbowlabs	3b2715c6cd	recordify: fix process exiting and imcomplete files issues • Node.js not exiting at all • Node.js exiting on end_safe ing stream.Writable (?????) • Incomplete files - "unexpected end of file" errors and invalid JSON	2022-07-08 18:54:00 +01:00
Starbeamrainbowlabs	b9a018f9a9	properly close all teh streams	2022-07-08 16:51:17 +01:00
Starbeamrainbowlabs	1a657bd653	add new uniq subcommand It deduplicates lines in the files, with the potential to add the ability to filter on a specific property later. The reasoningf or this is thus: 1. There will naturally be periods of time where nothing happens 2. Too many duplicates will interfere and confuse with the contrastive learning algorithm, as in each batch it will have less variance in samples This is especially important because contrastive learning causes it to compare every item in each batch with every othear item in the batch.	2022-07-04 19:46:06 +01:00
Starbeamrainbowlabs	1297f41105	.tfrecord files are too much hassle let's go with a standard of .jsonl.gz instead	2022-07-01 18:28:39 +01:00
Starbeamrainbowlabs	3cb7e42505	it doesn't crash as much now, but it still isn't behaving.	2022-05-19 18:52:15 +01:00
Starbeamrainbowlabs	bb018c53f6	Fix many bugs Many bugs remain though	2022-05-19 17:54:14 +01:00
Starbeamrainbowlabs	cc5efbae8a	Implement tfrecodify subcommand. It's all still untested, but that's the next step	2022-05-19 17:15:15 +01:00
Starbeamrainbowlabs	8a9cd6c1c0	Lay out some basic scaffolding I really hope this works. This is the 3rd major revision of this model. I've learnt a ton of stuff between now and my last attempt, so here's hoping that all goes well :D The basic idea behind this attempt is Contrastive Learning. If we don't get anything useful with this approach, then we can assume that it's not really possible / feasible. Something we need to watch out for is the variance (or rather lack thereof) in the dataset. We have 1.5M timesteps, but not a whole lot will be happening in most of those.... We may need to analyse the variance of the water depth data and extract a subsample that's more balanced.	2022-05-13 19:06:15 +01:00

16 commits