research-rainfallradar/README.md

3.5 KiB

Rainfall Radar

A model to predict water depth data from rainfall radar information.

This is the 3rd major version of this model.

Unfortunately using this model is rather complicated and involves a large number of steps. There is no way around this. This README (will) explain it the best I can though.

System Requirements

  • Linux (Windows may work but is untested. You will probably have a bad day if you use Windows)
  • Node.js (a recent version - i.e. v16+ - the version in the default Ubuntu repositories is too old)
  • Python 3.8+
  • Nvidia GPU (16GiB RAM+ is strongly recommended) + CUDA and CuDNN (see this table for which versions you need)
  • Experience with the command line
  • 1TiB disk space free
  • Lots of time and patience

Overview

The process of using this model is as follows.

  1. Apply for access to CEDA's 1km rainfall radar dataset
  2. Obtain rainfall radar data (use nimrod-data-downloader)
  3. Obtain a heightmap (or Digital Elevation Model, as it's sometimes known) from the Ordnance Survey (can't remember the link, please PR to add this)
  4. Use terrain50-cli to slice the the output from steps #2 and #3 to be exactly the same size [TODO: Preprocess to extract just a single river basin from the data]
  5. Push through HAIL-CAESAR (this fork has the ability to handle streams of .asc files rather than each time step having it's own filename)
  6. Use rainfallwrangler in this repository (finally!) to convert the output to .json.gz then .tfrecord files
  7. Pretrain a contrastive learning model
  8. Encode the rainfall radar data with the contrastive learning model you pretrained
  9. Train the actual model to predict water depth

Only steps #6 to #9 actually use code in this repository.

rainfallwrangler

rainfallwrangler is a Node.js application to wrangle the dataset into something more appropriate for training an AI efficiently. The rainfall radar and water depth data are considered temporally to be regular time steps. Here's a diagram explaining the terminology:

                       NOW
│                       │         │Water depth
│▼ Rainfall Radar Data ▼│[Offset] │▼
├─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┼─┬─┬─┬─┬─┼─┐
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┼─┴─┴─┴─┴─┴─┘
                        │
◄────────── Timesteps ─────────────►

Note to self: 150.12 hashes/sec on i7-4770 4c8t, ???.?? hashes/sec on Viper compute

After double checking, rainfallwrangler does NOT mess with the ordering of the data.

License

All the code in this repository is released under the GNU Affero General Public License unless otherwise specified. The full license text is included in the LICENSE.md file in this repository. GNU have a great summary of the licence which I strongly recommend reading before using this software.