Data Analysis
The battle of the weather data has begun. Motivated to bring factual data I build a bit of a validation script.
This script will do the following:
- Get the weather forecasts I make during the day.
- Get the BOM forecasts during the day for Dayboro
- Get the actual recorded weather from the Dayboro Weather station for the day.
- Compare all of them and provide the variation.
Lets collect some data
Basically what I do is collect the following data:
- Get the Actual Weather data. I do that in 2.5s intervals.
- Get the detailed forecast from the BOM. I do that in 3 hrs intervals to give them the best chance
- Then compare the BOM forecast for Dayboro against the actual data collected at Dayboro.
I do the same with my own generated forecast, since 2004 I have been collecting weather data only since 2008 or so been doing forecasts. Every night the forecast is checked against a few months of data and the variations are put back into the system for the next forecast. This generally should give a good forecast.
I was doing a forecast every two hours, but have now changed that to align with the BOM and generate one every three hours at the same time the BOM is creating one. This should eliminate any future arguments about timing.
Results
Over time I get more data, comparing the actual versus prediction.
The positive values mean over-reporting, and the negative mean under-reporting. In this example of test data, the UV was over-reported by 6.5 and the Lowest Temperature was under-reported by 10C.
This means that if BOM forecasted 30C and it only got 20C, the result is -10C.
I do the same for my forecasts versus actual. Once I get some more data I will publish it in realtime. At the moment it is still in testing and have to see if it actually works in a non BIAS way :-).