Calculating RMSE for single variable time series data

8 views (last 30 days)
Aaron Haegele on 17 Jun 2020
Hello,
I am rather new to MATLAB and need help calculating RMSE for single variable time series data. The data is cloud cover percentage from a weather model (predicted variable) and a camera (observed variable). I was able to calcluate the RMSE comparing the same times but now I want to compare different times of the predicted and observed data. I would like to do a 1 day ahead persistence forecast. A persistence forecast is a forecast in which assumes the future weather condition will be the same as the present condition. (Today equals tomorrow) The persistence forecast is often used as a standard of comparison in measuring the degree of skill of forecasts becasue the persistence forecast (non-skilled) should be much less accurate than the skilled forecast (weather model). Attached is a csv file of the cloud cover data. There are cloud cover percentages from the weather model "HRRR" and the camera. There are 3 cylces per day (15Z, 18Z, 21Z) for December 2018 through Februrary 2019.
I am interested in comparing and calculating the persistence RMSE for 1 day ahead. Meaning the 15z, 18z, 21z HRRR cloud cover percentage for December 1 2019 will be compared to the 15z, 18z, 21Zz December 2 2019 camera cloud cover percentage and so on. Any help or advice to approach this would be greatly appreciated! I can do it easily in Excel but I am trying to learn to do it in MATLAB so I can also do other winter and summer seasons much easier and quicker. Thanks!

Aditya Patil on 18 Aug 2020
There are various ways to approach this problem.
If there was one and only one row from each date and cycle combination, this could easily be done by creating a new column. As some data is missing and some data is repeated, one way to do this is to calculate the prediction for each row, and then calculate RMSE. Look at the following sample code,
data.date = data.date + calyears(2000); % Fix year
sum = 0;
count = 0;
for index = 1:height(data)
% find matching row
nextdayindex = (data.date == data{index, "date"} + 1)';
nextday = data(nextdayindex, :);
cycleindex = (nextday.cycle == data{index, "cycle"})';
nextcycle = nextday(cycleindex, :); % sometimes two rows match
% calculate squared error
previous = data{index, "hrrr_cloud_cover"};
current = nextcycle.hrrr_cloud_cover; % rows are not always 1
if isempty(current)
continue;
elseif length(current) > 1
current = current(1);
end
error = (current - previous) .^ 2;
count = count + 1;
sum = sum + error;
end
rmse = sqrt(sum / count)