# Wind speed prediction using ARIMA model

58 views (last 30 days)
Aubai on 19 May 2014
Answered: Alex on 16 Apr 2019
Dear All,
I am trying to predicte the next 2 hours wind speed of 10-min wind speed reading (12-point ahead forecasting). for that i am trying to compare an ANN-NAR model with ARIMA model. for the last one i am getting problems in the predicted wind speed.
in my code i am using a very simple method which has the following steps:
1- Calculated the Autocorrelation & Partial Autocorrelation functions on the row data in order to:
A- see if there is a need for data differencing (Identifiy the d value of the ARIMA model)
B- try to identify the p,q order of the AR and MA filters respectivly.
2- re-calculate the Autocorrelation & Partial Autocorrelation function on the differenced data in order to see if it changes and to identifiy the correct d-value of the ARIMA model.
3- as this Autocorrelation calculation is time consuming it can be shutdown by the if condition.
4- after reading the results of the "correlation- test" an ARIMA model is created, in this mfile i have created random ARIMA models which need to be tested.
5- after building the ARIMA model the standerd steps are carried out:
A- estimating the model
B- calculating the residual (optional at the time being don't care of the results)
C- testing the residual (optional at the time being don't care of the results)
D- forecasting the wind speed for the next 2 hours (12- points ahead of 10-min reading)
E- running the simulation (Monte Carlo simulation) (optional)
F- comparing the forecasted results with the measured results.
MY QUESTION: Why i am getting a non-realistic forecasting results of wind speed when using the forecast function on the selected (estimated) ARIMA model?
my code is as the following:
if true
%%introduction for the usage of this mfile
% Creating an ARIMA model in matlab to intgrate with the ANN script to
% compare the different prediction results of those two differnt models in
% the report was called:
% 1- STSFMs-ARIMA
% 2- ANNTSFMs_NAR
clear
clc
tic
%%pre-processing of the wind speed data by doing the ac.f and pac.f
% this part will try to describe the available data which will help use
% understand the best fit model to be used in next steps.
%----------------------- End ------------------------
% for doing the AutoCorrelation Function and the Partial AutoCorrelation
% Function please use the following functions in matlab:
k_max = round(length(Final_test)/3);% maximum number of lags to be correlated
Test = 0;
if Test
tic
[ACF, lags, bounds] = autocorr(Final_test,k_max);
Feedback_delays = lags((ACF > bounds(1,1)) | (ACF < bounds(2,1)));
q = Feedback_delays;
[PACF, Plags, Pbounds] = parcorr(Final_test,k_max);
P = lags((PACF > Pbounds(1,1)) | (PACF < Pbounds(2,1)));
figure();
subplot(2,1,1)
lineHandles = stem(lags,ACF,'filled','r-o');
set(lineHandles(1),'MarkerSize',4)
grid('on')
xlabel('Lag')
ylabel('Sample Autocorrelation')
title('Sample Autocorrelation Function')
hold('on')
plot(lags,ACF);hold on; plot(lags,repmat(bounds(1,1),1,max(lags)+1));hold on;plot(lags,repmat(bounds(2,1),1,max(lags)+1))
subplot(2,1,2)
lineHandles = stem(Plags,PACF,'filled','r-o');
set(lineHandles(1),'MarkerSize',4)
grid('on')
xlabel('Lag')
ylabel('Sample Partial Autocorrelations')
title('Sample Partial Autocorrelation Function')
hold('on')
plot(Plags,PACF);hold on; plot(Plags,repmat(Pbounds(1,1),1,max(Plags)+1));hold on;plot(Plags,repmat(Pbounds(2,1),1,max(Plags)+1))
toc
end
% if we have a linearly decaying sample ACF indicates a nonstationary
% process (time-serie) for that a diffrencing should be done using the diff
% function in matlab as follows:
if Test
tic
Final_test_stationary = diff(Final_test);
[ACF_D, lags_D, bounds_D] = autocorr(Final_test_stationary,k_max);
Feedback_delays_D = lags_D((ACF_D > bounds_D(1,1)) | (ACF_D < bounds_D(2,1)));
q_D = Feedback_delays_D;
[PACF_D, Plags_D, Pbounds_D] = parcorr(Final_test_stationary,k_max);
P_D = lags((PACF_D > Pbounds_D(1,1)) | (PACF_D < Pbounds_D(2,1)));
figure();
subplot(2,1,1)
lineHandles = stem(lags_D,ACF_D,'filled','r-o');
set(lineHandles(1),'MarkerSize',4)
grid('on')
xlabel('Lag')
ylabel('Sample Autocorrelation')
title('Sample Autocorrelation Function')
hold('on')
plot(lags_D,ACF_D);hold on; plot(lags_D,repmat(bounds_D(1,1),1,max(lags_D)+1));hold on;plot(lags_D,repmat(bounds_D(2,1),1,max(lags_D)+1))
subplot(2,1,2)
lineHandles = stem(lags_D,PACF_D,'filled','r-o');
set(lineHandles(1),'MarkerSize',4)
grid('on')
xlabel('Lag')
ylabel('Sample Partial Autocorrelations')
title('Sample Partial Autocorrelation Function')
hold('on')
plot(Plags_D,PACF_D);hold on; plot(Plags_D,repmat(Pbounds_D(1,1),1,max(Plags_D)+1));hold on;plot(Plags_D,repmat(Pbounds_D(2,1),1,max(Plags_D)+1))
toc
end
%%creating the ARIMA-mode
% this part of this mfile will show how to creat an STSFMs-ARIMA model and
% how to adjust the different setting of such a model in matlab
if 0 == 1
P = 1;
D = 1;
Q = 1;
STSFMs_ARIMA = arima(P,D,Q); % the creation of an ARIMA model with the following factors: AR-Factor = 2 (first input), Integration_Factor = 2 (second input), MA_Factor = 2 (third input).
% the built ARIMA model will looks like this:
% ARIMA(2,2,2) Model:
% --------------------
% Distribution: Name = 'Gaussian'
% P: 4
% D: 2
% Q: 2
% Constant: NaN
% AR: {NaN NaN} at Lags [1 2]
% SAR: {}
% MA: {NaN NaN} at Lags [1 2]
% SMA: {}
% Variance: NaN
% now to adjust the settings of such a model (seasonallity, lags ,and so
% on) we can use the following
STSFMs_ARIMA.Seasonality = 12;
% the model now will looks like this:
% ARIMA(2,2,2) Model Seasonally Integrated:
% ------------------------------------------
% Distribution: Name = 'Gaussian'
% P: 16
% D: 2
% Q: 2
% Constant: NaN
% AR: {NaN NaN} at Lags [1 2]
% SAR: {}
% MA: {NaN NaN} at Lags [1 2]
% SMA: {}
% Seasonality: 12
% Variance: NaN
% as it can be seen from this detailed represntation of the ARIMA model the
% following paramerters can be adjusted:
% 1- D (defrencing Factor)
% 2- Constant (Constant term of the selected model depend on the predicted time series)
% 3- AR (Non-seasonal AutoRegression coefficents)
% 4- SAR (Seasonal AutoRegression Coefficents)
% 5- MA (Non-seasonal MovingAverage coefficents)
% 6- SMA (Seasonal MovingAverage coefficents)
% 7- Seasonality
% 8- Variance.
% all the previouslly stated parameters could be ajusted in the same way
% the seasonality of the model was changed
STSFMs_ARIMA.AR = [];
STSFMs_ARIMA.MA = [];
Lags = num2cell(nan(1,STSFMs_ARIMA.Seasonality));
STSFMs_ARIMA.SAR = Lags;
STSFMs_ARIMA.SMA = Lags;
else
STSFMs_ARIMA = arima('SMALags',12,'D',1,'SARLags',12,...
'Seasonality',12);
D1 = LagOp({1 -1},'Lags',[0,1]);
D12 = LagOp({1 -1},'Lags',[0,12]);
D = D1*D12;
dFinal_test = filter(D,Final_test);
STSFMs_ARIMA_1 = arima('SMALags', 12, 'D',1,'SARLags',12,'MALags',1,'ARLags',1,...
'Seasonality',12);
STSFMs_ARIMA_Minitab = arima('SMALags', 12,'SMA',0.9852, 'D',1,'SARLags',12,'SAR',0.1264,'MALags',1,'MA',0.0500,'ARLags',1,'AR',0.9688,...
'Seasonality',12,'Constant',-0.0002256,'Variance',2);
max_order = 3;
[aic_matrix,bic_matrix] = ARMA_model_order_tuning(Final_test,max_order);
end
%%tuning (training or estimating) stage of the built ARIMA model
% for tuning our ARIMA model we will need to use the estimate function in
% matlab with a training time series. as following
Est_STSFMs_ARIMA = estimate(STSFMs_ARIMA,Final_test,'Display','full');
Est_STSFMs_ARIMA_1 = estimate(STSFMs_ARIMA_1,Final_test,'Display','full');
Est_STSFMs_ARIMA_Minitab = estimate(STSFMs_ARIMA_Minitab,Final_test,'Display','full');
% ther reuslts of tuning such model will be shown in matlab as following:
% ARIMA(2,2,2) Model Seasonally Integrated:
% ------------------------------------------
% Conditional Probability Distribution: Gaussian
%
% Standard t
% Parameter Value Error Statistic
% ----------- ----------- ------------ -----------
% Constant 0.000339604 0.000342367 0.99193
% AR{1} -1.07354 0.012271 -87.4857
% AR{2} -0.163044 0.0125264 -13.0161
% MA{1} 0.00845896 0.00237307 3.56456
% MA{2} -0.980359 0.00239701 -408.992
% Variance 0.551794 0.00897504 61.481
% for changing the estimation settings use the optimset as following:
options = optimset('fmincon');
options = optimset(options,'Algorithm','sqp','TolCon',1e-7,'MaxFunEvals',...
1000,'Display','iter','Diagnostics','on');
%%check goodness of fit
% now we will try to check of the residuals are normally distributed and
% uncorrelated by using the infer function in matlab and finaly plot the
% resutls by the following code:
res = infer(Est_STSFMs_ARIMA,Final_test);
res_1 = infer(Est_STSFMs_ARIMA_1,Final_test);
res_Minitab = infer(Est_STSFMs_ARIMA_Minitab,Final_test);
% for plotting the results
figure();
subplot(2,2,1)
plot(res./sqrt(Est_STSFMs_ARIMA.Variance));grid on
title('Standardized Residuals')
subplot(2,2,2)
qqplot(res);grid on
subplot(2,2,3)
autocorr(res)
subplot(2,2,4)
parcorr(res)
figure();
subplot(2,2,1)
plot(res_1./sqrt(Est_STSFMs_ARIMA_1.Variance));grid on
title('Standardized Residuals')
subplot(2,2,2)
qqplot(res_1);grid on
subplot(2,2,3)
autocorr(res_1)
subplot(2,2,4)
parcorr(res_1)
figure();
subplot(2,2,1)
plot(res_1./sqrt(Est_STSFMs_ARIMA_Minitab.Variance));grid on
title('Standardized Residuals')
subplot(2,2,2)
qqplot(res_Minitab);grid on
subplot(2,2,3)
autocorr(res_Minitab,108)
subplot(2,2,4)
parcorr(res_Minitab,108)
% the Durbin-Watson Statistic test with the P-value
[P_value,DW]=dwtest(res,Final_test);
% the Ljung-Box-test
[hLBQ,P_valueLBQ] = lbqtest(res,'lags',[12,24,36,48]);
% Engle's ARCH test is used to identify residual heteroscedasticity
[HARCH,p_valueARCH] = archtest(res,'lags',[12,24,36,48]);
% if the results of the first vlaues of the used test method are 1 means
% that evidence is establisted that the test is positive and the tested
% relation exist yet 0 indicate the the test fail and no evidence of the
% tested relation exists.
%%Forecasting stage using the Est_STSFMs_ARIMA (where the coefficents are now known)
% for foreacasting use the forecast matlab function as following:
N = 12;% forecast horizon
[Yc,YcMSE,U] = forecast (Est_STSFMs_ARIMA,N);% the second input is the forecast horizon for more details refer to the created report by the author and matlab documentation.
[Yc_1,YcMSE_1,U_1] = forecast (Est_STSFMs_ARIMA_1,N);
[Yc_Minitab,YcMSE_Minitab,U_Minitab] = forecast (Est_STSFMs_ARIMA_Minitab,12);
% for showing the forecasted results use the plot function in matlab as
% following:
figure();
subplot(3,1,1);
plot(Target_wind);grid on;hold on;plot(YcMSE,'-.r');xlabel('Prediction Step Index');ylabel('Wind Speed in m/s');title('Est_STSFMs_ARIMA_12 model for wind speed forecasting');legend('Expected Outputs','ARIMA Predictions');hold off
subplot(3,1,2);
plot(Target_wind);grid on;hold on;plot(Yc_1,'-.r');xlabel('Prediction Step Index');ylabel('Wind Speed in m/s');title('Est_STSFMs_ARIMA_12 model for wind speed forecasting (with non-seasonal terms)');legend('Expected Outputs','ARIMA Predictions');hold off
subplot(3,1,3);
plot(Target_wind);grid on;hold on;plot(Yc_Minitab,'-.r');xlabel('Prediction Step Index');ylabel('Wind Speed in m/s');title('Est_STSFMs_ARIMA_12 model for wind speed forecasting (with non-seasonal terms as in Minitab)');legend('Expected Outputs','ARIMA Predictions');hold off
%%Monte Carlo simulation of ARIMA
% for simulating different paths of the built model the Monte Carlo
% simulation can be used in Matlab as following:
number_of_paths = 10;
number_of_observation = N;%length(Final_test);
[Y,E,C] = simulate(Est_STSFMs_ARIMA,number_of_observation,'NumPaths',number_of_paths);%length(Final_test));
% for showing the different results of such simulation use the following
% commands:
figure();
subplot(3,1,1);
plot(Y);grid on
title('Simulated Response')
subplot(3,1,2);
plot(E);grid on
title('Simulated Innovations')
subplot(3,1,3);
plot(Y(:,1),'-.r');grid on;hold on; plot(Target_wind);hold off
title('Compare between Simulated and measured wind speed reading')
%%compare model output and measured output
% this part will try to show the usage of the compare function in matlab
nx = 1:10;% the model order
sys = ssest(iddata(Final_test),nx);%#ok<*NASGU> % this will give the optimum model order number
%nx = 1; % chnage this value to that given by the previous analysis
sys = ssest(iddata(Final_test),nx);
compare(Target_wind,sys,N);%length(Final_test));%N)
compare(Final_test,sys,length(Final_test));
toc
end
##### 2 CommentsShowHide 1 older comment
Peter Mills on 24 Jul 2018
what is the function "ARMA_model_order_tuning"? Thank you in advance

Hang Qian on 20 May 2014
To forecast an ARIMA model, we want to provide both a fitted model as well as data. The former only carries model coefficients, but not observations.
Replace
forecast (Est_STSFMs_ARIMA,N)
by
forecast (Est_STSFMs_ARIMA,N,’Y0’,Y)
It will resolve the anomaly of unrealistic values.
Hang Qian
JiashenTeh on 2 Mar 2015
Hi Hang,
Why when I simulate the ARMA model which I have fitted to my wind speed data , the wind speed I get is negative? this doesn't make any sense to me.
Can you please explain this? Thank you.
using the example above, let's say i do this simulate(Est_STSFMs_ARIMA,N)
and I get negative wind speed..

Mohsen on 2 Dec 2014
Edited: Mohsen on 2 Dec 2014
sorry what is the target_wind ? here :plot(Target_wind)
Aubai on 24 Aug 2017
it is the target value of the wind which Need to be predicted

sharad kolte on 26 Feb 2015
HI CAN U SEND THE DAT SET OF WIND SPEED,
Aubai on 24 Aug 2017
sorry mate but i can't

Lourence Ferrera on 30 Oct 2017
Edited: Lourence Ferrera on 30 Oct 2017
Is your data has seasonality ? How can I code the data to show that it has seasonality and forecast this using SARIMA or ARIMA in matlab?
Aubai on 6 Nov 2017
I am still an electrical engineer so I am not the expert on statistical stuff.
here you can find a link for usage of such test. finally I can say that the unit root test can be used to "The task of the test is to determine whether the stochastic component contains a unit root or is stationary" according to the given reference.
What I am trying to say that:
- Trend Analysis for the D-part of your time series
- Seasonal Analysis for the S-part of your time series
- Stochastic Analysis for the Z-part (component) of your time series.
hope this helps

Alex on 16 Apr 2019
Hi Aubui,
Could you explain this statement "k_max = round(length(Final_test)/3);% maximum number of lags to be correlated"?
I could not understand the "max no. of lags to be correlated".
Thanks